Wednesday 25 November 2020

Azure SQL EDGE - Advantages

 

Introduction

Azure has introduced a new service called “Azure SQL Edge” which facilitates the containerized hot storage capability at EDGE itself. As per the documentation “Azure SQL Edge is an optimized relational database engine geared for IoT and IoT Edge deployments”. This tool is really enriching the concept of intelligent edge and distributed processing by proving a wonderful option to a store high-speed stream of data to a tabular shape. Azure-SQL-Edge is using the same engine of SQL server with streaming capability.

Ref: https://docs.microsoft.com/en-us/azure/azure-sql-edge/release-notes


 Major Storage Challenge

Using Azure IoT Edge architecture, we can develop any type of application, which can run in Linux and windows. This containerized architecture is capable to handle the remote management of modules. Even though we are able to utilize efficient programming capability, we need to depend on an internet connection to send and receive data from the cloud system. In the case of IoT, we are dealing with high-frequency data, processing can only be effective if we store the relevant data at EDGE. The file storage at EDGE was not effective from many perspectives.

 Microsoft SQL Server

Microsoft SQL Server is a popular relational database system developed by Microsoft. MS SQL Server is widely using in many heavy data centric application. It is possible to access MS SQL server instance, which is running on premise to read and write data from an IoT EDGE Module. In such case, the SQL instance need to be managed separately.

Implemented Use case

Need to read 10 OPC tags from an OPC Server, which are changing in the 100 millisecond frequency. These tags are required to store to an SQL database at EDGE (AzureSQLEdge). 10 messages are expecting per second. It is required to find the average value of these 10 tags for 10 seconds and that average values need to send to the cloud.

OPC Tags: 10 Tags

One Message (10 Tags): 2 KB

Expected Volume: One OPC Tag will be changing 20 times in a second.

For 10 tags, 200 messages per second.

Azure SQL EDGE – On Deployment



 

OPCServer

A Kepware OPC server is set up in an Azure VM to generate OPC data. 10 tags are configured with 100-millisecond data change frequency. Required port to communicate to this OPC server (Port 49320) is opened in windows firewall and Azure VM firewall. For this this trial I have not enabled the certificate based communication in Kepware. It is in “Non” state. In general, the opc server’s connection url will look like as below.

opc.tcp://<IP Address of VM>:<Port number>

 OPC Reader

I used the Microsoft “opcpublisher” 2.5.3 IoT Edge module as an OPC Reader. This is an IoT EDGE Module, which can interact with the OPC Server in “pub/sub” mode. The connection string and tag names need to add to this module’s configuration file. Container create option of “opcreader” module is as below

{

"Hostname": "opc-publisher",

"Cmd": [

"--pf=/<user folder>/pn.json",

"--tc=/<user folder>/telemetryconfiguration.json",

"--aa",

"--di=10",

"--si=0",

"--ms=0",

"--mq=24000"

],

"HostConfig": {

"Binds": [

"/var/iiotedge:/appdata"

],

"LogConfig": {

"Type": "json-file",

"Config": {

"max-size": "100k",

"max-file": 3

}

}

}

}

The output of this module is a flat json array of below structure

[{

"tag": "RCTR_TEMP",

"tagval": 124.25,

"tagtime": "2020-11-18T11:45:20.0000000Z"

}, {

"tag": "M1_FLWRATE",

"tagval": 108.00000000000003,

"tagtime": "2020-11-18T11:45:20.0000000Z"

}]

 AzureSQLEDGE



This is the Azure Market place module for Azure SQL EDGE. Please refer this link to get the direction to deploy this module from Azure Market Place. During deployment below details are required


Environment Variables



Variable Name

Value

ACCEPT_EULA

Y

MSSQL_SA_PASSWORD

<Password>

MSSQL_LCID

1033

MSSQL_COLLATION

SQL_Latin1_General_CP1_CI_AS



Container Create Option

{

"HostConfig": {

"CapAdd": [

"SYS_PTRACE"

],

"Binds": [

"sqlvolume:/sqlvolume"

],

"PortBindings": {

"1433/tcp": [

{

"HostPort": "1433"

}

]

},

"Mounts": [

{

"Type": "volume",

"Source": "sqlvolume",

"Target": "/var/opt/mssql"

}

]

},

"Env": [

"MSSQL_AGENT_ENABLED=TRUE",

"ClientTransportType=AMQP_TCP_Only",

"PlanId=asde-developer-on-iot-edge"

]

}

Based on deployment, EDGE will download the latest image from the Microsoft container registry and will create the module. By default, SQL is using port 1433 for establishing the connection.

 SQL Server Management Studio


https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-ver15

SQL Server Management Studio (SSMS) is an integrated environment for managing any SQL infrastructure, from SQL Server to Azure SQL Database. SSMS provides tools to configure, monitor, and administer instances of SQL Server and databases. Use SSMS to deploy, monitor, and upgrade the data-tier components used by your applications, and build queries and scripts.

In my case, I have used one another Windows VM and have installed the SSMS in that VM. After it is required to open the port 1433 port in windows firewall of windows VM. Network inbound and outbound port rules should be opened in a windows machine and EDGE VM.




The Azure SQL Edge is behaving similar to a local SQL Database instance and SSMS can provide a better development experience when we need to design the tables, but this is not a mandatory tool to set up the SQL EDGE.

Set Up SQL EDGE

There are few objects required to setup sequentially to create a streaming job. Once the streaming created, the streaming job can insert the input data to the SQL server tables. The objects required to create a streaming job is mentioned below.

  1. MASTER KEY ENCRYPTION

Query

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Your Password>';

The database master key is a symmetric key used to protect the private keys of certificates and asymmetric keys that are present in the database. When it is created, the master key is encrypted by using the AES_256 algorithm and a user-supplied password.

  1. EXTERNAL FILE FORMAT

Query

If NOT Exists (select name from sys.external_file_formats where name = 'JSONFormat')

Begin

CREATE EXTERNAL FILE FORMAT [JSONFormat]

WITH ( FORMAT_TYPE = JSON)

End

  1. EXTERNAL DATA SOURCE

Query

If NOT Exists (select name from sys.external_data_sources where name = 'EdgeHub')

Begin

Create EXTERNAL DATA SOURCE [EdgeHub]

With(

LOCATION = N'edgehub://'

)

End

  1. External Stream - Input

Query

If NOT Exists (select name from sys.external_streams where name = 'opcInput')

Begin

CREATE EXTERNAL STREAM opcInput WITH

(

DATA_SOURCE = EdgeHub,

FILE_FORMAT = JSONFormat,

LOCATION = N'opcMeasures'

)

End



  1. DATABASE SCOPED CREDENTIAL

Query

If NOT Exists (select name from sys.database_scoped_credentials where name 'SQLCredential')

Begin

set @query = 'CREATE DATABASE SCOPED CREDENTIAL SQLCredential

WITH IDENTITY = ''sa'', SECRET = ''' + @SQL_SA_Password + ''''

Execute(@query)

End



  1. External Data Source

Query

If NOT Exists (select name from sys.external_data_sources where name = 'LocalSQLOutput')

Begin

CREATE EXTERNAL DATA SOURCE LocalSQLOutput WITH (

LOCATION = 'sqlserver://tcp:.,1433',CREDENTIAL = SQLCredential)

End

  1. External Stream – Output

Query

If NOT Exists (select name from sys.external_streams where name = 'opcOutput')

Begin

CREATE EXTERNAL STREAM opcOutput WITH

(

DATA_SOURCE = LocalSQLOutput,

LOCATION = N'pocdb.dbo.opcrawtelemetry'

)

End


  1. Create Stream Job

Query

EXEC sys.sp_create_streaming_job @name=N'opcdata',

@statement= N'Select i.tag as tag,i.tagval as tagval,i.tagtime as tagtime INTO opcOutput from opcInput i'


  1. Start Stream Job

Query

exec sys.sp_start_streaming_job @name=N'opcdata'

Once these steps are completed, we can route the message from the message generator to this SQLEdge module, and it is required to make sure the columns specified in the step 8 is available in the messages.

 Statistics

    EDGE Hardware

            Type: Azure VM

            OS: Ubuntu 18.04 LTS

            Spec: D2s V3

            RAM: 8 GB

            Vcpus : 2


    Data Used

            Number of OPC Tags used: 10 Tags

            Tag changing frequency: 100 millisecond

            Message size per Tag: 105 Byte

            Messages per Tag per second: 10 messages

            Total Number of messages per second: 100 messages

            Number of messages inserted to SQLEdge: Approximately 2million rows                      (1988976)

            Average latency during insert: 234 milliseconds

            Searching latency: xxxx <WIP>

            Data persistence: Yes in Docker volume. By default, it is configured to                 store the data files to the             physical location and removing and                 adding the SQLEdge module will not lead to data miss.                                       Volume mounting path should be same while re-deploying the module.

 Pricing of Azure SQL EDGE

        https://azure.microsoft.com/en-us/pricing/details/sql-edge/

PAY AS YOU GO

1 YEAR RESERVED

3 YEAR RESERVED

$10.001/device/month

$100/device/year

$60/device/year


 The Conclusion

 AzureSQLEdge is a promising feature providing by Azure. This can very well act as storage at EDGE side and can interact with various modules and other on premises applications. Along with the characteristics of SQL Server, this is having the capability to ingest high frequent data stream as well.