Introduction
Azure has introduced a new service called “Azure SQL Edge” which facilitates the containerized hot storage capability at EDGE itself. As per the documentation “Azure SQL Edge is an optimized relational database engine geared for IoT and IoT Edge deployments”. This tool is really enriching the concept of intelligent edge and distributed processing by proving a wonderful option to a store high-speed stream of data to a tabular shape. Azure-SQL-Edge is using the same engine of SQL server with streaming capability.
Ref: https://docs.microsoft.com/en-us/azure/azure-sql-edge/release-notes
Major Storage Challenge
Using Azure IoT Edge architecture, we can develop any type of application, which can run in Linux and windows. This containerized architecture is capable to handle the remote management of modules. Even though we are able to utilize efficient programming capability, we need to depend on an internet connection to send and receive data from the cloud system. In the case of IoT, we are dealing with high-frequency data, processing can only be effective if we store the relevant data at EDGE. The file storage at EDGE was not effective from many perspectives.
Microsoft SQL Server
Microsoft SQL Server is a popular relational database system developed by Microsoft. MS SQL Server is widely using in many heavy data centric application. It is possible to access MS SQL server instance, which is running on premise to read and write data from an IoT EDGE Module. In such case, the SQL instance need to be managed separately.
Implemented Use case
Need to read 10 OPC tags from an OPC Server, which are changing in the 100 millisecond frequency. These tags are required to store to an SQL database at EDGE (AzureSQLEdge). 10 messages are expecting per second. It is required to find the average value of these 10 tags for 10 seconds and that average values need to send to the cloud.
OPC Tags: 10 Tags
One Message (10 Tags): 2 KB
Expected Volume: One OPC Tag will be changing 20 times in a second.
For 10 tags, 200 messages per second.
Azure SQL EDGE – On Deployment
OPCServer
A Kepware OPC server is set up in an Azure VM to generate OPC data. 10 tags are configured with 100-millisecond data change frequency. Required port to communicate to this OPC server (Port 49320) is opened in windows firewall and Azure VM firewall. For this this trial I have not enabled the certificate based communication in Kepware. It is in “Non” state. In general, the opc server’s connection url will look like as below.
“opc.tcp://<IP Address of VM>:<Port number>”
OPC Reader
I used the Microsoft “opcpublisher” 2.5.3 IoT Edge module as an OPC Reader. This is an IoT EDGE Module, which can interact with the OPC Server in “pub/sub” mode. The connection string and tag names need to add to this module’s configuration file. Container create option of “opcreader” module is as below
{
"Hostname": "opc-publisher",
"Cmd": [
"--pf=/<user folder>/pn.json",
"--tc=/<user folder>/telemetryconfiguration.json",
"--aa",
"--di=10",
"--si=0",
"--ms=0",
"--mq=24000"
],
"HostConfig": {
"Binds": [
"/var/iiotedge:/appdata"
],
"LogConfig": {
"Type": "json-file",
"Config": {
"max-size": "100k",
"max-file": 3
}
}
}
}
The output of this module is a flat json array of below structure
[{
"tag": "RCTR_TEMP",
"tagval": 124.25,
"tagtime": "2020-11-18T11:45:20.0000000Z"
}, {
"tag": "M1_FLWRATE",
"tagval": 108.00000000000003,
"tagtime": "2020-11-18T11:45:20.0000000Z"
}]
AzureSQLEDGE
This is the Azure Market place module for Azure SQL EDGE. Please refer this link to get the direction to deploy this module from Azure Market Place. During deployment below details are required
Environment Variables
-
Variable Name
Value
ACCEPT_EULA
Y
MSSQL_SA_PASSWORD
<Password>
MSSQL_LCID
1033
MSSQL_COLLATION
SQL_Latin1_General_CP1_CI_AS
Container Create Option
{
"HostConfig": {
"CapAdd": [
"SYS_PTRACE"
],
"Binds": [
"sqlvolume:/sqlvolume"
],
"PortBindings": {
"1433/tcp": [
{
"HostPort": "1433"
}
]
},
"Mounts": [
{
"Type": "volume",
"Source": "sqlvolume",
"Target": "/var/opt/mssql"
}
]
},
"Env": [
"MSSQL_AGENT_ENABLED=TRUE",
"ClientTransportType=AMQP_TCP_Only",
"PlanId=asde-developer-on-iot-edge"
]
}
Based on deployment, EDGE will download the latest image from the Microsoft container registry and will create the module. By default, SQL is using port 1433 for establishing the connection.
SQL Server Management Studio
https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-ver15
SQL Server Management Studio (SSMS) is an integrated environment for managing any SQL infrastructure, from SQL Server to Azure SQL Database. SSMS provides tools to configure, monitor, and administer instances of SQL Server and databases. Use SSMS to deploy, monitor, and upgrade the data-tier components used by your applications, and build queries and scripts.
In my case, I have used one another Windows VM and have installed the SSMS in that VM. After it is required to open the port 1433 port in windows firewall of windows VM. Network inbound and outbound port rules should be opened in a windows machine and EDGE VM.
The Azure SQL Edge is behaving similar to a local SQL Database instance and SSMS can provide a better development experience when we need to design the tables, but this is not a mandatory tool to set up the SQL EDGE.
Set Up SQL EDGE
There are few objects required to setup sequentially to create a streaming job. Once the streaming created, the streaming job can insert the input data to the SQL server tables. The objects required to create a streaming job is mentioned below.
MASTER KEY ENCRYPTION
Query
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Your Password>';
The database master key is a symmetric key used to protect the private keys of certificates and asymmetric keys that are present in the database. When it is created, the master key is encrypted by using the AES_256 algorithm and a user-supplied password.
EXTERNAL FILE FORMAT
Query
If NOT Exists (select name from sys.external_file_formats where name = 'JSONFormat')
Begin
CREATE EXTERNAL FILE FORMAT [JSONFormat]
WITH ( FORMAT_TYPE = JSON)
End
EXTERNAL DATA SOURCE
Query
If NOT Exists (select name from sys.external_data_sources where name = 'EdgeHub')
Begin
Create EXTERNAL DATA SOURCE [EdgeHub]
With(
LOCATION = N'edgehub://'
)
End
External Stream - Input
Query
If NOT Exists (select name from sys.external_streams where name = 'opcInput')
Begin
CREATE EXTERNAL STREAM opcInput WITH
(
DATA_SOURCE = EdgeHub,
FILE_FORMAT = JSONFormat,
LOCATION = N'opcMeasures'
)
End
DATABASE SCOPED CREDENTIAL
Query
If NOT Exists (select name from sys.database_scoped_credentials where name 'SQLCredential')
Begin
set @query = 'CREATE DATABASE SCOPED CREDENTIAL SQLCredential
WITH IDENTITY = ''sa'', SECRET = ''' + @SQL_SA_Password + ''''
Execute(@query)
End
External Data Source
Query
If NOT Exists (select name from sys.external_data_sources where name = 'LocalSQLOutput')
Begin
CREATE EXTERNAL DATA SOURCE LocalSQLOutput WITH (
LOCATION = 'sqlserver://tcp:.,1433',CREDENTIAL = SQLCredential)
End
External Stream – Output
Query
If NOT Exists (select name from sys.external_streams where name = 'opcOutput')
Begin
CREATE EXTERNAL STREAM opcOutput WITH
(
DATA_SOURCE = LocalSQLOutput,
LOCATION = N'pocdb.dbo.opcrawtelemetry'
)
End
Create Stream Job
Query
EXEC sys.sp_create_streaming_job @name=N'opcdata',
@statement= N'Select i.tag as tag,i.tagval as tagval,i.tagtime as tagtime INTO opcOutput from opcInput i'
Start Stream Job
Query
exec sys.sp_start_streaming_job @name=N'opcdata'
Once these steps are completed, we can route the message from the message generator to this SQLEdge module, and it is required to make sure the columns specified in the step 8 is available in the messages.
Statistics
EDGE Hardware
Type: Azure VM
OS: Ubuntu 18.04 LTS
Spec: D2s V3
RAM: 8 GB
Data Used
Number of OPC Tags used: 10 Tags
Tag changing frequency: 100 millisecond
Message size per Tag: 105 Byte
Messages per Tag per second: 10 messages
Total Number of messages per second: 100 messages
Number of messages inserted to SQLEdge: Approximately 2million rows (1988976)
Average latency during insert: 234 milliseconds
Searching latency: xxxx <WIP>
Data persistence: Yes in Docker volume. By default, it is configured to store the data files to the physical location and removing and adding the SQLEdge module will not lead to data miss. Volume mounting path should be same while re-deploying the module.
Pricing of Azure SQL EDGE
https://azure.microsoft.com/en-us/pricing/details/sql-edge/
-
PAY AS YOU GO
1 YEAR RESERVED
3 YEAR RESERVED
$10.001/device/month
$100/device/year
$60/device/year
The Conclusion
AzureSQLEdge is a promising feature providing by Azure. This can very well act as storage at EDGE side and can interact with various modules and other on premises applications. Along with the characteristics of SQL Server, this is having the capability to ingest high frequent data stream as well.