Serverless Data Integration using Azure Data Factory

shashi
3 min readMar 10, 2022

Data is indeed the building block of any business work. Data is needed to drive the business in a proposed direction. Data comes from multiple different directions as well as in multiple different forms. The need of the business is always to process the data in order to get the insights that help in achieving different heights with the product/projects. But getting the data is the actual tedious part not to mention the wrangling which actually takes up 70% of the time.

Data Acquisition

Data integration generally involves acquiring structured or unstructured data in one place followed by any data transformation or data wrangling. There are lots of different tools and packages which allow integrating data from various different sources. One such tool is the Azure Data Factory.

Azure Data Factory: Serverless Data Integrator

Azure data factory or ADF is the new age data integration tool and an ETL service offered by Microsoft Azure. ADF offers lots of different sources and sinks for extracting data and loading it to the destination.

Some of the popular connectors for ADF are listed below:

  • Amazon S3,
  • Azure SQL,
  • Mongo DB,
  • Amazon RDS,
  • SQL Server,
  • Azure Blob Storage,
  • Azure Cosmos DB…and many more.

The complete details of the connectors can be found here.

ADF the data is integrated to the sink from the source via pipeline. These pipelines perform the extraction of the data from the source and transform them if necessary. This is then followed by loading or inserting the data into the sink target. This entire process can be termed an ETL process.

ADF provides different methods to build and deploy the pipeline.

- SDK

- REST APIs

- az CLI

Here are some of the commands that are commonly used:

Create a data factory

ADF create command

Create a pipeline with a for each activity

for each activity

The creation of activity in ADF requires a JSON object which can be passed as a file. The enclosed forEach.json looks as follows:

Deleting a Data factory ( -y for direct consent : optional)

ADF delete command
ADF delete command

--

--