...

Azure Data Factory Solution – Case Study

Azure Data Factory Logo

How Smart Data Helped A Client Streamline Their Data Management

Using Azure Data Factory, Smart Data was able to create a customized solution for the client that streamlined their data management processes and provided valuable insights into their business operations.

Healthcare IT Services Case Study

Problem/Objective:

Data load tasks at client location are taken care through inhouse applications that are developed and maintained separately by each team. While teams are able to program it and maintain it without any problem, it is an added task for each enhancement that are done. Data Load applications also require development effort to make necessary changes and required testing and deployment as a regular app. Client is using different data sources which includes relations databases On Premise SQL Server, Azure SQL and Oracle. In addition to relation databases data is also funneled through excel files.

With business growing client anticipates increase in data transfer requirement in near future and was looking to automate ETL process or a stable product that could handle heavy load and process data on scheduled time and provide notification ability to concerned stake holders.

To change process and evaluate new client requested us to start with a small data load project that can read excel files, extract data and transfer data to Azure SQL database. Along with ETL process it was required that field engineers get access to shared folder on azure storage to drop files. Also, it was required that notification be sent out to stakeholders upon successful or unsuccessful transfer with detailed processing information.

REQUIREMENTS
• Process excel files (Read, extract data)
• Ingest data into a database
• Keep raw files
• Account for every file
• Ongoing low effort maintenance, cost-efficient and automated
• Send email at the end of successful or unsuccessful transaction

Technical Solution:

After reviewing the challenges, Smart Data team were able to provide solution and recommended a solution built on Azure Data Factory (ADF), Microsoft’s Extract-Transform-Load (ETL) solution for Azure.

While there are many ETL solutions that can run on any infrastructure, this is very much a native Azure service especially as client is already using Azure services for many of its products and deployments. It also easily ties into the other services Microsoft offers.

N

Azure Storage Container

Field Engineers to drop excel files on azure storage container. Proper access is also enabled.
N

Setup SQL Database (ADF access) user login.

A new user login was created in Destination SQL Database for ADF use to sink data in required tables.
N

Azure Key Vault

Azure Key Vault is used to store secrets such as database login information.
N

Azure Data Factory

ADF to setup ETL tasks and pipeline to process data.

  • Create Linked Services
    • Azure Storage container (Source files)
    • Azure Key Vault (To get secret values)
    • Azure SQL Database (Destination)
  • Create Datasets
    • Dataset setup for source and destination
  • Created Dataflows
    • Dataflows in ADF handles tasks of reading excel file from Azure Storage container, check for duplications of data, validate data, add conditions to sink valid data to SQL database and create a new file in azure storage with information of ignored data (invalid data).
    • Create Pipeline to run Dataflow that was created, add tasks for successful and unsuccessful email messages with information on how many records are updated to SQL and information about location of data that was not processed.
N

Provisioned Azure Logic apps

ADF do not have an inbuilt functionality to send emails. Logic apps is a viable solution to achieve mail functionality in azure and ADF can call post messages to provided logic apps url with required information, which can be used by Logic apps mail service.

N

Azure DevOps (Pipelines and repository)

Azure DevOps project which contains repository for Azure Data Factory. Azure DevOps is used to automate deployment of Azure Data Factory. Continuous deployment of ADF to different environments such as DEV, QA, Prod.

Outcome:

The client was very pleased with the ADF and Logic Apps solution for notifications. This Azure Data Factory solution provided a robust and futuristic solution freeing up development resources to work on actual product instead of worrying on developing applications for data load. Since ADF Is Azure native service it made it easy for team to get it approved by architecture team for use and they intend to extend the use of ADF for all ETL process going forward.

Get in touch

Ready to enhance your business with cutting-edge technology?

Address

8571 Gander Creek Dr.
Miamisburg, OH 45342

Email Us

info@smartdata.net

Phone

(937) 886-9166

Social Media