Shoaib Follow I’m curious about how things work. Everything I share is part of that exploration to inspire smarter ways of thinking, building, and solving.

Azure Data Flow Conditional Flatten

In this task, we are going to flatten the json file stored in a Container in our storage account.

In the json file, the data is an array of arrays, so we are going to flatten it to individual records.

This is the json file data:

{"albums":[{"name":"SomeSongs","tracks":[{"trackid":1,"name":"Song1"},{"trackid":2,"name":"Song2"}]},{"name":"EvenMoreSongs","tracks":[{"trackid":1,"name":"Song3"},{"trackid":2,"name":"Song4"}]}]}

1. Create a linked service to blob storage

2. Create datasets

We create 2 datasets: 1 for input json file, 1 for output json

3. Create dataflow

Turn on Data flow debug option at the top

set the source to input json

Choose flatten under schema modifier for the transformation

In flatten settings: Unroll by albums.tracks

If we look at the data preview, we can see the data is flattened.

alt text

We need to set the sink options to output to single file and give a file name.

alt text

Now, we create the pipeline by dragging this data flow and trigger it.

If we go to the container, a new file is created as shown below.

alt text

Done !!!

Data flow allows us to manipulate the json files

20 Dec 2024

« Enable Security for Azure Data Lake Azure Data Flow Conditional Split »

Azure Data Flow Conditional Flatten

1. Create a linked service to blob storage

2. Create datasets

3. Create dataflow

Explore →