1

I have a nested JSON stored in a Data Lake on Azure , it has this format:

    {"proto": "01",
     "type": "A",

 "description": "heartbeat",
 "geometry": {"y0_1": {"tag": "Normal",
   "probability": 0.40,
   "x": 39,
   "y": 13},
  "y0_2": {"tag": "category_3", "probability": 0.8, "x": 48, "y": 13},
  "y0_3": {"tag": "Normal", "probability": 0.9, "x": 27, "y": 10},
"Test": {"proba": 0.65}}}

I want to create ADF Pipeline (with triggers) to move it from Data Lake to Azure Sql. The problem is when I create a copy Activity, the mapping isn't recognized by ADF , It creates a table with 4 columns: proto, type, description, but the 4th one geometry contains all the rest of the json file in one row. While I want to have an output table in this format:

proto    type    description    tag       probability    x    y     proba
01        A      heartbeat      Normal     0.40          39   13     0.65
01        A      heartbeat      category_3 0.8           48   13     0.65
01        A      heartbeat      Normal     0.9           27   10     0.65

I tried to parse the json directly on SQL using CROSS APPLY tool, but I have trouble making the JSON to copy from ADLS to SQL directly with the wanted mapping on ADF If anyone has some guidance or any idea that I can follow, it will be much appreciated

2
  • Per my experience, Data Factory doesn't work well with the nested json. Commented Sep 4, 2020 at 0:32
  • Hi @Zin, If the answer is helpful for you, you can accept it as answer. This can be beneficial to other community members. Thank you. Commented Sep 7, 2020 at 0:55

2 Answers 2

1

Per my experience, Data Factory doesn't work well with the nested json.

To get your expect output, you may need create three copy actives to achieve that. Each active with the same source and sink. And create the table firstly in sink database.

Pipeline overview: enter image description here

The differences are the mapping setting in each copy active.

Copy active1: copy data geometry.y0_1 to sink: enter image description here

Copy active2: copy data geometry.y0_2 to sink: enter image description here

Copy active3: copy data geometry.y0_3 to sink: enter image description here

Output data in sink table: enter image description here

Some other ways, you could create a stored procedure in database to deal with the JSON data, choose the stored procedure in sink like bellow: enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

Thanks Leon Yue for your answer , the only issue is that I have a very long json, and I can't create manually many copy activities.

1 Comment

Hi @Zin, as I said, data factory doesn't work well with nested json. It' s very hard or can't achieve your request. We can't mapping it to the sink table because the geometry can not consider as the collection.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.