Logo DWBI.org Login / Sign Up
Sign Up
Have Login?
Login
New Account?
Recovery
Go to Login
Pentaho Data Integration

Handling JSON source files in Pentaho Data Integration

Updated on Oct 03, 2020

This article will demonstrate how to read data from JSON based source files using Pentaho Data Integration. In order to read the source JSON based file we will be using the Json Input step. In this article we will read one complex JSON file with nested hierarchical JSON array.

Let us take a quick look into our first sample JSON file which has information regarding bakery items batter & toppings.

{
  "items":{
    "item":[
      {
        "id":"0001",
        "type":"donut",
        "name":"Cake",
        "ppu":0.55,
        "batters":{
          "batter":[
            {
              "id":"1001",
              "type":"Regular"
            },
            {
              "id":"1002",
              "type":"Chocolate"
            },
            {
              "id":"1003",
              "type":"Blueberry"
            },
            {
              "id":"1004",
              "type":"Devil's Food"
            }
          ]
        },
        "topping":[
          {
            "id":"5001",
            "type":"None"
          },
          {
            "id":"5002",
            "type":"Glazed"
          },
          {
            "id":"5005",
            "type":"Sugar"
          },
          {
            "id":"5007",
            "type":"Powdered Sugar"
          },
          {
            "id":"5006",
            "type":"Chocolate with Sprinkles"
          },
          {
            "id":"5003",
            "type":"Chocolate"
          },
          {
            "id":"5004",
            "type":"Maple"
          }
        ]
      },
      {
        "id":"0002",
        "type":"donut",
        "name":"Raised",
        "ppu":0.55,
        "batters":{
          "batter":[
            {
              "id":"1001",
              "type":"Regular"
            }
          ]
        },
        "topping":[
          {
            "id":"5001",
            "type":"None"
          },
          {
            "id":"5002",
            "type":"Glazed"
          },
          {
            "id":"5005",
            "type":"Sugar"
          },
          {
            "id":"5003",
            "type":"Chocolate"
          },
          {
            "id":"5004",
            "type":"Maple"
          }
        ]
      },
      {
        "id":"0003",
        "type":"donut",
        "name":"Old Fashioned",
        "ppu":0.55,
        "batters":{
          "batter":[
            {
              "id":"1001",
              "type":"Regular"
            },
            {
              "id":"1002",
              "type":"Chocolate"
            }
          ]
        },
        "topping":[
          {
            "id":"5001",
            "type":"None"
          },
          {
            "id":"5002",
            "type":"Glazed"
          },
          {
            "id":"5003",
            "type":"Chocolate"
          },
          {
            "id":"5004",
            "type":"Maple"
          }
        ]
      },
      {
        "id":"0004",
        "type":"bar",
        "name":"Bar",
        "ppu":0.75,
        "batters":{
          "batter":[
            {
              "id":"1001",
              "type":"Regular"
            }
          ]
        },
        "topping":[
          {
            "id":"5003",
            "type":"Chocolate"
          },
          {
            "id":"5004",
            "type":"Maple"
          }
        ],
        "fillings":{
          "filling":[
            {
              "id":"7001",
              "name":"None",
              "addcost":0
            },
            {
              "id":"7002",
              "name":"Custard",
              "addcost":0.25
            },
            {
              "id":"7003",
              "name":"Whipped Cream",
              "addcost":0.25
            }
          ]
        }
      },
      {
        "id":"0005",
        "type":"twist",
        "name":"Twist",
        "ppu":0.65,
        "batters":{
          "batter":[
            {
              "id":"1001",
              "type":"Regular"
            }
          ]
        },
        "topping":[
          {
            "id":"5002",
            "type":"Glazed"
          },
          {
            "id":"5005",
            "type":"Sugar"
          }
        ]
      },
      {
        "id":"0006",
        "type":"filled",
        "name":"Filled",
        "ppu":0.75,
        "batters":{
          "batter":[
            {
              "id":"1001",
              "type":"Regular"
            }
          ]
        },
        "topping":[
          {
            "id":"5002",
            "type":"Glazed"
          },
          {
            "id":"5007",
            "type":"Powdered Sugar"
          },
          {
            "id":"5003",
            "type":"Chocolate"
          },
          {
            "id":"5004",
            "type":"Maple"
          }
        ],
        "fillings":{
          "filling":[
            {
              "id":"7002",
              "name":"Custard",
              "addcost":0
            },
            {
              "id":"7003",
              "name":"Whipped Cream",
              "addcost":0
            },
            {
              "id":"7004",
              "name":"Strawberry Jelly",
              "addcost":0
            },
            {
              "id":"7005",
              "name":"Rasberry Jelly",
              "addcost":0
            }
          ]
        }
      }
    ]
  }
}

Please check the self explanatory implementation screenshots as below:

Transformation
Transformation
Json Input
Json Input
Json Input Field Paths
Json Input Field Paths
Filter
Filter
Select values - Batter
Select values - Batter
Json Input- Batter
Json Input- Batter
Json Input- Batter Fields
Json Input- Batter Fields
Sort
Sort
Select values- Toppings
Select values- Toppings
Json Input- Toppings
Json Input- Toppings
Json Input- Toppings Fields
Json Input- Toppings Fields
Sort
Sort
Merge Inner Join
Merge Inner Join
Select values- Fillings
Select values- Fillings
Json Input- Fillings
Json Input- Fillings
Json Input- Fillings Fields
Json Input- Fillings Fields
Sort
Sort
Merge Left Outer Join
Merge Left Outer Join
Select values
Select values
Table output
Table output
Table fields
Table fields
Execution Results - Preview data
Execution Results - Preview data