Pentaho Data Integration or Spoon is a GUI workspace that allows us to create jobs and transformations to extracts data from heterogeneous sources, transforms that data using built-in steps and functions to meet business requirements and then can load the data into a wide variety of target databases or file systems for further analysis.
In this article we will learn how to use Pentaho Data Integration Row Normalizer Step. Normalizer step is used to normalize data from pivot or denormalized table. It allows us to change how the relationship between rows is displayed. For each value in each pivot column, Row Normalizer produces a row in the output data set. We can create new field sets to specify more than one pivot columns. It basically convert Columns to Rows.
In this article we will learn how to use Pentaho Data Integration Row Denormaliser Step. Denormaliser step is used to denormalize or pivot data from normalized dataset or table. The Denormaliser step combines data from several rows into one row by creating new Target columns/fields. For each unique values for the Key field and each selected Group field, this step produces a target column in the output data set. It basically creates one row of data from several existing rows, i.e. to Pivot Rows to Columns.
This article will demonstrate how to read data from XML based source files using Pentaho Data Integration. In order to read the source XML based file we will be using the Get data from XML step. In this article we will read one simple XML file followed by complex nested hierarchical XML data file.
This article will demonstrate how to read data from JSON based source files using Pentaho Data Integration. In order to read the source JSON based file we will be using the Json Input step. In this article we will read one complex JSON file with nested hierarchical JSON array.
In this article we will learn how to use Pentaho Data Integration HTTP Client Step. Http Client step is used to send HTTP Requests to web URL's along with optional parameters/arguments and in turn receives the HTTP Response result from the the web methods.
This article will demonstrate how to write data to XML files using Pentaho Data Integration. In order to write to XML file we will be using the XML Output step. This XML Output step allows us to write rows from any source to one or more XML files.
Data integration is the process by which information from multiple databases is consolidated for use in a single application. ETL (extract, transform, and load) is the most common form of DI found in data warehousing. In this article we will look into the PDI architecture & product features which makes it a perfect ETL tool to fit in a typical DM/DW landscape. This article is meant primarily for Technical Architects, Consultants & Developers who are evaluating PDI capabilities.
As a continuation of the previous article, let us check some of the key PDI Steps. Below are few examples of in-built PDI transformation steps used widely in a DI/DW/DM landscape