Complex Xml Generation from Database
In this article we will generate complex XML tags/file directly from Database tables using simple SQL SELECT statement. Almost every database vendors provides support for XML read and write functionality. So let us exploit the available database xml functions. In this article we will demonstrate an use-case based on PostgreSQL database.
On a high-level, almost all databases have same kind of functions to generate xml from database tables, although the syntax may vary. In general any XML document have a document root, simple elements, complex elements, attributes & repeating element group. To start with the use case, let us first build the sample tables in our database schema. Next we will write a simple SELECT statement (based on KISS design principle) to generate a complex XML document tree. Note that writing data to the file is not in scope of this article. You may leverage an ETL tool to generate the corresponding data files, else use database specific file writer utility.
We are mainly going to using four types of XML database functions, to generate any kind of complex xml you can think of.
Below are the functions and their description.
- xmlattributes - Generate XML Attribute Tags
- xmlelement - Generate XML Element Tags- whether simple or complex
- xmlagg - Generate a block of aggregated XML block Tags, basically repeating groups / occurs many times.
- xmlroot - Concatenate the root/header of the document with the generated XML document.
Data Preparation/Environment Setup
Let's create a few sample database tables in PostgreSQL for our use case implementation.
Employee Table
drop table if exists emp;
create table emp(
empno serial,
ename varchar(60),
deptno integer,
sal integer);
insert into emp (ename,deptno,sal) values('Sam',10,1000);
insert into emp (ename,deptno,sal) values('Rock',20,2000);
insert into emp (ename,deptno,sal) values('Alen',10,3000);
select * from emp;
empno;ename;deptno;sal
1;"Sam";10;1000
2;"Rock";20;2000
3;"Alen";10;3000
Department Table
drop table if exists dept;
create table dept(
deptno integer,
dname varchar(60),
loc varchar(60));
insert into dept (deptno,dname,loc) values(10,'IT','India');
insert into dept (deptno,dname,loc) values(20,'SALES','Singapore');
select * from dept;
deptno;dname;loc
10;"IT";"India"
20;"SALES";"Singapore"
Address Table
drop table if exists empaddress;
create table empaddress(
id serial,
empno integer,
type varchar(60),
address varchar(60));
insert into empaddress (empno,type,address) values(1,'Local','Singapore');
insert into empaddress (empno,type,address) values(1,'Permanent','India');
insert into empaddress (empno,type,address) values(1,'Temporary','Malaysia');
insert into empaddress (empno,type,address) values(2,'Permanent','Singapore');
select * from empaddress;
id;empno;type;address
1;1;"Local";"Singapore"
2;1;"Permanent";"India"
3;1;"Temporary";"Malaysia"
4;2;"Permanent";"Singapore"
Referrals Table
drop table if exists referrals;
create table referrals(
id serial,
empno integer,
name varchar(60),
skill varchar(60));
insert into referrals (empno,name,skill) values(1,'Andy','Sales');
insert into referrals (empno,name,skill) values(1,'Bill','IT');
insert into referrals (empno,name,skill) values(1,'Candy','Marketing');
insert into referrals (empno,name,skill) values(1,'Dude','Sales');
insert into referrals (empno,name,skill) values(2,'Jack','HR');
insert into referrals (empno,name,skill) values(2,'Kane','Sales');
insert into referrals (empno,name,skill) values(3,'Zombie','Marketing');
select * from referrals;
id;empno;name;skill
1;1;"Andy";"Sales"
2;1;"Bill";"IT"
3;1;"Candy";"Marketing"
4;1;"Dude";"Sales"
5;2;"Jack";"HR"
6;2;"Kane";"Sales"
7;3;"Zombie";"Marketing"
So we have created four tables along with sample data. An Employee belongs to a Department. An Employee has many address reference. Also another table which keeps a track of referrals posted by the employees.
In this example, we will generate UUID for each xml document for an employee; for that we will use the below PostgreSql extension.
create extension "uuid-ossp";
select uuid_generate_v4();
Query Statement
Firstly let us check the SELECT statement used to generate the XML data tags.
select
emp.empno,
xmlroot(
xmlelement(name "Employee", --employee HEADER
xmlattributes(emp.empno as "EmployeeNumber"),
xmlelement(name "DocumentId", temp.uuid),
xmlelement(name "Name", emp.ename),
xmlelement(name "Department", --department
xmlattributes(dept.deptno as "DepartmentNumber"),
xmlelement(name "Name", dept.dname),
xmlelement(name "Location",
xmlelement(name "Country", dept.loc)
)
),
(select
xmlagg( --address DETAIL *Repeating group
xmlelement(name address,
xmlattributes('SN' || empaddress.seqnum as id),
xmlelement(name "Type", empaddress.type),
xmlelement(name "Country", empaddress.address)
)
)
from (select empaddress.*, row_number() over( order by empaddress.type ) seqnum
from empaddress empaddress
where emp.empno = empaddress.empno) empaddress
),
(select
xmlagg( --referrals DETAIL *Repeating group
xmlelement(name "Referrals",
xmlelement(name "ReferralId", referrals.id),
xmlelement(name "Name", referrals.name),
xmlelement(name "Skill", referrals.skill)
)
)
from referrals referrals
where emp.empno = referrals.empno
),
xmlelement(name "Salary",
xmlattributes('SGD' as "Currency"),
emp.sal
),
xmlelement(name "FileId", temp.uuid)
)
, VERSION '1.0" encoding="utf-8',STANDALONE YES)
as xml_tag
from emp emp --DRIVING TABLE
left outer join dept dept
on emp.deptno = dept.deptno
inner join ( select empno, uuid_generate_v4() uuid from emp ) temp
on emp.empno = temp.empno
Results
Before we start on the explanation of the xml functions used in the SELECT clause, let us first look at the resulting output. This will help to visualize and understand the data tags generated and how the corresponding XML functions are working.
Emp_1.xml
<?xml version="1.0" standalone="yes"?>
<Employee EmployeeNumber="1">
<DocumentId>f9031852-e536-4c1e-b829-fa673ce3b8f4</DocumentId>
<Name>Sam</Name>
<Department DepartmentNumber="10">
<Name>IT</Name>
<Location>
<Country>India</Country>
</Location>
</Department>
<address id="SN1">
<Type>Local</Type>
<Country>Singapore</Country>
</address>
<address id="SN2">
<Type>Permanent</Type>
<Country>India</Country>
</address>
<address id="SN3">
<Type>Temporary</Type>
<Country>Malaysia</Country>
</address>
<Referrals>
<ReferralId>1</ReferralId>
<Name>Andy</Name>
<Skill>Sales</Skill>
</Referrals>
<Referrals>
<ReferralId>2</ReferralId>
<Name>Bill</Name>
<Skill>IT</Skill>
</Referrals>
<Referrals>
<ReferralId>3</ReferralId>
<Name>Candy</Name>
<Skill>Marketing</Skill>
</Referrals>
<Referrals>
<ReferralId>4</ReferralId>
<Name>Dude</Name>
<Skill>Sales</Skill>
</Referrals>
<Salary Currency="SGD">1000</Salary>
<FileId>f9031852-e536-4c1e-b829-fa673ce3b8f4</FileId>
</Employee>
Emp_2.xml
<?xml version="1.0" standalone="yes"?>
<Employee EmployeeNumber="2">
<DocumentId>d627b697-909e-4aa0-b2c0-0f76e2dda898</DocumentId>
<Name>Rock</Name>
<Department DepartmentNumber="20">
<Name>SALES</Name>
<Location>
<Country>Singapore</Country>
</Location>
</Department>
<address id="SN1">
<Type>Permanent</Type>
<Country>Singapore</Country>
</address>
<Referrals>
<ReferralId>5</ReferralId>
<Name>Jack</Name>
<Skill>HR</Skill>
</Referrals>
<Referrals>
<ReferralId>6</ReferralId>
<Name>Kane</Name>
<Skill>Sales</Skill>
</Referrals>
<Salary Currency="SGD">2000</Salary>
<FileId>d627b697-909e-4aa0-b2c0-0f76e2dda898</FileId>
</Employee>
Emp_3.xml
<?xml version="1.0" standalone="yes"?>
<Employee EmployeeNumber="3">
<DocumentId>8d562578-aa7b-4fc5-971d-439c9cf9407a</DocumentId>
<Name>Alen</Name>
<Department DepartmentNumber="10">
<Name>IT</Name>
<Location>
<Country>India</Country>
</Location>
</Department>
<Referrals>
<ReferralId>7</ReferralId>
<Name>Zombie</Name>
<Skill>Marketing</Skill>
</Referrals>
<Salary Currency="SGD">3000</Salary>
<FileId>8d562578-aa7b-4fc5-971d-439c9cf9407a</FileId>
</Employee>
Functions in detail
Below are the explanation as well as the points to remember on the above SQL SELECT statement.
1. Element: We are using xmlelement function to generate an ELEMENT tag. In PostgreSql, the syntax that follows is NAME of the Element tag followed by its VALUE. In order to maintain the case of the Element name use double-quotes. E.g.
xmlelement(name "Name", dept.dname)
This will result a simple XML ELEMENT tag, with element name as "Name" with value of department name.
<Name>IT</Name>
In situations, where the Element is just a parent or hierarchical element, within which there may be other Attribute tags or child/sub child elements, just follow the same convention of the parent element NAME followed by the descendent elements/attribute inside the block/closure. E.g.
xmlelement(name "Employee",
xmlelement(name "Name", emp.ename)
)
This will result a parent complex ELEMENT tag, with child element name as "Name" with value of employee name.
<Employee>
<Name>Sam</Name>
</Employee>
2. Attribute: We are using xmlattribute function to generate an ATTRIBUTE tag. In PostgreSql, the syntax that follows is VALUE followed by its ATTRIBUTE NAME. In order to maintain the case of the Attribute name use double-quotes. E.g.
xmlelement(name "Salary",
xmlattributes('SGD' as "Currency"),
emp.sal
)
This will result a simple XML ELEMENT tag along with an ATTRIBUTE tag and VALUE.
<Salary Currency="SGD">1000</Salary>
3. Aggregate: We are using xmlagg function to generate XML elements that occur multiple times within the message's structure. These are basically repeating child elements corresponding to their parents. In situations, where you need to generate repeating groups use xmlagg. E.g.
select
xmlelement(name "Employee",
xmlelement(name "Name", emp.ename),
(select
xmlagg( --referrals DETAIL *Repeating group
xmlelement(name "Referrals",
xmlelement(name "ReferralId", referrals.id),
xmlelement(name "Name", referrals.name),
xmlelement(name "Skill", referrals.skill)
)
)
from referrals referrals
where emp.empno = referrals.empno
)
as xml_tag
from emp emp
This will result as below XML repeating groups.
<Employee
<Name>Sam</Name>
<Referrals>
<ReferralId>1</ReferralId>
<Name>Andy</Name>
<Skill>Sales</Skill>
</Referrals>
<Referrals>
<ReferralId>2</ReferralId>
<Name>Bill</Name>
<Skill>IT</Skill>
</Referrals>
<Referrals>
<ReferralId>3</ReferralId>
<Name>Candy</Name>
<Skill>Marketing</Skill>
</Referrals>
<Referrals>
<ReferralId>4</ReferralId>
<Name>Dude</Name>
<Skill>Sales</Skill>
</Referrals>
</Employee>
In cases, where you need to generate sequence within groups, use subquery with windowing function. You cannot directly use Windowing function inside xmlagg aggregate function.
Keep exploring the generation of any complex xml structures using xml db functions.