In this article we will generate complex XML tags/file directly from Database tables using simple SQL SELECT statement. Almost every database vendors provides support for XML read and write functionality. So let us exploit the available database xml functions. In this article we will demonstrate an use-case based on PostgreSQL database.

On a high-level, almost all databases have same kind of functions to generate xml from database tables, although the syntax may vary. In general any XML document have a document root, simple elements, complex elements, attributes & repeating element group. To start with the use case, let us first build the sample tables in our database schema. Next we will write a simple SELECT statement (based on KISS design principle) to generate a complex XML document tree. Note that writing data to the file is not in scope of this article. You may leverage an ETL tool to generate the corresponding data files, else use database specific file writer utility.

We are mainly going to using four types of XML database functions, to generate any kind of complex xml you can think of. Below are the functions and their description.

  1. xmlattributes - Generate XML Attribute Tags
  2. xmlelement - Generate XML Element Tags- whether simple or complex
  3. xmlagg - Generate a block of aggregated XML block Tags, basically repeating groups / occurs many times.
  4. xmlroot - Concatenate the root/header of the document with the generated XML document.

Data Preparation/Environment Setup

Let's create a few sample database tables in PostgreSQL for our use case implementation.

drop table if exists emp;

create table emp(
empno serial,
ename varchar(60),
deptno integer,
sal integer);

insert into emp (ename,deptno,sal) values('Sam',10,1000);
insert into emp (ename,deptno,sal) values('Rock',20,2000);
insert into emp (ename,deptno,sal) values('Alen',10,3000);

select * from emp;

empno;ename;deptno;sal
1;"Sam";10;1000
2;"Rock";20;2000
3;"Alen";10;3000
drop table if exists dept;

create table dept(
deptno integer,
dname varchar(60),
loc varchar(60));

insert into dept (deptno,dname,loc) values(10,'IT','India');
insert into dept (deptno,dname,loc) values(20,'SALES','Singapore');

select * from dept;

deptno;dname;loc
10;"IT";"India"
20;"SALES";"Singapore"
drop table if exists empaddress;

create table empaddress(
id serial,
empno integer,
type varchar(60),
address varchar(60));

insert into empaddress (empno,type,address) values(1,'Local','Singapore');
insert into empaddress (empno,type,address) values(1,'Permanent','India');
insert into empaddress (empno,type,address) values(1,'Temporary','Malaysia');
insert into empaddress (empno,type,address) values(2,'Permanent','Singapore');

select * from empaddress;

id;empno;type;address
1;1;"Local";"Singapore"
2;1;"Permanent";"India"
3;1;"Temporary";"Malaysia"
4;2;"Permanent";"Singapore"
drop table if exists referrals;

create table referrals(
id serial,
empno integer,
name varchar(60),
skill varchar(60));

insert into referrals (empno,name,skill) values(1,'Andy','Sales');
insert into referrals (empno,name,skill) values(1,'Bill','IT');
insert into referrals (empno,name,skill) values(1,'Candy','Marketing');
insert into referrals (empno,name,skill) values(1,'Dude','Sales');
insert into referrals (empno,name,skill) values(2,'Jack','HR');
insert into referrals (empno,name,skill) values(2,'Kane','Sales');
insert into referrals (empno,name,skill) values(3,'Zombie','Marketing');

select * from referrals;

id;empno;name;skill
1;1;"Andy";"Sales"
2;1;"Bill";"IT"
3;1;"Candy";"Marketing"
4;1;"Dude";"Sales"
5;2;"Jack";"HR"
6;2;"Kane";"Sales"
7;3;"Zombie";"Marketing"

So we have created four tables along with sample data. An Employee belongs to a Department. An Employee has many address reference. Also another table which keeps a track of referrals posted by the employees.

In this example, we will generate UUID for each xml document for an employee; for that we will use the below PostgreSql extension.

create extension "uuid-ossp";

select uuid_generate_v4();

Query Statement

Firstly let us check the SELECT statement used to generate the XML data tags.

select 
emp.empno, 
xmlroot(
xmlelement(name "Employee", 			--employee HEADER
    xmlattributes(emp.empno as "EmployeeNumber"),
    xmlelement(name "DocumentId", temp.uuid),
    xmlelement(name "Name", emp.ename),
    xmlelement(name "Department", 	        --department
    	xmlattributes(dept.deptno as "DepartmentNumber"),
        xmlelement(name "Name", dept.dname),
        xmlelement(name "Location", 
        	xmlelement(name "Country", dept.loc)
        )
    ),
    (select 
        xmlagg( 			--address DETAIL *Repeating group
		xmlelement(name address, 
			xmlattributes('SN' || empaddress.seqnum as id),
			xmlelement(name "Type", empaddress.type),
			xmlelement(name "Country", empaddress.address)           	
		)	
	)
	from (select empaddress.*, row_number() over( order by empaddress.type ) seqnum
			from empaddress empaddress
			where emp.empno = empaddress.empno) empaddress
    ),
    (select
	xmlagg( 			--referrals DETAIL *Repeating group
		xmlelement(name "Referrals", 
			xmlelement(name "ReferralId", referrals.id),
			xmlelement(name "Name", referrals.name),
			xmlelement(name "Skill", referrals.skill)           	
		)
	)
	from referrals referrals 
	where emp.empno = referrals.empno
    ),
    xmlelement(name "Salary", 
		xmlattributes('SGD' as "Currency"),
		emp.sal
    ),
    xmlelement(name "FileId", temp.uuid)
)
, VERSION '1.0" encoding="utf-8',STANDALONE YES) 
as xml_tag 
from emp emp --DRIVING TABLE
left outer join dept dept
on emp.deptno = dept.deptno
inner join ( select empno, uuid_generate_v4() uuid from emp ) temp
on emp.empno = temp.empno

Results

Before we start on the explanation of the xml functions used in the SELECT clause, let us first look at the resulting output. This will help to visualize and understand the data tags generated and how the corresponding XML functions are working.


Emp_1.xml

<?xml version="1.0" standalone="yes"?>
<Employee EmployeeNumber="1">
	<DocumentId>f9031852-e536-4c1e-b829-fa673ce3b8f4</DocumentId>
	<Name>Sam</Name>
	<Department DepartmentNumber="10">
		<Name>IT</Name>
		<Location>
			<Country>India</Country>
		</Location>
	</Department>
	<address id="SN1">
		<Type>Local</Type>
		<Country>Singapore</Country>
	</address>
	<address id="SN2">
		<Type>Permanent</Type>
		<Country>India</Country>
	</address>
	<address id="SN3">
		<Type>Temporary</Type>
		<Country>Malaysia</Country>
	</address>
	<Referrals>
		<ReferralId>1</ReferralId>
		<Name>Andy</Name>
		<Skill>Sales</Skill>
	</Referrals>
	<Referrals>
		<ReferralId>2</ReferralId>
		<Name>Bill</Name>
		<Skill>IT</Skill>
	</Referrals>
	<Referrals>
		<ReferralId>3</ReferralId>
		<Name>Candy</Name>
		<Skill>Marketing</Skill>
	</Referrals>
	<Referrals>
		<ReferralId>4</ReferralId>
		<Name>Dude</Name>
		<Skill>Sales</Skill>
	</Referrals>
	<Salary Currency="SGD">1000</Salary>
	<FileId>f9031852-e536-4c1e-b829-fa673ce3b8f4</FileId>
</Employee>


Emp_2.xml

<?xml version="1.0" standalone="yes"?>
<Employee EmployeeNumber="2">
	<DocumentId>d627b697-909e-4aa0-b2c0-0f76e2dda898</DocumentId>
	<Name>Rock</Name>
	<Department DepartmentNumber="20">
		<Name>SALES</Name>
		<Location>
			<Country>Singapore</Country>
		</Location>
	</Department>
	<address id="SN1">
		<Type>Permanent</Type>
		<Country>Singapore</Country>
	</address>
	<Referrals>
		<ReferralId>5</ReferralId>
		<Name>Jack</Name>
		<Skill>HR</Skill>
	</Referrals>
	<Referrals>
		<ReferralId>6</ReferralId>
		<Name>Kane</Name>
		<Skill>Sales</Skill>
	</Referrals>
	<Salary Currency="SGD">2000</Salary>
	<FileId>d627b697-909e-4aa0-b2c0-0f76e2dda898</FileId>
</Employee>

Emp_3.xml

<?xml version="1.0" standalone="yes"?>
<Employee EmployeeNumber="3">
	<DocumentId>8d562578-aa7b-4fc5-971d-439c9cf9407a</DocumentId>
	<Name>Alen</Name>
	<Department DepartmentNumber="10">
		<Name>IT</Name>
		<Location>
			<Country>India</Country>
		</Location>
	</Department>
	<Referrals>
		<ReferralId>7</ReferralId>
		<Name>Zombie</Name>
		<Skill>Marketing</Skill>
	</Referrals>
	<Salary Currency="SGD">3000</Salary>
	<FileId>8d562578-aa7b-4fc5-971d-439c9cf9407a</FileId>
</Employee>

Functions in detail

Below are the explanation as well as the points to remember on the above SQL SELECT statement.

  • Element: We are using xmlelement function to generate an ELEMENT tag. In PostgreSql, the syntax that follows is NAME of the Element tag followed by its VALUE. In order to maintain the case of the Element name use double-quotes. E.g.
    xmlelement(name "Name", dept.dname)
    

    This will result a simple XML ELEMENT tag, with element name as "Name" with value of department name.

    <Name>IT</Name>
    

    In situations, where the Element is just a parent or hierarchical element, within which there may be other Attribute tags or child/sub child elements, just follow the same convention of the parent element NAME followed by the descendent elements/attribute inside the block/closure. E.g.

    xmlelement(name "Employee", 			
        xmlelement(name "Name", emp.ename)
    )
    

    This will result a parent complex ELEMENT tag, with child element name as "Name" with value of employee name.

    <Employee>
    	<Name>Sam</Name>
    </Employee>
    
  • Attribute: We are using xmlattribute function to generate an ATTRIBUTE tag. In PostgreSql, the syntax that follows is VALUE followed by its ATTRIBUTE NAME. In order to maintain the case of the Attribute name use double-quotes. E.g.
        xmlelement(name "Salary", 
    		xmlattributes('SGD' as "Currency"),
    		emp.sal
        )
    

    This will result a simple XML ELEMENT tag along with an ATTRIBUTE tag and VALUE.

    <Salary Currency="SGD">1000</Salary>
    
  • Aggregate: We are using xmlagg function to generate XML elements that occur multiple times within the message's structure. These are basically repeating child elements corresponding to their parents. In situations, where you need to generate repeating groups use xmlagg. E.g.
        select 
        xmlelement(name "Employee", 
            xmlelement(name "Name", emp.ename),
            (select
    		xmlagg( 								--referrals DETAIL *Repeating group
    			xmlelement(name "Referrals", 
    				xmlelement(name "ReferralId", referrals.id),
    				xmlelement(name "Name", referrals.name),
    				xmlelement(name "Skill", referrals.skill)           	
    			)
    		)
    		from referrals referrals 
    		where emp.empno = referrals.empno
           )
        as xml_tag 
        from emp emp
    

    This will result as below XML repeating groups.

    <Employee
        <Name>Sam</Name>
        <Referrals>
            <ReferralId>1</ReferralId>
    	<Name>Andy</Name>
    	<Skill>Sales</Skill>
        </Referrals>
        <Referrals>
    	<ReferralId>2</ReferralId>
    	<Name>Bill</Name>
    	<Skill>IT</Skill>
        </Referrals>
        <Referrals>
    	<ReferralId>3</ReferralId>
    	<Name>Candy</Name>
    	<Skill>Marketing</Skill>
        </Referrals>
        <Referrals>
    	<ReferralId>4</ReferralId>
    	<Name>Dude</Name>
    	<Skill>Sales</Skill>
        </Referrals>
    </Employee>
    
  • In cases, where you need to generate sequence within groups, use subquery with windowing function. You cannot directly use Windowing function inside xmlagg aggregate function.

Keep exploring the generation of any complex xml structures using xml db functions.


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • DW Implementation Using EDB Postgres

    PostgreSQL is an advanced open source relational database management system. EDB develops and integrates performance, security, and manageability enhancements into PostgreSQL to support enterprise-class workloads for its database, EDB Postgres...

  • Complex Xml Generation from Database

    In this article we will generate complex XML tags/file directly from Database tables using simple SQL SELECT statement. Almost every database vendors provides support for XML read and write functionality. So let us exploit the available database...

  • SQL- Data Definition Language

    The Data Definition Language (DDL) manages database table and index structure. The most basic commands of DDL are the CREATE, ALTER, RENAME, DROP and TRUNCATE statements.

  • Top 20 SQL Interview Questions with Answers

    SQL is a language for accessing and manipulating database standardized by ANSI. To be successful with database-centric applications (which includes most of the applications Data Warehousing domain), one must be strong enough in SQL. In this...

  • SQL- Data Manipulation Language

    The Data Manipulation Language (DML) manages data operation like INSERT, UPDATE, DELETE to database tables. The most basic commands of DML are the INSERT, UPDATE, DELETE and MERGE statements.

  • SQL- Data Query Language

    Structured Query Language (SQL) can be further subdivided in multiple parts based on its functionality. Data Query Language (DQL) mainly deals with SQL SELECT statement for retrieving data from a database

  • Introduction to ANSI SQL

    We human beings communicate with each other with the help of language. Similiarly, SQL stands for Structured Query Language is the language that a database understands, and we will communicate to the database using SQL. SQL is a 4TH generation...