Data Services Metadata Query Part 1
Data Services provides full access to the repository metadata tables and views for metadata analysis. To access this metadata either we can use SQL SELECT statements or use the metadata reporting from Management Console.
Internal Metadata
Let us first take a look into a few important internal metadata tables. This set of tables and views capture information about built-in metadata objects used by Data Services and the relationships between those objects.
1. AL_MACHINE_INFO
This table contains Data Services Server information like Server Name, Server Group Name, Machine Name, Port etc.
2. AL_VERSION
This table contains Data Services Metadata Repository information like Repository Version, Repository Type- <b>central</b>/NULL(local), Secure- ( 1/0 ) etc.
3. AL_USERS, AL_ASUSERS
These tables contains Data Services user and administrator information.
4. AL_REPOTYPE_NAMES
This static data table contains internal repository object names and their id used by Data Services to identify an object.For example:
REPOOBJTYPE_ID | REPOOBJTYPE_NAME |
0 | Plan* |
1 | Dataflow |
3 | Transform |
4 | File |
5 | Database Datastore |
7 | Table |
8 | Column |
10 | User |
11 | Project |
13 | Function |
29 | Domain |
36 | XML Message |
45 | Memory Table |
* Job/Workflow
5. AL_DATATYPE_NAMES
This static data table contains data type id and names internal to Data Services as below
DATATYPE_ID | DATATYPE_NAME |
0 | REAL |
1 | FLOAT |
2 | DOUBLE |
5 | INTEGER |
6 | DECIMAL |
9 | FIXCSTR |
11 | VARCSTR |
13 | DATE |
14 | TIME |
15 | INTERVAL |
16 | DATETIME |
21 | LONG |
23 | TIMESTAMP |
200 | NFIXCSTR |
201 | NVARCSTR |
6. AL_LANG
This table contains various Data Services objects information displayed in Data Services object library.( OBJECT GUID )
Column Name | Description |
OBJECT_KEY | Internal ID of the object. UNIQUE for each objects we create in Data Services. |
OBJECT_TYPE | ID for type of object as defined in AL_REPOTYPE_NAMES. |
NAME | Name of the object. |
GUID | Globally Unique Identifiers or Universal Unique Identifier (UUID) of the object. |
NORMNAME | Unique name for this object in this table. |
VERSION | Indicates the number of times this object has been updated. |
TYPE | Subtype of the object.* |
OWNER | For table objects, the owner. |
DATASTORE | For table objects, the datastore to which they belong. |
* Integer value 1 for Workflow and 0 for all other object types.
NOTE: Other important columns namely CHECKOUT_DT, CHECKOUT_REPO, CHECKIN_DT, CHECKIN_REPO, LABEL_DT, LABEL, COMMENTS, SEC_USER, SEC_USER_COUT are NULL for Local repositories but are relevant for Central repositories and they maintain the Version Control information.
SELECT
OBJECT_KEY, OBJECT_TYPE, NAME, GUID, NORMNAME,
VERSION, TYPE, OWNER, DATASTORE, STATE, LANGSIZE,
CHECKOUT_DT, CHECKOUT_REPO, CHECKIN_DT, CHECKIN_REPO,
LABEL_DT, LABEL, COMMENTS, SEC_USER, SEC_USER_COUT
FROM AL_LANG
Also this metadata table does not holds information for tables within Data stores. They are maintained on other views or tables like ALVW_OBJ_CINOUT or AL_SCHEMA.
Some related tables- AL_LANGTEXT, AL_LANGXMLTEXT which contains the text and XML representation of the objects.
Also check the Table AL_ATTR, which contains attributes of repository objects like description etc.
Query 1: The following query returns all the user defined Jobs in a data services repository.
Note: Here we are interested to get the Job name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "JOB NAME"
FROM AL_LANG A
WHERE A.OBJECT_TYPE = 0 AND A.TYPE = 0
AND A.VERSION = (
SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE A.NORMNAME = B.NORMNAME )
AND A.NAME NOT IN( 'CD_JOB_d0cafae2', 'di_job_al_mach_info' )
ORDER BY A.NAME
Query 2: The following query returns all the user defined Workflows in a data services repository.
Note: Here we are interested to get the Workflow name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "WORKFLOW NAME"
FROM AL_LANG A
WHERE A.OBJECT_TYPE = 0 AND A.TYPE = 1
AND A.VERSION = (
SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE A.NORMNAME = B.NORMNAME )
ORDER BY A.NAME
Query 3: The following query returns all the user defined Dataflows in a data services repository.
Note: Here we are interested to get the Dataflow name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "DATAFLOW NAME"
FROM AL_LANG A
WHERE A.OBJECT_TYPE = 1
AND A.VERSION = (
SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE A.NORMNAME = B.NORMNAME )
AND A.NAME NOT IN( 'CD_DF_d0cafae2', 'di_df_al_mach_info' )
ORDER BY A.NAME
Query 4: The following query returns all the user defined Datastores in a data services repository.
Note: Here we are interested to get the Datastore name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "DATA STORE NAME"
FROM AL_LANG A
WHERE A.OBJECT_TYPE = 5
AND A.VERSION = (
SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE A.NORMNAME = B.NORMNAME )
AND A.NAME <> 'CD_DS_d0cafae2'
ORDER BY A.NAME
Query 5: The following query returns all the user defined flatfile formats in a data services repository.
Note: Here we are interested to get the flatfile formats name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "FILE FORMAT NAME"
FROM AL_LANG A
WHERE A.OBJECT_TYPE = 4
AND A.VERSION = (
SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE A.NORMNAME = B.NORMNAME )
AND A.NAME NOT IN( 'Transport_Format', 'di_ff_al_mach_info' )
ORDER BY A.NAME
Query 6: The following query returns all the user defined XML definitions in a data services repository.
Note: Here we are interested to get the XML schema name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "XML SCHEMA NAME"
FROM AL_LANG A
WHERE A.OBJECT_TYPE = 36
AND A.VERSION = (
SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE A.NORMNAME = B.NORMNAME )
ORDER BY A.NAME
Query 7: The following query returns all the Data flow names along with their dataflow Description attributes if any, in the repository:
Using AL_ATTR table to get attributes of repository objects
SELECT DISTINCT O.OBJECT_KEY, O.NAME,
CASE WHEN A.PARENT_OBJ_TYPE = 1 THEN A.ATTR_VALUE ELSE NULL END
FROM AL_LANG O LEFT OUTER JOIN <b>AL_ATTR</b> A
ON O.OBJECT_KEY = A.PARENT_OBJID
WHERE O.OBJECT_TYPE = 1
AND A.ATTR_NAME = 'Description'
AND O.VERSION = ( SELECT MAX(B.VERSION) FROM AL_LANG B
WHERE O.NORMNAME = B.NORMNAME )
AND O.NAME NOT IN ( 'CD_DF_d0cafae2', 'di_df_al_mach_info' )
ORDER BY 2
7. AL_SETOPTIONS
This table contains option settings for all repository objects.
Column Name | Description |
PARENT_OBJ_ID | ID of the parent object to which this option belongs. |
PARENT_OBJ_TYPE | ID for type of object as defined in AL_REPOTYPE_NAMES. |
OPTION_NAME | Name of the option. |
OPTION_VALUE | Value of the option. |
OVERFLOW_KEY | KEY value pointing to a row in the <b>AL_OVERFLOW_ATTR</b> table. When an OPTION_VALUE exceeds 255 characters, Data Services adds the remaining characters to AL_OVERFLOW_ATTR and stores the row ID as OVERFLOW_KEY in the AL_SETOPTIONS table. |
Also check the related table AL_OVERFLOW_ATTR.
Query 8: The following query returns all the user defined Jobs with their execution properties/options in a data services repository.
Note: Here we are interested to get the Job name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "JOB NAME",B.OPTION_NAME,
B.OPTION_VALUE, B.OVERFLOW_KEY
FROM AL_LANG A, AL_SETOPTIONS B
WHERE A.OBJECT_KEY = B.PARENT_OBJID
AND B.PARENT_OBJTYPE = 0 AND A.TYPE = 0
AND A.VERSION = (
SELECT MAX(C.VERSION) FROM AL_LANG C
WHERE A.NORMNAME = C.NORMNAME )
AND A.NAME NOT IN( 'CD_JOB_d0cafae2', 'di_job_al_mach_info' )
ORDER BY A.NAME, B.OPTION_NAME
Query 9: The following query returns all the user defined Workflows with their properties/options in a data services repository.
Note: Here we are interested to get the Workflow name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "WORKFLOW NAME",B.OPTION_NAME,
B.OPTION_VALUE, B.OVERFLOW_KEY
FROM AL_LANG A, AL_SETOPTIONS B
WHERE A.OBJECT_KEY = B.PARENT_OBJID
AND B.PARENT_OBJTYPE = 0 AND A.TYPE = 1
AND A.VERSION = (
SELECT MAX(C.VERSION) FROM AL_LANG C
WHERE A.NORMNAME = C.NORMNAME )
ORDER BY A.NAME, B.OPTION_NAME
Query 10: The following query returns all the user defined Dataflows with their properties/options in a data services repository.
Note: Here we are interested to get the Dataflow name for the latest version. Query works in central repository to fetch the latest version of the object.
SELECT A.NAME AS "DATAFLOW NAME",B.OPTION_NAME,
B.OPTION_VALUE, B.OVERFLOW_KEY
FROM AL_LANG A, AL_SETOPTIONS B
WHERE A.OBJECT_KEY = B.PARENT_OBJID
AND B.PARENT_OBJTYPE = 1
AND A.VERSION = (
SELECT MAX(C.VERSION) FROM AL_LANG C
WHERE A.NORMNAME = C.NORMNAME )
AND A.NAME NOT IN( 'CD_DF_d0cafae2', 'di_df_al_mach_info' )
ORDER BY A.NAME, B.OPTION_NAME
Similarly, modify the above query, PARENT_OBJTYPE value to 4 or 5 or 36 to get the property informations for File Formats, Data Stores and XML Schemas correspondingly.
8. AL_USAGE
This table contains information about objects (parents) that contain (or call) other objects (children). This table is identical to view ALVW_PARENT_CHILD except it captures the entire call hierarchy. For example, if a table is used in a dataflow which is called from a workflow, then rows appear in this table that associate the workflow (parent) to the table (descendant). The Depth column is unique to this table which indicates the number of levels between objects in a parent/descendant relationship.
Note: We need to populate this table explicitly by using the Calculate Usage Dependencies command. From Data Services Designer go to Tools > Options > Designer > General and enable the 'Automatically calculate column mappings' option while saving Dataflow. Else from Management Console go to Impact and Lineage Analysis -> Settings -> Refresh Usage Data -> Calculate Column Mapping.
Alternatively, from Local Object Library, select the dataflow right-click and select Repository -> Calculate Usage Dependencies and Calculate Column Mappings.
ALVW_PARENT_CHILD view contains information about objects (parents) that contain (or call) other objects (children).
Column Name | Description |
PARENT_OBJ | Name of the calling object. |
PARENT_OBJ_TYPE | Type of the object (Job or dataflow etc.). |
PARENT_OBJ_DESC | The description associated with this object. |
PARENT_OBJ_KEY | Key in the <b>AL_LANG</b> table of the parent object. |
DESCEN_OBJ | Name of the descendant object.For transforms, the name of the output schema of the transform call (if the name of the transform is unique). If it is not unique, Data Services generates a unique numeric suffix and appends that to the given name. |
DESCEN_OBJ_TYPE | Type of the called object (Dataflow, table, function, file etc.). |
DESCEN_OBJ_DESC | Description associated with the called object. |
DESCEN_OBJ_KEY | Key in <b>AL_LANG</b> of the descendant object. |
DESCEN_OBJ_USAGE | This identifies how the child is used- Source, Target, Lookup table. |
DESCEN_OBJ_DS | This is applicable only to tables and files to identify the datastore of the child object. |
DESCEN_OBJ_OWNER | Owner of the child table. |
DEPTH | Indicates the number of levels between objects in a parent/descendant relationship as integer value. |
Name of the descendant object.
For transforms, the name of the output schema of the transform call (if the name of the transform is unique). If it is not unique, Data Services generates a unique numeric suffix and appends that to the given name.</td></tr>
Query 11: The following query returns all the user defined Jobs with their corresponding Target tables they populate in a data services repository.
SELECT
AL_USAGE.PARENT_OBJ,
AL_USAGE.PARENT_OBJ_DESC,
AL_USAGE.PARENT_OBJ_TYPE,
AL_USAGE.DESCEN_OBJ,
AL_USAGE.DESCEN_OBJ_DESC,
AL_USAGE.DESCEN_OBJ_TYPE,
AL_USAGE.DESCEN_OBJ_USAGE,
AL_USAGE.DESCEN_OBJ_DS,
AL_USAGE.DESCEN_OBJ_OWNER,
AL_USAGE.DEPTH
FROM AL_USAGE
WHERE AL_USAGE.PARENT_OBJ_TYPE = 'Job'
AND AL_USAGE.DESCEN_OBJ_TYPE = 'Table'
AND AL_USAGE.DESCEN_OBJ_USAGE = 'Target'
IMPORTANT NOTES
In order to check for the Job name that populates a particular Target table, use additional filter clause as AL_USAGE.DESCEN_OBJ = '[Target Table Name]'. Alternatively if we want to check the Target tables being loaded by a particular Job use filter clause as AL_USAGE.PARENT_OBJ = '[Job Name]'.
Similarly, to find out the Tables being used as Source or Lookups or in Table Comparison in a Job, modify the AL_USAGE.DESCEN_OBJ_USAGE to 'Source' or 'Lookup_ext()' or 'Comparison' accordingly.
Again in order to identify the tables being used in Key_Generation transform in a Job, modify the AL_USAGE.DESCEN_OBJ_USAGE to 'Key Generation'. So all the above cases are valid when the descendant object type is datastores or files.
( AL_USAGE.DESCEN_OBJ_TYPE = 'Datastore' or 'Table' or 'File Datastore' or 'XML Datastore' or'File' )
Query 12: The following query returns all the user defined Dataflows with their corresponding Target tables they populate in a data services repository.
SELECT
AL_USAGE.PARENT_OBJ,
AL_USAGE.PARENT_OBJ_DESC,
AL_USAGE.PARENT_OBJ_TYPE,
AL_USAGE.DESCEN_OBJ,
AL_USAGE.DESCEN_OBJ_DESC,
AL_USAGE.DESCEN_OBJ_TYPE,
AL_USAGE.DESCEN_OBJ_USAGE,
AL_USAGE.DESCEN_OBJ_DS,
AL_USAGE.DESCEN_OBJ_OWNER,
AL_USAGE.DEPTH
FROM AL_USAGE
WHERE AL_USAGE.PARENT_OBJ_TYPE = 'Dataflow'
AND AL_USAGE.DESCEN_OBJ_TYPE = 'Table'
AND AL_USAGE.DESCEN_OBJ_USAGE = 'Target'
IMPORTANT NOTES
In order to check for the Dataflow name that populates a particular Target table, use additional filter clause as AL_USAGE.DESCEN_OBJ = '[Target Table Name]'. Alternatively if we want to check the Target tables being associated with a particular Dataflow use filter clause as AL_USAGE.PARENT_OBJ = '[Dataflow Name]'.
Similarly, to find out the Tables being used as Source or Lookups or in Table Comparison in a Dataflow, modify the AL_USAGE.DESCEN_OBJ_USAGE to 'Source' or 'Lookup_ext()' or 'Comparison' accordingly.
Again in order to identify the tables being used in Key_Generation transform in a Dataflow, modify the AL_USAGE.DESCEN_OBJ_USAGE to 'Key Generation'. So all the above cases are valid when the descendant object type is datastores or files.
( AL_USAGE.DESCEN_OBJ_TYPE = 'Datastore' or 'Table' or 'File Datastore' or 'XML Datastore' or'File' )
Let us create a table with possible list of values for PARENT_OBJ_TYPE, DESCEN_OBJ_TYPE and DESCEN_OBJ_USAGE for quick reference.
PARENT_OBJ_TYPE | DESCEN_OBJ_TYPE | DESCEN_OBJ_USAGE |
Job | WorkFlow | Source |
WorkFlow | DataFlow | Target |
DataFlow | Datastore | Lookup_ext() |
SDK Transform | File Datastore | Comparison |
XML Datastore | Key Generation | |
File | NULL | |
Transform | Transform | |
SDK Transform | SDK Transform |
Query 13: The following query returns all the objects (Job/Worflow/Dataflow) that would be affected if we drop a Target table or to find Object Dependencies. This is used to produce a WHERE USED report for an object.
SELECT
AL_USAGE.PARENT_OBJ,
AL_USAGE.PARENT_OBJ_DESC,
AL_USAGE.PARENT_OBJ_TYPE,
AL_USAGE.DESCEN_OBJ,
AL_USAGE.DESCEN_OBJ_DESC,
AL_USAGE.DESCEN_OBJ_TYPE,
AL_USAGE.DESCEN_OBJ_USAGE,
AL_USAGE.DESCEN_OBJ_DS,
AL_USAGE.DESCEN_OBJ_OWNER,
AL_USAGE.DEPTH
FROM AL_USAGE
WHERE AL_USAGE.DESCEN_OBJ_TYPE = 'Table'
AND AL_USAGE.DESCEN_OBJ = '[Target Table Name]'
Related Table AL_PARENT_CHILD or View ALVW_PARENT_CHILD contains information about objects (parents) that contain (or call) other objects (children).
Query 14: The following query returns the user defined Dataflows that populates a particular Target table.
SELECT
PARENT_OBJ,
PARENT_OBJ_DESC,
PARENT_OBJ_TYPE,
DESCEN_OBJ,
DESCEN_OBJ_DESC,
DESCEN_OBJ_TYPE,
DESCEN_OBJ_USAGE,
DESCEN_OBJ_DS,
DESCEN_OBJ_OWNER
FROM AL_PARENT_CHILD
WHERE DESCEN_OBJ_TYPE = 'Table'
AND DESCEN_OBJ = '[Target Table Name]'
AND DESCEN_OBJ_USAGE = 'Target'
Alternatively, we can use the same query with View ALVW_PARENT_CHILD.
Query 15: The following query returns the user defined Dataflows associated with a particular Job.
SELECT
PARENT_OBJ,
PARENT_OBJ_DESC,
PARENT_OBJ_TYPE,
DESCEN_OBJ,
DESCEN_OBJ_DESC,
DESCEN_OBJ_TYPE,
DESCEN_OBJ_USAGE,
DESCEN_OBJ_DS,
DESCEN_OBJ_OWNER
FROM AL_PARENT_CHILD
WHERE PARENT_OBJ_TYPE = 'Job'
AND PARENT_OBJ = '[Job Name]'
AND DESCEN_OBJ_TYPE = 'Dataflow'
Alternatively, we can use the same query with View ALVW_PARENT_CHILD.