The Oracle external tables feature allows us to access data in external sources as if it is a table in the database. This is a very convenient and fast method to retrieve data from flat files outside Oracle database.

What is an Oracle External Tables?

The Oracle external tables feature allows us to access data in external sources as if it is a table in the database. External tables are read-only. No data manipulation language (DML) operations is allowed on an external table. An external table does not describe any data that is stored in the database.

So, how do I create an external table?

To create an external table in Oracle we use the same CREATE TABLE DDL, but we specify the type of the table as external by an additional clause - ORGANIZATION EXTERNAL. Also we need to define a set of other parameters called ACCESS PARAMETERS in order to tell Oracle the location and structure of the source data. To understand the syntax of all these, let's start by creating an external table right away. First we will connect to the database and create a directory for the external table.

CONN SYS/SYS_PASSWORD AS SYSDBA   
CREATE  OR REPLACE DIRECTORY ext_tab_dir AS  'C:\External_Tables'; 
GRANT READ,WRITE ON DIRECTORY ext_tab_dir TO scott; 

Flat File Structure

We will start by trying to load a flat file as an external table. Suppose the flat file is named employee1.dat with the content as:

empno,first_name,last_name,dob 
1234,John,Lee,"31/12/1978" 
7777,Sam,vichi,"19/03/1975"

So our CREATE TABLE syntax will be something like below

CREATE TABLE Example for External Table

CREATE TABLE emp_ext ( 
empno NUMBER(4), first_name CHAR(20), last_name CHAR(20), dob CHAR(10)
ORGANIZATION EXTERNAL ( 
TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS
( 
RECORDS DELIMITED BY NEWLINE 
SKIP 1    
BADFILE 'bad_%a_%p.bad' 
LOGFILE 'log_%a_%p.log' 
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'  LRTRIM  
MISSING FIELD VALUES ARE NULL  
REJECT ROWS WITH ALL NULL FIELDS 
( empno INTEGER EXTERNAL (4),
first_name CHAR(20),
last_name CHAR(20),
dob CHAR(10) DATE_FORMAT DATE MASK "dd/mm/yyyy") 
) 
LOCATION ( 'employee1.dat','employee2.dat' )
)
PARALLEL 
REJECT LIMIT 0;  


SELECT * FROM emp_ext; 

Now we can insert this temporary read only data to our oracle table say employee.

INSERT INTO employee ( empno, first_name, last_name, dob )  
( SELECT empno, first_name, last_name, dob FROM emp_ext ); 

Explanation of the above External Table Syntax

  • The SKIP no_rows clause allows you to eliminate the header of the file by skipping the first row.
  • The LRTRIM clause is used to trim leading and trailing blanks from fields.
  • The SKIP clause skips the specified number of records in the datafile before loading. SKIP can be specified only when nonparallel access is being made to the data.
  • The READSIZE parameter specifies the size of the read buffer. The size of the read buffer is a limit on the size of the largest record the access driver can handle. The size is specified with an integer indicating the number of bytes. The default value is 512KB (524288 bytes). You must specify a larger value if any of the records in the datafile are larger than 512KB.
  • The LOGFILE clause names the file that contains messages generated by the external tables utility while it was accessing data in the datafile. If a log file already exists by the same name, the access driver reopens that log file and appends new log information to the end. This is different from bad files and discard files, which overwrite any existing file. NOLOGFILE is used to prevent creation of a log file. If you specify LOGFILE, you must specify a filename or you will receive an error. If neither LOGFILE nor NOLOGFILE is specified, the default is to create a log file. The name of the file will be the table name followed by _%p.
  • The BADFILE clause names the file to which records are written when they cannot be loaded because of errors. For example, a record was written to the bad file because a field in the datafile could not be converted to the datatype of a column in the external table. Records that fail the LOAD WHEN clause are not written to the bad file but are written to the discard file instead. The purpose of the bad file is to have one file where all rejected data can be examined and fixed so that it can be loaded. If you do not intend to fix the data, then you can use the NOBADFILE option to prevent creation of a bad file, even if there are bad records. If you specify BADFILE, you must specify a filename or you will receive an error. If neither BADFILE nor NOBADFILE is specified, the default is to create a bad file if at least one record is rejected. The name of the file will be the table name followed by _%p.
  • With external tables, if the SEQUENCE parameter is used, rejected rows do not update the sequence number value. For example, suppose we have to load 5 rows with sequence numbers beginning with 1 and incrementing by 1. If rows 2 and 4 are rejected, the successfully loaded rows are assigned the sequence numbers 1, 2, and 3.

External Table Access Driver

An external table describes how the external table layer must present the data to the server. The access driver and the external table layer transform the data in the datafile to match the external table definition.

The access driver runs inside of the database server hence the server must have access to any files to be loaded by the access driver. The server will write the log file, bad file, and discard file created by the access driver. The access driver does not allow to specify random names for a file. Instead, we have to specify directory objects as the locations from where it will read the datafiles and write logfiles. A directory object maps a name with the directory name on the file system.

Directory objects can be created by DBAs or by any user with the CREATE ANY DIRECTORY privilege. After a directory is created, the user creating the directory object needs to grant READ or WRITE permission on the directory to other users.

Notes

  1. If we do not specify the type for the external table, then the ORACLE_LOADER type is used as a default.
  2. Using the PARALLEL clause while creating the external table enables parallel processing on the datafiles. The access driver then attempts to divide large datafiles into chunks that can be processed separately and parallely. With external table loads, there is only one bad file and one discard file for all input datafiles. If parallel access drivers are used for the external table load, each access driver has its own bad file and discard file.
  3. We can change the target file name with the alter ddl command as:
    ALTER TABLE emp_tab LOCATION ( 'newempfile.dat' ); 
    

  4. The SYS tables for Oracle External Tables are dba_external_tables, all_external_tables and user_external_tables.


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • How to find out Which User is Running what SQL Query in Oracle database?

    Do you wonder how to get information on all the active query in the Oracle database? Do you want to know what query is executed by which user and how long is it running? Here is how to do it!

  • Learn Oracle Server Architecture in 10 minutes

    Here is an easy to understand primer on Oracle architecture. Read this first to give yourself a head-start before you read more advanced articles on Oracle Server Architecture.

  • Database Performance Tuning

    This article tries to comprehensively list down many things one needs to know for Oracle Database Performance Tuning. The ultimate goal of this document is to provide a generic and comprehensive guideline to Tune Oracle Databases from both...

  • Oracle Analytic Functions

    Oracle Analytic Functions compute an aggregate value based on a group of rows. It opens up a whole new way of looking at the data. This article explains how we can unleash the full potential of this.

  • Oracle External Tables

    The Oracle external tables feature allows us to access data in external sources as if it is a table in the database. This is a very convenient and fast method to retrieve data from flat files outside Oracle database.

  • How to find out Expected Time of Completion for an Oracle Query

    Too often we become impatient when Oracle Query executed by us does not seem to return any result. But Oracle (10g onwards) gives us an option to check how long a query will run, that is, to find out expected time of completion for a query.

  • Understanding Oracle QUERY PLAN - Part2 (Exploring SQL Joins)

    This is the second part of the article Understanding Oracle Query Plan. In this part we will deal with SQL Joins.

  • Oracle Spool

    Oracle Spool for SQLplus command line utility, can be used in conjunction with OS to export data from Oracle to flat file.

  • Oracle UTL_FILE

    The Oracle supplied PL/SQL package UTL_FILE used to read and write operating system files that are located on the database server.

  • Oracle AUTOTRACE Explained - A 10 Minute Guide

    AUTOTRACE is a beautiful utility in Oracle that can help you gather vital performance statistics for a SQL Query. You need to understand and use it for SQL Query Tuning. Here is how!