What is an Oracle External Tables?
The Oracle external tables feature allows us to access data in external sources as if it is a table in the database. External tables are read-only. No data manipulation language (DML) operations is allowed on an external table. An external table does not describe any data that is stored in the database.
So, how do I create an external table?
To create an external table in Oracle we use the same CREATE TABLE DDL, but we specify the type of the table as external by an additional clause - ORGANIZATION EXTERNAL. Also we need to define a set of other parameters called ACCESS PARAMETERS in order to tell Oracle the location and structure of the source data. To understand the syntax of all these, let's start by creating an external table right away. First we will connect to the database and create a directory for the external table.
CONN SYS/SYS_PASSWORD AS SYSDBA CREATE OR REPLACE DIRECTORY ext_tab_dir AS 'C:\External_Tables'; GRANT READ,WRITE ON DIRECTORY ext_tab_dir TO scott;
Flat File Structure
We will start by trying to load a flat file as an external table. Suppose the flat file is named employee1.dat with the content as:
empno,first_name,last_name,dob 1234,John,Lee,"31/12/1978" 7777,Sam,vichi,"19/03/1975"
So our CREATE TABLE syntax will be something like below
CREATE TABLE Example for External Table
CREATE TABLE emp_ext ( empno NUMBER(4), first_name CHAR(20), last_name CHAR(20), dob CHAR(10) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE SKIP 1 BADFILE 'bad_%a_%p.bad' LOGFILE 'log_%a_%p.log' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LRTRIM MISSING FIELD VALUES ARE NULL REJECT ROWS WITH ALL NULL FIELDS ( empno INTEGER EXTERNAL (4), first_name CHAR(20), last_name CHAR(20), dob CHAR(10) DATE_FORMAT DATE MASK "dd/mm/yyyy") ) LOCATION ( 'employee1.dat','employee2.dat' ) ) PARALLEL REJECT LIMIT 0; SELECT * FROM emp_ext;
Now we can insert this temporary read only data to our oracle table say employee.
INSERT INTO employee ( empno, first_name, last_name, dob ) ( SELECT empno, first_name, last_name, dob FROM emp_ext );
Explanation of the above External Table Syntax
- The SKIP no_rows clause allows you to eliminate the header of the file by skipping the first row.
- The LRTRIM clause is used to trim leading and trailing blanks from fields.
- The SKIP clause skips the specified number of records in the datafile before loading. SKIP can be specified only when nonparallel access is being made to the data.
- The READSIZE parameter specifies the size of the read buffer. The size of the read buffer is a limit on the size of the largest record the access driver can handle. The size is specified with an integer indicating the number of bytes. The default value is 512KB (524288 bytes). You must specify a larger value if any of the records in the datafile are larger than 512KB.
- The LOGFILE clause names the file that contains messages generated by the external tables utility while it was accessing data in the datafile. If a log file already exists by the same name, the access driver reopens that log file and appends new log information to the end. This is different from bad files and discard files, which overwrite any existing file. NOLOGFILE is used to prevent creation of a log file. If you specify LOGFILE, you must specify a filename or you will receive an error. If neither LOGFILE nor NOLOGFILE is specified, the default is to create a log file. The name of the file will be the table name followed by _%p.
- The BADFILE clause names the file to which records are written when they cannot be loaded because of errors. For example, a record was written to the bad file because a field in the datafile could not be converted to the datatype of a column in the external table. Records that fail the LOAD WHEN clause are not written to the bad file but are written to the discard file instead. The purpose of the bad file is to have one file where all rejected data can be examined and fixed so that it can be loaded. If you do not intend to fix the data, then you can use the NOBADFILE option to prevent creation of a bad file, even if there are bad records. If you specify BADFILE, you must specify a filename or you will receive an error. If neither BADFILE nor NOBADFILE is specified, the default is to create a bad file if at least one record is rejected. The name of the file will be the table name followed by _%p.
- With external tables, if the SEQUENCE parameter is used, rejected rows do not update the sequence number value. For example, suppose we have to load 5 rows with sequence numbers beginning with 1 and incrementing by 1. If rows 2 and 4 are rejected, the successfully loaded rows are assigned the sequence numbers 1, 2, and 3.
External Table Access Driver
An external table describes how the external table layer must present the data to the server. The access driver and the external table layer transform the data in the datafile to match the external table definition.
The access driver runs inside of the database server hence the server must have access to any files to be loaded by the access driver. The server will write the log file, bad file, and discard file created by the access driver. The access driver does not allow to specify random names for a file. Instead, we have to specify directory objects as the locations from where it will read the datafiles and write logfiles. A directory object maps a name with the directory name on the file system.
Directory objects can be created by DBAs or by any user with the CREATE ANY DIRECTORY privilege. After a directory is created, the user creating the directory object needs to grant READ or WRITE permission on the directory to other users.
- If we do not specify the type for the external table, then the ORACLE_LOADER type is used as a default.
- Using the PARALLEL clause while creating the external table enables parallel processing on the datafiles. The access driver then attempts to divide large datafiles into chunks that can be processed separately and parallely. With external table loads, there is only one bad file and one discard file for all input datafiles. If parallel access drivers are used for the external table load, each access driver has its own bad file and discard file.
- We can change the target file name with the alter ddl command as:
ALTER TABLE emp_tab LOCATION ( 'newempfile.dat' );
- The SYS tables for Oracle External Tables are dba_external_tables, all_external_tables and user_external_tables.