InstantDB - Importing and Exporting Data


Home Top Next Prev

Please email any bug reports, comments or suggestions to:

peter.hearty@ceasar.demon.co.uk


InstantDB provides some rudimentary ways to import data from text files, and some even more primitive ways of exporting it. Imports are achieved using the SQL statement:

IMPORT table name FROM import file USING schema file [BUFFER rows]

The import file is the file containing the data to be imported. The schema file is a text file which describes the columns to be added to the new table, and how they are held in the import file.

The optional BUFFER keyword allows the number of disk writes during an import to be reduced. The number of rows specified are buffered in memory and only written out when the buffer fills or the import is complete. Note that if a crash occurs during a buffered import, the safest action is to drop the table and start again.

The import file syntax is a subset of the syntax supported by Microsoft's JET (c) database engine. An example schema file, import_schema.txt, is held in the Sample sub-directory. The only thing not included in the sample schema file is the alternative FORMAT directives. The full syntax of the format line is:

FORMAT={FIXEDLENGTH|DELIMITED(delimiter)|TABDELIMITED|CSVDELIMITED|AUTO} [STRICT]

The STRICT keyword is used to allow column delimiters to take precedence over string delimiters. By default, if the import encounters either a single quote ', or a double quote ", this is taken as the start of a string constant. The string only terminates when a corresponding quote is found. The import will include any column delimiters in the string constant and will even read across newlines.

There are two reasons for this default behaviour.

  1. It allows column delimiters to be included in string constants.
  2. It allows files exported by Microsoft Access (c) to be imported. Access sometimes splits string constants across multiple lines when large strings are exported.
The STRICT keyword ensures that whatever is between the chosen column delimiters is always strictly included in the column. String delimiters are then ignored.

The AUTO format indicates that a test table with the indicated column properties should be created. When the schema specifies an automatically generated table, then the file from which data is to be imported should contain a single line containing the number of rows to be generated.

The IMPORT command checks to see whether the table to be imported already exists or not. If it does, then the data being imported will be added to the existing table.

Data is exported by using the SET EXPORT SQL command. All subsequent results sets are exported to the given file. The full syntax of the SET EXPORT command is:

SET EXPORT filename [CSVDELIMITED|FIXEDLENGTH] [COLNAMEHEADER] [ROWNUMBERS] [CONTROLCOL] [SUMMARYHEADER] [TRACE level [CONSOLE][TIME]]

The SET EXPORT command isn't really up to producing reports. This can probably be better achieved using a JDBC based reporting facility. A filename of NULL switches data export back off.

Note that the SET EXPORT command affects the current thread of execution only. Each thread must execute its own SET EXPORT command if it wishes to record SELECT statements.

Trace Levels

The SET EXPORT command serves a dual purpose. As well as directing results sets to a file for a particular thread, it can also be used to set the level of diagnostic tracing which is enabled.

Trace levels are organised as a bitmap. The various bits are defined via the following public final ints in the db.Trace class:

ConstantValueBitDescription
TR_EVENT 1 0 Major events such as database open and close
TR_SQL 2 1 Logs each SQL statment processed
TR_ERROR 4 2 Logs errors
TR_OPEN 8 3 Table open and close
TR_TRANS 16 4 Transaction processing and locking
TR_PROGRESS 32 5 Progress of imports and index builds
TR_CACHE 64 6 Cache activity
TR_MEM 128 7 Memory and garbage collection
TR_INDEX 256 8 Index activity
TR_PARSE 512 9 SQL parsing
TR_JDBC 1024 10 JDBC calls
TR_EXPORT 2048 11 Used internally by SET EXPORT
TR_TABLESCAN4096 12 Table scans during SELECTs

So

SET EXPORT "fred.log" TRACE 18
would set tracing on for SQL processing and transaction processing. Note that, unlike dumping results sets, trace levels are set globally. So if any one thread sets a trace bit on, all other threads are affected. This inconsistency in the SET EXPORT command is unfortunate, but it has turned out to be necessary for performance reasons.

Including the optional CONSOLE keyword causes tracing to also be directed to standard out. The TIME keyword causes the GMT time to be output along with every trace line.

The properties traceLevel, traceFile and traceConsole determine the initial tracing which is produced (i.e. before a thread executes the SET EXPORT command).


Home Top Next Prev