Writing Interoperable Applications with ODBC

Richard Schwartz
ODBC Program Management

Overview

This technical article provides information to application developers on writing interoperable ODBC-enabled applications. The first section begins by exploring the challenges inherent to heterogeneous data access. It looks at ways of approaching these challenges and discusses the open database connectivity (ODBC) solution. It then discusses the ODBC programming paradigm, which application developers need to work within when coding to ODBC. The second section provides a high-level "recipe" for writing interoperable applications. It enumerates the steps involved in the development process, from planning through coding and testing, and includes information about issues a development team needs to address at each step. The third section contains detailed scenarios that illustrate proper usage of the ODBC paradigm and programming interface. The final section ties together the main ideas of this technical article.

The reader is presumed to have application development experience, but no knowledge of the ODBC API or experience writing applications for heterogeneous data access is necessary. Readers who are familiar with ODBC and the issues involved in writing applications that access multiple DBMSs may want to skip to the more technical sections, starting with "A Recipe for Writing Interoperable Applications."

Introduction to Interoperability

Microsoft developed the open database connectivity (ODBC) interface as a means of providing applications with a single application programming interface (API) through which to access data stored in a wide variety of database management systems (DBMSs). Prior to ODBC, applications written to access data stored in a DBMS had to use the proprietary interface specific to that database. If application developers wanted to provide their users with heterogeneous data access (access to data in more than one data source), they needed to code to the interface of each data source. Applications written in this manner are difficult to code, difficult to maintain, and difficult to extend. ODBC was created to overcome these difficulties.

The ODBC architecture has four components:

This architecture is designed to permit maximum interoperability : It allows application developers to create an application without targeting a specific DBMS. End users can then use the application with the DBMS that contains their data by adding modules called database drivers , which are dynamic-link libraries (DLLs) available from the database vendor or third-party driver developers.

This technical article provides information to application developers on writing interoperable ODBC-enabled applications. Note that the concept of interoperability is a relative one. To one developer it might mean being able to access data in each of two known data sources using the same application. To a second developer, it might mean writing applications that work with DB/2 today, but might at some future date need to work with ORACLE. To another developer it might mean providing access to any data stored in a desktop-database format. And to a fourth developer, interoperability might mean giving access to any data for which there exists an ODBC driver. ODBC can meet the interoperability needs of all of these developers.

If ODBC Is the Solution, What Is the Question?

Heterogeneous data access is very appealing to developers and end users alike. Independent software vendors (ISVs) would like to be able to provide users of their shrink-wrapped applications with the ability to use data that is stored in one of the many commercially available databases. Corporate developers want easy ways to access data that is stored in different DBMSs. These same developers want the ability to write code to one DBMS and reuse it on a second (or a third, or. . .). Database vendors see data access to their DBMS from popular applications as a way of increasing use of (and demand for) their product. And for end users, interoperability means there is a simple way to access their data and process it from a familiar, easy-to-use interface.

The Hurdles to Interoperability

The benefits of (and need for) heterogeneous data access are well-defined. However, the hurdles to accessing data in multiple DBMSs from a shrink-wrapped application, or writing code with one DBMS in mind and having it work on another DBMS without modification, are many. The crux of the problem is that each DBMS provides its own interface, its own manner of handling and storing data, its own version of SQL, its own set of functionality, and so forth. Any attempt to create interoperable applications has to address these differences head-on.

Let's look more closely at the hurdles to interoperability.

Programming interfaces

Each DBMS typically has its own call-level programming interface through which it communicates with client applications. For example, SQL Server provides an interface called DB-Library™, while ORACLE® Server provides the Oracle Call Interface (OCI). Any client application that wishes to access more than one DBMS must therefore be able to translate requests and data transfers into each interface it needs to access. After the application has been completed, if developers wish to provide access to a new DBMS, they need to learn a new programming interface, write and test new code for accessing the new DBMS, and recompile and redistribute the application.

Functionality

Although database users have come to expect a certain set of functionality for defining, updating, and working with their data (and there is much overlap from one product to another), there is no agreement on what is the proper set of functionality. Even databases that follow the relational database model, use SQL as a query language, and run on client-server architecture have no consensus on functionality. On the contrary, database vendors take pride in the ways their product differs from the others on the market, and users purchase the products partially due to the extended functionality that the product offers. But what is good for a single database is a nightmare for interoperability. Interoperable applications must contend with the fact that they cannot know what functionality to expect from any given DBMS. This problem is further compounded by the fact that users want an application to access data in both relational and non-relational databases (be they hierarchical, object-oriented, ISAM, or just plain text), providing the same interface and functionality for each. This raises the question of whether or not there can be a "standard" set of functionality, and if not, what can be done about it.

SQL syntax and semantics

While most DBMSs support SQL, and while the SQL they support is typically based on an ANSI or ISO standard, each DBMS has its own idiosyncratic implementation of the language. Although these differences are often small, they are huge impediments to interoperability, as SQL statements must be tailor-made for each DBMS.

System catalogs

Relational database management systems (RDBMSs) use system catalogs to hold information about the data being stored, such as the column names, data types, and permissions. Applications need catalog information to create databases, to make changes to data, and to provide a clean user interface. Although ANSI has defined standard system views that contain this system data, these are not yet implemented by most database vendors. Until they are implemented, each RDBMS will continue to have an incompatible manner of providing information from its system catalog. Consider, for example, that the names of data types differ across DBMSs: SQL Server refers to alphanumeric data as CHAR, ORACLE uses CHARACTER, and Paradox calls it ALPHANUMERIC. Because of these differences, an application needs to make specific, individual reference to the system catalog of each DBMS it supports. Once again, there is the issue of writing DBMS-specific code, and the problem of portability.

Solving the Interoperability Problem

As stated above, ODBC was created to permit maximum interoperability between applications and databases. Its primary design goal is to facilitate the process of applications communicating with multiple, different DBMSs. This may ease the initial development burden on the developer, allow developers to write code that is easily extensible and portable, and broaden the application support for databases.

ODBC achieves its goals through its driver-based architecture and by defining a set of standards that mitigate the differences among databases, thereby eliminating DBMS-specific code.

Addressing the issue of differing database functionality

In addressing the issue of differing functionality, there were a number of approaches for ODBC to choose from. One solution would have been to limit the functionality available through ODBC to only that which all DBMSs have in common. This solution is often referred to as the lowest-common-denominator approach. Under this approach, an application would be assured of a given set of functionality. Although this may seem appealing at first, it turns out that there is very little functionality that all databases have in common. Many application developers would have found this lowest common denominator to be insufficient for meeting their users' needs, and it would be unacceptable to users of advanced databases.

A second approach would have been to define a guaranteed functionality that was somewhat larger than the lowest common denominator. Any functionality that a DBMS could not provide would be handled by ODBC. ODBC would essentially need a database engine of its own to provide this functionality. (This is the model used by Microsoft® Access®.) Although the functionality provided by the engine would go further to meeting the needs of end users, there are still applications that would find this solution too constricting, since not all of a DBMS's functionality would be utilized.

Still another approach would have been to not worry about providing common functionality at all. Instead, the ODBC API would provide a way to get to all features of all databases, but it would make no guarantees about any of it. This would maximize functionality available to applications, but at the cost to developers of having to write much conditional code to deal with functionality that might not be available.

In the end, ODBC has chosen a hybrid approach to handling database functionality. As stated above, ODBC provides a minimum level of functionality that is expected to be supported by all drivers, while utilizing as many features of a DBMS as possible. Application developers have to decide to either use the minimum level of functionality, or write the conditional code to test for extended functionality. Applications that make the latter choice are responsible for handling any missing DBMS functionality in the way they best see fit. The process of compensating for differing functionality is the heart of the "ODBC programming paradigm."

ODBC Programming Paradigm

ODBC has chosen to use as much functionality of the DBMS as possible. Because no DBMS provides all functionality, ODBC has provided a standard way for an application to check what functionality is available from any data source it is connected with. Available functionality is communicated by way of conformance levels and capabilities.

Even though DBMSs can differ greatly from one to the next, many of them have sets of functionality in common. Therefore, ODBC has defined conformance levels , both for the API and for SQL statements. There are three ODBC API conformance levels defined in the Programmer's Reference : Core, Level 1, and Level 2. Driver writers are encouraged to support Level 1 API (at a minimum), as most interactive applications require much of the functionality defined in that conformance level. But an application should not make any assumptions about the conformance level of a driver. Developers should use the SQLGetInfo function to explicitly check for API conformance.

Note Information about what functions are contained within each conformance level can be found in the ODBC Programmer's Reference in the documentation for SQLGetFunctions .

ODBC also defines three SQL conformance levels: Minimum, Core, and Extended. Applications should use SQLGetInfo to check for SQL conformance.

Note Information about SQL conformance level can be found in Appendix C of the ODBC Programmer's Reference .

Although drivers are required to report their conformance levels, they are not limited to the functionality defined by those levels. On the contrary, driver writers are encouraged to support as much of the API and SQL grammar as they are able. In order to learn what functionality is provided by a given driver, an application can check for specific capabilities using SQLGetInfo .

ODBC also provides a standard means of executing extended functionality. For example, although most DBMSs provide scalar functions such as SUBSTRING and ABS, there is no consensus on the syntax of those functions. ODBC defines a canonical syntax that the application can use to access this extended functionality of the data source. Similarly, while most databases support time and date data types, there are many different ways to represent those data types. ODBC defines a standard way of representing times and dates.

By design, all ODBC applications have a similar structure. An application begins by connecting to a data source. Next, it queries the data source for information about conformance levels and individual capabilities that the application will take advantage of if available. Then it typically sends one or more SQL statements to the DBMS for processing. Next, the application retrieves the results sent by the DBMS, checking for errors along the way. Finally, the application terminates the connection when it has finished interacting with the data source.

The primary feature of the ODBC programming paradigm is the shift from writing DBMS-dependent code to writing capability-dependent code. Any application that is not written for a particular driver, even one that only uses functionality listed as Core API or Minimum SQL, needs to actively query the DBMS for supported features, and respond appropriately to those queries.

A Recipe for Writing Interoperable Applications

This section examines the process of writing an interoperable application from the planning phase through development and testing. It gives a step-by-step account of the issues that need to be addressed, providing some specific solutions along the way. The primary focus of the section is on the planning phase. This is partly due to the scope of this paper, but mostly due to the belief that interoperable applications are the result of good planning.

Note that although the issues to address are given in a linear order, the reality of planning, developing, and testing the application tells us that these steps are not always taken in a specific order. Often more than one step will be addressed at a time. Do not take the "recipe" metaphor too literally, to the detriment of your development process.

Step 1: Understand ODBC and Its Applications

The first step in developing an ODBC-enabled application is understanding the ODBC programming paradigm, including the problems that ODBC was designed to solve. The discussion in the above sections describes this paradigm. Another source for this information is the ODBC Programmer's Reference .

Part of understanding ODBC is knowing when its use is appropriate and when it is best to look for other solutions. ODBC is not a panacea for all database programming problems. As discussed above, ODBC was designed primarily as a tool for heterogeneous data access. Although ODBC exposes much of the functionality of any given DBMS, and serves as an alternative to coding directly to a particular DBMS's API, it is more advantageous to use the proprietary DBMS interface for some applications, particularly those applications that perform administrative tasks on the database and those that are written for only one DBMS.

An example of an administrative application that is best written in the native API is SQL Administrator for SQL Server. This application allows a user to create SQL Server devices; create databases; move the log segment to a separate device; create and maintain log-in IDs, user names, and remote log-ins; obtain statistics information; view the error log; and provide other administrative functions in addition to being an ad-hoc querying tool. Almost all of this functionality can be achieved using the ODBC API, but if one knows enough about the SQL Server architecture to write SQL Administrator (a SQL Server-specific administrative utility), it is definitely easier to use the DB-Library API.

Step 2: Determine Which Databases You Need to Connect to and the Conformance Levels of Those Drivers

The second step in writing interoperable applications is to determine which databases you need to be able to connect to. Many applications use ODBC to access only one or two specific DBMSs. If you have a limited, known set of databases you need to connect to, you may be able to take advantage of features specific to those databases. This may cut down on the amount of capability-checking the application needs to perform, thereby lessening the application's complexity. On the other hand, if you are writing an off-the-shelf application, or you are uncertain which databases you need to communicate with, then you will need to program for maximum interoperability, making no assumptions about the databases you are connected to.

Once the data sources have been determined, it can be helpful to know the conformance levels of the drivers that access those data sources. For applications that are written for a limited number of data sources, knowing conformance levels can give you some idea of what functionality to expect, which might decrease the amount of work you need to do (particularly relating to Steps 3 and 4 below). For applications that need to be interoperable with many or unknown data sources, it is useful to know the general conformance levels of currently available drivers when planning the application.

Accessing a known and limited set of data sources

If the set of data sources to be accessed is known, as is often the case when writing custom applications, the developer needs to consider the constraints imposed by existing hardware and software and by cost considerations. For example, are drivers available for the data sources? Do these drivers require additional software, such as network components or gateways? Will the drivers work with the existing network and server hardware? Although ODBC allows applications to ignore these questions, developers writing software to work in a specific environment must ensure that all components are available and fit cost constraints before they begin writing code.

For example, the only commercially available ODBC driver for ORACLE at the time of this writing comes from Microsoft. To communicate with ORACLE, this driver uses SQL*Net, a network component from Oracle Corporation. Although SQL*Net is available in both TSR and DLL form, the Microsoft ORACLE driver can only use SQL*Net DLLs. Furthermore, although there is a SQL*Net TSR for DECNet™, there is not a SQL*Net DLL for DECNet. Therefore, before users can use an ODBC application to access ORACLE data, a company currently using DECNet and ORACLE must either change their network or wait for an ORACLE driver that can communicate across DECNet.

Overall driver conformance

Microsoft has encouraged driver writers to support Level 1 API conformance, and to date, most drivers do. In response, there are applications being written that require Level 1 conformance in order to connect with the driver (Microsoft Access being a notable example). As for SQL conformance, most server databases provide Core conformance, while most desktop databases (Microsoft Access, Paradox, text, and so on) provide only Minimum SQL conformance. This is because the desktop databases do not possess an engine to parse SQL statements, so the SQL support must be implemented in the driver.

Determining the conformance level of drivers

There are a number of ways to determine the conformance levels of ODBC drivers. First, you can check the documentation that comes with the drivers. Alternatively, you can access the drivers through ODBC Test (a utility provided in the ODBC SDK) and confirm conformance there. A shortcut could be to check the driver's conformance in the ODBC Driver Catalog, if it is listed.

Differences between desktop and server databases

Part of understanding the conformance issue is understanding the fundamental differences between data stored in a relational database on a server or mainframe and data stored in ISAM format in a desktop database. Developers who are familiar with relational databases need to understand that some of the basic concepts from the relational model, such as transactions and system catalogs, are not present in desktop databases. Also, as pointed out above, desktop databases do not use SQL as their native query language. On the other hand, developers familiar with desktop databases need to realize that some of the functionality they depend on, such as the rich capabilities for navigating (moving) through a result set, is not present in the relational model. Although ODBC makes it possible for interoperability between the two models, and developers should be able to present an interface to the user that masks the differences, there is an impact to the application developer. ODBC enables interoperability, and decreases complexity, but it does not eliminate all of the issues.

Step 3: Determine the Functionality Requirements of the Application

Together with Step 4 below, this is the major focus of the planning phase. Functionality here refers to the capabilities of the driver you are connected to. It might be the same as the functionality your application offers the end user, but not necessarily. ( Note : In some cases, the application will compensate for functionality not provided in the driver. See Step 4, below.) As discussed above, the fact that each DBMS provides a unique set of functionality is one of the major hurdles to creating interoperable applications. The functionality required by your application will determine which drivers it is compatible with (or, as seen below, the amount of work you need to do to make your application compatible).

When determining the functionality required by your application, you might want to consider each of the following:

It is virtually impossible to know all the ways in which DBMSs can vary from one to the next. The best way to get an idea is to look over the options for SQLGetInfo , the primary function provided for querying DBMS capabilities. Familiarity with SQLGetInfo greatly increases the likelihood of creating interoperable applications.

Step 4: Determine How to Handle Functionality That Is Not Supported

Many applications will require functionality that is not supported by the DBMS. There are essentially five options available to the application at that time:

Handling these situations involves weighing the tradeoffs. Will your users miss the functionality if you don't provide it? Are you eliminating important data sources by not communicating with them? Will your application be less interoperable if you provide the functionality yourself? Is there a clean way to implement an alternative without frustrating or confusing the end user? These are tough choices, but they are inevitable when writing complex, interoperable applications.

The Programmer's Reference from the ODBC SDK 1.0 provides a good illustration of this kind of situation: Assume your application displays data in a table. It uses SQLColumnPrivileges to determine which columns a user can update and dims those columns the user cannot update. Assume further that not all drivers support SQLColumnPrivileges . What are your options?

Step 5: Determine Appropriate Places for Data-SourceSpecific Code

There are times when it is appropriate to write data-source-specific code in applications. Although this will probably not be an option for shrink-wrapped interoperable applications, there are cases in which it might be useful for applications written to known data sources. In particular, a developer may choose to write directly to the data source's interface to gain access to features of the DBMS that are not exposed through ODBC.

ODBC drivers generally support any SQL that the DBMS supports, including SQL that is not covered by ODBC SQL. If an application sends SQL to the driver that is not supported by ODBC, the driver will pass the statement through to the DBMS. In this manner, an application can use data-source-specific code.

For example, SQL Server supports a USE database statement, with which a user can change the current SQL Server database. An application developer writing an application specific to SQL Server can change the database by calling SQLExecDirect and sending it this statement. The SQL Server driver will pass the statement through to SQL Server, which will execute the statement.

Here is another example of when an application may choose to write data-source-specific code. Suppose the application is a table builder that allows users to create tables and insert values into them. One of the things such an application has to keep in mind is domain integrity. Defaults are a common way of enforcing domain integrity constraints. Assume that the underlying drivers do not support the Integrity Enhancement Facility (IEF). Now the developer must use data-source-specific SQL for defining defaults.

In ORACLE, the SQL looks like this:

create table employees (name char(20), deptno number(5) default 100)

In SQL Server, the SQL looks like this:

create table employees (name char(20), deptno int)

create default deptno_default as 100

sp_bindefault deptno_default, 'employees.deptno'

The driver will pass these data-source-specific SQL statements through to the DBMS.

Step 6: Code the Application

Once the design decisions have been made, it is time to code the application. The primary obstacle to writing interoperable applications is using data-source-specific code. As seen above, there are some cases in which writing code particular to a data source is appropriate. However, much data-source-specific code is not written intentionally, and it generally comes in the form of making assumptions about the functionality of DBMSs.

Using SQLGetInfo

The key to avoiding driver-specific code is SQLGetInfo . There are more than 70 values for the SQLGetInfo fInfoType parameter, each representing a DBMS feature that might or might not be present. All applications need to check the capability of the driver before using a feature. To omit to do so is to open the application up to interoperability problems. There are dozens of examples in which developers have neglected to check for capabilities, mostly because it did not occur to them that some functionality would not be supported. Here are some tips on avoiding data-source-specific code.

Multiple active statement handles per connection

Many developers assume that all drivers support multiple statement handles per connection. However, the SQL Server driver from Microsoft does not. This feature needs to be checked with an fInfoType value of SQL_ACTIVE_STATEMENTS.

Transaction capability

As noted above, some data sources, particularly desktop databases, do not support transactions. Others provide transaction support but have differing behavior in response to data definition language (DDL) statements within a transaction. Use an fInfoType value of SQL_TXN_CAPABLE to check transaction support and behavior.

Cursor behavior on COMMIT and ROLLBACK

No assumptions can be made about the behavior of a cursor following a COMMIT or ROLLBACK operation. In some cases cursors will be closed and deleted, in other cases they will be closed but not deleted, and in other cases they will be preserved. To find out about cursor behavior, use SQLGetInfo with fInfoType values of SQL_CURSOR_COMMIT_BEHAVIOR and SQL_CURSOR_ROLLBACK_BEHAVIOR.

Support for NOT NULL in CREATE TABLE statement

Some data sources, such as dBASE® and FoxPro®, do not support the NOT NULL column constraint of the CREATE TABLE statement. Use an fInfoType value of SQL_NON_NULLABLE_COLUMNS to check for availability.

Support for scalar functions

Virtually all data sources will support scalar functions, but no assumptions can be made about which ones are supported and what syntax is used. Call SQLGetInfo with fInfoType values of SQL_NUMERIC_FUNCTIONS, SQL_STRING_FUNCTIONS, SQL_TIMEDATE_FUNCTIONS, and so on, to determine what scalar functions are supported, and use the ODBC canonical syntax for scalar functions.

Data type support

Do not make any assumptions about what types are supported by the DBMS and what their names are. Use SQLGetTypeInfo to find this information. Be sure to check their precision and other characteristics to find out about data-source-specific behavior.

Identifier naming conventions

Application developers should not assume that all data sources support the same identifier naming convention. In addition to a single identifier (usually a table or column name), identifiers can be extended by the owner's name and the use of qualifiers. ODBC supports one-, two-, and three-part naming conventions. To determine if, and in what statements, a data source supports owner names, an application calls SQLGetInfo with an fInfoType of SQL_OWNER_USAGE. To determine if, and in what statements, a data source supports qualifiers, an application calls SQLGetInfo with an fInfoType of SQL_QUALIFIER_USAGE. In addition, different data sources use different characters as the separator between a qualifier name and the element that follows it. To find out what character is being used, call SQLGetInfo with an fInfoType of SQL_QUALIFIER_NAME_SEPARATOR.

Once again, there is no way to know all the ways in which one DBMS differs from another. The best way to become familiar with the capability issues that your application might need to address is to look over the possible values for the fInfoType parameter of SQLGetInfo .

Using ODBC SQL

A second major step in writing generic code is to be certain to use the ODBC SQL syntax and data types. ODBC provides these standards to mitigate the incompatibilities among DBMSs. Consider the syntax for performing outer joins. ( Note : ANSI has defined a standard syntax for outer joins, but this standard has not been implemented by most database vendors.) Here are two versions of the same request for an outer join. The first is written for ORACLE Server, the second for SQL Server:

SELECT EMPLOYEE.NAME, DEPT.NAME

FROM EMPLOYEE, DEPT

WHERE (EMPLOYEE.PROJID = 544) AND (EMPLOYEE.DEPTID = DEPT.DEPTID (+))


SELECT EMPLOYEE.NAME, DEPT.NAME

FROM EMPLOYEE, DEPT

WHERE EMPLOYEE.PROJID = 544 AND EMPLOYEE.DEPTID *= DEPT.DEPTID

Rather than writing conditional code to fine-tune this SQL SELECT statement for the appropriate DBMS, the programmer should send the SELECT statement using the SQL syntax defined in Appendix C of the ODBC Programmer's Reference . Using this syntax, the SQL statement sent to the DBMS will look like one of the following (the escape syntax is given first, the shorthand syntax second):

select employee.name, dept.deptname from 

--(*vendor(Microsoft),product(ODBC)oj employee left outer join 

dept on employee.deptid=dept.deptid*)-- 

where employee.projid =544


select employee.name, dept.deptname from 

{oj employee left outer join dept on employee.deptid=dept.deptid} 

where employee.projid =544

It is then the driver's responsibility to translate the ODBC SQL grammar to the SQL native to the data source. (Note that before using outer joins, the application must first call SQLGetInfo with an fInfoType value of SQL_OUTER_JOINS to determine if the data source even supports the feature.)

Performance issues

Somewhat off the topic of writing interoperable applications, but directly relevant to the topic of coding ODBC applications, are some misunderstandings developers have had about the proper use of certain ODBC constructs. The consequences of these misuses have been significant performance problems for these applications. To avoid future problems, here are descriptions of the proper use of two of the misunderstood constructs, prepared execution and opening and closing connections.

Prepared vs. direct execution

ODBC provides two ways to submit an SQL statement to the DBMS for processing: direct execution (using SQLExecDirect ) and prepared execution (using SQLPrepare and SQLExecute ). Prepared execution is useful if a statement will be executed many times, or if an application needs information about a result set prior to the execution of the statement. Under prepared execution, upon receiving the SQLPrepare function the data source will compile the statement, produce an access plan, and return the access plan to the driver. The data source will then use this plan when it receives a SQLExecute statement. For statements that are executed multiple times, prepared execution creates a performance advantage because the access plan need only be created once. But for statements that are executed just once, prepared execution creates added overhead, and hence there is a performance hit. Direct execution is the proper choice for statements that are executed a single time. Using the correct execution strategy is one way of optimizing application performance.

Opening and closing connections

Opening and closing connections can be very time-consuming. Under ODBC 1.0, upon opening a connection, the driver manager loads the driver DLL and calls the driver's SQLAllocEnv and SQLAllocConnect functions, plus the driver's connect function corresponding to the connection option chosen by the application. Upon closing a connection, the driver manager unloads the DLL and calls all the disconnect functions: SQLDisconnect , SQLFreeConnect , and SQLFreeEnv . ( Note : Under ODBC 2.0, the driver manager will not unload the DLL or call SQLFreeConnect or SQLFreeEnv upon receiving a SQLDisconnect call; instead it waits until the application calls SQLFreeConnect or requests a connection to a different data source.) For this reason, from a performance perspective, it is preferable to leave connections open, rather than closing and re-opening them each time a statement is executed. However, there is a cost to maintaining open, idle connections. Each connection consumes a significant amount of resources on the server, which can cause problems on PC-based DBMSs that have limited resources. Therefore, applications must use connections judiciously, weighing the potential costs of any connection strategy.

Step 7: Test the Application

In most ways, testing an ODBC application is no different than testing any other application. However, the only way to discover how interoperable an application truly is is to test it against multiple drivers. For applications that are written for a known set of drivers, it is critical to test against all of them. For applications that are generically interoperable, it is advised that application developers test with as many drivers as they can. At a minimum, developers should test with at least one driver with limited capabilities (many single-tier drivers fall into this category) and one driver having extensive capabilities (many multiple-tier drivers fall into this category).

The ODBC Test utility that comes with the ODBC SDK version 1.0 can be an aid to the testing process. It can be particularly helpful when tracking down interoperability problems with specific drivers: ODBC Test can help the application developer determine if the problem is with the application or with the driver. For the ODBC SDK version 2.0, there will be an expanded version of the Test tool, plus a second utility called ODBC Spy that will capture the communication between application and driver. ODBC Spy will be a valuable debugging tool, providing information on what ODBC functions were called in what sequence, including parameter and return information.

Application Development Scenarios

Although much of the discussion of interoperability pertains to all applications, the specific issues that a developer needs to address are very much dependent upon the nature of the application. ODBC-enabled applications are many and varied. However, for the purposes of this article, it is possible to categorize these applications according to the number of data sources they need to be able to connect with, and the amount of functionality they require of the data source.

Below are four scenarios that examine the interoperability issue in more detail. Each scenario typifies a subset of applications and discusses approaches to planning and testing.

Note These scenarios are presented solely to illustrate the process of determining how ODBC can best be employed to meet different types of data access challenges. They are by no means intended to represent complete product specifications.

Scenario 1: Merging Data from Two Known Data Sources

Description

A mid-sized manufacturing firm needs a small, custom application that accesses sales data stored in both Microsoft SQL Server and dBASE IV. The company's primary DBMS is SQL Server. It contains tables with information concerning inventory, personnel, customers, sales, and so forth. The dBASE database is used in the field by sales representatives to record information about individual sales. The application in question will run queries against SQL Server based on the data in dBASE and generate summary reports. Assume that there are no plans for using other DBMSs, so the application can ignore many interoperability issues.

Planning the application

The developer could accomplish his or her goal by writing directly to the SQL Server API and retrieving data directly from the dBASE files. However, assume this developer is not very familiar with the SQL Server API or the structure of dBASE files, and prefers to learn a single interface (ODBC) for accessing both data sources. Because the focus of the project is heterogeneous data access, the application seems well-suited for ODBC.

The databases this application needs to connect with are a fixed set: The company uses only SQL Server and dBASE IV. Because the SQL Server driver is multiple-tier and the dBASE driver is single-tier, the developer might have some concerns about the capabilities of the dBASE driver. However, the functionality requirements of the application are not very demanding. The drivers in question should pose no particular problems due to functionality deficiencies.

The dBASE database contains data regarding sales that the salespeople have made, both basic invoice information and notes particular to a given sale. In order to produce reports from this data, information is needed from SQL Server. For example, the dBASE database contains only product code numbers, but the report will need full product descriptions, which are located in the PRODUCTS database on SQL Server.

What are the specific functionality requirements of the application? From the perspective of ODBC, the application needs to do the following:

Because the drivers have adequate functionality, the developer need not be concerned about having to code around deficiencies. In fact, because the data sources are known and not subject to change, the application can take full advantage of any DBMS-specific features that it wants. For example, the application can prompt the user for the name of the local database using a standard File/Open dialog box with the file extension set for dBASE (.dbf), as opposed to asking the user to identify a data source using the standard ODBC SQL Data Sources dialog box.

Coding and testing the application

From the perspective of ODBC, the coding of the application is very straightforward. After the user selects a dBASE file from the File/Open dialog box, the application uses SQLAllocEnv to allocate an environment handle, SQLAllocConnect to get connection handles for both data sources, and SQLDriverConnect to connect with the drivers.

The application builds a SELECT statement to run against the selected dBASE file. Because the application knows the structure of the databases in advance (things such as the names of the columns and their data types), it need not query for this information. It allocates a statement handle with SQLAllocStmt , sends the SELECT statement to the DBMS via SQLExecDirect , and binds the columns using SQLBindCol . The results are retrieved one row at a time using SQLFetch . The data from each row of the result set is used to generate the SELECT statement for the SQL Server database. To retrieve data from SQL Server, the application needs to allocate another statement handle, generate the query, and bind and fetch results. One report is printed for each row of data fetched.

After all data have been retrieved, the application frees the statement handles by calling SQLFreeStmt , disconnects from the data sources by calling SQLDisconnect , and cleans up after itself by calling SQLFreeConnect and SQLFreeEnv to free the connection and environment handles.

Because the application is not designed to be interoperable, it need not pursue any ODBC-specific test strategies, other than making certain it works properly with the SQL Server and dBASE drivers.

Scenario 2: Generic Mail Merge Utility

Description

ProseWare Corporation, a small, independent software vendor, wants to use ODBC to increase the data access capabilities of the mail-merge feature of their low-end, Windows™-based, word-processing program. In the current release, the program can perform mail merge only with data in text format. They hope that ODBC will allow for mail merge with information stored in any desktop database format.

Planning the application

After examining the ODBC Programmer's Reference , the developers at ProseWare believe that ODBC will meet their needs. They will design their application to connect with all desktop databases for which there exists a driver (such as Microsoft Access, text, Microsoft Excel, and dBASE). As mail merge does not use any complex or advanced SQL statements and does not make great demands of the ODBC API, this will not be a difficult goal to meet.

Here is how the application's mail-merge feature works from the end user's perspective: After indicating that s/he wants to create a mail-merge document, the user is prompted for the data source using a standard File/Open dialog box. (ProseWare recognizes that their users are familiar with this dialog box, and will find it easier to use than an ODBC data source connection prompt). The file extensions (filters) are listed in a drop-down combo box in the dialog box. The user selects the data source type by choosing the appropriate file filter, and then selecting the file to connect with. For data sources that contain more than one table per file (such as Microsoft Access), the user receives a second dialog box through which s/he indicates the correct table.

The user now begins entering text into her/his document. To insert a data field, s/he selects "Insert Data Field" from the menu. This brings up another dialog box with the list of fields to choose from. The user selects a field, and it is placed in the document. In addition to inserting data fields, users can also indicate simple selection criteria by choosing "Selection Criteria" from the menu. This invokes another dialog box for entering the criteria. When they are finished composing their document, inserting fields, and entering selection criteria, users can select "Print Merge" from the menu, and the document is printed once for each record that meets the criteria.

In terms of ODBC functionality required by the DBMS, the application needs to connect to the data source and retrieve information about the columns it contains so it can populate the selection list. If it is unable to get column information from the DBMS, the application displays an error message indicating to the user that it is unable to communicate with the file. After the user has finished inserting fields and entering selection criteria, the application needs to send a SQL SELECT statement to the DBMS and retrieve the results. Once the results are retrieved, the application can terminate the connection.

Coding and testing the application

The ODBC logic for this feature is more complex than that for Scenario 1, but it is still fairly straightforward. As described above, the application uses a File/Open dialog box to allow the user to select a data source. After the user has selected a data source, the application uses SQLAllocEnv , SQLAllocConnect , and SQLDriverConnect to connect with the driver. SQLDriverConnect is called with the option of SQL_DRIVER_PROMPT in case the user needs to indicate further information to access the data source.

The file selected by the user may represent a single table, as is the case for Xbase, or it may represent an entire database, as is the case for Microsoft Access. To find out how the data source treats files, the application needs to call SQLGetInfo with an fInfoType of SQL_FILE_USAGE. In cases where the data source uses files as databases, the application needs to call SQLTables to generate a list of tables for the user to select from. Once the table is known (either because the file itself is a table or because the user has selected one from the list), the application calls SQLColumns to populate the drop-down combo box that displays the names of the columns the user will place in her/his document.

SQLAllocStmt is called to allocate a statement handle. The application uses a SQL expression builder to transform the user's selection criteria into a SQL SELECT statement written with ODBC syntax. It keeps track of the columns that have been chosen and places them in the SELECT statement. The selection criteria are placed in the WHERE clause. Once the SQL SELECT statement is built, it is sent to the DBMS via a SQLExecDirect call.

After the SQL statement has been executed, the application needs to fetch the results and display the retrieved data in the document. To simplify the processing of the results, all data returned from the DBMS is cast to strings. The application uses SQLBindCol to bind the columns and SQLFetch to retrieve the rows. It can free the statement handle when all data have been returned.

When the user quits the mail-merge feature, the application calls SQLDisconnect to terminate the connection, SQLFreeConnect to free the connection handle, and SQLFreeEnv to free the environment handle.

When debugging the application, the ProseWare developers should test their application against all available desktop-database drivers from all vendors, since their goal is to be interoperable with all of them.

Scenario 3: Report Writer for SQL Server and ORACLE

Description

An independent consultant wants to write a single report writer to sell to two different clients, one of whom has SQL Server and the other, ORACLE Server. The report writer will perform cross-tabulation and feature a "wizard" for helping users generate reports in a step-by-step manner.

Planning the application

This seems to the consultant to be a perfect use of ODBC technology: Rather than write the same application twice to two different APIs, s/he can write the application once to reach both data sources. Although the application uses more DBMS functionality than the first two scenarios, neither the SQL Server nor the ORACLE driver has any trouble providing the necessary level of functionality.

The application enables the user to generate reports from a single table. The user can either use a previously saved report or create a new one. To create a new report, the user begins by selecting a data source and providing any extra information needed to make a connection (for example, password). Next the user is asked which table s/he wants to generate the report from. If the data source uses a three-part table-naming convention, the user is also prompted to select from a list of available table qualifiers. In the case of SQL Server, which uses the database name as the table qualifier, the user is given a list of the available databases. In the case of ORACLE, the user simply selects a table. Once the table has been selected, the user is prompted to provide each of the following in turn:

After the user provides the application with all the information, the application builds and executes the SQL SELECT statement and generates the report.

In order to provide the user interface described here, the application needs to gather information about the data in the data source. Some of the things it needs to know are:

Coding and testing the application

Both drivers are Level 1 API conformant, and hence provide support for SQLColumns and SQLTables , two functions the application needs in order to provide the described functionality. The wizard begins by prompting the user for connection information. Connections are made using SQLDriverConnect with a value of SQL_DRIVER_PROMPT for the fDriverCompletion parameter. This provides users with the standard SQL Data Sources dialog box.

After connection the user needs to select a table from the data source. The application displays a dialog box that contains two lists. The first list is the available qualifiers; the second is the available tables corresponding to the selected qualifier. If qualifiers are not supported, the first list will be dimmed. To set up the dialog box, the application first calls SQLGetInfo with an fInfoType of SQL_QUALIFIER_TERM to find out the native term for a qualifier. If the function returns a valid string, as is the case with SQL Server, it will use that string to label the list of qualifiers. It then calls SQLGetInfo with an fInfoType of SQL_QUALIFIER_NAME_SEPARATOR to find out what character is used between the qualifier and the table name when referring to the table. The separator character is used later when qualifying the table in the FROM clause of the SELECT statement. For SQL Server, the character is "." (a period). Now the application calls SQLTables to generate the list of qualifiers. The "current" qualifier is set by default to the first item in the list. Finally, using the current qualifier if it exists or the empty string if it does not, the application calls SQLTables again to populate the list of available tables. Note that the application needs to call SQLTables whenever the user selects a new qualifier from the qualifier list in order to repopulate the table list.

As the wizard moves on, it needs to display the columns available in the selected table. These can be populated by calling SQLColumns . In addition to providing the names of the columns, SQLColumns also provides information on the ODBC SQL data type of the column. Data type information is used elsewhere by the application, such as when prompting the user for fields to aggregate: Only numeric fields are offered as choices.

The application will want to make available a set of scalar functions that are supported by the DBMS. To check if a particular scalar function is supported, the application calls SQLGetInfo using the fInfoType values for scalar functions (SQL_NUMERIC_FUNCTIONS, SQL_STRING_FUNCTIONS, SQL_TIMEDATE_FUNCTIONS, and so on). These functions will then be placed in a list from which the user can make a selection. When the user selects a scalar function, s/he is prompted for whatever function arguments are required by the ODBC SQL syntax.

For example, suppose the user wishes to select only those rows in which the postal code is "97405", and s/he knows that some of the data contains the full nine-digit postal code. The user can use the SUBSTRING function to extract the first five digits from the POSTAL_CODE field. After selecting SUBSTRING from the list, the application prompts the user for the function arguments --the string, the starting position, and the length of the substring.

Once all the information for the report has been obtained from the user, the application builds a SELECT statement using the ODBC SQL syntax. The SELECT statement includes WHERE and GROUP BY clauses, plus the aggregated fields. Scalar functions are placed in escape clauses, where they can be easily found and parsed by the driver. ( Note : Escape clause syntax is used because DBMSs vary in their use of scalar functions. See "The Hurdles to Interoperability" above for further discussion.) For the postal code example, the SUBSTRING portion of the SELECT statement will look like this:

{fn SUBSTRING(POSTAL_CODE, 1, 5)}

The SELECT statement is sent to the DBMS the same way it was for the Mail Merge application described in Scenario 2. The data is fetched from the DBMS, and the report generated on the screen. If the user is satisfied with the report, s/he can print it out; otherwise, s/he can regenerate it. Before exiting, the user will have the opportunity to save the report. Upon exiting, or starting a new report, the application closes the connection.

When testing the application, the developer needs to be certain that it works for both data sources for which it was written.

Scenario 4: Generic Query Application

Description

Winmark Enterprises is developing a generic query application that provides users with a means of constructing complex queries against multiple DBMSs. It includes the ability to scroll through result sets; to create new tables; and to add, update, and delete data in the data source.

Planning the application

Winmark's query application pushes ODBC to its limits. It attempts to take advantage of all DBMS-specific functionality, and provide some missing functionality itself. Here are some of the features it provides, and the issues it needs to address in providing them:

Coding and testing the application

Exploring the full logic of an application of this magnitude is beyond the scope of this paper. Instead, here are some of the development issues the programmers would need to address in writing this application.

Creating tables

To provide support for table creation, the application needs to know the data types that the DBMS supports, which can be obtained through SQLGetTypeInfo . This same function provides the application with other information it needs, including the parameters for the data types and the DBMS-specific data type names. ( Note : There has been some confusion about the result set returned by SQLGetTypeInfo . The data type names to be used in the CREATE TABLE statement are in the TYPE_NAME column. The LOCAL_TYPE_NAME column is provided for labelling purposes only.) In order to enforce limits such as the maximum number of columns in a table, characters in a column name, and characters in a table name, the application needs to call SQLGetInfo . Other information the application may want to gather about the characteristics of the data source includes how qualifiers can be used ( SQLGetInfo with an fInfoType of SQL_QUALIFIER_USAGE), how owner names can be used ( SQLGetInfo with an fInfoType of SQL_OWNER_USAGE), and whether the data source supports non-nullable columns ( SQLGetInfo with an fInfoType of SQL_NON_NULLABLE_COLUMNS).

Updating data

To allow for updates, the application needs access to multiple active statement handles per connection, one for reading and one for writing. It needs to query the DBMS for this capability (by calling SQLGetInfo with an fInfoType of SQL_ACTIVE_STATEMENTS). If multiple active statements are not supported, the application can open a second connection under the covers, using the connection information it received to make the first connection. Before allowing updates, the application can check if the data source is read-only ( SQLGetInfo with an fInfoType of SQL_DATA_SOURCE_READ_ONLY).

Deleting data

Here is the logic for supporting a DELETE, taking advantage of transactions if they are available, and working around their absence if they are not. When the user attempts to delete a row of the database, the application attempts to build a WHERE clause for the DELETE statement that uniquely identifies the row. To do so, the application first calls SQLSpecialColumns . If SQLSpecialColumns indicates that the row can be uniquely identified, the application can build a WHERE clause using the data from the SQL_BEST_ROWID columns(s) and submit a DELETE statement.

If the record cannot be uniquely identified, the application will try to set the connection option to Manual Commit by calling SQLSetConnectOption with an fOption of SQL_AUTOCOMMIT and a vParam of FALSE. If this is successful, the application issues a searched DELETE statement using a WHERE clause that contains all the columns in the row. It then uses SQLRowCount to determine how many rows were deleted. If more than one row was deleted, the row was not uniquely identified, and the user needs to be made aware of how many items will be deleted. The application displays a dialog box informing the user that n rows will be deleted, and prompts for continuation. If the user chooses to continue, the application issues the COMMIT manually. Otherwise, the transaction is rolled back.

If the application is unable to set the connection option to Manual Commit, the application can perform a SELECT COUNT(*) using the same WHERE clause as the DELETE statement. If the number of rows selected is not equal to 1, the row was not uniquely identified, and the application displays the same alert box as above. If the row is unique, or the user says it is ok to delete more than one, then the application issues the DELETE statement.

Data consistency

Because this application will be running in a multi-user environment, it needs to address the data consistency issues that arise when two users access the same data simultaneously. For example, suppose User 1 and User 2 have both retrieved data from the CUSTOMERS table. It is quite possible that when User 2 attempts to UPDATE one of the records on her/his screen, User 1 has already changed or deleted that record, since there is a lag time between the time when User 2 issued the SELECT and when s/he issued the UPDATE. Some approaches to handling these kinds of problems include:

Each method has its own advantages and disadvantages. Note that setting the transaction isolation level to SQL_TXN_SERIALIZABLE guarantees that such anomalies do not occur. However, most data sources implement serializable transactions by a locking protocol, so choosing this option can cause a significant reduction in concurrency. The actual approach chosen by an application depends on the cursor model it uses, the transaction and locking capabilities of the data source, and other factors.

Scalar functions and stored procedures

The application supports any scalar functions available from the data source. As described in Scenario 3, it calls SQLGetInfo to find out which are available, if any. The application also provides access to stored procedures if they are supported by the DBMS. It can check to see if the DBMS supports stored procedures by calling SQLGetInfo with an fInfoType of SQL_PROCEDURES. If they are not supported, the procedures feature can be disabled (by dimming a menu item). If procedures are supported, the application uses SQLProcedures to get the list of those that are available. Some procedures that are returned in this list may not be available to the user; this case can be tested by calling SQLGetInfo with an fInfoType of SQL_ACCESSIBLE_PROCEDURES. Before using a procedure, the application calls SQLProcedureColumns to gather important column and parameter information.

Testing

Because this is a generic application that is designed to connect with any driver, the developers need to test the application with as many drivers as they can procure.

Summary

In discussing ODBC as a tool for heterogeneous data access, this paper has provided the following information:

The key to writing interoperable applications is to make no assumptions about the functionality or behavior of the DBMS. ODBC encourages driver writers to provide Level 1 API conformance, but not all drivers meet that conformance level. Application developers must query the driver for the capabilities it supports, and respond accordingly based on the information they get back. Although ODBC does not eliminate all conditional code from the application, by shifting the focus from DBMS-dependent code to capability-dependent code, ODBC provides a cost-effective solution for application developers who need to reach multiple DBMSs with a minimum of programming effort.

© 1995 by Microsoft Corporation. All rights reserved. Publication of the Microsoft Developer Network. All trademarks are the property of their respective owners.


Return to Synergex Product Directions Document