Replied: Fri, 13 Sep 2002 16:46:41 -0500 Replied: William Johnston Return-Path: wejohnston@lbl.gov Delivery-Date: Fri Sep 13 15:35:32 2002 Return-Path: Received: from round.uits.indiana.edu (round.uits.indiana.edu [129.79.1.72]) by grids.ucs.indiana.edu (8.10.2+Sun/8.10.2) with ESMTP id g8DKZVm14980 for ; Fri, 13 Sep 2002 15:35:31 -0500 (EST) Received: from postal1.lbl.gov (postal1.lbl.gov [128.3.7.82]) by round.uits.indiana.edu (8.12.1/8.12.1/IUPO) with ESMTP id g8DKd9wo007594 for ; Fri, 13 Sep 2002 15:39:09 -0500 (EST) Received: from postal1.lbl.gov (localhost [127.0.0.1]) by postal1.lbl.gov (8.11.2/8.11.2) with ESMTP id g8DKdAN10472 for ; Fri, 13 Sep 2002 13:39:10 -0700 (PDT) Received: from lbl.gov (maat.lbl.gov [131.243.2.75]) by postal1.lbl.gov (8.11.2/8.11.2) with ESMTP id g8DKdAq10463 for ; Fri, 13 Sep 2002 13:39:10 -0700 (PDT) Message-ID: <3D824CEC.3E61F0F5@lbl.gov> Date: Fri, 13 Sep 2002 13:39:08 -0700 From: William Johnston Organization: DOE Lawrence Berkeley National Lab / NASA Ames X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U) X-Accept-Language: en,pdf MIME-Version: 1.0 To: Geoffrey Fox Subject: a database system that stores and manages XML data in its native form Content-Type: multipart/mixed; boundary="------------55D4D231E7CAABEC678842AF" Content-Length: 9437 This is a multi-part message in MIME format. --------------55D4D231E7CAABEC678842AF Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Geoff; FYI: http://www.sleepycat.com/xml/ Bill --------------55D4D231E7CAABEC678842AF Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Base: "http://www.sleepycat.com/xml/" Content-Location: "http://www.sleepycat.com/xml/" Coming Soon: Berkeley DB XML Sleepycat Software, Inc.

Berkeley DB XML

Sleepycat Software is currently developing a new product, built on top of the Berkeley DB database engine. The new product, Berkeley DB XML, is a database system that stores and manages XML data in its native form.

Berkeley DB XML is not yet available. However, design and implementation are well underway, and Sleepycat is soliciting interested parties to review plans and specifications and to participate in an early access program in the second half of 2002.

What is Berkeley DB XML?

Berkeley DB XML is a programmatic toolkit that specializes in the storage and retrieval of XML documents. Documents are stored in collections and queried using XPath.

The key components of the system are:

the XML Storage Manager, which writes native XML data to Berkeley DB for storage;
the XPath Query Processor, which uses the XPath 1.0 specification to parse, plan, and optimize XPath queries, and which searches the repository for matching documents; and
the XML Indexer, which provides a number of XML indexing strategies to support efficient expression evaluation.

Berkeley DB XML is built on top of Berkeley DB, which provides fast, reliable, scalable, and mission-critical database support.

Distribution

Like the underlying Berkeley DB engine, Berkeley DB XML will be available as a source code distribution. The software will be distributed under an open source license.

Languages and Platforms

Berkeley DB XML provides both C++ and Java APIs. The source code is designed to be portable to a wide variety of hardware platforms and operating systems. The initial release will be tested on Windows NT/2000/XP, Linux, and Solaris 2.8.

Product Details

Embedded

The toolkit is provided as a library that is linked into the user's application. This provides superior performance by eliminating communication among processes or systems.

Document Storage

Documents are stored in collections. A single application may operate on many collections at the same time, and may combine data from different collections easily. Non-XML data may be included by creating ordinary Berkeley DB tables. Tables and collections may be used together, with full support for Berkeley DB transactions and recovery services, by multiple users at the same time.

Native Storage

Documents are returned to the user exactly as they were stored, including any extraneous white space. Some products decompose the document on storage and reconstitute it on retrieval, and can introduce subtle differences between what goes in and what comes out. Because Berkeley DB stores documents natively, this problem cannot arise.

Indexing

Individual collections may be indexed differently, to take advantage of different query patterns or to exploit the schema common to each collection. Each collection may be indexed in more than one way. Different indexing schemes support different XPath queries efficiently.

Query Processing

The Query Processor implements XPath 1.0. A cost-based query optimizer considers the indices that exist, the data volume that a query is likely to produce, and the cost of computation and disk I/O to select a query plan with the lowest run-time cost. Results may be computed eagerly or lazily. Lazy evaluation makes it easy to stream results to the user.

Unicode UTF-8

Berkeley DB XML accepts UTF-8 encoded XML documents and XPath expressions.

Threading

Berkeley DB XML is entirely thread-safe. As the library itself does not mandate the use of any particular threads package, you can use the one you like best or the one most natural to your application. You can build applications that are single-threaded or multi-threaded, as your application requires.

Berkeley DB XML is implemented as a layer above Berkeley DB. The user can choose which version of Berkeley DB is most suitable for a particular application: Data Store, Concurrent Data Store, Transactional Data Store, or High Availability.

Standards

The design and implementation of Berkeley DB XML are based on W3C standards work on XML, XML Namespaces, and XPath 1.0.

Join the Mailing List

Sleepycat maintains a mailing list of people interested in Berkeley DB XML. The list is a forum for discussion of technical issues. Membership on the list is open to the public, but requests to join must be approved by the XML project manager at Sleepycat. Only list members may post messages to the list.

To join the mailing list, send an email message to xml-request@sleepycat.com with the text

subscribe

in the message body. Any text in the message subject line will be ignored. --------------55D4D231E7CAABEC678842AF--