Skip to:

e-Science 2008 4th IEEE International Conference on e-Science

Workshops & Special Sessions

eScience for cheminformatics and drug discovery

SQMD: Architecture for Scalable, Distributed Database System built on Virtual Private Servers


Presenters and Authors
  • Kangseok Kim
  • Rajarshi Guha
  • Marlon Pierce
Abstract

Many scientific fields routinely generate huge datasets. In many cases, these datasets are not static but rapidly grow in size. Handling these types of datasets, as well as allowing sophisticated queries necessitates scalable distributed database systems, in which scientists are efficiently able to search the datasets. In this paper we present the architecture, implementation and performance analysis of a scalable, distributed database system built on software based virtualization environments. The system architecture makes use of a software partitioning of the database based on data clustering, SQMD (Single Query Multiple Database) mechanism, a web service interface, and virtualization software technologies. The system allows uniform access to concurrently distributed databases, using the SQMD mechanism based on the publish/subscribe paradigm. We highlight the scalability of our architecture by applying it to a database of 17 million chemical structures. In addition to simple identifier based retrieval, we will present performance results for shape similarity queries, which is extremely, time intensive with traditional architectures.

Date and Time

Friday, December 12, 10–10:30 a.m.

<< Return to workshop

More Information

Show your support for e-Science 2008

Add one of our badges to your site:

  • Teal eScience 2008 Web badge
  • Green eScience 2008 Web badge
  • Orange eScience 2008 Web badge