untitled presentation Parallel Relational Database Management Systems -- I Click here for subtitle Gang Cheng, Marek Podgorny April 1995 Northeast Parallel Architectures Center at Syracuse University Abstract of Parallel Relational Database Management Systems -- I This presentation contains the first two sections Parallel Database Technology in Commercial Applications and Industry Parallel Database Technology and Theory Of the full CPS616 Parallel Database Module The first section sets the scene by motivating the need for paraalel databases while The second section reviews both Sequentional and Parallel Relational Databases looking at explicit examples nCUBE and SP2 with Oracle and DB2 We also discuss database system architectures and review The SQL Query language Outline of Full Database Presentation This Presentation has five Sections Parallel Database Technology in Commercial Applications and Industry Parallel Database Technology and Theory Parallel Database Projects at NPAC Parallel Oracle7 RDBMS -- A Case Study Parallel Database Benchmarking Section I: Parallel Database Technology in Commercial Applications and Industry Motivations for Parallel Databases -- I: Overview of Parallel Database Appeal scalability system expandability alleviates migration problems business branches of different size can use the same technology power and price/performance ratio nCUBE + Oracle capable of >100 tps at ~$2000/tps (TPC-B benchmark, compare with 425 tps for 4*VAX cluster at $16,500/tps) simultaneous OLTP and complex Decision Support queries (single enterprise image High availability Integrability parallel Oracle RDBMS integrates as easily as conventional RDBMS does not require re-investment in application software Industry standard database language - SQL (Structured Query Language) Parallelization is carried out at RDBMS server side Migration to parallel RDBMS is painless (compare to scientific applications) Motivations for Parallel Databases -- II: Inadequacies with Current Mainframe Solutions Conventional (mainframe) systems unable to support time-critical queries simultaneous OLTP and decision support systems gigantic scientific and commercial databases complex queries (approximate matching, pattern recognition ...) sophisticated optimization as in market segmentation Motivations for Parallel Databases -- III: Commercial versus Scientific Applications In commercial business world, MPP systems Can find more profitable applications than the scientific world, As SQL intrinsically parallel, one needs only parallelize the database system and not the many SQL applications in contrast for scientific computing, one needs to both produce a parallel compiler and parallelize each application at language and algorithm level Have better cost/performance than the mainframe system Are mature and highly available NOW Motivations for Parallel Databases -- IV: Market Demand from Competitiveness Most commercial cooperations re-organize their businesses to be more efficient with a smaller workforce need more computing capacity to grow business and fast response to have better services need better system scalability and price/performance Application Areas for Parallel Database: commercial, administration, scientific High transaction rate applications banks, airlines, telecommunication, security brokers, retailers ... Real-time applications stock trading, C3I (military, civil defense), air traffic, process control (manufacturing, accelerators, reactors/generators), real-time insurance claim analysis ... Complex queries decision support (market analysis), molecular biology, chemistry/pharmacology ... Massive amount of data EOS, HEP, weather, census, global environment models, oil exploration, medical history files, multimedia support ... General Classes of Commercial Applications OLTP (On Line Transaction Processing) --- consists of performing many small independent requests submitted by many clients simultaneously and operating Performance needed is linear in database size DSS (Decision Support System) Performance needed can be exponential in database size for sophisticated optimization or queries Data Warehouse --- operations mixing DSS and OLTP Data mining --- uncover/extract meaningful information from massive data sources An Application Example --- Intelligent Business systems 1)Objectives understand customer buying habits and product preference predict market tendency make efficient, productive and profitable decisions must be based on answers to questions on ALL historical business transaction data (Lots of !) Intelligent Business systems --- 2)Typical Questions ÒGive me a breakdown of all customers likely to default in the coming yearÓ ÒDo three things: predict year-end demand, tell me which customers will fuel that demand, and tell me whyÓ ÒDefine a highly concentrated micromarket (make special offers to customer segments that meet very select criteria) within my databaseÓ ÒCreate a model that really explains why some customers renew their subscriptions and others donÕtÓ ÒSuggest ways of regrouping our customers into new market segments (for Direct-Marketing)Ó ÒFind some customer-preference pattern I might not be aware ofÓ Intelligent Business systems --- 3) Major Technology Challenges Lots of Historical Data (and they are growing daily !) High Performance Computing System and DBMS Intelligent Data Analysis (Optimization) Not possible on a mainframe system in terms of both performance and capacity Intelligent Business systems --- 4) Solutions MPP system Parallel Relational Database Management System Intelligent Datamining Algorithms Expert Systems Fractal Compression Neural Networks Modelling Genetic Algorithms Major Software and Hardware vendors in Parallel Database Technology Oracle (Oracle7 with parallel server and parallel query options) Sybase (Navigation Server on AT&T 3600, SUN SMPs, and IBM SP2 (planned),SQL Server 10) Informix (INFORMIX-OnLine Dynamic Server) IBM (Oracle7, DB2/6000, DB2/2, SP2, IBM 390 Parallel Sysplex) SGI (Oracle7 on SMP Challenge database server) Taradata with AT&T and NCR Tandem (Himalaya K10000 Cray Research (SuperServer) SUN (Oracle7, DB2 on SUNÕs SMP) AT&T and NCR TMC (CM5, Darwin, Oracle7, Decision/SQL, Parasort) KSR (Query Decomposor, Oracle7) Amdahl (Oracle7) nCUBE (Oracle7, nCUBE2) data-CACHE Corp (SQL dataMANAGER) Data General Corp (Oracle7, DSO) DEC (POLYCENTER Manager) HP (Oracle7, IMAGE/SQL, HP 3000) Encore Computer Corp (Oracle7 on Infinity 90) Some Current Major Commercial Users Kmart American Express AAA Airline Citibank Prudential Securities Bell Atlantic Corp BellSouth BT (British Telecom) Criminal Investigative Technology Inc. Check http://greatwall.npac.syr.edu:1963 for more info about parallel database technology in industry Parallel Database Technology and Theory Hardware architectures for parallel DBMS -- Generic System A typical hardware setup is: Hardware Architectures and forms of Parallelism Three (Hardware) architectures for parallel DBMS shared-memory (also called Symmetric Multiprocessors SMP) all processors share direct access to a common global memory and to all disks. example: Oracle7 on SGI Challenge shared-disks each processor has a private memory but has direct access to all disk. example: Oracle 7 on IBM SP2 shared-nothing each memory and disk is owned by some processor that acts as a server for that data. Mass storage is distributed among processors by connecting one or more disk to each processor. example: DB2 on SP2 This is Òowner computeÓ rule as in much scientific computing (HPF default implies processor where variable ÒstoredÓ performs computations setting variableÕs value) Notes on Shared Nothing Architecture On a shared-nothing architecture, one or more disks are connectedto each processor, depending on machineÕs (parallel) I/O architecture, via direct connecting or some I/O channels. The shared-nothing does not focus on the physical I/O connection architectures, rather, it refers tohow the data is partitioned on the disk arrays and how the data are movedinto the procesor buffers for the parallel query processing. In this architecture, data locations determines where the data will be processed and how data is shared by processors other than its local processor (ie. the processor which has the local disk holding the data if direct connection like DB2 on SP2, or the shortest path from disks to the process if other I/O architectures like Oracle on ncube). Unlike shared-disk where data may be shared via the interconnection network when it is first readfrom a disk to a remote processor (buffer), if the data local to processorA is required by a remote processor B to perfom some parallel query processing, A sends the data to B via the communication network. That is the only shared media for shared-nothing is the communication network. In most case, data placement has determined how and where (even when) a parallel query processing is decomposed or partitioned. Data (or I/O) load balance determines the CPU load balance in the system. This is built in the query decomposer. eg. if Processor A gets 10% data local to its disk, while B gets remaining 90%. Then in a shared-nothing system, A spends 10% CPU and then is idle while B takes most of CPU time. On the otherhand for a shared-disk system, A and B will process similiar amount of data with some overhead of sending data from B to A. Shared-data Architecture Combine features of the above three models, data sharing technology directly interfaces with the individual processors to enable each processor to cache shared data in its own local memory with full read/write capability (as in shared-memory model) Data sharing maintains data integrity by ensuring cache coherency across all processors and provide efficient cross-system locking mechanisms (as in shared-nothing model)All disk devices are accessible to each processor so that a workload may be scheduled to run on any available processor (as in shared-disks model), instead of being run only on the processors with data affinity (as in shared-nothing model). Example of shared-data system: DB2 on IBM System/390 Parallel Sysplex. Shared Data Architecture Note on Oracle nCUBE2 Hybrid Architecture It is difficult to characterise the Oracle7 on ncube2 as it mixed both shared-disk and shared-nothing architectures features by using an additonal subcube as a giga-cache which lies between ncube I/O system (including I/O nodes, multiple I/O channels and disks drives) and the compute-node subcube(s). Before data is read into the buffer in compute-processor, they first are cached in a Giga-cache node. So if you look just at the compute-subcube <--> Giga-cache subcube, it looks likea shared-nothing system, But if you look at the whole compute-subcube <--> Giga-cache subcube <--> disk-arrays, it looks like a shared-disk system. Strictly speaking and compared to DB2 on SP2, it is not a shared-nothing system as the data-placement on disk-arrays has little to do with how query processing is paralleliazed or decomposed. This is also the reason why we found data partition schemes have less I/O performance impact for Oracle7 on nucbe than that on SP2 (the latter is a shared-disk system). Topological view of MP machines (with ÒScalabilityÓ in mind) Shared-Memory and Shared-Disk interference limits scalability complicated and relatively expensive Shared Nothing minimal interference between processors moves only questions and answers over the network can scale up to hundreds (maybe thousands) of processors composed of cheap, stock parts and a fast interconnecting network achieve near-linear improvements good for system scalability and portability Shared Data Little experience so far Dataflow perspective Pipelined: Òassembly lineÓ average query does not have that many steps some operational stages cannot be pipelined (e.g. Sort) one operator is longer than the other (a kind of skew) This is task parallelism in scientific computing Partitioned: Òdivide and conquerÓ fairly straight forward to implement ÒtoughÓ operations are divided, parts run in parallel This is classic data parallelism in scientific computing Combination: Òbest of both worldÓ several partitions running parallel each partition is, where possible, a short pipeline Parallelisms in parallel database systems pipelined parallelism (inter-operator parallelism) query parallelism inter-query parallelism intra-operator parallelism partitioned (declustered) parallelism (I/O parallelism) transaction parallelism Data Partitioning --- How to divide data among multiple disks ? a common practice on large number of disks and mass storage systems (file striping) high I/O bandwidth of multiple disks and parallel I/O no need for specialized hardware important for query load balance on a shared-nothing system this is of course a central problem in scientific computing and much of High Performance Fortran is devoted to data partitioning Basic Data Partitioning Schemes Major Approaches in Data Partitioning Round Robin ( cyclic in HPF/Scientific Computing Notation) data scattered cannot locate specific records Hashing ( scattered decomposition in HPF style) scatters the data can locate specific records danger of data skew Range Partitioning (block cyclic in HPF style) data not scattered can locate specific records danger of data skew ÒrelatedÓ data can be clustered Danger of Data Skew arises on Shared Nothing with Data Partitioning Pitfalls in data partitioning operations/application dependent danger of data skew arises on shared-nothing with data partitioning data skew is the focus of a number of research projects automatic OS file striping vs manual (application-specific) table striping Performance Metrics In Parallel Database Systems speedup (degree of parallelism) (fixed problem size) scaleup (scalability) -- transaction scaleup, batch scaleup, processor scaleup, I/O scaleup (problem size increases) sizeup (both) (scaled speedup in scientific computing increases problem and computer size by same factor) Performance barriers startup (latency) interference (communication, locking) load balance (execution skew - all the execution occurs in one partition) data-partition (data skew - all the data is placed in one partition) imbalance of system resources (CPUs and I/O channels) CPU-intensive or I/O intensive queries Some basic terminology for relational database model Data Structure relations (files, tables) tuples (records, rows) attributes (fields, columns) Relation operators scan (select-project) (a relation, a predicate, and an attribute list) sort (reorder) aggregate operators (SUM,AVG,MAX,MIN,...) insert/delete/update set operators (union, intersection, difference) join, merge and division embedded operators uniformity of the data and operators source of data-flow execution model Examples of Typical Relational Operations Join Operation: a SELECT operation that combines rows from two or more tables. Each returned row contains data from more than one table SQL: Select A.A1,A.A3,B.B1,B.B2 from A,B where A.A1=B.B1; Overview of Structure Query Language (SQL) a database language specifies the semantics of various components of a DBMS: structures and operation of a data model, data definition, data access, security, programming language interface, and data administration Industry accepted Standard, first introduced by ANSI in 1986, current SQL92 by ANSI and ISO, new standard SQL3 with enhancements in object-oriented data management is undergoing. Portable to all RDBMS systems. built on relational model, simple and powerful non-procedural 4GL language, only specify Òwhat-to-doÓ, not Òhow-to-doÓ , extended to object-oriented programming interface this extended model competes with fledging object-oriented database in industry Features of Structure Query Language (SQL) applications in SQL do not need to be rewritten to run on parallel server (all parallelizations of SQL programs are carried out by the server), compared to scientific areas where both parallel application algorithms and a parallel compiler are required data-independent --- two-phase database schema (logical and physical database) SQL in Oracle 7 create tables in the database store information in tables select exactly the information you need from your database make changes to your data and the structure of underlying tables, and combine and calculate data to generate the information you need Major RDBMS functionality Data Access --- SQL, Transactions, PL/SQL, Data Integrity Data Concurrency and Consistency --- Concurrency, Locking Data Security --- Users and Schemes, Roles and Privileges, Profiles and Auditing Database Backup and Recovery --- Redo log, Rolling Forward/Back Distributed Processing --- Client-Server environment What is a 3GL or 4GL? Historians of programming languages and database people tend to classify programming languages by generations: 1GL --- machine language (in 0/1 binary) 2GL --- assembly language (in symbolic machine language) 3GL --- high-level programming language (structural, modular and usually procedural) 4GL --- non-procedural (declarative). Most AI languages are 4GL. What is PL/SQL -- I ? PL/SQL illustrates some low-level data access tools/languages that are available in most RDBMS. Basically, there are three levels of database access interfaces: 1. highest level --- SQL Non-procedure language, only specify what-to-do, not how-to-do. So-called fouth-generation language (4GL). no control structures(loop,if) No module (block) structure, very expressive or declarative. One Other Language example in this category is Prolog. What is PL/SQL -- II? 2. Middle-level --- PL/SQL fill the gap between the highest SQL level and lowest C level programming. It is basically a procedual implementation of SQL access. Pure SQL is a subset of PL/SQL where more C-like constructs are introduced Such as block-structure,control statments, variables,module,interface specification,functions, etc). But it doesnt have the rich data-types and low-level operators as in C such as Structure(record),enumeration,array, pointer(reference), and set datatypes, and bitwise-operator,I/O and file handling. What is PL/SQL -- III? 3. Embedded SQL in C (or f77,Cobol) --- programming language interfaces Procedural high-level programming langauges (often called 3GL). All capability and flexibility in C-level programming as a host language to embed the SQL statement as the data-access for querying the database engine. Poweful in programming but complex sematics and inconvenient interface to SQL. Usually you need to know many programming/libary details to interface the both in programming. Good for pre-/post-processing to the query but bad for query programming. What is Data Integrity? A way to enforce error-checking and formating in data-types. Usually the rules for data integrity is created when a table is specified. Examples: Null -- a data field may or may not be allowed to be Null. Unique -- a data field may or may not be allowed to be uniquely identified among all the other raws in a table Primary key -- a data field being a primary key must be unique Referential integrity -- inserting or updating a fieldÕs value on table A must depend on if it matches the value of a field on another table B. Data integrity is the major way to enforce the data-consistency in the database. What are Schemes? Schemes are used to define scope of different data objects to enforce data secuity and naming issues in a DBMS. A Scheme in a DB is a logical hierarchy of data storage structures. Major scheme objects are: users, tablespaces, tables, views, index, cluster, synonyms, columns. Access to Different Schemes require different roles and privileges for different users. What are Roles? A way to classify level of database users. Like sysadm usually has superuser privileges, roles are used to group users with different privileges. Because a DBMS system has many levels of operations which set many levels(kinds) of operation privilages, Role is another logical level of grouping to define/specify who will have what system privileges. It makes system management easier and more flexible. What are Profiles and Auditing? Profiles and Auditing are the way to monitor (system logging) and record user database actions They are used for: investigating suspicious activity and . gathering data about specific activity for performance tuning and system statistics. These are just a DB way of saying logging. As in other cases, Database field sometimes use its own different terms for the same thing in other fields. What are Two-phase Database Schema? For a RDBMS there are two levels of abstractions of how data being stored and represented in the system: 1) Logical database consists of the conceptual view of a DB system which is described by an ER (entity-relationship) model, defined and access by SQL and represented by tables,columns,views and other abstractions of data-object; 2) The physical database consists of the actual phyical view of the DB which is represented by files,blocks,indexes,clusters,partitions,etc. End-user and developers only need to deal with the logical level, While the DBMS and a DBA (database administrator) define and perform the mapping between the two levels. This is the reason why SQL achieves portability (oompared to f77), from a viewpoint of a data independent model. Example: Relational Joins General Structure of Parallel and Sequential Relational Joins Parallel Algorithm for Relational Joins Parallel Database Software Architecture Distributed Lock Manager Implementation of Parallel Cache Management Support transaction parallelism of multiple OLTP A simple and effective approach to port sequential RDBMS to MPP and loosely coupled systems with shared-nothing architecture support parallel loading, parallel indexing, parallel insert/update and parallel recovery Major functionality keep track of the current ÒownshipÓ of a resource accepts requests for resources from application processes notifies the requesting process when a resource is available get exclusive access to a resource for a resource Note this can work fine in OLTP with many uncorrelated queries but will not work in scientific computing where ALL updates are correclated and reserving a resource introduces a sequential bottleneck Parallel Database Software Architecture Parallel Query Optimizer Parallel Query Optimizer builds a query execution plan employing all system resources on the basis of data-partition knowledge, it can easily employ few tens of processors to execution a single query with high degree of parallelism. Both data-parallel and data-flow paradigms are used to achieve this goal Support complex CPU-intensive or I/O-intensive queries for DSS near-linear or even super-linear speedup can be achieved (due to larger effective caches) Major focus for research --- optimized parallel algorithms needed carried out by a query coordinator on the RDBMS server (the query coordinator generates the optimal parallel plan rather than having the best serial plan simply executed in parallel. This optimization is performed by the coordinator without the need for any external information (DB2 and Oracle7 use different approaches) most relational operators are parallizable, including full-table scan, sort, aggregate operators, join, merge System/Server tuning plays an important role in achieving optimized performance in a parallel RDBMS --- benchmarking is an effective means of performing tuning