Parallel RDBMS and Desion Support System (DSS) Benchmarking

<head>
NPAC Spring 1995 Seminar Series:
</head>
<h1>Parallel RDBMS and Desion Support System (DSS) Benchmarking</h1>

<pre>                   Gang Cheng, Marek Podgorny
                          
                   NPAC, Syracuse University

                February 3, SCI/TECH 1-218, 1pm

                         ABSTRACT

An overview of NPAC's parallel database hardware and software 
infrastructure and current activities in parallel database technology 
is presented. Our experience with Decision Support System (DSS) 
benchmarking is highlighted, followed by a technical discussion on 
the parallel Oracle 7 software on the nCUBE2 and IBM SP2. 
Issues on benchmarking DSS using a parallel Relational 
Database Management System (RDBMS) are presented.

</pre>

<h2>Outline</h2>

<ul>
<li><a href=#1>
Motivations</a>
<li><a href=#2>
Applications Types</a>
<li><a href=#3>
Application Areas</a>
<li><a href=#4>
An Application Example --- Intelligent Business systems</a>
<li><a href=#5>
Current Major Commercial Users</a>
<li><a href=#6>
Hardware architectures for parallel DBMS</a>
<li><a href=#7>
Topological view of MP machines (with "Scalability" in mind)</a>
<li><a href=#8>
Dataflow perspective</a>
<li><a href=#9>
Parallelisms in parallel database systems</a>
<li><a href=#10>
Data Partitioning</a>
<li><a href=#11>
Pitfalls in data partitioning</a>
<li><a href=#12>
Performance Matrics</a>
<li><a href=#13>
Performance barriers</a>
<li><a href=#14>
Some basic teminoloty for relational model</a>
<li><a href=#15>
Relation operators</a>
<li><a href=#16>
Major RDBMS funcationality</a>
<li><a href=#17>
Structure Query Language (SQL)</a>
<li><a href=#18>
Overview of Parallel Oracle7 Software Architecure</a>
<li><a href=#19>
Database Hardware and Software Infrastructure at NPAC (1)</a>
<li><a href=#20>
Database Hardware and Software Infrastructure in NPAC (2)</a>
<li><a href=#21>
Parallel Database Evaluation and Benchmarking Service at NPAC</a>
<li><a href=#22>
Integrated Decision Support Systems at NPAC</a>
<li><a href=#23>
NPAC's Experience in Parallel RDBMS</a>
<li><a href=#24>
What is the Oracle Parallel Server ?</a>
<li><a href=#25>
Oracle 7 Server Architecture</a>
<li><a href=#26>
The Oracle Server Instance</a>
<li><a href=#27>
Oracle7 Parallel Server Characteristics</a>
<li><a href=#28>
Parallel Server Option</a>
<li><a href=#29>
Parallel Cache Management and Distributed Lock Manager</a>
<li><a href=#30>
An Example of Concurrent Control</a>
<li><a href=#31>
Parallel Query Option</a>
<li><a href=#32>
Parallel Query Processing: Overview</a>
<li><a href=#33>
Processing Phases</a>
<li><a href=#34>
Example of parallel execution of a full scan</a>
<li><a href=#35>
Oracle 7.1.3 Server Paralliziable Operations </a>
<li><a href=#36>
Data Partitioning in Oracle 7.1.3</a>
<li><a href=#37>
Industry Benchmarking Tests</a>
<li><a href=#38>
Characteristics of a DSS workloads</a>
<li><a href=#39>
TPC-D - a Decision Support Benchmark? (1)</a>
<li><a href=#40>
TPC-D - a Decision Support Benchmark? (2)</a>
<li><a href=#41>
TPC-D - a Decision Support Benchmark? (3)</a>
<li><a href=#42>
TPC-D - a Decision Support Benchmark? (4)</a>
<li><a href=#43>
TPC-D - a Decision Support Benchmark? (5)</a>
<li><a href=#44>
TPC-D - a Decision Support Benchmark? (6)</a>
<li><a href=#45>
TPC-D - a Decision Support Benchmark? (7)</a>
<li><a href=#46>
Lessons Learned in Our TPC-D Benchmarking</a>
<li><a href=#47>
Sorry</a>
</ul>

<a name=1>
<h2>Motivations</h2>
<pre>
. Conventional (mainframe) systems unable to support
  . time-critical quiers
  . simultaneous OLTP and decision support
  . gigantic scientific and commercial databases
  . complex queries (approximate matching, pattern recognition ...)

. In commercial business world, MPP systems 
  . are mature and highly available NOW ("Prime time")
  . have better cost/performance than the mainframe system 
  . can find more profitable applications than the scientific world

. Market Demand
  . Most commercial cooperations re-engineer their businesses
    to be more efficient with a smaller workforce
  . need better system scalability and price/performance
  . need more computing capacity to grow business and fast response
    to have better services

. Parallel Database Technology Appeals

  . Scalability
    . system expandability alleviates migration problems
    . business branches of different size can use the same technology

  . Power and price/performance ratio
    . simultaneous OLTP and complex Decision Support queries 
      (single enterprise image)
    . nCUBE + Oracle capable of >100 tps at ~$2000/tps
      (TPC-B benchmark, compare with 425 tps for 4*VAX cluster at $16,500/tps)

  . High availability

  . Integrability
    . Parallel Oracle RDBMS integrates as easily as conventional RDBMS

  . do not require re-investment in application software
    . Industry standard database language - SQL (Structured Query Language)
    . Parallelization is carried out at RDBMS server side
    . Migration to paralllel RDBMS is painless 
      (compare to scientific applications)

. Massive amount of data
        . EOS, HEP, weather, census, global environment models, oil exploration, medical
          history files, multimedia support ...
</pre>
</a>

<a name=2>
<h2>Applications Types</h2><pre>

. OLTP (On Line Transaction Processing) --- 
        consists of performing many small independent requestes
        submitted by many clients simuteneously and operating
        on a shared database (parallel server)
. DSS (Decision Support System) --- 
. Data Warehouse --- operations mixing DSS and OLTP
. Data mining --- uncover/extract meaningful information from mass data sources

Application Areas: commercial, administration, scientific
. High transaction rate applications
        . banks, airlines, telecommunication, security brokers, retailers ...
. Real-time applications
        . stock trading, C3I (militart, civil defense), air traffic, 
          process control (manufacturing, accelarator, reactors/generators),
          real-time insurance claim analysis ...
. Complex queries
        . decision support (market analysis), molecular biology, chemistry/pharmacology ...
</pre></a>

<a name=3>
<h2>An Application Example --- Intelligent Business systems</h2><pre>
Objectives: 
 . understand customer buying habits and product preference
 . predict market tendency
 . make efficient, productive and profitable decisions

 must be based on answers to questions on ALL historical 
business transaction data (Lots of !)

Intelligent Business systems
. Typical Questions:
        . "Give me a breakdown of all customers likely to default in the coming year"
        . "Do three things: predict year-end demand, tell me which customers will
           fuel that demand, and tell me why"
        . "Define a highly concentrated micromarket (make special offers to customer
           segements that meet very select criteria) within my database"
        . "Create a model that really explains why some customers renew their 
           subscriptions and others don't"
        . "Suggest ways of regrouping our customers into new market segements 
           (for Direct-Marketing)"
        . "Find some customer-preference pattern I might not be aware of"

Intelligent Business systems
. Major Technology Challenges in this application
        . Lots of Historical Data (and they are growing daily !)
        . High Performance Computing System and DBMS
        . Intelligent Data Analysis (Optimization)
        . not possible on a mainframe system in terms of both performance and capacity

Intelligent Business systems
. Solutions:
        . MPP system
        . Parallel Relational Database Management System
        . Intelligent Algorithms
</pre></a>

<a name=4>
<h2>Major Software and Hardware vendors in Parallel Database Technology</h2><pre>
. Oracle (Oracle7 with parallel server and parallel query options)
. Sybase (Navigation Server on AT&T 3600, SUN SMPs, and IBM SP2 (planned), SQL Server 10)
. Informix (INFORMIX-OnLine Dynamic Server)
. IBM (Oracle7, DB2/6000, DB2/2, SP2, IBM 390 Parallel Sysplex)
. SGI (Oracle7 on SMP Challenge database server)
. Taradata with AT&T and NCR
. Tandem (Himalaya K10000)
. Cray Research (SuperServer)
. SUN (Oracle7, DB2 on SUN's SMP)
. AT&T and NCR
. TMC (CM5, Darwin, Oracle7, Decision/SQL, Parasort)
. KSR (Query Decomposor, Oracle7)
. Amdahl (Oracle7)
. nCUBE (Oracle7, nCUBE2)
. data-CACHE Corp (SQL dataMANAGER)
. Data General Corp (Oracle7, DSO)
. DEC (POLYCENTER Manager)
. HP (Oracle7, IMAGE/SQL, HP 3000)
. Encore Computer Corp (Oracle7 on Infinity 90)
</pre></a>

<a name=5>
<h2>Current Major Commercial Users</h2><pre>
. Kmart
. American Express
. AAA Airline
. Citibank
. Prudential Securities
. Bell Atlantic Corp
. BellSouth
. BT (British Telecom)
. Criminal Investigative Technology Inc.
</pre></a>

<a name=6>
<h2>
Hardware architectures for parallel DBMS</h2><pre>
. shared-memory (also called Symmetric Multiprocessors SMP)
       all processors share direct access to a common global memory and 
        to all disks, example: Oracle7 on SGI Challenge
. shared-disks
        each processor has a private memory but has direct access
        to all disks, example: Oracle 7 on IBM SP2
. shared-nothing 
        each memory and disk is owned by some processor that acts
        as a server for that data. Mass storage in such an architecture
        is distributed among processors by connecting one or more disks
        (Good for system scalability and portability)
        example: Oracle 7 on nCUBE2
. shared-data
        combine features of the above three models, use data sharing
        technology directly interfaces with the individual processors to enable         each processor to cache shared data in its own local memory
        with full read/write capability (as in shared-memory model). 
        Data sharing maintains data integrity by ensuring cache coherency
        across all processors and provide efficient cross-system locking mechanisms
        (as in shared-nothing model). All disk devices are accessible
        to each processor so that a workload may be scheduled to run
        on any available processor (as in shared-disks model), instead of
        run only on processors with data affinity (as in shared-nothing model).
        example: DB2 on IBM System/390 Parallel Sysplex.
        
Four Figures illustrating the four architectures
</pre></a>

<a name=7>
<h2>
Topological view of MP machines </h2><pre>
 (with "Scalability" in mind)
. Shared-Memory and Shared-Disk
        . interference limits scalability
        . complicated and relatively expensive
. Shared Nothing
        . minimal interference between processors
        . moves only questions and answers over the newtwork
        . can scale up to hundreds (maybe thousands) of processors
        . composed of cheap, stock parts and a fast interconnecting network
        . achieve near-linear improvements
</pre></a>

<a name=8>
<h2>
Dataflow perspective</h2><pre>
. Pipelined: "assembly line"
        . average query does not have that many steps
        . some operational stages cannot be pipled (e.g. sort)
        . one operator is longer than the other (a kind of skew)
. Partitioned: "divide and conquer"
        . fairly straight forward to implement
        . "tough" operations are divided, parts run in parallel
. Combination: "best of both world"
        . several partitions running parallel
        . each partition is, where possible, a short pipeline
</pre></a>

<a name=9>
<h2>
Parallelisms in parallel database systems</h2><pre>
. pipelined parallelism (inter-operator parallelism)
. query parallelism
        . inter-query parallelism 
        . intra-operator parallelism
. partitioned (declustered) parallelism (I/O parallelism)
. transaction parallelism
</pre></a>

<a name=10>
<h2>
Data Partitioning</h2><pre>
How to divide data among multiple disks ?

. a common practice on large number of disks and 
  mass storage systems(file striping)
. high I/O bandwidth of multiple disks and parallel I/O 
. no need for specialized hardware
. important to load balance on a shared-nothing system
. major approaches
. Round Robin
        . data not scattered
        . cannot locate specific records
. Hashing
        . scatters the data
        . can locate specific records
        . danger of data skew

. Range Partitioning:
        . data scattered
        . can locate specific records
        . danger of data skew
        . "related" data can be clustered
(figures)
</pre></a>

<a name=11>
<h2>
Pitfalls in data partitioning</h2><pre>
. operations/application dependent
. danger of data skew arises on shared-nothing with data partitioning
. data skew is the focus of a number of research projects
. automatic OS file striping vs. manual table striping
</pre></a>

<a name=12>
<h2>
Performance Matrics</h2><pre>

. speedup (degree of parallelism)
. scaleup (scalbility) -- transaction scaleup and batch scaleup
. sizeup  (both)
</pre></a>

<a name=13>
<h2>
Performance barriers</h2><pre>

. startup (letency)
. interference (communication, locking)
. load balance (execution skew - all the execution occurs in one partition)
. data-partition (data skew - all the data is placed in one partition)
. imbalance of system configuration (CPUs and I/O channels)
. CPU-intensive or I/O intensive queries
</pre></a>

<a name=14>
<h2>
Some basic teminoloty for relational model (COBOL term, spreadsheet term)</h2><pre>

Data Structure
. relations (files, tables)
. tuples (records, rows)
. attributes (fields, columns)
(figure)
</pre></a>

<a name=15>
<h2>
Relation operators</h2><pre>
. scan (select-project) (a relation R, a predicate P, and an attribute list L.
. sort (reorder)
. aggregate operators (SUM,AVG,MAX,MIN,...)
. insert/delete/update
. set operators (union, intersection, difference)
. join, merge and division
. embeded operators --- source of data-flow execution model 
. uniformity of the data and operators
(figure showing example of a join)
</pre></a>

<a name=16>
<h2>
Major RDBMS funcationality</h2><pre>
. Data Access --- SQL, Transactions, PL/SQL, Data Integrity
. Data Concurrency and Consistancy --- Concurrency, Locking
. Data Security --- Users and Schemas, Roles and Prileges, Profiles and Auditing. Database Backup and Recovery --- Redo log, Rolling Forward/Back
. Distributed Processing 
</pre></a>

<a name=17>
<h2>
Structure Query Language (SQL)</h2><pre>
. non-procedure language, only specify "what-to-do", but "how-to-do" (4GL)
. SQL and software rewitten, compared to scientific areas.
. built on relational model
. data-independent --- two-phase database schema
  (logical and physical database)
</pre></a>

<a name=18>
<h2>
Overview of Parallel Oracle7 Software Architecure</h2><pre>

. Distributed Lock Manager
   . Implementation of Parallel Cache Management
   . Support transaction parallelism of multiple OLTP
   . A simple and effective approach to port sequential
     RDBMS to MPP and loosely coupled systems with shared-nothing
     architecture
   . support parallel loading, parallel indexing, 
     parallel insert/update and parallel recovery
   . Major functionalities
      . keep track of the current "ownship" of a resource
      . accepts requests for resources from application processes
      . notifies the requesting process when a resource is available
      . get exclusive access to a resource for a resource
. Parallel Query Optimizier
   . Parallel Query Optimizier builds a query execution plan employing
     all system resources. On the basis of data-partition knowledge,
     it can easily employ few tens of processors to execution a single
     query with high degree of parallelism. Both data-parallel and 
     data-flow paradigms are used to achieve this goal
   . Support complex CPU-intensive query for DSS
   . near-linear or even super-linear speedup can be achieved
   . Major focus of research area  --- optimized parallel algorithms needed
   . carried out by a query coordinator on the RDBMS server
     (the query coordinator generates the 
     optimal parallel plan rather than having the best serial plan simply 
     executed in parallel. This optimization is performed by the 
     coordinator without the need for any external information. 
     (DB2 and Oracle7 use different approaches)
   . most relational operators are parallizable, including
     full-table scan, sort, aggregate operators, join, merge
   . system/server tuning plays an important role in achieving optimized
     performance in a parallel RDBMS --- benchmarking is an effective means
</pre></a>

<a name=19>
<h2>
  Database Hardware and Software Infrastructure in NPAC (1)</h2><pre>
    nCUBE 2
      64 computing nodes, 2 GB RAM
      32 disk I/O nodes, 256 MB RAM, 32 SCSI2 channels
      96 2 GB disks, 192 GB total
      OS 3.3 (GFS) and 3.4 beta (PFS, NFS)
      Oracle 7.1.3 Parallel Server with PDQ
      Oracle Parallel Text Server 3.0 beta
      Video-On-Demand software
      upcoming upgrade (Summer '95)
          nCUBE3, 96 nodes, 6 GB RAM
          32 SCSI Fast/Wide channels, 1 TB on-line disk space
          8 ATM DS3 network connections
</pre></a>

<a name=20>
<h2>
 Database Hardware and Software Infrastructure in NPAC (2)</h2><pre>
    IBM SP2
      8 Thin and 4 Wide nodes, 256 MB RAM each
      13/21 GB SCSI2 disks per node, 188 GB total
      TB2 high-performane switch
      parallel I/O supported via VSD software
      database software
      IBM parallel DB2 (beta) 
      Oracle 7.1.3.2 Parallel Server (beta)
      upcoming upgrade:
          switched Ethernet/FDDI/ATM hook-up to NPAC backbone
</pre></a>

<a name=21>
<h2>
 Parallel Database Evaluation and Benchmarking Service</h2><pre>
    . Planning, design, implementation, and interpretation of Decision Support 
      benchmarks using parallel database technology
    . InfoMall product, supported by technology vendors
    . Standard and customer defined benchmarks
    . NPAC provides state-of-the-art infrastructure, MPP and parallel    
      database expertise, and trained personnel
    . InfoMall assumes most of the risks and hazards involved in the 
      benchmarking process
    . Non-disclosure arrangements available 
</pre></a>

<a name=22>
<h2>
  Integrated Decision Support Systems</h2><pre>
  InfoMall project
    NPAC teams with a small software vendor, major software vendor, 
     and hardware vendor to provide an integrated Decision Support 
     system for a number of Fortune 500 companies
    NPAC role:
        help parallelizing the  code kernel of the Decision 
        Support package interface the package to a parallel 
        database system and provide capacity planning based 
        on customer performance requirements and NPAC parallel 
        database expertise provide infrastructure and system 
        support for project duration
    Project deliverable: integrated turnkey DS system with all 
        elements implemented of the same MPP platform
</pre></a>

<a name=23>
<h2>
NPAC's Experience in Parallel RDBMS</h2><pre>
 . Beta test sites with:
   Oracle, nCUBE and IBM for their parallel RDBMS products
 . direct and close contacts with PRDBMS developers at Oracle and IBM
 . Hands-on Expertise in large-scale (up to 100 GB db size) DSS benchmarking
 . Expertise in Parallel computing and application, parallel system integration
   and data visualization       
</pre></a>

<a name=24>
<h2>
What is the Oracle Parallel Server ?</h2><pre>
. A high-performance RDBMS for loosely coupled and MPP systems
  . currently supported systems: IBM SP2, nCUBE2, SGI SMP, SUN SMP, HP SMP,
    Amdahl, Encore, KSR, CM5, 
. Also known as multi-instance Oracle
  -- Multiple instances running against a single, shared database
. Key technology: Parallel Cache Management and Parallel Query Processing
. A envolutionary approach on existing sequential Oracle RDBMS
  . first introduced with Oracle Oracle 6.2
     . version 6.2 = version 6.0 (sequential) + Parallel Cache Management
     . version 7.1 = version 6.2 + parallel query processing
  . newest version 7.1.3 = sequential Oracle + Parallel Server Option
                              + Parallel Query Option
  . production release on nCUBE is just a week ago
  . All parallel implementation is transparent to users

. key components: a Distributed Lock Manager and a Parallel Query Optimzer
</pre></a>

<a name=25>
<h2>
Oracle 7 Server Architecture </h2><pre>
(fig.  )
. Oracle 7 Database Files

Data Files --- Contain all the database data; logical structure, such as
tables and indexes, are physically stored in the data files
Redo Log Files --- Transaction logs that record all changes made to 
database information for recovery
Control Files --- File that record the physical structure of the database 
including all datafiles and redo log files
</pre></a>

<a name=26>
<h2>
The Oracle Server Instance</h2><pre>
. Every time the Oracle 7 Server is started, the System Global Area
(SGA) is allocated and the Oracle 7 Server background processes
are started. The combination of memory buffers and background processes is
called an Oracle Server Instance. SGA (fig.  )
. The SGA is a group of shared memory buffers allocated by the Oracle 7
Server for an instance
. The background processes asynchronously perform distinct tasks on behalf 
of all database users of the instance
</pre></a>

<a name=27>
<h2>
Oracle7 Parallel Server Characteristics </h2><pre>
(fig.  )
  . start an Oracle Server instance on each node
  . each instance has its own set of redo log files
  . all instances can concurrently execute transactions against 
   the same database and each instance can have multiple users 
   executing transactions
</pre></a>

<a name=28>
<h2>
Parallel Server Option</h2><pre>
. Multiple Instances on different nodes share a single physical database
. The data is accessed by all nodes through a parallel file subsystem
. All instances have common data & controls files and their individual
  log files
. All instances can simultaneously execute transactions against the
  single database, and each instance can have multiple users executing
  transactions
. Data is kept consistent using PCM Technology 
. Implemented by the Distributed Lock Manager
</pre></a>

<a name=29>
<h2>
Parallel Cache Management and Distributed Lock Manager</h2><pre>
. Concurrency control of buffer caches among instances
   . for instances to maintain independent, yet consisstent cache of blocks
   . for multiple nodes to simultaneous read/write to database (disks)
. Maximizes concurrent data access and minimizes (disk) I/O activity
. Block-level PCM locks, row-level locking preserved
. A fault-tolerant DLM: if a node goes down, the DLMs will automatically
  recover and reconstract the lock database on behalf of the failed DLM
(fig.  )
</pre></a>

<a name=30>
<h2>
An Example of Concurrent Control</h2><pre>
time
1 Node 1 reads block 1 from disk and stores in Node 1 memory cache,
   Node 2 reads block 1 from disk and stores in Node 2 memory cache.
2 Node 1 writes information to block 1,
   Node 2 writes information to block 1.
3 Node 1 writes block 1 to disk.
4 Node 3 reads block 1 from disk and stores in Node 3 memory cache.

In this situation, Node 1 and 3 have the same information in each memory cache, 
but Node 2 has an incoherent cache.

Using Oracle 7 Server's PCM locks,
. An instnace acquires a PCM lock as a resource for update purpose
. all buffer caches must conatin the same copy of a resource
PCM blocks:
. Ensure cache coherency by only allowing instances to change blcoks
while holding a lock
. PCM lock work independently of transaction locks (row-level locking)
. Ensure cache coherency using minimum amount of communication
. PCM locks are allocated to files when a instance is started

Consider the following scenario:
Time
1 Instance X attempts to update row 1 in (data blcok n)
  Instance X becomes owner of PCM lock for data block n
  Instance X update row 1

2 Instance Y attempts to update row 2 in (data blcok n)
  Instance Y requests the PCM lock from instance X for data block n
  Instance X releases the PCM lock
  Instance Y becomes owner of PCM lock for data block n
  Instance Y update row 2

3 Instance X attempts to update row 3 in (data blcok n)
  Instance X requests the PCM lock from instance Y for data block n
  Instance Y releases the PCM lock
  Instance X becomes owner of PCM lock for data block n
  Instance X update row 3

. The PCM is handled separately from transaction lock -- when
instance wish to update the same data (e.g.: Instnace X and Y attept
to update Row 1), normal transaction locking is involved.
</pre></a>

<a name=31>
<h2>
Parallel Query Option</h2><pre>
. Parallel Query Processing 
  . enables the parallel processing of lon and complex queries with
    parallelizable operators such as scan,join and sorts
  . speeds up query response time (vs. thoughput in Parallel Server Option)
. Parallel Data Load
  . Loads data in both a direct and parallel method
  . speeds up the laoding of large data loads
. Parallel Index Creation
  . speeds up the creation of large indexes
. Parallel Recovery
  . enables concurrent recovery sessions
Best suited to data-intensive applications such as decision support
systems and data warehouse
</pre></a>

<a name=32>
<h2>
Parallel Query Processing: Overview</h2><pre>
Data Flow (fig.  ):
1 A client process issues a query
2 The statement is parsed and a shared server process executes the statement
3 The shared server process determines if the statement can utilize parallel
processing, and then enlists two or more query servers to
execute the statement in parallel. The shared server process partitions the 
work of each operation in the query equally among the query servers
4 Query servers process certan operations in the query independently and
return sub-results back to the shared server process
5 The shared server process reassembles the results from the query servers
and returns the query's results to the client

</pre></a>

<a name=33>
<h2>
Processing Phases</h2><pre>
1 Parse --- a query optimizer determines optimal execution plan, 
parallelize query if a full scan was included in the execution plan 
(note the query is parallelized dynamically at execution time, 
automatically adapting changes in data partition and server
configuration)
2 Execute --- perform operations
3 Fetch (Queries) --- Return rows from query server processes

The execution and fetch phases can be performed in parallel (
pipelined parallelism) by multiple server processes
</pre></a>

<a name=34>
<h2>
Example of parallel execution of a full scan </h2><pre>
(fig.  , fig.  )
. the shared query server is known as Query Coordinator who
is responsible for parallelizingoperations and interpreting
the results from query servers
. not all operators can be parallelized in Oracle 7.1.3
</pre></a>

<a name=35>
<h2>
Oracle 7.1.3 Server Paralliziable Operations </h2><pre>
. Table Scans
. Sorts (for ORDER BY, GROUP BY, Joins, DISTINCT)
. Joins (Sort-Merge, Nested Loops)
. Aggregates (GROUP BY, MIN, MAX, AVG, etc.)
. SET operations (e.g., UNION, INTERSECT or MINUS) are not parallelized
</pre></a>

<a name=36>
<h2>
Data Partitioning in Oracle 7.1.3</h2><pre>
. Oracle 7 on nCUBE2 and IBM SP2 are actually implemented as
  shared-disk (vs. shared-nothing) architectures
  . shared database by sharing disks through parallel file system or
    parallel I/O system:
    nCUBE -- the GFS (Global File System) over a Gigcache
    Subcube and nCUBE2's parallel I/O system; 
    IBM SP2 -- VSD (Virtual Shared Disk) parallel 
		I/O system (over the HPS) 
. Automatic file striping and disk partitions by the operating 
  system on nCUBE2
  . general OS and server level, less flexible, but relatively 
    transparent to DBA
  . data skew is less possible, due to the Gigcache
. Manually file striping at database table level
  . application-specific, flexible, but difficult to handle with 
    large number of files and impossible to determine optimal 
     striping length
  . easy to have data skew
</pre></a>

<a name=37>
<h2>
Industry Benchmarking Tests</h2><pre>
. Transaction Processing Council
  TPCD A,B,C
  TPC D (not yet finalized)
. Wisconsin Benchmark (aka Dewitt)
. Set Query
. The Neal Nelson Database Benchmark (tm)
. SPEC
. The Customer
 -- or worse >> The Consultant
</pre></a>

<a name=38>
<h2>
Characteristics of a DSS workloads</h2><pre>
. few insert/update/delete transactions, read-only queries.
. non-selective & data-intensive; few in number, large in size
. may return a large volume of data: e.g.: mailing list
. queries frequently scan a large percentage of the data in the database
. often used to mine data  in a recursive, drill-down fasion
</pre></a>

<a name=39>
<h2>
TPC-D - a Decision Support Benchmark? (1)</h2><pre>
A common suite for DSS benchmarking and comparing:
. different software and hardware systems (such as Oracle 7 on 
nCUBE2 vs. DB2 on IBM SP2)
. different software Products on the same machine 
  (such as Oracle7 and DB2 on IBM SP2)
. different machines in a compatible family
(such as Oracle7 on IBM SP1 and SP2, or on a RS/6000)
. different releases of a product on the same machine 
(such as Oracle 7.1.0, 7.1.2, 7.1.3 on nCUBE2)
                 
Our TPCD benchmarking project has gone through most of  the above examples
</pre></a>

<a name=40>
<h2>         
TPC-D - a Decision Support Benchmark? (2)</h2><pre>
. TPC-D definition requires that either one (power test) or 
many (scale-up test) query streams are run and timed against at least 
three sizes of a predefined database, with sizes growing exponentially 
(e.g., 1G, 10GB, 100GB sequence) 
. Query stream consists of 17 complex queries plus two concurrently 
running update operations
. No extensive database tuning in allowed - optimizer hints are explicitly 
forbidden
. timings for individual queries are not part of the TPC-D metric
Database structure well represents  most common DSS DBs in applications, 
i.e. one largest table takes ~70% of the total DB size.
</pre></a>

<a name=41>
<h2>         
TPC-D - a Decision Support Benchmark? (3)</h2><pre>
Queries
. model complex queries against a relatively large and complex database 
all currently parallelizable relational operators (Oracle 7 Server) 
are used, including:
                                
Aggregate (Group by) - Q2, Q6, Q11, Q14, Q15
Merge Join - Q8 
Nested Loops - All  except Q1
Sort (Group by) - All except Q2
Sort (Join) - Q8, Q12
Sort (Order by) - Q3, Q5, Q10, Q11, Q16
Sort (Unique) - Q4
Table Access (Full) - All 
</pre></a>

<a name=42>
<h2>         
TPC-D - a Decision Support Benchmark? (4)</h2><pre>
Problems with  TPC-D design
. Query stream design: Query 17 takes more than 95% of the time 
needed to execute the stream
. Overal design unnecessarily mimics TPC OLTP benchmarks
Metrics more suitable for Data Warehouse applications than 
for ad hoc DS queries
. Benchmark expensive to run (huge disk space requirements)
 </pre></a>

<a name=43>
<h2>        
TPC-D - a Decision Support Benchmark? (5)</h2><pre>
Is TPC-D suitable for parallel DBMS evaluation?
. No scalability performance metrics suitable for parallel data 
query evaluation i.e., query time vs. number of instances 
(CPUs), and/or, query time vs. available I/O resources  
( I/O channels, drivers, disks and disk partitions)
. Scalability of a parallel database system relies on both CPU resources  
and I/O resources. By using multiple instances (CPUs), 
parallel server/query decreases query response time and increases 
throughput. By using partitioned data on multiple 
disks (with multiple I/O channels), parallel server/query 
achieves higher I/O bandwidth for large DBs. The two 
characteristics are equally important and must be benchmarked 
together. In current  TPC-D specification, neither performance 
metrics is defined.
</pre></a>

<a name=44>
<h2>         
TPC-D - a Decision Support Benchmark? (6)</h2><pre>
. For DS applications, query stream execution time is largely irrelevant. 
It is far more important to understand performance and scalability of 
elementary SQL operations and of typical queries encountered in customer 
application.  
. Loading and indexing may take a large portion of the overall time in 
preparing a large DB. Both parallel loading and indexing become 
or large DSS databases should be defined and benchmarked, together with  
the query benchmarks. Currently none of them is defined or mentioned 
in the  TPC-D specification.
 </pre></a>

<a name=45>
<h2>        
TPC-D - a Decision Support Benchmark? (7)</h2><pre>
. TPC-D design contains flaws that make its use questionable even for                 sequential database systems
Given the complexity of the parallel database systems, attempts to 
reduce their performance estimate to one number are most likely futile
TPC-D defines no metrics useful for understanding behavior of parallel 
RDBMSes. TPC-D designers should re-examine  academic research work 
pertinent to the field
. Future research work towards definition of DS benchmark for parallel 
databases seems important
 </pre></a>

<a name=46>
<h2>        
Lessons Learned in Our TPC-D Benchmarking</h2><pre>

. Data partitioning must be very carefully designed to balance I/O and CPU
. Parallel Loading is great ! --- almost linear speedup
. Backup may be important to save time (we don't have)
. the most difficult and trouble-making phase is the database 
  preparation/loading/indexing phase which always involves storage 
  management parameters 
. you must be both a DBA and sysadm to get the job done quickly !
. in the worst case, you may need to bring down the whole parallel 
  system a couple of times a day to have a cleared system
. benchmarking relies critically on system stability
. be prepared that the results are negative (nothing may be obtained !)
. you need at least twice the disk space of your original database size
. performance tuning is extremely difficult with Oracle 7 parallel server
 </pre></a>

<a name=47>
<h2>        
Sorry</h2><pre>
we regret that no performance data is given here, due to our compliance
with the non-disclosure agreements with our sponsors and vendors
</pre></a>