HELP! * BLUE=global GREY=local Full HTML for

GLOBAL foilset Full Relational Database Presentation prepared for HPDC-4 Tutorial (Used in CPS616-1995)

Given by Gang Cheng Marek Podgorny (Geoffrey Fox) at HPDC-4 Tutorial/CPS616 Spring95 on August 1, Pentagon City Virginia. Foils prepared July 23,1995
Abstract * Foil Index for this file See also color IMAGE

Commercial Applications and Motivation for large Parallel Databases
Overview of base Technology including parallel architecture and review of sequential SQL
NPAC's Parallel Database Installation
Parallel Oracle on SP2 and nCUBE
DR-LINK high level frontend from Textwise
Some experiences with benchmarking --TPC-D
Linking of relational databases to the Web
  • Example of use with large UNIX mail system
  • Contrast of pluses and minuses of relational databases with distributed Web databnases

Table of Contents for full HTML of Full Relational Database Presentation prepared for HPDC-4 Tutorial (Used in CPS616-1995)


1 Tutorial on Parallel Relational Databases including Oracle, DB2PE on nCUBE2 and IBM SP2
2 Abstract of Parallel Relational Database Tutorial
3 Parallel Relational Database Management Systems -- I
4 Abstract of Parallel Relational Database Management Systems -- I
5 Outline of Full Database Presentation
6 Section I:
Parallel Database Technology in Commercial Applications and Industry

7 Motivations for Parallel Databases -- I:
Overview of Parallel Database Appeal

8 Motivations for Parallel Databases -- II:
Inadequacies with Current Mainframe Solutions

9 Motivations for Parallel Databases -- III:
Commercial versus Scientific Applications

10 Motivations for Parallel Databases -- IV:
Market Demand from Competitiveness

11 Application Areas for Parallel Database: commercial, administration, scientific
12 General Classes of Commercial Applications
13 An Application Example ---
Intelligent Business systems
1)Objectives

14 Intelligent Business systems ---
2)Typical Questions

15 Intelligent Business systems ---
3) Major Technology Challenges

16 Intelligent Business systems ---
4) Solutions

17 Major Software and Hardware vendors in Parallel Database Technology
18 Some Current Major Commercial Users
19 Parallel Database Technology and Theory
20 Hardware architectures for parallel DBMS
-- Generic System

21 Hardware Architectures and forms of Parallelism
22 Three (Hardware) architectures for parallel DBMS
23 Notes on Shared Nothing Architecture
24 Shared-data Architecture
25 Shared Data Architecture
26 Note on Oracle nCUBE2 Hybrid Architecture
27 Topological view of MP machines (with ÒScalabilityÓ in mind)
28 Dataflow perspective
29 Parallelisms in parallel database systems
30 Data Partitioning --- How to divide data among multiple disks ?
31 Basic Data Partitioning Schemes
32 Major Approaches in Data Partitioning
33 Danger of Data Skew arises on
Shared Nothing with Data Partitioning

34 Pitfalls in data partitioning
35 Performance Metrics In Parallel Database Systems
36 Performance barriers
37 Some basic terminology for relational database model
38 Examples of Typical Relational Operations
39 Join Operation: a SELECT operation that combines rows from two or more tables. Each returned row contains data from more than one table
40 Overview of Structure Query Language (SQL)
41 Features of Structure Query Language (SQL)
42 Major RDBMS functionality
43 What is a 3GL or 4GL?
44 What is PL/SQL -- I ?
45 What is PL/SQL -- II?
46 What is PL/SQL -- III?
47 What is Data Integrity?
48 What are Schemes?
49 What are Roles?
50 What are Profiles and Auditing?
51 What are Two-phase Database Schema?
52 Example: Relational Joins
53 General Structure of Parallel and Sequential Relational Joins
54 Parallel Algorithm for Relational Joins
55 Parallel Database Software Architecture
Distributed Lock Manager

56 Oracle Parallel Relational Database -- Screen1: Oracle Parallel Server
57 Oracle Parallel Relational Database -- Screen2: Parallel Cache Management
58 Parallel Database Software Architecture
59 Oracle Parallel Relational Database -- Screen3: Oracle's Parallel Query
60 Parallel Relational Database Management Systems II
61 Abstract of Second Part of Parallel Relational Database Systems Presentation
62 Parallel Database efforts at NPAC and in InfoMall
63 NPAC's Experience in Parallel Database Technology
64 nCUBE Database Hardware and Software Infrastructure in NPAC
65 IBM Database Hardware and Software Infrastructure in NPAC
66 InfoMall(NPAC) Parallel Database Evaluation and Benchmarking Service
67 Integrated Decision Support Systems
68 Overview of DR-LINK High Level Natural Language Database Interface
69 Features of DR-LINK High Level Natural Language Database Interface
70 Structure of DR-LINK High Level Database Interface
71 Parallel Oracle7 RDBMS -- A Case Study
72 What is the Oracle Parallel Server -- Introduction ?
73 Oracle Parallel RDBMS Architecture
Overall Structure

74 Oracle Parallel RDBMS Architecture
Parallel Data Query option:

75 Oracle Parallel RDBMS Architecture
Parallel Server option

76 What are the versions of the Oracle Parallel Server ?
77 What is the Oracle Parallel Server File Structure and Architecture?
78 Oracle7 Server Architecture
79 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 1 on Oracle's Parallel Server for nCUBE2
80 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 2 on Oracle's Parallel Server for nCUBE2
81 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 3 on Oracle's Parallel Server for nCUBE2
82 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 4 on Oracle's Parallel Server for nCUBE2
83 The Oracle Server Instance
84 Oracle7 Parallel Server Characteristics
85 The Oracle7 Parallel Server
86 Application Example of Parallel Server (functional parallelism)
87 Parallel Server Option
88 Oracle7 on IBM SPX Architecture
89 Parallel Cache Management (PCM) and Distributed Lock Manager(DLM)
90 Oracle Parallel Server On-Line Recovery
91 Example of failure due to lack of concurrency control
92 Discussion of PCM Lock Example
93 Contention for Shared Resources
94 Using Oracle 7 Server's PCM locks:
95 Consider the following PCM Example
96 Parallel Query Option
97 Parallel Query Processing: Overview
98 Query Processing
99 Three Processing Phases in Parallel Query
100 Example of parallel execution of a full scan
101 Parallel Query Execution
102 The query coordinator dynamically partitions the work among the query servers
103 Data Partitioning in Oracle 7.1.3
104 File Striping in Oracle
105 Benchmarking Parallel RDBMS
106 Industry Benchmarking Tests-I
107 Industry Benchmarking Tests-II
108 Characteristics of DSS Workloads
109 TPC-D - a Decision Support Benchmark?
Overall Structure

110 TPC-D - a Decision Support Benchmark?
Nature of Benchmark Examples

111 TPC-D - a Decision Support Benchmark? Nature of Queries
112 TPC-D - a Decision Support Benchmark?
Problems with TPC-D design

113 TPC-D - a Decision Support Benchmark?
Scalability Issues

114 TPC-D - a Decision Support Benchmark?
Relevance to Real Applications

115 TPC-D - a Decision Support Benchmark?
Overall Evaluation

116 Some Lessons Learned in NPAC's Benchmarking
117 Integration of Database Server and Web Technology
118 Abstract of Web Oracle Integration Presentation
119 Key points in Web Technology
120 Key points in database technology
121 The Synergy of Web Networking and Database Technologies
122 Weak Points in Web Technology
123 Weak Points in Database Technology
124 By integrating a database server into Web technology, a Web server can be enhanced with:
125 By integrating Web into database technology, a Database server can be enhanced with:
126 Interfacing Web server and a database server
-- Basic Features of WWW Client Server System

127 WWW Client Server Relationship
128 Basic features in database access -- a SQL-based query interface
129 Basic Client-Server model
130 Major approaches for interfacing web into DB server environment
131 Basic WWW DB Integration Model
132 Oracle approach to Web-RDBMS Integration
133 Procedural and Programmable mechanisms
134 Web-Oracle-Web (WOW) --- a Web-Oracle interface tool for developing gateways
135 Basic components in WOW
136 Client-Server Architecture for WOW
137 Web-oracle Mail Database: a case study of integrating Oracle7 database server into Web
138 The Entity-Relation Model
139 Functionality of the mail database
140 Web to Oracle Interface Screen 1: Home Page for the Web-Mail Database
141 Web to Oracle Interface Screen 2: Query the Mailbox
142 Web to Oracle Interface Screen 3: Query the Folder -- Choose a Folder
143 Web to Oracle Interface Screen 4: Query the Folder
144 Query by mail header
145 Web to Oracle Interface Screen 5: Query by Mail Header -- Choose a Folder
146 Web to Oracle Interface Screen 6: Query by mail header
147 Web to Oracle Interface Screen 14: Query Result -- Browse Mail Body --1
148 Web to Oracle Interface Screen 15: Query Result -- Browse Mail Body --2
149 Query by date
150 Web to Oracle Interface Screen 7: Query by Date --1
151 Web to Oracle Interface Screen 8: Query by Date --2
152 Query by Subject
153 Web to Oracle Interface Screen 9: Query by Subject --1
154 Web to Oracle Interface Screen 10: Query by Subject --2
155 Query by Sender
156 Web to Oracle Interface Screen 11: Query by Sender
157 Incorporating new mail into the Database from your UNIX mailbox --I
158 Web to Oracle Interface Screen 12: Incorporate New Mail from Web Page -- Choose a folder
159 Incorporating new mail into the Database from your UNIX mailbox --II
160 Web to Oracle Interface Screen 13: Incorporate New Mail
161 Format of Query results from all query options
162 Web to Oracle Interface Screen 6: Query by mail header
163 Web to Oracle Interface Screen 14: Query Result -- Browse Mail Body --1
164 Web to Oracle Interface Screen 15: Query Result -- Browse Mail Body --2
165 Query options to be added in the future to Oracle-based Web Mail System

This table of Contents Abstract



HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 1 Tutorial on Parallel Relational Databases including Oracle, DB2PE on nCUBE2 and IBM SP2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
HPDC4
Washington DC
August 1,1995
NPAC
Gang Cheng, Marek Podgorny with Geoffrey Fox
Syracuse University
111 College Place
Syracuse
NY 13244-4100

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 2 Abstract of Parallel Relational Database Tutorial

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Commercial Applications and Motivation for large Parallel Databases
Overview of base Technology including parallel architecture and review of sequential SQL
NPAC's Parallel Database Installation
Parallel Oracle on SP2 and nCUBE
DR-LINK high level frontend from Textwise
Some experiences with benchmarking --TPC-D
Linking of relational databases to the Web
  • Example of use with large UNIX mail system
  • Contrast of pluses and minuses of relational databases with distributed Web databnases

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 3 Parallel Relational Database Management Systems -- I

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Click here for subtitle
Gang Cheng, Marek Podgorny
April 1995
Northeast Parallel Architectures Center
at Syracuse University

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 4 Abstract of Parallel Relational Database Management Systems -- I

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
This presentation contains the first two sections
  • Parallel Database Technology in Commercial Applications and Industry
  • Parallel Database Technology and Theory
Of the full CPS616 Parallel Database Module
The first section sets the scene by motivating the need for paraalel databases while
The second section reviews both Sequentional and Parallel Relational Databases looking at explicit examples
  • nCUBE and SP2 with Oracle and DB2
We also discuss database system architectures and review
The SQL Query language

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 5 Outline of Full Database Presentation

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
This Presentation has five Sections
Parallel Database Technology in Commercial Applications and Industry
Parallel Database Technology and Theory
Parallel Database Projects at NPAC
Parallel Oracle7 RDBMS -- A Case Study
Parallel Database Benchmarking

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 6 Section I:
Parallel Database Technology in Commercial Applications and Industry

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 7 Motivations for Parallel Databases -- I:
Overview of Parallel Database Appeal

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 8 Motivations for Parallel Databases -- II:
Inadequacies with Current Mainframe Solutions

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Conventional (mainframe) systems unable to support
  • time-critical queries
  • simultaneous OLTP and decision support systems
  • gigantic scientific and commercial databases
  • complex queries (approximate matching, pattern recognition ...)
  • sophisticated optimization as in market segmentation

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 9 Motivations for Parallel Databases -- III:
Commercial versus Scientific Applications

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
In commercial business world, MPP systems
  • Can find more profitable applications than the scientific world,
  • As SQL intrinsically parallel, one needs only parallelize the database system and not the many SQL applications
    • in contrast for scientific computing, one needs to both produce a parallel compiler and parallelize each application at language and algorithm level
  • Have better cost/performance than the mainframe system
  • Are mature and highly available NOW

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 10 Motivations for Parallel Databases -- IV:
Market Demand from Competitiveness

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 11 Application Areas for Parallel Database: commercial, administration, scientific

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
High transaction rate applications
  • banks, airlines, telecommunication, security brokers, retailers ...
Real-time applications
  • stock trading, C3I (military, civil defense), air traffic, process control (manufacturing, accelerators, reactors/generators), real-time insurance claim analysis ...
Complex queries
  • decision support (market analysis), molecular biology, chemistry/pharmacology ...
Massive amount of data
  • EOS, HEP, weather, census, global environment models, oil exploration, medical history files, multimedia support ...

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 12 General Classes of Commercial Applications

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
OLTP (On Line Transaction Processing) --- consists of performing many small independent requests submitted by many clients simultaneously and operating
  • Performance needed is linear in database size
DSS (Decision Support System)
  • Performance needed can be exponential in database size for sophisticated optimization or queries
Data Warehouse --- operations mixing DSS and OLTP
Data mining --- uncover/extract meaningful information from massive data sources

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 13 An Application Example ---
Intelligent Business systems
1)Objectives

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
understand customer buying habits and product preference
predict market tendency
make efficient, productive and profitable decisions
must be based on answers to questions on ALL historical business transaction data (Lots of !)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 14 Intelligent Business systems ---
2)Typical Questions

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
ÒGive me a breakdown of all customers likely to default in the coming yearÓ
ÒDo three things: predict year-end demand, tell me which customers will fuel that demand, and tell me whyÓ
ÒDefine a highly concentrated micromarket (make special offers to customer segments that meet very select criteria) within my databaseÓ
ÒCreate a model that really explains why some customers renew their subscriptions and others donÕtÓ
ÒSuggest ways of regrouping our customers into new market segments (for Direct-Marketing)Ó
ÒFind some customer-preference pattern I might not be aware ofÓ

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 15 Intelligent Business systems ---
3) Major Technology Challenges

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Lots of Historical Data (and they are growing daily !)
High Performance Computing System and DBMS
Intelligent Data Analysis (Optimization)
Not possible on a mainframe system in terms of both performance and capacity

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 16 Intelligent Business systems ---
4) Solutions

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
MPP system
Parallel Relational Database Management System
Intelligent Datamining Algorithms
  • Expert Systems
  • Fractal Compression
  • Neural Networks
  • Modelling
  • Genetic Algorithms

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 17 Major Software and Hardware vendors in Parallel Database Technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Oracle (Oracle7 with parallel server and parallel query options)
Sybase (Navigation Server on AT&T 3600, SUN SMPs, and IBM SP2 (planned),SQL Server 10)
Informix (INFORMIX-OnLine Dynamic Server)
IBM (Oracle7, DB2/6000, DB2/2, SP2, IBM 390 Parallel Sysplex)
SGI (Oracle7 on SMP Challenge database server)
Taradata with AT&T and NCR
Tandem (Himalaya K10000
Cray Research (SuperServer)
SUN (Oracle7, DB2 on SUNÕs SMP)
AT&T and NCR
TMC (CM5, Darwin, Oracle7, Decision/SQL, Parasort)
KSR (Query Decomposor, Oracle7)
Amdahl (Oracle7)
nCUBE (Oracle7, nCUBE2)
data-CACHE Corp (SQL dataMANAGER)
Data General Corp (Oracle7, DSO)
DEC (POLYCENTER Manager)
HP (Oracle7, IMAGE/SQL, HP 3000)
Encore Computer Corp (Oracle7 on Infinity 90)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 18 Some Current Major Commercial Users

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Kmart
American Express
AAA Airline
Citibank
Prudential Securities
Bell Atlantic Corp
BellSouth
BT (British Telecom)
Criminal Investigative Technology Inc.
Check http://greatwall.npac.syr.edu:1963 for more info about parallel database technology in industry

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 19 Parallel Database Technology and Theory

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 20 Hardware architectures for parallel DBMS
-- Generic System

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
A typical hardware setup is:

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 21 Hardware Architectures and forms of Parallelism

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 22 Three (Hardware) architectures for parallel DBMS

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
shared-memory (also called Symmetric Multiprocessors SMP)
  • all processors share direct access to a common global memory and to all disks. example: Oracle7 on SGI Challenge
shared-disks
  • each processor has a private memory but has direct access to all disk. example: Oracle 7 on IBM SP2
shared-nothing
  • each memory and disk is owned by some processor that acts as a server for that data. Mass storage is distributed among processors by connecting one or more disk to each processor. example: DB2 on SP2
  • This is Òowner computeÓ rule as in much scientific computing (HPF default implies processor where variable ÒstoredÓ performs computations setting variableÕs value)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 23 Notes on Shared Nothing Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
On a shared-nothing architecture, one or more disks are connectedto each processor, depending on machineÕs (parallel) I/O architecture, via direct connecting or some I/O channels.
The shared-nothing does not focus on the physical I/O connection architectures, rather, it refers tohow the data is partitioned on the disk arrays and how the data are movedinto the procesor buffers for the parallel query processing.
In this architecture, data locations determines where the data will be processed and how data is shared by processors other than its local processor (ie. the processor which has the local disk holding the data if direct connection like DB2 on SP2, or the shortest path from disks to the process if other I/O architectures like Oracle on ncube).
Unlike shared-disk where data may be shared via the interconnection network when it is first readfrom a disk to a remote processor (buffer), if the data local to processorA is required by a remote processor B to perfom some parallel query processing, A sends the data to B via the communication network.
That is the only shared media for shared-nothing is the communication network.
In most case, data placement has determined how and where (even when) a parallel query processing is decomposed or partitioned.
Data (or I/O) load balance determines the CPU load balance in the system.
This is built in the query decomposer. eg.
if Processor A gets 10% data local to its disk, while B gets remaining 90%.
Then in a shared-nothing system, A spends 10% CPU and then is idle while B takes most of CPU time.
On the otherhand for a shared-disk system, A and B will process similiar amount of data with some overhead of sending data from B to A.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 24 Shared-data Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Combine features of the above three models, data sharing technology directly interfaces with the individual processors to enable each processor to cache shared data in its own local memory with full read/write capability (as in shared-memory model) Data sharing maintains data integrity by ensuring cache coherency across all processors and provide efficient cross-system locking mechanisms (as in shared-nothing model)All disk devices are accessible to each processor so that a workload may be scheduled to run on any available processor (as in shared-disks model), instead of being run only on the processors with data affinity (as in shared-nothing model). Example of shared-data system: DB2 on IBM System/390 Parallel Sysplex.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 25 Shared Data Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 26 Note on Oracle nCUBE2 Hybrid Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
It is difficult to characterise the Oracle7 on ncube2 as it mixed both shared-disk and shared-nothing architectures features by using an additonal subcube as a giga-cache which lies between ncube I/O system (including I/O nodes, multiple I/O channels and disks drives) and the compute-node subcube(s).
Before data is read into the buffer in compute-processor, they first are cached in a Giga-cache node.
So if you look just at the compute-subcube <--> Giga-cache subcube, it looks likea shared-nothing system,
But if you look at the whole compute-subcube <--> Giga-cache subcube <--> disk-arrays, it looks like a shared-disk system.
Strictly speaking and compared to DB2 on SP2, it is not a shared-nothing system as the data-placement on disk-arrays has little to do with how query processing is paralleliazed or decomposed.
This is also the reason why we found data partition schemes have less I/O performance impact for Oracle7 on nucbe than that on SP2 (the latter is a shared-disk system).

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 27 Topological view of MP machines (with ÒScalabilityÓ in mind)

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Shared-Memory and Shared-Disk
  • interference limits scalability
  • complicated and relatively expensive
Shared Nothing
  • minimal interference between processors
  • moves only questions and answers over the network
  • can scale up to hundreds (maybe thousands) of processors
  • composed of cheap, stock parts and a fast interconnecting network
  • achieve near-linear improvements
  • good for system scalability and portability
Shared Data
  • Little experience so far

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 28 Dataflow perspective

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Pipelined: Òassembly lineÓ
  • average query does not have that many steps
  • some operational stages cannot be pipelined (e.g. Sort)
  • one operator is longer than the other (a kind of skew)
  • This is task parallelism in scientific computing
Partitioned: Òdivide and conquerÓ
  • fairly straight forward to implement
  • ÒtoughÓ operations are divided, parts run in parallel
  • This is classic data parallelism in scientific computing
Combination: Òbest of both worldÓ
  • several partitions running parallel
  • each partition is, where possible, a short pipeline

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 29 Parallelisms in parallel database systems

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
pipelined parallelism (inter-operator parallelism)
query parallelism
  • inter-query parallelism
  • intra-operator parallelism
partitioned (declustered) parallelism (I/O parallelism)
transaction parallelism

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 30 Data Partitioning --- How to divide data among multiple disks ?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
a common practice on large number of disks and mass storage systems (file striping)
high I/O bandwidth of multiple disks and parallel I/O
no need for specialized hardware
important for query load balance on a shared-nothing system
this is of course a central problem in scientific computing and much of High Performance Fortran is devoted to data partitioning

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 31 Basic Data Partitioning Schemes

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 32 Major Approaches in Data Partitioning

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Round Robin ( cyclic in HPF/Scientific Computing Notation)
  • data scattered
  • cannot locate specific records
Hashing ( scattered decomposition in HPF style)
  • scatters the data
  • can locate specific records
  • danger of data skew
Range Partitioning (block cyclic in HPF style)
  • data not scattered
  • can locate specific records
  • danger of data skew
  • ÒrelatedÓ data can be clustered

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 33 Danger of Data Skew arises on
Shared Nothing with Data Partitioning

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 34 Pitfalls in data partitioning

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
operations/application dependent
danger of data skew arises on shared-nothing with data partitioning
data skew is the focus of a number of research projects
automatic OS file striping vs manual (application-specific) table striping

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 35 Performance Metrics In Parallel Database Systems

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
speedup (degree of parallelism)
  • (fixed problem size)
scaleup (scalability) -- transaction scaleup, batch scaleup, processor scaleup, I/O scaleup
  • (problem size increases)
sizeup (both)
  • (scaled speedup in scientific computing increases problem and computer size by same factor)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 36 Performance barriers

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
startup (latency)
interference (communication, locking)
load balance (execution skew - all the execution occurs in one partition)
data-partition (data skew - all the data is placed in one partition)
imbalance of system resources (CPUs and I/O channels)
CPU-intensive or I/O intensive queries

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 37 Some basic terminology for relational database model

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Data Structure
  • relations (files, tables)
  • tuples (records, rows)
  • attributes (fields, columns)
Relation operators
  • scan (select-project) (a relation, a predicate, and an attribute list)
  • sort (reorder)
  • aggregate operators (SUM,AVG,MAX,MIN,...)
  • insert/delete/update
  • set operators (union, intersection, difference)
  • join, merge and division
embedded operators
  • uniformity of the data and operators
  • source of data-flow execution model

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 38 Examples of Typical Relational Operations

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 39 Join Operation: a SELECT operation that combines rows from two or more tables. Each returned row contains data from more than one table

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
SQL: Select A.A1,A.A3,B.B1,B.B2 from A,B where A.A1=B.B1;

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 40 Overview of Structure Query Language (SQL)

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
a database language specifies the semantics of various components of a DBMS: structures and operation of a data model, data definition, data access, security, programming language interface, and data administration
Industry accepted Standard, first introduced by ANSI in 1986, current SQL92 by ANSI and ISO, new standard SQL3 with enhancements in object-oriented data management is undergoing. Portable to all RDBMS systems.
built on relational model, simple and powerful
non-procedural 4GL language, only specify Òwhat-to-doÓ, not Òhow-to-doÓ , extended to object-oriented programming interface
  • this extended model competes with fledging object-oriented database in industry

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 41 Features of Structure Query Language (SQL)

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
applications in SQL do not need to be rewritten to run on parallel server (all parallelizations of SQL programs are carried out by the server), compared to scientific areas where both parallel application algorithms and a parallel compiler are required
data-independent --- two-phase database schema (logical and physical database)
SQL in Oracle 7
  • create tables in the database
  • store information in tables
  • select exactly the information you need from your database
  • make changes to your data and the structure of underlying tables, and
  • combine and calculate data to generate the information you need

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 42 Major RDBMS functionality

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Data Access --- SQL, Transactions, PL/SQL, Data Integrity
Data Concurrency and Consistency --- Concurrency, Locking
Data Security --- Users and Schemes, Roles and Privileges, Profiles and Auditing
Database Backup and Recovery --- Redo log, Rolling Forward/Back
Distributed Processing --- Client-Server environment

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 43 What is a 3GL or 4GL?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Historians of programming languages and database people tend to classify programming languages by generations:
1GL --- machine language (in 0/1 binary)
2GL --- assembly language (in symbolic machine language)
3GL --- high-level programming language (structural, modular and usually procedural)
4GL --- non-procedural (declarative). Most AI languages are 4GL.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 44 What is PL/SQL -- I ?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
PL/SQL illustrates some low-level data access tools/languages that are available in most RDBMS.
Basically, there are three levels of database access interfaces:
1. highest level --- SQL
  • Non-procedure language, only specify what-to-do, not how-to-do.
  • So-called fouth-generation language (4GL). no control structures(loop,if)
  • No module (block) structure, very expressive or declarative.
  • One Other Language example in this category is Prolog.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 45 What is PL/SQL -- II?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
2. Middle-level --- PL/SQL
  • fill the gap between the highest SQL level and lowest C level programming.
  • It is basically a procedual implementation of SQL access.
  • Pure SQL is a subset of PL/SQL where more C-like constructs are introduced
  • Such as block-structure,control statments, variables,module,interface specification,functions, etc).
  • But it doesnt have the rich data-types and low-level operators as in C such as
  • Structure(record),enumeration,array, pointer(reference), and set datatypes, and bitwise-operator,I/O and file handling.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 46 What is PL/SQL -- III?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
3. Embedded SQL in C (or f77,Cobol) --- programming language interfaces
  • Procedural high-level programming langauges (often called 3GL).
  • All capability and flexibility in C-level programming as a host language to embed the SQL statement as the data-access for querying the database engine.
  • Poweful in programming but complex sematics and inconvenient interface to SQL.
  • Usually you need to know many programming/libary details to interface the both in programming.
  • Good for pre-/post-processing to the query but bad for query programming.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 47 What is Data Integrity?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
A way to enforce error-checking and formating in data-types. Usually the rules for data integrity is created when a table is specified. Examples:
  • Null -- a data field may or may not be allowed to be Null.
  • Unique -- a data field may or may not be allowed to be uniquely identified among all the other raws in a table
  • Primary key -- a data field being a primary key must be unique
  • Referential integrity -- inserting or updating a fieldÕs value on table A must depend on if it matches the value of a field on another table B.
Data integrity is the major way to enforce the data-consistency in the database.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 48 What are Schemes?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Schemes are used to define scope of different data objects to enforce data secuity and naming issues in a DBMS.
A Scheme in a DB is a logical hierarchy of data storage structures.
Major scheme objects are: users, tablespaces, tables, views, index, cluster, synonyms, columns.
Access to Different Schemes require different roles and privileges for different users.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 49 What are Roles?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
A way to classify level of database users. Like sysadm usually has superuser privileges, roles are used to group users with different privileges.
Because a DBMS system has many levels of operations which set many levels(kinds) of operation privilages,
Role is another logical level of grouping to define/specify who will have what system privileges.
It makes system management easier and more flexible.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 50 What are Profiles and Auditing?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Profiles and Auditing are the way to monitor (system logging) and record user database actions
They are used for:
  • investigating suspicious activity and .
  • gathering data about specific activity for performance tuning and system statistics.
These are just a DB way of saying logging.
As in other cases, Database field sometimes use its own different terms for the same thing in other fields.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 51 What are Two-phase Database Schema?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
For a RDBMS there are two levels of abstractions of how data being stored and represented in the system:
1) Logical database consists of the conceptual view of a DB system which is described by an ER (entity-relationship) model, defined and access by SQL and represented by tables,columns,views and other abstractions of data-object;
2) The physical database consists of the actual phyical view of the DB which is represented by files,blocks,indexes,clusters,partitions,etc.
End-user and developers only need to deal with the logical level,
While the DBMS and a DBA (database administrator) define and perform the mapping between the two levels.
This is the reason why SQL achieves portability (oompared to f77), from a viewpoint of a data independent model.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 52 Example: Relational Joins

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 53 General Structure of Parallel and Sequential Relational Joins

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 54 Parallel Algorithm for Relational Joins

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 55 Parallel Database Software Architecture
Distributed Lock Manager

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Implementation of Parallel Cache Management
Support transaction parallelism of multiple OLTP
A simple and effective approach to port sequential RDBMS to MPP and loosely coupled systems with shared-nothing architecture
support parallel loading, parallel indexing, parallel insert/update and parallel recovery
Major functionality
  • keep track of the current ÒownshipÓ of a resource
  • accepts requests for resources from application processes
  • notifies the requesting process when a resource is available
  • get exclusive access to a resource for a resource
Note this can work fine in OLTP with many uncorrelated queries but will not work in scientific computing where ALL updates are correclated and reserving a resource introduces a sequential bottleneck

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 56 Oracle Parallel Relational Database -- Screen1: Oracle Parallel Server

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 57 Oracle Parallel Relational Database -- Screen2: Parallel Cache Management

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 58 Parallel Database Software Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Parallel Query Optimizer
  • Parallel Query Optimizer builds a query execution plan employing all system resources on the basis of data-partition knowledge, it can easily employ few tens of processors to execution a single query with high degree of parallelism. Both data-parallel and data-flow paradigms are used to achieve this goal
  • Support complex CPU-intensive or I/O-intensive queries for DSS
  • near-linear or even super-linear speedup can be achieved (due to larger effective caches)
  • Major focus for research --- optimized parallel algorithms needed
  • carried out by a query coordinator on the RDBMS server (the query coordinator generates the optimal parallel plan rather than having the best serial plan simply executed in parallel. This optimization is performed by the coordinator without the need for any external information (DB2 and Oracle7 use different approaches)
  • most relational operators are parallizable, including full-table scan, sort, aggregate operators, join, merge
  • System/Server tuning plays an important role in achieving optimized performance in a parallel RDBMS --- benchmarking is an effective means of performing tuning

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 59 Oracle Parallel Relational Database -- Screen3: Oracle's Parallel Query

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 60 Parallel Relational Database Management Systems II

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Gang Cheng
Marek Podgorny
(and a bit of Geoffrey Fox)
gcheng@npac.syr.edu,marek@npac.syr.edu
NPAC
111 College Place
Syracuse
New York 13244-4100

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 61 Abstract of Second Part of Parallel Relational Database Systems Presentation

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
This discusses the last three parts of presentation
What is situation at NPAC including hardware, InfoMall activities and the DR-LINK product of Textwise (Liz Liddy)
Detailed Discussion of Parallel Oracle on SP2 and nCUBE
Results of NPAC's Benchmarking Activities
  • Especially TPCD work

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 62 Parallel Database efforts at NPAC and in InfoMall

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 63 NPAC's Experience in Parallel Database Technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Beta test sites with Oracle, nCUBE and IBM for their parallel RDBMS products
Direct contacts with PRDBMS developers at Oracle and IBM
Hands-on Expertise in large-scale (up to 100 GB DB size) DSS benchmarking
Experience in parallel computing and applications in industry, parallel system integration and data visualization

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 64 nCUBE Database Hardware and Software Infrastructure in NPAC

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
nCUBE 2
  • 64 computing nodes, 2 GB RAM
  • 32 disk I/O nodes, 256 MB RAM, 32 SCSI2 channels
  • 96 2 GB disks, 192 GB total
  • OS 3.3 (GFS) and 3.4 beta (PFS, NFS)
  • Oracle 7.1.3 Parallel Server with PDQ
  • Oracle Parallel Text Server 3.0 beta
  • Video-On-Demand software

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 65 IBM Database Hardware and Software Infrastructure in NPAC

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
IBM SP2
  • 8 Thin and 4 Wide nodes, 256 MB RAM each
  • 13/21 GB SCSI2 disks per node, 188 GB total
  • TB2 high-performane switch
  • parallel I/O supported via VSD software
  • database software
    • IBM parallel DB2 (beta)
    • Oracle 7.1.3.2 Parallel Server (beta)
Upcoming upgrade:
  • switched Ethernet/FDDI/ATM hook-up to NPAC backbone

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 66 InfoMall(NPAC) Parallel Database Evaluation and Benchmarking Service

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Planning, design, implementation, and interpretation of Decision Support benchmarks using parallel database technology
InfoMall product, supported by technology vendors
Standard and customer defined benchmarks
NPAC provides state-of-the-art infrastructure, MPP and parallel database expertise, and trained personnel
InfoMall assumes most of the risks and hazards involved in the benchmarking process
Non-disclosure arrangements available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 67 Integrated Decision Support Systems

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
InfoMall project
  • NPAC teams with a small software vendor, major software vendor, and hardware vendor to provide an integrated Decision Support system for a number of Fortune 500 companies
  • NPAC role:
    • help parallelizing the code kernel of the Decision Support package
    • interface the package to a parallel database system and provide capacity planning based on customer performance requirements and NPAC parallel database expertise
    • provide infrastructure and system support for project duration
  • Project deliverable: integrated turnkey DS system with all elements implemented of the same MPP platform

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 68 Overview of DR-LINK High Level Natural Language Database Interface

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
This research activity of Professor E. Liddy of Information Studies School at Syracuse University is being commercialized by InfoMall member -- the start-up company Textwise.
The Surface Realizations i.e. particular strings of words in documents or database quert, do not reflect the breadth, richness or precision of meaning intended by either
  • Author of Document
  • Presenter of Query
Simple word level representation cannot produce the recall and prescision possible when the full conceptual intent of queries and documents is the basis of retrieval
  • Thus DR-LINK matches on the concepts of interest to the user and not just the words of interest
DR-LINK achieves this deeper conceptual representation of texts by utilizing all the levels of language at which meaning is conveyed:
  • Lexical, Syntactic, Semantic and Discourse levels of language
  • DR-LINK combines these multiple levels of conceptual representation of text

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 69 Features of DR-LINK High Level Natural Language Database Interface

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Accepts lengthy ambiguous complex Natural Language Queries
Translates Natural Language queries into precise Boolean representations of the user's implied requirements for relevance of a document to query
Produces summary-level semantic SFC (Subject Field Codes) vector representations of query and documents for quick filtering of large databases
Catures Text Structure dimensions of time, source, state of completion, credibility, definiteness, intentionality and causality
Fulfills Proper Noun requirements of queries by:
  • Precise matching of focused Proper Noun requests
  • Accurate Proper Noun category-level matching
  • Expansion of categories to constituent members
Provides both high recall and high precision via controlled expansion of Complex Nominals
Evaluates Implicit and Explicit Semantics to assign similiarity between Query and Documents using its Integrated Matcher
Predicts how many documents need to be viewed to acieve the user-specified level of recall via the Cut-Off Criterion

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 70 Structure of DR-LINK High Level Database Interface

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Raw
Documents

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 71 Parallel Oracle7 RDBMS -- A Case Study

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 72 What is the Oracle Parallel Server -- Introduction ?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
A high-performance RDBMS for loosely coupled and MPP systems
Currently supported systems: IBM SP2, nCUBE2, SGI SMP, SUN SMP, HP SMP, Amdahl, Encore, KSR, CM5,
Also known as multi-instance Oracle: Multiple instances running against a single, shared database
A evolutionary approach on existing sequential Oracle RDBMS

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 73 Oracle Parallel RDBMS Architecture
Overall Structure

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Two independent options
  • Parallel Server (PS) - since v. 6.2
  • Parallel Data Query (PDQ) - since v. 7.1
Parallel Server option available for most Unix platforms
Parallel Data Query available for few SMPs
combination of PS and PDQ mostly in beta, available for a handful of SMP clusters and MPP machines

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 74 Oracle Parallel RDBMS Architecture
Parallel Data Query option:

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
parallelizes execution of a single SQL query
parallel execution carried out by concurrent Unix processes communicating via shared memory
single Oracle instance sufficient - no PS necessary
uses multiple CPUs on SMP architectures but can run on single CPU machines as well with modest performance benefits
essential for Decision Support
irrelevant for OLTP

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 75 Oracle Parallel RDBMS Architecture
Parallel Server option

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
lets multiple Oracle instances access the same database
requires a cluster of single CPU workstations, a cluster of SMPs, or an MPP to run
requires a Distributed Lock Manager to ensure data consistency
requires all disk storage accessible by each Oracle node
supports PDQ distributed across multiple instances
essential for OLTP and Òapplication partitioningÓ
relevant for scalability of ad hoc DS queries

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 76 What are the versions of the Oracle Parallel Server ?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
First introduced with Oracle 6.2
version 6.2 = version 6.0 (sequential) + Parallel Cache Management
version 7.1 = version 6.2 + parallel query processing
newest version 7.1.3 = sequential Oracle + Parallel Server Option + Parallel Query Option
production release on nCUBE is in Jan. 1995
All parallel implementation is transparent to users
Key technology: Parallel Cache Management and Parallel Query Processing
Key components: a Distributed Lock Manager and a Parallel Query Optimizer

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 77 What is the Oracle Parallel Server File Structure and Architecture?

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Oracle 7 Server Architecture
  • shown in next foil
Oracle 7 Database Files:
  • Data Files --- Contain all the database data; logical structure, such as tables and indexes, are physically stored in the data files
  • Redo Log Files --- Transaction logs that record all changes made to database information for recovery
  • Control Files --- File that record the physical structure of the database including all datafiles and redo log files

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 78 Oracle7 Server Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 79 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 1 on Oracle's Parallel Server for nCUBE2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 80 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 2 on Oracle's Parallel Server for nCUBE2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 81 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 3 on Oracle's Parallel Server for nCUBE2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 82 Oracle Parallel Relational Database Server on nCUBE2 -- Screen 4 on Oracle's Parallel Server for nCUBE2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 83 The Oracle Server Instance

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Every time the Oracle 7 Server is started, the System Global Area (SGA) is allocated and the Oracle 7 Server background processes are started. The combination of memory buffers and background processes is called an Oracle Server Instance.
The SGA is a group of shared memory buffers allocated by the Oracle 7 Server for an instance
The background processes asynchronously perform distinct tasks on behalf of all database users of the instance

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 84 Oracle7 Parallel Server Characteristics

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Start an Oracle Server instance on each node
Each instance has its own set of redo log files
All instances can concurrently execute transactions against the same database and each instance can have multiple users executing transactions

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 85 The Oracle7 Parallel Server

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Each node of a parallel system runs an instance of the Oracle7 Server that accesses a shared database

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 86 Application Example of Parallel Server (functional parallelism)

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 87 Parallel Server Option

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Multiple Instances on different nodes share a single physical database
The data is accessed by all nodes through a parallel file subsystem
All instances have common data & controls files and their individual log files
All instances can simultaneously execute transactions against the single database, and each instance can have multiple users executing transactions
Data is kept consistent using PCM Technology
Implemented by the Distributed Lock Manager

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 88 Oracle7 on IBM SPX Architecture

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Oracle Components
  • Distributed Lock Manager
  • Oracle Parallel Server
IBM Components
  • High Performance Switch
  • Virtual System Disk (VSD)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 89 Parallel Cache Management (PCM) and Distributed Lock Manager(DLM)

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Concurrency control of buffer caches among instances
Maintain independent, yet consistent cache of blocks for all instances
Allow multiple nodes to simultaneous read/write to database (disks)
Maximizes concurrent data access and minimizes (disk) I/O activity
Block-level PCM locks, row-level locking preserved
A fault-tolerant DLM: if a node goes down, the DLMs will automatically recover and reconstruct the lock database on behalf of the failed DLM

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 90 Oracle Parallel Server On-Line Recovery

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Three possible failures

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 91 Example of failure due to lack of concurrency control

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Time Steps:
  • Step 1: Node 1 reads block 1 from disk and stores in Node 1 memory cache, Node 2 reads block 1 from disk and stores in Node 2 memory cache
  • Step 2: Node 1 writes information to block 1, Node 2 writes information to block 1
  • Step 3: Node 1 writes block 1 to disk
  • Step 4: Node 3 reads block 1 from disk and stores in Node 3 memory cache
In this situation, Node 1 and 3 have the same information in each memory cache, but Node 2 has an inconsistent cache.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 92 Discussion of PCM Lock Example

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
The example of previous and following foil, illustrates why concurrency control is needed in a parallel DB server environment.
It shows how contention for shared resources(data,disk access) is being handled under a multi-node parallel system sharing a single database.
The blackened boxes in the figure are the same blocks in each node (not rows !) requested by different nodes(instances) and located in different SGA bufferes.
Concurrency control needs to be implemented to ensure a consistant database,
i.e. the same block from the disk must be reflected as exactly the same copy in different buffers.
This is achieved by PCM locks.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 93 Contention for Shared Resources

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
3 Oracle Instances Accessing Shared database

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 94 Using Oracle 7 Server's PCM locks:

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
An instance acquires a PCM lock as a resource for update purposes
All buffer caches must contain the same copy of a resource
PCM blocks:
  • Ensure cache coherency by only allowing instances to change blocks while holding a lock
  • PCM lock work independently of transaction locks (row-level locking)
  • Ensure cache coherency using minimum amount of communication
  • PCM locks are allocated to files when a instance is started
  • Blocks are typical O/S disk blocks and of fixed size

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 95 Consider the following PCM Example

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Time
  • 1 Instance X attempts to update row 1 in (data block n)
  • Instance X becomes owner of PCM lock for data block n
  • Instance X update row 1
  • 2 Instance Y attempts to update row 2 in (data block n)
  • Instance Y requests the PCM lock from instance X for data block n
  • Instance X releases the PCM lock
  • Instance Y becomes owner of PCM lock for data block n
  • Instance Y update row 2
  • 3 Instance X attempts to update row 3 in (data block n)
  • Instance X requests the PCM lock from instance Y for data block n
  • Instance Y releases the PCM lock
  • Instance X becomes owner of PCM lock for data block n
  • Instance X update row 3
The PCM is handled separately from transaction lock -- when instance wish to update the same data (e.g.: Instance X and Y attempt to update Row 1), normal transaction locking is involved.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 96 Parallel Query Option

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Parallel Query Processing
  • enables the parallel processing of lon and complex queries with parallelizable operators such as scan,join and sorts
  • speeds up query response time (vs throughput in Parallel Server Option)
Parallel Data Load
  • Loads data in both a direct and parallel method
  • speeds up the loading of large data loads
Parallel Index Creation
  • speeds up the creation of large indexes
Parallel Recovery
  • enables concurrent recovery sessions

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 97 Parallel Query Processing: Overview

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Typical Data Flow with 5 steps:
  • 1 A client process issues a query
  • 2 The statement is parsed and a shared server process executes the statement
  • 3 The shared server process determines if the statement can utilize parallel processing, and then enlists two or more query servers to execute the statement in parallel. The shared server process partitions the work of each operation in the query equally among the query servers
  • 4 Query servers process certain operations in the query independently and return sub-results back to the shared server process
  • 5 The shared server process reassembles the results from the query servers and returns the query's results to the client

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 98 Query Processing

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 99 Three Processing Phases in Parallel Query

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
1 Parse --- a query optimizer determines optimal execution plan, parallelize query if a full scan was included in the execution plan (note the query is parallelized dynamically at execution time, automatically adapting changes in data partition and server configuration)
2 Execute --- perform operations
3 Fetch (Queries) --- return rows from query server processes
The execution and fetch phases can be performed in parallel ( pipelined parallelism) by multiple server processes

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 100 Example of parallel execution of a full scan

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
The shared query server is known as Query Coordinator who is responsible for parallelizing operations and interpreting the results from query servers
Not all operators can be parallelized in Oracle 7.1.3
Oracle Server Parallizable Operations
  • Table Scans
  • Sorts (for ORDER BY, GROUP BY, Joins, DISTINCT)
  • Joins (Sort-Merge, Nested Loops)
  • Aggregates (GROUP BY, MIN, MAX, AVG, etc )
  • SET operations (eg., UNION, INTERSECT or MINUS) are not parallelized
See following two figures

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 101 Parallel Query Execution

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Full Table Scan without Parallel Query Option
Multiple Query Servers Performing a Full Table Scan

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 102 The query coordinator dynamically partitions the work among the query servers

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 103 Data Partitioning in Oracle 7.1.3

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Oracle 7 on nCUBE2 and IBM SP2 are actually implemented as shared-disk (vs. shared-nothing) architecture
  • See earlier discussion of sophisticated nCUBE implementation
Shared database by sharing disks through parallel file system or parallel I/O system:
  • nCUBE -- the GFS (Global File System) over a Gigacache Subcube and nCUBE2's parallel I/O system;
  • IBM SP2 -- VSD (Virtual Shared Disk) parallel I/O system (over the HPS)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 104 File Striping in Oracle

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Automatic file striping and disk partitions by the operating system on nCUBE2
  • General OS and server level, less flexible, but relatively transparent to DBA
  • data skew is less possible, due to the Gigacache
Manually file striping at database table level on IBM SP2
  • Application-specific, flexible, but difficult to handle with large number of files and impossible to determine optimal striping length
  • data skew is highly possible

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 105 Benchmarking Parallel RDBMS

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 106 Industry Benchmarking Tests-I

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Transaction Processing Council
  • TPC A,B,C -- standards for OLTP benchmarks
    • TPCB OLTP without network
    • TPCA OLTP with a LAN or wide area network
    • TPCC Online Business Transaction Processing
  • TPCD -- not yet finalized DSS benchmark
Wisconsin Benchmark (aka Dewitt)
  • Initially a benchmark suite for DIRECT database machine but later used in academia and research for benchmarking parallel RDBMS with relational queries

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 107 Industry Benchmarking Tests-II

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Set Query
  • Complex queries and reporting for DSS applications and datamining applications
The Neal Nelson Database Benchmark (tm)
  • Multi-user UNIX business benchmarking for specific application domains and customers produced by Neal Nelson Associates
SPEC (Standard Performance Evaluation Corporation)
  • Industry body to coordinate Industry standard benchmarking in DB (and scientific) arena.
The Customer -- or worse >> The Consultant
  • Customized domain-specific benchmarking developed by cutomer's own applications and specifications

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 108 Characteristics of DSS Workloads

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Few insert/update/delete transactions, read-only queries.
Non-selective & data-intensive; few in number, large in size
May return a large volume of data: e.g.: mailing list
Queries frequently scan a large percentage of the data in the database
Often used to mine data in a recursive, drill-down fashion

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 109 TPC-D - a Decision Support Benchmark?
Overall Structure

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
A common suite for DSS benchmarking and comparing:
  • different software and hardware systems (such as Oracle 7 on nCUBE2 vs. DB2 on IBM SP2)
  • different software Products on the same machine (such as Oracle7 and DB2 on IBM SP2)
  • different machines in a compatible family (such as Oracle7 on IBM SP1 and SP2, or on a RS/6000)
  • different releases of a product on the same machine (such as Oracle 7.1.0, 7.1.2, 7.1.3 on nCUBE2)
Our TPCD benchmarking project has gone through most of the above examples

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 110 TPC-D - a Decision Support Benchmark?
Nature of Benchmark Examples

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
TPC-D definition requires that either one (power test) or many (scale-up test) query streams are run and timed against at least three sizes of a predefined database, with sizes growing exponentially (e.g., 1G, 10GB, 100GB sequence)
Query stream consists of 17 complex queries plus two concurrently running update operations
No extensive database tuning in allowed - optimizer hints are explicitly forbidden
timings for individual queries are not part of the TPC-D metric
Database structure well represents most common DSS DBs in applications, i.e. one largest table takes ~70% of the total DB size.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 111 TPC-D - a Decision Support Benchmark? Nature of Queries

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
model complex queries against a relatively large and complex database
All currently parallelizable relational operators (Oracle 7 Server) are used, including:
  • Aggregate (Group by) - Q2, Q6, Q11, Q14, Q15
  • Merge Join - Q8
  • Nested Loops - All except Q1
  • Sort (Group by) - All except Q2
  • Sort (Join) - Q8, Q12
  • Sort (Order by) - Q3, Q5, Q10, Q11, Q16
  • Sort (Unique) - Q4
  • Table Access (Full) - All

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 112 TPC-D - a Decision Support Benchmark?
Problems with TPC-D design

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Query stream design: Query 17 takes more than 95% of the time needed to execute the stream
Overal design unnecessarily mimics TPC OLTP benchmarks
Metrics more suitable for Data Warehouse applications than for ad hoc DS queries
Benchmark expensive to run (huge disk space requirements)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 113 TPC-D - a Decision Support Benchmark?
Scalability Issues

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Is TPC-D suitable for parallel DBMS evaluation?
  • No scalability performance metrics suitable for parallel data query evaluation i.e., query time vs. number of instances (CPUs), and/or, query time vs. available I/O resources ( I/O channels, drivers, disks and disk partitions)
Scalability of a parallel database system relies on both CPU resources and I/O resources.
By using multiple instances (CPUs), parallel server/query decreases query response time and increases throughput.
By using partitioned data on multiple disks (with multiple I/O channels), parallel server/query achieves higher I/O bandwidth for large DBs.
The two characteristics are equally important and must be benchmarked together. In current TPC-D specification, neither performance metrics is defined.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 114 TPC-D - a Decision Support Benchmark?
Relevance to Real Applications

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
For DS applications, query stream execution time is largely irrelevant. It is far more important to understand performance and scalability of elementary SQL operations and of typical queries encountered in customer application.
Loading and indexing may take a large portion of the overall time in preparing a large DB. Both parallel loading and indexing become important functionality of a parallel RDBMS. Their performance for large DSS databases should be defined and benchmarked, together with the query benchmarks. Currently none of them is defined or mentioned in the TPC-D specification.

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 115 TPC-D - a Decision Support Benchmark?
Overall Evaluation

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
TPC-D design contains flaws that make its use questionable even for sequential database systems
Given the complexity of the parallel database systems, attempts to reduce their performance estimate to one number are most likely futile
TPC-D defines no metrics useful for understanding behavior of parallel RDBMSes. TPC-D designers should re-examine academic research work pertinent to the field
Future research work towards definition of DS benchmark for parallel databases seems important

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 116 Some Lessons Learned in NPAC's Benchmarking

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Data partitioning must be very carefully designed to balance I/O and CPU
Parallel Loading is great ! --- almost linear speedup
Backup may be important to save time (we don't have)
The most difficult and trouble-making phase is the database preparation/loading/indexing phase which always involves storage management parameters
you must be both a DBA and sysadm and benchmarking relies critically on system stability
Be prepared to bring down the whole parallel system a couple of times a day
Be prepared that the results are negative (nothing may be obtained !)
you need at least twice the disk space of your original database size
performance tuning is extremely difficult with Oracle 7 parallel server

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 117 Integration of Database Server and Web Technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Gang Cheng
NPAC
111 College Place
Syracuse NY 13244-4100

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 118 Abstract of Web Oracle Integration Presentation

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
These foils were prepared by Gang Cheng and presented by Geoffrey Fox in CPS600
They contrast the strengths and weaknesses of Web and Database Information models
They describe Web, Conventional and DB-WEb linked Client Servers Models
They describe Oracle's WOW Oracle to Web Interface
They describe one of our experimental implementations which links mh mail to Oracle

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 119 Key points in Web Technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Internet as the major vehicle in networking industry
Driving force: information access and dissemination
WWW: Internet navigation and multimedia information access --- a killer networking application
Main WWW components today:
  • HTTP: the transport protocol
  • HTML: the makeup language
  • CGI: an interface to extend HTTP and HTML
  • Fill-Out Form: a set of HTML constructs to allow clients passing run-time input data to server through GUI interface
Client-server as the communication model
(Flat hierarchical UNIX) File system as the major file (data) management system
Focus on transparent networking navigation and a graphical user-interface for hypertext-based information dissemination

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 120 Key points in database technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Relational database management system (RDBMS) as the major data model in database industry
Driving force: information management, retrieval and processing
Database server: store, manipulate and manage data in a reliable and powerful information system
Main RDBMS components
  • a search engine: the database management system
  • SQL: the query language
  • embedded SQL interfaces: interfaces to extend the SQL non-procedure programming in high-level programming languages
  • database tools: various (GUI) application interfaces to the DBMS
  • client-server as the major communication model
  • customized proprietary network protocols built on TCP/IP as the major transport protocol
  • focus on file management and I/O processing for a powerful text-based searching and data manipulation system

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 121 The Synergy of Web Networking and Database Technologies

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
database techniques used in Web technology
  • data storage and data caching
  • index searching
  • data processing
networking techniques used in distributed database technology
  • distributed database
  • two-phase commit
  • data replication
  • client/server model

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 122 Weak Points in Web Technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
static, browser-oriented client
data (information in a narrow sense) repository
tedious to manage and update (HTML) files on the server side
difficult to keep most updated data automatically --- modifies usually done by a human using word processor or editing tools
A plain UNIX file system only supports primitive functions such as open,read/write and close.
  • More powerful operations on files are required for an information system which is supposed to provide information instead of raw data,
  • i.e., low-level data manipulation operators such as those provided in RDBMS: select, insert, update, commit, rollback,etc.
  • Meaningful data needs to be generated on-the-fly from raw data

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 123 Weak Points in Database Technology

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
a non-universal, non-portable remote networking access to database server
plain-text oriented, no multimedia
nonstandard front-end browser (most are line-based)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 124 By integrating a database server into Web technology, a Web server can be enhanced with:

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
a powerful backend text searching engine supporting complex queries
an efficient, reliable and sophisticated data management system supporting representation and organization of logical relationships among information entities
a dynamic data processing engine, provided with opportunity for optimized searching performance in large information systems
a time-dimension added HTML programming engine (see case study examples later)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 125 By integrating Web into database technology, a Database server can be enhanced with:

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
a transparent networking interface for remote database server access
a universal GUI-based search interface for hypertext-added data retrieval
a truly globally-shared data system for maximum information dissemination
an integrated multi-media information system

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 126 Interfacing Web server and a database server
-- Basic Features of WWW Client Server System

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Basic features in HTTP protocol -- a client-server model:
  • single transaction per connection:
  • stateless nature of the connection: end-user's previous connection run-time input will not be seen by current connection
client & server open the connection --->
client sends request messages to server --->
server locates requested information and sends
response back to client ----> client & server close the connection (note each URL represents a separate connection process)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 127 WWW Client Server Relationship

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Fill-out forms are currently used for the server to accept clients run-time input from a HTML page
Two different strategies for WWW client-server interactions:
  • Pre-edited HTML files (links) --- static and plain ASCII read-only files, identified by a file path; exist on disk; no need to process when used by the server (server performance is only determined by disk -> core memory I/O transfer)
  • CGI scripts --- programs (shell,perl,programming languages) to generate data (usually in HTML format) to be returned to the server (which will then be transferred to the requesting client in HTTP protocol); the resultant data are dynamically generated on the fly; pipes are used to feed the output (via STDOUT) to server; server performance is mostly determined by CPU processing)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 128 Basic features in database access -- a SQL-based query interface

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
SQL is the standard way to extract(query) data out from the database.
Low-level SQL-based interfaces are available on different vendors RDBMS
most commonly used one: Embedded SQL in C
SQL programs: static (compiled) SQL and dynamic (interpretive) SQL

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 129 Basic Client-Server model

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
clients send SQL statements to server --->
server parses and executes the SQL against the current database --->
returned rows are buffered in server's main memory --->
server fetches buffered rows and sends them to clients

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 130 Major approaches for interfacing web into DB server environment

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Embedded SQL in C + CGI (C program)
Form-based relational querying and text searching
Different approaches for different vendor's RDBMS systems (Sybase,Oracle,Informix, DB2, etc)
more details can be found at "http://greatwall.npac.syr.edu:1963/PDB/
web-db.html"

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 131 Basic WWW DB Integration Model

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Note: Web Server and DB client must be on the same machine,
while DB client and DB server can be on different machine and a proprietary networking protocol (e.g., SQL*NET for Oracle7) is used.
Inter-process Comunication (IPC) is used for the Web server to communicate with DB client on the same machine

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 132 Oracle approach to Web-RDBMS Integration

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Data and SQL engine
  • use tables, instead of static files, to store data
  • build a schema on data tables to define data and the relationships of different information entities (the relational model)
  • Web pages can be created, managed and delivered in more powerful way
  • SQL operators for data and file processing: Select data from tables to provide accurate and up-to-date information, and data can be entered, updated through a web page to keep a consistent information base

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 133 Procedural and Programmable mechanisms

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
use PL/SQL for data manipulation and processing.
PL/SQL is
  • Oracle's procedural language for SQL application development
  • support stored procedures, database triggers, optimized concurrent data access, and executable programs as meta-data (i.e., programs are stored together with and attached to the data objects to be manipulated on)
  • a database programming language with expressive capability between C and SQL: loops, decisions, modules and function calls, subroutines, etc.
  • database server is both a SQL engine and a compute engine
  • programmable HTML pages

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 134 Web-Oracle-Web (WOW) --- a Web-Oracle interface tool for developing gateways

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
in a specific environment: PL/SQL stored programs
major features:
  • a utility to develop CGI gateways for Web-servers with PL/SQL in an Oracle7 database
  • Wow procedures are executed within an Oracle database.
  • They automatically inherit the following characteristics of Oracle7 database server:
  • Scalability, Portability, National Language Support,
  • Multi Threaded Servers, Replication, Distribution, Administration (SNMP), Object encapsulation (PL/SQL packages)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 135 Basic components in WOW

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
A small shell-script to set environment-variables
A C stub called wowstub implemented as a normal CGI program
A PL/SQL package called wow
An optional, but strongly recommended set of PL/SQL packages called HTP and HTF. These encapsulate HTML formatting, and will make migration to future HTML versions easy
A stand-alone PL/SQL compiler with CGI enhancements. The compiler is a
stand-alone gateway that will allow you to execute pl/sql code directly from within an HTML document

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 136 Client-Server Architecture for WOW

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Note: the gateway wowstub program simply passes PL/SQL program name and input parameters gathered from forms to DB server.
The DB server does both SQL query and HTML processing/formatting

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 137 Web-oracle Mail Database: a case study of integrating Oracle7 database server into Web

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
implemented using wow
another Web interface to MH mailbox (similar to xmh but under web)
a enhanced backend mail archives server with advanced search capability
basic components
  • a MH mail -> Mail tables parser in perl
  • a Entity-Relation Model for mails
  • a PL/SQL package for basic data query and processing functionality
  • the CPS600 demonstration mail database is available at URL: http://kayak.npac.syr.edu:1963, under CPS600's Demo Mail Database

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 138 The Entity-Relation Model

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Note: field definition of each table is not shown here;
Inserting a parsed MH mail into the database will affect the tables:
folder,msg header,msg body and people

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 139 Functionality of the mail database

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
home page --- generated by a PL/SQL procedure which first stores the client's information of the current connection to a 'visitor' table and then generates links of query options. clients history can be easily pull over from the database. (fig. 1)
query the mailbox --- find out the summary information about the mailbox (fig. 2)
query the folder --- choose a folder from all folders currently available in the database (fig. 3), and find out summary information of the chosen folder (fig. 4)
All the above queries are summary information which are meta-data (accumulated or statistic data about basic tables) and produced at the time when clients requested (done by simple SQL calculation statements)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 140 Web to Oracle Interface Screen 1: Home Page for the Web-Mail Database

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 141 Web to Oracle Interface Screen 2: Query the Mailbox

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 142 Web to Oracle Interface Screen 3: Query the Folder -- Choose a Folder

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 143 Web to Oracle Interface Screen 4: Query the Folder

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 144 Query by mail header

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Client chooses all folders or a folder currently available in the database (fig. 5)
browse all mail headers (show msgid,date,sender,size in bytes and subject) (fig. 6)
show chosen mail's body after the mail header browsing (fig. 14)
optionally show the complete original MH mail in UNIX mailbox (fig. 15)
Note that full table scan only on the msg header table is required to list all the mail header in a folder (msg body is not touched during the header browsing).
This is Optimized browsing performance, compared to MH header browsing where header and body are stored in a single file and are not indexed

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 145 Web to Oracle Interface Screen 5: Query by Mail Header -- Choose a Folder

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 146 Web to Oracle Interface Screen 6: Query by mail header

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 147 Web to Oracle Interface Screen 14: Query Result -- Browse Mail Body --1

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 148 Web to Oracle Interface Screen 15: Query Result -- Browse Mail Body --2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 149 Query by date

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Most recent five mails currently in the db
mail related to now (the exact time as clients click the query-by-date page) in all folders:
  • the same hour/day/week/month/ (this hour/today/this week/this month)
  • one day before the same day (last hour/yesterday/last week/last month) (fig . 7)
mail related to today (the same day as clients click the query-by-date page) in selected folders:
  • N days/weeks/months ago from today (N >=0)
  • N hours ago from now
mail at any specified time or within any specified time range. Choose from hour(1-12), am/pm, day(1-31), month(1-12), year (fig. 8)
system clock will be consulted to form above queries

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 150 Web to Oracle Interface Screen 7: Query by Date --1

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 151 Web to Oracle Interface Screen 8: Query by Date --2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 152 Query by Subject

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Query by a keyword --- return mails containing the entered keyword (non-indexed pattern matching, future option: context searching using Oracle TextServer search engine) (fig. 9, 10)
the keyword can be a typed in or chosen from a table of previously entered keywords
search in all folders or a selected folder from currently available folders in the DB
search keyword occurrence in:
  • both mail header (subject line) and body, or
  • mail header only, or
  • mail body only
search with either case insensitive or sensitive for the keyword
typein keyword will or will not be added to the DB table for future use or keyword in the current DB table will or will not be removed from the DB
database insert/delete operations are used to add/remove keywords
in the table --- the power of remembering clients interactions on the web

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 153 Web to Oracle Interface Screen 9: Query by Subject --1

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 154 Web to Oracle Interface Screen 10: Query by Subject --2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 155 Query by Sender

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Query all mail sent from a selected sender, based on all senders gathered in the current mail database (fig. 11)
sender is uniquely identified by his/her userid
a specified folder can be used to narrow the searching, as in many other query options
Note the sender list is not pre-entered or edited. It is automatically built when a new mail is being incorporated into the database from the MH mail folder by the parser --- another dynamic feature of web-db integration

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 156 Web to Oracle Interface Screen 11: Query by Sender

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 157 Incorporating new mail into the Database from your UNIX mailbox --I

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
For security and database consistence reason, userid/password at the database level (not Mosaic page-level) is enforced to update the database (fig. 12)
Two kinds of folders: MH folder in UNIX file system and folder table in the database system. They hold the same information but in different places.
Original MH mail (in MH format) in a folder must be converted into the database tables (in Oracle table format) to enable all the above query options on the Oracle db server
Assume each database folder must have a unique MH folder in UNIX file system. The database folder is created when the first mail in the UNIX MH folder is being incorporated in the database

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 158 Web to Oracle Interface Screen 12: Incorporate New Mail from Web Page -- Choose a folder

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 159 Incorporating new mail into the Database from your UNIX mailbox --II

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Before incorporating new mail from a UNIX folder, the PL/SQL procedure first finds out current maximum msg id in the corresponding DB folder and only try to parse/incorporate those MH mails with msg id greater than the maximum msg id in DB (fig. 13)
If the the MH folder contains subfolders, new mails in all subfolders will be considered to incorporate, and this is a recursive process.
Implemented by both a perl script (for parsing) and PL/SQL procedures
A time-intensive process as insertions may be occurred on up to four DB tables which are indexed. But this process can be done in batch !
Incorporating can be done at the same time when other query options are requested by other clients on the web. The DB server will ensure the query concurrency and data consistency --- a reliable information system !

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 160 Web to Oracle Interface Screen 13: Incorporate New Mail

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 161 Format of Query results from all query options

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
List of mail headers satisfying the query (show msgid,date,sender,size in bytes and subject) (fig. 6)
Show chosen mail's body after the mail header browsing (fig. 14)
Optionally show the complete original MH mail in UNIX mailbox (fig. 15)

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 162 Web to Oracle Interface Screen 6: Query by mail header

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 163 Web to Oracle Interface Screen 14: Query Result -- Browse Mail Body --1

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 164 Web to Oracle Interface Screen 15: Query Result -- Browse Mail Body --2

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * Critical Information in IMAGE
Full HTML Index
Produced by Gang Cheng April 1995

There is a larger Better Quality Image available

HELP! * BLUE=global GREY=local HTML version of GLOBAL Foils prepared July 23,1995

Foil 165 Query options to be added in the future to Oracle-based Web Mail System

From Full Relational Database HPDC-4 Presentation HPDC-4 Tutorial/CPS616 Spring95 -- August 1, Pentagon City Virginia. * See also color IMAGE
Full HTML Index
Query the People
Query By Mail Type
Query By Date and Sender
Query By Date and Subject
Query By Date and Sender and Subject
Interface the mail database with WebMail tools (CPS600 server)
Another example using WOW for a similar database can be found at
http://kayak.npac.syr.edu:1963 under link TravelVenture Demo for "travel agency" application

Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Tue Feb 18 1997