Untitled presentation CPS 616 January-April 1996 Computational Science Track on Base Technology for the Information Age: Oracle RDBMS and Web Integration Module 1 - A Technical Introduction to Oracle RDBMS Instructor: Geoffrey Fox Version 13 March 96 teamed with Gang Cheng, Chao-Wei Ou Syracuse University 111 College Place Syracuse New York 13244-4100 Abstract of: Module 1 - A Technical Introduction to Oracle RDBMS This starts with a summary of four major modules (this is first one of the four) What is a Relational Database A Phone-List Example for Motivation and Illustration of Concept Major Schema Objects and Structures in Oracle with examples Data Integrity and Data Types in Oracle Data Concurrency and Consistency in Oracle Full Summary of CPS616 Oracle/Web Presentation - I The Oracle/web part of CPS616 consists of three presentation Sub-modules and One Lab Session with a Hands-on Tutorial Module 1 - A Technical Introduction to Oracle RDBMS A Phone-List Example for Motivation and Illustration of Concept Major Schema Objects and Structures in Oracle Data Integrity and Data Types in Oracle Data Concurrency and Consistency in Oracle Module 2 - Data Access and Database Programming in Oracle Oracle SQL and Examples Programming in PL/SQL Oraperl - Access Oracle RDBMS from Perl Full Summary of CPS616 Oracle/Web Presentation - II Module 3 - Integration of Web and Oracle RDBMS The Synergy of Web and Database Technologies Architectures of Web and Oracle RDBMS Integration A Technical Overview of The Oracle-Web Integration Examples of a Web-based Search Interface for the Phone List Database Module 4 - Hands-on Lab Session Basic Steps in Using Oracle Procedures to Build an Oracle Database Build the Web/Oracle Phone List Database Example By Yourself Practice Examples in the Lectures on An Oracle RDBMS Server Handout Materials are also Associated with Each Module for Details A Technical Introduction to Oracle RDBMS Major features in database and information management large databases and space management control many concurrent database users high transaction processing performance high availability controlled availability industry accepted standards manageable security database enforced integrity client/server (distributed processing) environments distributed database systems portability, compatibility and connectability Example Problem Description -I Through out the lectures, we use a simple example to illustrate important concepts, features and syntax in Oracle7 This example will also be used to demonstrate the Oracle/Web integration and in the lab session to build a web application with Oracle server This simple application is a phone list database, such as NPAC phone list Basic requirement is to build such a database so as to locate a personÕs phone number with different query criteria Example Problem Description -II Database design is a major area in RDBMS and system analysis. We will not discuss this key database issue, and assume that the phone list database is represented in two tables which represent two entities in the modeled application: person entity and phone entity Table 1 - personal information. Table name : person_info_table Example problem description -III Table 2 - phone number information. Table name : phone_list_table Example problem description -IV Basic Assumptions in the Entity-Relationship (E-R) Model: The pid is unique in person_info_table (can be viewed as SSN) One person may have multiple phone numbers of the same or different type A phone number can be shared by more than one persons The Relational Database Model A database model is a set of definitions describing how real-world data is conceptually represented as information in the computer The relational model - database collects related information in a set of tables. A table either contains the attributes of an entity or the relationships between entities In the example problem, people and phone numbers are entities; Title, uid etc. are attributes of the person entity; type is attribute of phone number entity pid is used to set up relationships between entities Examples of Attribute and Entity Relationship Tables Simple Attribute Tables for: Person Phone Room While a set of Tables could describe Many:Many(1) relationships Relational Rules -- 1 Rule 1: First Normal Form Rule - a table must have no multi-valued columns example: in phone_list_table pid phone no type 3,4 315-443-9209 office such a row is not allowed in RDBMS Relational Rules - 2 Rule 2: Access rows by content only rule - we can only retrieve rows/columns by their content (attribute values that exist), i.e., there is no order defined for the rows or columns Example: you cannot issue the query Ôget the 2nd row in the person_info_tableÕ, or Ôget the 1st column in all the rows of phone_list_tableÕ. You can only do Ôget the row from person_info_table whose pid equals to 2 Õ, or Ôget the column values of all the rows in phone_list_table where the column is named as phone no Õ Relational Rules - 3 Rule 3: The uniqueness rule - two rows in a table cannot be identical in all column values at once Example: the two rows are not allowed to coexist in the phone_list_table pid phone no type 3 315-443-9209 office 3 315-443-9209 office Relational Algebra All base RDBMS operations and SQL operators are built on relational algebra theory Set operations - union, intersection, difference, product Native relational operations - project, select, join (outer join, left outer join, right outer join, theta join), division Database Structure and Space Management in Oracle Database Structures - well-defined objects represented as tablespace and other schema objects that store the data of a database Database Operations - actions to manipulate the data and their structures Integrity Rules - relational rules to govern proper operations on data and their structures Major Schema Objects in Oracle - Tablespace I Tablespace - a database is divided into logical storage units called tablespace A tablespace is used to group related logical structure together Tablespaces commonly group all of an applicationÕs objects to simplify certain administrative operations It is Database Administrator (DBA)Õs job to define tablespaces and assign them to users Major Schema Objects in Oracle - Tablespace II This figure illustrates Each Oracle DB is logically divided into one or more tablespaces One or more data files are explicitly created for each tablespace to physically store the data of all logical structures in a tablespace The combined size of a tablespaceÕs data files is the total storage capacity of the tablespace (SYSTEMS has 2M while USERS has 4M) The combined storage capacity of a databaseÕs tablespaces is the total storage capacity of the database (6M in example above) Major Schema Objects in Oracle - III -- Table Table - the basic unit of data storage in Oracle. The tables of a database hold all of the user-accessible data Each table is defined with a table name and set of columns Each column is given a column name, a datatype (CHAR, VARCHAR2, DATE or NUMBER), and a width Once a table is created, valid rows of data can be inserted into it. The tableÕs rows can then be queried, deleted, or updated Define A Table Structure in Oracle - An Example To create the two tables for the phone-list database, we would have Major Schema Objects in Oracle -IV- View View - customized presentation of the data in one or more tables Views do not actually contain or store data They derive their data from the tables on which they are based, referred to as the base table Like tables, views can be queried, updated, inserted into, and deleted from, but with some restrictions All operations performed on a view actually affect the base tables of the view Major Use of Views in a Database I Provide an additional level of table security by restricting access to a predetermined set of rows and columns of a table. For example, a view of a table can be created so that columns with sensitive data (for example, SSN) are not included in the definition of the view Hide data complexity. For example, a single view can be used to create a join, which is a display of related columns or rows in multiple tables. However, the view hides the fact that this fact that this data actually originates from several tables Major Use of Views in a Database II Simplify commands for the user. For example, views allow users to select information from multiple tables without requiring the users to actually know how to perform a correlated subquery Present the data in a different perspective from that of the base table. For example, views provide a means to rename columns without affecting the tables on which the view is based Store complex queries. For example, a query might perform extensive calculations with table information. By saving this query as a view, the calculations are performed only when the view is queried Define a View in Oracle - An Example Suppose weÕd like to create a view (virtual table) called staff_info_table which has all the personal information of NPAC staff. This view will use the base table person_info_table with the position column has only staff related value Major Schema Objects in Oracle -V -- Sequences Sequence - a sequence generates a serial list of unique numbers within a single numeric column of a databaseÕs tables Sequences simplify application programming by automatically generating unique numerical ÒlabelsÓ for the rows of tables Use them for attributes such as pid -- person ID in phone list where must generate a unique label for every row Major Schema Objects in Oracle -VI - Procedures etc. Procedure or Function - a set of SQL and PL/SQL statements grouped together as an executable unit to perform a specific task Procedures and functions allow user to combine the ease and flexibility of SQL with the procedural functionality of a structured programming language Using PL/SQL, such procedures and functions can be defined and stored in the database for reuse Package - provides a method of encapsulating and storing related procedures, functions, and other package constructs together as a unit in the database. Packages provide DBA or developer organizational benefits, as well as increased functionality and database performance more details about procedure/function/package will be covered in ÔPL/SQLÕ section in second module Major Schema Objects in Oracle -VI- Synonym Synonym - an alias for a table, view, sequence, or program unit A synonym is not actually an object itself, but instead is a direct reference to an object A synonym can be public or private. A Private synonym is only visible to the user who created it. DBA most often create public synonyms that make the base schema object available for general, system-wide use by any database user Major Schema Objects in Oracle -VII-Synonym Synonyms are used to mask the real name and owner of an object provide public access to an object provide location transparency for tables, views, or program units of a remote database simplify the SQL statements for database users For example, to allow all users to share the WOW PL/SQL package called htp which is currently owned by a user Ôcps616Õ. A public synonym must be created and certain permissions must be granted: CREATE PUBLIC SYNONYM htp for cps616.htp GRANT execute on htp to public Major Schema Objects in Oracle -VIII - Indices Index, Cluster (described later) - are optional structures associated with tables, which can be created to increase the performance of data retrieval Indices are created on one or more columns of a table. An Oracle index provides a faster access path to efficiently locate requested rows Create an index in Oracle - An Example We knew that Last name may be the most frequently used column in query. An index on this column would improve most queries using this field, in particular when there are many rows in the person_info_table and most of them have distinct values on this column CREATE INDEX last_name_index ON person_info_table(last_name); When Should one Create Indices ? An Index can improve the performance of queries that select a small percentage of rows from a table. A general guideline is to create indexes on tables that are queried for less than 2% - 4% of the tableÕs rows, with the following assumptions: Rows with the same value for the column on which the query is based are uniformly distributed throughout the data blocks allocated to the table Rows in the table are randomly ordered with respect to the column on which the query is based Each data block allocated to the table contains at least 10 rows The table contains a relatively small number of columns Most queries on the table have relatively simple WHERE clauses How to Choose Columns to be Indexed Consider indexing columns that are used frequently in WHERE clauses Consider indexing columns that are used frequently to join tables in SQL statements Only index columns when only a few rows have the same value in the columns indexed Do not index columns with few distinct values Do not index columns that are frequently modified Do not index columns that only appear in WHERE clauses with functions or operator Consider indexing foreign keys (see later) of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statement access the parent and child tables When choosing whether to index a column, consider whether the performance gain for queries is worth the performance loss for INSERT, UPDATE, and DELETE statements and use of the space required to store the index Major Schema Objects in Oracle - IX Clusters are groups of one or more tables physically stored together because they share common columns and are often used together. Because related rows are physically stored together, disk access time improve Data Dictionary - each Oracle server has a data dictionary which is a set of tables and views that are used as a read-only reference about the database (meta-data), such as logical and physical structure of the database, users, tables definition, integrity constraints, etc. For example, you can see your disk space quota: select * from user_ts_quotas; Data Integrity in Oracle -I It is very important to guarantee that data adheres to certain rules which are determined by application For example, assume that a business rule says that no row in the phone_list_table should be allowed unless its person_id value found in the person_id column in the person_info_table (this is to ensure that a personÕs entity must exist first before the phone number information can be established) If an insert or update statement attempts to violate this integrity rule, Oracle must roll back the invalid statement and return an error to the application In order to achieve this protection, Oracle provides integrity constraints For example, a foreign key (see later) can be defined on the column person_id of phone_list_table which references to the primary key person_id of the person_info_table Data Integrity in Oracle -II Integrity Constraints - a declarative way to define a business rule for a column of a table. They are defined with a table and are stored as part of the tableÕs definition Following integrity constraints are supported by Oracle: NOT NULL - disallow nulls (empty entries) in a tableÕs column UNIQUE - disallow duplicate values in a column or set of columns PRIMARY KEY - disallow duplicate values and nulls in a column or set of columns FOREIGN KEY - requires each value in a column or set of columns match a value in a related tableÕs UNIQUE or PRIMARY KEY (FOREIGN KEY integrity constraints also define referential integrity actions that dictate what Oracle should do with dependent data if the data it references is altered) CHECK - disallows values that do not satisfy the logical expression of the constraint Data Integrity in Oracle -III For example, to enable necessary Integrity constraints, we should redefine the two tables in our example as follows : Datatypes in Oracle -I CHAR - stores fixed length character strings. 1 - 255 chars When a row is inserted into or updated in the table, the value for the CHAR column has a fixed length If a shorter value is given, the value is blank-padded to the fixed length If a longer value with trailing blanks is given, blanks are trimmed from the value to the fixed length If a value is too large, Oracle returns an error VARCHAR2 - stores variable-length character strings. 1 - 2000 chars For example, assume a column is declared VARCHAR2 with a maximum size of 50. If only 10 chars are given for the column in a particular row, the column in the rowÕs row piece only stores the 10 chars (10 bytes), not 50 CHAR and VARCHAR2 have different comparison semantics in string comparisons Datatypes in Oracle -II NUMBER - stores fixed and floating point numbers. Numbers of virtually any magnitude can be stored and are guaranteed portable among different systems operating Oracle, up to 38 digits of precision positive or negative numbers in the range 1 x 10 -130 to 9.99..9 x 10 125 DATE - stores point-in-time values (e.g. dates and times) in a table. It stores the year (including the century), the month, the day, the hours, the minutes, and the seconds. Julian dates are allowed for continuous dating from a common reference Oracle date arithmetic takes account the anomalies of the calendars used throughout history ROWID - every row in a non-clustered table is assigned a unique ROWID that corresponds to the physical address of a rowÕs row piece ROWIDs are the fastest means of accessing particular rows can be used to see how a table is physically organized they are unique identifiers for rows in a given table Datatypes in Oracle -III LONG - store variable-length character data containing up to 2 GB non-searchable, i.e., can not be queried as in where, like clause used mostly for text data to be converted when moving among different systems used in data dictionary to store the text of view definitions can be used in SELECT lists, SET clause of UPDATE, and VALUES clause of INSERT some restrictions in SQL statements RAW and LONG RAW - for data that is not to be interpreted by Oracle. Intended for binary data or byte strings Oracle allows some restricted implicit datatype conversions and provides a set of SQL functions for explicitly datatype conversions See associated resources for more details on Oracle datatypes Data Concurrency and Consistency in Oracle -I Any RDBMS must have the software mechanisms to achieve the following important requirements of an information management system : Data must be read and modified in a consistent fashion Data concurrency of a multi-user system must be maximized High performance is required for maximum productivity from the many users of the database system Concurrency control - simultaneous access of the same data by many users must ensure data integrity for concurrent updates, inserts, deletes and queries of the same data from many users as little interference as possible among concurrent transactions must be achieved and destructive interactions must be prevented. Oracle resolves such issues by using various types of locks and a multi-version consistency model based on the concept of a transaction Data Concurrency and Consistency in Oracle -II Transaction - a logical unit of work that comprises one or more SQL statements executed by a single user A transaction begins with the userÕs first executable SQL statement A transaction ends when it is explicitly committed or rolled back Transactions provide the database user or application developer with the capability of guaranteeing consistent changes to data, as long as the SQL statements within a transaction are grouped logically A transaction should consist of all of the necessary parts for one logical unit of work - no more and no less Data in all referenced tables are in a consistent state before the transaction begins and after it ends. Transactions should consist of only the SQL statements that comprise one consistent change to the data Oracle RDBMS Supports Read-Consistency -I Guarantees that the set of data seen by a statement is consistent with respect to a single point-in-time and does not change during statement execution (atomic operations or statement-level read consistency) Ensure that readers of database data do not wait for writers or other readers of the same data Ensure that writers of database data do not wait for readers of the same data Ensure that writers only wait for other writers if they attempt to update identical rows in concurrent transactions The simplest way to think of OracleÕs implementation of read-consistency is to imagine each user operating a private copy of the database, hence called the multi-version consistency model Oracle RDBMS Supports Read-Consistency -II When an update (written) occurs, the original data values changed by the update remains part of an uncommitted transaction, any user that later queries the modified data views the original data values Only when a transaction is committed are the changes of the transaction made permanent Only statements that start after another userÕs transaction is committed see the changes made by the committed transaction A transaction is key to OracleÕs RDBMS for providing read-consistency. This unit of committed (or uncommitted) SQL statements : dictates the start point for read-consistent views generated on behalf of readers controls when modified data can be seen by other transactions of the database for reading or updating Data Concurrency and Consistency in Oracle -I Locking - control concurrent access to data Intended to prevent destructive interaction between users accessing table data Locks are used to achieve two important database goals Consistency - ensure that the data a user is viewing or changing is not changed (by other users) until the user is finished with the data Integrity - ensures that the databaseÕs data and structures reflect all changes made to them in the correct sequence Data Concurrency and Consistency in Oracle -II Oracle provides automatic (implicit) locking and manual (explicit) locking Automatic locking - locking is performed automatically and requires no user action depending on the action requested, implicit locking occurs for SQL statements as necessary Row-level locking and table-level locking One or more Oracle lock managers maintain locks Two general types of locks - exclusive locks and shared locks Manual locking - explicitly issued locks by a user on certain resources override default locking can be at either row-level or table-level can be either exclusive or shared