CEWES WWW Site Management System Focused Effort REQUIREMENTS/DESIGN The WWW Site Management System (WSMS) is intended to facilitate the day-to-day operation and maintenance of the CEWES MSRC and PET web sites. Issues pertaining to the _content_ of the site, and tools focused on content (HTML editors, graphical design tools, etc.) are outside the scope of WSMS. WSMS will provide for "prototype", "production", and "archival" views of the web site. The model of operation is such that content developers will have relatively free access to the prototype area, while materials would be staged from the prototype to production area under the control of a limited number of individuals with review authority. Old documents would be kept in the archival area for future reference, and would be available to content developers on a read-only basis. The purpose of the archival area is not to be able to reconstruct the the production web site as it existied at a certain point in time, but rather for content developers and webmasters to be able to retrieve recent versions of documents for reference. Facilities will be provided to automatically or manually cull the archive based on age of documents and/or number of versions. The security model includes access control lists (ACL) which will allow different groups of users to be given different access permissions to each document or directory. By default, new files/directories will inherit the ACLs of their parent directory. ACLs would function similarly to thos in the AFS or DFS file systems. For example, "pet-cfd" group might be defined and given read/write access to all of the files in the PET CFD area of the site. There would be pre-defined groups representing any user which has properly authenticated themselves to the WSMS web server, and those which are not authenticated (analagous to the sys.authuser and sys.anyuser groups in the AFS file system). Metadata to be stored with each content file includes o author o access control list o date of last modification o 'expiration' date o reviewer o date of review o date of staging from production to archive Content developers will interact with the WSMS via a command-line interface along the lines of RCS or CVS. They will create or modify documents using their favorite tools on their favorite system. When ready, the will move documents to the WSMS host machine and "check in" their changes into the prototype area. They can also "check out" documents from any of the three areas, and can request staging of documents from prototype to production areas. Each directory can have an authorized reviewer associated with it (though in this case we expect just two, one for the MSRC tree and one for the PET tree). The reviewer is notified when developers request staging of documents from prototype to production. A web-based interface will provide the reviewer with a check-off list of documents to review. WSMS will provide tools and reports to help understand the maintain the structure of the site, including o lists of dead links o lists of orphaned files (those which are not referenced by any other file on the site) o lists of files according to dates kept in metadata It will be up to the document owner and/or webmaster to rectify these problems (in fact they will not always be problems). WSMS will provide grep- and sed-like capabilities for all files in the database. WSMS will provide hooks to allow external tools to be applied to sets of files. For example, and HTML checker, or a tool to require that every IMG tag has an ALT attribute could be applied to a set of files. Such tools could be applied at will by content developers to files in the prototype area, or can be applied by the review authority prior to staging from prototype to production. WSMS will link with a search engine and will utilize user authentication and access control lists to limit access to search results. Primarly this will mean that the prototype and archive areas can be searched only by properly authenticated users, but it will also be possible to use ACLs to control access to the production area. WSMS will _support_ dynamic content (direct database queries) as part of the web site, but their implementation is not part of WSMS. This distinction is because there are issues around the location, access control, and interoperation of the databases to be queried which do not fall under the purview of a web site management system. Desires were expressed for the following dynamic content 1) Software configuration report 2) Training schedule (CEWES and coordinated) 3) System usage taxonomy (not currently automated) IMPLEMENTATION There are two options for implementation of WSMS: 1) WSMS implemented WITHOUT direct WWW access (using Oracle database, but not Oracle web server). In this case, the prototype and production WWW trees would be periodically exported to the web server machine. 2) WSMS implemented WITH direct WWW access (using Oracle web server) would serve content directly from the database. This is the recommended option. HARDWARE REQUIREMENTS Sun SPARC system running Solaris 2.4 or 2.5 OR SGI system running IRIX 5.3 Minimum 256 MB memory recommended Hard disk requirements depend on data size (current MSRC+PET web site ~2.5 MB, plan about 4x for WSMS space due to multiple areas, metadata) Disk for search engines also depends on details. For example, current grid search engine requires ~250 MB. ORACLE REQUIREMENTS Oracle Server 7.3.2 or higher SQL*Plus 3.3 PL/SQL 2.3.2 or higher Proc*C 2.2 or higher: Oracle Common Libraries 7.3 ConText Server 1.1.2.0.0 or higher Oracle Web Server 2.1 or higher (required for search engines and WSMS implementation option 2) ADDITIONAL REQUIREMENTS SPARCworks C compiler 3.0.1 (Sun) or ??? (SGI) oraperl (perl scripting language with Oracle interface [both free])