Some Statistics about http://www.npac.syr.edu Approx. NPAC agent performance: 36000 URLs in about 12 hours Total URLs indexed in the DB: 36587 Total text file sizes processed: 118 MB Avg of file size processed : 3235 bytes There are 3048 found but not indexed (referred in some URL but not accessible, due to reasons: 1) local documents, or 2) restricted access, or 3) dead links, or 4) removed, but still referenced Total number of unique Words/numbers/symbols: 78455 In categories : Upper Lower case A 1856 1986 B 1351 1310 C 3162 3031 D 1578 1992 E 1194 1646 F 1191 3363 G 1041 923 H 1199 1012 I 1258 2326 J 393 277 K 566 383 L 1109 1155 M 2143 1973 N 1145 1459 O 684 829 P 2108 2398 Q 186 204 R 1371 1943 S 3367 3544 T 1451 1717 U 497 757 V 595 504 W 870 828 X 193 226 Y 125 112 Z 135 88 other (numbers and symbols): 11701 in which number: words beginning with number : 10384 words beginning with + : 149 words beginning with - : 549 words beginning with . : 527 Of 66754 words beginning with letters, 52651 are exactly different words (case-insensitive) Total bytes used by Oracle to store both text and indexes: 150 MB - 75MB for indexes, 75MB for compressed text (calculate: sole tablespace - total - freespace = 245 MB sum total extents -> 275MB, 10% pctfree, last extent is the largest extent and not completely full)