Data Diversity
-
Web - a huge distributed database, unstructured, non-relational, hierarchical (multimedia) information entities with various data formats: MIME -- html,plain text,PostScript, LaTex, images,audio/video clips, etc.
-
Web repositories are heterogeneous . inconsistent . incomplete
|