NPAC at Syracuse University
111 College Place
Syracuse NY 13244-4100
Phone: 3154432163, email gcf@npac.syr.edu
This is http://www.npac.syr.edu/users/gcf/edperf.html
We report on analysis of network and CPU performance needed in some applications
which we expect to be important in the networked electronic classroom of
the future. The work was performed as part of the New York State funded
Living SchoolBook collaboration with the Syracuse University School of Education,
NPAC, Columbia's Teachers College, NYNEX and Rome Laboratory. We find that a modern PC has the necessary performance to perform real time
decompression of both images and video and reduce the needed network bandwidth
by an order of magnitude. Thus we conclude that planning the classroom of
the future requires careful tradeoff between network and CPU characteristics.
As part of the Living SchoolBook project, several novel K-12 educational
applications which combined high performance computing and communication
technology were prototyped. Here we look at the two subprojects which involved
the largest network and CPU power requirements. These were educational applications
built around digital streaming video and an interactive three dimensional
Geographical Information System (GIS) called New York State -- The Interactive Journey. Each application was developed using pervasive Web technologies so that
production versions can be deployed broadly. Although these applications
are still being developed, they are complete systems and sufficient for
realistic performance measurement and evaluation.
In our initial versions, we concluded that one needed a network performance
of over one megabit per second per session (i.e. per client system) to support both applications. However we have
since developed and deployed new compression algorithms for both the map
(image) and video information. These decrease the needed network performance
by about a factor of twenty but at the cost of increasing the needed CPU
performance to decompress the images and video in real time.
We find that the new compressed version of the applications give good performance
with network performance per session in the ISDN (100 kilobits per second)
to POTS ( 20 kilobits per second) range with a modern consumer PC (75 Mhz
Pentium) or low end workstation as the client. Note that the decompression time is independent of the network and only
depends on the client CPU characteristics. For the GIS application as one
decreases the compression ratio, one decreases the needed CPU decoding time
but increases the needed network bandwidth. The attached results show that
for a 75 Mhz Pentium that the GIS is best implemented with uncompressed
images above about a 0.5 Megabit per second network while the compressed
implementation gives the best user performance at network speeds below this.
Similarily let us suppose that the client PC (or digital set-top box/network
PC) comes with hardware MPEG decompression. Then at the needed 1.5 megabit
per second network, this would be preferred delivery technology. Low bit
rate compression schemes can be used with portable easily deployed software
decoders at a factor of 20 lower network speed than MPEG but the requirement
of Pentium class client CPU's.
These conclusions are also seen in evaluting effectiveness of "caching"
of Internet (Web) information on local disks. A set of detailed experiments
by schools connected to NASA Langley (in Virginia) with a specially modified
HTTP server showed that 95% of educational access was from cached material
and so again we see a factor of 20 reduction in naive estimate of needed
network bandwidth. Here you are trading network bandwidth versus local disk
storage for a large cache.
Our work needs to be interpreted carefully as one can indeed either choose
fast clients or networks. The growing interest in "network PC's" probably
only makes sense if they are supported by fast reliable networks. However
it is not clear if these devices will be succesful. Again we can contrast
schools in China and the USA. In China, essentially the only PC's are the
modern Pentium systems we used and a tradeoff to CPU power with modest network
bandwidth need seems natural. In the USA, the schools have substantial investment
in older Apple and IBM PC architecture machines which are quite inaedequate
for the decompression tasks we employed. However as with the network PC,
it is not clear that older PC's are sufficient anyway as most applications
require substantial graphics and memory (just to run a Web browser!) capabilities.
This is contained in a set of separate documents:
http://turais:8888/gis/tests/cpu2net.html is the overview and this references both spreadsheet (Excel or Ascii) and a set of graphs of application time versus network bandwidth.
This evaluation considers a single map unit covering a 25 kilometer square
with digital elevation and texture data at current 30 to 100 meter resolution
and needing about 800 kilobytes of memory in uncompressed form. For an ISDN
line, the results show that the uncompressed network dominated loading time
of over 40 seconds becomes just 4 seconds on a SGI R10000 workstation for
the wavelet compressed texture map which is roughly equally split between
network and CPU decompression time. On a 36.6 kbaud modem line, the uncompressed
image takes over 160 seconds to download. On the otherhand, a 75 Mhz Pentium
PC only takes a total of 19 seconds for downloading and decompression for
the highest quality wavelet images. The results show that it is better to
use high quality (level) wavelet compression as at level 6, the image is
"too small" and downloading time is less than the decompression time which
is roughly independent of compression quality.
A realistic journey would involve several such map units which could be
buffered in the client and available so that user smoothly transitions to
them as one crosses map unit boundaries. Thus one needs to reduce downloading/decompression
time so that total is (much) less than typical time a user spends in each
unit.
It is well known that traditional MPEG video encoding requires both a high
performance network and a specialized card for hardware decompression on
the client machine. We have experimented with a different approach using
a new low bite rate compression scheme (H263) and both a high performance
(C,C++) and portable (Java) decoder on the client machine. The Java results
show many differences between vendors (Microsoft, Netscape) and machines
(PC, SGI) but these reflect immaturity of this technology. We choose four
video sequences varying from "quiet" (golf, video conferencing) to active
scenes. The needed network performance varies from that of phone modems
for the quiet video to ISDN (100 kilobytes per second) for the very active
(and hence changing) video. As video compression encodes differences, the
more rapidly varying movies require more network bandwidth. However the
decoder performance is roughly independent of the movie and can be done
in approximately real time on a 75Mhz Pentium PC. To be precise our optimized
decoder on this PC is about twice real time and the best Java version about
half real time today.
We conclude that modern clients with approximately ISDN performance networks
will be sufficient for education video applications (which will not require
entertainment quality images) with video conferencing only needing conventional
phone line modems.
Movie sequences Tested
All sequences use (as is conventional in H263) one I-frame with remaining frames specified by differences
H263 Decoders Tested:
The ActiveMovie decoder, written in C++, has been ported from the C version
which was an optimized commercial( Telenor) code. The ActiveMovie code is
running inside of an ActveMovie OCX, or, as Microsoft calls it now, an ActiveX
control.This is contributing to a rather large system overhead, but the
comparison with Java applet is very legitimate, since in both cases video
is playing inside of browser window.
Test platform: Pentium 75 MHz, with GrafixStar 700 video card (high end device speeding up video display by some 25%), running Windows NT 4.0 (build 1381).
Note that all frame rates larger than 10 are fully acceptable as 10 frames per second is realtime performance. We averaged several runs getting reproducible results.