However, for real-time delivery of audio and video, TCP and other
reliable transport protocols such as XTP are inappropriate. The three
main reaons are:
An additional small disadvantage is that the TCP and XTP headers are
larger than a UDP header (40 bytes for TCP and XTP 3.6, 32 bytes for
XTP 4.0, compared to 8 bytes). Also, these reliable transport
protocols do not contain the necessary timestamp and encoding
information needed by the receiving application, so that they cannot
replace RTP. (They would not need the sequence number as these
protocols assure that no losses or reordering takes place.) While LANs often have sufficient bandwidth and low enough losses not
to trigger these problems, TCP does not offer any advantages in that
scenario either, except for the recovery from rare packet losses.
Even in a LAN with no losses, TCP would suffer from the initial slow
start delay. The question of the relationship of RTP and
XTP appears to arise
frequently. (This may simply be due to the word 'transport' in both
protocol names.) However, XTP and RTP are not replacements for each
other. XTP is designed as a general, configurable network and
transport protocol for both reliable and unreliable data
communications. RTP has no reliability mechanisms (although these
could be added if desired for specific applications) and no flow
control like the rate control in XTP. RTP is not intended for
regular, reliable data transfer (where TCP or XTP might be used
instead). For real-time data, where retransmission is usually not
possible due to timing constraints, XTP would have to disable
retransmission. Flow/congestion control for real-time data is most
likely inappropriate as the rate of such sources is inherently given
and not modifiable on the time-scale of transport-protocol flow
control, as explained in the previous section. It should be noted
that RTP supports mechanisms that allow a form of congestion control
on longer time scales, e.g., by modifying the source encoder if
network congestion is detected. RTP has no protocol state by itself and can thus be used over either
connection-less networks, such as IP/UDP, or connection-oriented
networks, such as XTP, ST-II or ATM (AAL3/4 or AAL5). Many real-time
multimedia applications use multicast with a large fan-out, e.g.,
several hundred to thousands for a lecture or concert.
Connection-oriented protocols like XTP have difficulty scaling to
such a large number of receivers. XTP does not offer timing or content type (media) information, and
thus would need these services, as offered by RTP. XTP provides no
RTP-like direct feedback of the received quality-of-service, and thus,
again, would have to "import" these from another protocol. Looking
at existing applications using XTP for real-time services confirms
that they need to add a layer similar in content to the RTP data part
"between" XTP and the actual media. RTP (in particular, the data part) is tightly coupled to the
application, so that a kernel or library implementation makes little
sense. However, NeVoT can be used as a linkable library that
implements RTP for an audio tool, with a documented API. The sources
to NeVoT, rtpdump and vic also contain RTCP processing modules which
should be usable in other applications with minor modifications. Note
also that the specification itself contains numerous code fragments.
(Most of the other applications are using older versions of RTP and
thus should not be relied upon for developments.) A new version of VAT (currently in alpha-test) also implements RTP.
As soon as there are a sufficient number of stable applications using
RTP, it is anticipated that most Internet MBONE audio/video events
will be transmitted using RTP. For conferencing over ISDN:
For conference control, application and data sharing, there are a
number of recommendations:
For near real-time distribution of audio, e.g., the on-demand delivery
of music or news: CuSeeMe (for
Windows PC and the Macintosh) is a combined audio and video tool using
reflectors rather than IP-level multicast. RealAudio writes what currently applies to all tools:
A survey can be found at
www.von.com.
RTP does not ensure real-time delivery. So how come it is
called a real-time protocol?
No end-to-end protocol, including RTP, can ensure in-time delivery.
This always requires the support of lower layers that actually have
control over resources in switches and routers. RTP provides
functionality suited for carrying real-time content, e.g., a timestamp
and control mechanisms for synchronizing different streams with timing
properties.Is Is RTP an unreliable protocol? Are there any
mechanisms provided for error recovery in RTP?
As currently defined, RTP does not define any mechanisms for
recovering for packet loss. Such mechanisms are likely to be highly
dependent on the packet content. For example, for audio, it has been
suggested to add low-bit-rate
redundancy, offset in time. For other applications, retransmission of
lost packets may be appropriate. This requires no additions to RTP.
RTP probably has the necessary header information (like sequence
numbers) for some forms of error recovery by retransmission.Can RTP run over IPng? ATM?
Yes. RTP contains no specific assumptions about the capabilities of
the lower layers, except that they provide framing. It contains no
network-layer addresses, so that RTP is not affected by addressing
changes. Any additional lower-layer capabilities such as security or
quality-of-service guarantees can obviously be used by applications
employing RTP. There are several implementations of video tools that
run RTP directly over AAL5. It should be noted that the RTCP CNAME
field is currently based on the assumption that hosts have
Internet-style domain names.Why can't we just use TCP for audio and video?
For delivering audio and video for playback, TCP may be appropriate.
Also, with sufficiently long buffering and adequate average
throughput, near-real-time delivery using TCP can be successful, as
practiced by the Netscape WWW browser. TCP may often run over highly
lossy networks (e.g., the German X.25 network) with acceptable
throughput, even though the uncompensated losses would make audio or
video communication impossible.
Can't we just use XTP?
Many of the arguments parallel those in the previous section.Is there an RTP library or kernel implementation?
What are some of the differences between the VAT protocol
and RTP?
The VAT protocol was originally implemented in the VAT audio tool and
subsequently also in other audio tools such as NeVoT. It is currently
the most frequently used packet format for audio on the MBONE. The
VAT header format is only described in header files. (See the VAT and
NeVoT sources for details.) Many aspects of RTP and the VAT protocol
are similar, but RTP improves upon the VAT protocol in a number of
ways:
What are the differences between RTP version 1 and 2?
Version 1 is of historical interest only. Applications should not be
written for it. RTP version 2 is not backwards compatible with
version 1. If you care, you can find a definition of version 1 in
an old Internet
draft.Are there related ITU efforts?
Media formats:
Are there other efforts in using the Internet for
real-time audio and video?
Too many, some may say. vat versions 3.4 and
earlier, one of the early
(recent) Internet audio applications, uses mostly the same audio
encodings as specified in the RTP profile, but a different protocol.
There are a number of "Internet telephones" (usually for PCs) using
proprietary audio coding and protocols, meant for point-to-point
connections:
If the packet loss is high, it may be due to a busy network. If this
is the case, there is little you can do to remedy the situation other
than to try connecting to the site at a later time.
Henning Schulzrinne