blob: 6dd7b99b09dd39ae5b0b237aa3cd7574f0cb9728 [file] [log] [blame] [view]
<!-- go/cmark -->
<!--* freshness: {owner: 'hta' reviewed: '2021-06-03'} *-->
# RTP in WebRTC
WebRTC uses the RTP protocol described in
[RFC3550](https://datatracker.ietf.org/doc/html/rfc3550) for transporting audio
and video. Media is encrypted using [SRTP](./srtp.md).
## Allocation of payload types
RTP packets have a payload type field that describes which media codec can be
used to handle a packet. For some (older) codecs like PCMU the payload type is
assigned statically as described in
[RFC3551](https://datatracker.ietf.org/doc/html/rfc3551). For others, it is
assigned dynamically through the SDP. **Note:** there are no guarantees on the
stability of a payload type assignment.
For this allocation, the range from 96 to 127 is used. When this range is
exhausted, the allocation falls back to the range from 35 to 63 as permitted by
[section 5.1 of RFC3550][1]. Note that older versions of WebRTC failed to
recognize payload types in the lower range. Newer codecs (such as flexfec-03 and
AV1) will by default be allocated in that range.
Payload types in the range 64 to 95 are not used to avoid confusion with RTCP as
described in [RFC5761](https://datatracker.ietf.org/doc/html/rfc5761).
## Allocation of audio payload types
Audio payload types are assigned from a table by the [PayloadTypeMapper][2]
class. New audio codecs should be allocated in the lower dynamic range [35,63],
starting at 63, to reduce collisions with payload types
## Allocation of video payload types
Video payload types are allocated by the
[GetPayloadTypesAndDefaultCodecs method][3]. The set of codecs depends on the
platform, in particular for H264 codecs and their different profiles. Payload
numbers are assigned ascending from 96 for video codecs and their
[associated retransmission format](https://datatracker.ietf.org/doc/html/rfc4588).
Some codecs like flexfec-03 and AV1 are assigned to the lower range [35,63] for
reasons explained above. When the upper range [96,127] is exhausted, payload
types are assigned to the lower range [35,63], starting at 35.
## Handling of payload type collisions
Due to the requirement that payload types must be uniquely identifiable when
using [BUNDLE](https://datatracker.ietf.org/doc/html/rfc8829) collisions between
the assignments of the audio and video payload types may arise. These are
resolved by the [UsedPayloadTypes][4] class which will reassign payload type
numbers descending from 127.
# Bitrate probing
Bandwidth estimation sometimes requires sending RTP packets to ramp up the
estimate above a certain treshold. WebRTC uses two techniques for that:
* Packets that only consist of RTP padding
* RTX packets
At the receiving end, both types of probing packets do not interfere with media processing.
After being taken into account for bandwidth estimation, probing packets only consisting
of padding can be dropped and RTX packets act as redundancy.
Using RTX for probing is generally preferred as padding probes are limited to 256 bytes
of RTP payload which results in a larger number of packets being used for probing which
is a disadvantage from a congestion control point of view.
## Padding probes
Padding probes consist of RTP packets with header extensions (either abs-send time or
the transport-wide-cc sequence number) and set the RTP "P" bit. The last byte of the
RTP payload which specifies the amount of padding is set to 255 and preceeded by 255
bytes of all zeroes. See [section 5.1 of RFC3550][1].
Padding probes use the RTX RTP stream (i.e. payload type, sequence number and timestamp)
when RTX is negotiated or share the same RTP stream as the media packets otherwise.
Padding probes are used either when
* RTX is not negotiated (such as for audio, less commonly for video) or
* no suitable original packet is found for RTX probing.
Padding probes should not be interleaved with packets of a video frame.
## RTX probes
RTX probes are resends of previous packets that use RTX retransmissions specified in
[RFC4588](https://www.rfc-editor.org/rfc/rfc4588) as payload format when negotiated with
the peer. These packets will have a different abs-send-time or transport-wide-cc sequence
number and use the RTX RTP stream (i.e. RTX payload type, sequence number and timestamp)
associated with the media RTP stream.
RTX probing uses recently sent RTP packets that have not yet been acknowledged by
the remote side. Sending these packets again has a small chance of being useful when the
original packet is lost and will not affect RTP processing at the receiver otherwise.
[1]: https://datatracker.ietf.org/doc/html/rfc3550#section-5.1
[2]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/media/engine/payload_type_mapper.cc;l=25;drc=4f26a3c7e8e20e0e0ca4ca67a6ebdf3f5543dc3f
[3]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/media/engine/webrtc_video_engine.cc;l=119;drc=b412efdb780c86e6530493afa403783d14985347
[4]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/pc/used_ids.h;l=94;drc=b412efdb780c86e6530493afa403783d14985347