Name: “Playout Delay” ; “RTP Header Extension to control Playout Delay”
SDP “a= name”: “playout-delay” ; this is also used in client/cloud signaling.
Status: This extension is defined here to allow for experimentation. Once experience has shown that it is useful, we intend to make a proposal based on it for standardization in the IETF.
On WebRTC, the RTP receiver continuously measures inter-packet delay and evaluates packet jitter. Besides this, an estimated delay for decode and render at the receiver is computed. The jitter buffer, the local time extrapolation and the predicted render time (based on predicted decode and render time) impact the delay on a frame before it is rendered at the receiver.
This document proposes an RTP extension to enable the RTP sender to try and limit the amount of playout delay at the receiver in a certain range. A minimum and maximum delay from the sender provides guidance on the range over which the receiver can smooth out rendering.
Thus, this extension aims to provide the sender’s intent to the receiver on how quickly a frame needs to be rendered.
The following use cases are addressed by this extension:
The playout delay on a frame represents the amount of delay added to a frame the time it is captured at the sender to the time it is expected to be rendered at the receiver. Thus playout delay is essentially:
Playout delay = ExpectedRenderTime(frame) - ExpectedCaptureTime(frame)
MIN and MAX playout delay in turn represent the minimum and maximum delay that can be seen on a frame. This restriction range is best effort. The receiver is expected to try and meet the range as best as it can.
A value of 0 for example is meaningless from the perspective of actually meeting the suggested delay, but it indicates to the receiver that the frame should be rendered as soon as possible. It is up-to the receiver to decide how to handle a frame when it arrives too late (i.e., whether to simply drop or hand over for rendering as soon as possible).
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ID | len=2 | MIN delay | MAX delay | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
12 bits for Minimum and Maximum delay. This represents a range of 0 - 40950 milliseconds for minimum and maximum (with a granularity of 10 ms). A granularity of 10 ms is sufficient since we expect the following typical use cases:
The header is attached to the RTP packet by the RTP sender when it needs to change the min and max smoothing delay at the receiver. Once the sender is informed that at least one RTP packet which has the min and max details is delivered, it MAY stop providing details on all further RTP packets until another change warrants communicating the details to the receiver again. This is done as follows:
RTCP feedback to RTP sender includes the highest sequence number that was seen on the RTP receiver. The RTP sender can track the sequence number on the packet that first had the playout delay extension and then stop sending the extension once the received sequence number is greater than the sequence number on the first packet containing the current values playout delay in this extension.