The main implementation of webrtc::webrtc_pc_e2e::PeerConnectionE2EQualityTestFixture
is webrtc::webrtc_pc_e2e::PeerConnectionE2EQualityTest
. Internally it owns the next main pieces:
MediaHelper
- responsible for adding audio and video tracks to the peers.VideoQualityAnalyzerInjectionHelper
and SingleProcessEncodedImageDataInjector
- used to inject video quality analysis and properly match captured and rendered video frames. You can read more about it in DefaultVideoQualityAnalyzer section.AudioQualityAnalyzerInterface
- used to measure audio quality metricsTestActivitiesExecutor
- used to support ExecuteAt(...)
and ExecuteEvery(...)
API of PeerConnectionE2EQualityTestFixture
to run any arbitrary action during test execution timely synchronized with a test call.QualityMetricsReporter
added by the PeerConnectionE2EQualityTestFixture
user.TestPeer
object.Also it keeps a reference to webrtc::TimeController
, which is used to create all required threads, task queues, task queue factories and time related objects.
Call participants are represented by instances of TestPeer
object. TestPeerFactory
is used to create them. TestPeer
owns all instances related to the webrtc::PeerConnection
, including required listeners and callbacks. Also it provides an API to do offer/answer exchange and ICE candidate exchange. For this purposes internally it uses an instance of webrtc::PeerConnectionWrapper
.
The TestPeer
also owns the PeerConnection
worker thread. The signaling thread for all PeerConnection
's is owned by PeerConnectionE2EQualityTestFixture
and shared between all participants in the call. The network thread is owned by the network layer (it maybe either emulated network provided by Network Emulation Framework or network thread and rtc::NetworkManager
provided by user) and provided when peer is added to the fixture via AddPeer(...)
API.
PeerConnectionE2EQualityTestFixture
gives the user ability to provide different QualityMetricsReporter
s which will listen for PeerConnection
GetStats
API. Then such reporters will be able to report various metrics that user wants to measure.
PeerConnectionE2EQualityTestFixture
itself also uses this mechanism to measure:
CrossMediaMetricsReporter
)Also framework provides a StatsBasedNetworkQualityMetricsReporter
to measure network related WebRTC metrics and print debug raw emulated network statistic. This reporter should be added by user via AddQualityMetricsReporter(...)
API if requried.
Internally stats gathering is done by StatsPoller
. Stats are requested once per second for each PeerConnection
and then resulted object is provided into each stats listener.
PeerConnectionE2EQualityTest
provides ability to test Simulcast and SVC for video. These features aren't supported by P2P call and in general requires a Selective Forwarding Unit (SFU). So special logic is applied to mimic SFU behavior in P2P call. This logic is located inside SignalingInterceptor
, QualityAnalyzingVideoEncoder
and QualityAnalyzingVideoDecoder
and consist of SDP modification during offer/answer exchange and special handling of video frames from unrelated Simulcast/SVC streams during decoding.
In case of Simulcast we have a video track, which internally contains multiple video streams, for example low resolution, medium resolution and high resolution. WebRTC client doesn't support receiving an offer with multiple streams in it, because usually SFU will keep only single stream for the client. To bypass it framework will modify offer by converting a single track with three video streams into three independent video tracks. Then sender will think that it send simulcast, but receiver will think that it receives 3 independent tracks.
To achieve such behavior some extra tweaks are required:
Described modifications are illustrated on the picture below.
The exchange will look like this:
Such mechanism put a constraint that RTX streams are not supported, because they don't have RID RTP header extension in their packets.
In case of SVC the framework will update the sender's offer before even setting it as local description on the sender side. Then no changes to answer will be required.
ssrc
is a 32 bit random value that is generated in RTP to denote a specific source used to send media in an RTP connection. In original offer video track section will look like this:
m=video 9 UDP/TLS/RTP/SAVPF 98 100 99 101 ... a=ssrc-group:FID <primary ssrc> <retransmission ssrc> a=ssrc:<primary ssrc> cname:... .... a=ssrc:<retransmission ssrc> cname:... ....
To enable SVC for such video track framework will add extra ssrc
s for each SVC stream that is required like this:
a=ssrc-group:FID <Low resolution primary ssrc> <Low resolution retransmission ssrc> a=ssrc:<Low resolution primary ssrc> cname:... .... a=ssrc:<Low resolution retransmission ssrc> cname:.... ... a=ssrc-group:FID <Medium resolution primary ssrc> <Medium resolution retransmission ssrc> a=ssrc:<Medium resolution primary ssrc> cname:... .... a=ssrc:<Medium resolution retransmission ssrc> cname:.... ... a=ssrc-group:FID <High resolution primary ssrc> <High resolution retransmission ssrc> a=ssrc:<High resolution primary ssrc> cname:... .... a=ssrc:<High resolution retransmission ssrc> cname:.... ...
The next line will also be added to the video track section of the offer:
a=ssrc-group:SIM <Low resolution primary ssrc> <Medium resolution primary ssrc> <High resolution primary ssrc>
It will tell PeerConnection that this track should be configured as SVC. It utilize WebRTC Plan B offer structure to achieve SVC behavior, also it modifies offer before setting it as local description which violates WebRTC standard. Also it adds limitations that on lossy networks only top resolution streams can be analyzed, because WebRTC won't try to restore low resolution streams in case of loss, because it still receives higher stream.
In the encoder, the framework for each encoded video frame will propagate information requried for the fake SFU to know if it belongs to an interesting simulcast stream/spatial layer of if it should be “discarded”.
On the decoder side frames that should be “discarded” by fake SFU will be auto decoded into single pixel images and only the interesting simulcast stream/spatial layer will go into real decoder and then will be analyzed.