Add conceptual docs for modules/video_coding

Bug: webrtc:12558
Change-Id: I6d258fcd6b666453397ce833d906efc7a6ce3dbc
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/215071
Commit-Queue: Rasmus Brandt <brandtr@webrtc.org>
Reviewed-by: Artem Titov <titovartem@webrtc.org>
Reviewed-by: Philip Eliasson <philipel@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#33754}
diff --git a/g3doc/sitemap.md b/g3doc/sitemap.md
index 8804aab..43e1440 100644
--- a/g3doc/sitemap.md
+++ b/g3doc/sitemap.md
@@ -22,6 +22,7 @@
             *   [APM](/modules/audio_processing/g3doc/audio_processing_module.md)
     *   Video
         *   [Adaptation](/video/g3doc/adaptation.md)
+        *   [Video coding](/modules/video_coding/g3doc/index.md)
     *   DataChannel
     *   PeerConnection
     *   Desktop capture
diff --git a/modules/video_coding/g3doc/index.md b/modules/video_coding/g3doc/index.md
new file mode 100644
index 0000000..6fdab6e
--- /dev/null
+++ b/modules/video_coding/g3doc/index.md
@@ -0,0 +1,177 @@
+<?% config.freshness.owner = 'brandtr' %?>
+<?% config.freshness.reviewed = '2021-04-15' %?>
+
+# Video coding in WebRTC
+
+## Introduction to layered video coding
+
+[Video coding][video-coding-wiki] is the process of encoding a stream of
+uncompressed video frames into a compressed bitstream, whose bitrate is lower
+than that of the original stream.
+
+### Block-based hybrid video coding
+
+All video codecs in WebRTC are based on the block-based hybrid video coding
+paradigm, which entails prediction of the original video frame using either
+[information from previously encoded frames][motion-compensation-wiki] or
+information from previously encoded portions of the current frame, subtraction
+of the prediction from the original video, and
+[transform][transform-coding-wiki] and [quantization][quantization-wiki] of the
+resulting difference. The output of the quantization process, quantized
+transform coefficients, is losslessly [entropy coded][entropy-coding-wiki] along
+with other encoder parameters (e.g., those related to the prediction process)
+and then a reconstruction is constructed by inverse quantizing and inverse
+transforming the quantized transform coefficients and adding the result to the
+prediction. Finally, in-loop filtering is applied and the resulting
+reconstruction is stored as a reference frame to be used to develop predictions
+for future frames.
+
+### Frame types
+
+When an encoded frame depends on previously encoded frames (i.e., it has one or
+more inter-frame dependencies), the prior frames must be available at the
+receiver before the current frame can be decoded. In order for a receiver to
+start decoding an encoded bitstream, a frame which has no prior dependencies is
+required. Such a frame is called a "key frame". For real-time-communications
+encoding, key frames typically compress less efficiently than "delta frames"
+(i.e., frames whose predictions are derived from previously encoded frames).
+
+### Single-layer coding
+
+In 1:1 calls, the encoded bitstream has a single recipient. Using end-to-end
+bandwidth estimation, the target bitrate can thus be well tailored for the
+intended recipient. The number of key frames can be kept to a minimum and the
+compressability of the stream can be maximized. One way of achiving this is by
+using "single-layer coding", where each delta frame only depends on the frame
+that was most recently encoded.
+
+### Scalable video coding
+
+In multiway conferences, on the other hand, the encoded bitstream has multiple
+recipients each of whom may have different downlink bandwidths. In order to
+tailor the encoded bitstreams to a heterogeneous network of receivers,
+[scalable video coding][svc-wiki] can be used. The idea is to introduce
+structure into the dependency graph of the encoded bitstream, such that _layers_ of
+the full stream can be decoded using only available lower layers. This structure
+allows for a [selective forwarding unit][sfu-webrtc-glossary] to discard upper
+layers of the of the bitstream in order to achieve the intended downlink
+bandwidth.
+
+There are multiple types of scalability:
+
+* _Temporal scalability_ are layers whose framerate (and bitrate) is lower than that of the upper layer(s)
+* _Spatial scalability_ are layers whose resolution (and bitrate) is lower than that of the upper layer(s)
+* _Quality scalability_ are layers whose bitrate is lower than that of the upper layer(s)
+
+WebRTC supports temporal scalability for `VP8`, `VP9` and `AV1`, and spatial
+scalability for `VP9` and `AV1`.
+
+### Simulcast
+
+Simulcast is another approach for multiway conferencing, where multiple
+_independent_ bitstreams are produced by the encoder.
+
+In cases where multiple encodings of the same source are required (e.g., uplink
+transmission in a multiway call), spatial scalability with inter-layer
+prediction generally offers superior coding efficiency compared with simulcast.
+When a single encoding is required (e.g., downlink transmission in any call),
+simulcast generally provides better coding efficiency for the upper spatial
+layers. The `K-SVC` concept, where spatial inter-layer dependencies are only
+used to encode key frames, for which inter-layer prediction is typically
+significantly more effective than it is for delta frames, can be seen as a
+compromise between full spatial scalability and simulcast.
+
+## Overview of implementation in `modules/video_coding`
+
+Given the general introduction to video coding above, we now describe some
+specifics of the [`modules/video_coding`][modules-video-coding] folder in WebRTC.
+
+### Built-in software codecs in [`modules/video_coding/codecs`][modules-video-coding-codecs]
+
+This folder contains WebRTC-specific classes that wrap software codec
+implementations for different video coding standards:
+
+* [libaom][libaom-src] for [AV1][av1-spec]
+* [libvpx][libvpx-src] for [VP8][vp8-spec] and [VP9][vp9-spec]
+* [OpenH264][openh264-src] for [H.264 constrained baseline profile][h264-spec]
+
+Users of the library can also inject their own codecs, using the
+[VideoEncoderFactory][video-encoder-factory-interface] and
+[VideoDecoderFactory][video-decoder-factory-interface] interfaces. This is how
+platform-supported codecs, such as hardware backed codecs, are implemented.
+
+### Video codec test framework in [`modules/video_coding/codecs/test`][modules-video-coding-codecs-test]
+
+This folder contains a test framework that can be used to evaluate video quality
+performance of different video codec implementations.
+
+### SVC helper classes in [`modules/video_coding/svc`][modules-video-coding-svc]
+
+*   [`ScalabilityStructure*`][scalabilitystructure] - different
+    [standardized scalability structures][scalability-structure-spec]
+*   [`ScalableVideoController`][scalablevideocontroller] - provides instructions to the video encoder how
+    to create a scalable stream
+*   [`SvcRateAllocator`][svcrateallocator] - bitrate allocation to different spatial and temporal
+    layers
+
+### Utility classes in [`modules/video_coding/utility`][modules-video-coding-utility]
+
+*   [`FrameDropper`][framedropper] - drops incoming frames when encoder systematically
+    overshoots its target bitrate
+*   [`FramerateController`][frameratecontroller] - drops incoming frames to achieve a target framerate
+*   [`QpParser`][qpparser] - parses the quantization parameter from a bitstream
+*   [`QualityScaler`][qualityscaler] - signals when an encoder generates encoded frames whose
+    quantization parameter is outside the window of acceptable values
+*   [`SimulcastRateAllocator`][simulcastrateallocator] - bitrate allocation to simulcast layers
+
+### General helper classes in [`modules/video_coding`][modules-video-coding]
+
+*   [`FecControllerDefault`][feccontrollerdefault] - provides a default implementation for rate
+    allocation to [forward error correction][fec-wiki]
+*   [`VideoCodecInitializer`][videocodecinitializer] - converts between different encoder configuration
+    structs
+
+### Receiver buffer classes in [`modules/video_coding`][modules-video-coding]
+
+*   [`PacketBuffer`][packetbuffer] - (re-)combines RTP packets into frames
+*   [`RtpFrameReferenceFinder`][rtpframereferencefinder] - determines dependencies between frames based on information in the RTP header, payload header and RTP extensions
+*   [`FrameBuffer`][framebuffer] - order frames based on their dependencies to be fed to the decoder
+
+[video-coding-wiki]: https://en.wikipedia.org/wiki/Video_coding_format
+[motion-compensation-wiki]: https://en.wikipedia.org/wiki/Motion_compensation
+[transform-coding-wiki]: https://en.wikipedia.org/wiki/Transform_coding
+[motion-vector-wiki]: https://en.wikipedia.org/wiki/Motion_vector
+[mpeg-wiki]: https://en.wikipedia.org/wiki/Moving_Picture_Experts_Group
+[svc-wiki]: https://en.wikipedia.org/wiki/Scalable_Video_Coding
+[sfu-webrtc-glossary]: https://webrtcglossary.com/sfu/
+[libvpx-src]: https://chromium.googlesource.com/webm/libvpx/
+[libaom-src]: https://aomedia.googlesource.com/aom/
+[openh264-src]: https://github.com/cisco/openh264
+[vp8-spec]: https://tools.ietf.org/html/rfc6386
+[vp9-spec]: https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf
+[av1-spec]: https://aomediacodec.github.io/av1-spec/
+[h264-spec]: https://www.itu.int/rec/T-REC-H.264-201906-I/en
+[video-encoder-factory-interface]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/video_codecs/video_encoder_factory.h;l=27;drc=afadfb24a5e608da6ae102b20b0add53a083dcf3
+[video-decoder-factory-interface]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/api/video_codecs/video_decoder_factory.h;l=27;drc=49c293f03d8f593aa3aca282577fcb14daa63207
+[scalability-structure-spec]: https://w3c.github.io/webrtc-svc/#scalabilitymodes*
+[fec-wiki]: https://en.wikipedia.org/wiki/Error_correction_code#Forward_error_correction
+[entropy-coding-wiki]: https://en.wikipedia.org/wiki/Entropy_encoding
+[modules-video-coding]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/
+[modules-video-coding-codecs]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/codecs/
+[modules-video-coding-codecs-test]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/codecs/test/
+[modules-video-coding-svc]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/svc/
+[modules-video-coding-utility]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/utility/
+[scalabilitystructure]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/svc/create_scalability_structure.h?q=CreateScalabilityStructure
+[scalablevideocontroller]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/svc/scalable_video_controller.h?q=ScalableVideoController
+[svcrateallocator]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/svc/svc_rate_allocator.h?q=SvcRateAllocator
+[framedropper]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/utility/frame_dropper.h?q=FrameDropper
+[frameratecontroller]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/utility/framerate_controller.h?q=FramerateController
+[qpparser]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/utility/qp_parser.h?q=QpParser
+[qualityscaler]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/utility/quality_scaler.h?q=QualityScaler
+[simulcastrateallocator]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/utility/simulcast_rate_allocator.h?q=SimulcastRateAllocator
+[feccontrollerdefault]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/fec_controller_default.h?q=FecControllerDefault
+[videocodecinitializer]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/include/video_codec_initializer.h?q=VideoCodecInitializer
+[packetbuffer]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/packet_buffer.h?q=PacketBuffer
+[rtpframereferencefinder]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/rtp_frame_reference_finder.h?q=RtpFrameReferenceFinder
+[framebuffer]: https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/video_coding/frame_buffer2.h?q=FrameBuffer
+[quantization-wiki]: https://en.wikipedia.org/wiki/Quantization_(signal_processing)