[getStats] Implement "media-source" audio levels, fixing Chrome bug.
Implements RTCAudioSourceStats members:
- audioLevel
- totalAudioEnergy
- totalSamplesDuration
In this CL description these are collectively referred to as the audio
levels.
The audio levels are removed from sending "track" stats (in Chrome,
these are now reported as undefined instead of 0).
Background:
For sending tracks, audio levels were always reported as 0 in Chrome
(https://crbug.com/736403), while audio levels were correctly reported
for receiving tracks. This problem affected the standard getStats() but
not the legacy getStats(), blocking some people from migrating. This
was likely not a problem in native third_party/webrtc code because the
delivery of audio frames from device to send-stream uses a different
code path outside of chromium.
A recent PR (https://github.com/w3c/webrtc-stats/pull/451) moved the
send-side audio levels to the RTCAudioSourceStats, while keeping the
receive-side audio levels on the "track" stats. This allows an
implementation to report the audio levels even if samples are not sent
onto the network (such as if an ICE connection has not been established
yet), reflecting some of the current implementation.
Changes:
1. Audio levels are added to RTCAudioSourceStats. Send-side audio
"track" stats are left undefined. Receive-side audio "track" stats
are not changed in this CL and continue to work.
2. Audio level computation is moved from the AudioState and
AudioTransportImpl to the AudioSendStream. This is because a) the
AudioTransportImpl::RecordedDataIsAvailable() code path is not
exercised in chromium, and b) audio levels should, per-spec, not be
calculated on a per-call basis, for which the AudioState is defined.
3. The audio level computation is now performed in
AudioSendStream::SendAudioData(), a code path used by both native
and chromium code.
4. Comments are added to document behavior of existing code, such as
AudioLevel and AudioSendStream::SendAudioData().
Note:
In this CL, just like before this CL, audio level is only calculated
after an AudioSendStream has been created. This means that before an
O/A negotiation, audio levels are unavailable.
According to spec, if we have an audio source, we should have audio
levels. An immediate solution to this would have been to calculate the
audio level at pc/rtp_sender.cc. The problem is that the
LocalAudioSinkAdapter::OnData() code path, while exercised in chromium,
is not exercised in native code. The issue of calculating audio levels
on a per-source bases rather than on a per-send stream basis is left to
https://crbug.com/webrtc/10771, an existing "media-source" bug.
This CL can be verified manually in Chrome at:
https://codepen.io/anon/pen/vqRGyq
Bug: chromium:736403, webrtc:10771
Change-Id: I8036cd9984f3b187c3177470a8c0d6670a201a5a
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/143789
Reviewed-by: Oskar Sundbom <ossu@webrtc.org>
Reviewed-by: Stefan Holmer <stefan@webrtc.org>
Commit-Queue: Henrik Boström <hbos@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#28480}
diff --git a/pc/rtc_stats_collector.cc b/pc/rtc_stats_collector.cc
index ec917ae..0ccfd18 100644
--- a/pc/rtc_stats_collector.cc
+++ b/pc/rtc_stats_collector.cc
@@ -551,13 +551,6 @@
attachment_id);
audio_track_stats->remote_source = false;
audio_track_stats->detached = false;
- if (voice_sender_info.audio_level >= 0) {
- audio_track_stats->audio_level = DoubleAudioLevelFromIntAudioLevel(
- voice_sender_info.audio_level);
- }
- audio_track_stats->total_audio_energy = voice_sender_info.total_input_energy;
- audio_track_stats->total_samples_duration =
- voice_sender_info.total_input_duration;
if (voice_sender_info.apm_statistics.echo_return_loss) {
audio_track_stats->echo_return_loss =
*voice_sender_info.apm_statistics.echo_return_loss;
@@ -1395,18 +1388,38 @@
const auto& track = sender_internal->track();
if (!track)
continue;
- // TODO(hbos): The same track could be attached to multiple senders which
- // should result in multiple senders referencing the same media source
- // stats. When all media source related metrics are moved to the track's
- // source (e.g. input frame rate is moved from cricket::VideoSenderInfo to
- // VideoTrackSourceInterface::Stats), don't create separate media source
- // stats objects on a per-attachment basis.
+ // TODO(https://crbug.com/webrtc/10771): The same track could be attached
+ // to multiple senders which should result in multiple senders referencing
+ // the same media-source stats. When all media source related metrics are
+ // moved to the track's source (e.g. input frame rate is moved from
+ // cricket::VideoSenderInfo to VideoTrackSourceInterface::Stats and audio
+ // levels are moved to the corresponding audio track/source object), don't
+ // create separate media source stats objects on a per-attachment basis.
std::unique_ptr<RTCMediaSourceStats> media_source_stats;
if (track->kind() == MediaStreamTrackInterface::kAudioKind) {
- media_source_stats = absl::make_unique<RTCAudioSourceStats>(
+ auto audio_source_stats = absl::make_unique<RTCAudioSourceStats>(
RTCMediaSourceStatsIDFromKindAndAttachment(
cricket::MEDIA_TYPE_AUDIO, sender_internal->AttachmentId()),
timestamp_us);
+ // TODO(https://crbug.com/webrtc/10771): We shouldn't need to have an
+ // SSRC assigned (there shouldn't need to exist a send-stream, created
+ // by an O/A exchange) in order to read audio media-source stats.
+ // TODO(https://crbug.com/webrtc/8694): SSRC 0 shouldn't be a magic
+ // value indicating no SSRC.
+ if (sender_internal->ssrc() != 0) {
+ auto* voice_sender_info =
+ track_media_info_map->GetVoiceSenderInfoBySsrc(
+ sender_internal->ssrc());
+ if (voice_sender_info) {
+ audio_source_stats->audio_level = DoubleAudioLevelFromIntAudioLevel(
+ voice_sender_info->audio_level);
+ audio_source_stats->total_audio_energy =
+ voice_sender_info->total_input_energy;
+ audio_source_stats->total_samples_duration =
+ voice_sender_info->total_input_duration;
+ }
+ }
+ media_source_stats = std::move(audio_source_stats);
} else {
RTC_DCHECK_EQ(MediaStreamTrackInterface::kVideoKind, track->kind());
auto video_source_stats = absl::make_unique<RTCVideoSourceStats>(
@@ -1420,15 +1433,18 @@
video_source_stats->width = source_stats.input_width;
video_source_stats->height = source_stats.input_height;
}
- // TODO(hbos): Source stats should not depend on whether or not we are
- // connected/have an SSRC assigned. Related to
- // https://crbug.com/webrtc/8694 (using ssrc 0 to indicate "none").
+ // TODO(https://crbug.com/webrtc/10771): We shouldn't need to have an
+ // SSRC assigned (there shouldn't need to exist a send-stream, created
+ // by an O/A exchange) in order to get framesPerSecond.
+ // TODO(https://crbug.com/webrtc/8694): SSRC 0 shouldn't be a magic
+ // value indicating no SSRC.
if (sender_internal->ssrc() != 0) {
- auto* sender_info = track_media_info_map->GetVideoSenderInfoBySsrc(
- sender_internal->ssrc());
- if (sender_info) {
+ auto* video_sender_info =
+ track_media_info_map->GetVideoSenderInfoBySsrc(
+ sender_internal->ssrc());
+ if (video_sender_info) {
video_source_stats->frames_per_second =
- sender_info->framerate_input;
+ video_sender_info->framerate_input;
}
}
media_source_stats = std::move(video_source_stats);
diff --git a/pc/rtc_stats_collector_unittest.cc b/pc/rtc_stats_collector_unittest.cc
index 963a3bc..02f6654 100644
--- a/pc/rtc_stats_collector_unittest.cc
+++ b/pc/rtc_stats_collector_unittest.cc
@@ -1438,9 +1438,6 @@
cricket::VoiceSenderInfo voice_sender_info_ssrc1;
voice_sender_info_ssrc1.local_stats.push_back(cricket::SsrcSenderInfo());
voice_sender_info_ssrc1.local_stats[0].ssrc = 1;
- voice_sender_info_ssrc1.audio_level = 32767;
- voice_sender_info_ssrc1.total_input_energy = 0.25;
- voice_sender_info_ssrc1.total_input_duration = 0.5;
voice_sender_info_ssrc1.apm_statistics.echo_return_loss = 42.0;
voice_sender_info_ssrc1.apm_statistics.echo_return_loss_enhancement = 52.0;
@@ -1471,9 +1468,6 @@
expected_local_audio_track_ssrc1.remote_source = false;
expected_local_audio_track_ssrc1.ended = true;
expected_local_audio_track_ssrc1.detached = false;
- expected_local_audio_track_ssrc1.audio_level = 1.0;
- expected_local_audio_track_ssrc1.total_audio_energy = 0.25;
- expected_local_audio_track_ssrc1.total_samples_duration = 0.5;
expected_local_audio_track_ssrc1.echo_return_loss = 42.0;
expected_local_audio_track_ssrc1.echo_return_loss_enhancement = 52.0;
ASSERT_TRUE(report->Get(expected_local_audio_track_ssrc1.id()))
@@ -2219,6 +2213,9 @@
voice_media_info.senders.push_back(cricket::VoiceSenderInfo());
voice_media_info.senders[0].local_stats.push_back(cricket::SsrcSenderInfo());
voice_media_info.senders[0].local_stats[0].ssrc = kSsrc;
+ voice_media_info.senders[0].audio_level = 32767; // [0,32767]
+ voice_media_info.senders[0].total_input_energy = 2.0;
+ voice_media_info.senders[0].total_input_duration = 3.0;
auto* voice_media_channel = pc_->AddVoiceChannel("AudioMid", "TransportName");
voice_media_channel->SetStats(voice_media_info);
stats_->SetupLocalTrackAndSender(cricket::MEDIA_TYPE_AUDIO,
@@ -2231,6 +2228,9 @@
report->timestamp_us());
expected_audio.track_identifier = "LocalAudioTrackID";
expected_audio.kind = "audio";
+ expected_audio.audio_level = 1.0; // [0,1]
+ expected_audio.total_audio_energy = 2.0;
+ expected_audio.total_samples_duration = 3.0;
ASSERT_TRUE(report->Get(expected_audio.id()));
EXPECT_EQ(report->Get(expected_audio.id())->cast_to<RTCAudioSourceStats>(),
diff --git a/pc/rtc_stats_integrationtest.cc b/pc/rtc_stats_integrationtest.cc
index a05fa0e..adb986d 100644
--- a/pc/rtc_stats_integrationtest.cc
+++ b/pc/rtc_stats_integrationtest.cc
@@ -647,8 +647,13 @@
media_stream_track.jitter_buffer_delay);
verifier.TestMemberIsNonNegative<uint64_t>(
media_stream_track.jitter_buffer_emitted_count);
- verifier.TestMemberIsNonNegative<uint64_t>(
+ verifier.TestMemberIsPositive<double>(media_stream_track.audio_level);
+ verifier.TestMemberIsPositive<double>(
+ media_stream_track.total_audio_energy);
+ verifier.TestMemberIsPositive<uint64_t>(
media_stream_track.total_samples_received);
+ verifier.TestMemberIsPositive<double>(
+ media_stream_track.total_samples_duration);
verifier.TestMemberIsNonNegative<uint64_t>(
media_stream_track.concealed_samples);
verifier.TestMemberIsNonNegative<uint64_t>(
@@ -676,8 +681,12 @@
verifier.TestMemberIsUndefined(media_stream_track.jitter_buffer_delay);
verifier.TestMemberIsUndefined(
media_stream_track.jitter_buffer_emitted_count);
+ verifier.TestMemberIsUndefined(media_stream_track.audio_level);
+ verifier.TestMemberIsUndefined(media_stream_track.total_audio_energy);
verifier.TestMemberIsUndefined(
media_stream_track.total_samples_received);
+ verifier.TestMemberIsUndefined(
+ media_stream_track.total_samples_duration);
verifier.TestMemberIsUndefined(media_stream_track.concealed_samples);
verifier.TestMemberIsUndefined(media_stream_track.concealment_events);
verifier.TestMemberIsUndefined(
@@ -710,11 +719,6 @@
verifier.TestMemberIsUndefined(
media_stream_track.sum_squared_frame_durations);
// Audio-only members
- verifier.TestMemberIsNonNegative<double>(media_stream_track.audio_level);
- verifier.TestMemberIsNonNegative<double>(
- media_stream_track.total_audio_energy);
- verifier.TestMemberIsNonNegative<double>(
- media_stream_track.total_samples_duration);
// TODO(hbos): |echo_return_loss| and |echo_return_loss_enhancement| are
// flaky on msan bot (sometimes defined, sometimes undefined). Should the
// test run until available or is there a way to have it always be
@@ -903,6 +907,9 @@
bool VerifyRTCAudioSourceStats(const RTCAudioSourceStats& audio_source) {
RTCStatsVerifier verifier(report_, &audio_source);
VerifyRTCMediaSourceStats(audio_source, &verifier);
+ verifier.TestMemberIsPositive<double>(audio_source.audio_level);
+ verifier.TestMemberIsPositive<double>(audio_source.total_audio_energy);
+ verifier.TestMemberIsPositive<double>(audio_source.total_samples_duration);
return verifier.ExpectAllMembersSuccessfullyTested();
}