This visualisation uses “@d3/stacked-horizontal-bar-chart” to visualise the Common Voice metadata coverage.
The original data is taken from the Common Voice cv-dataset
repository – direct link
- Splits by age range – shows how many clips have been provided by speakers of different age ranges for each locale (language)
- Splits by gender – shows how many clips have been provided by speakers of different genders for each locale (language)
- Average utterance duration by language – shows the average length of the utterance in seconds
- Total hours versus validated hours by language – compares the # of hours of recordings to the # of hours of validated recordings
https://observablehq.com/@kathyreid/mozilla-common-voice-v9-dataset-metadata-coverage