-
Notifications
You must be signed in to change notification settings - Fork 23.2k
Revamp WebCodecs documentation #43697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2bd2bf0
66f5016
25eae41
592a170
354ffdc
40381de
d9249e1
3983b82
10756ca
eda8b35
84d9bcc
0ef3562
9026fbc
8455a85
927a2a6
ed3f8c0
988d191
5aefa47
fc49deb
0aa4d77
f3c1a54
6d3ff0c
20e06f5
979ab05
1f17bb9
01e98cf
2f8c353
f1746c9
0ffefde
a8576f6
ecfc60f
2558420
18c091d
0ba697e
50a6272
a99f8a3
12c8ee5
c6b1397
10f8ece
e19bf34
df18502
4e43289
046241c
4b4d5dc
c67d14a
626db48
e9a288c
5a72486
7ba1c48
0918ac3
c2e21d4
f32a2c3
7c7d107
377db4d
865507e
164b7aa
76c9cbe
7ee8da9
628e5ff
e901ff2
b52bb7a
e163445
e54dd04
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,43 +6,105 @@ page-type: web-api-overview | |
|
|
||
| {{DefaultAPISidebar("WebCodecs API")}}{{AvailableInWorkers("window_and_dedicated")}} | ||
|
|
||
| The **WebCodecs API** gives web developers low-level access to the individual frames of a video stream and chunks of audio. | ||
| It is useful for web applications that require full control over the way media is processed. | ||
| For example, video or audio editors, and video conferencing. | ||
| The **WebCodecs API** enables web developers to encode and decode video and audio in the browser efficiently (using hardware acceleration) and with very low-level control (processing on a per-frame basis). | ||
|
|
||
| ## Concepts and Usage | ||
| It is useful for web applications that do heavy media processing, or which require low-level control over the way media is encoded. | ||
| This includes browser-based video and audio editing, as well as live-streaming and video conferencing. | ||
|
|
||
| Many Web APIs use media codecs internally. | ||
| For example, the [Web Audio API](/en-US/docs/Web/API/Web_Audio_API), and the [WebRTC API](/en-US/docs/Web/API/WebRTC_API). | ||
| However these APIs do not allow developers to work with individual frames of a video stream and unmixed chunks of encoded audio or video. | ||
| ## Concepts | ||
|
|
||
| Web developers have typically used WebAssembly in order to get around this limitation, | ||
| and to work with media codecs in the browser. | ||
| However, this requires additional bandwidth to download codecs that already exist in the browser, | ||
| reducing performance and power efficiency, and adding additional development overhead. | ||
| The WebCodecs API provides browser-native interfaces to represent raw video frames, encoded video frames, as well as raw and encoded audio. | ||
|
|
||
| The WebCodecs API provides access to codecs that are already in the browser. | ||
| It gives access to raw video frames, chunks of audio data, image decoders, audio and video encoders and decoders. | ||
| | | Video | Audio | | ||
| | ----------- | -------------------------------- | -------------------------------- | | ||
| | **Raw** | {{domxref("VideoFrame")}} | {{domxref("AudioData")}} | | ||
| | **Encoded** | {{domxref("EncodedVideoChunk")}} | {{domxref("EncodedAudioChunk")}} | | ||
|
|
||
| ### Processing Model | ||
| The WebCodecs API also introduces the {{domxref("VideoDecoder")}} and {{domxref("VideoEncoder")}} interfaces, which transform `EncodedVideoChunk` objects into `VideoFrame` objects and vice-versa. | ||
|
|
||
| The WebCodecs API uses an asynchronous [processing model](https://w3c.github.io/webcodecs/#codec-processing-model-section). Each instance | ||
| of an encoder or decoder maintains an internal, independent processing queue. When queueing a substantial amount of work, it's important to | ||
| keep this model in mind. | ||
|  | ||
|
|
||
| Methods named `configure()`, `encode()`, `decode()`, and `flush()` operate asynchronously by appending control messages | ||
|
sb2702 marked this conversation as resolved.
|
||
| to the end the queue, while methods named `reset()` and `close()` synchronously abort all pending work and purge the | ||
| processing queue. After `reset()`, more work may be queued following a call to `configure()`, but `close()` is a permanent operation. | ||
| Likewise, the WebCodecs API also introduces the {{domxref("AudioDecoder")}} and {{domxref("AudioEncoder")}} interfaces, which transform `EncodedAudioChunk` objects into `AudioData` objects and vice-versa. | ||
|
|
||
| Methods named `flush()` can be used to wait for the completion of all work that was pending at the time `flush()` was called. However, it | ||
| should generally only be called once all desired work is queued. It is not intended to force progress at regular intervals. Calling it | ||
| unnecessarily will affect encoder quality and cause decoders to require the next input to be a key frame. | ||
|  | ||
|
|
||
| ### Demuxing | ||
| Generally there is a 1:1 correspondence between the raw and encoded versions of each media type. Decoding a number of `EncodedVideoChunk` objects will yield the same number of `VideoFrame` objects (and this is also true for audio). | ||
|
|
||
| There is currently no API for demuxing media containers. Developers working with containerized media will need to implement their own | ||
| or use third party libraries. E.g., [MP4Box.js](https://github.com/gpac/mp4box.js/) or [jswebm](https://github.com/jscodec/jswebm) can be | ||
| used to demux audio and video data into {{domxref("EncodedAudioChunk")}} and {{domxref("EncodedVideoChunk")}} objects respectively. | ||
| ### Video | ||
|
|
||
| A `VideoFrame` represents a video frame, and is tied to actual pixel data on the device's graphics memory, as well as metadata such as the timestamp and duration (in microseconds), format and resolution. A `VideoFrame` can be constructed from any image source, and can also be rendered to a {{domxref("Canvas")}} using any of the canvas rendering methods. | ||
|
|
||
| `EncodedVideoChunk` represents the encoded (compressed) version of the same frame, tied to binary data in regular memory and the same metadata. | ||
| The only difference is that it has one additional field: `type`, which can be "key" or "delta", representing whether or not it corresponds to a [key frame](https://webcodecsfundamentals.org/basics/encoded-video-chunk/#key-frames). An `EncodedVideoChunk` typically stores 10 to 100 times less data than its corresponding raw `VideoFrame`. | ||
|
|
||
|  | ||
|
sb2702 marked this conversation as resolved.
|
||
|
|
||
| ### Audio | ||
|
|
||
| An `AudioData` object represents a number of individual audio samples (1024 is a typical number). Audio sample data can be extracted as a {{jsxref("Float32Array")}} via the `copyTo` method. There is no direct integration with the [Web Audio API](/en-US/docs/Web/API/Web_Audio_API); however, the extracted `Float32Array` samples can be copied directly into a {{domxref("AudioBuffer")}} for playback. | ||
|
|
||
| Likewise, the `EncodedAudioChunk` represents the encoded (compressed) version of an `AudioData` object, containing compressed audio sample data. | ||
|
|
||
|  | ||
|
|
||
| ### Processing model | ||
|
|
||
| The WebCodecs API uses an asynchronous [processing model](https://w3c.github.io/webcodecs/#codec-processing-model-section). Each instance of an encoder or decoder maintains an internal, independent processing queue. When queueing a substantial number of encoded chunks for decoding or frames/samples for encoding, it's important to keep this model in mind. | ||
|
|
||
| Methods named {{domxref("VideoEncoder/configure", "configure()")}}, {{domxref("VideoEncoder/encode", "encode()")}}, {{domxref("VideoDecoder/decode", "decode()")}}, and {{domxref("VideoEncoder/flush", "flush()")}} operate asynchronously by appending control messages to the end of the queue, while methods named {{domxref("VideoEncoder/reset", "reset()")}} and {{domxref("VideoEncoder/close", "close()")}} synchronously abort all pending work and purge the processing queue. After `reset()`, more work may be queued following a call to `configure()`, but `close()` is a permanent operation. These methods work for both Audio and Video decoders/encoders. | ||
|
|
||
| The `flush()` method can be used to wait for the completion of all work that was pending at the time `flush()` was called. However, it should generally only be called once all desired work is queued — it is not intended to force progress at regular intervals. Calling it unnecessarily will affect encoder quality and cause decoders to require the next input to be a key frame. | ||
|
|
||
| ### Codecs | ||
|
|
||
| A codec is a specific algorithm for encoding (compressing) and decoding (decompressing) video and audio. There are several industry standard codecs for video, and a separate set of codecs for audio. Here are the major ones supported by the WebCodecs API: | ||
|
|
||
| #### Video codecs | ||
|
|
||
| - H.264 (AVC) | ||
| - : The most widely supported video codec. Most MP4 files use H.264. | ||
| - VP9 | ||
| - : Open source, developed by Google. Better compression than H.264. Commonly used on YouTube and in WebM files. | ||
| - AV1 | ||
| - : The newest open source codec, with better compression than VP9. Broad decoder support; hardware encoder support is still limited. | ||
| - H.265 (HEVC) | ||
| - : Better compression than H.264, but with significant gaps in browser support outside of Apple platforms. | ||
|
|
||
| #### Audio codecs | ||
|
|
||
| - Opus | ||
| - : Open source, low-latency. The recommended choice for most WebCodecs audio encoding. | ||
| - AAC | ||
| - : Widely supported. Common in MP4 files. | ||
| - MP3 | ||
| - : Broadly supported for decoding, but not available as an encoder in WebCodecs. | ||
| - PCM | ||
| - : Uncompressed audio. No quality loss, but large file sizes. | ||
|
|
||
| The WebCodecs specification supports a particular set of codecs, and individual devices and browsers may only support a subset of those. Encoders and decoders must be configured with fully specified codec strings (such as `"vp09.00.40.08.00"` for VP9 or `"avc1.4d0034"` for H.264) instead of ambiguous codec names like `"vp9"` or `"h264"`. The [Codec selection guide](/en-US/docs/Web/API/WebCodecs_API/Codec_selection) provides guidance on choosing an appropriate codec string (see the [Codec Support Table](https://webcodecsfundamentals.org/datasets/codec-support-table/) (webcodecsfundamentals.org) for a complete list of codec strings and their browser support). | ||
|
|
||
| ### Muxing and demuxing | ||
|
|
||
| The WebCodecs API only deals with encoding and decoding, with encoded chunks just representing binary data. It does not provide a built-in way to read `EncodedVideoChunk` objects from a video file, or write them to a playable video file. | ||
|
|
||
| Reading encoded chunks from a video file is a completely different process called demuxing, and to fetch `EncodedVideoChunk` objects from a video file, you will need to use a demuxing library such as [Mediabunny](https://mediabunny.dev/) or [web-demuxer](https://github.com/bilibili/web-demuxer). | ||
|
|
||
|  | ||
|
|
||
| These libraries will follow the video container specifications (e.g., webm, mp4) to extract the track data and byte offsets for each encoded chunk, and provide methods for extracting the actual chunks from the raw file. | ||
|
|
||
| Likewise, to write to a playable video file, you will need a muxing library, with [Mediabunny](https://mediabunny.dev/) being the primary option. Muxing libraries handle formatting the binary encoded data, and placing it in the correct byte position in the output file stream according to the container specification, so that the output video is playable. | ||
|
|
||
| You can find more information on muxing and demuxing in the [Muxing and Demuxing guide](https://webcodecsfundamentals.org/basics/muxing/) (webcodecsfundamentals.org). | ||
|
|
||
| ## Guides | ||
|
|
||
| - [Video processing concepts](/en-US/docs/Web/API/WebCodecs_API/Video_processing_concepts) | ||
| - : A brief primer on video processing, covering codecs and containers, muxing and demuxing, and conceptual information that explains how the WebCodecs API implements these concepts. | ||
| - [Using the WebCodecs API](/en-US/docs/Web/API/WebCodecs_API/Using_the_WebCodecs_API) | ||
| - : In depth guide to how to actually use the WebCodecs API, including how to instantiate and configure encoders and decoders, how to create and consume video frames, and how to extract samples from `AudioData`. | ||
| - [Codec selection](/en-US/docs/Web/API/WebCodecs_API/Codec_selection) | ||
| - : The WebCodecs API requires codec strings — precise identifiers that specify not just the codec family but also the profile, level, and other parameters. This guide explains how codec strings work and how to choose the right codec for common use cases. | ||
|
Comment on lines
+100
to
+107
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Guides are never listed - see https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API Normally you'd just list them in the sidebar and as See also section. @wbamberg Do you have any thoughts on listing guides in API Overview pages. To my mind it is an omission - sure they are guides and we separate our doc types, but it is useful to have them here in the main entry point.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, they're not, but I agree that they should be. I think of API overview pages as like expanded versions of the sidebar really, so it makes sense to have guides listed (and I would have them first, to mirror the sidebar). I also think we ought to be stricter about not having too much exposition in them, and tend towards moving exposition into separate guide pages. API overview pages IMO are primarily navigational, although they do also provide context and motivation. e.g. I think the Fetch API page is OK, but the Service Worker API is too much.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could move much of the content in the index to an "Intro to WebCodecs" guide, and slim down to a much narrower concepts and usage on the main page, just talking about the core interfaces (VideoEncoder, AudioEncoder, VideoDecoder, AudioDecoder) and the core methods (currently in processing model) and move everything else (codecs, muxing, diagrams) into an intro or concepts guide?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, something like this would be my preference, but I'm not the reviewer of this, Hamish is, so only if he agrees. And as you see, MDN is not consistent about doing this, so it might just reflect my personal prejudices rather than any considered policy. This is probably obvious, but to unpack a bit what I mean by "primarily navigational": I expect most of the time people land on this page, they are going to be looking to get somewhere else, rather than spend a lot of time reading stuff. And in a page like the Service Worker API, the exposition at the start makes it harder for people to find links to the places they might want to go, like the interface pages.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @wbamberg (and @sb2702 ) I broadly see API overviews the same way as you. The current structure guide-like material in Web API overviews is a "sensible minimum" - ensuring that every API at least has some conceptual overview. It usually contains way too much information because it is easier to slightly extend the concepts than to generate a Using document or a separate concepts doc. My problem is that this is a more significant change in way of working. If we do this we need buy-in and a clear instruction to authors and reviewers about what they do instead. This is complicated because it might differ if you know you are getting lots of guide or if you are just doing an introduction. One way we might do this is modify the templates to suggest that if more than three or four paragaphs the current concepts section should be turned into a separate guide document "XXXX API Concepts". We could then prototype this for existing docs, such as Service Worker API. I'm back day after tomorrow. I'll think about this then and see how you respond. But for now, I think lets leave this as-is.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly, I have no preference either way. As a reader of the docs, I can see arguments for both. I'm happy to leave it as-is
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note, I will think about this. But if we do it, we can still do it as a post process. |
||
|
|
||
| ## Interfaces | ||
|
|
||
|
|
@@ -73,34 +135,49 @@ used to demux audio and video data into {{domxref("EncodedAudioChunk")}} and {{d | |
|
|
||
| ## Examples | ||
|
|
||
| In the following example, frames are returned from a {{domxref("MediaStreamTrackProcessor")}}, then encoded. | ||
| See the full example and read more about it in the article [Video processing with WebCodecs](https://developer.chrome.com/docs/web-platform/best-practices/webcodecs). | ||
| ### Basic usage | ||
|
|
||
| To instantiate a `VideoEncoder`, we pass an object that specifies a callback function that will be called when `EncodedVideoChunk` instances are available for processing, and an error function that will be called if there are errors. | ||
| This is shown in the following code: | ||
|
|
||
| ```js | ||
| let frameCounter = 0; | ||
| const track = stream.getVideoTracks()[0]; | ||
| const mediaProcessor = new MediaStreamTrackProcessor(track); | ||
| const reader = mediaProcessor.readable.getReader(); | ||
| while (true) { | ||
| const result = await reader.read(); | ||
| if (result.done) break; | ||
| let frame = result.value; | ||
| if (encoder.encodeQueueSize > 2) { | ||
| // Too many frames in flight, encoder is overwhelmed | ||
| // let's drop this frame. | ||
| frame.close(); | ||
| } else { | ||
| frameCounter++; | ||
| const insertKeyframe = frameCounter % 150 === 0; | ||
| encoder.encode(frame, { keyFrame: insertKeyframe }); | ||
| frame.close(); | ||
| } | ||
| const encoder = new VideoEncoder({ | ||
| output(chunk, meta) { | ||
| // Do something with chunk, typically send to muxing library | ||
| }, | ||
| error(e) { | ||
| console.warn(e); | ||
| }, | ||
| }); | ||
| ``` | ||
|
|
||
| You then need to configure the encoder with the codec parameter and various other fields. | ||
|
|
||
| ```js | ||
| encoder.configure({ | ||
| codec: "vp09.00.40.08.00", // See codec selection guide | ||
| width: 1280, | ||
| height: 720, | ||
| bitrate: 1_000_000, // 1 Mbps | ||
| framerate: 30, | ||
| }); | ||
| ``` | ||
|
|
||
| You can then start encoding frames to the encoder. You can construct a `VideoFrame` from a `Canvas`. | ||
|
|
||
| ```js | ||
| for (let i = 0; i < 60; i++) { | ||
| const frame = new VideoFrame(canvas, { timestamp: (i * 1e6) / 30 }); //30 fps, in microseconds | ||
|
sb2702 marked this conversation as resolved.
|
||
| encoder.encode(frame, { keyFrame: i % 60 === 0 }); | ||
| } | ||
| ``` | ||
|
|
||
| See [Using the WebCodecs API](/en-US/docs/Web/API/WebCodecs_API/Using_the_WebCodecs_API) for more examples. | ||
|
|
||
| ## See also | ||
|
|
||
| - [Video processing with WebCodecs](https://developer.chrome.com/docs/web-platform/best-practices/webcodecs) | ||
| - [WebCodecs API Samples](https://w3c.github.io/webcodecs/samples/) | ||
| - [WebCodecsFundamentals](https://webcodecsfundamentals.org/) | ||
| - [Real-Time Video Processing with WebCodecs and Streams: Processing Pipelines](https://webrtchacks.com/real-time-video-processing-with-webcodecs-and-streams-processing-pipelines-part-1/) | ||
| - [Video Frame Processing on the Web – WebAssembly, WebGPU, WebGL, WebCodecs, WebNN, and WebTransport](https://webrtchacks.com/video-frame-processing-on-the-web-webassembly-webgpu-webgl-webcodecs-webnn-and-webtransport/) | ||
Uh oh!
There was an error while loading. Please reload this page.