On top of editorial updates, substantives changes since publication as a W3C Recommendation in November 2016 are the addition of a {{SourceBuffer/changeType()}} method to switch codecs, the possibility to create and use {{MediaSource}} objects off the main thread in dedicated workers, and the removal of the createObjectURL()
extension to the {{URL}} object following its integration in the File API [[FILEAPI]]. For a full list of changes done since the previous version, see the commits.
The working group maintains a list of all bug reports that the editors have not yet tried to address.
Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should track the GitHub repository and take part in the discussions.
This specification allows JavaScript to dynamically construct media streams for <audio> and <video>. It defines a MediaSource object that can serve as a source of media data for an HTMLMediaElement. MediaSource objects have one or more SourceBuffer objects. Applications append data segments to the SourceBuffer objects, and can adapt the quality of appended data based on system performance and other factors. Data from the SourceBuffer objects is managed as track buffers for audio, video and text data that is decoded and played. Byte stream specifications used with these extensions are available in the byte stream format registry [[MSE-REGISTRY]].
This specification was designed with the following goals in mind:
This specification defines:
The [=track buffers=] that provide [=coded frames=] for the {{AudioTrack/enabled}} {{HTMLMediaElement/audioTracks}}, the {{VideoTrack/selected}} {{HTMLMediaElement/videoTracks}}, and the or {{HTMLMediaElement/textTracks}}. All these tracks are associated with SourceBuffer objects in the {{MediaSource/activeSourceBuffers}} list.
A [=presentation timestamp=] range used to filter out [=coded frames=] while appending. The append window represents a single continuous time range with a single start time and end time. Coded frames with [=presentation timestamp=] within this range are allowed to be appended to the SourceBuffer while coded frames outside this range are filtered out. The append window start and end times are controlled by the {{SourceBuffer/appendWindowStart}} and {{SourceBuffer/appendWindowEnd}} attributes respectively.
A unit of media data that has a [=presentation timestamp=], a [=decode timestamp=], and a [=coded frame duration=].
The duration of a [=coded frame=]. For video and text, the duration indicates how long the video frame or text SHOULD be displayed. For audio, the duration represents the sum of all the samples contained within the coded frame. For example, if an audio frame contained 441 samples @44100Hz the frame duration would be 10 milliseconds.
The sum of a [=coded frame=] [=presentation timestamp=] and its [=coded frame duration=]. It represents the [=presentation timestamp=] that immediately follows the coded frame.
A group of [=coded frames=] that are adjacent and have monotonically increasing [=decode timestamps=] without any gaps. Discontinuities detected by the [=coded frame processing=] algorithm and {{SourceBuffer/abort()}} calls trigger the start of a new coded frame group.
The decode timestamp indicates the latest time at which the frame needs to be decoded assuming instantaneous decoding and rendering of this and any dependant frames (this is equal to the [=presentation timestamp=] of the earliest frame, in [=presentation order=], that is dependant on this frame). If frames can be decoded out of [=presentation order=], then the decode timestamp MUST be present in or derivable from the byte stream. The user agent MUST run the [=append error=] algorithm if this is not the case. If frames cannot be decoded out of [=presentation order=] and a decode timestamp is not present in the byte stream, then the decode timestamp is equal to the [=presentation timestamp=].
A sequence of bytes that contain all of the initialization information required to decode a sequence of [=media segments=]. This includes codec initialization data, [=Track ID=] mappings for multiplexed segments, and timestamp offsets (e.g., edit lists).
The [=byte stream format specifications=] in the byte stream format registry [[MSE-REGISTRY]] contain format specific examples.
A sequence of bytes that contain packetized & timestamped media data for a portion of the . Media segments are always associated with the most recently appended [=initialization segment=].
The [=byte stream format specifications=] in the byte stream format registry [[MSE-REGISTRY]] contain format specific examples.
A MediaSource object URL is a unique [[FILEAPI]] created by {{URL/createObjectURL()}}. It is used to attach a MediaSource object to an HTMLMediaElement.
These URLs are the same as a , except that anything in the definition of that feature that refers to and objects is hereby extended to also apply to MediaSource objects.
The [=origin=] of the MediaSource object URL is the [=relevant settings object=] of this
during the call to {{URL/createObjectURL()}}.
For example, the [=origin=] of the MediaSource object URL affects the way that the media element is consumed by canvas.
The parent media source of a SourceBuffer object is the MediaSource object that created it.
The presentation start time is the earliest time point in the presentation and specifies the initial playback position and earliest possible position. All presentations created using this specification have a presentation start time of 0.
For the purposes of determining if {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} contains a {{TimeRanges}} that includes the current playback position, implementations MAY choose to allow a current playback position at or after [=presentation start time=] and before the first {{TimeRanges}} to play the first {{TimeRanges}} if that {{TimeRanges}} starts within a reasonably short time, like 1 second, after [=presentation start time=]. This allowance accommodates the reality that muxed streams commonly do not begin all tracks precisely at [=presentation start time=]. Implementations MUST report the actual buffered range, regardless of this allowance.
The presentation interval of a [=coded frame=] is the time interval from its [=presentation timestamp=] to the [=presentation timestamp=] plus the [=coded frame duration|coded frame's duration=]. For example, if a coded frame has a presentation timestamp of 10 seconds and a [=coded frame duration=] of 100 milliseconds, then the presentation interval would be [10-10.1). Note that the start of the range is inclusive, but the end of the range is exclusive.
The order that [=coded frames=] are rendered in the presentation. The presentation order is achieved by ordering [=coded frames=] in monotonically increasing order by their [=presentation timestamps=].
A reference to a specific time in the presentation. The presentation timestamp in a [=coded frame=] indicates when the frame SHOULD be rendered.
A position in a [=media segment=] where decoding and continuous playback can begin without relying on any previous data in the segment. For video this tends to be the location of I-frames. In the case of audio, most audio frames can be treated as a random access point. Since video tracks tend to have a more sparse distribution of random access points, the location of these points are usually considered the random access points for multiplexed streams.
The specific [=byte stream format specification=] that describes the format of the byte stream accepted by a SourceBuffer instance. The [=byte stream format specification=], for a SourceBuffer object, is initially selected based on the |type:DOMString| passed to the {{MediaSource/addSourceBuffer()}} call that created the object, and can be updated by {{SourceBuffer/changeType()}} calls on the object.
A specific set of tracks distributed across one or more SourceBuffer objects owned by a single MediaSource instance.
Implementations MUST support at least 1 MediaSource object with the following configurations:
MediaSource objects MUST support each of the configurations above, but they are only required to support one configuration at a time. Supporting multiple configurations at once or additional configurations is a quality of implementation issue.
A byte stream format specific structure that provides the [=Track ID=], codec configuration, and other metadata for a single track. Each track description inside a single [=initialization segment=] has a unique [=Track ID=]. The user agent MUST run the [=append error=] algorithm if the [=Track ID=] is not unique within the [=initialization segment=].
A Track ID is a byte stream format specific identifier that marks sections of the byte stream as being part of a specific track. The Track ID in a [=track description=] identifies which sections of a [=media segment=] belong to that track.
The MediaSource object represents a source of media data for an HTMLMediaElement. It keeps track of the {{MediaSource/readyState}} for this source as well as a list of SourceBuffer objects that can be used to add media data to the presentation. MediaSource objects are created by the web application and then attached to an HTMLMediaElement. The application uses the SourceBuffer objects in {{MediaSource/sourceBuffers}} to add media data to this source. The HTMLMediaElement fetches this media data from the MediaSource object when it is needed during playback.
Each {{MediaSource}} object has a [[\live seekable range]] internal slot that stores a . It is initialized to an empty {{TimeRanges}} object when the {{MediaSource}} object is created, is maintained by {{MediaSource/setLiveSeekableRange()}} and {{MediaSource/clearLiveSeekableRange()}}, and is used in HTMLMediaElement Extensions to modify {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} behavior.
Each {{MediaSource}} object has a [[\has ever been attached]] internal slot that stores a {{boolean}}. It is initialized to false when the {{MediaSource}} object is created, and is set true in the extended {{HTMLMediaElement}}'s [=resource fetch algorithm=] as described in the [=attaching to a media element=] algorithm. The extended [=resource fetch algorithm=] uses this internal slot to conditionally fail attachment of a {{MediaSource}} using a {{MediaSourceHandle}} set on a {{HTMLMediaElement}}'s {{HTMLMediaElement/srcObject}} attribute.
enum ReadyState { "closed", "open", "ended", };
enum EndOfStreamError { "network", "decode", };
Terminates playback and signals that a network error has occurred.
JavaScript applications SHOULD use this status code to terminate playback with a network error. For example, if a network error occurs while fetching media data.
Terminates playback and signals that a decoding error has occurred.
JavaScript applications SHOULD use this status code to terminate playback with a decode error. For example, if a parsing error occurs while processing out-of-band media data.
[Exposed=(Window,DedicatedWorker)] interface MediaSource : EventTarget { constructor(); [ SameObject, Exposed=DedicatedWorker ] readonly attribute MediaSourceHandle handle; readonly attribute SourceBufferList sourceBuffers; readonly attribute SourceBufferList activeSourceBuffers; readonly attribute ReadyState readyState; attribute unrestricted double duration; attribute EventHandler onsourceopen; attribute EventHandler onsourceended; attribute EventHandler onsourceclose; static readonly attribute boolean canConstructInDedicatedWorker; SourceBuffer addSourceBuffer (DOMString type); undefined removeSourceBuffer (SourceBuffer sourceBuffer); undefined endOfStream (optional EndOfStreamError error); undefined setLiveSeekableRange (double start, double end); undefined clearLiveSeekableRange (); static boolean isTypeSupported (DOMString type); };
handle
of type MediaSourceHandle, [SameObject] readonlyContains a handle useful for attachment of a dedicated worker {{MediaSource}} object to an {{HTMLMediaElement}} via {{HTMLMediaElement/srcObject}}. The handle remains the same object for this {{MediaSource}} object across accesses of this attribute, but it is distinct for each {{MediaSource}} object.
This specification may eventually enable visibility of this attribute on {{MediaSource}} objects on the main Window context. If so, specification care will be necessary to prevent potential backwards incompatible changes, such as could happen if exceptions were thrown on accesses to this attribute.
On getting, run the following steps:
sourceBuffers
of type SourceBufferList, readonly activeSourceBuffers
of type SourceBufferList, readonly Contains the subset of {{MediaSource/sourceBuffers}} that are providing the {{VideoTrack/selected}} video track, the {{AudioTrack/enabled}} audio track(s), and the or text track(s).
SourceBuffer objects in this list MUST appear in the same order as they appear in the {{MediaSource/sourceBuffers}} attribute; e.g., if only sourceBuffers[0] and sourceBuffers[3] are in {{MediaSource/activeSourceBuffers}}, then activeSourceBuffers[0] MUST equal sourceBuffers[0] and activeSourceBuffers[1] MUST equal sourceBuffers[3].
The Changes to selected/enabled track state section describes how this attribute gets updated.
readyState
of type {{ReadyState}}, readonly Indicates the current state of the MediaSource object. When the MediaSource is created {{MediaSource/readyState}} MUST be set to {{ReadyState/""closed""}}.
duration
of type {{unrestricted double}}Allows the web application to set the presentation duration. The duration is initially set to NaN when the MediaSource object is created.
On getting, run the following steps:
On setting, run the following steps:
The [=duration change=] algorithm will adjust |new duration| higher if there is any currently buffered coded frame with a higher end time.
{{SourceBuffer/appendBuffer()}} and {{MediaSource/endOfStream()}} can update the duration under certain circumstances.
onsourceopen
of type {{EventHandler}}The event handler for the {{sourceopen}} event.
onsourceended
of type {{EventHandler}}The event handler for the {{sourceended}} event.
onsourceclose
of type {{EventHandler}}The event handler for the {{sourceclose}} event.
canConstructInDedicatedWorker
of type {{boolean}}Returns true.
This attribute enables main thread and dedicated worker feature detection of support for creating and using a {{MediaSource}} object in a dedicated worker, and mitigates the need for higher latency detection polyfills like attempting creation of a {{MediaSource}} object from a dedicated worker, especially if the feature is not supported.
addSourceBuffer
Adds a new SourceBuffer to {{MediaSource/sourceBuffers}}.
For example, a user agent MAY throw a {{QuotaExceededError}} exception if the media element has reached the {{HTMLMediaElement/HAVE_METADATA}} readyState. This can occur if the user agent's media engine does not support adding more tracks during playback.
removeSourceBuffer
Removes a {{SourceBuffer}} from {{MediaSource/sourceBuffers}}.
This should trigger {{AudioTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=AudioTrackList/removetrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to the {{AudioTrack}} object, at the |SourceBuffer audioTracks list|. If the {{AudioTrack/enabled}} attribute on the {{AudioTrack}} object was true at the beginning of this removal step, then this should also trigger {{AudioTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=AudioTrackList/change=] at the |SourceBuffer audioTracks list|.
This should trigger {{AudioTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=AudioTrackList/removetrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to the {{AudioTrack}} object, at the |HTMLMediaElement audioTracks list|. If the {{AudioTrack/enabled}} attribute on the {{AudioTrack}} object was true at the beginning of this removal step, then this should also trigger {{AudioTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=AudioTrackList/change=] at the |HTMLMediaElement audioTracks list|.
This should trigger {{VideoTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=VideoTrackList/removetrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to the {{VideoTrack}} object, at the |SourceBuffer videoTracks list|. If the {{VideoTrack/selected}} attribute on the {{VideoTrack}} object was true at the beginning of this removal step, then this should also trigger {{VideoTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=VideoTrackList/change=] at the |SourceBuffer videoTracks list|.
This should trigger {{VideoTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=VideoTrackList/removetrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to the {{VideoTrack}} object, at the |HTMLMediaElement videoTracks list|. If the {{VideoTrack/selected}} attribute on the {{VideoTrack}} object was true at the beginning of this removal step, then this should also trigger {{VideoTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=VideoTrackList/change=] at the |HTMLMediaElement videoTracks list|.
This should trigger {{TextTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=TextTrackList/removetrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to the {{TextTrack}} object, at the |SourceBuffer textTracks list|. If the {{TextTrack/mode}} attribute on the {{TextTrack}} object was or at the beginning of this removal step, then this should also trigger {{TextTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=TextTrackList/change=] at the |SourceBuffer textTracks list|.
This should trigger {{TextTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=TextTrackList/removetrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to the {{TextTrack}} object, at the |HTMLMediaElement textTracks list|. If the {{TextTrack/mode}} attribute on the {{TextTrack}} object was or at the beginning of this removal step, then this should also trigger {{TextTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=TextTrackList/change=] at the |HTMLMediaElement textTracks list|.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|sourceBuffer| | {{SourceBuffer}} | ✘ | ✘ |
endOfStream
Signals the end of the stream.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|error| | {{EndOfStreamError}} | ✘ | ✔ |
setLiveSeekableRange
Updates {{MediaSource/[[live seekable range]]}} that is used in HTMLMediaElement Extensions to modify {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} behavior.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|start| | {{double}} | ✘ | ✘ | The start of the range, in seconds measured from [=presentation start time=]. While set, and if {{MediaSource/duration}} equals positive Infinity, {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} will return a non-empty TimeRanges object with a lowest range start timestamp no greater than |start|. |
|end| | {{double}} | ✘ | ✘ | The end of range, in seconds measured from [=presentation start time=]. While set, and if {{MediaSource/duration}} equals positive Infinity, {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} will return a non-empty TimeRanges object with a highest range end timestamp no less than |end|. |
clearLiveSeekableRange
Updates {{MediaSource/[[live seekable range]]}} that is used in HTMLMediaElement Extensions to modify {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} behavior.
isTypeSupported
, staticCheck to see whether the MediaSource is capable of creating SourceBuffer objects for the specified MIME type.
If true is returned from this method, it only indicates that the MediaSource implementation is capable of creating SourceBuffer objects for the specified MIME type. An {{MediaSource/addSourceBuffer()}} call SHOULD still fail if sufficient resources are not available to support the addition of a new SourceBuffer.
This method returning true implies that HTMLMediaElement.canPlayType() will return "maybe" or "probably" since it does not make sense for a MediaSource to support a type the HTMLMediaElement knows it cannot play.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|type| | {{DOMString}} | ✘ | ✘ |
Event name | Interface | Dispatched when... |
---|---|---|
sourceopen | {{Event}} | {{MediaSource}}'s {{MediaSource/readyState}} transitions from {{ReadyState/"closed"}} to {{ReadyState/"open"}} or from {{ReadyState/"ended"}} to {{ReadyState/"open"}}. |
sourceended | {{Event}} | {{MediaSource}}'s {{MediaSource/readyState}} transitions from {{ReadyState/"open"}} to {{ReadyState/"ended"}}. |
sourceclose | {{Event}} | {{MediaSource}}'s {{MediaSource/readyState}} transitions from {{ReadyState/"open"}} to {{ReadyState/"closed"}} or {{ReadyState/"ended"}} to {{ReadyState/"closed"}}. |
When a {{Window}} {{HTMLMediaElement}} is attached to a {{DedicatedWorkerGlobalScope}} {{MediaSource}}, each context has algorithms that depend on information from the other.
{{HTMLMediaElement}} is exposed only to {{Window}} contexts, but {{MediaSource}} and related objects defined in this specification are exposed in {{Window}} and {{DedicatedWorkerGlobalScope}} contexts. This lets applications construct a {{MediaSource}} object in either of those types of context and attach it to an {{HTMLMediaElement}} object in a {{Window}} context using a [=MediaSource object URL=] or a {{MediaSourceHandle}} as described in the [=attaching to a media element=] algorithm. A {{MediaSource}} object is not {{Transferable}}; it is only visible in the context where it was created.
The rest of this section describes a model for bounding information latency for attachments of a {{Window}} media element to a {{DedicatedWorkerGlobalScope}} {{MediaSource}}. While the model describes communication using message passing, implementations MAY choose to communicate in potentially faster ways, such as using shared memory and locks. Attachments to a {{Window}} {{MediaSource}} synchronously have the information already without communicating it across contexts.A {{MediaSource}} that is constructed in a {{DedicatedWorkerGlobalScope}} has a [[\port to main]] internal slot that stores a {{MessagePort}} setup during attachment and nulled during detachment. A {{Window}} {{MediaSource/[[port to main]]}} is always null.
An {{HTMLMediaElement}} extended by this specification and attached to a {{DedicatedWorkerGlobalScope}} {{MediaSource}} similarly has a [[\port to worker]] internal slot that stores a {{MessagePort}} and a [[\channel with worker]] internal slot that stores a {{MessageChannel}}, both setup during attachment and nulled during detachment. Both {{HTMLMediaElement/[[port to worker]]}} and {{HTMLMediaElement/[[channel with worker]]}} are null unless attached to a {{DedicatedWorkerGlobalScope}} {{MediaSource}}.Algorithms in this specification that need to communicate information from a {{Window}} {{HTMLMediaElement}} to an attached {{DedicatedWorkerGlobalScope}} {{MediaSource}}, or vice versa, will use these internal ports implicitly to post a message to their counterpart, where the implicit handler of the message runs steps as described in the algorithms.
There are distinct mechanisms for attaching a {{MediaSource}} to a media element depending on where the {{MediaSource}} object was constructed, in a {{Window}} versus in a {{DedicatedWorkerGlobalScope}}:
Attaching a {{MediaSource}} that was constructed in a {{Window}} can be done by assigning a [=MediaSource object URL=] for that {{MediaSource}} to the media element {{HTMLMediaElement/src}} attribute or the src attribute of a <source> inside a media element. A [=MediaSource object URL=] is created by passing a MediaSource object to {{URL/createObjectURL()}}.
Though implementations MAY allow [=MediaSource object URL=] creation in a {{DedicatedWorkerGlobalScope}} for a {{MediaSource}} constructed in that worker, attempting to use that [=MediaSource object URL=] to attach to a media element using either the {{HTMLMediaElement/src}} attribute or the src attribute of a <source> inside a media element MUST fail in the media element's [=resource fetch algorithm=], as extended below.
Extending the object URL attachment mechanism to worker MediaSource object URLs would further propagate this idiom that is less preferred versus using srcObject, and would unnecessarily increase user agent interoperability risk and implementation complexity.
If the [=resource fetch algorithm=] was invoked with a media provider object that is a {{MediaSource}} object, a {{MediaSourceHandle}} object or a URL record whose object is a {{MediaSource}} object, then let mode be local, skip the first step in the [=resource fetch algorithm=] (which may otherwise set mode to remote) and continue the execution of the [=resource fetch algorithm=].
The first step of the [=resource fetch algorithm=] is expected to eventually align with
selecting local mode for URL records whose objects are media provider objects. The intent is that if the
HTMLMediaElement's src
attribute or selected child <source>
's src
attribute is a blob:
URL matching a [=MediaSource object URL=] when the respective
src
attribute was last changed, then that MediaSource object is used as the media provider object
and current media resource in the local mode logic in the [=resource fetch algorithm=]. This also
means that the remote mode logic that includes observance of any preload attribute is skipped when a
MediaSource object is attached. Even with that eventual change to [[HTML]], the execution of the following
steps at the beginning of the local mode logic is still required when the current media resource is a
MediaSource object.
At the beginning of the "Otherwise (mode is local)" section of the [=resource fetch algorithm=], execute the additional steps, below.
Relative to the action which triggered the media element's resource selection algorithm, these steps are asynchronous. The [=resource fetch algorithm=] is run after the task that invoked the resource selection algorithm is allowed to continue and a stable state is reached. Implementations may delay the steps in the "Otherwise" clause, below, until the MediaSource object is ready for use.
An attached MediaSource does not use the remote mode steps in the [=resource fetch algorithm=], so the media element will not fire "suspend" events. Though future versions of this specification will likely remove "progress" and "stalled" events from a media element with an attached MediaSource, user agents conforming to this version of the specification may still fire these two events as these [[HTML]] references changed after implementations of this specification stabilized.
The following steps are run in any case where the media element is going to transition to {{HTMLMediaElement/NETWORK_EMPTY}} and [=queue a task=] to [=fire an event=] named [=HTMLMediaElement/emptied=] at the media element. These steps SHOULD be run right before the transition.
detach
message posted to
{{HTMLMediaElement/[[port to worker]]}}.detach
notification runs the remainder of these
steps in the {{DedicatedWorkerGlobalScope}} {{MediaSource}}.Going forward, this algorithm is intended to be externally called and run in any case where the attached MediaSource, if any, must be detached from the media element. It MAY be called on HTMLMediaElement [[HTML]] operations like load() and [=resource fetch algorithm=] failures in addition to, or in place of, when the media element transitions to {{HTMLMediaElement/NETWORK_EMPTY}}. Resource fetch algorithm failures are those which abort either the resource fetch algorithm or the resource selection algorithm, with the exception that the "Final step" [[HTML]] is not considered a failure that triggers detachment.
Run the following steps as part of the "Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is, until it has decoded enough data to play back that position" step of the :
The media element looks for [=media segments=] containing the |new playback position:double| in each SourceBuffer object in {{MediaSource/activeSourceBuffers}}. Any position within a {{TimeRanges}} in the current value of the {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} attribute has all necessary media segments buffered for that position.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
The web application can use {{SourceBuffer/buffered}} and {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} to determine what the media element needs to resume playback.
If the {{MediaSource/readyState}} attribute is {{ReadyState/""ended""}} and the |new playback position| is within a {{TimeRanges}} currently in {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}}, then the seek operation must continue to completion here even if one or more currently selected or enabled track buffers' largest range end timestamp is less than |new playback position|. This condition should only occur due to logic in {{SourceBuffer/buffered}} when {{MediaSource/readyState}} is {{ReadyState/""ended""}}.
The following steps are periodically run during playback to make sure that all of the SourceBuffer objects in {{MediaSource/activeSourceBuffers}} have [=enough data to ensure uninterrupted playback=]. Changes to {{MediaSource/activeSourceBuffers}} also cause these steps to run because they affect the conditions that trigger state transitions.
Having enough data to ensure uninterrupted playback is an implementation specific condition where the user agent determines that it currently has enough data to play the presentation without stalling for a meaningful period of time. This condition is constantly evaluated to determine when to transition the media element into and out of the {{HTMLMediaElement/HAVE_ENOUGH_DATA}} ready state. These transitions indicate when the user agent believes it has enough data buffered or it needs more data respectively.
An implementation MAY choose to use bytes buffered, time buffered, the append rate, or any other metric it sees fit to determine when it has enough data. The metrics used MAY change during playback so web applications SHOULD only rely on the value of {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} to determine whether more data is needed or not.
When the media element needs more data, the user agent SHOULD transition it from {{HTMLMediaElement/HAVE_ENOUGH_DATA}} to {{HTMLMediaElement/HAVE_FUTURE_DATA}} early enough for a web application to be able to respond without causing an interruption in playback. For example, transitioning when the current playback position is 500ms before the end of the buffered data gives the application roughly 500ms to append more data before playback stalls.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
During playback {{MediaSource/activeSourceBuffers}} needs to be updated if the {{VideoTrack/selected}} video track, the {{AudioTrack/enabled}} audio track(s), or a text track {{TextTrack/mode}} changes. When one or more of these changes occur the following steps need to be followed.
Also, when {{MediaSource}} was constructed in a {{DedicatedWorkerGlobalScope}}, then each change that occurs
to a {{Window}} mirror of a track created previously by the implicit handler for the internal create
track mirror
message MUST also be made to the corresponding {{DedicatedWorkerGlobalScope}} track
using an internal update track state
message posted to {{HTMLMediaElement/[[port to worker]]}}
whose implicit handler makes the change and runs the following steps.
Likewise, each change that occurs to a {{DedicatedWorkerGlobalScope}} track MUST also be made to the
corresponding {{Window}} mirror of the track using an internal update track state
message
posted to {{MediaSource/[[port to main]]}} whose implicit handler makes the change to the mirror.
Follow these steps when {{MediaSource/duration}} needs to change to a |new duration:unrestricted double|.
Duration reductions that would truncate currently buffered media are disallowed. When truncation is necessary, use {{SourceBuffer/remove()}} to reduce the buffered range before updating {{MediaSource/duration}}.
This condition can occur because the [=coded frame removal=] algorithm preserves coded frames that start before the start of the removal range.
This algorithm gets called when the application signals the end of stream via an {{MediaSource/endOfStream()}} call or an algorithm needs to signal a decode error. This algorithm takes an |error:EndOfStreamError| parameter that indicates whether an error will be signalled.
This allows the duration to properly reflect the end of the appended media segments. For example, if the duration was explicitly set to 10 seconds and only media segments for 0 to 5 seconds were appended before endOfStream() was called, then the duration will get updated to 5 seconds.
This algorithm is used to run steps on {{Window}} from a {{MediaSource}} attached from either the same {{Window}} or from a {{DedicatedWorkerGlobalScope}}, usually to update the state of the attached {{HTMLMediaElement}}. This algorithm takes a |steps| parameter that lists the steps to run on {{Window}}.
mirror on window
message to {{MediaSource/[[port to main]]}} whose
implicit handler in {{Window}} will run |steps|. Return control to the caller without awaiting that
handler's receipt of the message.
The MediaSourceHandle object represents a proxy for a {{MediaSource}} object that is useful for attaching a {{DedicatedWorkerGlobalScope}} {{MediaSource}} to a {{Window}} {{HTMLMediaElement}} using {{HTMLMediaElement/srcObject}} as described in the [=attaching to a media element=] algorithm.
This distinct object is necessary to attach a cross-context {{MediaSource}} to a media element because {{MediaSource}} objects themselves are not transferable since they are event targets.
Each {{MediaSourceHandle}} object has a [[\has ever been assigned as srcobject]] internal slot that stores a {{boolean}}. It is initialized to false when the {{MediaSourceHandle}} object is created, is set true in the extended {{HTMLMediaElement}}'s {{HTMLMediaElement/srcObject}} setter as described in HTMLMediaElement Extensions, and if true, prevents successful transfer of the {{MediaSourceHandle}} as described in the [=Transfer=] section.
{{MediaSourceHandle}} objects are {{Transferable}}, each having a [[\Detached]] internal slot that is used to ensure that once the handle object instance has been transferred, that instance cannot be transferred again.
[Transferable, Exposed=(Window,DedicatedWorker)] interface MediaSourceHandle {};
The {{MediaSourceHandle}} [=transfer steps=] and [=transfer-receiving steps=] require the implementation to maintain an implicit internal slot referencing the underlying {{MediaSource}} to enable [=attaching to a media element=] using {{HTMLMediaElement/srcObject}} and consequent setup of an attachment's [=cross-context communication model=].
Implementors should be aware that assumption of "move" semantics implied by {{Transferable}} is not always reality. For example, extensions or internal implementations of postMessage using broadcast may cause unintended multiple recipients of a transferred {{MediaSourceHandle}}. For this reason, implementations are guided to not resolve which potential clone of a transferred {{MediaSourceHandle}} is still valid for attachment until and unless any handle for the underlying {{MediaSource}} object is used in the asynchronous portion of the media element's resource selection algorithm. This is similar to the existing behavior for attachment via [=MediaSource object URLs=], which can be cloned easily, where such a URL is valid for at most one attachment start (across all of its potentially many clones).
Implementations MUST support at most one attachment (load) via {{HTMLMediaElement/srcObject}} ever for the {{MediaSource}} object underlying a {{MediaSourceHandle}}, regardless of potential cloning of the {{MediaSourceHandle}} due to varying implementations of {{Transferable}}.
See [=attaching to a media element=] for how this is enforced during the asynchronous portion of the media element's resource selection algorithm.
{{MediaSourceHandle}} is only exposed on {{Window}} and {{DedicatedWorkerGlobalScope}} contexts, and cannot successfully transfer between different [=ECMAScript/agent clusters=] [[!ECMASCRIPT]]. Transfer of a {{MediaSourceHandle}} object can only succeed within the same [=ECMAScript/agent cluster=].
For example, transfer of a {{MediaSourceHandle}} object from either a {{Window}} or {{DedicatedWorkerGlobalScope}} to either a SharedWorker or a ServiceWorker will not succeed. Developers should be aware of this difference versus [=MediaSource object URLs=] which are {{DOMString}}s that can be communicated many ways. Even so, [=attaching to a media element=] using a [=MediaSource object URL=] can only succeed for a {{MediaSource}} that was constructed in a {{Window}} context. See also the integration of the [=ECMAScript/agent=] and [=ECMAScript/agent cluster=] formalisms for Web Application APIs [[HTML]] where related concepts such as [=dedicated worker agents=] are defined.
[=Transfer steps=] for a {{MediaSourceHandle}} object MUST include the following step:
enum AppendMode { "segments", "sequence", };
[Exposed=(Window,DedicatedWorker)] interface SourceBuffer : EventTarget { attribute AppendMode mode; readonly attribute boolean updating; readonly attribute TimeRanges buffered; attribute double timestampOffset; readonly attribute AudioTrackList audioTracks; readonly attribute VideoTrackList videoTracks; readonly attribute TextTrackList textTracks; attribute double appendWindowStart; attribute unrestricted double appendWindowEnd; attribute EventHandler onupdatestart; attribute EventHandler onupdate; attribute EventHandler onupdateend; attribute EventHandler onerror; attribute EventHandler onabort; undefined appendBuffer (BufferSource data); undefined abort (); undefined changeType (DOMString type); undefined remove (double start, unrestricted double end); };
mode
of type AppendModeControls how a sequence of [=media segments=] are handled. This attribute is initially set by {{MediaSource/addSourceBuffer()}} after the object is created, and can be updated by {{SourceBuffer/changeType()}} or setting this attribute.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
If the {{MediaSource/readyState}} attribute of the [=parent media source=] is in the {{ReadyState/""ended""}} state then run the following steps:
updating
of type {{boolean}}, readonly Indicates whether the asynchronous continuation of an {{SourceBuffer/appendBuffer()}} or {{SourceBuffer/remove()}} operation is still being processed. This attribute is initially set to false when the object is created.
buffered
of type {{TimeRanges}}, readonly Indicates what {{TimeRanges}} are buffered in the SourceBuffer. This attribute is initially set to an empty {{TimeRanges}} object when the object is created.
When the attribute is read the following steps MUST occur:
Text [=track buffers=] are included in the calculation of |highest end time|, above, but excluded from the buffered range calculation here. They are not necessarily continuous, nor should any discontinuity within them trigger playback stall when the other media tracks are continuous over the same time range.
timestampOffset
of type {{double}}Controls the offset applied to timestamps inside subsequent [=media segments=] that are appended to this SourceBuffer. The {{SourceBuffer/timestampOffset}} is initially set to 0 which indicates that no offset is being applied.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
If the {{MediaSource/readyState}} attribute of the [=parent media source=] is in the {{ReadyState/""ended""}} state then run the following steps:
audioTracks
of type {{AudioTrackList}}, readonly videoTracks
of type {{VideoTrackList}}, readonly textTracks
of type {{TextTrackList}}, readonly appendWindowStart
of type {{double}}The [=presentation timestamp=] for the start of the [=append window=]. This attribute is initially set to the [=presentation start time=].
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
appendWindowEnd
of type {{unrestricted double}}The [=presentation timestamp=] for the end of the [=append window=]. This attribute is initially set to positive Infinity.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
onupdatestart
of type {{EventHandler}}The event handler for the {{updatestart}} event.
onupdate
of type {{EventHandler}}The event handler for the {{update}} event.
onupdateend
of type {{EventHandler}}The event handler for the {{updateend}} event.
onerror
of type {{EventHandler}}The event handler for the {{error}} event.
onabort
of type {{EventHandler}}The event handler for the {{abort}} event.
appendBuffer
Appends the segment data in an BufferSource[[!WEBIDL]] to the {{SourceBuffer}}.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|data| | {{BufferSource}} | ✘ | ✘ |
abort
Aborts the current segment and resets the segment parser.
changeType
Changes the MIME type associated with this object. Subsequent {{SourceBuffer/appendBuffer()}} calls will expect the newly appended bytes to conform to the new type.
If the {{MediaSource/readyState}} attribute of the [=parent media source=] is in the {{ReadyState/""ended""}} state then run the following steps:
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|type| | {{DOMString}} | ✘ | ✘ |
remove
Removes media for a specific time range.
If the {{MediaSource/readyState}} attribute of the [=parent media source=] is in the {{ReadyState/""ended""}} state then run the following steps:
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|start| | {{double}} | ✘ | ✘ | The start of the removal range, in seconds measured from [=presentation start time=]. |
|end| | {{unrestricted double}} | ✘ | ✘ | The end of the removal range, in seconds measured from [=presentation start time=]. |
A track buffer stores the [=track descriptions=] and [=coded frames=] for an individual track. The track buffer is updated as [=initialization segments=] and [=media segments=] are appended to the SourceBuffer.
Each [=track buffer=] has a last decode timestamp variable that stores the decode timestamp of the last [=coded frame=] appended in the current [=coded frame group=]. The variable is initially unset to indicate that no [=coded frames=] have been appended yet.
Each [=track buffer=] has a last frame duration variable that stores the [=coded frame duration=] of the last [=coded frame=] appended in the current [=coded frame group=]. The variable is initially unset to indicate that no [=coded frames=] have been appended yet.
Each [=track buffer=] has a highest end timestamp variable that stores the highest [=coded frame end timestamp=] across all [=coded frames=] in the current [=coded frame group=] that were appended to this track buffer. The variable is initially unset to indicate that no [=coded frames=] have been appended yet.
Each [=track buffer=] has a need random access point flag variable that keeps track of whether the track buffer is waiting for a [=random access point=] [=coded frame=]. The variable is initially set to true to indicate that [=random access point=] [=coded frame=] is needed before anything can be added to the [=track buffer=].
Each [=track buffer=] has a track buffer ranges variable that represents the presentation time ranges occupied by the [=coded frames=] currently stored in the track buffer.
For track buffer ranges, these presentation time ranges are based on [=presentation timestamps=], frame durations, and potentially coded frame group start times for coded frame groups across track buffers in a muxed SourceBuffer.
For specification purposes, this information is treated as if it were stored in a . Intersected [=track buffer ranges=] are used to report {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}}, and MUST therefore support uninterrupted playback within each range of {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}}.
These coded frame group start times differ slightly from those mentioned in the [=coded frame processing=] algorithm in that they are the earliest [=presentation timestamp=] across all track buffers following a discontinuity. Discontinuities can occur within the [=coded frame processing=] algorithm or result from the [=coded frame removal=] algorithm, regardless of {{SourceBuffer/mode}}. The threshold for determining disjointness of [=track buffer ranges=] is implementation-specific. For example, to reduce unexpected playback stalls, implementations MAY approximate the [=coded frame processing=] algorithm's discontinuity detection logic by coalescing adjacent ranges separated by a gap smaller than 2 times the maximum frame duration buffered so far in this [=track buffer=]. Implementations MAY also use coded frame group start times as range start times across [=track buffers=] in a muxed SourceBuffer to further reduce unexpected playback stalls.
Event name | Interface | Dispatched when... |
---|---|---|
updatestart | {{Event}} | {{SourceBuffer}}'s {{SourceBuffer/updating}} transitions from false to true. |
update | {{Event}} | A {{SourceBuffer}}'s append or remove successfully completed. {{SourceBuffer}}'s {{SourceBuffer/updating}} transitions from true to false. |
updateend | {{Event}} | The append or remove of a {{SourceBuffer}} ended. |
error | {{Event}} | An error occurred during the append to a {{SourceBuffer}}. {{SourceBuffer/updating}} transitions from true to false. |
abort | {{Event}} | The {{SourceBuffer}}'s append was aborted by an {{SourceBuffer/abort()}} call. {{SourceBuffer/updating}} transitions from true to false. |
Each {{SourceBuffer}} object has an [[\append state]] internal slot that keeps track of the high-level segment parsing state. It is initially set to [=WAITING_FOR_SEGMENT=] and can transition to the following states as data is appended.
Append state name | Description |
---|---|
WAITING_FOR_SEGMENT |
Waiting for the start of an [=initialization segment=] or [=media segment=] to be appended. |
PARSING_INIT_SEGMENT |
Currently parsing an [=initialization segment=]. |
PARSING_MEDIA_SEGMENT |
Currently parsing a [=media segment=]. |
Each {{SourceBuffer}} object has an [[\input buffer]] internal slot that is a byte buffer that holds unparsed bytes across {{SourceBuffer/appendBuffer()}} calls. The buffer is empty when the {{SourceBuffer}} object is created.
Each {{SourceBuffer}} object has a [[\buffer full flag]] internal slot that keeps track of whether {{SourceBuffer/appendBuffer()}} is allowed to accept more bytes. It is set to false when the {{SourceBuffer}} object is created and gets updated as data is appended and removed.
Each {{SourceBuffer}} object has a [[\group start timestamp]] internal slot that keeps track of the starting timestamp for a new [=coded frame group=] in the {{AppendMode/""sequence""}} mode. It is unset when the SourceBuffer object is created and gets updated when the {{SourceBuffer/mode}} attribute equals {{AppendMode/""sequence""}} and the {{SourceBuffer/timestampOffset}} attribute is set, or the [=coded frame processing=] algorithm runs.
Each {{SourceBuffer}} object has a [[\group end timestamp]] internal slot that stores the highest [=coded frame end timestamp=] across all [=coded frames=] in the current [=coded frame group=]. It is set to 0 when the SourceBuffer object is created and gets updated by the [=coded frame processing=] algorithm.
The {{SourceBuffer/[[group end timestamp]]}} stores the highest [=coded frame end timestamp=] across all [=track buffers=] in a SourceBuffer. Therefore, care should be taken in setting the {{SourceBuffer/mode}} attribute when appending multiplexed segments in which the timestamps are not aligned across tracks.
Each {{SourceBuffer}} object has a [[\generate timestamps flag]] internal slot that is a boolean that keeps track of whether timestamps need to be generated for the [=coded frames=] passed to the [=coded frame processing=] algorithm. This flag is set by {{MediaSource/addSourceBuffer()}} when the {{SourceBuffer}} object is created and is updated by {{SourceBuffer/changeType()}}.
When the segment parser loop algorithm is invoked, run the following steps:
If the {{SourceBuffer/[[append state]]}} equals [=WAITING_FOR_SEGMENT=], then run the following steps:
If the {{SourceBuffer/[[append state]]}} equals [=PARSING_INIT_SEGMENT=], then run the following steps:
If the {{SourceBuffer/[[append state]]}} equals [=PARSING_MEDIA_SEGMENT=], then run the following steps:
The frequency at which the coded frame processing algorithm is run is implementation-specific. The coded frame processing algorithm MAY be called when the input buffer contains the complete media segment or it MAY be called multiple times as complete coded frames are added to the input buffer.
When the parser state needs to be reset, run the following steps:
This algorithm is called when an error occurs during an append.
When an append operation begins, the following steps are run to validate and prepare the SourceBuffer.
If the {{MediaSource/readyState}} attribute of the [=parent media source=] is in the {{ReadyState/""ended""}} state then run the following steps:
If the {{SourceBuffer/[[buffer full flag]]}} equals true, then throw a {{QuotaExceededError}} exception and abort these steps.
This is the signal that the implementation was unable to evict enough data to accommodate the append or the append is too big. The web application SHOULD use {{SourceBuffer/remove()}} to explicitly free up space and/or reduce the size of the append.
When {{SourceBuffer/appendBuffer()}} is called, the following steps are run to process the appended data.
Follow these steps when a caller needs to initiate a JavaScript visible range removal operation that blocks other SourceBuffer updates:
The following steps are run when the [=segment parser loop=] successfully parses a complete [=initialization segment=]:
Each SourceBuffer object has a [[\first initialization segment received flag]] internal slot that tracks whether the first [=initialization segment=] has been appended and received by this algorithm. This flag is set to false when the SourceBuffer is created and updated by the algorithm below.
Each SourceBuffer object has a [[\pending initialization segment for changeType flag]] internal slot that tracks whether an [=initialization segment=] is needed since the most recent {{SourceBuffer/changeType()}}. This flag is set to false when the SourceBuffer is created, set to true by {{SourceBuffer/changeType()}} and reset to false by the algorithm below.
User agents MAY consider codecs, that would otherwise be supported, as "not supported" here if the codecs were not
specified in |type:DOMString| parameter passed to
(a) the most recently successful {{SourceBuffer/changeType()}} on this {{SourceBuffer}} object, or
(b) if no successful {{SourceBuffer/changeType()}} has yet occurred on this object, the {{MediaSource/addSourceBuffer()}}
that created this {{SourceBuffer}} object.
For example, if the most recently successful {{SourceBuffer/changeType()}} was called with 'video/webm'
or 'video/webm; codecs="vp8"'
, and a video track containing vp9 appears in the initialization segment,
then the user agent MAY use this step to trigger a decode error even if the other two properties'
checks, above, pass. Implementations are encouraged to trigger error in such cases only when the codec
is indeed not supported or the other two properties' checks fail.
Web authors are encouraged to use {{SourceBuffer/changeType()}}, {{MediaSource/addSourceBuffer()}} and
{{MediaSource/isTypeSupported()}} with precise codec parameters to more proactively detect user agent
support. {{SourceBuffer/changeType()}} is required if the {{SourceBuffer}} object's bytestream format
is changing.
If the {{SourceBuffer/[[first initialization segment received flag]]}} is false, then run the following steps:
User agents MAY consider codecs, that would otherwise be supported, as "not supported" here if the codecs were not
specified in |type:DOMString| parameter passed to
(a) the most recently successful {{SourceBuffer/changeType()}} on this {{SourceBuffer}} object, or
(b) if no successful {{SourceBuffer/changeType()}} has yet occurred on this object, the {{MediaSource/addSourceBuffer()}}
that created this {{SourceBuffer}} object.
For example, MediaSource.isTypeSupported('video/webm;codecs="vp8,vorbis"')
may return true, but if
{{MediaSource/addSourceBuffer()}} was called with 'video/webm;codecs="vp8"'
and a Vorbis track appears in the
[=initialization segment=], then the user agent MAY use this step to trigger a decode error.
Implementations are encouraged to trigger error in such cases only when the codec is indeed not supported.
Web authors are encouraged to use {{SourceBuffer/changeType()}}, {{MediaSource/addSourceBuffer()}} and
{{MediaSource/isTypeSupported()}} with precise codec parameters to more proactively detect user agent
support. {{SourceBuffer/changeType()}} is required if the {{SourceBuffer}} object's bytestream format
is changing.
For each audio track in the [=initialization segment=], run following steps:
If this {{SourceBuffer}} object's {{SourceBuffer/audioTracks}}.{{AudioTrackList/length}} equals 0, then run the following steps:
This should trigger {{AudioTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=AudioTrackList/addtrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to |new audio track|, at the {{AudioTrackList}} object referenced by the {{SourceBuffer/audioTracks}} attribute on this SourceBuffer object.
create track mirror
message to {{MediaSource/[[port to
main]]}} whose implicit handler in {{Window}} runs the following steps:This should trigger {{AudioTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=AudioTrackList/addtrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to |mirrored audio track| or |new audio track|, at the {{AudioTrackList}} object referenced by the {{HTMLMediaElement/audioTracks}} attribute on the HTMLMediaElement.
For each video track in the [=initialization segment=], run following steps:
If this {{SourceBuffer}} object's {{SourceBuffer/videoTracks}}.{{VideoTrackList/length}} equals 0, then run the following steps:
This should trigger {{VideoTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=VideoTrackList/addtrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to |new video track|, at the {{VideoTrackList}} object referenced by the {{SourceBuffer/videoTracks}} attribute on this SourceBuffer object.
create track mirror
message to {{MediaSource/[[port to
main]]}} whose implicit handler in {{Window}} runs the following steps:This should trigger {{VideoTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=VideoTrackList/addtrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to |mirrored video track| or |new video track|, at the {{VideoTrackList}} object referenced by the {{HTMLMediaElement/videoTracks}} attribute on the HTMLMediaElement.
For each text track in the [=initialization segment=], run following steps:
This should trigger {{TextTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=TextTrackList/addtrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to |new text track|, at the {{TextTrackList}} object referenced by the {{SourceBuffer/textTracks}} attribute on this SourceBuffer object.
create track mirror
message to {{MediaSource/[[port to
main]]}} whose implicit handler in {{Window}} runs the following steps:This should trigger {{TextTrackList}} [[HTML]] logic to [=queue a task=] to [=fire an event=] named [=TextTrackList/addtrack=] using {{TrackEvent}} with the {{TrackEvent/track}} attribute initialized to |mirrored text track| or |new text track|, at the {{TextTrackList}} object referenced by the {{HTMLMediaElement/textTracks}} attribute on the HTMLMediaElement.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement. If transition from {{HTMLMediaElement/HAVE_NOTHING}} to {{HTMLMediaElement/HAVE_METADATA}} occurs, it should trigger HTMLMediaElement logic to [=queue a task=] to [=fire an event=] named [=HTMLMediaElement/loadedmetadata=] at the media element.
When complete [=coded frames=] have been parsed by the [=segment parser loop=] then the following steps are run:
For each [=coded frame=] in the [=media segment=] run the following steps:
Special processing may be needed to determine the presentation and decode timestamps for timed text frames since this information may not be explicitly present in the underlying format or may be dependent on the order of the frames. Some metadata text tracks, like MPEG2-TS PSI data, may only have implied timestamps. Format specific rules for these situations SHOULD be in the [=byte stream format specifications=] or in separate extension specifications.
Implementations don't have to internally store timestamps in a double precision floating point representation. This representation is used here because it is the representation for timestamps in the HTML spec. The intention here is to make the behavior clear without adding unnecessary complexity to the algorithm to deal with the fact that adding a timestampOffset may cause a timestamp rollover in the underlying timestamp representation used by the byte stream format. Implementations can use any internal timestamp representation they wish, but the addition of timestampOffset SHOULD behave in a similar manner to what would happen if a double precision floating point representation was used.
If {{SourceBuffer/timestampOffset}} is not 0, then run the following steps:
Some implementations MAY choose to collect some of these coded frames with |presentation timestamp| less than {{SourceBuffer/appendWindowStart}} and use them to generate a splice at the first coded frame that has a [=presentation timestamp=] greater than or equal to {{SourceBuffer/appendWindowStart}} even if that frame is not a [=random access point=]. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement.
Some implementations MAY choose to collect coded frames with |presentation timestamp| less than {{SourceBuffer/appendWindowEnd}} and |frame end timestamp| greater than {{SourceBuffer/appendWindowEnd}} and use them to generate a splice across the portion of the collected coded frames within the append window at time of collection, and the beginning portion of later processed frames which only partially overlap the end of the collected coded frames. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement. In conjunction with collecting coded frames that span {{SourceBuffer/appendWindowStart}}, implementations MAY thus support gapless audio splicing.
This is to compensate for minor errors in frame timestamp computations that can appear when converting back and forth between double precision floating point numbers and rationals. This tolerance allows a frame to replace an existing one as long as it is within 1 microsecond of the existing frame's start time. Frames that come slightly before an existing frame are handled by the removal step below.
Removing all [=coded frames=] until the next [=random access point=] is a conservative estimate of the decoding dependencies since it assumes all frames between the removed frames and the next random access point depended on the frames that were removed.
The greater than check is needed because bidirectional prediction between coded frames can cause |presentation timestamp| to not be monotonically increasing even though the decode timestamps are monotonically increasing.
If the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute is {{HTMLMediaElement/HAVE_METADATA}} and the new [=coded frames=] cause {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} to have a {{TimeRanges}} for the current playback position, then set the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute to {{HTMLMediaElement/HAVE_CURRENT_DATA}}.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
If the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute is {{HTMLMediaElement/HAVE_CURRENT_DATA}} and the new [=coded frames=] cause {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} to have a {{TimeRanges}} that includes the current playback position and some time beyond the current playback position, then set the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute to {{HTMLMediaElement/HAVE_FUTURE_DATA}}.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
If the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute is {{HTMLMediaElement/HAVE_FUTURE_DATA}} and the new [=coded frames=] cause {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} to have a {{TimeRanges}} that includes the current playback position and [=enough data to ensure uninterrupted playback=], then set the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute to {{HTMLMediaElement/HAVE_ENOUGH_DATA}}.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
Follow these steps when [=coded frames=] for a specific time range need to be removed from the SourceBuffer:
For each [=track buffer=] in this {{SourceBuffer}}, run the following steps:
If this [=track buffer=] has a [=random access point=] timestamp that is greater than or equal to |end|, then update |remove end timestamp| to that random access point timestamp.
Random access point timestamps can be different across tracks because the dependencies between [=coded frames=] within a track are usually different than the dependencies in another track.
For each removed frame, if the frame has a [=decode timestamp=] equal to the [=last decode timestamp=] for the frame's track, run the following steps:
Removing all [=coded frames=] until the next [=random access point=] is a conservative estimate of the decoding dependencies since it assumes all frames between the removed frames and the next random access point depended on the frames that were removed.
If this object is in {{MediaSource/activeSourceBuffers}}, the is greater than or equal to |start| and less than the |remove end timestamp|, and {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} is greater than {{HTMLMediaElement/HAVE_METADATA}}, then set the {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} attribute to {{HTMLMediaElement/HAVE_METADATA}} and stall playback.
Per [[HTML]] logic, {{HTMLMediaElement}}.{{HTMLMediaElement/readyState}} changes may trigger events on the HTMLMediaElement.
This transition occurs because media data for the current position has been removed. Playback cannot progress until media for the is appended or the selected/enabled tracks change.
This algorithm is run to free up space in this {{SourceBuffer}} when new data is appended.
Need to recognize step here that implementations MAY decide to set {{SourceBuffer/[[buffer full flag]]}} true here if it predicts that processing |new data| in addition to any existing bytes in {{SourceBuffer/[[input buffer]]}} would exceed the capacity of the {{SourceBuffer}}. Such a step enables more proactive push-back from implementations before accepting |new data| which would overflow resources, for example. In practice, at least one implementation already does this.
Implementations MAY use different methods for selecting |removal ranges| so web applications SHOULD NOT depend on a specific behavior. The web application can use the {{SourceBuffer/buffered}} attribute to observe whether portions of the buffered data have been evicted.
Follow these steps when the [=coded frame processing=] algorithm needs to generate a splice frame for two overlapping audio [=coded frames=]:
floor(x * sample_rate + 0.5) / sample_rate
).
For example, given the following values:
|presentation timestamp| and |decode timestamp| are updated to 10.0125 since 10.01255 is closer to 10 + 100/8000 (10.0125) than 10 + 101/8000 (10.012625)
Some implementations MAY apply fades to/from silence to coded frames on either side of the inserted silence to make the transition less jarring.
This is intended to allow |new coded frame| to be added to the |track buffer| as if |overlapped frame| had not been in the |track buffer| to begin with.
If the |new coded frame| is less than 5 milliseconds in duration, then coded frames that are appended after the |new coded frame| will be needed to properly render the splice.
See the [=audio splice rendering=] algorithm for details on how this splice frame is rendered.
The following steps are run when a spliced frame, generated by the [=audio splice frame=] algorithm, needs to be rendered by the media element:
Here is a graphical representation of this algorithm.
Follow these steps when the [=coded frame processing=] algorithm needs to generate a splice frame for two overlapping timed text [=coded frames=]:
This is intended to allow |new coded frame| to be added to the |track buffer| as if it hadn't overlapped any frames in |track buffer| to begin with.
SourceBufferList is a simple container object for SourceBuffer objects. It provides read-only array access and fires events when the list is modified.
[Exposed=(Window,DedicatedWorker)] interface SourceBufferList : EventTarget { readonly attribute unsigned long length; attribute EventHandler onaddsourcebuffer; attribute EventHandler onremovesourcebuffer; getter SourceBuffer (unsigned long index); };
length
of type {{unsigned long}}, readonly Indicates the number of SourceBuffer objects in the list.
onaddsourcebuffer
of type {{EventHandler}}The event handler for the {{addsourcebuffer}} event.
onremovesourcebuffer
of type {{EventHandler}}The event handler for the {{removesourcebuffer}} event.
getter
Allows the SourceBuffer objects in the list to be accessed with an array operator (i.e., []).
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
|index| | {{unsigned long}} | ✘ | ✘ |
Event name | Interface | Dispatched when... |
---|---|---|
addsourcebuffer | {{Event}} | When a {{SourceBuffer}} is added to the list. |
removesourcebuffer | {{Event}} | When a {{SourceBuffer}} is removed from the list. |
A {{ManagedMediaSource}} is a {{MediaSource}} that actively manages its memory content. Unlike a {{MediaSource}}, the [=user agent=] can evict evict content through the [=ManagedMediaSource/memory cleanup=] algorithm from its {{MediaSource/sourceBuffers}} (populated with {{ManagedSourceBuffer}}) for any reason.
[Exposed=(Window,DedicatedWorker)] interface ManagedMediaSource : MediaSource { constructor(); readonly attribute boolean streaming; attribute EventHandler onstartstreaming; attribute EventHandler onendstreaming; };
On getting:
Event name | Interface | Dispatched when... |
---|---|---|
startstreaming | {{Event}} | {{ManagedMediaSource/streaming}} attribute changed from `false` to `true`. |
endstreaming | {{Event}} | {{ManagedMediaSource/streaming}} attribute changed from `true` to `false`. |
The following steps are run periodically, whenever the [=MediaSource/SourceBuffer Monitoring=] algorithm is scheduled to run.
Having enough managed data to ensure uninterrupted playback is an implementation defined condition where the user agent determines that it currently has enough data to play the presentation without stalling for a meaningful period of time. This condition is constantly evaluated to determine when to transition the value of {{ManagedMediaSource/streaming}}. These transitions indicate when the user agent believes it has enough data buffered or it needs more data respectively.
Being able to retrieve and buffer data in an efficient way is an implementation defined condition where the user agent determines that it can fetch new data in an energy efficient manner while able to achieve the desired memory usage.
[Exposed=(Window,DedicatedWorker)] interface BufferedChangeEvent : Event { constructor(DOMString type, BufferedChangeEventInit eventInitDict); [SameObject] readonly attribute TimeRanges addedRanges; [SameObject] readonly attribute TimeRanges removedRanges; }; dictionary BufferedChangeEventInit : EventInit { TimeRanges addedRanges; TimeRanges removedRanges; };
[Exposed=(Window,DedicatedWorker)] interface ManagedSourceBuffer : SourceBuffer { attribute EventHandler onbufferedchange; };
An [=event handler IDL attribute=] whose [=event handler event type=] is {{bufferedchange}}.
Event name | Interface | Dispatched when... |
---|---|---|
bufferedchange | {{Event}} | The {{ManagedSourceBuffer}}'s buffered range changed following a call to {{SourceBuffer/appendBuffer()}}, {{SourceBuffer/remove()}}, or as a consequence of the user agent running the [=ManagedSourceBuffer/memory cleanup=] algorithm. |
The following steps are run at the completion of all operations to the {{ManagedSourceBuffer}} |buffer:ManagedSourceBuffer| that would cause a |buffer|'s {{SourceBuffer/buffered}} to change. That is once {{SourceBuffer/appendBuffer()}}, {{SourceBuffer/remove()}} or [=ManagedSourceBuffer/memory cleanup=] algorithm have completed.
Implementations can use different strategies for selecting |removal ranges| so web applications shouldn't depend on a specific behavior. The web application would listen to the {{bufferedchange}} event to observe whether portions of the buffered data have been evicted.
This section specifies what existing {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} and {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} attributes on the {{HTMLMediaElement}} MUST return when a MediaSource is attached to the element, and what the existing {{HTMLMediaElement}}.{{HTMLMediaElement/srcObject}} attribute MUST also do when it is set to be a {{MediaSourceHandle}} object.
The {{HTMLMediaElement}}.{{HTMLMediaElement/seekable}} attribute returns a new static created based on the following steps:
This case is intended to handle implementations that may no longer maintain any previous information about buffered or seekable media in a MediaSource that was constructed in a DedicatedWorkerGlobalScope that has been terminated by {{Worker/terminate()}} or user agent execution of [=terminate a worker=] for the MediaSource's DedicatedWorkerGlobalScope, for instance as the eventual result of {{DedicatedWorkerGlobalScope/close()}} execution.
Should there be some (eventual) media element error transition in the case of an attached worker MediaSource having its context destroyed? The experimental Chromium implementation of worker MSE just keeps the element readyState, networkState and error the same as prior to that context destruction, though the seekable and buffered attributes each report an empty TimeRange.
The {{HTMLMediaElement}}.{{HTMLMediaElement/buffered}} attribute returns a static based on the following steps.
This case is intended to handle implementations that may no longer maintain any previous information about buffered or seekable media in a MediaSource that was constructed in a DedicatedWorkerGlobalScope that has been terminated by {{Worker/terminate()}} or user agent execution of [=terminate a worker=] for the MediaSource's DedicatedWorkerGlobalScope, for instance as the eventual result of {{DedicatedWorkerGlobalScope/close()}} execution.
Should there be some (eventual) media element error transition in the case of an attached worker MediaSource having its context destroyed? The experimental Chromium implementation of worker MSE just keeps the element readyState, networkState and error the same as prior to that context destruction, though the seekable and buffered attributes each report an empty TimeRange.
The overhead of recalculating and communicating |recent intersection ranges| so frequently is one reason for allowing implementation flexibility to query this information on-demand using other mechanisms such as shared memory and locks as mentioned in [=cross-context communication model=].
If a {{HTMLMediaElement}}.{{HTMLMediaElement/srcObject}} attribute is assigned a {{MediaSourceHandle}}, then set {{MediaSourceHandle/[[has ever been assigned as srcobject]]}} for that {{MediaSourceHandle}} to true as part of the synchronous steps of the extended {{HTMLMediaElement}}'s {{HTMLMediaElement/srcObject}} setter that occur before invoking the element's load algorithm.
This prevents transferring that {{MediaSourceHandle}} object ever again, enabling clear synchronous exception if that is attempted.
MediaSourceHandle needs to be added to HTMLMediaElement's MediaProvider IDL typedef and related text involving media provider objects.
This section specifies extensions to the [[HTML]] {{AudioTrack}} definition.
[Exposed=(Window,DedicatedWorker)] partial interface AudioTrack { readonly attribute SourceBuffer? sourceBuffer; };
sourceBuffer
of type SourceBuffer, readonly , nullableOn getting, run the following step:
create track
mirror
handler in {{Window}} to create this track, then the {{Window}} copy of the track would
return null for this attribute.
This section specifies extensions to the [[HTML]] {{VideoTrack}} definition.
[Exposed=(Window,DedicatedWorker)] partial interface VideoTrack { readonly attribute SourceBuffer? sourceBuffer; };
sourceBuffer
of type SourceBuffer, readonly , nullableOn getting, run the following step:
create track
mirror
handler in {{Window}} to create this track, then the {{Window}} copy of the track would
return null for this attribute.
This section specifies extensions to the [[HTML]] {{TextTrack}} definition.
[Exposed=(Window,DedicatedWorker)] partial interface TextTrack { readonly attribute SourceBuffer? sourceBuffer; };
sourceBuffer
of type SourceBuffer, readonly , nullableOn getting, run the following step:
create track
mirror
handler in {{Window}} to create this track, then the {{Window}} copy of the track would
return null for this attribute.
The bytes provided through {{SourceBuffer/appendBuffer()}} for a SourceBuffer form a logical byte stream. The format and semantics of these byte streams are defined in byte stream format specifications. The byte stream format registry [[MSE-REGISTRY]] provides mappings between a MIME type that may be passed to {{MediaSource/addSourceBuffer()}}, {{MediaSource/isTypeSupported()}} or {{SourceBuffer/changeType()}} and the byte stream format expected by a {{SourceBuffer}} using that MIME type for parsing newly appended data. Implementations are encouraged to register mappings for byte stream formats they support to facilitate interoperability. The byte stream format registry [[MSE-REGISTRY]] is the authoritative source for these mappings. If an implementation claims to support a MIME type listed in the registry, its SourceBuffer implementation MUST conform to the [=byte stream format specification=] listed in the registry entry.
The byte stream format specifications in the registry are not intended to define new storage formats. They simply outline the subset of existing storage format structures that implementations of this specification will accept.
Byte stream format parsing and validation is implemented in the [=segment parser loop=] algorithm.
This section provides general requirements for all byte stream format specifications:
If the byte stream format covers a format similar to one covered in the in-band tracks spec [[INBANDTRACKS]], then it SHOULD try to use the same attribute mappings so that Media Source Extensions playback and non-Media Source Extensions playback provide the same track information.
The number and type of tracks are not consistent.
For example, if the first [=initialization segment=] has 2 audio tracks and 1 video track, then all [=initialization segments=] that follow it in the byte stream MUST describe 2 audio tracks and 1 video track.
Unsupported codec changes occur across [=initialization segments=].
See the [=initialization segment received=] algorithm, {{MediaSource/addSourceBuffer()}} and {{SourceBuffer/changeType()}} for details and examples of codec changes.
Video frame size changes. The user agent MUST support seamless playback.
This will cause the <video> display region to change size if the web application does not use CSS or HTML attributes (width/height) to constrain the element size.
Audio channel count changes. The user agent MAY support this seamlessly and could trigger downmixing.
This is a quality of implementation issue because changing the channel count may require reinitializing the audio device, resamplers, and channel mixers which tends to be audible.
This is intended to simplify switching between audio streams where the frame boundaries don't always line up across encodings (e.g., Vorbis).
For example, if I1 is associated with M1, M2, M3 then the above MUST hold for all the combinations I1+M1, I1+M2, I1+M1+M2, I1+M2+M3, etc.
Byte stream specifications MUST at a minimum define constraints which ensure that the above requirements hold. Additional constraints MAY be defined, for example to simplify implementation.
<script> function onSourceOpen(videoTag, e) { var mediaSource = e.target; if (mediaSource.sourceBuffers.length > 0) return; var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"'); videoTag.addEventListener('seeking', onSeeking.bind(videoTag, mediaSource)); videoTag.addEventListener('progress', onProgress.bind(videoTag, mediaSource)); var initSegment = GetInitializationSegment(); if (initSegment == null) { // Error fetching the initialization segment. Signal end of stream with an error. mediaSource.endOfStream("network"); return; } // Append the initialization segment. var firstAppendHandler = function(e) { var sourceBuffer = e.target; sourceBuffer.removeEventListener('updateend', firstAppendHandler); // Append some initial media data. appendNextMediaSegment(mediaSource); }; sourceBuffer.addEventListener('updateend', firstAppendHandler); sourceBuffer.appendBuffer(initSegment); } function appendNextMediaSegment(mediaSource) { if (mediaSource.readyState == "closed") return; // If we have run out of stream data, then signal end of stream. if (!HaveMoreMediaSegments()) { mediaSource.endOfStream(); return; } // Make sure the previous append is not still pending. if (mediaSource.sourceBuffers[0].updating) return; var mediaSegment = GetNextMediaSegment(); if (!mediaSegment) { // Error fetching the next media segment. mediaSource.endOfStream("network"); return; } // NOTE: If mediaSource.readyState == “ended”, this appendBuffer() call will // cause mediaSource.readyState to transition to "open". The web application // should be prepared to handle multiple “sourceopen” events. mediaSource.sourceBuffers[0].appendBuffer(mediaSegment); } function onSeeking(mediaSource, e) { var video = e.target; if (mediaSource.readyState == "open") { // Abort current segment append. mediaSource.sourceBuffers[0].abort(); } // Notify the media segment loading code to start fetching data at the // new playback position. SeekToMediaSegmentAt(video.currentTime); // Append a media segment from the new playback position. appendNextMediaSegment(mediaSource); } function onProgress(mediaSource, e) { appendNextMediaSegment(mediaSource); } </script> <video id="v" autoplay> </video> <script> var video = document.getElementById('v'); var mediaSource = new MediaSource(); mediaSource.addEventListener('sourceopen', onSourceOpen.bind(this, video)); video.src = window.URL.createObjectURL(mediaSource); </script>
<script> async function setUpVideoStream() { // Specific video format and codec const mediaType = 'video/mp4; codecs="mp4a.40.2,avc1.4d4015"'; // Check if the type of video format / codec is supported. if (ManagedMediaSource?.isTypeSupported(mediaType)) { return; // Not supported, do something else. } // Set up video and its managed source. const video = document.createElement("video"); const source = new ManagedMediaSource(); video.controls = true; await new Promise((resolve) => { video.src = URL.createObjectURL(source); source.addEventListener("sourceopen", resolve, { once: true }); document.body.appendChild(video); }); const sourceBuffer = source.addSourceBuffer(mediaType); // Set up the event handlers sourceBuffer.onbufferedchange = (e) => { console.log("onbufferedchange event fired."); console.log(`Added Ranges: ${timeRangesToString(e.addedRanges)}`); console.log(`Removed Ranges: ${timeRangesToString(e.removedRanges)}`); }; source.onstartstreaming = async () => { const response = await fetch("./videos/bipbop.mp4"); const buffer = await response.arrayBuffer(); await new Promise((resolve) => { sourceBuffer.addEventListener("updateend", resolve, { once: true }); sourceBuffer.appendBuffer(buffer); }); }; source.onendstreaming = async () => { // Stop fetching new segments here }; } // Helper function... function timeRangesToString(timeRanges) { const ranges = []; for (let i = 0; i < timeRanges.length; i++) { ranges.push([timeRanges.start(i), timeRanges.end(i)]); } return "[" + ranges.map(([start, end]) => `[${start}, ${end})` ) + "]"; } </script> <body onload="setUpVideoStream()"></body>
VideoPlaybackQuality
object and the {{HTMLVideoElement}} extension method getVideoPlaybackQuality()
described in those previous revisions.