{% include "warning.html" %}

Introduction

In this article, I'll cover two techniques for streaming audio/video using a few of the newer multimedia capabilities of the web platform. The first method is the MediaSource API, an API that allows JavaScript to dynamically construct and append media segments to an existing <audio> or <video> element. The second is the collaboration of binary WebSocket and the Web Audio API to send, reconstruct, and schedule audio chunks at precise times to produce a seamless playback.

Method 1: MediaSource API

An API designed with streaming in mind is the MediaSource API. It's an experimental feature that allows JavaScript to dynamically append media to an HTMLMediaElement.

The <audio> or <video> media elements are frighteningly trivial to use. That's why we like them! One sets a src attribute that points to a media file and boom, the browser does its thing decoding the file in whatever codec(s) it was created with. But we're dealing with entire files here. We have no control over what the browser does after setting that src. For example, what if we want to adaptively change the quality of video based on network conditions, or splice in different sections of video from multiple sources? Aw shucks. We can't!...without multiple videos elements and some JS hackery.

The MediaSource API is here to solve these issues. With it, we can tell an audio/video element to behave differently.

Feature detection

The MediaSource API is still experimental but is enabled by default in Chrome 23, with a vendor prefix:

function hasMediaSource() {
  return !!(window.MediaSource || window.WebKitMediaSource);
}

if (hasMediaSource()) {
  // Ready to (html5)rock!
} else {
  alert("Bummer. Your browser doesn't support the MediaSource API!");
}

Getting started

Using the MediaSource API starts off with our old buddy, HTML5 <video>:

<video controls autoplay></video>

Note: the examples in this section use <video>, but the same concepts apply to <audio>.

Next, create a MediaSource object:

window.MediaSource = window.MediaSource || window.WebKitMediaSource;
var ms = new MediaSource();

The source is going to be the brains behind our <video>. Instead of setting its the video's src to a file URL, we're going to create a blob blob: URL handle to the MediaSource. This makes the media element feel special. It knows we're going to do more with it than just feed it a URL. We're going to feed it video data! Yum.

The rest of the setup looks like this:

ms.addEventListener('webkitsourceopen', onSourceOpen.bind(ms), false);

// Use MediaSource to supply video data.
var video = document.querySelector('video');
video.src = window.URL.createObjectURL(ms); // blob URL pointing to the MediaSource.

function onSourceOpen(e) {
  // this.readyState === 'open'. Add source buffer that expects webm chunks.
  var sourceBuffer = ms.addSourceBuffer('video/webm; codecs="vorbis,vp8"');

  ....
}

Only the .webm container is supported at this time.

sourceopen fires after settings the video's .src to a blob URL pointing to the media source. When this happens, the <video> is ready to accept incoming data and we can create a new SourceBuffer in the event callback. The mimetype passed to .addSourceBuffer() indicates what format the <video> should expect to be handed (webm in this case).

Once we have things setup, chunks of .webm can be dynamically added to the <video> by appending them to the SourceBuffer:

// Append a chunk of a webm file.
sourceBuffer.append(webmChunk);

This method takes a Uint8Array typed array.

Appending chunks of media

The previous example appended a single chunk of webm to our <video>. However, for the purposes of streaming, we need API functionality that lets us continuously append new video chunks as they come in from the server. Since most people don't have their media split into a bunch of pieces, there are a couple of ways to do this.

Using range requests

If your server supports it, you can request portions of a file using the Range header. Two APIs that support partial resources out of the box are the Google Drive API and the App Engine BlobStore API (via X-AppEngine-BlobRange).

You can set custom headers on an XHR request using setRequestHeader(). For instance, here's an example of requesting the first 500 bytes of a file:

var xhr = new XMLHttpRequest();
xhr.open('GET', '/path/to/video.webm', true);
xhr.responseType = 'blob';
xhr.setRequestHeader('Range', 'bytes=0-500'); // Request first 500 bytes of the video.
xhr.onload = function(e) {
  var initializationWebMChunk = new Uint8Array(e.target.result);
  sourceBuffer.append(initializationWebMChunk);
}
xhr.send();

I'm signifying the first chunk of a .webm file as the "initialization chunk". This first portion contains the .webm container file header information. If your videos are constructed correctly, there's nothing special you need to do here. Just make sure this first chunk is indeed the first one you append.

Then for subsequent appends, request the appropriate byte range and go to town:

sourceBuffer.append(webMChunk2);
sourceBuffer.append(webMChunk3);
...

Slicing a file

The second way to dice a file is to do things ahead of time on the server. However, for demonstration purposes, we can do so client-side using the File APIs.

As an example, here's how to use XHR to request a file and slice it into pieces using File.slice():

var FILENAME = 'test.webm';
var NUM_CHUNKS = 5;

function get(url, callback) {
  var xhr = new XMLHttpRequest();
  xhr.open('GET', url, true);
  xhr.responseType = 'blob';

  xhr.onload = function(e) {
    if (this.status == 200) {
      callback(this.response);
    }
  };

  xhr.send();
}

get(FILENAME, function(file) {
  var chunkSize = Math.ceil(file.size / NUM_CHUNKS);
  var fileNameParts = FILENAME.split('.');

  for (var i = 0; i < NUM_CHUNKS; ++i) {
    var startByte = chunkSize * i;

    var chunk = file.slice(startByte, startByte + chunkSize, file.type);

    var a = document.createElement('a');
    a.download = [fileNameParts[0] + i, fileNameParts[1]].join('.');
    a.textContent = 'Download chunk ' + i;
    a.title = chunk.size + ' byte';
    // blob urls created from file parts use original file. See crbug.com/145156.
    a.href = window.URL.createObjectURL(chunk);
    document.body.appendChild(a);
  }
});

The important bits are:

  • .responseType is set to "blob" to inform the server we're interested in the resource as a file rather than a string.
  • File.slice() is used to break up the file into NUM_CHUNKS pieces.
  • For each chunk, fashion a blob: URL using window.URL.createObjectURL() and create a downloadable anchor using a[download].

Closing the stream

When there's no more data to append, call .endOfStream() to indicate you're done. This also fires the sourceended event:

ms.endOfStream();

ms.addEventListener('webkitsourceended', function(e) {
  // this.readyState === 'ended'
}, false);

Now we have everything needed for adaptive streaming. For that use case, we can detect network changes in JS and append higher/lower quality video chunks based on the connection.

Example: Chunking a file and appending for continuous playback

To demonstrate appending video data onto a <video>, we need a .webm movie that's split into multiple pieces. Since you probably don't one of these laying around, I've created the following script to do that for you. It uses file.slice() to break up a .webm file into NUM_CHUNKS pieces.

Select a .webm file:

Here, we're using XHR2 to pull down the entire webm movie. The important bits to note are:

Chunking a file

Todo

var NUM_CHUNKS = 5;
var FILE = '/static/videos/mediasource_test.webm';

var video = document.querySelector('video');
video.src = video.webkitMediaSourceURL;

video.addEventListener('webkitsourceopen', function(e) {
  var chunkSize = Math.ceil(file.size / NUM_CHUNKS);

  // Slice the video into NUM_CHUNKS and append each to the media element.
  for (var i = 0; i < NUM_CHUNKS; ++i) {
    var startByte = chunkSize * i;

    // file is a video file.
    var chunk = file.slice(startByte, startByte + chunkSize);

    var reader = new FileReader();
    reader.onload = (function(idx) {
      return function(e) {
        video.webkitSourceAppend(new Uint8Array(e.target.result));
        logger.log('appending chunk:' + idx);
        if (idx == NUM_CHUNKS - 1) {
          video.webkitSourceEndOfStream(HTMLMediaElement.EOS_NO_ERROR);
        }
      };
    })(i);

    reader.readAsArrayBuffer(chunk);
  }
}, false);

TODO: Link to DashPlayer http://downloads.webmproject.org/adaptive-demo/adaptive/dash-player.html

Method 2: Binary WebSocket

Todo

http://www.smartjava.org/content/record-audio-using-webrtc-chrome-and-speech-recognition-websockets

Feature detection

Todo

Additional Resources