My js1k Entry: Making Music with the Audio Data API

I participated in the js1k this year, it was lot of fun. I got disqualified for using a Firefox-specific feature though, so I will share my account here because otherwise I would have nothing at all to show for it, which would be very sad.

I've never entered this competition before and so didn't know exactly what to expect, but I thought it would be fun. Each year, there's a different theme. This year's theme is love. I brainstormed various ideas and I ended up deciding to write a program that plays a love song - this is actually not easy because the competition rules disallow linking to any asset - no mp3, no wav, no flash. I am a music geek at heart and this appealed to me - also, I happen to be listening to a lot of old-fashioned love songs recently.

Goals

Here are my goals for this competition - what I want to get out of it

Song Choice

Although I was initially partial to You Ain't Living' Till You're Lovin' by Tammi Terrell and Marvin Gaye, I later chose Unchained Melody by the Righteous Brothers - every one knows it, it's a pretty song, and also a simpler song - with fewer notes (thus saving me bytes).

Working with the "Enhanced" Audio API

I decided to use the Firefox-only Audio data API (currently under review). I knew this was against the rules - the program has to run on all the major browsers except IE. I decided to do it anyway because my idea was so cool and also my belief that using this API was the only way to do what I wanted (this turned out not to be true - more on that later).

So, Firefox has a set of APIs for writing raw audio data to your speakers. There's mozSetup(channels, sampleRate) which tells the Audio element how many channels it has (2 for stereo, 1 for mono), and the sampling frequency(44100 for 44.1KHz).

To write data to the buffer you'd use the mozWriteAudio(buffer) method and you'd want to pass it a buffer of type Float32Array. The whole process might look something like this

var audio = new Audio
audio.mozSetup(2, 44100)
var samples = new Float32Array([0.242, 0.127, 0.0, -0.058, -0.242, ...])
audio.mozWriteAudio(samples)

Simple, right? Except it's not. I first tried to write out the samples for the entire song at once using the mozWriteAudio call. This didn't work because it can only write so much data at once - and the method returns you the amount of data samples it wrote, so you can continue from the point it left off the next time you write. What their example(under "Complete Example: Creating a Web Based Tone Generator") tells you to do is to periodically write samples(using an async loop up to a fixed-size buffer ahead of the current playhead. You'd use

audio.mozCurrentSampleOffset()

To get the current play head. Then what you'd do is calculate the amount of data you need to write to fill up to your pre-buffer size

var toWrite = audio.mozCurrentSampleOffset() + prebufferSize - writePos

Where writePos is the return value of the last mozWriteAudio call you made. The resulting code would be something like this

function writeBits(){
    // toWrite is the amount to write to fill up our pre-buffer
    var toWrite = audio.mozCurrentSampleOffset() + prebufferSize - writePos
    if (toWrite){
        var data = new Float32Array(toWrite)
        for (var i = 0; i < toWrite; i++, writePos++){
            // write your samples to the `data` array here
        }
        audio.mozWriteAudio(data)
    }
    // Feed it data every 100ms
    setTimeout(writeBits, 100)
}

writeBits() // kick off the async loop

Generating the Song Samples

At first I tried using actual samples of the recordings. I would down-sample a small section of the song and format it into a Javascript array literal. This failed miserably as even when severely down-sampled, the data was still way to big to fit into 1k. The next approach was generating the song notes from sine waves - this is more fun anyways!

Sine Waves

The formula for generating a sine wave of a curtain frequency should look more or less like

for (var i = 0; i < toWrite; i++){
    buffer[i] = Math.sin(writePos * Math.PI * 2 * freq / sampleRate)
}

this is for one single frequency, i.e. one note. If you want multiple notes (or tracks) to be playing at the same time, you'd just add the samples all together.

for (var i = 0; i < toWrite; i++, writePos++){
    for (var j = 0; j < notes.length; j++){
        buffer[i] += Math.sin(writePos * Math.PI * 2 * notes[j] / sampleRate)
    }
}

Note Representation

When I figured out that I can't just generate the samples for the entire song at once, I decided to go to an event-based approach where for each note I wanted to play I would have a note-on event to start it, and then a note-off event to stop it - at any instant I wound keep track of the current notes that are being played. This is just like the way midi works. So, I would have an event list like

var events = [
    [0, 440],
    [11500, 440],
    [11500, 523.25],
    [22500, 523.25]
]

Each tuple in the events array has two values: a start time(here it's the sample number) and a frequency. I didn't bother encoding whether the event is a note-on or note-off because if the note isn't being played I can assume it's a note-on and vice versa for note-off. One advantage of this format is that supporting multiple notes playing at the same time is trivial.

I started laying out the melody using this format but I quickly found that not only was it not space efficient(I quickly went over the 1k limit), but it was inconvenient from the song writer perspective: I had to make several calculations each time I needed to add or modify a note. Also, it was very hard to move melodies around because all the start times would have to be changed.

Because of these limitations, I rewrote the algorithm to take a format which not only was easier for the song writer, but also would yield much better compression. The new format would consist of an array of tracks, and each track would be an array of notes. A track looks like this

var trackA = [
    [11, 523.25],
    [1, 587.33],
    [12, 523.25],
    ...
]

The first value of the tuple here is the duration of the note, not the start time, and the second value is the frequency - this enables the writer to move melodies around without having to modify them. The unit of duration is the triplet. I chose the triplet because Unchained Melody has a triplet-ized feel - so, the [12, 523.25] entry is a whole note, for example, because each measure of a 4/4 signature has 12 triplets.

Sound Envelope

When I got the song finished and showed it to a friend, he mentioned that he hears some clicking noise at the start of each note. I noticed this too, and it was because the start and end of the sine wave of each note is abrupt. To remove this sound artifact I needed to make sure that the note edges were gradual, i.e. the volume of a note shouldn't go straight from 0 to 5 with no in-between.

To do this I added a taper factor to the beginning and end of each note. But before this, I needed to keep track of for each note that is being played, how long it has been playing for, and how much longer before it ends. I stored this info in a noteDur tuple where the first value is the time played so far, and the second value is the time till note-off. So the sample calculation now looks like

buffer[i] = Math.sin(writePos * Math.PI * 2 * notes[j] / sampleRate) *
    (
        (noteDur[0] < taper ? noteDur[0] / taper : 1) *
        (noteDur[1] < taper ? noteDur[1] / taper : 1)
    )

taper is hardcoded to 5000 in unit of samples, which equates to about a tenth of a second for 44.1Khz. This got rid of the sound artifact.

Byte Saving Techniques

I am by no means an expert on shaving bytes off of Javascript programs. I first used a naive approach of just making it work first, and then making it fit.

Minification

I used Uglify to shrink my code - which was performing consistently better than the closure compiler. But, here are a couple of tricks I used in conjunction with the minifier.

The Wrap/Unwrap Trick

One good way to evaluate how well your minifier did is to prettify the minified source using jsbeautifier. I found that the minifier does't rename global variables, so I wrapped all my code inside of an iife.

!function(){
    // my code
}()

This got the minifier to rename all the variable names to one character. But, I then remove the iife in the minified source to save me all of 15 bytes!

Go to the Dark Side - Get Rid of var

When I really got stuck on squeezing out the last few bytes to get it down within the 1k limit, inspiration struck: what if I just got rid of all these var statements? It would certainly still work, all the variables would just become globals, what's the harm in that? So, in the minified source, I used search/replace to remove all the var statements in the program. Of course, then I re-ran the program to make sure it still worked.

Get Rid of Functions

I found that in all cases that I've encountered, using functions increases file size. So, I removed all functions except where it's absolutely necessary(like for an async loop).

Get Rid of Variables

Getting rid of any variable that's used only once will save you bytes.

Try Different Approaches and Compare

If you are unsure if a technique will save bytes, try it and compare with what you have.

Be Willing to Kill Code

Give up code that's not "pulling its weight". In my case, I used to have a function that translates a note number on the musically scale to it's corresponding frequency. This was overkill as the song didn't really use that many notes and so just using variables as the translation layer worked just as well, and allowed me to get rid of that code.

Results

The code

!function(){
    // note frequencies stored for reuse
    var dol = 523.25
      , re = 587.33
      , mi = 659.26
      , so = 394.995
      , fa = 349.23
      , mib = 311.13
      , la_ = 220
      , ti = 246.94
      , mi_ = mi / 2
      , dol_ = dol / 2
      , la = la_ * 2
      
    // Melody sections
    var sectionA = [
        [11, dol],
        [1, re],
        [12, dol],
        [1, re],
        [1, mi],
        [9, dol],
        [1, dol],
        [8, so],
        [1, so],
        [2, mi],
        [1, re],
        [11, dol],
        [1, re],
        [9, dol],
        [2, mi_],
        [1, fa],
        [24, so]
    ]
    
    var sectionB = [
        [2, fa],
        [2, so],
        [1, la],
        [1, dol],
        [4, ti*2],
        [1, dol],
        [1, ti*2],
        [4, la],
        [1, fa],
        [1, la],
        [6, so],
        [2, fa],
        [2, so],
        [1, la],
        [1, dol],
        [4, re],
        [1, re],
        [1, dol],
        [1, mi],
        [11, so*2]
    ]
    
    var notes = [].concat(sectionA, sectionB, sectionB, sectionA)
    
    // Chords!
    var I = [dol_, mi_, so, mi_]
      , VIII = [la_, dol_, mi_, la]
      , IV = [fa/2, la_, dol_, fa]
      , V = [so/2, ti, re/2, ti]
    
    var compA = [I, I, VIII, VIII, IV, IV, V, V, I, I, VIII, VIII, V, V, V, V]
      , compB = [IV, V, IV, V, IV, V, I, I]
      , comp = []
    ![].concat(compA, compB, compB, compA).forEach(function(chord){
        [0,1,2,3,2,1].forEach(function(i){
            comp.push([1, chord[i]])
        })
    })
    
    var tracks = [notes, comp] // 2 tracks, melody and accompaniment
    
    var audio = new Audio()
      , sampleRate = 44100
      , trackCursors = [0, 0] // for each track, the index to the current note being played
      , currNoteDur = [] // This is to get at note duration info for the current notes being played
                         // to implement note envelope tapering
      , writePos = 0     // write position for the sample data
      , taper = 5000
      
    audio.mozSetup(1, sampleRate)

    function writeBits(){
        // toWrite is the amount to write to fill up our pre-buffer
        var toWrite = audio.mozCurrentSampleOffset() + sampleRate / 2 - writePos
        if (toWrite){
            var data = new Float32Array(toWrite)
            for (var i = 0; i < toWrite; i++, writePos++){
                for (var j = 0; j < 2; j++){
                    var cursor = trackCursors[j]
                      , currNote = tracks[j][cursor]
                    if (!currNote){ // no more notes to play
                        trackCursors[j] = 0 // return to of the track
                        continue
                    }
                    var noteDur = currNoteDur[j]
                    if (!noteDur){
                        noteDur = currNoteDur[j] = [0, currNote[0] * sampleRate / 3]
                    }else if (!noteDur[1]){
                        trackCursors[j]++
                        currNoteDur[j] = null
                        break
                    }else{
                        noteDur[1]--
                        noteDur[0]++
                    }
                    data[i] += Math.sin(writePos * Math.PI * 2 * currNote[1] / sampleRate) * 0.3 *
                        ( // Note envelope tapering to prevent artifacts at note edges
                            (noteDur[0] < taper ? noteDur[0] / taper : 1) *
                            (noteDur[1] < taper ? noteDur[1] / taper : 1)
                        )
                }
            }
            audio.mozWriteAudio(data)
        }
        // Feed it data every 100ms
        setTimeout(writeBits, 100)
    }

    writeBits() // kick off the async loop

}()

Minified code: 979 bytes

function C(){a=w.mozCurrentSampleOffset()+x/2-A;if(a){b=new Float32Array(a);for(c=0;c<a;c++,A++)for(d=0;d<2;d++){e=y[d],f=v[d][e];if(!f){y[d]=0;continue}g=z[d];if(!g)g=z[d]=[0,f[0]*x/3];else{if(!g[1]){y[d]++,z[d]=null;break}g[1]--,g[0]++}b[c]+=Math.sin(A*Math.PI*2*f[1]/x)*.5*(g[0]<B?g[0]/B:1)*(g[1]<B?g[1]/B:1)}w.mozWriteAudio(b)}setTimeout(C,100)}a=523.25,b=587.33,c=659.26,d=394.995,e=349.23,f=311.13,g=220,h=246.94,i=c/2,j=a/2,k=g*2,l=[[11,a],[1,b],[12,a],[1,b],[1,c],[9,a],[1,a],[8,d],[1,d],[2,c],[1,b],[11,a],[1,b],[9,a],[2,i],[1,e],[24,d]],m=[[2,e],[2,d],[1,k],[1,a],[4,h*2],[1,a],[1,h*2],[4,k],[1,e],[1,k],[6,d],[2,e],[2,d],[1,k],[1,a],[4,b],[1,b],[1,a],[1,c],[11,d*2]],n=[].concat(l,m,m,l),o=[j,i,d,i],p=[g,j,i,k],q=[e/2,g,j,e],r=[d/2,h,b/2,h],s=[o,o,p,p,q,q,r,r,o,o,p,p,r,r,r,r],t=[q,r,q,r,q,r,o,o],u=[];![].concat(s,t,t,s).forEach(function(a){[0,1,2,3,2,1].forEach(function(b){u.push([1,a[b]])})});v=[n,u],w=new Audio,x=44100,y=[0,0],z=[],A=0,B=5e3;w.mozSetup(1,x),C()

Here's the demo page (Firefox only).

After I finished my project. I browsed other projects in the competition, and found that several of the entries also had an audio component. They did this by outputing a wav file in base64 format as a data url and feeding it to the Audio object constructor

// from Chime Hero <http://js1k.com/2012-love/demo/1265>
player = new Audio('data:audio/wav;base64,UklGRqSIFQBXQVZFZm10IBAAAAA\
BAAEARKwAAESsAAABAAgAZGF0YYCI' + btoa('\0'+track+track+track+track));

Gee, I wish I'd have thought of that! Well, there's always next year, but I have no regrets! This project has been a lot of fun - it let me exercise both my creative and analytical sides, and I learned a lot from it. And, by virtue of writing this up and sharing the code, I have contributed something to the community.

Links

blog comments powered by Disqus