Or, Processing the Sound Data of AIFF Files in AS3.

Finally! The main course. So far, we’ve loaded the bytes of an AIFF and started reading its chunks. We’ve focused on the COMM chunk previously, but now we’re ready for the SSND chunk. Once we’re in the SSND chunk, reading the sample data is really straight forward. You do need to be aware that there are eight extra bytes of information that are rarely used but will otherwise throw off your samples, and that you need to perform logic and read the data one way or another depending on the number of channels and the bit depth (also known as sample size in the AIFF spec).

The extra info: There are two pieces of extra information that, according to the documentation, are hardly ever used and should just be set to 0. But they exist, so you need to account for them. They are both unsigned 32-bit integers, so anything that moves the pointer ahead by 8 bytes will get you from the beginning of the data to the beginning of the actual sample data. You may choose to store this information, in which case you can get the actual values using readUnsignedInt() twice, once for the offset and again for the blockSize. I’ll leave it to you to research the uses of these two bit of data.

Now, we’re ready to read the samples! The only catch left is to know how to read the samples. If the sound is a typical 16-bit sound, then you can use readShort() to get a sample’s value. If it’s an 8-bit sound, then you can use readByte() to get the value. If it’s a 24-bit sound…then I’d love some help in getting the value. We run into the same issue with the sample rate in the common chunk…ByteArray doesn’t give us a “read24BitInt()” method. I’m sure we can readByte() and then readShort() to read a full 24 bits, and then bit shift our way to a value. But for now I’m not worrying about that. I’d love to have that capability, so anyone reading want to lend a hand?

The other thing to be aware of is that we have to know how many channels the sound is, and read a sample once for each channel. That is, if it’s a mono sound, we just read each sample in turn. But if it’s stereo, then we need to read a sample, and that’s the left channel, and then read the next sample, and that the right channel that aligns with the left channel we just read. The same goes for more channels.

Incidentally, this is what a “sample frame” is, which was one of the common data pieces. A single sample frame represents a single sample of sound. But a sample frame can contain one to many channels of samples. A stereo sample fame has two samples in it. A 5.1 surround sound sample frame has six samples in it.

For our purposes, I’m ignoring anything greater than 2, and treating mono files as stereo files with the same samples in both channels.

Here’s a processSamples() function, that stores Sample obejcts (which are simple data objects that contain an index and the left and right sample values) in a samples Array. There is some optimization going on, which makes things a little confusing, but basically we need to do one thing or another depending on the bit depth. I’m figuring that stuff out before the loop, rather than putting that conditional inside the loop. Something else I’m doing is sort of “neutralizing” the bit depth by dividing the actual sample value by the max value to get a number between -1 and 1 and storing that value. The trick, though, is that the max value is different not only between bit depths, but between positive and negative values.

var offset:uint;
var blockSize:uint;

var samples:Array;

function processSamples():void {
	// Read those two bits of 4-byte unsigned integers that aren't really used.
	offset = binaryData.readUnsignedInt();
	blockSize = binaryData.readUnsignedInt();
	// Set up some "constant" vars that only need to be determined once.  readSample is a reference to a method
	// on the ByteArray.  Depending on the bit-depth, we need to either readByte() or readShort().  It will always
	// be the same method to call, so rather than testing the bit depth in every loop iteration, we're testing it
	// once and storing a reference to it in a local variable.  You'll see it being used in the loop below.
	samples = new Array();
	var readSample:Function;
	var negativeDivisor:uint;
	var positiveDivisor:uint;
	switch (sampleSize) {
		case 8:
			readSample = binaryData.readByte;
			negativeDivisor = 128;
			positiveDivisor = 127;
		case 16:
			readSample = binaryData.readShort;
			negativeDivisor = 32768;
			positiveDivisor = 32767;
			throw new Error("Resolutions other than 8 and 16 are not supported.");
	// Declare the vars that we'll be reusing in the loop.
	var peak:int, left:int, right:int, peakPercent:Number, leftPercent:Number, rightPercent:Number, divisor:int;
	var sample:Sample;
	// Loop over all of the sampleFrames.
	for (var i:Number = 0; i < sampleFrames; i++) {
		switch (channels) {
			case 1:
				// Read the data from the ByteArray.
				peak = _readSample();
				// Determine the amount to divide by, pending sign.
				divisor = (peak < 0) ? _negativeDivisor : _positiveDivisor;
				// Divide the ByteArray value by the appropriate amount we just determined.
				peakPercent = peak / divisor;
				// Samples are "stereo," so just pass the mono value into both channels.
				sample = new Sample(i, peakPercent, peakPercent);
			case 2:
				left = _readSample();
				right = _readSample();
				divisor = (left < 0) ? _negativeDivisor : _positiveDivisor;
				leftPercent = left / divisor;
				divisor = (right < 0) ? _negativeDivisor : _positiveDivisor;
				rightPercent = right / divisor;
				sample = new Sample(i * _timer.currentCount, leftPercent, rightPercent)
				throw new Error("Multi-channel AIFF's not supported.");

For reference, here is an implementation of the Sample class:

package {
	public class Sample {
		private var _index:uint;
		private var _left:Number;
		private var _right:Number;
		public function Sample(index:uint, left:Number, right:Number) {
			_index = index;
			_left = left;
			_right = right;
		public function get index():uint			{ return _index; }
		public function get left():Number			{ return _left; }
		public function get right():Number			{ return _right; }

Lastly, I need to admit that I lied. There is one final pitfall: On AIFF files of any significant size, the above loop will probably cause the Flash Player to hang and time out. This post is already too long, so I won’t get into details, but I got around this by processing increments of data in a TimerEvent. Obviously, this meant dispatching a COMPLETE event, along with PROGRESS events for feedback, and setting up a Timer and storing considerably more information that needs to persist between iterations of the Timer. It complicates things, but is sort of a tangental problem and not directly related to the technicalities of how to read an AIFF file. It is worth noting, though, that virtually every audio program I have takes a few seconds to load a sound and draw its waveform, so this is pretty much something to expect. Of course, it seems that ActionScript 3 takes significantly longer than, say, your typical C-based audio program. But’s still pretty amazing to be able to do this at all in Flash.

In the next exciting episode, I’ll talk about how best to draw a waveform from the sample data that we’ve just gathered.