A Spec-tacular Failure

Aaron, I’m glad you brought up the issue of Functional vs. Technical specifications. Here’s a clarifying blurb by Joel Spolsky on that:

http://discuss.fogcreek.com/askjoel/default.asp?cmd=showixPost=4202ixReplies=8

My way of thinking is that you just don’t write “technical specs” that cover the entire functionality of an application … that’s what the functional spec is for. Once you have a complete functional spec the only thing left to document is points of internal architecture and specific algorithms that are (a) entirely invisible (under the covers) and (b) not obvious enough from the functional spec.

Note only is the ID3 spec bad, the information in most songs isn’t terribly useful.

Hurah for The Dead Milkmen! Damn hard music to find nowadays.

Husker Du is great though.

As someone who has implemented ID3v1 and ID3v2 parsers for a commercial embedded mp3 player, I have to completely agree. ID3v2 is completely stupid. First of all, the sync-safe integers are useless when frame contents don’t need to be sync-safe. Basically, every player needs to figure out how to skip the header. That breaks the whole point of using a sync-safe tag.

Second, was there really a problem with having the tag at the end of the file? Editing the tags (updating them, whatever), is much simpler if the tag is added at the end of the file than inserted at the beginning. Yes, there is a provision for leaving padding space when you re-write the whole file to insert the new tag. However, lots of apps don’t do it. And if they do, it’s a much bigger waste of space than storing 2-byte unicode for strings.

I’m sure ID3v2 was designed by someone who had good intentions. While it does work, it could have been much simpler if they had dropped the sync-safe thing and put the tags at the end of the files. Retagging thousands of files would have been a much faster operation.

Specs are useful, but only if there is also a reference implementation. That way where the spec doesn’t explain it very well you can check with the reference code what should be happening.

It also means that the spec writers have to keep themselves grounded in reality as they also have to implement it themselves.

Needless to say, the ID3v2 freaks never wrote one. They certainly never write one which had to trry and parse all of the incompatible versions they trotted out

The reason for putting ID3v2 at the beginning of the MP3 is to support streaming media.

As I understood it the idea was to make the entire ID3v2 header sync safe so that players which didn’t understand ID3v2 at all would just coast through it looking for the sync pattern. It took them until v2.4 to actually achieve this (I think) - before this version the block headers could look like the sync pattern.

and put the tags at the end of the files

This is one of the very few things in the ID3v2 spec that did make sense to me. Putting the tags at the beginning does make it faster to read them; you only need to read in the first few hundred bytes to know the name, artist, etc of the recording.

It’s much better when you’re streaming a file over a slow connection, too, but the same economies of scale apply to every device.

Writing beginning tags is of course painful, but I think the tradeoff between fast read / slow write is the correct way to go. You’ll be reading 99.9% of the time anyway so why not optimize for that case?

Once upon a time specs were developed by engineers and they were complete and clear. Then specs became holy and many people started writing specs. And lo, specs became political.

Specs generally have no explanation for the why of anything, which is really what you need. With the proliferation of video we may have a solution. Set up a video camera in front of a white board and encourage developers to expound on what they have written and why they did it that way. If you have some people who don’t participate, hold an interview in front of the camera. Walk through the code with them. Put all the recodings/interviews on a server. I know it may take more time to search all the videos for the info you need, but at least you will have a chance of finding it, rather than having to try figure why they did something from the code, which can be near impossible.

I generally think the spec is ok. I’ve written code to implement all the common tags, just copying any unimplemented (in my app) tags as is.

The biggest problem, though, is the difference between V2.2, V2.3, and V2.4 specs.

To call the difference between 2.2 and 2.3 a minor revision was just crazy. V2.3 should have been V3.0. Try making an base class for V2 frames that can be extended to either V2.2 or V2.3. It just isn’t practical. The headers are different; the frames are different; the names are different. The only similarity is … hmmmmm… well, I can’t think of any similarities right now.

While you might call the difference between V2.3 and V2.4 a minor revision, even that is not so clear.

One more point about the idea of using current implementations as the spec rather than the spec as the spec.

I wrote my ID3 library and media manager because of the flaws and usability issues with Media Player and iTunes. Neither did a great job - both were good in some areas and terrible in others.

But I needed my MP3 files to play in my car and in my truck, to play in my home stereo, in my Media Center, on several PC’s using iTunes or Media Player (depending on the preference of various family members), my PocketPC, and so on. The only way to accomplish that is to follow the standard. My own player follows the standard and uses all the capabilities of my ID3 library; others that don’t use the standard are limited but that is their own limitation, not a limitation of my software so it doesn’t cause us any grief.

I completely agree. I’m writing a tag editor in C# and this spec is an abortion. The unsync scheme is one of the many poor choices made IMO. A better solution would be as follows:

If a tag is present, the first 3 bytes of the file will be ‘ID3’.

The next 4 bytes represent the integer size, in bytes, of the entire tag(nSize). This makes it very simple for decoders to start scanning for sync bits at offset ‘nSize’. No more bit shifting nightmares for tag editors.

Hello. My name is Mitch Honnert and I’m the author of UltraID3Lib, the library mentioned in the blog entry. I know I’m late to the discussion, but I thought I’d add my two cents…

I consider myself to be a supporter of the ID3 format, but other than a few quibbles, I’d have to say I agree with most of your criticisms. I noticed, however, that your comments focus mostly on the documentation of the ID3 format rather the format itself. While the documentation has its flaws, I think the format itself is rather good. The ID3 spec may be overkill for the vast majority of tag users, but the design allows for an incremental implementation that leaves the choice of how deep to go into the format up to the developer.

So, yes, there are some obvious problems with the specification documentation. In fact, your blog entry has inspired me to lobby for incremental updates to the existing specs. My hope is not to change the specification in any material way, but just to rewrite some of the documentation in order to avoid the problems associated with ambiguous standards.

  • Mitchell S. Honnert

Tnis might sound like a little, nit-picky point, but the ID3 spec does not make it clear whether the ‘length’ field in the ID3 header should include the header itself. My assumption has always been that it does not, and Windows Media Player seems to agree with me, but I have just been handed some files where it does and these break my code :frowning: So who knows? - I am still trying to figure it out. If anybody has a definitive answer I’d be pleased to hear it, but I don’t think there is one. Opinions seem to vary.

But this whole sorry little tale demonstrates something about the art of writing specifications. It is important to pay attention to detail, and the inexperienced often don’t.

Paul Sanders
a href="http://www.alpinesoft.co.uk"http://www.alpinesoft.co.uk/a

This might sound like a little, nit-picky point, but the ID3 spec does not make it clear whether the ‘length’ field in the ID3 header should include the header itself. My assumption has always been that it does not, and Windows Media Player seems to agree with me, but I have just been handed some files where it does and these break my code :frowning: So who knows? - I am still trying to figure it out. If anybody has a definitive answer I’d be pleased to hear it, but I don’t think there is one. Opinions seem to vary.

But this whole sorry little tale demonstrates something about the art of writing specifications. It is important to pay attention to detail, and the inexperienced often don’t.

Paul Sanders
http://www.alpinesoft.co.uk

a bit late, but interesting discussion, nevertheless. i’ve noticed the ‘length’ field is under-specified as well. i’ve mailed to author of the spec - AFAIK without result (and certainly without any response).

but anyway, i think it’s not all that bad. it takes some time to understand the spec, but at least all info can be found on www.id3.org. i’ve been developing kind of “music / tags validator” in sense of “w3c html validator” - and i found out mp3’s arround the net are just disaster. tags and various rubbish is found in the middle of the file (even id3v1), ape tags are used for mp3, last frames are truncated, some encoders even calculate mp3 header CRC the wrong way, so all frames in the whole file seems damaged :slight_smile: it’s lols … i guess it’s caused by the number of mp3 encoders (both hw and sw) written …

on the other hand, i love Ogg Vorbis. tags are simple and neat … and in 200GB of music, i have no single broken Ogg. of course, Ogg could learn from mp3’s mistakes.

So MP3, Zip, AES, MD5, VST, TCP/IP, RAID, PostScript, ClearType, and Kerberos are all useless technical specs, right?

Let’s not confuse the quality of a particular spec here with the usefulness of specs in general.

And functional specs are completely different from technical specs. Functional specs are merely “cover your ass” documents when working on an important project. Whether or not they’re well-written, well-read, or help at all in making the finished product, isn’t the point - they’re merely documented proof that you delivered what you promised. Good business sense is requisite in any profession, software included, and having a spec is simply good business sense, even if it’s nothing more than a project hangnail.

Does this post mean you prefer the old days where browsers had to be lenient in parsing HTML because all the other browsers had bugs? Having an actual HTML spec to follow is bad?

I like how every piece of media software ships with “reorganize my media” and “wreck my ID3 meta with garbage data downloaded from god knows where”.

What’s worse is these same programs that require the meta to function are always what mess it up.

Another problem solved by Google. Thank god for Google Music. Although I will always miss the Zune application.