The way that music files are tagged is all wrong. There must be a better solution.
Like (I suppose) most people reading this I have a large (and growing) collection of music files that I've (legally, of course) ripped from my CDs. The ripping software that I use writes various information about the track into data "tags" that are stored in the file. It does this by querying an online database to find out the track name, artist, album, track number and (least usefully, in my opinion) genre of the track. There are many pieces of software for manipulating these tags and I've recently been working on series of modules that manipulate this data in a format independent manner (so it doesn't matter if you have an Ogg Vorbis file or an MP3 file).
But I'm becoming more and more convinced that this approach needs some work. I store all of my music files in a directory structure on a (large) hard disk that I bought expressly for that purpose. I have a directory for each artist and within that a subdirectory for each album. The album subdirectory contains the actual music files and in the artist directory I also have a number of playlist files (.m3u) which represent the albums. An m3u file is pretty unintelligent. It simply contains the names of the files that make up the album in the correct order. Most people I've spoken to have a similar set-up with minor variations.
This setup can create a number of problems. Most of them stem from the fact that the same track can appear on a number of different albums. And under my current system I need to store each track once for each album that it appears on. And that wastes space.
Of course, I don't actually need to store a track multiple times. If a track appears on more than one album, I could just store it once and and reference that version of the file in the m3u files for each of the other albums that contain that track. In fact I could lose the different directories for different albums and just have a big directory containing all the tracks by each artist and just use m3u files to reconstruct each album.
The problem with that comes down to the data tags that I mentioned earlier. When I rip a track from one album, it is tagged with the name of that album and its track number on that album. When I try to link that same track to a different album there's no way that I can include the new album information in the data tags. So when I'm playing the new album, my music player will display the wrong information for tracks that were previously ripped from other albums. You might not see that as a huge problem, but it niggles me.
The core of the problem is that the data has been modelled incorrectly. It makes no sense to try and store all of this data in a file representing the track. You actually need to push some of the data into the file that represents the album. So you need to store the album name and the list of tracks in the m3u file for that album and remove the album name and the track number from the track file.
When you think about it, that's a much better way of doing it. In the general case, a track isn't associated with a particular album so storing that data in the track file really doesn't make much sense. It's as tho' the format was designed by someone who didn't understand data modelling[1]. I'm going to think about this a bit more over the next few days and see how easy it would be to implement it. Of course, the current standard is implemented in huge amounts of existing software, so getting a new standard implemented anywhere might be a bit of a bugger. We can try tho.
[1] Something I've seen a few examples of recently - but that's a rant for another day.

Recent Comments