This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Feature Request: Duplicate Song Recognition

Tags: None
(comma "," separated)
bad wolf
Registered Member
I'm not sure if this is a feature that's already been discussed or even in development, but I thought I'd suggest it anyway. One thing I noticed is that occasionally If I have albums from an artist and then a compilation of some kind I will end up with 2 or 3 pretty much identical versions of the same song.Take for example Gnarls Barkley, I have both the single Crazy and the album St. Elsewhere. So I have the song Crazy twice in my collection (In Amarok 1.x this is). This means that both tracks have different ratings and play counts.

My suggestion is to make Amarok detect duplicate songs like this. For example, such a feature would scan the collection and identify songs that have the same (or very similar) tag data (Title, Artist...) and then analise the song to see if it sounds the same (can that even be done?).

It would then suggest these duplicate entries to the user and, after approval, 'link' the separate files in the Amarok database. What I mean by this is that the actual files would remain where they are, but say that in my previous example I had encoded the single CD in FLAC, but the album in mp3. Then I 'linked' the song Crazy. Both songs would share the same Rating and play count for a start but if I put the album into the playlist and let it play. when it reached the linked song it would automatically play the best quality version available, in this case the version from the single.

I know this is hardly a critical feature, or even much of an inconvenience, but I thought I might as well suggest it, what does everyone think?
Registered Member
I would like to see this too, there has already been some effort to calculate the audio thumb print and use that to get the correct tag, so it doesn't seem like that big of a jump to additionally store the result of the thumb print calculation in the database, or in the comment section of the tag, not sure what Picard does, if Picard puts it into the tag somewhere it seems like it would be good to do the same thing as Picard. I would like to see more widespread and reliable use of audio thumbprints, for more filetypes, somewhere along the way in the 2.x series.

When it comes to the actual dupe finding though, I would want some kind of interface that shows a side by side comparison of each file, file type, bitrate/quality level, filesize, etc.. and if possible some kind of percentage indicator of how well the two songs match each other. That's a little more work and hands seem like they will be full enough for the next couple releases, just getting 2.0 out the door, then working on the 2.0.x cleanups and refinements.

If we get do get the dupe stuff, it would be a nice bonus feature if we could see some percentage indicator of how well each file matches up to it's respective entry in muicbrainz for the release it claims to be part of, similar to the way Picard shows.

Later, Seeker
Registered Member
Idea 2:

Since Amarok 2 has a big center area to display stuff.....

Create a plasmoid that will display all the relevant song data when you mouse over a track in the right or left pane. Relevant data being artist, track name, album, filetype, bitrate, file size, path to file, (more?).

Then when potential dupes are searched for and identified they could be loaded into a playlist grouped by artist and track name and you would just need to mouse over each track listing and compare what gets displayed in the center pane to see which to keep.

Later, Seeker
Registered Member
I would really like to see this feature too.

But I don't think you need to look at the signature of the sound.  From my experience, you could just look for songs with similar titles, similar artist name and same length.  If all of those things are the same or very similar, you could remove the lower quality (smaller file size?) file from listings immediately.  If they're just very similar, you could ask the user.
bad wolf
Registered Member
Also, sometimes you can get alternate versions of songs with slightly different parts to them, although they usually have slightly different lengths. But it must be made very easy (a few clicks) to undo any auto-pairing done by Amarok.
I think this is a horrible idea.  I like the idea of cleansing out alternate songs, but listening to songs album to album such is my taste deleting a song in my collection without my permision is a horrible idea.  Being a person who spent over a year going through every song I had to make sure i wanted it or not, and fixing every mp3 id3 tag so it was right.  I think it's bad policy to add anything to delete without user permission.  Now if it asked song by song it was going to delete, sure, that would be fine.  But I think Amarok just happened to be the best program to fix all your tags, for damn sure it wasn't by mark's design.  Or else it would be important in Amarok 2.  So if you want a program that makes it easy to fix your tags use Amarok 1.*.  But don't ask for Amarok to delete songs, I would yell so much if Amarok started deleting songs without my permission!!!!!!  It isn't that hard really, how many songs do you have?
bad wolf
Registered Member
qurk wrote:I think this is a horrible idea.  I like the idea of cleansing out alternate songs, but listening to songs album to album such is my taste deleting a song in my collection without my permision is a horrible idea.  Being a person who spent over a year going through every song I had to make sure i wanted it or not, and fixing every mp3 id3 tag so it was right.  I think it's bad policy to add anything to delete without user permission.  Now if it asked song by song it was going to delete, sure, that would be fine.  But I think Amarok just happened to be the best program to fix all your tags, for damn sure it wasn't by mark's design.  Or else it would be important in Amarok 2.  So if you want a program that makes it easy to fix your tags use Amarok 1.*.  But don't ask for Amarok to delete songs, I would yell so much if Amarok started deleting songs without my permission!!!!!!  It isn't that hard really, how many songs do you have?

Hey now, I never said anything about deletion. I do agree that letting a program automatically delete data is a very stupid idea indeed, but that's not what I suggested. Although I do see how I could have been misunderstood easily - let me try to explain what I mean.

Say for example you heard a song that you like, so you bought the single with it on. But later that same song was released on an album and you happened to get that album too for whatever reason. If you rip both CD's to your Amarok collection you now have two versions of the same song. Now there's nothing structurally wrong with this, but the play count for both tracks will be wrong, and you have to rate/score each track individually.

I'm suggesting a feature where Amarok would look for songs like this. tracks that exist multiple times in it's database by scanning for tags that are the same. Assume that Amarok did this and found these two songs from the single and the album. It would then, with persimmon from the user, 'link' them. Meaning it would choose one of them as the preferred file on the disk, this may be random if both files are identical, or if one is better quality it will prefer that one. It will also syncronise the playcount and rateing and other data. Essentially Amarok will treat the two tracks as one, because they are the same. The same rateing the same playcount.

Also, if in this example you ripped the album into ogg but the single into FLAC. It would prefer the FLAC version. So if you played the album it would automatic playback the FLAC, single version when it reached that track. At no point would it actually delete anything. All these changes would exist purely in the way Amarok handled it's database of the files.

It could maybe be configured to suggest redundant files for deletion by the user to save disk space, even delete them for you if you wanted it to. But never without direct authorization from the user.


Who is online

Registered users: Bing [Bot], claydoh, Google [Bot], rblackwell, Yahoo [Bot]