This forum has been archived. All content is frozen. Please use KDE Discuss instead.

renaming tags with unicode creates strange id3 tags

Tags: None
(comma "," separated)
news1234
Registered Member
Posts
6
Karma
0
Hi,

This is some quite strange behaviour:


I had some russion music, which had bad id3 tags (coded in cp12... or koi8_r),
so they were unreadable.

The filenames however were already UTF-8 coded


So what I did:

I opened a shell window and made ls of my audio file

In amarok I clicked on properties and copied pasted the song title from the shell window.
Everything looks fine. I have a nice cyrillic song title.

Problem:
Amarok modifies the id3 tag (which I expected)
However it is competely unreadable and definitely not UTF8 (neither any other encoding, that I know for cyrillic)

I have the impression, that amarok stores garbage in the id3 tag, but stores the good name in the msql data base and displays
according to this information.

Is this a known problem.
news1234
Registered Member
Posts
6
Karma
0
One information, that I forgot.

If I change a song title to a title without UTF-8 characters, then no garbage is displayed, but a the filename, that I entered


Just o give you an example:

The cyrillc band name DDT (if changed with amarok)
displays as >@>=K in my terminal
if I change it with id3 it dislays correctly  as CyrillicUpper--D-CyrillicUpper-D-Cyrillc-upper-T.
seeker5528
Registered Member
Posts
84
Karma
0
You have to have the right lanquage files for command line and X/KDE. If the file names are UTF-8 and showing up correctly both in a terminal window and in the filemanager (dolphin or konqueror) you should be able to copy and past the file name into the tag.

You can do this from a terminal window to the tag window in Amarok, but the file name and path to the file should be displayed in the tag dialog, so you could copy the name from there and past it to the tag field. I do get the odd character here and there that show up as a square box, not sure what's up there since when these get scrobbled to Last FM and I view the recently played tracks in my Last FM profile the names are displayed correctly in Firefox, but this indicates to me the characters are represented correctly in the tag.

With Non-English tracks I download from Last FM, the file names seem to be unicode but all the tags show all question marks and not having these keys on my keyboard copy and paste is the only way I have to deal with it, these files are a range of Japanese, Chinese, Polish, Greek, Russian, etc...  and there are only a few characters that are displayed as a box instead of the correct character, these all seem to be characters with some kind of accent mark or something similar above and/or below the main body of the character. I have downloaded files from other places that had an odd grouping of characters where an accented character should be, but assuming it shows correctly in the file name or download location or I can find a place on the internet where it shows me the name correctly, copy and paste always works for me, whether it's from the file listing to a tag field in the tag dialog box, from the command line to the tag dialog box, or from Firefox to the tag dialog box.

If this is not your native language I don't think you need the full support, just the right fonts, but depending you your distribution, there may be some metapackages you can install  that will pull in the other files needed to support the language.

Later, Seeker
news1234
Registered Member
Posts
6
Karma
0
Hi Seeker, Thanks for your answer,

I have language support installed (for the file manager and the terminal)
If I copy and paste the file name from a terminal to amarok, I see the correct filename in Amarok no problem.
However if I redisplay it on the terminal command linem, then it is broken.
if I apply the filename as id3 tag from the command line, then the tag is fine.
However Amarok refuses to consider the id3 change, as it seems to have cached this data somehow and doesn't re-read the id3 tag.
So I'm still at a loss there.


DId you ever try to check the id3 tags, that you renamed with amarok with another program? like for example 'id3 -l filename.mp3'

I'd just be curious whether anybody else can observe this problem
seeker5528
Registered Member
Posts
84
Karma
0
news1234 wrote:if I apply the filename as id3 tag from the command line, then the tag is fine.
However Amarok refuses to consider the id3 change, as it seems to have cached this data somehow and doesn't re-read the id3 tag.
So I'm still at a loss there.


You have to get Amarok to re-read the stuff, some small change in a tag field that is not being used would do it, on the menus you have an update option which will look for new/moved files which might to it but I think that isn't enough and a full rescan of the library would be needed.

DId you ever try to check the id3 tags, that you renamed with amarok with another program? like for example 'id3 -l filename.mp3'


Other programs recognize it. I don't have this id3 program installed, but looking using Synaptic in Ubuntu I see there is an id3 program that doesn't say in the description what tag version/versions it supports and and id3v2 program that does version 2 tags or converts from V1 to V2. Based on that I would guess that id3 program you are using at the command line only does V1 tags resulting in the discrepencies you are seeing between what it shows and what Amarok shows.

Later, Seeker
news1234
Registered Member
Posts
6
Karma
0
Hi Seeker,

Thanks again for your answer. It put me on the rigfht track.
I'm using now  the command line tool eyed3 (reading v2 and v1 tags) instead of th program id3, which as you assumed read only v1 tags.

What happens is following:

If I change an Artist or a Title in amarok to a new name containing  only ASCII characters, then amarok changes the id3v1 and id3v2 tags correctly.

If however the new name contains cyrillic characters, then the id3v2 tag is correct (readably on a UTF8 terminal with eyed3) and the id3v1 tag  is broken (reading with eyed3 -1 and with id3 )
I tried to set the terminal coding to UTF-8, koi8-r koi8-u cp1251 and some others, but the id3v1 tags rest wrong,  for example tthe band name for DDT (in cyrillic) becoms just one double quote "

So I'll use now eyed3 as command line tool and will switch in python from ID3 to eyeD3..

bye

N
tifff
Registered Member
Posts
2
Karma
0
I'm confused too (amarok 1.4.10 on intrepid): I imported a commercial mp3-DataCD to harddisk and ID3 tags showed up correct in amarok. But due to TPE1 (Artist) was in uppercase I wanted to change to mixed case. I used id3v2 --TPE1 "New Artist" file and was surprised: The TPE1 tag just written was ok but most others (TPE2, TALB,TIT2,TSRC) showed up "chineese"  from now on in amarok while  id3v2 --list showed them still correct. My solution (as a used awk to perform the translation to mixed case over all files): I rewrote all tags, then display was ok in amarok.
Is there a problem when changing one tag value only?, or why other program allow (assumption) different encodings for different tags.
#!/bin/sh
# $Id: id3case.sh,v 1.2 2009-01-26 16:22:25 horst Exp $
if [ ! -w "$1" ]; then
  echo "Alters tags using script"
  echo "Current setting: Title, Artist and Album tags converted to mixed case"
  echo "Usage: $0 "
  echo "  e.g."
  echo "find somedir -name '*.mp3' -exec id3case.sh {} \;"
  exit 1
fi

id3v2 -l "$1" | awk --assign "file=$1" -- '
/(TPE1)|(TPE2)|(TALB)|(TIT2)/ {
  tag=$1
  sub(".*: ", "")
  for (i=1; i <= NF; i++) {
    fc=substr($i, 1, 1);
    rc=tolower(substr($i, 2));
    $i=fc rc
  }
  print "--" tag " " $0
  system("id3v2 --" tag " \"" $0 "\" \"" file "\"");
}
/(TSRC)|(TRCK)/ {
  tag=$1
  sub(".*: ", "")
  print "--" tag " " $0
  system("id3v2 --" tag " \"" $0 "\" \"" file "\"");
}'


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Sogou [Bot], Yahoo [Bot]