Moderator
|
Since Amarok switched to using unicode exclusively for track tags, there has been a constant stream of queries about various codepages, and how to convert the tags.
I noticed this little app on kde-apps this morning, which should help quite a few people. MP3Unicode
Please note I'm not the dev of this app, nor is it an Amarok project, so if you have any questions or problems, the best thing would be to either post on the kde-apps page, or contact the author. And always remember, when trying anything new, always work on a COPY of your music first - NEVER your original. Hope this helps a few of you!
"There are two theories to arguing with women. Neither one works."
. If men could get pregnant, we'd learn the true meaning of "screaming nancyboy wuss" |
Registered Member
|
Nice idea, but I wonder how the end user is supposed to know what encoding they are coming from? It would be better if the tool code handle that part and read the command as "convert from whatever to unicode". Otherwise, I do not see how it would be useful.
|
Moderator
|
This would be an issue best addressed on the kde-apps page for the utility, as mentioned in the original message. Amarok has nothing to do with this project. It's merely listed here as a resource.
If anyone has other utilities they've used that are useful, perhaps they could also post them.
"There are two theories to arguing with women. Neither one works."
. If men could get pregnant, we'd learn the true meaning of "screaming nancyboy wuss" |
KDE Developer
|
The most fun part of encodings is that its impossible to know which one they are reliably (why web browsers often still have that huge list of possible encodings).
Amarok Developer
|
Registered Member
|
Well, there IS a way to auto-guess the encoding. I'm using a patched taglib from the rus-xmms project (yes, the project originally was supposed to allow XMMS to recode cp1251 tags, hence the name, but now it became something more). See http://rusxmms.sourceforge.net/ for details.
|
Registered Member
|
If you want to know how to patch lame to produce utf-8 tags, see http://amarok.kde.org/forum/index.php/t ... 972.0.html
|
Registered Member
|
Also you can try this:
(It requires mutagen, as far as I remember). |
Registered Member
|
I'm having the problem of many Ukrainian/Russian songs
ALmost none UTF-8 encoded For the songs I know at least, that it's one of two languages I thought about following semi-automatic appoach. Try several codigns - reject all automatically, which result in decoding errors - take the other codings and run each against a Ukrainian/Russian spell checker (Normally know the language upfront) - if there's multiple solutions (several without spelling errors or none without spelling error), prompt for the one to be chosen (ordered by least amount of spelling mistakes) - when a song has been accepted add all words of the song and the band name to the spell checker Probably it would be faster to just display a list of potential codigns for each file to be translated and just select which translation should be taken. Assuming, that there's no tool doing exactly what I need, I'll probably try to write something small / not user firendly in python. - character recoding is part of python (function unicode() and the string method encode() ) - the library ID3 can be used to read modify id3 tags - the library enchant could be used to communicate with a spell checker bye N |
KDE Developer
|
Amarok 2 now has an encoding detector built in (borrowed from Firefox) which is pretty accurate.
--
Mark Kretschmann - Amarok Developer |
Registered Member
|
+1 to news1234.
Amarok2 does not detect East-European encodings in tags (KOI-8r KOI-8u CP1251...) http://www.picatom.com/r/capture1-6.html
flying_stranger, proud to be a member of KDE forums since 2008-Oct.
|
Registered Member
|
Mozilla's charset-detector does not detected correctly in Thai language too. (Many Thai songs were TIS-620 encoded)
Since Amarok team had removed my beloved "manual Charset selection" feature from Amarok 1.4 (or 1.3 ? .. not sure) and I think that Mozilla's charset-detector was not ready for using. So, i was trying to bring it back but by guess from locale. now, i have patches for Amarok 1.4.10 and 2.0.x already. These are my patches: For Amarok 1.4.10: http://linux.thai.net/websvn/wsvn/softw ... tring.diff For Amarok 2.0.x: http://linux.thai.net/websvn/wsvn/softw ... ocale.diff They are pretty work for me (for a locale which i was using). Sure, I don't expect about Amarok team would be accept my patches, but please kindly consider to use another method for detecting charset instead of Mozilla's charset-detector. Regards, donga. |
KDE Developer
|
If you could please send your patch for 2.0.x to amarok-devel@kde.org, we will be happy to review it. 1.4.x is no longer maintained, so we would not patch it.
--
Mark Kretschmann - Amarok Developer |
KDE Developer
|
What's wrong with your patch?
Are there other methods of charset detection?
Amarok Developer
|
Registered users: Bing [Bot], Google [Bot], Sogou [Bot], Yahoo [Bot]