This forum has been archived. All content is frozen. Please use KDE Discuss instead.

OFX import not able to handle German umlauts (ü ö ä Ä Ü Ö ß)

Tags: None
(comma "," separated)
grntbn
Registered Member
Posts
22
Karma
0
OS
Hi,

in the past I’ve been using MS Money (99 German and Sunset) and want to switch to KMyMoney for Windows in German. Therefore I’m currently testing Version 4.8.1.1, set to German.

Now I’m having a problem with the German umlauts (ü ö ä Ä Ü Ö ß) showing unreadable characters like for example “ �“ instead of „Ö“ when importing the bank data with OFX.
• In MS Money these characters are importing without any problem from the same OFX file, so the OFX file is ok.
• Typing these characters manually into KMyMoney is working fine.

Does anybody have an idea what the problem could be?
User avatar
ipwizard
KDE Developer
Posts
1359
Karma
6
OS
Ah, my old friend encoding is in town again ;) . Can you provide us with the first few lines of your OFX file? It looks like the following, and the lines in bold are of importance to us.

OFXHEADER:100
DATA:OFXSGML
VERSION:102
SECURITY:NONE
ENCODING:USASCII
CHARSET:1252
COMPRESSION:NONE
OLDFILEUID:NONE


ipwizard, proud to be a member of the KMyMoney forum since its beginning. :-D
openSuSE Leap 15.4 64bit, KF5
grntbn
Registered Member
Posts
22
Karma
0
OS
Hi,

these are the first lines of my OFX file.

<?xml version="1.0" encoding="utf-8" ?>
<?OFX OFXHEADER="200" VERSION="202" SECURITY="NONE" OLDFILEUID="NONE" NEWFILEUID="NONE"?>
<OFX>
<SIGNONMSGSRSV1>
<SONRS>
<STATUS>
<CODE>0
</CODE>
<SEVERITY>INFO
</SEVERITY>
</STATUS>
<DTSERVER>30180510142630.588
</DTSERVER>
<LANGUAGE>DEU
</LANGUAGE>
User avatar
ipwizard
KDE Developer
Posts
1359
Karma
6
OS
Then it seems your OFX file is lying (in other words: your bank screws up). Here's why: the file states that the contents is encoded in utf-8 in its very first line

<?xml version="1.0" encoding="utf-8" ?>

and then later on uses ISO-8859-1 or ISO-8859-15. This usually results in those � to show up for e.g. an umlaut, because in ISO-8859-1 the umlaut is encoded as one byte but in UTF-8 it is two bytes. Let's take the letter 'ä' (lowercase a-umlaut):

In ISO-8859-1 this is encoded as one byte with the value of 0xE4. In UTF-8 the same character is encoded as two bytes with the value 0xC3, 0xA4.

Your file contains the former version (0xE4) and if you convert this through the UTF-8 encoder (which is a result of that first line) the converter gets confused, as 0xE4 is the leadin for the 3 Byte sequence (CJK-code page) and the following two bytes apparently form an invalid sequence causing the � to be sent to the output stream. You can possibly fix this if you replace

<?xml version="1.0" encoding="utf-8" ?>

with

<?xml version="1.0" encoding="iso-8859-1" ?>

in your OFX input file and try to process that file with KMyMoney. I would be interested in your results. In case this does not help it could be a problem in 4.8 which we already fixed in 5.0 as I see it working there with a correct file.


ipwizard, proud to be a member of the KMyMoney forum since its beginning. :-D
openSuSE Leap 15.4 64bit, KF5
grntbn
Registered Member
Posts
22
Karma
0
OS
Thank you for the detailed explanations and the suggestion for a fix.

Unfortunately it does not solve the problem.

Since there is no version 5 for Windows available yet, I’ll setup Ubuntu on a VM Workstation and install the Linux version 5. Then we will know if it’s a problem of v 4.8 or not.
grntbn
Registered Member
Posts
22
Karma
0
OS
Some additional info:

I’ve installed Ubuntu 16.04 on VM Workstation 12 and then KMyMoney with the following commands

$ sudo add-apt-repository ppa:claydoh/kmymoney2-kde4
$ sudo apt-get update
$ sudo apt-get install kmymoney


resulting in the installed version 4.8. The language was again set to German.

By the way: as well Windows 10 as Ubuntu 16.04 are installed in English.

With this installation both OFX file versions did work perfect !!!

I’m quite unfamiliar with Linux, so I still have to find out how to install Version 5 on Ubuntu. But since the OFX files with Version 4.8 are working fine already, this will make no difference regarding my problem.

Any other idea what could cause the problem with the Windows version?
grntbn
Registered Member
Posts
22
Karma
0
OS
Some more testing: On 3 different computers with Windows 10 English and 1 with Windows 10 German the problem with the umlauts did occurr.
User avatar
ipwizard
KDE Developer
Posts
1359
Karma
6
OS
I bet it might then be related to 4.8, though the only entry I find in the source history states

Code: Select all
commit bde946b1e800e40414219397032b8ff484e17733
Author: Cristian Oneț <onet.cristian@gmail.com>
Date:   Tue Dec 1 12:43:25 2015 +0200

    Fix OFX direct connect data encoding.
   
    Write the byte array directly into the file without passing it trough
    QTextStream which uses QTextCodec::codecForLocale() to interpret the
    data. The encoding is handled by libofx.
   
    BUG: 353372
    FIXED-IN: 4.8.0
and should be part of the version you are using. I don't know from the top of my head if there was anything else.


ipwizard, proud to be a member of the KMyMoney forum since its beginning. :-D
openSuSE Leap 15.4 64bit, KF5
grntbn
Registered Member
Posts
22
Karma
0
OS
Yes, related to 4.8 for Windows as it is working fine with 4.8 for Linux.

Many thanks for your effort!


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]