This forum has been archived. All content is frozen. Please use KDE Discuss instead.

problem with character encoding / MySQL

Tags: None
(comma "," separated)
klinnem
Registered Member
Posts
2
Karma
0
My goal is to import an existing database from a mysql server and use it with Kexi on the same server (via SSH Tunneling). Everything seems to work really fine, except for the character encoding.

Special and local characters are displayed as a rectangle with a question mark. I tried this on Ubuntu Linux 17.10, Arch Linux and even Cygwin/Windows.

It's very likely I'm missing something, but I have no such problems when working with the data in LibreOffice, Phpmyadmin, HeidiSQL and a PHP Script, I wrote years ago to access these research data with a browser. I do remember to have had problems years ago, but they looked different and I got around them somehow.
I already tried to convert an sql-dump with iconv from latin1 to utf8, reimport that and importing it with Kexi. It's puzzeling me, that this had no effect at (I expected at least to make it any worse). Currently I have no ideas what to try next. Any help would be appreciated.

Last edited by klinnem on Mon Feb 26, 2018 2:36 pm, edited 1 time in total.
User avatar
jstaniek
Moderator
Posts
1027
Karma
2
OS
klinnem wrote:My goal is to import an existing database from a mysql server and use it with Kexi on the same server (via SSH Tunneling). Everything seems to work really fine, except for the character encoding.

Special and local characters are displayed as a rectangle with a question mark. I tried this on Ubuntu Linux 17.10, Arch Linux and even Cygwin/Windows.

It's very likely I'm missing something, but I have no such problems when working with the data in LibreOffice, Phpmyadmin, HeidiSQL and a PHP Script, I wrote years ago to access these research data with a browser. I do remember to have had problems years ago, but they looked different and I got around them somehow.
I already tried to convert an sql-dump with iconv from latin1 to utf8, reimport that and importing it with Kexi. It's puzzeling me, that this had no effect at (I expected at least to make it any worse). Currently I have no ideas what to try next. Any help would be appreciated.


Hi Klinnem.
First step is to know if the character encoding is any kind of UTF or 'local' encoding. I think KEXI is expecting UTF-8 encoded strings, and then it is converting them to its unicode representation. If database has local (e.g. 8-bit) encoding, results like your may appear.

I am not saying this is the reason for sure. I am not saying in the future KEXI can't have option to specify source database encoding. It can be even detected (the way I linked below [1]) without bothering the user.

We need to know if your source database has non-UTF-8 encoding. Example advice for you to see that: [1] https://stackoverflow.com/questions/104 ... -column-is

If so, possible current workaround is to convert encoding on the source database (or on copy of the database).

It's planned that in future versions KEXI would no longer require to "import" server databases and would just connect to them. (It is not the case right now because such feature is much more fragile with regards to various database configurations options, there's just a lot to do and test)

PS: please always specify your KEXI version, not just on what OS it runs, because it is never clear what version runs on given OS.


Best regards,
Jarosław Staniek
• Qt Certified Specialist
KEXI - Open Source Visual DB Apps Builder
• Request a feature or fix for KEXI here
May I help you? Please mention your app's version and OS when asking for help
klinnem
Registered Member
Posts
2
Karma
0
Hello, thanks for your suggestions. My source database was encoded with latin1, but I also tried a simple db coded with utf8.
The information that kexi "expects" utf8 lead me to the test, what would happen when I let kexi "upload" or export a newly from within kexi created database to the MySQL Server. To my surprise kexi itself created a database with latin1_swedish_ci collation on the server. I guessed that this was determined by some kind of a "default" in the config of MySQL and that seemed to be correct.

I followed the suggestions from https://stackoverflow.com/questions/3513773/change-mysql-default-character-set-to-utf-8-in-my-cnf
to change the MySQL default behaviour by inserting this into my.cnf:

Code: Select all
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8


After restarting MySQL kexi could import/export my tables without a problem, even when the import source had collation latin1_swedish_ci.

So my problem is solved by that, maybe it could help anyone else.
It is possible that this is already part of a more modern default my.cnf than I was still using on my MySQL 5.5 and my MariaDB Server, so that not everyone will run into these problems.

Kexi Version: I used the actual appimage with 3.02, as kexi is broken in Arch Linux repository atm because of qt5 problems. I also tried the version delivered by Cygwin (3.0.2) and 3.0.2-2 from Ubuntu 17.10 repo.
User avatar
jstaniek
Moderator
Posts
1027
Karma
2
OS
Thanks for the details, Klinnem, maybe they would land in our FAQs one day :)

Defaults in MySQL is a long topic, it seems they as they are for historical reasons or because of performance. Maybe it makes sense for KEXI to create databases with forced UTF8 by default...

It's interesting the Cygwin project is packaging KEXI, it's completely independent effort :) That's OK, just we will have native Windows versions nevertheless.


Best regards,
Jarosław Staniek
• Qt Certified Specialist
KEXI - Open Source Visual DB Apps Builder
• Request a feature or fix for KEXI here
May I help you? Please mention your app's version and OS when asking for help


Bookmarks



Who is online

Registered users: Bing [Bot], gfielding, Google [Bot], markhm, sethaaaa, Sogou [Bot], Yahoo [Bot]