UPDATE: After posting this i found a good blog post about the same issue Turning MySQL data in latin1 to utf8 utf-8

While migraging an old PHP based web app to Ruby on Rails I came across an issue where the database and php were talking latin1 but the web interface outputted utf8 and forms posted were also in utf8. Part of the issue was the html the application was outputting had no meta tag such as:
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"/>

and the headers sent by apache were:
Content-Type: text/html; charset=UTF-8

After much trial and error i managed to work out how to migrate this data.

Use mysqldump on the table with the option --default-character-set=latin1 and output to a file.

This file will have text encoded as utf8

Then edit the table definition in the file and change the DEFAULT CHARSET to utf8

Its then a good idea to run the file though iconv to drop corrupted characters

iconv -f utf-8 -t utf-8 table.sql -c -o table_fixed.sql

It is also a good idea to set mysql to default to utf8 for everything, edit my.cnf

under [client] and [mysqld] add: default-character-set=utf8

Then can reimport the fixed data back into mysql.

mysql -u user -p -e "source table_fixed.sql" database

Of course now you need to make sure anything talking to the database is doing so in utf8

Sorry, comments are closed for this article.