In attempts to optimize our relational database backend we were trying various RDBMS solutions including MySQL, Percona and PostgreSQL. It’s worth to say that we had a relatively large (25GB) database set with plenty of Unicode data, including special chars such as
\t and others. That made migration process even more sophisticated.
There are many articles all over the Internet explaining fine plans of migration from MySQL to PostgreSQL and only a few describe the backwards action. In this article I’ll try to only show the caveats we faced.
So our first migration was MySQL to Percona. That was amazingly simple: as easy as installing Percona (which in Debian 6.0 removes MySQL) and issuing
service mysql restart. That’s it. Percona picked up database files and
my.cnf flawlessly, just as they were built for it from the very beginning.
Though Percona proved to be much faster, we decided to take even more advantage and switch to PostgreSQL. Here is where challenges began.
Note: I will not explain schema and roles migration because we had schema for each RDBMS written from scratch; so this is out of scope of this article, as it only explains data migration.
First thing to do was to create a dump of existing data:
mysqldump --default-character-set=utf8 --skip-lock-tables --skip-opt --compatible=postgresql --no-create-info database_name
This prints the database dump to stdout, in some kind of postgre compatible format (please see MySQL 5.6 Documentation for details). This is true but quotes. Quotes are in MySQL, not ANSI SQL style. To make PostgreSQL understand this syntax, we had to prepend the dump with the following lines:
SET standard_conforming_strings = 'off'; SET backslash_quote = 'on';
You can find more information at Wikibooks: Converting MySQL to PostgreSQL.
Another issue was with Boolean fields. In MySQL these are represented with
tinyint(1) and in PostgreSQL there’s a special type for it. But PostgreSQL did not want to understand
true (only with quotes, which is kinda weird). So we had to make that columns
INT then cast to
CAST(column AS boolean) and save to a new column.
But staying with PostgreSQL did not show much benefit for us. The I/O rate has increased and the overall performance was mostly the same. There may be several reasons for it: we have so called “simple” queries (but much of them), PostgreSQL was not configured properly (though I spent quite a time with it) or even software bugs, but this is another story. So our task was to switch back to MySQL (Percona).
To dump the database:
pg_dump --data-only --inserts --no-privileges --no-tablespaces --schema=public database_name
INSERTs to stdout. Again same problem with quotes: the output is in ANSI format, while MySQL expects it’s own by default. No problem, prepend the dump with MySQL directive:
Also we had no problem with booleans: MySQL was okay to translate
tinyint(1). The only queries to worry about were calls to “pg_catalog.setval”, so we just grepped them out.