philipsoutham / py-mysql2pgsql Goto Github PK

Tool for migrating/converting from mysql to postgresql.

Home Page: http://packages.python.org/py-mysql2pgsql/

License: MIT License

Python 100.00%

py-mysql2pgsql's Introduction

py-mysql2pgsql - A tool for migrating/converting/exporting data from MySQL to PostgreSQL

This tool allows you to take data from an MySQL server (only tested on 5.x) and write a PostgresSQL compatible (8.2 or higher) dump file or pipe it directly into your running PostgreSQL server (8.2 or higher).

Attention

Currently there is no support for importing spatial data from MySQL.

Installation:

If you're like me you don't like random stuff polluting your python install. Might I suggest installing this in an virtualenv?

> virtualenv --no-site-packages ~/envs/py-mysql2pgsql
> source ~/envs/py-mysql2pgsql/bin/activate

Requirements:

Python 2.7
MySQL-python
psycopg2
PyYAML
termcolor (unless you're installing on windows)
pytz

On Windows

I have only done limited testing on this platform using Python 2.7. Here are the driver dependencies for windows, install these before attempting to install py-mysql2pgsql or it will fail.

From PyPI:

All dependencies should be automatically installed when installing the app the following ways

> pip install py-mysql2pgsql

From source:

> git clone git://github.com/philipsoutham/py-mysql2pgsql.git
> cd py-mysql2pgsql
> python setup.py install

Usage:

Looking for help?

> py-mysql2pgsql -h
usage: py-mysql2pgsql [-h] [-v] [-f FILE]

Tool for migrating/converting data from mysql to postgresql.

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         Show progress of data migration.
  -f FILE, --file FILE  Location of configuration file (default:
                        mysql2pgsql.yml). If none exists at that path,
                        one will be created for you.

Don't worry if this is your first time, it'll be gentle.

> py-mysql2pgsql
No configuration file found.
A new file has been initialized at: mysql2pgsql.yml
Please review the configuration and retry...

As the output suggests, a file was created at mysql2pgsql.yml for you to edit. For the impatient, here is what the file contains.

# a socket connection will be selected if a 'socket' is specified
# also 'localhost' is a special 'hostname' for MySQL that overrides the 'port' option
# and forces it to use a local socket connection
# if tcp is chosen, you can use compression

mysql:
 hostname: localhost
 port: 3306
 socket: /tmp/mysql.sock
 username: mysql2psql
 password: 
 database: mysql2psql_test
 compress: false
destination:
 # if file is given, output goes to file, else postgres
 file: 
 postgres:
  hostname: localhost
  port: 5432
  username: mysql2psql
  password: 
  database: mysql2psql_test

# if only_tables is given, only the listed tables will be converted.  leave empty to convert all tables.
#only_tables:
#- table1
#- table2
# if exclude_tables is given, exclude the listed tables from the conversion.
#exclude_tables:
#- table3
#- table4

# if supress_data is true, only the schema definition will be exported/migrated, and not the data
supress_data: false

# if supress_ddl is true, only the data will be exported/imported, and not the schema
supress_ddl: false

# if force_truncate is true, forces a table truncate before table loading
force_truncate: false

# if timezone is true, forces to append/convert to UTC tzinfo mysql data
timezone: false

# if index_prefix is given, indexes will be created whith a name prefixed with index_prefix
index_prefix:

Pretty self explanatory right? A couple things to note, first if destination -> file is populated all output will be dumped to the specified location regardless of what is contained in destination -> postgres. So if you want to dump directly to your server make sure the file value is blank.

Say you have a MySQL db with many, many tables, but you're only interested in exporting a subset of those table, no problem. Add only the tables you want to include in only_tables or tables that you don't want exported to exclude_tables.

Other items of interest may be to skip moving the data and just create the schema or vice versa. To skip the data and only create the schema set supress_data to true. To migrate only data and not recreate the tables set supress_ddl to true; if there's existing data that you want to drop before importing set force_truncate to true. force_truncate is not necessary when supress_ddl is set to false.

Note that when migrating, it's sometimes possible to knock your sequences out of whack. When this happens, you may get IntegrityErrors about your primary keys saying things like, "duplicate key value violates unique constraint." See this page for a fix

Due to different naming conventions in mysql an postgresql, there is a chance that the tool generates index names that collide with table names. This can be circumvented by setting index_prefix.

One last thing, the --verbose flag. Without it the tool will just go on it's merry way without bothering you with any output until it's done. With it you'll get a play-by-play summary of what's going on. Here's an example.

> py-mysql2pgsql -v -f mysql2pgsql.yml
START PROCESSING table_one
  START  - CREATING TABLE table_one
  FINISH - CREATING TABLE table_one
  START  - WRITING DATA TO table_one
  24812.02 rows/sec [20000]  
  FINISH - WRITING DATA TO table_one
  START  - ADDING INDEXES TO table_one
  FINISH - ADDING INDEXES TO table_one
  START  - ADDING CONSTRAINTS ON table_one
  FINISH - ADDING CONSTRAINTS ON table_one
FINISHED PROCESSING table_one

START PROCESSING table_two
  START  - CREATING TABLE table_two
  FINISH - CREATING TABLE table_two
  START  - WRITING DATA TO table_two

  FINISH - WRITING DATA TO table_two
  START  - ADDING INDEXES TO table_two
  FINISH - ADDING INDEXES TO table_two
  START  - ADDING CONSTRAINTS ON table_two
  FINISH - ADDING CONSTRAINTS ON table_two
FINISHED PROCESSING table_two

Data Type Conversion Legend

Since there is not a one-to-one mapping between MySQL and PostgreSQL data types, listed below are the conversions that are applied. I've taken some liberties with some, others should come as no surprise.

MySQL	PostgreSQL
char	character
varchar	character varying
tinytext	text
mediumtext	text
text	text
longtext	text
tinyblob	bytea
mediumblob	bytea
blob	bytea
longblob	bytea
binary	bytea
varbinary	bytea
bit	bit varying
tinyint	smallint
tinyint unsigned	smallint
smallint	smallint
smallint unsigned	integer
mediumint	integer
mediumint unsigned	integer
int	integer
int unsigned	bigint
bigint	bigint
bigint unsigned	numeric
float	real
float unsigned	real
double	double precision
double unsigned	double precision
decimal	numeric
decimal unsigned	numeric
numeric	numeric
numeric unsigned	numeric
date	date
datetime	timestamp without time zone
time	time without time zone
timestamp	timestamp without time zone
year	smallint
enum	character varying (with check constraint)
set	ARRAY[]::text[]

Conversion caveats:

Not just any valid MySQL database schema can be simply converted to the PostgreSQL. So when you end with a different database schema please note that:

Most MySQL versions don't enforce NOT NULL constraint on date and enum fields. Because of that NOT NULL is skipped for this types. Here's an excuse for the dates: http://bugs.mysql.com/bug.php?id=59526.

About:

I ported much of this from an existing project written in Ruby by Max Lapshin over at https://github.com/maxlapshin/mysql2postgres. I found that it worked fine for most things, but for migrating large tables with millions of rows it started to break down. This motivated me to write py-mysql2pgsql which uses a server side cursor, so there is no "paging" which means there is no slow down while working it's way through a large dataset.

py-mysql2pgsql's People

Contributors

Stargazers

Watchers

Forkers

foxxtrot anentropic scw inactivist qnub robbyt fvbock xevix gadventures igniteflow fintanmm typecode mlissner cewing rupa108 rosscdh anandology david-barbion dreamct lxyu nikicat gms8994 garianin akinsgre jesper ccurvey imhuzi funkyan opesok pareis rraub cazlo stephenmcd johncosta pfote kjmcclain frodosamoa anonymouzz stask miquelbar simon-liu jeckep labeneator dyson ahmed-aly-aut blackyboy eyepea waweber deep9 sureshantony mustafamagdi ry4nz demofly rgegriff gstarke nicramarcin camillobruni mattcarp cerdman pgcd tarkatronic priestd09 nepherhotep spootdev mgreschke edwardsamuel emailgregn leture pombredanne deathweaselx86 varsha-arun lukecyca philsong zhao9jack crossdot mr-tron tuky champierre josesanch piyusql programmer-ke pathcl food52 fmarchenko iblagojevic pavanv blazwin clot09 max-morozenko nguyendn marten-cz ksdtech hellotech searchinfluence skoval00 alexmirtoff bobspct adventuresoflucylu nyzsirt mrmichalis

py-mysql2pgsql's Issues

Conversion error with NULL default for timestamp

Hi,

We love this tool, but had to patch it to make it work with NULL defaults for timestamp fields. We added these two lines after line 77 in mysql2pgsql/lib/postgres_writer.py:

if not column['default']:
    return None, 'timestamp without time zone'

TypeError: argument of type 'NoneType' is not iterable

i have used this program few months ago and never happened it...

TypeError: connect() argument 3 must be string, not int

Hello ! Get error from the topic.

Using this conf:

# if a socket is specified we will use that
# if tcp is chosen you can use compression
mysql:
 hostname: 127.0.0.1
 port: 3306
 socket: /tmp/mysql.sock
 username: root
 password: 12345
 database: podcast
 compress: false
destination:
 # if file is given, output goes to file, else postgres
 file: 
 postgres:
  hostname: 127.0.0.1
  port: 5432
  username: postgres
  password: 12345
  database: podcast

# if tables is given, only the listed tables will be converted.  leave empty to convert all tables.
#only_tables:
#- table1
#- table2
# if exclude_tables is given, exclude the listed tables from the conversion.
#exclude_tables:
#- table3
#- table4

# if supress_data is true, only the schema definition will be exported/migrated, and not the data
supress_data: false

# if supress_ddl is true, only the data will be exported/imported, and not the schema
supress_ddl: false

# if force_truncate is true, forces a table truncate before table loading
force_truncate: false

add longblob support for longblob in postgres writer

~/repos/py-mysql2pgsql: git diff
diff --git a/mysql2pgsql/lib/postgres_writer.py b/mysql2pgsql/lib/postgres_writer.py
index 3165520..16109fd 100644
--- a/mysql2pgsql/lib/postgres_writer.py
+++ b/mysql2pgsql/lib/postgres_writer.py
@@ -104,7 +104,7 @@ class PostgresWriter(object):
return default, 'time with time zone'
else:
return default, 'time without time zone'

       elif column['type'] in ('blob', 'binary', 'varbinary'):

       elif column['type'] in ('longblob', 'blob', 'binary', 'varbinary'):
         return default, 'bytea'
     elif column['type'] in ('tinytext', 'mediumtext', 'longtext', 'text'):
         return default, 'text'

Exluding columns

Hello,

I have django gis Point field. This is not implemented, so I want exclude specific column, not table.

Maybe there is are way to do this?

Handling of BINARY(1) columns

We have a table with a column defined as BINARY(1). This doesn't seem to be defined currently. This should probably just be an int4 column.

Thoughts?

Failed to load some table data

I have issues with several tables but don't know the reason. The error is always the same, like:

Traceback (most recent call last):
File "/home/dgambin/venv/py-mysql2pgsql/bin/py-mysql2pgsql", line 59, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/home/dgambin/venv/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/mysql2pgsql.py", line 31, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/home/dgambin/venv/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/converter.py", line 49, in convert
self.writer.write_contents(table, self.reader)
File "/home/dgambin/venv/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/init.py", line 86, in decorated_function
ret = f(_args, *_kwargs)
File "/home/dgambin/venv/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_db_writer.py", line 193, in write_contents
self.copy_from(f, '"%s"' % table.name, ['"%s"' % c['name'] for c in table.columns])
File "/home/dgambin/venv/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_db_writer.py", line 116, in copy_from
columns=columns
psycopg2.extensions.QueryCanceledError: COPY from stdin failed: error in .read() call
CONTEXT: COPY table_name, line 1

Any clue?

Old version on pypi

On the python package Index there is still the old version 0.1.5:
https://pypi.python.org/pypi/py-mysql2pgsql/
Can you please update to 0.1.6.

Thanks Roman

Table and Column comments are not being migrated to postgresql

It looks like this script is not migrating over any of the comments from mysql.
Using the query 'EXPLAIN tablename' only gets the column Name, Type, Nullity, Key, Default, and Extra.
If we were to use 'SHOW FULL COLUMNS FROM tablename' we would also get the Comments for each column (and a few other bits).
There is also the comments on the table itself to be migrated.

Getting Error when converting bit(1) NOT NULL DEFAULT b'0',

here is the stack trace

Traceback (most recent call last):
  File "/usr/local/bin/py-mysql2pgsql", line 5, in <module>
    pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 489, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 1207, in run_script
    execfile(script_filename, namespace, namespace)
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in <module>
    mysql2pgsql.Mysql2Pgsql(options).convert()
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 37, in convert
    Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/converter.py", line 32, in convert
    self.writer.write_table(table)
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/__init__.py", line 91, in decorated_function
    ret = f(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_file_writer.py", line 65, in write_table
    table_sql, serial_key_sql = super(PostgresFileWriter, self).write_table(table)
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 230, in write_table
    primary_keys, serial_key, maxval, columns = self.table_attributes(table)
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 206, in table_attributes
    columns.write('  %s,\n' % self.column_description(column))
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 26, in column_description
    return '"%s" %s' % (column['name'], self.column_type_info(column))
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 126, in column_type_info
    default, column_type = get_type(column)
  File "/Library/Python/2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 62, in get_type
    default = (" DEFAULT %s" % ('true' if int(column['default']) == 1 else 'false')) if t(default) else None
ValueError: invalid literal for int() with base 10: "b'0'"

mysql schema

CREATE TABLE `campaigns` (
  `name` varchar(255) NOT NULL,
  `description` text,
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `xAdCampaingID` varchar(50) DEFAULT NULL,
  `created_at` datetime NOT NULL,
  `updated_at` datetime NOT NULL,
  `deleted_at` datetime DEFAULT NULL,
  `updated_by` int(11) unsigned DEFAULT NULL,
  `created_by` int(11) unsigned NOT NULL,
  `guid` varchar(255) DEFAULT NULL,
  `launch_date` datetime NOT NULL,
  `end_date` datetime NOT NULL,
  `summary_notes` text NOT NULL,
  `attribution_view_through_max_window` float DEFAULT '30',
  `attribution_click_through_max_window` float DEFAULT '30',
  `ui_kpi_impression_count` double(11,2) DEFAULT '0.00',
  `ui_kpi_conversion_rate` double(11,2) DEFAULT '0.00',
  `ui_kpi_click_rate` double(11,2) DEFAULT '0.00',
  `ui_kpi_reach_rate` double(11,2) DEFAULT '0.00',
  `ui_kpi_engagement_rate` double(11,2) NOT NULL DEFAULT '0.00',
  `ui_kpi_video_rate` double(11,2) DEFAULT '0.00',
  `actual_impression_count` int(11) NOT NULL DEFAULT '0',
  `actual_click_count` int(11) NOT NULL DEFAULT '0',
  `actual_expand_count` int(11) NOT NULL DEFAULT '0',
  `actual_engagement_count` int(11) NOT NULL DEFAULT '0',
  `ui_kpi_expand_rate` double(11,2) DEFAULT '0.00',
  `actual_click_rate` double(11,2) NOT NULL DEFAULT '0.00',
  `actual_expand_rate` double(11,2) NOT NULL DEFAULT '0.00',
  `actual_engagement_rate` double(11,2) NOT NULL DEFAULT '0.00',
  `expected_impressions` int(11) DEFAULT '0',
  `advertiser_id` int(11) unsigned DEFAULT NULL,
  `is_madison` tinyint(11) NOT NULL DEFAULT '0',
  `instance_id` int(11) unsigned NOT NULL,
  `agency_id` int(11) unsigned DEFAULT NULL,
  `brand_id` int(11) unsigned DEFAULT NULL,
  `budget` decimal(20,2) DEFAULT NULL,
  `currency` varchar(255) DEFAULT 'USD',
  `deleted_by` int(11) unsigned DEFAULT NULL,
  `timezone` varchar(100) NOT NULL DEFAULT 'America/New_York',
  `indexed_at` datetime DEFAULT NULL,
  `actual_conversion_rate` double(10,2) unsigned NOT NULL DEFAULT '0.00',
  `actual_conversion_count` int(11) NOT NULL DEFAULT '0',
  `actual_reach_rate` double(10,2) unsigned NOT NULL DEFAULT '0.00',
  `actual_reach_count` double(11,2) NOT NULL DEFAULT '0.00',
  `actual_video_rate` double(10,2) unsigned NOT NULL DEFAULT '0.00',
  `actual_video_count` double(11,2) NOT NULL DEFAULT '0.00',
  `insertion_order` varchar(100) NOT NULL DEFAULT '',
  `impression_enabled` bit(1) NOT NULL DEFAULT b'0',
  `click_enabled` int(11) unsigned NOT NULL DEFAULT '0',
  `expand_enabled` int(11) unsigned NOT NULL DEFAULT '0',
  `engagement_enabled` bit(1) NOT NULL DEFAULT b'0',
  `conversion_enabled` bit(1) NOT NULL DEFAULT b'0',
  `reach_enabled` bit(1) NOT NULL DEFAULT b'0',
  `video_enabled` bit(1) NOT NULL DEFAULT b'0',
  `budget_enabled` bit(1) NOT NULL DEFAULT b'0',
  `metrics_updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `old_attribution_id` int(11) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_campaign_deleted` (`deleted_at`) USING BTREE,
  KEY `idx_guid` (`guid`) USING BTREE,
  KEY `updated_by` (`updated_by`),
  KEY `created_by` (`created_by`),
  KEY `deleted_by` (`deleted_by`),
  KEY `instance_id` (`instance_id`),
  KEY `agency_id` (`agency_id`),
  KEY `advertiser_id` (`advertiser_id`),
  KEY `campaigns_ibfk_7` (`brand_id`),
  KEY `old_attribution_id` (`old_attribution_id`),
  CONSTRAINT `campaigns_ibfk_1` FOREIGN KEY (`updated_by`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `campaigns_ibfk_2` FOREIGN KEY (`created_by`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `campaigns_ibfk_3` FOREIGN KEY (`deleted_by`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `campaigns_ibfk_5` FOREIGN KEY (`instance_id`) REFERENCES `instances` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `campaigns_ibfk_6` FOREIGN KEY (`agency_id`) REFERENCES `agencies` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `campaigns_ibfk_7` FOREIGN KEY (`brand_id`) REFERENCES `advertiser_brands` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `campaigns_ibfk_8` FOREIGN KEY (`advertiser_id`) REFERENCES `advertisers` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1274 DEFAULT CHARSET=utf8;

Autogenerated index name length exceeds 63 characters

When migrating, indexes are created with autogenerated name in Writer. PG has a limit of column (and index) name to 63 characters, so when you have first 63 characters of generated index equal for some indexes, you get an error while importing.

psql:/projects/mysql_2_pg.sql:8093973: ERROR:  cannot drop index X because constraint X on table Y requires it
HINT:  You can drop constraint X on table Y instead.
psql:/projects/mysql_2_pg.sql:8093974: ERROR:  relation "X" already exists

Can you have an option to just re-use index names from current database? Or any other thoughts on this?

Thanks.

Mssing quotation marks for columns with ADD PRIMARY KEY sql statement

Hi all,

A small fix for primary keys using columns with funny names, in postgres_writer.py, write_indexes(), the column_names join() must use quotation marks

'column_names': ', '.join('"%s"' % col for col in primary_index[0]['columns'])

Thanks

from datetime to timestamp with timezone

the field datetime is translated into timestamp without timezone.
Django fails localizing timestamp without timezone.
I noticed that there is an option
timezone: true
but I don't see the difference: My timestamps are still "without timezone"

Missing escape causes case-sensitivity

When the tool generates Postgres SQL, most identifiers (tables, columns, key names etc) are properly escaped with double quotes. This forces Postgres to interpret the literals without case conversion. However, when generating the primary key for a converted table, with a statement such as

ALTER TABLE "inv_types" ADD CONSTRAINT "inv_types_typeID_pkey" PRIMARY KEY(typeID);

...the name of the key is not escaped with double quotes. Postgres downcases the unescaped name, and the operation fails if the column name has mixed case characters because the referred column does not exist (as it was created originally in mixed case with double quotes, but referred to in forced lower case without double quotes).

Mangle case-sensitive schemas

Sorry if this is not the right place, couldnt find another... :)
I need to migrate a MySQL database to a PostgreSQL server. On mysql I have all lower-case database, tables and field names, but on Postgres I have CamelCase names. This is the bacula (linux backup tool) database, and thus I cannot change any definition on both servers. When I run py-mysql2pgsql it tells me:

"psycopg2.ProgrammingError: ERROR: relation "BaseFiles" does not exist"

or something like that in pt_BR.
Couldn't your script handle this, or can you suggest a workaround?

Tks, best regards.

Tinyint converted to Bool

Hi there
I am using MySQL 5.5 and PG 9.2. The script is converting tinyint columns into boolean, which is not desired. Is there a way to get the script to not do that?

Problem with mixed case sequence names

Due to the current behaviour of Postgres, sequence names have to be in lower case, otherwise it throws an exception. Can this be fixed as a matter of urgency? Soon I will be having to migrate a very large database and it would be extremely helpful, btw I have evaluated 4 commercial solutions and 4 open source ones and py-mysql2pgsql is hands down the best.

typo in the README.md file

Perhaps

"As the output suggests, I file was created at mysql2pgsql.yml for you to edit."

should instead be something like:

"As the output suggests, a file was created at mysql2pgsql.yml for you to edit."

Unsigned bigint conversion issue

Hi!

Have to start with that I really like your script! I have however stumbled upon one issue.

Unsigned bigint gets converted to numeric, which is fine. However it is using the precision from the bigint as the precision for the numeric. That causes some problems because precision is usually 11 for mysql bigint, however that is not affecting any operation at all in mysql for bigint.

 M indicates the maximum display width for integer types. The maximum 
 display width is 255. Display width is unrelated to the range of values a 
 type can contain, as described in Section 11.2, “Numeric Types”. For 
 floating-point and fixed-point types, M is the total number of digits that 
  can be stored.

(From http://dev.mysql.com/doc/refman/5.6/en/numeric-type-overview.html)

So this script converts bigint(11) to numeric(11), then inserts of values that have more than 11 digits (as allowed by mysql) will fail.

The solution would be to always convert unsigned bigint to numberic(20) as maximum value for unsigned bigint is 18446744073709551615, i.e 20 digits.

I could provide a patch for this if you think this would be a good solution.

inserted literal empty single quotes '' into empty char fields

Great script, thanks! (I previously tried the Ruby one and couldn't get it to work)

I just migrated my db and the only problem I see is it has inserted literal pair of single quotes '' into char and text fields where the value was empty in the mysql db.

Any idea where to look to fix this?

Exception: unknown datetime(6)

While running my conversion I got the following error

 153, in table_attributes
    columns.write('  %s,\n' % self.column_description(column))
  File "/usr/local/lib/python2.7/site-packages/mysql2pgsql/lib/postgres_writer.py", line 18, in column_description
    return '"%s" %s' % (column['name'], self.column_type_info(column))
  File "/usr/local/lib/python2.7/site-packages/mysql2pgsql/lib/postgres_writer.py", line 107, in column_type_info
    default, column_type = get_type(column)
  File "/usr/local/lib/python2.7/site-packages/mysql2pgsql/lib/postgres_writer.py", line 105, in get_type
    raise Exception('unknown %s' % column['type'])
Exception: unknown datetime(6)

I digged a bit deeper to find the offending table by printing the contents of column:

({
    'length': 6, 
    'table_name': u'django_migrations', 
    'name': u'applied', 
    'auto_increment': False, 
    'default': None, 
    'decimals': None, 
    'null': False, 
    'type': u'datetime(6)', 
    'primary_key': False
})

And the info from mysql:

mysql> DESCRIBE django_migrations;
+---------+--------------+------+-----+---------+----------------+
| Field   | Type         | Null | Key | Default | Extra          |
+---------+--------------+------+-----+---------+----------------+
| id      | int(11)      | NO   | PRI | NULL    | auto_increment |
| app     | varchar(255) | NO   |     | NULL    |                |
| name    | varchar(255) | NO   |     | NULL    |                |
| applied | datetime(6)  | NO   |     | NULL    |                |
+---------+--------------+------+-----+---------+----------------+

I don't have a clue what the length on a datetime field should do, but I can provide a fast fix by checking column['type'] with a regex to avoid the above exception.

Quoting for table names

My PostgreSQL 9.3 complained about table names in the generated .sql file.
I had to add quotes around table names in functions

def column_comment(self, tablename, column)

and

def write_constraints(self, table)

Thanks.

Will always use socket when host is 'localhost'

The generated template and the README should mention that the script will always
try to connect to mysql using the socket if hostname is "localhost" (this is a "feature" of the python mysql library).

If you want to connect to localhost via TCP, you must change the hostname to "127.0.0.1".

Keyword: unix_socket

Blob data corrupted after migration

I'm moving Blob data from MySQL 5.6.10 to Postgresql 9.3.1.0 using Python 2.7.5

The MySQL data contains PDF files as Blobs. After the migration, the files are corrupted.

I haven't started doing any debugging of the py-mysql2pgsql scripts, other than just checking that the data is good in MySQL and bad in Postgres immediately after doing the conversion.

But I was hoping this was something other people might have run into.

ImportError: No module named psycopg2.extensions

Hi
i am new in Python, so excuse my question

In windows 7,
I have been trying to install at first : psycopg2-windows : https://github.com/nwcell/psycopg2-windows

I run in DOS mode :
setup.py install

Result:
D:\soft\python\psycopg2-windows-0.1.0\psycopg2-windows-0.1.0>setup.py install
C:\Python\Python27\lib\distutils\dist.py:267: UserWarning: Unknown distribution
option: 'summary'
warnings.warn(msg)
running install
running build
running install_egg_info
Writing C:\Python\Python27\Lib\site-packages\psycopg2-2.5.2-py2.7.egg-info

Then , i have been trying to install py-mysql2pgsql
I run in DOS:
D:\soft\python\py-mysql2pgsql-master>setup.py install

Result :
...
Finished processing dependencies for py-mysql2pgsql==0.1.6

When i run the py-mysql2pgsql script, i got a : ImportError: No module named psycopg2.extensions

I run :
python py-mysql2pgsql -f
D:\business_automation\processus_db\maintenance\export_mysql_import_postgresql\mysql_local_pg_local.yml

Result :
C:\Python\Python27\Scripts>python py-mysql2pgsql -f D:\business_automation\processus_db\maintenance\export_mysql_import_postgresql\mysql_local_pg_local.yml
Traceback (most recent call last):
File "py-mysql2pgsql", line 4, in import('pkg_resources').run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
File "C:\Python\Python27\lib\site-packages\pkg_resources.py", line 517, in run_script self.require(requires)[0].run_script(script_name, ns)
File "C:\Python\Python27\lib\site-packages\pkg_resources.py", line 1436, in run_script exec(code, namespace, namespace)
File "c:\python\python27\lib\site-packages\py_mysql2pgsql-0.1.6-py2.7.egg\EGG-INFO\scripts\py-mysql2pgsql", line 5, in import mysql2pgsql
File "C:\Python\Python27\lib\site-packages\py_mysql2pgsql-0.1.6-py2.7.egg\mysql2pgsql__init__.py", line 2, in
from .mysql2pgsql import Mysql2Pgsql
File "C:\Python\Python27\lib\site-packages\py_mysql2pgsql-0.1.6-py2.7.egg\mysql2pgsql\mysql2pgsql.py", line 7, in
from .lib.postgres_file_writer import PostgresFileWriter
File "C:\Python\Python27\lib\site-packages\py_mysql2pgsql-0.1.6-py2.7.egg\mysql2pgsql\lib\postgres_file_writer.py", line 6, in
from .postgres_writer import PostgresWriter
File "C:\Python\Python27\lib\site-packages\py_mysql2pgsql-0.1.6-py2.7.egg\mysql2pgsql\lib\postgres_writer.py", line 7, in
from psycopg2.extensions import QuotedString, Binary, AsIs
ImportError: No module named psycopg2.extensions

What did i miss there ?
The setup.py install isn't sufficient to get all dependencies ?

Thanks

tinyints being converted

Hi,
I was attempting to migrate a Moodle database from MySQL to Postgres. (Moodle is a php framework for managing and delivering e-learning content). I created the Postgres database using the Moodle installation procedure and tried to transfer data only, ignoring DDL, from one to the other. The Moodle versions were exactly the same between servers. Here's the issue ... The transfer failed because tinyint column values were converted from 1 and 0 values into 't' and 'f' characters in the py-mysql2pgsql output file (representing true and false maybe?). Postgres COPY command refused to import the 't' and 'f' characters into numeric fields. The fix was relatively easy, querying the MySQL data dictionary for 'tinyint' columns like so ...
mysql> use information_schema mysql> tee alter_holchem_copy.sql mysql> select CONCAT('ALTER TABLE ',TABLE_NAME, ' CHANGE ', COLUMN_NAME, ' ', COLUMN_NAME, ' INTEGER;') from columns where table_schema = 'holchem_copy' and data_type = 'tinyint'; mysql> exit;

I then formatted the output with
sed -i 's/..\(.*\)|/\1/' alter_holchem_copy.sql

Running this script converted all the MySQL tinyint's into int's (larger therefore no data loss) and subsequently py-mysql2pgsql copied the values between databases perfectly.

I wondered if the 1/0 -> t/f conversion for tinyint columns was by design or is there a parameter I can use to control this behaviour? I did view the MySQL data in three different clients (Navicat, phpMyAdmin, and the CLI mysql tool) to make sure the values were stored as 1/0 in the MySQL database, not as t/f.

Centos 6
py-mysql2pgsql version 0.1.5
MySQL 5.5.30
Postgres 8.4.13

Cheers,
Chris

Boolean fields stored as hexadecimal values incorrectly converted

I have a mysql db that appears to store boolean values as the hexadecimal values '\x01' for true and '\x00' for false. The postgres writer is incorrectly leaving these values in place and PostgreSQL is throwing errors when trying to load the data.

I've got a fix for the problem and will issue a pull request in a short while

psycopg2.IntegrityError: null value in datetime

psycopg2.IntegrityError: null value in column "datewritten" violates not-null constraint

Trying to move the following table over:

CREATE TABLE `articles` (
   `index` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
   `title` tinytext,
   `datewritten` date NOT NULL,
   PRIMARY KEY (`index`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `articles` (`index`, `title`, `datewritten`)
VALUES (1,'Hello World','0000-00-00');

Error when writing data to postgres

Hi,

I get this error when migrating a mysql db:

Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 5, in
pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
File "/usr/local/lib/python2.7/dist-packages/setuptools-1.1-py2.7.egg/pkg_resources.py", line 492, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/setuptools-1.1-py2.7.egg/pkg_resources.py", line 1350, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 31, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/converter.py", line 51, in convert
self.writer.write_contents(table, self.reader)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/init.py", line 86, in decorated_function
ret = f(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 193, in write_contents
self.copy_from(f, '"%s"' % table.name, ['"%s"' % c['name'] for c in table.columns])
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 116, in copy_from
columns=columns
psycopg2.extensions.QueryCanceledError: COPY from stdin failed: error in .read() call
CONTEXT: COPY tupdate, line 1

I'm running mysql 5.5, postgresql 9.2 and python 2.7.3.

I've read other posts on this issue and I've tried py-mysql2pgsql 0.1.5 with the same result.

Any help would be greatly appreciated.

Thanks,

Serge

NULL ENUM values causing error

Get the following err when trying to convert a database with a non-defined ENUM:

psycopg2.IntegrityError: new row for relation "debug" violates check constraint "debug_gender_check"
CONTEXT:  COPY debug, line 2: "2

The test table dump:

CREATE TABLE `debug` (
  `index` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `firstname` tinytext,
  `gender` enum('Male','Female') DEFAULT NULL,
  PRIMARY KEY (`index`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `debug` (`index`, `firstname`, `gender`)
VALUES
    (1,'Bluma','Female'),
    (2,'','');

psycopg2.InterfaceError: can't encode unicode string to latin-1

With MySQL-python-1.2.3-0.3.c1.1.el6.i686, python-psycopg2-2.0.14-2.el6.i686, mysql-5.1.71-1.el6.i686, postgresql-8.4.18-1.el6_4.i686 (output to file)

Traceback (most recent call last):
  File "/var/lib/pgsql/py-mysql2pgsql/bin/py-mysql2pgsql", line 38, in <module>
    mysql2pgsql.Mysql2Pgsql(options).convert()
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/mysql2pgsql.py", line 31, in convert
    Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/converter.py", line 31, in convert
    self.writer.write_table(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/__init__.py", line 90, in decorated_function
    return f(*args, **kwargs)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_file_writer.py", line 65, in write_table
    table_sql, serial_key_sql = super(PostgresFileWriter, self).write_table(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 177, in write_table
    primary_keys, serial_key, maxval, columns = self.table_attributes(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 153, in table_attributes
    columns.write('  %s,\n' % self.column_description(column))
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 18, in column_description
    return '"%s" %s' % (column['name'], self.column_type_info(column))
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 107, in column_type_info
    default, column_type = get_type(column)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 40, in get_type
    default = (' DEFAULT %s' % QuotedString(column['default']).getquoted()) if t(column['default']) else None
psycopg2.InterfaceError: can't encode unicode string to latin-1
(py-mysql2pgsql)postgres@luther: py-mysql2pgsql$

bigint to integer conversion

Script convert column:

id bigint(20) NOT NULL AUTO_INCREMENT

into:

id | integer | not null default nextval('market_category_id_seq'::regclass)

UTF8 surrogate pairs will cause the dump to crash

simple reproduce:

try to dump '\xed\xa0\xbd' to postgresql will raise this error:

ERROR:  invalid byte sequence for encoding "UTF8": 0xed 0xa0 0xbd

while this string is not a proper utf-8, it's accepted in both python2 and mysql. python3 fixed though.

for more info, refer to this question in stackoverflow:
http://stackoverflow.com/a/3510831/543492

Purge open Issues and Pull Requests

Hi,

This tool is great, at least better than the ruby equivalent (mysql2pgsql), but I had some issues during a migration that was already fixed on other forkes, and sometimes, resolved the same bug in several forkes.

For example, the empty SQL error:

Do you need some help to close and merge interesting things to prepare a new release ?
I understand that you don't have time anymore to work on this project, I can help you, if you want.

Regards.

Feature Request: Comments

I was curious if it would be possible to include support for table / column comments to be transferred?

If this is already supported, is there anything special I need to do to get those transferred?

psycopg2.ProgrammingError: syntax error at or near

Hello, I've just tried to use py-mysql2pgsql for the first time. I didn't get very far :) I got this error:

psycopg2.ProgrammingError: syntax error at or near "1"
LINE 1: DROP SEQUENCE IF EXISTS 1_products_idItemNum_seq CASCADE;

I don't know anything about Python, but I took the above generated DROP SEQUENCE command and dropped it into PgAdminIII and confirmed it gives the same error. Then, still in PgAdminIII, I modified the command with some double quotes, like so:

DROP SEQUENCE IF EXISTS "1_products_idItemNum_seq" CASCADE;

This time it executed correctly.

Would it be possible to get py-mysql2pgsql to generate quoted names like the above?

Thanks

psycopg2.ProgrammingError: can't execute an empty query

I get this err on the 1st table (dumped the sql that it was trying to execute each time)

STARTING <<<<<<<<<<

START CREATING TABLES
START - CREATING TABLE city
DROP SEQUENCE IF EXISTS "city_id_seq" CASCADE;
CREATE SEQUENCE "city_id_seq" INCREMENT BY 1
NO MAXVALUE NO MINVALUE CACHE 1;
SELECT pg_catalog.setval('"city_id_seq"', 8, true);
DROP TABLE IF EXISTS "city" CASCADE;
CREATE TABLE "city" (
"id" integer DEFAULT nextval('"city_id_seq"'::regclass) NOT NULL,
"name" character varying(32) NOT NULL,
"pathname" character varying(32) NOT NULL,
"latitude" numeric(9, 6) NOT NULL,
"longditude" numeric(9, 6) NOT NULL
)
WITHOUT OIDS;

Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 5, in
pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 37, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/converter.py", line 32, in convert
self.writer.write_table(table)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/init.py", line 91, in decorated_function
ret = f(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 155, in write_table
self.execute(sql)
File "/usr/local/lib/python2.7/dist-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 109, in execute
cur.execute(sql, args)
psycopg2.ProgrammingError: can't execute an empty query

Just adding an if len(sql): to the execute function fixes it but no idea where the blank sql would come from.

Mysql connection exception if password is int

Hi!
If we use password as int in config file, your script can't create connection. Because all opts for create connection should be string.
Thanks

enum field with non-ascii values

Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 5, in
pkg_resources.run_script('py-mysql2pgsql==0.1.2', 'py-mysql2pgsql')
File "build/bdist.freebsd-9.0-RELEASE-amd64/egg/pkg_resources.py", line 489, in run_script
File "build/bdist.freebsd-9.0-RELEASE-amd64/egg/pkg_resources.py", line 1207, in run_script
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 59, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 31, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/converter.py", line 31, in convert
self.writer.write_table(table)
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/init.py", line 90, in decorated_function
return f(_args, *_kwargs)
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 151, in write_table
table_sql, serial_key_sql = super(PostgresDbWriter, self).write_table(table)
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 179, in write_table
primary_keys, serial_key, maxval, columns = self.table_attributes(table)
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 155, in table_attributes
columns.write(' %s,\n' % self.column_description(column))
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 18, in column_description
return '"%s" %s' % (column['name'], self.column_type_info(column))
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 109, in column_type_info
default, column_type = get_type(column)
File "/usr/local/lib/python2.7/site-packages/py_mysql2pgsql-0.1.2-py2.7.egg/mysql2pgsql/lib/postgres_writer.py", line 40, in get_type
default = (' DEFAULT %s' % QuotedString(column['default']).getquoted()) if t(column['default']) else None
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-5: ordinal not in range(256)

Table causing this is:

create table export_test ( `quality` enum('Эконом','Стандарт','Премиум') DEFAULT 'Эконом' COMMENT 'Качество товаров в категории');

Anothery typo in the README.md file: "py-mysql2pgsql -v -f mysql2pgsql" should be "py-mysql2pgsql -v -f mysql2pgsql.yml"

In the code, I think:

py-mysql2pgsql -v -f mysql2pgsql

should be

py-mysql2pgsql -v -f mysql2pgsql.yml

AttributeError: 'NoneType' object has no attribute 'group'

Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 38, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib/python2.7/dist-packages/mysql2pgsql/mysql2pgsql.py", line 31, in convert
Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
File "/usr/local/lib/python2.7/dist-packages/mysql2pgsql/lib/converter.py", line 22, in convert
tables = [t for t in (t for t in self.reader.tables if t.name not in self.exclude_tables) if not self.only_tables or t.name in self.only_tables]
File "/usr/local/lib/python2.7/dist-packages/mysql2pgsql/lib/converter.py", line 22, in
tables = [t for t in (t for t in self.reader.tables if t.name not in self.exclude_tables) if not self.only_tables or t.name in self.only_tables]
File "/usr/local/lib/python2.7/dist-packages/mysql2pgsql/lib/mysql_reader.py", line 200, in
return (self.Table(self, t[0]) for t in self.db.list_tables())
File "/usr/local/lib/python2.7/dist-packages/mysql2pgsql/lib/mysql_reader.py", line 87, in init
self._load_indexes()
File "/usr/local/lib/python2.7/dist-packages/mysql2pgsql/lib/mysql_reader.py", line 162, in _load_indexes
index['columns'] = [re.search(r'(\w+)', col).group(1) for col in match_data.group(2).split(',')]
AttributeError: 'NoneType' object has no attribute 'group'

Long index row problem.

Recently seen this one:

psycopg2.OperationalError: index row size 3656 exceeds maximum 2712 for index "opins_text"
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

Looks like not just any index that can be created in MySQL could be easily replicated to postgres. Any thoughts on this? I think about detecting such conditions, yelling loud and ignoring index creation.

Inconsistent application of quotes in references leads to case issues

In the following sql generated automatically by py-mysql2pgsql, the key field will not get found as does not have quotes unlike when the field was created.

from http://www.postgresql.org/docs/9.3/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS :

Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. ... If you want to write portable applications you are advised to always quote a particular name or never quote it.)

ALTER TABLE "cic_map" ADD CONSTRAINT "cic_map_CIC_Code_pkey" PRIMARY KEY(CIC_Code);

error runn py-mysql2pgsql

I'm getting the following error running this on a linux box. Any ideas? thanks…..

py-mysql2pgsql -v -f sage3.sql
Traceback (most recent call last):
File "/usr/local/bin/py-mysql2pgsql", line 4, in
import('pkg_resources').run_script('py-mysql2pgsql==0.1.5', 'py-mysql2pgsql')
File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 534, in run_script
File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 1438, in run_script
File "/usr/local/lib64/python2.6/site-packages/py_mysql2pgsql-0.1.5-py2.6.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in
mysql2pgsql.Mysql2Pgsql(options).convert()
File "/usr/local/lib64/python2.6/site-packages/py_mysql2pgsql-0.1.5-py2.6.egg/mysql2pgsql/mysql2pgsql.py", line 18, in init
self.file_options = Config(options.file, True).options
File "/usr/local/lib64/python2.6/site-packages/py_mysql2pgsql-0.1.5-py2.6.egg/mysql2pgsql/lib/config.py", line 33, in init
super(Config, self).init(config_file_path)
File "/usr/local/lib64/python2.6/site-packages/py_mysql2pgsql-0.1.5-py2.6.egg/mysql2pgsql/lib/config.py", line 18, in init
self.options = load(open(config_file_path))
File "/usr/local/lib64/python2.6/site-packages/yaml/init.py", line 71, in load
return loader.get_single_data()
File "/usr/local/lib64/python2.6/site-packages/yaml/constructor.py", line 37, in get_single_data
node = self.get_single_node()
File "/usr/local/lib64/python2.6/site-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/usr/local/lib64/python2.6/site-packages/yaml/composer.py", line 58, in compose_document
self.get_event()
File "/usr/local/lib64/python2.6/site-packages/yaml/parser.py", line 118, in get_event
self.current_event = self.state()
File "/usr/local/lib64/python2.6/site-packages/yaml/parser.py", line 193, in parse_document_end
token = self.peek_token()
File "/usr/local/lib64/python2.6/site-packages/yaml/scanner.py", line 128, in peek_token
self.fetch_more_tokens()
File "/usr/local/lib64/python2.6/site-packages/yaml/scanner.py", line 220, in fetch_more_tokens
return self.fetch_value()
File "/usr/local/lib64/python2.6/site-packages/yaml/scanner.py", line 580, in fetch_value
self.get_mark())
yaml.scanner.ScannerError: mapping values are not allowed here
in "sage3.sql", line 3, column 20

TimeZone usage not documented

I just fixed up installation after TimeZone support merging. However README.rst misses documentation on how this thing works. I'm filing this issue just in case me or @qnub will forget about this.

Parallelism

Hi,
1st, Thank you very much for an amazing tool. it worked like a charm
2nd, it would be great if the migration of the data can be set with an amount of parallelism. so i can copy X tables at once.

What I have done so far is for the big tables i used Enterprise DB migration tools but for all the DDL and small tables used the normal tools.

Thanks again!

AttributeError: OptionParser instance has no attribute 'add_argument'

Below is the error I'm receiving. it turns out that the add_argument function call for OptionParser should be add_option.

Traceback (most recent call last):
File "/usr/bin/py-mysql2pgsql", line 5, in
pkg_resources.run_script('py-mysql2pgsql==0.1.2', 'py-mysql2pgsql')
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 467, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 1200, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/lib/python2.6/site-packages/py_mysql2pgsql-0.1.2-py2.6.egg/EGG-INFO/scripts/py-mysql2pgsql", line 36, in
parser.add_argument(
AttributeError: OptionParser instance has no attribute 'add_argument'

clang error unknown argument '-mno-fused-madd'

We install mysql2pgsql as part of our boxen installation and recently the installation failed with following error.

Xcode 5.1.1

building '_mysql' extension

creating build/temp.macosx-10.8-intel-2.7

clang -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe -Dversion_info=(1,2,5,'final',1) -D__version__=1.2.5 -I/opt/boxen/homebrew/Cellar/mysql/5.6.10-boxen3/include -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c _mysql.c -o build/temp.macosx-10.8-intel-2.7/_mysql.o -Os -g -fno-strict-aliasing
clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]

clang: note: this will be a hard error (cannot be downgraded to a warning) in the future

Found this and it explained the problem pretty well.

COPY from stdin failed: error in .read() call

Hi,
i'm trying to dump a big mysql database into a postgresql database, and i'm ending up with the following error

  >> py-mysql2pgsql -v -f mysql2pgsql.yml 
  >>>>>>>>>> STARTING <<<<<<<<<<

START CREATING TABLES
  START  - CREATING TABLE PING
  FINISH - CREATING TABLE PING
  START  - CREATING TABLE error
  FINISH - CREATING TABLE error
  START  - CREATING TABLE ping
  FINISH - CREATING TABLE ping
  START  - CREATING TABLE product
  FINISH - CREATING TABLE product
  START  - CREATING TABLE updateRequest
  FINISH - CREATING TABLE updateRequest
DONE CREATING TABLES
START WRITING TABLE DATA
  START  - WRITING DATA TO PING
  8438.17 rows/sec [1440000]    
  FINISH - WRITING DATA TO PING
  START  - WRITING DATA TO error

  FINISH - WRITING DATA TO error
  START  - WRITING DATA TO ping
Traceback (most recent call last):
  File "/usr/bin/py-mysql2pgsql", line 5, in <module>
    pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 499, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1239, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/lib/python2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in <module>
    mysql2pgsql.Mysql2Pgsql(options).convert()
  File "/usr/lib/python2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/mysql2pgsql.py", line 31, in convert
    Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
  File "/usr/lib/python2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/converter.py", line 51, in convert
    self.writer.write_contents(table, self.reader)
  File "/usr/lib/python2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/__init__.py", line 86, in decorated_function
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 193, in write_contents
    self.copy_from(f, '"%s"' % table.name, ['"%s"' % c['name'] for c in table.columns])
  File "/usr/lib/python2.7/site-packages/py_mysql2pgsql-0.1.6-py2.7.egg/mysql2pgsql/lib/postgres_db_writer.py", line 116, in copy_from
    columns=columns
psycopg2.extensions.QueryCanceledError: COPY from stdin failed: error in .read() call
CONTEXT:  COPY ping, line 693

No further information on the postgresql receiving side but :

ERROR:  COPY from stdin failed: error in .read() call
CONTEXT:  COPY ping, line 693
STATEMENT:  COPY "ping"("id","ccVersion","detectedOs","login","number","pingType","productVersion","receivedAt","uuid","version","product_id") FROM stdin WITH DELIMITER AS '      ' NULL AS '\\N'

Any idea ?

Regards,

Olivier.

Abuse of "in" at mysql_reader.py

Hint:

>>> 'varchar' in 'enum("varchar")'
True

If we really need to test if some string starts with another one we should do:

>>> 'enum("varchar")'.startswith('varchar')
False