Saturday, June 11, 2011

Migrating a database from Microsoft Access to MySQL



There are basically four ways to migrate a database from Access to MySQL:
  1. Use a MySQL (or other) utility program to pull the structure and data out of the .mdb file.
  2. Use Access to push the structure and data into a MySQL database via ODBC.
  3. Create the new empty MySQL database and its table structures manually with hand-coding; export the data from Access to a text file; import from the text file to MySQL.
  4. Use a commercial converter program.

1) Pull the structure and data out of an .mdb file

Windows

I found the now-deprecated MySQL Migration Toolkit to be very helpful for migrating about 15 databases from Access to MySQL. The version I had was for Windows. After I changed a couple of global settings in Access and a couple of specific settings in the .mdb file to be migrated, the Migration Wizard read from the .mdb file, created corresponding table definitions using MySQL data types equivalent to the Access ones, and transferred all the table data.
Although it was nearly 100% unsuccessful at migrating Access Queries to MySQL Views (I did that by manually copying the SQL code out of Access and hand-editing it), the Migration Toolkit's ability to migrate the tables saved time and was a lot better than ho help at all. Unfortunately, the Migration Toolkit has been discontinued with no replacement available yet in MySQL Workbench, so right now "no help at all" is what's available, but that could change at any time.  
If you are still using an old version of MySQL and its GUI Tools package that has the Migration Toolkit, I'd suggest using it to do your migrations before upgrading to Workbench.  

Linux

The MDB Tools package for Linux (one Ubuntu version) has programs to extract table structures and data directly from an Access .mdb file. I tested the programs but haven't used them for a migration.
mdb-export - Export data in an MDB database table to CSV format.
mdb-schema - Generate table creation SQL code. Output formats: access, sybase, oracle, postgres, mysql.
mdb-sql    - Interactive SQL interface to MDB Tools.
mdb-tables - Get listing of tables in an MDB database.
mdb-ver    - Return the format of a given MDB database.
The mdb-schema program has an output option called "mysql" that isn't mentioned in the documentation. That's the one to use. The command line version of MDB Tools worked for me without errors; I also have an Ubuntu Gnome GUI version that only crashes.

2) Use Access to push the structure and data into a MySQL database

Access-related documentation and help refer often to how an ODBC connection allows you to "access a data source" and query it. It took me a while to realize that it allows both read and write access to the data source, which means it gives you the ability to push the data into a second database.
The advantages of this method are that you don't have to manually create the table definition with SQL (as with method 3, below), and you don't have to transfer the data manually.
The drawbacks are: 1) It's complicated for the relatively small amount of automation it achieves. Nonetheless, it's satisfying after the last mouse click to see your database magically appear in MySQL. 2) Access doesn't know the MySQL-specific data types, so the generic SQL ones it chooses might or might not be the exact best choices. If you disagree with the ones chosen, you don't have a script that you can revise and rerun. You need to use an ALTER TABLE command to edit the definitions in-place in MySQL.
These instructions were created in Windows XP SP3. Vista or 7 might have different names for things.
  1. Make sure you have the MySQL ODBC driver installed.
     
  2. Log into MySQL (I'll assume as the "root" user), and create the new database name:

    mysql -uroot -p
    CREATE DATABASE yournewdatabasename;


    That's the last you'll see of MySQL for a while...
     
  3. Click start > Control Panel > Administrative Tools > Data Sources (ODBC) to start the process of creating an ODBC data source (DSN = Data Source Name) containing the information that Access needs for connecting to your MySQL server.
     
  4. Click the File DSN tab.

    I've seen tutorials showing steps to create a User DSN and a System DSN, but because I am the only user who will use this connection, I chose to use a File DSN instead. The reason is that the data for a File DSN is stored in a file, but the others are stored in the Windows registry. A while ago, I created a User DSN. I recently uninstalled the old version of the MySQL ODBC driver so I could install the new one. Now Windows refuses to delete my old User DSN, telling me to "please" reinstall the old ODBC driver! I'm left with obsolete connection data in my registry that I'll probably never get rid of. To delete a File DSN, Windows just deletes the file that stores the connection data. If I create a new User or System DSN now, what are the chances I'll remember to delete all my Data Sources before installing the next version of the driver? Zero.
     
  5. Click Add.
     
  6. In the list of drivers, select the MySQL ODBC driver. For me, it's MySQL ODBC 5.1 Driver. Click Next.
     
  7. In the Create New Data Source box, click Browse. Navigate to \My Documents\. We want the file to be written to a folder only accessible by your Windows user. In My Documents, there should be a folder called "My Data Sources". Select that. If that folder doesn't exist, you can either try creating it, or select any other convenient folder. Then enter the file name. We can use something like mysqlodbc51. Click Save, then Next, then Finish. Now the MySQL driver itself will ask for some input.
     
  8. Assuming your MySQL server is on your local PC, Server = localhost. The standard and usual MySQL port = 3306. User = root (because it has privileges to do anything we need to do). Password = (your MySQL root password). As soon as you've entered the root password, you'll find that there are entries available in the Database box, but it's not necessary to select a database now.
     
  9. Click Test. It should say "Connection successful".
     
  10. The Details box shows lots of options you don't need to change.
     
  11. Click OK, then OK.
     
  12. We've created a generic ODBC connection to MySQL using the root user. You could create different connections for different users or databases, but this will do for our purpose. You'll be able to reuse this connection for migrating multiple databases.
     
  13. Open in Access the .mdb file to be migrated. Find the table to be migrated (must do one table at a time). Right-click on it and select Export...
     
  14. In "Save as type", select ODBC Databases() (at bottom of list).
     
  15. In the Export dialog, enter the name you want the table to have in MySQL. Click OK.
     
  16. In the next dialog (Select Data Source), select the File Data Source tab and navigate to the folder where you saved your new Data Source file earlier (\My Documents\My Data Sources\).
     
  17. Select the data source you created previously. Click OK.
     
  18. In the MySQL Connector/ODBC dialog, enter your MySQL root password.
     
  19. The new database you created in Step 2 should appear in the list of databases. Select it.
     
  20. When you click OK, the table and its data will be magically exported into the MySQL copy of the database. (Sometimes magic takes work.)
     
  21. You can confirm success by going back to your open MySQL Command Line Client window and typing these SQL commands (the case of MySQL keywords is not significant, but it is conventional to show them in upper case):

    USE yournewdatabasename;
    SHOW TABLES;
    DESCRIBE yourmigratedtablename;
    SELECT * FROM yourmigratedtablename LIMIT 10;

     
  22. There is no way to automatically migrate Queries, Forms, Reports, Macros, or Modules.
     
  23. I migrate queries manually, without converting them to MySQL Views:
     
  24. Create a folder somewhere in My Documents, named after the database you are migrating. I created a master folder for all databases, with individual database-named folders inside it. Each subfolder stores the set of .sql files (migrated queries) for one database.
     
  25. In Access, click the Queries tab, then the query you want to migrate. Then click the Design button.
     
  26. When the QBE grid window opens, click the View > SQL View menu item or the menu item in its corresponding button (farthest left button in my toolbar layout).
     
  27. In the database folder from Step 24, create a new text file named after the query you are migrating, with an extension of ".sql".
     
  28. Copy the SQL code out of the SQL View window, and paste it into the new .sql file. Save the file.
     
  29. You can now run this query using the MySQL batch processing command. MySQL needs to know the exact path to the file. You can do this either by a) using the cd command (change directory, in both Linux and Windows) to switch to the folder containing your .sql file before you launch the MySQL Command Line Client, or b) providing the full path to the .sql file on the line below; in Windows, use forward slashes instead of the usual backslashes to separate the parts of the path:

    SOURCE C:/Documents and Settings/Owner/My Documents/MyDBQueries/dbname/test.sql;
     
  30. If you get errors, it is because the query used something that was Access-specific and needs revision to the MySQL dialect of SQL.
Once I got into the routine of it, my query migration method became easier to do than to describe. It seems cumbersome, but I find it easier to deal with queries as text files than it was in the forced layout and storage method in Access. I can put comments in the code, revise a query by editing its text file, create a new query from an old one by copying and revising the file, and look at how I did something in a query in a different database without opening the database and navigating to its queries. This is my query template:
/* 07-14-2010  (last modified date) 
Description what it does.
Notes:

*/
WARNINGS;
USE databasename;

-- This is an alternative comment format.

SQL code goes here;

3) Do the migration with manual hand-coding and data export/import

This option certainly gives you the most control, and it has the advantage that if the export/import doesn't go as expected, you can simply revise your script or the data format, and try again, as many times as needed.
  1. Use the first script in the "Example .sql files" section below to create a template script containing the SQL commands for creating your database and its tables.
     
  2. Open your database table in Access Design View so you can refer to its field (column) data type definitions.
     
  3. Open the MySQL documentation page about data types so you can determine the MySQL equivalents of the Access ones.
     
  4. Do the data type translations to customize your template script for this particular database and tables.
     
  5. Run the .sql script in MySQL to create the database and tables.
     
  6. From Access, export each table's data into a CSV or tab-delimited text file.
     
  7. In MySQL, use the LOAD DATA INFILE command (in an .sql script file so you can revise it if necessary) to read the data into each table. It has features that allow you to set fields in a different order than they appear in the input file, or omit fields, and to perform transformations on the input data.
     
  8. If it didn't go as planned, revise your table-creation and/or LOAD DATA INFILE scripts as needed, and try again until it works.
     
  9. Queries must be migrated manually using the steps starting at #22 in Section 2, above, or by some alternative method.
     
  10. That's "all"!

4) Use a commercial migration toolkit

It looks like there are quite a few companies offering migration solutions for sale. I haven't tried any of them.

Example .sql files

In addition to their purpose of defining and populating a database with one table, these example .sql files also demonstrate the following:
  • Each SQL command must be terminated with a semicolon.
  • Whitespace and line breaks are not significant, so you can format a command in a way that makes it easier to read.
  • A single line of text that begins with two dashes is a comment.
  • You can also use C-style for single or multi-line comments: /* comment */
1) The first file creates a database with one table:
-- ----------------------------------------------------------------------
-- MySQL Migration Toolkit (manually edited)
-- SQL Create Script
-- ----------------------------------------------------------------------
-- Automatically show all warnings and errors from now on.
WARNINGS;

SET FOREIGN_KEY_CHECKS = 0;

-- DROP DATABASE IF EXISTS `broadcast`;
CREATE DATABASE IF NOT EXISTS `broadcast`
  CHARACTER SET latin1 COLLATE latin1_swedish_ci;
USE `broadcast`;
-- -------------------------------------
-- Tables

DROP TABLE IF EXISTS `broadcast`.`stations`; 
CREATE TABLE `broadcast`.`stations` 
(
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `notheard` CHAR(1) NULL,
  `approx` CHAR(1) NULL,
  `freq` DOUBLE NULL,
  `band` CHAR(2) NULL,
  `call` VARCHAR(6) NULL,
  `time` TINYINT UNSIGNED NULL,
  `city` VARCHAR(25) NULL,
  `network` VARCHAR(10) NULL,
  `notes` VARCHAR(80) NULL,
  PRIMARY KEY (id)
)
ENGINE = INNODB;

SET FOREIGN_KEY_CHECKS = 1;

-- ----------------------------------------------------------------------
-- EOF
2) The second file (which could be made part of the first file) inserts 4 rows of data into the table:
WARNINGS;

-- Save old setting, then disable foreign key checks
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;

INSERT INTO `broadcast`.`stations`
  (`notheard`, `approx`, `freq`, `band`, `call`, `time`, `city`, `network`, `notes`)
VALUES 
  (NULL, NULL, 640.0, 'am', 'KFI', 4, 'los angeles', NULL, 'talk'),
  (NULL, NULL, 810.0, 'am', 'KGO', 4, 'san francisco', 'ABC', 'talk'),
  (NULL, NULL, 1160.0, 'am', 'KSL', 4, 'salt lake city', 'CBS', 'talk,RadioMysteryTheater10pm'),
  ('N', NULL, 1070.0, 'am', 'KNX', NULL, 'los angeles', NULL, 'news, talk?');

-- Restore foreign key checks to whatever value it had before starting.
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;

-- End of script
3) This shows an alternate method of populating the table. Microsoft Access exported the above data into a CSV (comma separated values) text file like this. I manually added the leftmost field (a primary key field called `id` with the values 1,2,3,4) to illustrate a point later:
1,,,640.00,"am","KFI",4,"los angeles",,"talk"
2,,,810.00,"am","KGO",4,"san francisco","ABC","talk"
3,,,1160.00,"am","KSL",4,"salt lake city","CBS","talk,RadioMysteryTheater10pm"
4,"N",,1070.00,"am","KNX",,"los angeles",,"news, talk?"

As you can see, Access writes NULL values as "nothing" between the commas. MySQL can't interpret this format. It needs to find something there that indicates a null value. It can be either \N or NULL. So I had to edit the text file to turn it into this:
1,\N,\N,640.00,"am","KFI",4,"los angeles",\N,"talk"
2,\N,\N,810.00,"am","KGO",4,"san francisco","ABC","talk"
3,\N,\N,1160.00,"am","KSL",4,"salt lake city","CBS","talk,RadioMysteryTheater10pm"
4,"N",\N,1070.00,"am","KNX",\N,"los angeles",\N,"news, talk?"

Given the above input file, the following script loads the data into the table. Notice that database, table, and field names are enclosed in the standard MySQL backtick delimiters, while text variables are enclosed in ordinary single quotes:
WARNINGS;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;

-- This is the command that reads the data from the file.
LOAD DATA LOCAL INFILE 'stations.txt'
INTO TABLE `broadcast`.`stations`
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\r\n'
(@ignoreid,`notheard`, `approx`, `freq`, `band`, `call`, `time`, `city`, `network`, `notes`);

SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;

The first parameter in the field list, @ignoreid, is a variable rather than a field name. MySQL will read the first value from each line into that variable, but I don't instruct it to do anything with it; thus, the first value on each line is ignored. That's because it's an auto-increment primary key field in the table. Rather than create the possibility of a conflict with rows already in the table, I ignore the field, allowing MySQL to assign a new sequential value for that field for the incoming rows. That wouldn't really be a necessary precaution. I did it to demonstrate the feature. You can use the variable in other ways, too, such as perform a calculation with it and assign the result to a field of the table.

0 comments:

Post a Comment