This is a migration guide I wrote solely for the benefit of myself. Although I don’t think I need it the second time, to have a detailed reproduce steps written and modify to reflect the actual steps are the basic skills of a support engineer (by training).
The Problems
The basic export/import stuff works for MovableType to WordPress – pretty straight forward. There are only three minor issues that prevent the smooth transition.
Problems 1: URL change.
My MovableType uses a naming system I created, not the default. The Individual Entry Archive page URL was:
<$MTArchiveDate format=”%Y%m%d”$>_<$MTEntryTitle dirify=”1″$>.htm
After exporting, the URL is renamed by the base name of the MovableType, not the full name. For example, URL:
http://home.wangjianshuo.com/archives /20060127_long_vacation_of_spring_festival_comes.htm
becomes
http://home.wangjianshuo.com/archives/ 20060127_long_vacation.htm
The solution:
Unlike most of the solutions found on Internet, I am going to change the base_name in the Export file itself. I will change the MTOS-4.38-en/lib/MT/ImportExport.pm file. I am going to change the line
BASENAME:<$MTEntryTitle dirify=”1″$>
to
BASENAME: <$MTArchiveDate format=”%Y%m%d”$>_<$MTEntryTitle dirify=”1″$>
This should solve the problem to let WordPress know the right way to preserve the URLs.
Dash Problem
This solve the old post problem that was imported from MovableType. For the new post, the problem is with the “-“. WordPress uses dash instead of underscore to replace non-alphabetic characters. I just need to go to wp-includes/formatting.php
file and change all the dash, to underscore in the function sanitize_title_with_dash
. This solve the future post problem to make it consistent with the older posts.
Update July 25, 2012
Be sure to comment out the following line:
preg_replace(‘|-+|’, ‘_’, $title);
Because WordPress just leave – as it is, and replace it to underscore causes many previous articles broken.
Problem 2: Encoding difference
The default encoding of MovableType was ISO 8859-1, and WordPress uses “UTF-8” (right choice). The steps in the migration plan solved the problem. Otherwise, the problem I met was, the content after the special character, like ASCII code 92 was cut off, which is a necessary replacement of a single quote ‘.
WordPress uses UTF-8 as the default encoding. So if your MT blog uses ISO 8859-1 or Latin – 1 to encode posts, convert the posts to UTF-8 before importing, to ensure that all characters display properly.
On *nix and OSX you can use the iconv program to convert your import.txt file: $ iconv -f ISO-8859-1 -t UTF-8 import.txt > import_new.txt
After I did the conversation, I went on for the extra mile to use the following command in vim to change all the annoying encoding x92, x93, x95 to its proper format:
:%s/[|]/’/g (119, and 1623 replaced)
:%s/[|]/”/g (979 and 897 replaced)
:%s//o /g (334 replaced)
:%s/[|]/-/g (323 and 264 replaced)
If the original MT uses UTF-8, it won’t be a problem, although the exported file is not directly readable in editor in Mac.
Problem 3: Convert Line Breaks
By default, when I use two lines to separate a paragraph, but in WordPress, it becomes a single line and the two paragraphs are put together with only one line break.
It turned out the bug 16147 is exactly talking about the problem and fixed the problem. Just go to the importer.php file and remove the following line.
(if( !empty($line) ))
Something to note is, since WordPress load plugins automatically, /wp-content/plugins/movabletype-importer/movabletype-importer.php does not exist in the downloaded package.
Problem 4: The Chinese Title
Using the sanitised title as part of URL is good to keep it unique, but the Chinese title causes problems. You cannot just use the Chinese encoded names as WordPress, resulting huge number of % and numbers in the URL. The original less of consideration of MovableType actual worked very well by just taking some a or e out of the encoded title, but need some research.
Configuration
After migration, there are some configuration work. Basically by looking at the admin tab one by one, we can get some idea. Here are some outline:
1. Configure the upload file folder.
2. The URL slug – use %year%%monthnum%%day%_%post_name%.htm
3. Open XML-RPC
4. Change display name for default user.
Done
That should be all I have to do. Did some quick research and quickly fixed the problem. I am going to do the actual migration the next weekend. Then you will see a brand new blog.