Skip to main content

Migrate blog from Wordpress to Orchard CMS

As I mentioned before, I'm working on a replacement for this blog. I'm going to migrate to Orchard CMS, mostly to learn that content management system. This has been a pleasant experience so far. Before I can release my new blog I need to move all my content from Wordpress to Orchard CMS. Someone (not me) should really think about writing a module to make this a pleasant journey. I did this manually, because I only expect to do it once.

Export from Wordpress

Wordpress export XML

Wordpress has an export function. You find it in Tools menu. This is cool, except that the format is some weird kind of RSS, that is extended with Wordpress' own xml elements. Fine. Let's see what we can do about this.

<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="WordPress/2.9.2" created="2011-05-24 06:45"-->
<rss version="2.0"
 xmlns:excerpt="http://wordpress.org/export/1.0/excerpt/"
 xmlns:content="http://purl.org/rss/1.0/modules/content/"
 xmlns:wfw="http://wellformedweb.org/CommentAPI/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:wp="http://wordpress.org/export/1.0/">

<channel> <title>Mint</title> <link>http://mint.litemedia.se</link> <description>building a .NET application</description> <pubDate>Tue, 24 May 2011 06:28:55 +0000</pubDate> <generator>http://wordpress.org/?v=2.9.2</generator> <language>en</language> <wp:wxrversion>1.0</wp:wxrversion> <wp:basesiteurl>http://mint.litemedia.se</wp:basesiteurl> <wp:baseblogurl>http://mint.litemedia.se</wp:baseblogurl> ... </channel>

This is your whole Wordpress exported in one file. You will have draft messages, spam comments and even pages in there. All you're probably interested in is published pages and accepted comments. If you want to make some bulk action to your blog data, you should do it now. I'm thinking, changing all the absolute paths from old blog address to the new blog address with a quick, search and replace. I forgot to do that, and now have to go through 180 blog posts manually. Not that I mind very much. I had planned to do that anyway.

Import into Orchard CMS

Orchard export XML

Go to the modules gallery, find and install the Orchard Team Install Export module. As the name of the module surely reveal, it let's you import and export data into Orchard. To find out what kind of XML Schema Orchard uses, try writing a couple of blog posts, comments and export it. It should look something like this

<!--Exported from Orchard-->
<Orchard>
  <Recipe>
    <Name>Generated by Orchard.ImportExport</Name>
    <Author>Mikael Lundin</Author>
  </Recipe>
  <Data>
    <Comment Id="/Identifier=6066991b882a488da366d1f64e19d36d" Status="Published">
      <CommentPart Author="Mikael Lundin" UserName="Mikael Lundin" Email="myemail@home.se" Status="Approved" CommentDateUtc="2011-05-22T10:54:33Z" CommentText="This is comment number 1. It has no line breaks." CommentedOn="/Route.Slug=what-to-do-before-release" CommentedOnContainer="/Route.Slug=blog" />
      <CommonPart Owner="/User.UserName=Mikael Lundin" CreatedUtc="2011-05-22T10:54:33Z" PublishedUtc="2011-05-22T10:54:33Z" ModifiedUtc="2011-05-22T10:54:33Z" />
      <IdentityPart Identifier="6066991b882a488da366d1f64e19d36d" />
    </Comment>
    <Comment Id="/Identifier=11376d3721144b2ebb51d4da880592f6" Status="Published">
      <CommentPart Author="Mikael Lundin" UserName="Mikael Lundin" Email="myemail@home.se" Status="Approved" CommentDateUtc="2011-05-22T10:55:00Z" CommentText="Here comes comment number two.&#xD;&#xA;It has several line breaks.&#xD;&#xA;&#xD;&#xA;Saluté!" CommentedOn="/Route.Slug=what-to-do-before-release" CommentedOnContainer="/Route.Slug=blog" />
      <CommonPart Owner="/User.UserName=Mikael Lundin" CreatedUtc="2011-05-22T10:55:00Z" PublishedUtc="2011-05-22T10:55:00Z" ModifiedUtc="2011-05-22T10:55:00Z" />
      <IdentityPart Identifier="11376d3721144b2ebb51d4da880592f6" />
    </Comment>
    <BlogPost Id="/Route.Slug=first" Status="Published">
      <TagsPart Tags="" />
      <CommentsPart CommentsShown="true" CommentsActive="true" />
      <RoutePart Title="New blog on litemedia.info" Slug="first" Path="first" />
      <CommonPart Owner="/User.UserName=Mikael Lundin" Container="/Route.Slug=blog" CreatedUtc="2011-04-16T08:18:28Z" PublishedUtc="2011-04-17T19:59:18Z" ModifiedUtc="2011-04-17T19:59:18Z" />
      <BodyPart Text="&lt;p&gt;I will move all the blog posts from mint.litemedia.se to litemedia.info&lt;/p&gt;&#xD;&#xA;&lt;p&gt;I hope this will result in&lt;/p&gt;&#xD;&#xA;&lt;ul&gt;&#xD;&#xA;&lt;li&gt;More readers&lt;/li&gt;&#xD;&#xA;&lt;li&gt;Easier management&lt;/li&gt;&#xD;&#xA;&lt;li&gt;Better design&lt;/li&gt;&#xD;&#xA;&lt;/ul&gt;" />
    </BlogPost>
    <BlogPost Id="/Route.Slug=what-to-do-before-release" Status="Published">
      <TagsPart Tags="tag1,tag2,tag3,tag4" />
      <CommentsPart CommentsShown="true" CommentsActive="true" />
      <RoutePart Title="What to do before release" Slug="what-to-do-before-release" Path="what-to-do-before-release" />
      <CommonPart Owner="/User.UserName=Mikael Lundin" Container="/Route.Slug=blog" CreatedUtc="2011-04-19T17:46:23Z" PublishedUtc="2011-05-23T19:17:38Z" ModifiedUtc="2011-05-23T19:17:38Z" />
      <BodyPart Text="&lt;p&gt;Things that needs to be done before release of the blog&lt;/p&gt;..." />
    </BlogPost>
  </Data>
</Orchard>

Now we have our data in Wordpress xml format, and we would like to transform it into Orchard xml format to import it into our new blog. For that we will use my favorite tool.

Transforming the export data into import data

We use XSLT to transform from one xml format into another. We could use ordinary scripting, but xslt makes it so easy. Here's the script that I used. Excuse me for the VBScript part, but I got lazy and took the simple way out when I had to transform date formats.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:wp="http://wordpress.org/export/1.0/"
 xmlns:msxml="urn:schemas-microsoft-com:xslt" 
 xmlns:vb="#VBCustomScript"
 xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <xsl:output method="xml" indent="yes"/>

<!-- Yes this is ugly but I didn't have the energy to solve it with Xslt 1.0 --> <msxml:script language="VBScript" implements-prefix="vb"> <![CDATA[ function gmtToUtc(str) gmtToUtc = Split(str)(0) & "T" & Split(str)(1) & "Z" end function ]]> </msxml:script>

<!-- A function to join values together with a seperator Example join(tags, ',' --> <xsl:template name="join"> <xsl:param name="list" /> <xsl:param name="separator"/>

<xsl:for-each select="$list"> <xsl:value-of select="." /> <xsl:if test="position() != last()"> <xsl:value-of select="$separator" /> </xsl:if> </xsl:for-each> </xsl:template>

<!-- Main entry Point --> <xsl:template match="/"> <Orchard> <Recipe> <Name>Transformed export from Wordpress</Name> <Author>Mikael Lundin</Author> </Recipe> <Data> <!-- Comments: Only approved ones --> <xsl:apply-templates select="//channel/item/wp:comment[wp:comment_approved='1']" /> <!-- Blog items: Only published ones --> <xsl:apply-templates select="//channel/item[wp:status='publish']"/> </Data> </Orchard> </xsl:template>

<!-- Render a comment --> <xsl:template match="wp:comment"> <!-- Comment publish date --> <xsl:variable name="date" select="vb:gmtToUtc(string(wp:commentdategmt))" /> <!-- Parent identifier --> <xsl:variable name="parentSlug" select="../wp:postname" /> <!-- The comment ID --> <xsl:variable name="identity" select="wp:commentid" />

<Comment Id="/Identifier={$identity}" Status="Published"> <CommentPart Status="Approved" CommentDateUtc="{$date}" CommentedOnContainer="/Route.Slug=blog" CommentedOn="/Route.Slug={$parentSlug}"> <xsl:attribute name="Email"><xsl:value-of select="wp:commentauthoremail"/></xsl:attribute> <xsl:attribute name="Author"><xsl:value-of select="wp:commentauthor"/></xsl:attribute> <xsl:attribute name="CommentText"><xsl:value-of select="wp:commentcontent/text()"/></xsl:attribute> </CommentPart> <CommonPart CreatedUtc="{$date}" PublishedUtc="{$date}" ModifiedUtc="{$date}" /> <IdentityPart Identifier="{$identity}" /> </Comment> </xsl:template>

<!-- Render Blog item --> <xsl:template match="item"> <!-- Blog post identifier --> <xsl:variable name="slug" select="wp:postname" /> <!-- Publish date --> <xsl:variable name="date" select="vb:gmtToUtc(string(wp:postdate_gmt))" />

<BlogPost Id="/Route.Slug={$slug}" Status="Published"> <TagsPart> <!-- Render parts --> <xsl:attribute name="Tags"> <xsl:call-template name="join"> <xsl:with-param name="list" select="category[@domain='tag']/@nicename" /> <xsl:with-param name="separator" select="','" /> </xsl:call-template> </xsl:attribute> </TagsPart> <CommentsPart CommentsShown="true" CommentsActive="true" /> <RoutePart Slug="{$slug}" Path="{$slug}"> <xsl:attribute name="Title"> <xsl:value-of select="title" /> </xsl:attribute> </RoutePart> <CommonPart Owner="/User.UserName=Mikael Lundin" Container="/Route.Slug=blog" CreatedUtc="{$date}" ModifiedUtc="{$date}" PublishedUtc="{$date}" /> <BodyPart> <xsl:attribute name="Text"> <xsl:value-of select="content:encoded/text()" /> </xsl:attribute> </BodyPart> </BlogPost> </xsl:template> </xsl:stylesheet>

Now you add the following line on the top of the Wordpress xml file, on line 2, under the xml declaration.

<?xml-stylesheet type="text/xsl" href="import.xslt" ?>

At this point it would be pretty simple to create an import from wordpress Orchard module, but I just want to solve my problem and move on. That is why I open my Wordpress xml-file in Internet Explorer and let that transform it for me. Press F12 and you will have the ability to save the whole transformed file to disc.

internet explorer dev tool

Now you can use that file to import your goodies into Orchard. Don't forget to move wp-content into your media library also and change all the media links accordingly. Not very hard, was it?

comments powered by Disqus