Migrating Content

There has been a great thread on the CM Professionals mailing list about automated tools for migrating content. When the same topic was discussed a few months ago, the general consensus was that the hope of effortlessly migrating into a new CMS (either from an old CMS or a static site) is unrealistic. A few people made the point that that limitation is not necessarily a bad thing because there is value in evaluating every piece of content as it is moved to the new site. Slurping out of one website and spitting into another is likely to simply move the mess rather than achieve the goal of creating a more useful and manageable web resource.

The dialog was re-energized today when a new member introduced himself as working for a company called Vamosa. Automated content migration is what they do. The process that Vamosa practices is not a simple turn-key process. That was reassuring because I know that if that is what he claimed, he would be lying. A typical Vamosa project takes roughly 3-4 months. It consists of measuring the gap between the state of the current content and the target system, creating rules for parsing content and mapping, then, once everything is set up to go, the system can migrate 10,000 pages per hour. I assume that it takes several tries because that is my experience even for simple relational database migrations.

This seems like an interesting service to look into. Some legacy systems may be more suitable for this solution than others. For example, if the target is highly structured, and the source has no structure and there is no uniformity of layout on which to base parsing rules, prospects would be dim. However, if there is uniformity and structure, there would seem to be potential.

Other posters in the dialog had ideas for other tools that would help in content migration. For example, if Microsoft were to make a version of Word that was more of a content entry tool than a desktop publishing/layout tool, there might be hope that good structured, XML content could come of out of it. I hear that Information Mapping’s Content Mapper product does just that although I have not used it yet. Another email talked about the need for tools to plan and manage the content migration process. That sounds interesting too.

Note: If you are a CM Professional and saw a point that you made on the mailing list in this blog post and want attribution. Please email me and I will quote and attribute. I just kept it anonymous for privacy reasons

Related posts:

  1. Migrating from commercial to open source CMS I just saw a post on the Bricolage developer...
  2. A Content Management Definition I just heard Frank Gilbane define Content Management as...
  3. Content Management Overview On the last day of the Gilbane conference, Erik...
  4. Hippo CMS 7’s New Content Type Editor Arjé Cahn posted a short video demonstrating the Hippo...
  5. Corporate Use of Blogs and Wikis Lauren Wood, of The Gilbane Report has an excellent...

One Response to “Migrating Content”

  1. Anonymous says:

    It is common to hear the argument that automated migration of very unstructured content is not practical. This is why Vamosa developed what it calls ’source classification’.
    Source Classification of content exposes the structure of all the pages so that they can be understood more clearly. It establishes a range of page-type complexity in the source content which will contribute to mapping the existing content to the new structure/templates.
    Vamosa’s Source Classification is analogous to ‘source templates’:
    •All pages are analysed for content, navigation, headers, footers etc
    •Pages which have a similar make up are grouped together
    •Vamosa product has a default setting which uses a fixed parameter set to give a high level classification
    •Vamosa source classification parameters can then be tuned to reduce or increase the number of source templates created
    Why is it significant?
    •Each source classification template type is mapped through a Vamosa rule to ‘target template’
    •The reconciliation process of these derived source templates to the discreet number of target templates is an important part of any automated migration

Leave a Reply