I just published my CMS Selection Workshop handout on Scribd. The handout contains:
Examples of how to specify content types
Example usage scenarios
An example RFP table of contents
An example vendor demo agenda
I just published my CMS Selection Workshop handout on Scribd. The handout contains:
Examples of how to specify content types
Example usage scenarios
An example RFP table of contents
An example vendor demo agenda
This is the second installment of a series of articles on content management assessments.
Nearly all content management systems have flexible repositories that allow you to control how content is structured. This is important because every organization's content is different just like every website is different. But this flexibility can also be a liability because it creates opportunity for bad design as well as good design. Content can be under-structured, over-structured, or wrongly structured. Asking the right questions will help you assess how effective your content was modeled.
If your content contributors spend a lot of time formatting, it may mean that your content is not structured enough. Look out for "formatting patterns" where a content contributor repeats a format pattern to give the illusion of structure. For example, if the by-line and date-lines of articles are stored in the main body and distinguished by a particular font treatment, those should probably be fielded elements. The risk of letting this go is that it reduces flexibility. The only way to change how a by-line looks is to go into each article and change the formatting.
While structure is generally a good thing, too much can be stifling to the overall goal of content: communication. Most content contributors need a blank palette to rearrange and massage information until it tells the right story. Filling out endless form fields breaks that flow and never allows the creative process to take root. A good example is forcing a user to enter the body of a semi-structured content asset into a series of paragraph fields rather than one rich text field. A series of paragraph fields makes it hard for the author to re-arrange content and split and merge paragraphs. A good clue that your content entry interface is overly cumbersome is if content contributors wait until the very end of the editorial process to enter content into the CMS.
On one CMS selection, I received a requirement for a global search and replace. The reason for this requirement was a recent episode where the organization's phone number changed and they needed to update every press release because the contact number was embedded in the body field below the rest of the article. Not everything that a visitor sees on the page has to be entered by a content contributor. Some of the page can be hard-coded into display templates. Other parts of the page can be managed as global content components. In this case, it would have been better to embed the phone number in the press release display template below the body copy. If the phone number was likely to change frequently, a global content component would allow a content contributor to edit this information in one place and without the assistance of a template developer.
A content management system is not doing its job if contributors are not effectively reusing content. The classic example is an article list and detail page. If content is structured properly, the article headline exists in only one place: in the article. The list page queries the articles and lists their headlines. If the headline exists in two places, it needs to managed in two places. Now, this is an obvious example and it is shown in nearly every CMS demo that I have seen. A less obvious example is a promotional element — some block of display that promotes some messaging or a product. You want to manage those as components that can be re-used in different places on the site. I call this keeping your content DRY. With better reuse, you can manage less content and still support the same visitor experience. The downside to reuse is that it makes specialization hard, you can't change the wording in one place without it affecting all other occurrences of that content. You need to make decisions about whether specializing a piece of content is worth the additional management overhead.
A hallmark of an un-maintained content management solution is when content contributors start to get creative and misuse content elements to achieve the behavior that they want. Deane Barker has an excellent presentation on this called "Just Put That in the Zip Code Field". This warning sign means that the requirements for the solution have changed but the solution has not adapted. It is a big issue when your content contributors care more about the website more than the support team. On the plus side, this means that you have established ownership over the content. On the negative side, the organization as a whole is either unwilling or incapable of supporting that ownership. If you see these signs, you need to get in gear and honor content contributors' commitment by taking better care of the solution.
The good news is that most content management systems will allow you to correct content modeling inefficiencies — but with certain limitations. First, content management systems tend to have different strategies for managing content: page based or object based and the extent of available data types varies from product to product (again, check out Deane's presentation). Second, going from more structure to less is easier than going from less structure to more. You can automatically concatenate multiple fields into one; it is harder to arbitrarily split large unstructured elements into multiple structured elements. Third, some content management systems have better utilities for doing bulk content transformations than others.
As a quick little exercise, try to prototype and alternative content model and ask your content contributors to give feedback. They may feel like they have a whole new lease on their content.
My general advice for content modeling is to structure things as much you can without annoying your editors. More structure means more re-usability because presentation templates have more control over what to put where. A good example is names of people. If you have one field for name, you can show the name in alternative ways (such as "Joe Smith" and "Smith, Joe") and you can't do things like sort by last name. It's better to put this information into two fields.
So, you ask, what do you do when a person has one name as in Pink, Madonna, or Unknown? If you have a person with one name, use the last name field. Otherwise, when you sort by last name and then first name, all the one-namers come to the top. This actually came up on two different projects in the last 6 months so I figured I would pass it on.
A few days ago I read Deane Barker's excellent post Editors Live in the Holes (go ahead and read the post and then come back) and I have been thinking about it ever since. I have had the same experience several times and it is a good reminder for developers to pay special attention to configuring and testing the rich text editor. As Deane points out, it is too easy for developers to disregard "the holes" as a contributor problem, not a system problem.
To get it right, the holes need to be jointly owned by the designers, developers, and content contributors. Designers need to design for flexibility. Developers need to do everything they can to make contributors successful. But this raises something of a chicken and egg problem — at least for new CMS implementations (as opposed to migrations). In these projects, content entry typically occurs after the system is considered complete. This means that the designer and developer need to anticipate what rich text capabilities (formatting controls and the styles that control the display of rich text) the contributors will need. This is particularly important in the ever-present "generic page" content type that is typically used for the many one-off (odd ball) pages that exist in any website.
I have found two good techniques to get around this problem. First, it is good to test the rich text editor with a few of the more challenging one-off pages on the site. Take a page with embedded images and objects (like perhaps a Google map) and formatting and try to reproduce it in the rich text editor. Don't disable the rich text editor and edit the source. That is cheating. If it turns out you can't do it without pulling your hair out, you need to come up with a work around. If it is a really important page, you might need to develop a special content type and/or presentation template that does some of the work. If you find that there are too many challenging one-off pages to choose from, you might step back and consider enforcing more uniformity between pages. Otherwise, you will probably not be getting all of the value (content reuse and manageability) out of a CMS.
The second technique is to build a "style guide" page and place it in some discrete area on the site. The style guide page is a generic page that contains examples of all the stylings that are available to the contributor. For example, every heading level, paragraphs, lists (ordered and unordered), tables, embedded images, etc. The content contributor can visit this page to get an idea of what is possible and then open it in edit mode to see how the formatting was executed. The process of building and reviewing the style guide page is a useful forum to get designers, developers, and contributors together to collaborate and align. The fact that it is so tangible grounds everyone in the real capabilities of the platform. The style guide page is also the first place to check updates or enhancements to styles after launch.
At the end of the day, designers, developers, and contributors all want the site to be a success. They can't just claim victory on their little piece ("the mockups were approved," "we got out of QA," or "I got my page to preview!"). Editors may live in the holes but everyone has to keep the holes clean.
After over 10 years of working in content management, I have come to realize that there is only one way to learn the value of managing structured information: the hard way — and that way is only 50% effective. People can intellectually accept concepts like content re-use and content/layout separation, but in the heat of the moment, few can resist the siren song of a word processor and the clipboard. Pasting in a bunch of text into a rich text area (and then re-formatting it) provides so much more instant gratification than data entry into the fields of a structured content form. It is only after a number of painful global content changes that people come to realize that the value of all that painstaking WYSIWYG work has a very short shelf life. It is not until a migration onto another platform that one becomes aware of all that semi-redundant content. But that realization only happens around half the time. The other half of the time the site's unmanageability is blamed on the CMS. A clear sign that the content manager didn't make the connection is when there is a requirement that the new CMS have a global search and replace feature.
As someone who has seen many companies succeed and fail (and really fail) with content management, it is easy for me to notice these patterns. But that doesn't mean that I can make anyone short-circuit his/her learning process. If I were able to forcefully impose a highly structured content model on a client, all they would notice was the complexity of the content entry forms. They would take for granted the downstream benefits. The best you can do is gently guide and hope that guidance will lead to recognition when the site becomes unmanageable. I don't get too worked up about it. If I get frustrated, I can just talk to my friends in the DITA/XML advocate community. Their pain in working with technical documentation teams is way worse.
In the software development world, we have the concept of DRY (Don't Repeat Yourself). The idea is "every piece of knowledge must have a single, unambiguous, authoritative representation within a system." I call the opposite of DRY WET (Write Everything Thrice) or DAMP (Developer Accepts Maintenance Problems. Hat tip to Brian Kelly). This means copying and pasting code (rather than referencing it) or writing the same data over and over again. Part of the development process is recognizing patterns and coming up with ways to reduce redundancy. Good developers are constantly thinking about maintaining the code they write because they will inevitably need to add a feature of fix a bug. And the feedback cycle is really short for developers. You write a bit of code, test it, fix it, write some more code, test that and the first code you wrote, fix it.... If you did anything stupid, the time you have to wait before suffering for it is usually short. I am not saying that all developers practice DRY, but they have a better track record than content contributors.
Most content contributors don't have that short feedback loop. Too often, content is considered a "set it and forget it" initiative. You publish and move on. But I am seeing two positive trends in the content management industry that may shorten the feedback loop. First, there has been some great thought leadership around solving the "post launch paradigm". Second, many CMS vendors are building in analytics and multivariate testing functionality that encourages the content manager to constantly tweak a website to maximum performance. My hope is that awareness of this functionality will compel buyers to think of their content in a more dynamic way — something that evolves and improves like software. Then maybe we will hear content managers talking about their websites being DRY, WET, or DAMP.
Deane Barker, over at Gadgetopia, has posted slides from his presentation "Just put that in the zip code field". He gave the talk at the Web Content 2009 conference in Chicago. Unfortunately, I was not able to attend the conference and missed seeing Deane present. However, knowing that I am as passionate about this stuff as he is, Deane and I did talk at great length on content modeling during the days leading up to the conference. Oh, the war stories we told. Those conversations inspired me to write this post on pages and objects.
The reason why I find this topic so important (aside from the fact that I am a recovering DBA myself) is that content modeling capability is one of those difficult to change characteristics of a content management system. It is what I call a "load bearing wall" in the customization of a CMS. That is, while it may be possible to remediate a content modeling limitation, all the buttressing required may make such an effort impractical. Content modeling architecture is so difficult to change, in fact, that the products themselves tend to live with what they have and change very little in this area. Products that do change how they model content usually take a while to stabilize as they work out the nuances of how to generate entry forms and validation routines and the appropriate templating syntax to access the elements.
Because of all this, content modeling is a critical part of my CMS selection process. Part of my demo process requires the suppliers to implement a content model specification that is based on the client's own content. Deane's presentation also gives useful tips on what to look for in a CMS. In particular, I look for the ability to support specific data types and structures. Don't know what that means? Then take a few minutes and click through Deane's presentation. Or, better yet, look for an opportunity to see Deane present it live. You might see me there too.
Back in prehistoric times, I was implementing a custom CMS for a very large computer manufacturer. The data model drew a distinction between "pages" and "objects" and I remember having a difficult time understanding (and then, later, explaining) the difference. At a high level, objects were items of content (like a "product") and pages were containers (like a landing page that lists collections objects). The areas where my simplistic explanation tended to break down were 1) the notion of a detail page that just displayed one object and 2) unmanaged listing pages that just automatically listed objects. These were the cases where you would have pages on the site that do not map to "pages" in the repository. If you were to practice the page/object model in its truest form, you would create a page asset to wrap objects for every page on the site. This didn't make sense when you had a product catalog with thousands of items (objects) in it. Over time, the page content type became less and less used until it was defined as strictly as a tool for building landing (also known as category) pages. That made sense because the site was really about the products and displaying them in lots of different ways. But if this had been a project to build a typical brochure site (where contributors focused on managing things like the "about" page or the "services" page) we may have gone in the other direction and focused on pages. Objects would have diminished to components that could be re-used across pages.
Another way to look at is to ask who owns the pages, the contributor or the display tier? In a page based model the user owns the page. He places the page in a site hierarchy, gives it a URL, and then fills it with content. In the object based model, the contributor feeds in the content and the display tier (the controllers and the views) has the logic to decide what content to show and how to show it. Like I mentioned in my computer manufacturer website story, the object based approach tends to do better with sites that have more content than is practical to manually organize onto pages. In an extreme example, think about if www.google.com was a website that someone had to manage. No matter how many editors they hired, it would be impossible for an editorial staff to manage every single result page. Maybe they could go in and fix the descriptions of a few index entries here and there.
When I look at web content management systems today, I see similar stories to my prehistoric custom web content management system experience. Every web content management system on the market today grew out of some project to build a website and then was abstracted to build more websites. A WCMS is either conceived as a product and then heavily shaped by its earliest customers or it starts life as an in-house project and then is abstracted into something that could be resold. Those initial uses leave their imprint and become part of the product's DNA. That doesn't mean that products are necessarily limited in their use. As it matures, a product runs into diverse range of potential customers that forces it to broaden its capabilities. In the real world, of course, web sites are combination of pages and objects and the contributor needs at least some level of control of both. However, this digital DNA does help determine what problems it solves more naturally (or comfortably, or intuitively) than others. Page based systems need to figure out a clean way to manage "placeless" content and object based systems need to figure out a simple way to manage basic pages. Some of these additions feel more awkward than others.
The only way to really appreciate the difference in approach and its implications for you is by demonstrating the products managing a site like yours with content like yours. I like the scenario approach where you (or the vendor) build a prototype using your content types and then testing it with your typical usage scenarios. It is only then that you will see how well it addresses your balance of objects and pages.
Lisa Welchman has a great post on CMS Watch Trend Watch that describes the phenomenon of Web Content Management System buyers seeing a CMS as just a wrapper around the WYSIWYG editor. I can't even begin to say how true that is. Recently, a client was having his first look at the WCM system that we were implementing (after not participating in any of the prototyping or reviewing the incremental builds that that we had been doing over several weeks) and he left me a voicemail saying "the administrative interface is all wrong. We have major problems." As you might expect, I was very concerned. Anyway, it turned out that all of his issues were around the WYSIWYG editor. To him, the WYSIWYG editor was the CMS. It turned out that it was misconfigured and everything is OK now. Phew! That would have been bad.
Way back when, I remember the debate over WYSIWYG editors between the CM purists and the user facing pragmatists. The purists didn't want any markup (formatting) in content and the pragmatists were trying to appease users who wanted to make web pages with as much control as writing a document in Microsoft Word. The compromise was to give users free control over small portions of the page. However, given the attention that the editors are getting, it appears that the balance is shifting.
I think this is natural as WCM goes "down market" to run small web sites that had at one time been just static HTML. Small websites have fewer authors and less content so they do not need as much centralized control or content reuse. Strict adherence to content management best practices is less critical. All that is needed is a reduced dependence on the HTML literate webmaster. The CMS becomes more of an HTML editing and deployment tool. Ironically, the HTML savvy webmaster that managed the static site frequently becomes the sole user of the CMS. I would argue that this is not true content management. But maybe not all CMS buyers need to manage content. They just need to manage their website. Still, if you are spending hundreds of thousands on a CMS, presumably you have real content management needs and you should be looking at more than just the WYSIWYG editor.