<!-- Content Here -->

Where content meets technology

Jun 12, 2008

CM Professionals Interview

The June 2008 edition of the CM Pros newsletter has a short interview with yours truly. I was asked

  • What were the drivers behind your latest report, "Open Source Web Content Management in Java?"

  • You have published numerous reports addressing the intelligent selection of content management solutions. In your opinion, what is the biggest mistake that enterprises make when trying to identify and select a solution?

  • Based on current trends in content management strategies, where do you see this ever-growing and ever-changing field headed over the next five years?

Click through to see my answers.

Jun 09, 2008

Alfresco web content management team moving on

A few weeks ago I slid in a subtle mention that the Alfresco WCM team had some significant departures (I'll say who it was now: chief WCM architect Jon Cox, and lead programmer Britt Park). Since then I have been wondering to myself how long the remaining WCM leadership (that is, Kevin Cochrane) would last. Well, I don't need to wonder any more. John Newton just announced on his blog that Kevin is leaving Alfresco to "to pursue other opportunities in the US."

Kevin has been a very important member of the Alfresco team. He fought hard to bring an understanding of the web to a bunch of Documentum alumni (to Alfresco's credit they recognized WCM as their blind spot and invited him into fight that fight). His energy and experience was critical in product development and sales. Alfresco's WCM offering has come so far since it was initially introduced. Right now, many of its challenges stem from being stuck in an Enterprise Content Management user interface. It may be that Kevin was running up against similar constraints.

I guess the next thing to wonder about is how Alfresco will backfill for the valuable role that Kevin played. Hopefully they will bring in someone with similar wisdom and passion. But that will be hard to do.

I am sure Kevin will have a similar (or even greater) impact in his new role wherever that may be.

Jun 06, 2008

The End of Print Media

Steve Ballmer predicts the end of printed media in 10 years. "there will be no media consumption left in 10 years that is not delivered over an IP network." I guess you wind up saying things like that when you force yourself to repeat "I will not underestimate the Internet" 100 times a day.

If this is true, I wonder if that will lead to better (portable and stationary) electronic displays or a huge jump in the home printer business.

On the content production side, I wonder if content will still initially be produced for a print format (creating print-ready PDF using InDesign/K4) or in more presentation neutral contribution environment (like classic the web content management model).

Jun 04, 2008

Summize

content is not data - Summize
Uploaded with plasq's Skitch!

As a follow up to an earlier post about tracking all that chatter happening around content, I thought I would mention a new(ish) site called Summize. I had heard the name before but never bothered to look. Then I read this post on Dave Kellog's blog that described what it is all about.

Summize is a search engine for Twitter. Twitter itself has really weak search engine so this is a welcome service. Summize is pretty cool but I think it falls short of piecing together a conversation about a topic. Since each tweet is 140 characters or less, tweets can be less meaningful out of context. For example, if personA tweets "I am trying product X" and someone else says in reply "@personaA I tried it too and it totally sucked," Summize probably won't be able to associate the second tweet with the first.

One thing that Summize is good for is seeing trendy topics. For instance, in the screenshot you can see that Plurk was on the top of the list. If you religiously follow Twitter, you know that over the last 7 days everyone has been talking about it as a possible successor to Twitter. If you don't Twitter you wouldn't know a Plurk from a Pluck.

FriendFeed does have a search engine and you can subscribe to the results over RSS. The nice thing about FriendFeed is that you can see comments too.

FriendFeed - Search -
Uploaded with plasq's Skitch!

Jun 02, 2008

Sacha Chua on Drupal

Sacha Chua, whose blog keeps me hopeful that smart and creative people can thrive in huge companies not called "Google", has started to play with Drupal.  I like how she starts by addressing the configuration management problem.   Her method is to have a script that blows away everything and then reloads from an installation profile (note, like many Drupal modules, the Profile Wizard module is not available yet for 6.x. However, many of the features should be included in the Drupal 7 core).  Most people wait to tackle configuration management after it is too late.  


Sacha is just getting started with Drupal.  We will have to see if she moves her blog off of WordPress.

Jun 02, 2008

Social Media Traffic Patterns

Louis Gray has written a post about a trend that has been on my mind a lot recently: traffic and other activity shifting from the information source to social and aggregation sites. Louis has some compelling data showing that people are increasingly using sites like FriendFeed and Twitter to comment on provocative articles and blog posts.

The overall trend started with blogs and social bookmarking sites but is really accelerating with FriendFeed and tools like Twhirl that bring content right to your desktop and make it amazingly easy to comment.

sggottlieb - twhirl 0.8.1
Uploaded with plasq's Skitch!

.

There have been a number interesting discussions about who owns your comments and how to control the conversation. Many of the ideas are summarized on this blog post on Read Write Web and the related comments. Not surprisingly, there are even more comments about this post on FriendFeed.

There are interesting implications for publishers. Publishers that depend on advertising revenues are justifiably concerned that traffic is being pulled away from their site. To be sure, some traffic is lost especially when the conversation about the topic goes astray and eclipses the topic itself. But there is also a gain in traffic as people click through to the article to create their own informed opinion on the topic (although less people do this than you would expect. Often they just comment on the other comments).

If a publisher is more interested in mindshare than eyeballs and advertising revenue, this trend is a great opportunity because it gives higher visibly to the idea or business. In particular, companies that publish content for marketing purposes benefit. As Web 2.0 marketing strategies take hold, more and more companies are trying viral communication to get their message out. Of course, because no one controls it, the attitude can easily turn to the negative. Some companies are hurt more from negative commenting than others. Companies that create physical products (and shipped software) are hurt the most because they can't easily address people's complaints. SaaS companies can use the feedback to make their products better and then jump into the conversation and tell everyone that they were listening and made the correction. Many businesses can take the P.T. Barnum's attitude ("I don't care what the newspapers say about me as long as they spell my name right.") because any publicity is good publicity. Publishers that hope to prevent the expression of negative opinion by disabling the commenting feature have no hope at controlling the conversation.

While the good news is that conversations are happening digitally and out in the open (rather than verbally around a water cooler in some office park or through point to point emails) so it can be tracked, tracking and monitoring a conversation that is happening all over the web is a challenge. Conversations on Twitter are fleeting but the impression left in the reader's mind is much longer lived. You can't bookmark a tweet (even if you could, you may not be able to follow the link because Twitter is often down). You can bookmark and link a FriendFeed post (and its related comments) but Google doesn't index these detail pages very well (it only indexes the listing pages that rapidly change). It seems like the best thing you can do is subscribe to searches for each of the services and more seem to be popping up every minute. Another good strategy is to pay close attention to web server logs (using an analytics package) and look for traffic spikes and referrers. The spikes are likely to be more subtle than a Digg, Reddit, or Stumbled upon reference but the significance from a marketing perspective may be more profound depending what is said about you.

This is a fascinating topic. Please discuss (where ever you want to).

May 27, 2008

May 27, 2008

Drupal and Alfresco

Jeff Potts has a nice post about how Alfresco and Drupal complement each other today but will wind up competing as Alfresco develops its front end capability (Alfresco is all back-end. Drupal is all front end). Jeff, who is writing a book on Alfesco and was the 2007 North American Contributor of the Year for Alfresco, lists five areas that we can expect Alfresco to improve on the front end.

When a project moves, it is important to consider where it is coming from (and potentially what it is leaving behind) as well as where it is going. Sometimes a product's heritage holds it back other times the product leaves a trail of frustrated customers. Both Drupal and Alfresco are on the move. Drupal has evolved to be more suitable for very large, commercial websites. Several people commented that Drupalcon 2008 in Boston saw a much greater commercial presence than earlier events. Drupal employers and Drupal related businesses are now active in the Drupal community and are helping to shape its future. But the Drupal community still has to consider the many small non-profit, departmental and personal sites running on the platform. Many of the improvements designed to help the enterprise will also help the small guy. Others times there may be conflict around priorities (do you spend more resources on improving caching and clustering or making it easier to install on Plesk or cPanel?).

Alfresco is facing similar questions but it is coming from the other direction. It's early base was the "ECM for the rest of us" market - medium to large companies doing simple document management. While the latest improvements on the WCM side are very excited to people wanting to build innovative Web 2.0-style web applications, they may be less relevant to the core customer base. Alfresco has a lot of resources, but they still need to make choices around prioritization.

It is all a matter of not being able to be all things to all people. Alfresco and Drupal are both fortunate to be in the positions that they are in. They both have achieved high levels of success in their original target market and are looking to expand their range. But it is not all about expansion, it is also about focus - what to focus on and what to turn away from.

May 21, 2008

Content is not Data

David Nuescheler, CTO of Day Software and spec lead for the Java Content Repository specifications JSR 170 and 283, likes to say "everything is content." This is a bold statement that is intended provoke thought but I think that it is also a reaction to a prevailing view among technologists and database vendors that everything (including content) is just data. While it is true that content, when stored electronically, is just a bunch of 0's and 1's, if you think that content is just data, you need to get out of the server room because that is not how your users see it. There are four main reasons why.

  1. Content has a voice. Put another way, content is trying to communicate something. Like me writing this blog post, a content author tries to express an idea, make a point, or convince someone of something. Communication is hard and requires a creative process so authoring content takes much more time than recording data. Content is personal. If the author is writing on behalf of a company, there may need to be approvals to ensure the voice and opinion of the company is being represented. The author may refer to raw data to support his point, but he is interpreting. For example, even a graph of data may reflect some decisions about what data to include and how to show them. Because content has a voice, content is subjective. We consider the authority and perspective of the author when we decide whether we can trust it.

  2. Content has ownership. Data usually do not have a copyright but content does. The people who produce content, like reports, movies, and music, get understandably annoyed when people copy and redistribute their work. While data can be licensed, it is less common. Often data are distributed widely so that more people can provide insight into what they mean. Interestingly, when content is digitally stored as data on a disk, we think less about it in terms of content. For example, we are OK with data backups of copyrighted material even though creating copies is forbidden.

  3. Content is intended for a human audience. While content management purists strive for a total separation of content and presentation, content authors care about how content is being presented. They may have a lot of control over presentation and obsess over every line wrap or they only get to choose what words are bolded or italicized. They will only semantically tag a phrases in a document if they know that it will make for a richer experience for the audience. Presentation is not just for vanity's sake. Presentation, when done well, helps the audience understand the content by giving cues as to how things are organized and what is important. While the Semantic Web is all about machines understanding web content, at the end of the day, the machines are just agents trying to find useful information for human eyeballs (and eardrums). Content is authored with the audience in mind while data is just recorded.

  4. Content has context. In addition to who wrote the content, where it appears also matters. We care greatly how content is classified and organized because we want to make it easier to find. A database table doesn't care about the order of its rows (it is up to the application to determine how they should be sorted). Content contributors really care about where their assets fall in lists (everything from index pages to search results).

These distinctions may seem totally academic but I think they have real implications for the technologies that manage content. Because content is much more than "unstructured data," we can't think about the tools we use to manage and store it just in terms of big text fields in a relational database and forms to update these rows. Content is a personal experience for both the author and the audience and the technology that intermediates needs to be sensitive to that. Every once in a while there is a meme about "content management" becoming an irrelevant term because it will be subsumed into other more process or industry oriented disciplines. If that does happen, it is critical that certain content technology features and concepts carry over.

  1. Versioning. Content goes through a life cycle of evolution and refinement as groups of contributors work together to achieve the best way to convey the information and ideas. Some content assets (like policies and procedures) are updated hundreds of times over many years as information changes. Other assets go through many rapid iterations over a shorter period of time (such as an intensely negotiated contract). Often participants in a content life cycle need to know just what has changed. For example, a copyeditor can save time by just proofreading the changes since the previous copy edit. A translator may not need to re-translate an asset if only a minor edit was made. Sometimes the history of change can give insight into the spirit of meaning. Versioning is not just for reverting to older versions. A robust versioning system has features like version comparison and annotations.

  2. Control over the delivery. To effectively communicate, you need to tune your delivery to your audience. WYSIWYG editing and preview both try to give a content contributor the perspective of their audience. WYSIWYG editing gives a non-technical contributor some control over the styling over text. It is important that the WYSIWYG editor gives an accurate representation (as in the same CSS styles) of what a visitor will see. Single page preview puts the content into the context of a page by executing rendering logic. The more complex the rendering logic, the more difficult it is to control what the user sees. For example, if there is some logic to automatically display relevant related content, the preview environment has to have the same content, rendering code, and user session information as the production environment. Oftentimes, this is hard to do. I have had clients really struggle over controlling dynamic rendering logic. For example, a relevance engine automatically associated inappropriate images with articles or showed the same related content multiple times. Some users also like to see how articles show up on dynamic indices and search results. In these complex delivery tiers, preview is a lot more like software QA than simple visual verification - you need to test all the scenarios and parameters. A good practice is to delineate pages or sections that you want full editorial control over and other (less important sections) that are not worth the manual effort of controlling.

  3. Feedback. You can't communicate in a vacuum. You need feedback. However, most content contributors lob their content over the wall and then forget about it. When you are speaking in front of a group you can gauge reaction and make adjustments. As the web turns into a conversation, the content contributor needs to be listening as much as they are telling. Most content contributors underuse web analytics. The more accessible this information can be made, the better. Many web content management systems integrate analytics packages and have nice features like analytics overlays over rendered pages. However, these features are not used enough. More commonly, an analytics report will be circulated around to people who don't understand how to read it. Comments and voting can also be a powerful medium for adjusting and reacting to feedback either by direct response or by using knowledge of the audience in subsequent articles.

  4. Metadata. While metadata storage is trivial, capturing and using this information is a challenge. Metadata such as source and ownership are critical to tell the audience where the asset comes from (its voice and authority) and how it can be legally used. Metadata is also important for classification and context. Content contributors are notoriously bad at metadata entry: they either neglect or abuse it. Automation is part of the solution, but a good process involves humans with the responsibility for metadata (bring on the librarians!). The best way to leverage and exchange metadata is through standards based formats. Industry oriented formats (like NITF) are important because they have a standard set of metadata built in. Microformats are also useful for highlighting specific bits of standard information within rendered web pages. While most WCM platforms can produce these outputs through their templating tier, very few do any validation of the output. Reviewers just visually validate what they see on a preview page.

  5. Usability. Most of all the system needs to be easy to use. Creating content is hard work no matter how you do it. Any system that distracts or complicates a user from the creative process of developing content is bound to be un-popular and the first excuse for failure. The ideal content management system disappears from the user's consciousness by being familiar and frictionless - you don't need to think about it and it gives you immediate results. For many people, that is Microsoft Word (until Word tries to outsmart you and take over your document) and I have already mentioned the disturbing amount of web content that originates in MS Word. For some, blogging tools are approaching this level of usability. For others, in-context editing achieves it. In many cases users get so familiar with a tool that they forget they are using it even if the tool is hard to learn at first (I am reminded of this when my fingers just automatically type the right commands in vi). This usually only happens when you have specialists operating the CMS rather than a distributed authoring where all the contributors enter their own content.

If you are building an application that also needs to manage content, don't just think of the content in terms of CRUD for semi-structured data. Luckily, components and frameworks are available to incorporate into your architecture. The Open Source Web Content Management in Java report covers Alfresco, Hippo, and Jahia from this perspective. Recently, I have been playing around with the JCR Cup distribution of Day's CRX that bundles Apache Sling (very cool!). Commercial, back-end focused products like Percussion Rhythmyx and Refresh Software SR2 certainly play in this area. People used to deploy Interwoven Teamsite for this but I think it is too expensive to be used in this way. Bricolage is an open source back-end only WCM product written in Perl. But accurate preview and content staging can be complicated in decoupled architectures. Drupal and Plone are also quite popular as content centric frameworks for building applications but they tend to dominate the overall architecture (unless you use Plone with Enfold Entransit).

You have plenty of options that will allow you to avoid brewing your own content management functionality. Consider them!

May 19, 2008

Advice for vendors dealing with independent analysts

Alan Pelz-Sharpe, from CMS Watch, has written some great advice for Analyst Relations professionals in an article called "Advice for vendors dealing with independent analysts." The only thing I would add to his list of dos and don'ts is to use the information from the evaluation to make the product better.

One of the nice things about writing about open source software is that many of the products that I cover do not have analyst relations people. Instead, I talk to developers, committers, and CTOs. The big difference is this... AR people's job is to make the product look good, a developers job is to make the product be good. Nearly all of the products that I reviewed in Open Source Web Content Management in Java were extremely gracious with the criticism. Part of it is that they are grateful for the coverage. But a bigger part of it is that they are used to interacting directly with the community and getting direct feedback. Most open source developers know that software doesn't get better by convincing yourself that it is great. It gets better through continuous improvement that uses criticism as a catalyst for creative solutions.

It doesn't make sense for software vendors to reject (essentially) free feedback that can be used to make their product better and pay "tier one" analyst firms to try to delude themselves and the market with their own hype. In the end, the truth (as experienced by the user) always comes out. In a Web 2.0 world where everyone has a voice, it comes out even faster.

← Previous Next → Page 40 of 75