<!-- Content Here -->

Where content meets technology

Jan 10, 2011

Repository-Based vs. Presentation-Based Search

Search is probably the most common visitor-facing requirement in web content management system implementations. Usually the requirement is written in terse form such as "basic search" or "advanced search." But there are many nuances that need to be accounted for. There are essentially two approaches to implementing search requirements: repository-based search and page-based search.

A repository-based search indexes content items in the content repository. A page-based search indexes the pages of the site. This distinction is more important than you might think — especially if the site design heavily re-uses content. Here is an example. Let's say your site has pictures that are presented in slideshows. The picture content type has a caption that is searchable. A page-based search for a word in the caption of a picture will return the slideshow(s) where the picture is used. A repository-based search will return the picture item itself — but what if there is no detail page for the picture content type? You might have to do something like create a fake detail page that redirects the user to a slideshow page. Another difference is that a page-based search will index text that is hard coded into the presentation templates. For example, you might have your hours of operation in the footer of every page of the site and a "Visit" page that contains the hours plus directions. If a visitor types "hours" into a page-based search engine, he will get every page on the site in the results. A repository-based search engine will return the "Visit" page.

Generally speaking, the search functionality that comes out of the box in a CMS is repository-based. This is necessary because content contributors need a repository-based search to navigate the repository and find content to work on. Some of this content has not yet even been published on the site. Whether you need a page-based search engine for your visitors to use will depend on the nature of your site. Most types of websites do better with a page-based visitor search because a page is a good enough proxy for a piece of content and page-based search engines are generally easier to set up (look how easy it is to set up Google Custom Search). However, page-based search doesn't work well for all sites. In an eCommerce site that has a product catalog, you want to index the products themselves, not all the pages where the products are promoted. If you have requirements for a fielded search, like finding calendar events that occur within a date range, you will also need a repository-based search that indexes individual fields.

So, next time you are thinking about search, think about whether you want the search engine to index the pages on your site or the content that is being presented in those pages. As with all requirements, the best way to capture search requirements is through scenarios that present real-world examples.

Jan 04, 2011

Grading your CMS Implementation at J. Boye

After taking a break from conferences in 2010, I am planning on doing more speaking in 2011. I am starting off my 2011 conference campaign with one of my favorite conferences: J. Boye in Philadelphia. As I have mentioned before, the J. Boye conferences are excellent. The sessions are informative and practitioner-oriented, but what really makes the conferences great is the social element. More than any other conference that I know of, the J. Boye team plans and promotes social activities that encourage attendees to get to know each other and exchange ideas.

I will be presenting a session called Grading your CMS implementation. There is greater potential to understand requirements in the months after a CMS rollout than leading up to and during the implementation. The only form of failure is not using that information to make corrections. However, most organizations label the selection/implementation project as either a qualified success or a total failure and suffer until the next CMS procurement (when they will repeat the cycle). They don't treat it as a milestone in a larger content initiative, which they should. My presentation will show how to create feedback loop to correct issues in three areas: expectation rationality, platform suitability, and project execution. Organizations preparing for a CMS initiative will learn how to set their project up for success and avoid common pitfalls.

Hopefully, I will see you at J. Boye in Philadelphia this May (3rd-5th). If you don't catch my session, I am sure I will see you at one of the social events.

Jan 03, 2011

Open source and the other kind of software

When talking about open source software, it is often necessary to mention software that is not open source. Most people use the terms "commercial" or "proprietary" to classify the other kind of software. But neither of these words are right because open source software can be both commercial and proprietary.

Commercial means relating to commerce. Many open source software applications are built by software companies for the distinct purpose of engaging in commerce. We even talk of "commercial" as a classification of open source software: "commercial open source software." Open source applications can also be proprietary. Proprietary just means it is owned by someone. It is up to the owner to decide how much he wants to share his property. Putting an open source license on a software application doesn't make it a public good. An open source license is like any other license in that it states terms of use. You need to own the software in order to put a license on it and enforce it. The owner of an open source licensed application can also change the license to another license that either does or doesn't qualify as open source.

Given the inadequacy of "commercial" and "proprietary," I find myself using the terms "closed source" or non-FOSS (FOSS is an abbreviation for Free/Open Source Software). Non-FOSS is probably more accurate because many software companies practice a "shared source" model where they make the source code available to their customers. This doesn't meet the open source definition but it does invalidate the description of being "closed." If we want to be be totally technical about it, we could call software that isn't open source "OSDNS" (Open Source Definition Noncompliant Software) but that doesn't exactly roll off the tongue.

As academic as this little naming exercise appears, I think that it does reveal the nuances of software licenses more than "good open source, bad proprietary." Open source is just a classification of software licenses — a classification that is not always useful when it comes to assessing the suitability of the software. Open source licenses don't make software better, altruistic, communal, collaborative, or anything else. The ecosystem that supports the software defines what it is. The license is just one aspect of how the ecosystem operates. Interestingly, there is a meme within the Plone community to move away from the "open source" branding to something like "community shared development," which intends to focus on how the software is developed and managed (rather than its licensing) and differentiate from commercial open source.

Dec 23, 2010

11 Content Management Wishes for 2011

I don't generally write (or even read for that matter) upcoming year prediction posts. They seem more for the benefit of industry watchers than practitioners in the trenches. This, I know: some vendors will flourish; others will get acquired; early adopters will believe themselves to be leading lasting trends; many of them won't be; many employees will be frustrated that their companies are not following these trends; the rest of the world will take little notice. Now onto my hopes for the new year. I wish these things not just for the trendsetters but for everyone working in content management.

  1. I hope that expertise follows the technology. We have seen a trend of corporate web publishing moving out of the Information Technology group and into the Marketing department. This is great but I haven't seen marketing departments develop the talent to manage these technologies. People in marketing don't have enough experience hiring and managing technical people, enforcing engineering rigor, and buying technology services. This is particularly problematic because the content management products they are buying have a huge upside if you understand how to implement and maintain their sophisticated features.

  2. I want "Creative Developer" to emerge as the dominant role in most web teams. I have had the pleasure of working with some talented people who are good enough with design, HTML, CSS, AJAX and can also hold their own on a software development team. They can maintain their own development environments, can get around a command line, can pick up server-side coding, and have a genuine interest in engineering. For a while I got spoiled and didn't have an idea of how rare these individuals are. I would like to see this skill set become the norm rather than the exception.

  3. I want analytics to become a core competency. Nearly all companies practice web analytics but very few have achieved the level of mastery it takes to leverage this information. Setting and striving for metrics-based goals isn't baked into the culture. Traffic trends are more of a curiosity rather than a call to action. Typically companies are smartest about analytics immediately after purchasing a product and sending a team member to training. Then the intelligence decays as other operational aspects take priority or the person leaves. Often I see companies on the market for a new analytics system when they could do just as well by retraining on their current system.

  4. I wish everyone would dump IE6 support. Can we just do this already? By continuing to support IE6, we are not only wasting money and stifling innovation but we are also enabling companies to force their employees to use outdated, insecure technology.

  5. I wish technology companies would get more sustainable. The current pattern for technology companies is to: 1) burn lots of cash to build a product; 2) build a user base; 3) cash out by being bought by some other company. Customers are drawn in during phase 2 and then get screwed in phase 3. We see this in enterprise software (Vignette, RedDot, Merant) and in consumer software and services too (most recently with Delicious). Knowing this, I prefer solutions that I can see a sustainable business model from the start.

  6. I want a focus on content strategy to become the norm. Last year was big for content strategy. The field has been gaining awareness and key visionaries are starting to emerge. My hope is that this discipline continues to be incorporated into standard business practices. Teams need to start with the content rather than the container. We will know we are close when we stop seeing lorem ipsum in wirefames (another blog post I need to write).

  • I wish for the return of the corporate librarian. Too many companies believe that they can buy technology to play the role of a corporate librarian. Rather than invest time to organize and curate information, they leave it where it is and hope enterprise search will find it. The problem is that there winds up being so much duplicated content that the search engine can't effectively prioritize and recommend what it should in a search result. Either every knowledge worker needs training in basic library sciences, or companies need corporate librarians to help teams organize and share their information.
  • I hope the social intranet gets real. Up until recently companies have been operating under the myth that if you launch an internal Facebook, employees will jump in with the same enthusiasm that they poured into the real thing. That hasn't happened and it isn't going to happen. Nobody wants to invest their entire social being into their employers system where it only can be seen by co-workers. However, I think the social intranet concept has more legs than its predecessor which was to create a shared corporate brain (repository) that employees would voluntarily dump all their knowledge into and be able to leverage when they needed it. All the science says that knowledge and learning doesn't work that way. People learn by experience. The learning happens faster with immediate feedback. Someone else's feedback is almost as good as the natural feedback from a good or bad decision. If the social intranet can simply connect people who are learning skills with people who have been through the learning process, there will be huge returns. The challenge is to align the rewards to create a culture that prioritizes those types of exchanges.
  • I hope publishers develop a sustainable web business model. The foundation of the web is interconnectedness. Visitors bounce around from site to site and share links with each other. Traditional publishing has focused on building a captive audience — getting the full attention of a customer and renting some of that attention to advertisers. That is easier to do with a paper magazine or newspaper in hands of a person in a comfy chair than it is in a browser under the control of an attention challenged, click-happy web user. Publishers either need to either figure out away to monetize a non-captured audience or re-capture their audience. Apps and pay walls (the current infatuations of publishers) attempt the latter but they take the publisher out of the web ecosystem. I don't think that is going to work over the long term. This problem is more than a year away from getting solved but I would like to see some progress during 2011.
  • I hope Net Neutrality is preserved. I am getting dangerously close to industry-watcher territory here, but the threat to net neutrality really concerns me. It wouldn't be a problem if there was real competition between internet service providers, but most consumers have only 1 or 2 broadband service options. If the government is going to allow (and enable) these monopolies to exist, it has a responsibility to regulate this overly monopolistic behavior.
  • I hope content professionals continue to be passionate about their craft and earn their place in upper management. In order for that to happen, we will need to realize the rhetoric of content as a strategic asset. This means demonstrating how better content management can make businesses more effective and competitive.
  • As you can see, we all have our work cut out for us. Get some rest over the next couple of days and get ready for 2011. I expect great things from you all!

    Happy holidays.

    Dec 15, 2010

    Content Modeling: People Names

    My general advice for content modeling is to structure things as much you can without annoying your editors. More structure means more re-usability because presentation templates have more control over what to put where. A good example is names of people. If you have one field for name, you can show the name in alternative ways (such as "Joe Smith" and "Smith, Joe") and you can't do things like sort by last name. It's better to put this information into two fields.

    So, you ask, what do you do when a person has one name as in Pink, Madonna, or Unknown? If you have a person with one name, use the last name field. Otherwise, when you sort by last name and then first name, all the one-namers come to the top. This actually came up on two different projects in the last 6 months so I figured I would pass it on.

    Dec 09, 2010

    Jackalope: A PHP Port of the JCR

    Kas Thomas (from Adobe/Day) writes that a PHP port of the JCR (called Jackalope) is near completion. A big part of the project was to translate the JCR specification (which, like Java, is statically typed) to PHP's dynamic typing model. The result is a derivative specification called the PHPCR.

    PHP developers have had access to JCR repositories for quite some time through Apache Sling (which puts a nice REST interface in front of a JCR). What Jackalope brings to the table is a PHP-based, in-process API that may be faster than hitting a REST interface. However, that doesn't mean PHP developers can totally forget about Java. The current implementation is an adaptor that connects to a JCR (Apache JackRabbit) through webDAV (so http is still in the mix too). The next phase of development will swap out the JCR storage backend with a basic database thereby removing the JVM from the picture entirely.

    Nov 15, 2010

    When building communities, don't fall into the tool trap.

    Chris Grams has an excellent post reminding us to avoid the tool trap when forming a community. That advice is so obvious yet so rarely practiced. Building communities is hard work and the temptation to experiment with technology is a distraction — a distraction so strong that it can obscure the fact that the community has no reason to exist. No matter what platform, the community will inevitably fail if it doesn't have the necessary ingredients: people with passion and a common set of interests and goals.

    Technology can be an enabler but it can never be a driver. A robust community will survive with bad technology. I would say to intentionally use bad tools when you start a community. This approach will cause an unviable community to fail faster and save unnecessary investment. If a community is starting to grow despite bad tools, then you know you really have something worth investing in. By the way, if anyone wants to join my Julio Lugo fan club, fax me your fax number.

    Oct 20, 2010

    CMS Pricing

    Over the years, I have had a number of really interesting discussions about software pricing with both vendors and customers. Pricing software (be it a license fee, an annual subscription price, or whatever other source of revenue) is a complex problem on both sides of the transaction. Vendors are always trying to tweak their pricing to increase sales and revenue. Customers are constantly trying to figure out what prices mean in terms of value and overall costs. The results of these efforts are that prices overly complicated and never seem fair.

    One of the best explanations of software economics is Joel Spolsky's epic article Camels and Rubber Duckies. To quote:

    One of the biggest questions you're going to be asking now is, "How much should I charge for my software?" When you ask the experts they don't seem to know. Pricing is a deep, dark mystery, they tell you. The biggest mistake software companies make is charging too little, so they don't get enough income, and they have to go out of business. An even bigger mistake, yes, even bigger than the biggest mistake, is charging too much, so they don't get enough customers, and they have to go out of business.
    Joel goes on to talk about microeconomic theory of demand and supply and discusses how software is tricky because there is no intrinsic limit on supply. Which brings him to the classic Joel conclusion.
    Take my advice, offered about 20 pages back: charge $0.05 for your software. Unless it does bug tracking, in which case the correct price is $30,000,000. Thank you for your time, and I apologize for leaving you even less able to price software than you were when you started reading this.

    As great as Joel's article is, it only focuses on keeping the financials of a software business in the black. The article ignores another important aspect for software vendors: making customers successful. Sure, making a software sale is great. But to be viable in the long term, a software vendor needs to create value for the customer and then capture a reasonable portion of that value (surplus in microeconomic parlance). A good pricing model will charge customers in proportion to the amount of value they get from the software, but measuring value is not easy in content management software. Different pricing models use different data as proxy measures of value. The problem is that these proxies are not an exact representation of value and customers have an incentive to manipulate them to lower the cost of the software. To make matters worse, the behavior that prices encourage is often sub-optimal and risky. Customers consciously underuse or misuse products in order to stay in a lower pricing tier. This undermines the goal of making a customer successful.

    Let's dig a little deeper into the unintended consequences of content management software pricing models. In the world of content management, there are essentially three ways software is priced: by named users, by servers (or CPUs) or concurrent users, and by volume of content. All of these models try to get customers who use the software more to pay more. This enables a CMS vendor to sell into small departments as well as make large enterprise-level deals. But all of these models can encourage bad behavior. Lets look into each one.

    • By named users. This seems like a great model. When a customer has lots of users on a system, it means it is leveraging the product for high value things like distributed authoring. The product likely has become a critical part of many jobs — making it an important part of the operational infrastructure. Here the value is that lots of people are able to edit content. However, I frequently see this model drive customers to do foolish things like having shared accounts on the system (that is, 10 different people log in as a user named "User1"). I have also seen customers have dedicated content entry people to save on user accounts. If you don't have a user account, you just mail your content to someone who does. Both of these behaviors undermine all sorts of functionality like workflow and auditing. Any sense of accountability is gone.
    • By servers or concurrent users. I lump these two models together because they both attempt to tap into the intensity of usage. This is slightly better than the first option (named users) because it does not penalize customers who have lots of occasional users. Customers with more activity pay more than customers who are more static. Here the value is the amount of management activity the software enables. If the customer's business depends on actively managing its content, the CMS is creating a lot of value. But these models can encourage users to under-power their implementations. Customers will low-ball their infrastructure. When customer buys too few CPU licenses, the system will be slow (what software vendor wants a reputation for being slow?) or there will no fail-over for upgrades or crashes (what software vendor wants a reputation for being unstable?). When customers have not bought enough concurrent user licenses, I have seen employees repeatedly trying to log in to snatch up a session when one comes available. What a waste of time!
    • By content volume. While the first two models are common to all types of enterprise software, pricing by volume of content may be unique to content management software. I only know of a few vendors that sell this way but, frankly, I think this may be the best of the three options. Companies with more content derive more value from their CMS — at least they should. If a customer has a large volume of low-value content, maybe that is a problem that should be addressed for other reasons. Additional software fees are trivial compared to the cost of keeping around unneeded content: redundancy, out of date content, and poor findability all add up to great losses in efficiency — a customer will lose more efficiency to these issues than is gained by having the CMS. Bob Boiko's book Laughing at the CIO; A Parable and Prescription for IT Leadership
      gives excellent strategies for thinking about the value of information. If high software costs force a better content management strategy, great! However, things don't always work out that way. I have even seen customers manage content outside of the content management system (like on a file system or in some other CMS) to avoid having it count against quota. I have also seen customers model their content to be coarser-grained to reduce the number of items. For example, rather than create a content item for every glossary term (which would increase reusability) the customer might create one glossary page with terms listed as paragraphs in the rich text editor.

    All of these pricing models have their benefits and risks but they don't necessarily dictate success or failure. As I often say, a successful software acquisition establishes a partnership that delivers balanced value to both the customer and the supplier. The pricing model is an important aspect of that partnership because it establishes the terms of exchanging that value. But like all terms, pricing is negotiable and negotiation works the best when both sides seek a win-win result. If the customer is sensitive to the vendor's need to generate revenue to sustain its business, and the vendor is sensitive to customer's need to make cost-effective expenditures, mutually beneficial terms can be reached

    Oct 14, 2010

    The Interruptible Programmer

    I just read Steve Streeting's excellent post Work 2.0 – the interruptible programmer where he counters the conventionally assumed benefit of long hours of uninterrupted programming. It took a bad back to make Steve realize that programming for long hours at a time was not only unhealthy, but it was also unproductive. I have to agree with his conclusion. You need to break up your work to alternate between the two ways of thinking, which the book Pragmatic Thinking and Learning: Refactor Your Wetware

    calls the L-Mode and R-Mode (Seriously, buy the book if your job requires you to use your brain).

    While I agree with the conclusion, I do have a problem with the word "interruptible." Breaks are great, but I still do think that interruptions are bad. The difference between the two is that breaks can be planned, interruptions are not and can happen at really disruptive times in your thought process. As Steve eloquently describes "When you’re in that Zone, you’re juggling a whole bunch of context in your head, adjusting it on the fly, and maintaining and tweaking connections between issues constantly." I love that juggling metaphor because that is what it feels like. When a colleague stops by your desk to tell you that he just sent you an email, all the balls that you were juggling fall to the ground and it can take a really long time to get them back in the air. The article lists some useful techniques to help you freeze that context so the balls don't roll too far when an interruption happens, but they all require some proactive habits. If the interruption happens right before you save your context, you lost everything since the last save. It is a little like auto-save in Microsoft Word — it is great at mitigating loss but computer crashes still suck. Similarly, regularly auto-saving your context makes you more interruptible (that is, less damage is done) but that doesn't mean interruptions are good. The article even makes the point that you should table tangential tasks rather than interrupting yourself.

    When I am really slammed with work, I try to practice the Pomodoro Technique and I think that Steve's context saving, prioritizing, and pacing strategies are very compatible with it. In particular, as your 25 minute timer runs down on a work session, it is a really good time to spend a few minutes saving context when you are still in the zone. I will have to try that, but that won't make me interruptible.

    Oct 12, 2010

    Reference Checks

    Without a doubt the tool that software selections most neglect is the reference call. More often than not, reference calls are treated as a mere formality — a box to check after the decision has been made just in case someone asks. When checking references after the decision has been made, the reference checker is just going through the motions hoping against the unlikely chance that something negative turns up. Any hints at issues or other useful information are summarily ignored. When done this way, reference calls are probably not worth doing at all.

    But reference calls don't have to be this unproductive. When done properly they can provide critical information of what it is like to partner with a software supplier and use its software — after all, isn't that what a software selection is all about? Doubters of the efficacy of reference checks point to the fact that the software supplier is in control of the process. Vendors only connect you to customers that they have been successful with. This can be a valid point. But I would counter with two facts: one, it is desirable to know what types of clients and engagements a supplier has been successful with; and two, every software integration/implementation project has its challenges and it is useful to know what they were and how the supplier overcame them.

    To make reference calls as effective a possible, I recommend the following guidelines:

    • When to call? Call references as early in the process as you can. Usually this is once you have established your short list of products and you have qualified your organization as a legitimate buyer. Insufficient references is a great way to rule out products that are not worth considering. The vendors that you work with may balk at connecting you with references this early. This is understandable since these people are their customers and they don't want to annoy them by repeatedly asking them to talk to unqualified prospects. You need to work out some kind of compromise here. Limit your request to one really relevant customer rather than a long list from which you will call two or three. You might be able to connect to one of these customers through your own network. Also, if you have benefited from a reference call, "pay it forward" by offering to be a reference for products your company owns.

    • Which Customers? Be very selective as to which references you speak with. You only want to talk to companies that are using the software to solve similar problems. When the software salesman flashes the customer logo slide (you know the one I am talking about), don't be dazzled by the household brands that you recognize. Really dig into what the customers are doing with the software. If you work for a university and you see another university's logo on the slide. Stop the conversation and ask what that university is doing with the software. You might learn that only a tiny department uses the product. In that case, don't put much stock in that "customer success story." If, like your employer wishes to do, the customer has standardized all departmental websites onto the platform, put that customer on your list of references to request. In addition to learning about their use of the product, you might also learn about some organizational challenges that would be relevant to your initiative. It is a red flag if the vendor can't point you to any customers who have used its software to do what you want to do.

    • Which Contacts? Only talk to reference contacts who are intimately involved with the implementation and day to day usage. Senior level managers tend to be sheltered from the detail that you want to know. They usually only hear about usability issues if the users are picketing down the halls. If they did hear about a usability issue, they many not know if it was related to the product or the implementation. Ideal reference contacts are project managers who were involved with the planning, execution, and rollout of the implementation. They know what "out of the box" features took weeks to "configure."

    • What to ask? It is important to have a list of questions to guide the conversation but don't be afraid to go off script. Look for clues of problems and dig in to unravel hidden issues. As a baseline, you want to know what they use the software for, when and to whom they first rolled it out, what version they are on, workflow and user segmentation, how actively they maintain the solution. Ask open ended questions because every one of those topics can be a source of issues.

    If you follow these guidelines, reference calls can become a central part of your selection process rather than a useless formality. You will get first-hand knowledge from someone who has been down the path you are considering and can avoid spending too much time looking at products that have not demonstrated an ability in your problem domain.

    ← Previous Next → Page 17 of 75