<!-- Content Here -->

Where content meets technology

Mar 23, 2015

Double International Snapshot

Last year I wrote a post about two deployment patterns to prevent source language bleed through when using a transaction proxy. If you don't want to click through on the link, source language bleed through can happen when a visitor views a page with text that has not been translated yet. Most translation proxies have an option to use machine translations for untranslated text but this is not always desirable.

In our experience, the "International Snapshot" pattern works very well as long as the customer is willing to freeze their content while the site is being translated. Without a content freeze it impossible to make the site completely translated. You never know when a content editor is going to publish a new string. It is also inefficient because the translators might need to translate text that is only on the site for a few hours. For example, a translator might be sent a task to translate a sentence with a typo that will be corrected in a matter of minutes.

But not everyone is willing to interrupt their publishing flow. Thanks to the wonderful world of content management systems, people are used to being able to publish whenever they want. To accommodate customers who have no tolerance for bleed through or content freezes, we have started to apply a new pattern that I call the "Double International Snapshot." This pattern is identical to the "International Snapshot" pattern but has an added layer of a translation instance which effectively freezes the content for the translators.

Double International Snapshot

In this pattern, you have three instances of the site:

  1. The Original Site that is serving your source language visitors today. This site can be updated regularly for the benefit of source language visitors.
  2. A "Translation Snapshot" that is a point-in-time snapshot of the Original Site. This site provides a stable environment for new content discovery and staging translations.
  3. An "International Snapshot" that is a point-in-time snapshot of the Translation Snapshot. This is the site that your target language visitors will hit through the proxy.

The translation process works like this. When you think it is time to refresh the translated sites, you refresh the Translation Snapshot. This freezes the content for the translators so that they can complete their work without new content constantly appearing. The Translation Snapshot can be finalized and go through all sorts of testing to make sure that it as perfect as it needs to be. Once you are satisfied that the Translation Snapshot is production-ready, you refresh the International Snapshot from the Translation Snapshot. Because the translations for the new content are already in the proxy, there will be no bleed-through.

The Double International Snapshot allows content editors to constantly publish new content to the source language site without compromising the stability of the translation environment. The translated sites can be fully verified for translation completeness and quality before going live. This pattern is an ideal solution for companies that don't want to risk source language bleed through but are not willing to compromise publishing freedom. But these benefits are not without cost, you do need to host three versions of your website and you need to build mechanisms for maintaining these snapshots. The Translation Snapshot site can be relatively low powered because it will get only a limited amount of traffic. If your site is simple and does not have any runtime server-side logic, both snapshots could be static HTML files hosted on Amazon S3. This can be done pretty easily with GNU wget.

Feb 04, 2013

Preventing Source Language Bleed-Through

When considering a translation proxy to localize a website, people often ask "if we change the content on the site, what will visitors see while the translators are working on the new translations?" Unless you are using machine translation as a backstop for untranslated content, your source language is going to show through. This is because a translation proxy doesn't cache full pages. Instead, they apply translations one fragment at a time. If it doesn't have a translation for a fragment, the translation proxy will register it as a new string to translate and send the original text through. This is a necessary trade-off to support dynamic websites. You can shorten the period of bleed-through by increasing your translation responsiveness and throughput, but to prevent bleed entirely you need apply one of the following techniques.

These two techniques are really just variations on the same theme of pointing the proxy at two versions of your site: one for content discovery and another one for production.

International Snapshot

The "international snapshot" approach is where you create a non-current snapshot of your production site for international audiences. Depending on your translation velocity, this could be anywhere from a day old to a month old or more. Content discovery and translation is still done on the live site. But when an external visitor browses the site, they are getting a proxy-translated view of the snapshot. There is no bleed-through from the snapshot site because by the time the snapshot is refreshed, all of the translation rules have already been published. Here is a picture

International Snapshot Model

This is what we are doing for the new www.lionbridge.com website. We wrote some scripts to do a full snapshot of the site (code, database, and other files) to another virtual host on the same servers. If you are using a more sophisticated platform than Wordpress, you could probably set up another publishing target and just publish there when you want to update your snapshot.

Translate Staging

The "translate staging" option follows the same pattern as international snapshot. With translate staging, however, you do your content discovery and translation on the staging server (which has content which hasn't been published yet) so your translations are ready before you publish your content. Once your translation queue is down to zero, you can publish your source content to your live site and the translation rules will be waiting to be used.

Translate Staging Model

With "translate staging," you need to make sure:

  1. Your staging site is publicly accessible. You will not be able to view the site through the proxy unless it is available on the public internet. For security reasons, you can limit access with a firewall and also add authentication to the staging site.

  2. Your staging site only contains translation ready content. If you have are using your staging site for your editorial process, you might wind up discovering draft content that will never be published. Since this is obviously a waste of time and money, it's best to create a staging instance that only has content that is ready for publishing.

Here is a summary of the process

Lifecycle of a content update

What to choose?

Which approach is right for you depends on your requirements. If, because of regulations or communications policy, you are required to publish new content to all languages simultaneously, you need to go with the translate staging option. If, like it is with us, you can allow your localized sites to lag behind the primary site, the international snapshot option is usually most attractive. And, of course, letting source content leak through may be acceptable if your site is less formal. The most important thing is that you have options.

Feb 16, 2012

Introduction to Translation Proxies

Recently, I have been doing a lot of work with translation proxies. A translation proxy is a service that sits in front of your website and applies translations so that different audience segments can see your content in their local languages. You point various DNS entries to the proxy so that request goes through the proxy service to your native language website. On the return trip, the proxy applies translation rules to the HTML in the response. It’s a little like when Google Chrome asks you to run Google Translate on the foreign language page you are looking at. However, unlike Google Translate, these translations are done by professional translators and there is a multi-step workflow to ensure that the translations express correct information and tone. Of the three ways that Global Marketing Operations can localize websites, translation proxy is the fastest growing option.

Translation Proxy Architecture

This approach has several benefits. Among them:

  • There is no need for the core publishing platform to have multi-lingual capabilities

    The publishing platform behind the primary language site may not support localization features such as storing different localized versions of a piece of content or being able to easily localize display templates. Many CMS implementations start with a localization-capable product but then cut corners, which prohibits adding alternate languages down the road. Building a site for localization takes planning, skill, and investment that cannot be justified if localization is less than a drop-dead requirement at the time of the initial implementation. Retrofitting is a lot of work.

  • You can translate both content and template text in one place

    In a typical translation scenario, we need to translate both the content (which the content contributors manage) and the presentation templates (code that the developers manage). See this screenshot for an annotated view (sorry for the color palette choice. I understand if you have a sudden craving for a Big Mac).

    Template vs. Content

    With other localization approaches, you need to localize these aspects separately: the content through an editorial workflow and the template text through a software development life-cycle. In the translation proxy approach, all of this text can be localized together.

  • You can leverage your main site’s interactive functionality.

    Because requests are proxied back to your primary language site, all interactive features (including search, contact forms, and even commerce transactions) will continue to be supported.

  • Your content stays secure.

    Because the translation proxy only sees what regular visitors see, there is no risk to the security of the main site. The one thing you want to watch out for is transactional functionality. You want to make sure that the proxy service does not collect sensitive information that visitors submit to the website and also can create rules to ignore sensitive information coming back to the visitor.

Like with any technology, translation proxies are not ideal for every case. Here are some reasons why a translation proxy may not be right for you.

  • Local markets cannot create original content in the proxy.

    A translation proxy only translates (or, in some cases filters) content from the main site. There is no clean way to add original content to a localized site managed by a translation proxy. This is perhaps the biggest show-stopper for some companies. If you have a local marketing team that wants the autonomy to develop original content for their local markets, a translation proxy is not going to help. They need to publish their content somewhere before it can be served through the proxy. Perhaps they could create areas on the main site for their specific regions but then they wouldn't be leveraging the proxy.

  • A translation proxy will not allow you to neglect your localized sites.

    If you want to save money by putting up simple, (crappy,) static sites for local markets, a translation proxy is not a great solution. The translation proxy model is intended for managing localized versions of the primary site. Any new or changed content will display in the primary language until it is translated. You can contain translation costs by excluding certain sections from your localized websites (for example, you may not want to translate your blog or press releases). Still, with the proxy model, you need to consider the translation costs when you add new content to your website. One work-around could be to create a trimmed down, rarely-updated "international site" on your CMS and use the translation proxy to localize it for all your foreign markets.

  • Translation volumes are difficult to predict.

    Unlike traditional translation projects, a translation proxy service is monitoring a constantly changing website so scope is going to be fluid. This is not an “invest and then rest” type of service. It requires a level of attention and planning commensurate with the level of work that is done on the main site. If you are serious about reaching into new markets, this investment should seem reasonable to you. It’s still cheaper than hiring in-country marketers to produce original content. Look for suppliers that offer a capacity model where you pay a fixed fee based on expected usage. This will protect you against a large per-word translation bill after your primary language authors have been particularly prolific.

While the translation proxy approach is not optimal for every business or website, its benefits are very compelling for many. A translation proxy may be worth considering if you learn that your sophisticated interactive website was not built with localization in mind and you want to serve new markets with the same level of experience that you serve your home market.