<!-- Content Here -->

Where content meets technology

Apr 04, 2012

Keeping Django Settings Generic

Warning: Geeky Coding Post

I have worked on a number of Django projects and settings.py is always a problem. This is the configuration file with information like how to connect to the database and what Django apps to install. Some of this information is sensitive (like your database password or an API key) and you don't necessarily want it in your source code control system. Other information is instance-specific like the host that your database is sitting on.

In the past, I have managed the production settings in the core settings.py file and then imported an override file at the bottom that overwrites certain settings with my local settings. The code looks something like this:



try:
from settings_local import *
except ImportError:
pass

Then all you have to do is ignore your settings_local.py from your source code management system and exclude the file when bundling up a deployment package. This works pretty well except when you don't want to store a password in your SCMS or if you have lots of different settings for different environments that you are deploying to.

What you want to do is have those settings living permanently as part of your target environment(s) and just deploy a neutrally configured application. Last week I stumbled on a good technique for this that I thought I would share. The strategy comes from an article on Tim Sawyer's Drumcoder blog: Apache Environment Variables and mod_wsgi. The article shows how you can access environment variables from settings.py.

After following these directions, I was able to do things like:



DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'APP_DBNAME' in os.environ and os.environ['APP_DBNAME'] or 'dbname',
'USER': 'APP_DBUSER' in os.environ and os.environ['APP_DBUSER'] or 'dbuser',
'PASSWORD': 'APP_DBPASS' in os.environ and os.environ['APP_DBPASS'] or 'password',
'HOST': 'APP_DBHOST' in os.environ and os.environ['APP_DBHOST'] or 'localhost'
}
}

I am surprised that I limped along so long before using this technique. An alternative method would be to install a python package on the server (in the python libraries) that just contains these constants. But this just seemed easier to me.

Mar 01, 2011

The Placeholder Application Controller Pattern

One of the main benefits of using a coupled (aka "frying") web content management system (WCMS) is that you get a web application development framework with which to build dynamic content-driven applications. Like nearly all modern web application development frameworks, a coupled CMS provides an implementation of the MVC (Model-View-Controller) pattern. For the less technical reader, the MVC pattern is used in just about all software that provides a user interface. The essence is that the model (the data), the view (how the information is presented and interacted with), and the controller (the business logic of the application) are separate concerns and keeping them separate makes the software more maintainable. I will let Wikipedia provide a further explanation of MVC.

The MVC implementation that comes with a typical WCMS is less flexible than a generic, all purpose web application framework. Your WCMS delivery tier makes a lot of assumptions because it knows that you are primarily trying to publish semi-structured content to some form of document format — probably an HTML page, but possibly some XML-based syndication format. Your WCMS knows roughly what your model looks like (it is a semi-structured content item from the repository that has only the data types that the repository supports); it has certain way of interpreting URLS; and the output is designed to be cached for rapid page loads. Those assumptions are handy because they make less work for the developer. In fact, most of the work on a typical WCMS implementation is done in the templates. There is hardly any work done in the model or controller logic.

But there are times when the assumptions of the CMS are broken. Anyone who has implemented a relatively sophisticated website on a CMS has had the experience of either overloading or working around the MVC pattern that comes with the CMS. The approach that I want to talk about here is what I call the PAC (Placeholder Application Controller) pattern. In a nutshell, this pattern de-emphasizes the roles of the model and controller and overloads the view (template) to support logic for a mini-application. The content item goes from being the model to a placeholder on the website. The controller is used more like a switchboard to dial up the right template.

Here is a common example. Let's say that you are building a web site for an insurance company. Most of the pages on the site are pretty much static. But there is one page that has a calculator where a visitor enters in some information about what he wants to insure and gets back a recommended coverage amount and estimated premiums. It would be pretty silly to try to manage all the data that drives the coverage calculator as content in the CMS. Instead, you would probably want to write the calculator in client-side Javascript, copy it into a presentation template and then assign that presentation template to a blank content page with the title "Coverage Calculator." The Coverage Calculator page in the content repository is really just a placeholder that gives your Javascript application a URL on the site.

To a lesser extent, home pages often implement the PAC pattern. In this case, the home page might be a simple empty placeholder page that is assigned a powerful template that queries and features content from across the site. When the controller grabs the template and model, it may only think that it is rendering the content that is managed in the home page asset. Little does the controller know, the template is going to take over and start to act like a controller — grabbing other models and applying other templates to them.

Placeholder Application Controller is one of those patterns that, once you think about it, you realize you use it all the time. It is convenient and practical but be careful with it because it is easy to get carried away. The main risk of the PAC pattern is that you are going against the grain of the WCMS architecture. Templates are supposed to be for formatting only. You may be pushing the templating language a little farther than it was intended to go and your code may become unmanageable. You also may be short-circuiting the security controls provided by the controller. Some WCMS platforms have a pluggable architecture that allows 3rd party modules (programmed in something other than the template language) to step in and play the roles of model, view, and controller. This helps keep the architecture cleaner but there will always be some limitations on how these modules are allowed to work. After a certain point, you will be better off going with a generic web application framework that affords you more flexibility and just use the WCMS to publish content into your custom web application. But that is a much larger undertaking.

Feb 09, 2011

Comparison Between Drupal and Django

This article comparing Drupal to Django is pretty old but I just noticed it. There is a nice summary in the conclusion:

Drupal represents a middle ground between framework and CMS that we’ve chosen not to take. Drupal is far more capable than a CMS like WordPress, but also much less flexible than a pure framework. But more importantly, the facts that Drupal isn’t object-oriented, isn’t MVC/MTV, doesn’t have an ORM, and is generally less flexible than a pure framework, not to mention our preference for working in Python over PHP, all contribute to our decision not to use it.

I reached a similar conclusion on a recent project.

Apr 05, 2010

Django Action Item Follow Up

While moderating a comment on my "10 Django Master Class action items" post, I was inspired to evaluate how I am doing on these action items and whether they are helping. Below is a brief summary of my progress; but first a little background. Recently, I had the rare opportunity to rebuild (from the ground up) an application that I wrote for a client. The context was that the first version of the application was a prototype that I built to help demonstrate an idea to potential investors and customers. The prototype served its purpose excellently. It was able to evolve alongside the idea as my client got feedback and refined the value proposition. We came out of the prototyping phase with a strong vision and an excited group of investors and beta customers. To minimize costs I avoided refactoring the application and cut a lot of corners. By the end of the prototype phase, the idea had changed so much that we were really faking functionality by overloading different features. Still, for a ridiculously small investment, my client was able to develop and market test an idea. And now I get to build the application for real and apply the best practices that I learned about in the Django master class. Here is what I am doing and how it is working out.

  1. Use South for database migrations (adopted). I have grown so attached to South that I find it hard to imagine life without it. This is especially important because I am managing different environments and the object model is changing as I add new features.

  2. Use PostgreSQL rather than MySQL (adopted). I am steadily getting more comfortable with PosgreSQL. pgAdmin has been really helpful as I get up to speed with the syntactical differences from MySQL. So far, the biggest differences have been in user management and permissions.

  3. Use VirtualEnv (adopted). VirtualEnv + VirtualEnv Wrapper has been great. For a little while I was working on both the prototype and the actual application. VirtualEnv made it easy for me to switch back and forth. This will also be helpful when I upgrade to Django 1.2.

  4. Use PIP (adopted). I really like how you can do a "pip freeze" to create a requirements file that you can use to build up an environment.

  5. Break up functionality into lots of small re-usable applications (adopted). The prototype had one app. The production app that I am building has 6. One of the apps contains all the branding for the application and some tag libraries. Templates in other apps load a base template from my "skin" app. The best part of using this strategy is in testing and database migrations because you can test and migrate a project one app at a time. The hardest thing for me to figure out is how to manage inter-dependencies and coupling. One strategy that has worked well for me is to focus dependencies on just a couple of applications. For example, I have profile application which manages user profiles (extended from the base django.contrib.auth.User model.). I have other apps that relate to people but I am careful to create foreign key relationships to the User model rather than my profile model.

  6. Use Fabric for deployments (adopted). One word. AWESOME! I have scripts to set up a server and deploy my project without having to ssh onto the server. The scripts were not that hard to write. I took inspiration from some great posts (here and here). Now I can reliably push code (and media) with one local command. I am managing the development of another site running a PHP CMS and I am strongly considering having the team use Fabric for that as well.

  7. Use Django Fixtures (adopted). Managing fixtures in JSON has turned out to be really easy. I typically have two fixtures for each app: initial_data.json and <app_name>_test_data.json. initial_data.json mainly contains data for lookup tables. It is run automatically when syncdb (the Django command to update the database schema) is run. I typically create these files with the dumpdata command and then edit them manually.

  8. Look into the Python Fixture module (not adopted). I looked into this module but, to be honest, editing the JSON files is pretty easy so I don't see the need for it.

  9. Use django.test.TestCase more for unit testing (adopted). I have been doing a considerable amount of test driven development (TDD). It all started when I wanted to rewrite the core functionality but I needed to wait for someone else to re-build the HTML in the presentation templates. Now I have around 130 unit tests that I run before I commit any code. Focusing on unit testing has made me write code that is more atomic and easier to test. Now I think "how will I test this?" before I write any code.

  10. Use the highest version of Python that you can get away with (adopted). A big motivator for me here was when I upgraded my workstation to Snow Leopard which ships with Python 2.6.3. Getting 2.6.3 on my server was a little more complicated. I wound up using a host that comes with Ubuntu Karmic Koala which also comes with 2.6.3. I am really pleased with Ubuntu and it seems like most of the Django community is going that way.

I feel really lucky for the opportunity to rewrite an application and apply lessons learned. Too often you are stuck managing code that you (or someone else) wrote before you knew what you were doing. That is, before the functionality of the application was fully understood; before a feature of the API was available or known; before a more elegant solution was discovered. I am sure that I will continue to learn new things and want to apply them and I plan to continually refactor as long as I am involved with this project. But this full-reset has been a great experience.

Mar 25, 2010

The Onion's Migration from Drupal to Django

There is a great Reddit thread on The Onion's migration from Drupal to Django. The Onion was one of the companies that I interviewed for the Drupal for Publishers report. One of the things I mention in the report is that The Onion was running on an early version (4.7) of Drupal. The Onion was one of the first high traffic sites to adopt Drupal and the team had to hack the Drupal core to achieve the scalability that they needed. While versions 5 and 6 of Drupal made substantial performance improvements, The Onion's version was too far forked to cleanly upgrade.

Still, The Onion benefited greatly from using Drupal. They were able to minimize up-front costs by leveraging Drupal's native functionality and adapt the solution as their needs changed. Scalability was a challenge but it was a manageable one. Even though forking the code base was not ideal, it was a better alternative than running into a brick wall and having to migrate under duress. The Drupal community also benefited from the exposure and learning that came from The Onion using Drupal. Everybody won &mdash how often can you say that?

I can understand the choice of Django 1.1 (current) over a hacked version of Drupal 4.7. Having built sites in both Drupal and Django, I can also see the appeal of using a Django over Drupal 6.16 (current). Django is a more programming-oriented framework and The Onion has programmers. Django is designed to be as straightforward and "Pythonic" as possible. Drupal tries to make it possible to get things done without writing any code at all; and if you can avoid writing code in Drupal, you should. As a programming framework, Drupal has more indirection and asserts more control over the developer. The Onion's staff of programmers clearly appreciate the programmatic control that Django affords and they are quite happy with their decision.

Nov 03, 2009

10 Django Master Class action items

Edit: I wrote a follow-up post describing how I was doing with these action items. Enjoy!

A couple of weeks ago I attended Jacob Kaplan-Moss's Django Master Class in Springfield, Virginia. It was a great class and I walked out with a bunch of ideas for making better use of Django. What follows is a set of action items that I created for myself. Jacob was not this prescriptive in his presentation. These are just my personal decisions based on how he explained things.

  1. Use South for database migrations (complete). Unlike Rails, Django has no native system for synchronizing the database schema with code changes. Django will create your initial database schema for you but you need to modify the tables with SQL whenever your models change. South gives Django Rails-like migrations which consists of methods to alter the database and also roll-back changes. I ported a new application I am working on over to use South and am very impressed. Jacob gave some great advice to keep your schema migrations from your data migrations. For example, if you are renaming a field: you would create one migration to add the field; a second migration to move the data to the new field; and a third migration to delete the old field. Doing this will make your migrations safer and easier to roll-back.

  2. Use PostgreSQL rather than MySQL (complete). Jacob didn't talk disparagingly about MySQL but it was clear to me that PostgreSQL is what the cool kids are using. That is not to say there are not disagreements over what DB is best. I have been using MySQL for years but two things won me over. In the class, I learned that table alterations in MySQL are not transactional so if your South database migration fails, you can't roll-back so easily. The second factor came after the class when I was reading all these blog posts panicking about what will come of MySQL now that Oracle owns it. I agree with most pundits that Oracle doesn't have a great reason to invest in MySQL. My comfort level working with PostgreSQL is growing but its going to take a while to get as comfortable with the commands and syntax as I am with MySQL.

  3. Use VirtualEnv (complete). One thing about Python that always seemed hackey to me was the whole "site-packages" thing. I don't like how all your Python projects tend to share the same libraries. In Java, you are much more deliberate with your CLASSPATH. The class introduced me to virtualenv and its sister project virtualenvwrapper. This creates a virtual sandbox where you can manage libraries separately from your main Python installation. It is brilliant.

  4. Use PIP (complete). I was pretty haphazard about what tools I used to install Python packages. I admit that I didn't really know the difference between setuptools and easy_install. The Master Class nicely explained the different options and it seems like PIP is emerging as the Python package manager of choice.

  5. Break up functionality into lots of small re-usable applications (in process). Much of the advice from the class is summarized in James Bennett's DjangoCon 2008 talk: Reusable Apps. Watch the video and be convinced.

  6. Use Fabric for deployments (not started). My normal m.o. for deploying code has been to shell over to a server and svn export from my Subversion server. In multiple server environments, I would usually have some kind of rsync setup. However, in my one of my client projects (using Java), I started using AntHill Pro (plus Ant) for both continuous integration and deployment. From that experience I saw light on the automated deployments. Fabric is primitive compared to AntHill Pro (it doesn't have cool web-based UI) but it does allow you to run scripts remotely on other hosts. It's like Capistrano for Python. In the next phase of development, I will definitely be using this.

  7. Use Django Fixtures (not started). I am really embarrassed to say that I have avoided using Fixtures for loading lookup and test data. Instead, I have been doing horrid things with SQL and objects.create(). I am looking forward to reforming my errant ways. Fixtures allow you to create a data file that Django will load for you. It offers three format options: JSON, YAML, or XML. Jacob recommends YAML if you can be assured that you have access to PyYAML, otherwise go with JSON which is nearly as readable.

  8. Look into the Python Fixture module (not started). This straight Python module seems to be an alternative to the Django fixtures system. It is more oriented towards test data and looks a little like using mock objects. I need to dig in a little more before I make up my mind about it.

  9. Use django.test.TestCase more for unit testing (not started). I need to do more with unit tests. I have had some good experiences with writing DocTests but I should use the Django unit test framework more. This will allow me to use fixtures more too! Plus with Django 1.1, startapp even creates a tests.py for you. How can I resist an empty .py file?

  10. Use the highest version of Python that you can get away with (in progress). In the class, Jacob made the good point that every version of Python gets feature and performance improvements. Why not go with the latest stable version like 2.6? Snow Leopard did it for me. I will try to upgrade my server as soon as I can get away with it.

If you can make it to the next Django Master Class, I highly recommend you go. Otherwise, you should look into these resources and make your own educated decisions about whether to use them.

Oct 02, 2009

Mar 26, 2009

Book Review: Django 1.0 Template Development

I just finished reading Scott Newman's book Django 1.0 Template Development. This is the second Django book that I have read (the first was The Definitive Guide to Django
) and I am very impressed by the number (and quality) of Django books that have been published. 21% of the respondents to a recent "This Week in Django" poll said that they learned Django from reading a book (65% learned from the online documentation). Considering that until recently there were no Django books, this is significant.

Django 1.0 Template Development lives up to its title by focusing on the template layer of the Django web application framework although it does go through some basics of setting up your project and some of the details of the Django request handling pipeline. There is very little coverage of models - just enough to give the sample project some data to work with.

There is good coverage of how templates are loaded and guidelines of how to develop views [1] with plenty of tips on leveraging Django's many convenience features (like generic views) and organizing code for better manageability. There are examples for using and writing custom middleware, filters, and tags [2] with special attention paid to best practices in security. A whole chapter is devoted to working with Django's pagination system. Explanations are well supported with the theory behind and examples that demonstrate the details of Django's behavior.

The area that I was hoping for a little more depth was in optimizing performance. Django gives the developer a lot of options of how to design the application. For example, in addition to the typical template "include" syntax, Django also supports template inheritance (where a child template can extend and override blocks of a page from its parent). There is not much information on the performance implications of deep template hierarchies. The caching chapter gives a nice overview of Django's different caching options and engines and general guidelines but perhaps the art of really tuning a site is the topic for another book.

I would highly recommend Django 1.0 Template Development for anyone who wants to efficiently build a clean and manageable template layer for a Django project. In particular, a developer who needs to make the display tier flexible and extensible (such as the book's example of managing a separate site skin for mobile browsers). Although the preface recommends the reader have a working knowledge of Django and Python, I don't think that is really necessary. There is just enough information to help the developer to understand the overall Django framework but the emphasis is definitely on displaying data.

Notes:

  • 1 in Django, the "view" is the code that gathers and preprocesses the data for the template to render
  • 2 These are important for a template developer because Django deliberately limits the amount of logic you can put into a template to force developers to keep templates clean and make code more reusable. Logic belongs in filters (that manipulate data) and tags (that do more complex logic), and middleware (where you inject additional functionality into the request/ response cycle).