Wednesday, December 28, 2005

ZORacle Part III: Connecting Zope to an Oracle database

Database Adaptors and SQL Methods

If you have ever evaluated Zope you have probably heard that, in addition to using the included object database (the ZODB), Zope can access most SQL compliant relational databases. The typical framework to do this is Data Adaptors and ZSQLMethods. Data Adaptors, commonly called DAs, are plug in products that give you a connection to a particular database. For example, there is a popular DA for MySQL called ZMySQLDA. Once you have a DA, you can use the Zope Management Interface (ZMI) to create a ZSQL Connection to connect to the database and ZSQL Methods to execute queries against that database. The whole process is pretty well documented here. What you get when you go this route is a published Zope object that gives you a result set of data based on a query, or a handle on a query that can update the database (using SQL update, delete, or insert commands). ZSQL Methods are the most useful if you want to create a page in Zope (written in either ZPT or DTML) and you want to display information stored in a relational database. Unfortunately, there are not a lot of DA options for Oracle. The main open source one is DCOracle2 which is no longer active although there are still many people using it. My experience with compiling DCOracle2 for Oracle 10g looked like this. Even after that, it was still flaky. The best instructions I found are here but before you start with DCOracle2, look at the SQL Relay's DA. If you have money to spend, you can also look at eGenix mxODBC which talks to Oracle over ODBC and is not too bad at $120 per server.... Unless, however you are running on *NIX. In that case you need to buy an Oracle ODBC driver which may cost $1,599.00 per server (Windows Oracle ODBC drivers are free). The mxODBC DA is the most configurable DA that I have seen.

However, if you want to do more heavy lifting with a relational database, this framework is a little weak because you probably don't want to do manage all your SQL logic in the ZMI, especially if you want to access the data from Python based classes sitting on a file system (deployment gets difficult here). So the next level in working with relational data is to use a DA and a regular SQL method in your python code. That might look a little like this (note: all code samples are meant to be illustrative, I have left out important bits that are required for the code to run):

def __init__(self,context, map):
self._context = context

def selectProperties(self, pid):
setattr(self, '_selectProperties',
SQL('_selectProperties', '', CONNECTION_NAME, 'propertyId',
self._fieldList(map),RDBDAO.TABLE_NAME[])) )

method = self._selectProperties.__of__(self._context)
return method(propertyId=pid)

def setLocals(self):
results = self.selectProperties(self._context.propertyId)
columns = results.names()

if len(results) < 1:
raise Exception, "Can't find my database row."

for record in results:
for column in columns:
if column.lower() in self._propertyTypes:
if record[column] is not None:
self._rdata[column.lower()] = self._conversion.toZope(record[column],self._propertyTypes[column.lower()])
zLOG.LOG('RDBDAO', DEBUG, "no value for ", column.lower())

In the selectProperties method we create a new method called _selectProperties which is a SQL Method. Don't get too caught up in the syntax. The only thing I would call out is the use of "<dtml-sqlvar>" which does things like apply proper quoting and escape bad characters when the type is string. the CONNECTION_NAME variable is actually just a string which matches the name of the ZSQL Connection that you set up to point to the database that you want to talk to. If you put the ZQL Connection right at the root folder, your object will have a reference to it through Acquisition. However, this doesn't happen if the object does not exist within the acquisition hierarchy or is so brand new that the acquisition context has not yet been set. So we wrap the _selectProperties method in another method called selectProperties which just calls the private _selectProperties method using the context of the calling class which was passed in the __init__ method - hence the syntax __of__(self._context). Then the setLocals method runs the query and puts the results in a local dictionary. the _conversion object contains methods to do data conversion like handling dates.

Notice how there is no syntax for opening and closing a connection. That is all handled in the background by the ZSQL Connection. This example does not do a database update. If it did, you might see a SQL method that issued a query of "commit." Depending on the implementation of the DA, connection pooling and other configuration tends to be extremely simple. There are very few parameters to adjust and that gives you little control over how your application manages connections. Still, this method of data access goes pretty far - as long as the DA behaves reliably. When this framework starts to fall down is when you start working with really long strings such as CLOBs (Character Large Object). The problem arises because SQL methods only accept simple SQL statements. The update code would look like this:

def updateProperties(self, propertyId, args):

setattr(self, '_updateProperties',
SQL('_updateProperties', '', CONNECTION_NAME, 'propertyId ' + ' '.join([k for k in args] ),
self.sqlUpdateString(args)) )

method = self._updateProperties.__of__(self._context)
return method(schemaId=propertyId, **args)

def sqlUpdateString(self, map):
"""This method creates an update statement based on the property map.
There may be more properties than we will be setting when we execute this statement
but that is taken care of by the optional argument on the dtml-sqlvar tag.
sqlString += " SET " + ", ".join(["%s = %s" % (k.upper(), self._conversion.setValue(k,DBTYPE,self._propertyTypes[k])) for k in map])
sqlString += ", MODIFY_INSTANCE='%s:%s'" % (os.getenv('HOSTNAME'), INSTANCE_HOME)
sqlString += " WHERE PROPERTY_ID = "
return sqlString

Here the _conversion class puts in the appropriate <dtml-sqlvar> syntax as well as
do some additional data conversions. In Oracle, a SQL statement can only be up to a certain number of characters - I can't remember how many, but if you try to issue a query like "UPDATE tableA SET field A = '[some multiple thousand character string such as the body of an article]'", you will get an error. MySQL tends to be a little more forgiving but there are limits. In order to set large string values, you need to use "bound variables" so that the update query gets assembled on the database side. DA's and SQL Methods don't do this very well (actually they don't do it at all).

SQL Relay

To get around this limitation, we used SQL Relay which allowed our Python DAO to talk directly to the database without going through the Zope DA pathway. Although SQL Relay also has a DA, we didn't use it because it does not support bound variables. But the Python client libraries, which we used in our code, do support binding. SQL Relay consists of several components:

  • A set of connection daemons which hold the connection to the database open.
  • A listener which is a deamon that runs and listens on a specified port and forward requests to a connection
  • A client that can talk to the listener
  • A cache manager daemon that maintains the query cache and removes stale result sets

You can put the listener on the server with the database or the server with the application server. Our design had everything but the clients sitting on the relational database server. The system is also extremely configurable through editing various XML files.

So now, with SQL Relay, our update code looks a little like this:

def __init__(self,context,map,connection):
"""The connection object is passed in from DBTransactionManager. The syntax looks like this:

The user name and password used here are not the database username and password. They are username and passwords that are set up in SQL Relay for clients to use. See SQL Relay configuration documentation for more


def sqlUpdateString(self,datamap):
"""This method creates an update statement based on the property map.
There may be more properties than we will be setting when we execute this statement
but that is taken care of by the optional argument on the dtml-sqlvar tag.
sqlString = "UPDATE " + RDBDAO.TABLE_NAME[] + " SET "
params = []
for k in datamap.keys():
params.append(" %s=%s" % (
sqlString += ",".join(params)
sqlString += " WHERE PROPERTY_ID = %s" % self._context.propertyId
return sqlString

def persist(self):
query = self.sqlUpdateString(datamap)
for k in datamap:
if k.upper() == 'BODY':
body = self._conversion.setValue(
cur.inputBindClob( k.upper(), body, len(body))

if cur.affectedRows() > 0:
"Update %d rows " % cur.affectedRows())
"Database returned error: %s " % cur.errorMessage())
raise DBError, cur.errorMessage()

With this setup, we were able to have a robust DAO that talks to an Oracle database and can handle all sorts of data types including CLOBs and BLOBs. Also, we have a configurable database connection framework that can be used to interface wiht several different databases. If I have an opportunity to work on another Zope project that needed relational database connectivity, SQL Relay (either using the DA or the client libraries) will be the first option that I try.