ZODB is packaged using the standard distutils tools.
You will need Python 2.2 or higher. Since the code is packaged using
distutils, it is simply a matter of untarring or unzipping the release
package, and then running python setup.py install
.
You'll need a C compiler to build the packages, because there are various C extension modules. Binary installers are provided for Windows users.
Download the ZODB tarball containing all the packages for both ZODB and ZEO from http://www.zope.org/Products/ZODB3.2. See the README.txt file in the top level of the release directory for details on building, testing, and installing.
You can find information about ZODB and the most current releases in the ZODB Wiki at http://www.zope.org/Wikis/ZODB.
The ZODB is conceptually simple. Python classes subclass a
Persistent class to become ZODB-aware.
Instances of persistent objects are brought in from a permanent
storage medium, such as a disk file, when the program needs them, and
remain cached in RAM. The ZODB traps modifications to objects, so
that when a statement such as obj.size = 1
is executed, the
modified object is marked as ``dirty''. On request, any dirty objects
are written out to permanent storage; this is called committing a
transaction. Transactions can also be aborted or rolled back, which
results in any changes being discarded, dirty objects reverting to
their initial state before the transaction began.
The term ``transaction'' has a specific technical meaning in computer science. It's extremely important that the contents of a database don't get corrupted by software or hardware crashes, and most database software offers protection against such corruption by supporting four useful properties, Atomicity, Consistency, Isolation, and Durability. In computer science jargon these four terms are collectively dubbed the ACID properties, forming an acronym from their names.
The ZODB provides all of the ACID properties. Definitions of the ACID properties are:
There are 3 main interfaces supplied by the ZODB: Storage, DB, and Connection classes. The DB and Connection interfaces both have single implementations, but there are several different classes that implement the Storage interface.
Preparing to use a ZODB requires 3 steps: you have to open the Storage, then create a DB instance that uses the Storage, and then get a Connection from the DB instance. All this is only a few lines of code:
from ZODB import FileStorage, DB storage = FileStorage.FileStorage('/tmp/test-filestorage.fs') db = DB(storage) conn = db.open()
Note that you can use a completely different data storage mechanism by changing the first line that opens a Storage; the above example uses a FileStorage. In section 3, ``How ZEO Works'', you'll see how ZEO uses this flexibility to good effect.
Making a Python class persistent is quite simple; it simply needs to subclass from the Persistent class, as shown in this example:
import ZODB from Persistence import Persistent class User(Persistent): pass
The apparently unnecessary import ZODB
statement is
needed for the following from...import
statement to work
correctly, since the ZODB code does some magical tricks with
importing.
The Persistent base class is an ExtensionClass class. As a result, it not compatible with new-style classes or types in Python 2.2 and up.
For simplicity, in the examples the User class will simply be used as a holder for a bunch of attributes. Normally the class would define various methods that add functionality, but that has no impact on the ZODB's treatment of the class.
The ZODB uses persistence by reachability; starting from a set of root objects, all the attributes of those objects are made persistent, whether they're simple Python data types or class instances. There's no method to explicitly store objects in a ZODB database; simply assign them as an attribute of an object, or store them in a mapping, that's already in the database. This chain of containment must eventually reach back to the root object of the database.
As an example, we'll create a simple database of users that allows retrieving a User object given the user's ID. First, we retrieve the primary root object of the ZODB using the root() method of the Connection instance. The root object behaves like a Python dictionary, so you can just add a new key/value pair for your application's root object. We'll insert an OOBTree object that will contain all the User objects. (The BTree module is also included as part of Zope.)
dbroot = conn.root() # Ensure that a 'userdb' key is present # in the root if not dbroot.has_key('userdb'): from BTrees.OOBTree import OOBTree dbroot['userdb'] = OOBTree() userdb = dbroot['userdb']
Inserting a new user is simple: create the User object, fill it with data, insert it into the BTree instance, and commit this transaction.
# Create new User instance newuser = User() # Add whatever attributes you want to track newuser.id = 'amk' newuser.first_name = 'Andrew' ; newuser.last_name = 'Kuchling' ... # Add object to the BTree, keyed on the ID userdb[newuser.id] = newuser # Commit the change get_transaction().commit()
When you import the ZODB package, it adds a new function, get_transaction(), to Python's collection of built-in functions. get_transaction() returns a Transaction object, which has two important methods: commit() and abort(). commit() writes any modified objects to disk, making the changes permanent, while abort() rolls back any changes that have been made, restoring the original state of the objects. If you're familiar with database transactional semantics, this is all what you'd expect.
Because the integration with Python is so complete, it's a lot like having transactional semantics for your program's variables, and you can experiment with transactions at the Python interpreter's prompt:
>>> newuser <User instance at 81b1f40> >>> newuser.first_name # Print initial value 'Andrew' >>> newuser.first_name = 'Bob' # Change first name >>> newuser.first_name # Verify the change 'Bob' >>> get_transaction().abort() # Abort transaction >>> newuser.first_name # The value has changed back 'Andrew'
Practically all persistent languages impose some restrictions on programming style, warning against constructs they can't handle or adding subtle semantic changes, and the ZODB is no exception. Happily, the ZODB's restrictions are fairly simple to understand, and in practice it isn't too painful to work around them.
The summary of rules is as follows:
Let's look at each of these rules in detail.
The ZODB uses various Python hooks to catch attribute accesses, and
can trap most of the ways of modifying an object, but not all of them.
If you modify a User object by assigning to one of its
attributes, as in userobj.first_name = 'Andrew'
, the ZODB will
mark the object as having been changed, and it'll be written out on
the following commit().
The most common idiom that isn't caught by the ZODB is
mutating a list or dictionary. If User objects have a
attribute named friends
containing a list, calling
userobj.friends.append(otherUser)
doesn't mark
userobj
as modified; from the ZODB's point of
view, userobj.friends
was only read, and its value, which
happened to be an ordinary Python list, was returned. The ZODB isn't
aware that the object returned was subsequently modified.
This is one of the few quirks you'll have to remember when using the ZODB; if you modify a mutable attribute of an object in place, you have to manually mark the object as having been modified by setting its dirty bit to true. This is done by setting the _p_changed attribute of the object to true:
userobj.friends.append(otherUser) userobj._p_changed = 1
An obsolete way of doing this that's still supported is calling the __changed__() method instead, but setting _p_changed is the preferred way.
You can hide the implementation detail of having to mark objects as
dirty by designing your class's API to not use direct attribute
access; instead, you can use the Java-style approach of accessor
methods for everything, and then set the dirty bit within the accessor
method. For example, you might forbid accessing the friends
attribute directly, and add a get_friend_list() accessor and
an add_friend() modifier method to the class. add_friend()
would then look like this:
def add_friend(self, friend): self.friends.append(otherUser) self._p_changed = 1
Alternatively, you could use a ZODB-aware list or mapping type that handles the dirty bit for you. The ZODB comes with a PersistentMapping class, and I've contributed a PersistentList class that's included in my ZODB distribution, and may make it into a future upstream release of Zope.
Don't bother defining certain special methods on ExtensionClasses, because they won't work. Most notably, the __cmp__ method on an ExtensionClass will never be called. Neither will the reversed versions of binary arithmetic operations, such as __radd__ and __rsub__.
This is a moderately annoying limitation. It means that the
PersistentList class can't implement comparisons with regular
sequence objects, and therefore statements such as
if perslist==[]
don't do what you expect; instead of performing the correct
comparison, they return some arbitrary fixed result, so the if
statement will always be true or always be false. There is no good
solution to this problem at the moment, so all you can do is design
class interfaces that don't need to overload
__cmp__ or the __r*__ methods.
This limitation is mostly Python's fault. As of Python 2.1, the most
recent version at this writing, the code which handles comparing two
Python objects contains a hard-wired check for objects that are class
instances, which means that type(obj) == types.InstanceType
.
The code inside the Python interpreter looks like this:
/* Code to compare objects v and w */ if (PyInstance_Check(v) || PyInstance_Check(w)) return PyInstance_DoBinOp(v, w, "__cmp__", "__rcmp__", do_cmp); /* Do usual Python comparison of v,w */ c = PyObject_Compare(v, w);
While ExtensionClasses try to behave as much like regular Python
instances as possible, they are still not instances, and
type() doesn't return the InstanceType
object, so
no attempt is ever made to call __cmp__. Perhaps Python 2.2
will repair this.
Recent versions of ZODB allow writing persistent classes that have __getattr__, __delattr__, or __setattr__ methods. The one minor complication is that the machinery for automatically detecting changes to the object is disabled while the __getattr__, __delattr__, or __setattr__ method is executing. This means that if the object is modified, the object should be marked as dirty by setting the object's _p_changed method to true.
Now that we've looked at the basics of programming using the ZODB, we'll turn to some more subtle tasks that are likely to come up for anyone using the ZODB in a production system.
Ideally, before making a class persistent you would get its interface right the first time, so that no attributes would ever need to be added, removed, or have their interpretation change over time. It's a worthy goal, but also an impractical one unless you're gifted with perfect knowledge of the future. Such unnatural foresight can't be required of any person, so you therefore have to be prepared to handle such structural changes gracefully. In object-oriented database terminology, this is a schema update. The ZODB doesn't have an actual schema specification, but you're changing the software's expectations of the data contained by an object, so you're implicitly changing the schema.
One way to handle such a change is to write a one-time conversion program that will loop over every single object in the database and update them to match the new schema. This can be easy if your network of object references is quite structured, making it easy to find all the instances of the class being modified. For example, if all User objects can be found inside a single dictionary or BTree, then it would be a simple matter to loop over every User instance with a for statement. This is more difficult if your object graph is less structured; if User objects can be found as attributes of any number of different class instances, then there's no longer any easy way to find them all, short of writing a generalized object traversal function that would walk over every single object in a ZODB, checking each one to see if it's an instance of User.
Some OODBs support a feature called extents, which allow quickly finding all the instances of a given class, no matter where they are in the object graph; unfortunately the ZODB doesn't offer extents as a feature.