This is a feature matrix and comparison of the various ZODB storages.
status: Currently it is incomplete. It includes contributions from Toby Dickenson, Shane Hathaway, Jeremy Hylton, Barry Warsaw, and Paul Winkler. Please update the other columns, add new columns, and particularly add new rows if any other areas of comparison are important.
Please note that this information is for comparison only. The documentation for individual storages is likely to contain more detail than is presented here.
FileStorage | DirectoryStorage | BDBStorage | Ape |
---|---|---|---|
Maintainer | |||
Jeremy Hylton | Toby Dickenson | none | Shane Hathaway |
History | |||
ZODB's first ever storage. | Based on an idea prototyped in 2000, became stable in 2002. | First versions released in 2001. Added to standard ZODB version 3.2. Removed in ZODB 3.3, for lack of support. | Currently in alpha release. |
Installed Users | |||
All ZODB Users. | 10's | ? | a handful |
Active Developers | |||
10's | 1 | none | 1-2 |
License | |||
ZPL | LGPL | ZPL | ZPL |
Supported Platforms | |||
Everywhere. | Unix only. Only linux is well tested. | Everywhere. | Everywhere. |
ZODB3 | |||
yes | yes | yes | yes |
ZODB4 | |||
yes | not yet | yes | not yet |
ZEO Compatibility | |||
yes | ZEO 2 and later | yes | coming soon |
Implementation Overview | |||
Data.fs is a log file. New object revisions are appended. An in-memory index maps from oid to seek position within this file. | Uses ordinary files and directories to store revisions of ZODB objects; one file per revision per object, plus one file per transaction. | Uses the Sleepycat edition of BerkeleyDB. Approximately 20 tables are used to store the storage structures. Some tables are reference counted, with reference counts stored in other tables. | Serializes and stores objects using components. By writing or configuring components, you can store ZODB objects in virtually any kind of database. Alternatively, you can use one of the default configurations to store objects on the filesystem, in PostgreSQL, or in MySQL. |
Design Goals | |||
Extreme portability. Simple file format. | Fault tolerance. Manageability. Disaster-Preparedness. Simple implementation. | Leverage the famous robustness, performance, and scalability of BerkeleyDB. | Apelib bridges the gap between ZODB and relational data storage. It lets developers store ZODB objects in arbitrary databases and arbitrary formats, without changing application code. It combines the advantages of orthogonal persistence with relational storage. |
Variants | |||
Many people have hacked on FileStorage for their own purposes. PartitionedFileStorage by Shane Hathaway splits Data.fs into chunks, for easier manageability. | DirectoryStorage.Full is the only general-purpose production-quality variant. | bsddb3Storage.Full supports undo and versions, while bsddb3Storage.Minimal does not. ZODB4's MemoryStorage is built on the BDBStorage code, with an in-memory compatibility layer to the bsddb module API. | Filesystem storage, relational storage, and custom storage. |
External Dependencies | |||
If necessary, a filesystem (plus backup tools, etc) that can deal with one huge Data.fs files. Some systems still have
trouble at the 2G limit. |
A robust filesystem that will not dump files in lost+found after a power loss, and which can handle bucketloads of small files. The developers use reiserfs on linux. | Sleepycat's Berkeley DB 4.0.14 and Robin Dunn's PyBSDDB Python bindings. | None. |
Supports Undo | |||
Yes | Yes | Yes, in Full variant. | Not currently. |
Supports Versions | |||
Yes | No | Yes, in Full variant. | No. |
Performance | |||
Blindingly fast. Nothing beats FileStorage, so long as its index fits comfortably in memory. | A little slower than BDBStorage, but not enough to be significant except in a benchmark. | A little slower than FileStorage, but not enough to be significant except in a benchmark. | Depends on the component configuration. |
Memory Requirements | |||
It needs to keep its index in memory, which is proportional to storage size. As a general rule, allow between 2% and 10% of the Data.fs size in RAM or performance will slow to a crawl. | Small, and independant of storage size. The operating system provides caching. | Small, and independant of storage size. BerkeleyDB provides many ways to tune how memory is used for caching. | Independent of storage size. |
Disk Space Efficiency | |||
Data.fs is the benchmark for disk space. During packing it uses double the disk space, as a new file is built in Data.fs.new. | Roughly 30% more space than Data.fs. | ? | Depends on the component configuration. |
Packing? | |||
Yes, needs occasional packing to remove old revisions, and unreferenced objects. | Like FileStorage, but slower. DirectoryStorage has a fail-safe packing implementation that prevents a packing bug causing permanent data loss. (The history of FileStorage shows that packing is error-prone) | Reference counts are used to delete unreferenced objects, except when there are cycles. Full needs packing to remove old revisions and cycles of unreferenced objects. Minimal just for cycles. This packing can be performed automatically. | Not needed when storing on the filesystem, but relational databases will need some kind of packing or reference counting. |
Online Backup | |||
Yes, using repozo.py, full and incremental backups. Incremental backups write a file that only contains the changes, but it has to read the whole storage into memory before producing it. (or a third option saves the IO cost, but exposes some risk of corrupt backups) | Yes, using backup.py script, full and incremental. Incremental backups write a file that contains the changes, and it only needs to read the changed files while creating it. Backups events are transactional; you will not be left with a half-complete backup file if power is lost during or after the backup process. | Yes, as BerkeleyDB. | Expected to be provided by the database. |
Online Replication | |||
Note that it is possible to implement real-time replication, regardless of storage, using ZRS. ZRS has only been tested extensively with FileStorage. | |||
No. | Yes, A standard script to replicate changes into an identical storage directory on another machine. Efficient and transactional. | No. | No. |
Storage Checking Tools | |||
A variety of tools distributed as standard with ZODB, but not with Zope. Many can be run without shutting down the storage. | checkds.py, a unified checking tool. Can be run without shutting down the storage. | BerkelyDB provides tools for checking the tables, but there is no way to check storage-level details. | Depends on the database. No extra tools. |
Storage Dumping Tools | |||
fsdump.dy | dumpdsf.py | No (?) | None needed. The data is not encoded. |
Startup/Shutdown time | |||
On shutdown the index is written to disk, and reloaded on startup. This takes time proportional to storage size, but is quick. After an unclean shutdown it needs to rebuild the index, which is slower. It may need to rebuild the full index (very slow) or just the changes since it was last updated. | Small, and independant of storage size. | Small, and independant of storage size. | Small, and independant of storage size. |
Recovery after unexpected power loss | |||
Automatically truncates the file to the last complete transaction, and copies
the tail into a file for later inspection. PartitionedFileStorage may need some manual tweaking, if you are very unlucky. |
Automatically deletes any partial transactions. | Automatically deletes any partial transactions. | Expected to be provided by the database. |
What happens if my disk eats a chunk of my data? | |||
Use fsrecover.dy. This will remove the offending records. Attempts to access them in an application will raise a POSKeyError exception, which must be fixed manually from within the application. | Restore a backup or switch to a replica. You do have one, dont you? If not it may be possible to salvage some data from the storage. This process is untested. | ? | Depends on the database. |
Administrative Overhead | |||
Administrators need to be careful when running tools that modify the storage while the storage is running. |
Seperate download and install. |
As Berkeley DB. Learn the sleepycat tools. | You are expected to know your database. |
Lines of Code (wc -l, excluding tests, including tools) | |||
5000 | 4800 | 3500 | 7000 |
Other Features | |||
None |
Automatically checks for dangling references on commit, to avoid several possible
causes of POSKeyError exceptions.
Access to object revision files through the filesystem. All records have an md5 checksum. |
None | Ape component configurations are reusable for purposes other than persistent data storage. You can use them for exporting, importing, merging, synchronizing, and versioning objects. Ape also has a lot of functionality that does not depend on ZODB. |
Missing Features | |||
None | Does not report storage size or number of objects. | None | It is not yet easy to configure components. Also, when storing data on the filesystem, Ape doesn't yet poll the filesystem for changes to cached objects. |