The Humble Hash as DB
December 18, 2000
One of the most oft-used data structures in Perl is the venerable
hash.
Simply put, in a hash structure each item is referenced by a
"key" which is associated with a "value". Also known as a key/value
pair, a hash item is elementary Perl:
my %car = ( 'make' => 'Nissan',
'model' => 'Maxima',
'year' => '1997',
'color' => 'evergreen'
);
Here we have a typical hash, named %car, with four keys:
make, model, year, and color. Each key has an
associated value, such as 1997 or evergreen. It's not a
big leap of faith to see this hash as a very simple database. In
fact, we can see the hash %car as a database table, and each
key as a field name. Each value is the value for each field.
There are two reasons we might want to save this hash to disk. For
one, we might want to remember its contents to consult or even modify
on future invocations of a script. This is a form of state
preservation. Second, the hash might be very large -- suppose it has
1,000 or 40,000 keys -- and working with the whole hash in memory
could be an inefficient use of resources.
Using Perl's dbmopen function, we can "tie" this hash to a
disk file, rather than keep it in memory. It's important to note that
you need to tie the hash to disk before you begin using it -- a
pre-populated hash won't simply save its existing data to disk once
it becomes tied. dbmopen typically accepts three
parameters:
dbmopen (%hashName, "filename", $permissions_mode)
Imagine then that we want to create the %car hash in a
disk-based file:
my %car=();
dbmopen (%car, "car_data", 0666) ||
die "Could not open or create database!";
%car = ( 'make' => 'Nissan',
'model' => 'Maxima',
'year' => '1997',
'color' => 'evergreen'
);
dbmclose (%car);
First we initialize the hash so that it is empty, and then the
hash is tied to a disk file. Using the dbmopen function we
create a disk file named car_data. Note that on
Unix systems
the resulting file may automatically grow a ".db" extension onto the
filename, while on Windows it may not. The permissions mode is given
as a permissions mask applied to the database file only if it is
created from scratch -- exactly how this number behaves will depend
on the operating system.
This particular example illustrates only creating a new hash and
saving it to disk. In fact, the real utility of a DBM file is that we
can use the hash while it resides on disk, just as if it
resided in memory.
my %car=();
dbmopen (%car, "car_data", 0666) ||
die "Could not open or create database!";
print $car{model}."\n";
dbmclose (%car);
The example above assumes that the %car hash already
contains data on disk, as if we had run the previous example first.
Simply, this script grabs a key value from the hash and prints it to
output -- the result in this case would be 'Maxima'. Once you tie a
hash to the DBM file using dbmopen, you simply proceed to use
that hash just as any hash, despite the fact that it resides on the
disk. If you add new data to the hash, it will automatically be
stored to the disk.
Disk-based data structures (DBM)
The Perl You Need to Know
A Fistful of DBM's
|