Getting Deep with Hashes
December 18, 2000
All of the hashes we've seen today have been one-level. Put
another way, each key is simply associated with a scalar value. This
works great with all the DBM's we've seen for storing these hashes
onto disk-based files.
But the game changes when you want to store a multi-level hash.
After all, Perl doesn't restrict key values to scalars -- the value
of a key could be a list of scalars, or another hash, or a list of
hashes, or a hash of lists, and so on ad infinitum. In fact, Perl
hashes can be arbitrarily complex with many levels, and this is what
makes Perl hashes especially powerful (and sometimes mind-numbingly
confusing). None of the DBM approaches we've seen thus far can
properly store such a multi-level hash, though.
Fortunately, there is a solution to everything. Well, not
everything, but at least to this problem.
MLDBM is actually a module which sits on top of one of the other
DBM's we've seen today, and lets you store multi-level hashes
transparently, as if they were single-level hashes. The key to this,
not to steal any of MLDBM's thunder, is rather simple and it's called
serialization.
A serializer is an algorithm which essentially "flattens" a
multi-level data structure into a single scalar value. Of course,
like a compressed text file, you can't actually use the data
structure in its serialized state, but it is a way to make such
complex data portable. There are several serializers available for
Perl, especially Data::Dumper and Storable. MLDBM essentially lets
you hook up one of these serializers with one of your preferred
DBM's, transparently so that the serializing and de-serializing
happens without your intervention.
By default, MLDBM uses the SDBM database with the Data::Dumper
serializer. But I prefer Storable, because it is faster. I also
prefer DB_File for a DBM, and since we've seen that in use, let's
illustrate using MLDBM with DB_File and Storable to tie a multi-level
hash to disk. That's quite a mouthful.
use MLDBM qw(DB_File Storable);
use Fcntl;
my %car=();
tie (%car, "MLDBM", "car_data",
O_CREAT|O_RDWR, 0666, $DB_File::DB_BTREE) ||
die "Could not open or create database.";
$car{'JN1HU11P1HX875232'} =
{ 'make' => 'Nissan',
'model' => 'Maxima',
'year' => '1997',
'color' => 'evergreen'
};
$car{'1GNDM15Z2HB187252 '} =
{ 'make' => 'Chevrolet',
'model' => 'Astro',
'year' => '1999',
'color' => 'black'
};
print "Inventory contains: \n";
foreach (keys %car) {
print $_ . ":\t".$car{$_}{make}." ".$car{$_}{model}."\n";
}
untie (%car);
When we use MLDBM we express our preference for the DB_File
DBM and the Storable serializer. From there, we tie the hash and work
with it rather normally. In this case, we again create a BTree hash.
Our %car hash is a two-level hash, because the values of the
top-level keys are themselves hashes. Each top-level key is a vehicle
identification number, or VIN, representing a fictional inventory.
The value for each VIN is a hash describing the car. A short output
loop near the end of this script illustrates how we can dig into the
hash levels.
There's nothing special about the code for managing this hash, and
that's the point. Thanks to MLDBM, we deal with a multi-level hash
just like any other, and the fact that it is being stored to disk,
serialized, and de-serialized is entirely transparent.
Climbing the BTree
The Perl You Need to Know
Conclusion
|