Databases: Fielded Versus Amorphous
What is a Database, Exactly?
Simply put, a database is just a file containing a bunch of data. This data may be in the form of strings or numerics or a combination of the two.
The laziest way to build a database is to use fielded data. Let's say you want to build a data file of the employees at your company. You would field the information, perhaps like this:
Name - allow 40 characters
In your fielded database, a name like Joe T. Hu would still require 40 bytes of storage space, despite being only 9 characters long. When written to your data file, Joe's name would be padded with 31 spaces, so that it would be the full 40 characters long.
The advantage of fielded databases is that you can parse it by character count. In the example above, you would know that if all the files are packed one after another in the data file, every 59 characters equals one employee record. So you could simply "break" the file every 59 bytes, print the data to screen and you would have a neat report:
Joe T. Hu 555-55-5555 12/12/85
The problem with fielded databases is that they are disk hogs. In a typical database, easily 60% of the storage space on disk is filled with spaces!
Disks have really gotten cheap in the past few years. Even a 4 gigabyte disk doesn't cost that much anymore. So lazy programmers are to be excused if they say, "So what if it's a bit inefficient? It's no big deal!"
And it is hard to argue against the logic for local systems.
But when you are delivering that database to a visitor to your site through a dial-up modem that may only be running at 14.4 kbaud (1800 bytes per second) -- sending 2 of every 3 bytes of a fielded database as spaces is a real waste of the visitor's time, and is increasing your server load and bandwidth consumption unnecessarily. If you are billed for long distance telephone time, you will find a strong economic reason to converve bandwidth.
You may think that bandwidth consumption is no big deal to you personally. If you have a site that has low numbers of visitors and server hits, you'd be right.
But if your site begins getting the number of hits it really deserves, once the world discovers just how neat it is, you will be in for a very unpleasant surprise.
Most access providers cut you a bit of slack. They probably "give" you a gig of download each month from your site at no extra charge.
But if your site becomes a really popular destination, you're going to need lots more bandwidth than that -- more on the order of 5 to 10 gigabytes per day -- 150 to 300 gigs per month. Wait until you get the server bill for that "overage"! All this stuff about conserving bandwidth will suddenly become very important to you -- personally.
The Solution: The Amorphous Database
As we saw with the fielded database, 60% or so of the space occupied by the data is wasted. That's where the amorphous database shines.
An amorphous database is identical to the fielded database from the perspective that it stores the information you want, ready for retrieval.
The difference is simple: it doesn't pad the data. In fact, it is far more flexible, because you don't have to worry about a name being longer than the number of characters you allocated for that field.
You can have an empty field which will occupy no space and in the next record, an entry that takes up 122 characters.
You simply pick a character that is not being used in your database, like a tilde (~) and specify it as a field separator.
In the example above for a fielded database, we set up three fields - one for name, one for social security number and one for hire date.
So, for an amorphous database, each record will contain three tildes -- one to serve as a field terminator for each field.
Joe T. Hu~555-55-5555~12/12/85~Mary E. Richards~444-44-4444~09/01/92~
Despite appearing on two lines in this text (if your browser type size is set to 12 pt or greater) the line above is just one long line.
Parsing this database is actually just as simple as parsing a fielded one. You simply look for the tildes until you are done.
Because the database is now one long string, (without delving into the mysteries of contiguous - random - and non-contiguous - sequential -files) you can now pull any specific record very quickly with a simple .indexOf("name"); call.
From time to time, depending upon what you want to do, you may wish to include a record terminator in your amorphous database as well. It will be useful if you wish to load the records into an array without parsing the string all the way down to its individual field components. We might use an asterisk (*) for example, which would change our file like this:
Joe T. Hu~555-55-5555~12/12/85~*Mary E. Richards~444-44-4444~09/01/92~*
Delivering Databases with Your Own Pages
The fastest download of database info is as straight text. The most practical way to store the database is in a "hidden" form element.
A "hidden" form element, on the other hand is not subject to these limitations. You can store a 30k or more database on just a single text line in your "hidden" form element as a VALUE="database here"
Then, as we saw in the last chapter, you may simply call it into a string with this call:
Then you can parse that string into records and/or fields, read it into an array if it is desirable to do so, or simply pick pieces of it out to display. This is particularly useful for site searches.
This example may help you with some of the "tricks" to formulate your own site searches: