The actual block size used by KEYEDIO is different from the block size provided by you because space must be added to round the record size to an exact multiple of 6 characters, to provide room for the relative keys of an INDEXNOTRESTRICTED file, and to provide for the 10 words of header information in each block. Actual block size is used to calculate how much save memory is actually occupied by the KEYEDIO file. If the actual block size that has been calculated is not satisfactory to the programmer, it might be necessary to adjust the specified block size; that is, the BLOCKSIZE attribute value.
Make sure the block size is large enough to store more than one key for each block. If the block size is not large enough to store more than one key, an error occurs.
The specified block size is saved in the KEYEDIO file and is returned as the value of the BLOCKSIZE file attribute when the indexed file is open. If this attribute is interrogated when the file is closed, it always returns the value of 30, which is the value that the KEYEDIO library uses when creating the file. (This is a side effect of the fact that the KEYEDIO library manipulates the file using DIRECT I/O.)
Effect of Block Size on Processor Time
The proper specification of block size is extremely important to the performance of applications that use indexed files because the actual block size (calculated according to the algorithm described in “Calculating Actual Block Size”) is used not only for storing the data but also as the size of the key index tables used to access the data. The size of these tables and the number of records in the file determine how many tables must be searched in order to find a particular record. Each additional table that must be searched increases the processor and I/O time that is required to access a record.
The most efficient access is obtained when only a single table must be searched in order to find the key. A single-table search requires that the block size be large enough to hold the keys for all the records in the file; thus, this block size is usually not a practical choice except for files with a small number of records.
The next most efficient access is obtained when only two tables—a coarse table and a fine table—must be searched to find the key. A two-table search requires a block size large enough to hold a number of keys equal to the square root of the number of records in the file. A block size of this value is generally the most suitable choice for all but very small or very large indexed files. A block size smaller than this square root value requires multiple table accesses and noticeably increases the time required for random accesses to the file.
Effect of Block Size on Save Memory
The buffers used by KEYEDIO occupy save memory. The amount of save memory to be used for a given indexed file can be approximated by multiplying the actual block size (calculated according to the algorithm given in “Calculating Actual Block Size”) by the number of buffers to be used for the file.
If the save memory requirements for block sizes that provide one- or two-level access to data are too great, a new block size should be calculated that provides three- or four-level access. This block size can be calculated using the algorithm given under “Calculating User-Specified Block Size (2 Level)”; but at step 2 compute the cube root or fourth root of the number of records instead of the square root.
Calculating Actual Block Size
The actual block size used by KEYEDIO is different from the BLOCKSIZE attribute specified by you because space must be added to round the record size up to an exact multiple of 6 characters, to provide room for the relative keys of an INDEXEDNOTRESTRICTED file, and to provide for the 10 words of header information in each block.
KEYEDIO uses the following algorithm to compute the actual block size for a file:
-
Divide the specified BLOCKSIZE by the specified record size (MAXRECSIZE), truncating any remainder. This gives the records per block value.
-
Round the specified record size up to the next multiple of 6 characters, if it is not already an exact multiple of 6 characters. Convert this record size to the number of words required to hold the record by dividing by six.
-
If this is an INDEXEDNOTRESTRICTED file, add 1 (word) to the RECORDSIZE to allow space for the relative key.
-
Compute a trial block size by multiplying the record size in words (calculated in steps 2 and 3) by the specified records per block (calculated in step 1). Then add 10 words to provide space for the header information in each block.
-
Calculate the actual block size by rounding the trial block size (from step 4) up to the next multiple of 30 words, if it is not already an exact multiple of 30 words.
Once the actual block size has been calculated, as many records as can fit are placed in each block. That is, if the rounding process of step 5 adds enough space to the block for additional records, that space will be used, and the actual records per block will be greater than the specified records per block calculated in step 1.
Calculating User-Specified Block Size (2 Level)
To calculate the proper block size for an indexed file, assuming the two-level table search is desired, make the following calculation:
-
Calculate the number of records the file will contain over its lifetime.
-
Compute the square root of the number of records. Then multiply this value by an adjustment factor to allow for the fact that not all the tables will be completely filled. The result of this computation is the desired number of keys per block.
The value of the adjustment factor is determined by the way the file is created and updated. If the file is created sequentially with the entries for all the keys in ascending order, and few records will be added later, a small adjustment factor of 1.1 can be used. If the file is created sequentially, but more records are to be added, use an adjustment factor of about 1.3 (or greater, if many records will be added). If the file is created with the entries for some of the keys occurring in random order, use an adjustment factor of 2.0.
-
Compute the size of the largest key entry by performing the following steps:
-
Find the size of the largest key in the record.
-
Round this size up to the next multiple of 6 characters, if it is not already a multiple of 6 characters.
-
Add 6 characters to provide space for the key entry's pointer to the data record.
-
-
Compute the desired block size by multiplying the desired number of keys per block (from step 2) by the size of the largest key entry (from step 3).
-
Round this desired block size up to the next multiple of the record size, if it is not already a multiple of the record size. This last step ensures that the block size chosen is suitable for storing the data records as well as the keys.
The block size calculated by this procedure provides two-level access, but its impact on the system must be determined before deciding that this block size is the correct block size to use. Consider, in particular, the effects of the block size on memory usage. Refer to “Effect of Block Size on Save Memory” in this section.
Calculating Actual Area Size
KEYEDIO uses the following algorithm to calculate the actual AREASIZE of KEYEDIO files:
-
Converts the declared AREASIZE value to the number of blocks by dividing the AREASIZE value by the number of records per block.
-
Multiplies the result by the actual block size in segments.
-
If the resulting AREASIZE value is too small, assigns a size sufficient for one block per area.
The KEYEDIO file does not actually contain an area equal to the product of AREASIZE times AREAS, because many of the blocks in the file are used to hold index tables for the keys.