Database Systems: The Complete Book- P7

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:50

Thêm vào BST

Báo xấu

163
lượt xem 8
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Database Systems: The Complete Book- P7: Database Systems and Database Design and Application courses offered at the junior, senior and graduate levels in Computer Science departments. Written by well-known computer scientists, this introduction to database systems offers a comprehensive approach, focusing on database design, database use, and implementation of database applications and database management systems

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Database Systems: The Complete Book- P7

CHAPTER 12. REPRESENTING DATA EL.EMENTS 12.3. REPRESENTING BLOCK AAiD RECORD ADDRESSES 581 logical physical do not know in advance how many records the block will hold, and we do not Logical address have to allocate a fixed amount of the block header to the table initially. - offset table-) header -- ' unused -+ address record record 4 record 3 record 1 Figure 12.6: A map table translates logical to physical addresses t t I to reserve some bytes to represent the host, others to represent the storage unit, and so on, a rational address notation would use considerably more than Figure 12.7: A block with a table of offsets telling us the position of each record 10 bytes for a system of this scale. within the block The address of a record is now the physical address of its block plus the offset 12.3.2 Logical and Structured Addresses of the entry in the block's offset table for that record. This level of indirection within the block offers many of the advantages of logical addresses, without the One might wonder what the purpose of logical addresses could be. All the infor- need for a global map table. mation needed for a physical address is found in the map table, and following logical pointers to records requires consulting the map table and then going 1 can move the record around within the block, and all we have to do % to the physical address. However, the level of indirection involved in the map is change the record's entry in the offset table; pointers to the record will table allows us considerable flexibility. For example, many data organizations still be able to find it. require us to move records around, either within a block or from block to block. If we use a map table, then all pointers to the record refer to this map table, We can even allow the record to move to another block, if the offset table and all we have to do when ure move or delete the record is to change the entry entries are large enough to hold a '.forwarding address" for the record. for that record in the table. Many combinations of logical and physical addresses are possible as well, Finally, we have an option, should the record be deleted, of leaving in its yielding structured address schemes. For instance, one could use a physical offset-table entry a tombstone, a special value that indicates the record has address for the block (but not the offset within the block), and add the key value been deleted. Prior to its deletion, pointers to this record may have been for the record being referred to. Then, to find a record given this structured stored a t various places in the database. After record deletion, following address, we use the physical part to reach the block containing that record, and a pointer to this record leads to the tombstone, whereupon the pointer xe examine the records of the block to find the one with the proper key. can either be replaced by a null pointer, or the data structure otherwise Of course, to survey the records of the block, we need enough information modified to reflect the deletion of the record. Had we not left the tomb- to locate them. The simplest case is when the records are of a known, fixed- stone. the pointer might lead to some new record. with surprising, and length type, with the key field at a known offset. Then, we only have to find in erroneous. results. the block header a count of how many records are in the block, and xve know exactly where to find the key fields that might match the key that is part of the 12.3.3 Pointer Swizzling address. However, there axe many other ways that blocks might be organized so that we could survey the records of the block; we shall cover others shortly. Often, pointers or addresses are part of records. This situation is not typical A similar, and very useful, combination of physical and logical addresses is for records that represent tuples of a relation, but it is common for tuples to keep in each block an oflset table that holds the offsets of the records within that represent objects. Also, modern object-relational database systems allow the block, as suggested in Fig. 12.7. Notice that the table grows from the front attributes of pointer type (called references), so even relational systems need the end of the block, while the records are placed starting at the end of the block. ability to represent pointers in tuples. Finally, index structures are composed This strategy is useful when the records need not be of equal length. Then, we of blocks that usually have pointers within them. Thus, we need to study Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
582 CHAPTER 12. REPRESENTING DATA ELEMENTS 12.3. R EPRESEiVTIArG BLOCIC AND RECORD ADDRESSES 583 DBaddr mem-addr database Ownership of Memory Address Spaces address In this section we have presented a view of the transfer between secondary and main memory in which each client owns its own memory address space, and the database address space is shared. This model is common in object-oriented DBMS's. However, relational systems often treat the memory memory address space as shared; the motivation is to support recovery address and concurrency as we shall discuss in Cliapters 17 and 18. A useful compromise is to have a shared memory address space on the server side, with copies of parts of that space on the clients' side. Figure 12.8: The translation table turns database addresses into their equiva- That organization supports recovery and concurrency, while also allowing lents in memory processing to be distributed in "scalable" way: the more clients the more processors can be brought to bear. To a ~ o i d cost of translating repeatedly from database addresses to mem- the ory addresses, several techniques have been developed that are collectively known as pointer swizzling. The general idea is that when we move a block the management of pointers as blocks are moved between main and secondary from secondary to main memory, pointers within the block may be "s~vizzled," memory; we do so in this section. that is, translated from the database address space to the virtual address space. As we mentioned earlier, every block, record, object, or other referenceable Thus, a pointer actually consists of: data item has two forms of address: 1. Its address in the server's database address space, which is typically a 1. A bit l indicating whether the pointer is currently a database address or a sequence of eight or so bytes locating the item in the secondary storage (swizzled) memory address. of the system. We shall call this address the database address. 2. An address in virtual memory (provided that item is currently buffered 2. The database or memory pointer, as appropriate. The same space is used in virtual memory). These addresses are typically four bytes. lVe shall for ~vllirheveraddress form is present at the moment. Of course. not all refer to such an address as the m e m o r y address of the item. the space may be used when the memory address is present, because it is typically shorter than the database address. I?-hen in secondary storage, we surely must use the database address of the item. However, when the item is in the main memoiy, we can refer to the item Exatnple 12.7: Figure 12.9 shoxvs a simple situation in which the Block 1has by either its database address or its memory address. It is more efficient to put a record ri-ith pointers to a second record or; the same block and to a record on memory addresses wherever an item has a pointer, because these pointers can another block. The figure also sho~vs what might happen n-hen Block 1 is copied be followed using single machine instructions. to memory. The first pointer. which points within Block 1, can be stvizzled so In contrast, following a database address is much more time-consuming. \I-e ' it points directly to the memory address of the target record. need a table that translates from all those database addresses that are currently in virtual memory to their current memory address. Such a translation table is However. if Block 2 is not in memory at this time. then we cannot sn-izzlethe suggested in Fig. 12.8. It may be reminiscent of the map table of Fig. 12.6 that iecond pointer: it must remain unslvizzled. pointing to the database address of translates between logical and physical addresses. Ho~vever: its target. Should Block 2 be brought to memory later. it becomes theoretically possible to swizzle the second pointer of Block 1. Depending on the swizzling a) Logical and physical addresses are both representations for the database strategy used. there n ~ a y may not be a list of such pointers that are in or address. In contrast, memory addresses in the translation table are for memory. referring to Block 2; if so; then we have the option of sx-izzling the copies of the corresponding object in memory. pointer at that time. b) .Ill addressable items in the database have entries in the map table, while only those items currently in memory are mentioned in the translation There are several strategies we can use to determine ~vhento sn-izzle point- table. ers. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
r8p q CH-APTER 12. REPRESENTING DATA ELEMENTS 12.3. REPRESENTING BLOCK AND RECORD ADDRESSES 585 Disk Memory If we try to follow a pointer P from a block, and we find that pointer P is still unswizzled, i.e., in the form of a database pointer, then we need to niake . . Read into memory sure the block B containing the item that P points to is in memory (or else why are we following that pointer?). We consult the translation table to see if s Swizzle database address P currently has a memory equivalent. If not, we copy block B into a memory buffer. Once B is in memory, we can "swizzle" P by replacing its database form by the equivalent memory form. Block 1 Swizzling o n Demand Another approach is to leave all pointers unswizzled when the block is first brought into memory. We enter its address, and the addresses of its pointers, into the translation table, along with their memory equivalents. If and when II Unswizzled u we follow a pointer P that is inside some block of memory, we swizzle it, using the same strategy that we followed when we found an unswizzled pointer using Block 2 automatic swizzling. The difference between on-demand and automatic swizzling is that the latter Figure 12.9: Structure of a pointer when swizzling is used tries to get all the pointers swizzled quickly and efficiently when the block is loaded into memory. The possible time saved by swizzling all of a block's Automatic Swizzling pointers a t one time must be weighed against the possibility that some swizzled pointers will never be followed. In that case, any time spent swizzling and As soon as a block is brought into memory, we locate d l its pointers and unswizzling the pointer will be wasted. addresses and enter them into the translation table if they are not already An interesting option is to arrange that database pointers look like invalid there. These pointers include both the pointers from records in the block to memory addresses. If so, then we can allow the computer to follow any pointer elseivhere and the addresses of the block itself and/or its records. if tliese are as if it were in its memory form. I the pointer happens to be unswizzled, then f addressable items. We need some mechanism to locate the pointers within the the memory reference will cause a hardware trap. If the DBMS provides a block. For example: function that is invoked by the trap, and this function "swizzles" the pointer in the manner described above, then we can follow swizzled pointers in single 1. If the block holds records with a known schema, the schema will tell us instructions, and only need to do something more time consuming when the where in the records the pointers are found. pointer is unswizzled. 2. If the block is used for one of the index structures we shall discuss in N o Swizzling Chapter 13. then the block will hold pointers at known locations. Of course it is possible newr to swizzle pointers. We still need the translation 3. We may keep within the block header a list of where the pointers are. table, so the pointers may be followed in their unswizzled form. This approach does offer the advantage that records cannot be pinned in memory, as discussed When we enter into the translation table the addresses for the block just in Section 12.3.5, and decisions about which form of pointer is present need not moved into memory. and/or its records, we know where in memory the block be made. has been buffered. We ma?; thus create the translation-table entry for tliese database addresses straightfor~vardly. When I\-e inscrt one of these database P r o g r a m m e r Control of Swizzling addresses -4 into the translatio~ltable, we may find it in the table already. because its block is currently in memory. In this case, we replace -4 in the block In some applications, it may be known by the application programmer whether just moved to memory by the corresponding memory address, and we set the the pointers in a block are likely to be follo~ved.This programmer may be able '.swizzledT bit to true. On the other hand, if . is not yet in the translation 4 to specify explicitly that a block loaded into memory is to have its pointers table. then its block has not been copied into main memory. We therefore slvizzled, or the programmer may call for the pointers to be swizzled only as cannot swizzle this pointer and leave it in the block as a database pointer. needed. For example, if a programmer knows that a block is likely to be accessed Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
586 CHAPTER 12. REPRESENTING DATA ELEMENTS 12.3. REPRESENTING BLOCK AND RECORD ADDRESSES 587 heavily, such as the root block of a B-tree (discussed in Section 13.3), then the its main-memory buffer. The reason is that, should we follow the pointer in pointers would be swizzled. However, blocks that are loaded into memory, used B1, will lead us to the buffer, which no longer holds Bz; in effect, the pointer it once, and then likely dropped from memory: would not be swizzled. has become dangling. A block, like B2, that is referred to by a swizzled pointer from somewhere else is therefore pinned. When we write a block back to disk, we not only need to "unswizzle" any 12.3.4 Returning Blocks to Disk pointers in that block. We also need to make sure it is not pinned. If it is When a block is moved from memory back to disk, any pointers within that pinned, we must either unpin it, or let the block remain in memory, occupying block must be "unswizzled"; that is, their memory addresses must be replaced space that could otherwise be used for some other block. To unpin a block by the corresponding database addresses. The translation table can be used that is pinned because of swizzled pointers from outside, we xllust "unswizzle" to associate addresses of the two types in either direction, so in principle it is any pointers to it. Consequently, the translation table must record, for each possible to find, given a memory address, the database address to which the database address whose data item is in memory, the places in memory where memory address is assigned. swizzled pointers to that item exist. TWO possible approaches are: However, we do not want each unswizzling operation to require a search of the entire translation table. While we have not discussed the implementation of 1. Keep the list of references to a memory address as a linked list attached this table, we might imagine that the table of Fig. 12.8 has appropriate indexes. to the entry for that address in the translation table. If we think of the translation table as a relation, then the problem of finding 2. If memory addresses are significantly shorter than database addresses, we the memory address associated with a database address x can be expressed as can create the linked list in the space used for the pointers themselves. the query: That is, each space used for a database pointer is replaced by SELECT memAddr (a) The swizzled pointer, and FROM TranslationTable WHERE dbAddr = x; (b) Another pointer that forms part of a linked list of all occurrences of this pointer. For instance, a hash table using the database address as the key might be appropriate for an index on the dbAddr attribute; Chapter 13 suggests many Figure 12.10 suggests how all the occurrences of a memory pointer y could be linked, starting at the entry in the translation table for database possible data structures. address x and its corresponding memory address y. If we want to support the reverse query, SELECT dbAddr FROM TranslationTable WHERE memAddr = y; need to have an index on attribute memAddr as well. Again, Chapter 13 then ~c-e suggests data structures suitable for such an index. Also, Section 12.3.5 talks about linked-list structures that in some circumstances can be used to go from I Swizzled pointer a memory address to all main-memory pointers to that address. 12.3.5 Pinned Records and Blocks Translation table A block in memory is said to be pinned if it cannot at the moment be written Figure 12.10: .A linked list of occurrences of a swizzled pointer back to disk safely. A bit telling whether or not a block is pinned can be located in the header of the block. There are many reasons why a block could be pinned, including requirements of a recovery system as discussed in Chapter 17. Pointer 12.3.6 Exercises for Section 12.3 swizzling introduces an important reason why certain blocks must be pinned. If a block B1has within it a swizzled pointer to some data item in block Bg, * Exercise 12.3.1 : If we represent physical addresses for the Megatron 747 disk then n-e must be very careful about moving block B2 back to disk and reusing by allocating a separate byte or bytes to each of the cylinder, track within Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
588 CHAPTER 12. REPRESENTING DATA ELE114E1VTS a cylinder, and block within a track, how many bytes do we need? Make a reasonable assumption about the maximum number of blocks on each track; I 12.4. VARI-4BLGLEArGTH DATA AAD RECORDS pointers to it. For specificity, assume the deletion on any day always occurs 589 before the insertions. If the block is initially empty, after how many days will there be no room to insert any more records? recall that the Megatron 747 has a variable number of sectorsltrack. Exercise 12.3.2: Repeat Exercise 12.3.1 for the Megatron 777 disk described ! Exercise 12.3.8: Repeat Exercise 12.3.7 on the assumption that each day in Exercise 11.3.1 there is one deletion and 1.1 insertions on the average. Exercise 12.3.3: I£ we wish to represent record addresses as well as block Exercise 12.3.9: Repeat Exercise 12.3.7 on the assumption that instead of addresses, we need additional bytes. Assuming we want addresses for a single deleting records, they are mored to another block and must be given an 8-byte Megatron 747 disk as in Exercise 12.3.1, how many bytes would we need for forwarding address in their offset-table entry. Assume either: record addresses if we: ! a) All offset-table entries are given the maximum number of bytes needed in * a) Included the number of the byte within a block as part of the physical an entry. address. !! b) Offset-table entries are allowed to vary in length in such a way that all b) Used structured addresses for records. Assume that the stored records entries can be found and interpreted properly. have a 4-byte integer as a key. * Exercise 12.3.10: Suppose that if we swizzle all pointers automatically, we Exercise 12.3.4: Today, IP addresses have four bytes. Suppose that block can perform the swizzling in half the time it would take to swizzle each one addresses for a world-wide address system consist of an IP address for the host, separately. If the probability that a pointer in main memory xvill be followed at a device number between 1 and 1000, and a block address on an individual least once is p, for what values of p is it more efficient to swizzle automatically device (assumed to be a Megatron 747 disk). How many bytes would block than on demand? . addresses require? ! Exercise 12.3.11 : Generalize Exercise 12.3.10 to include the possibility that Exercise 12.3.5 : In I P version 6, I P addresses are 16 bytes long. In addition, we never swizzle pointers. Suppose that the important actions take the following we may want to address not only blocks, but records, which may start at any times, in some arbitrary time units: byte of a block. However, devices will have their own IP address, so there will be no need to represent a device within a host, as we suggested was necessary i. On-demand swizzling of a pointer: 30. in Exercise 12.3.4. How many bytes would be needed to represent addresses in these circumstances, again assuming devices were Xegatron 747 disks? ii. dutomatic swizzling of pointers: 20 per pointer. ! Exercise 12.3.6: Suppose we wish to represent the addresses of blocks on a iii. Following a sn-izzled pointer: 1. Megatron 747 disk logically, i.e., using identifiers of k bytes for some k. We also need to store on the disk itself a map table, as in Fig. 12.6, consisting of pairs iv. Following an unswizzled pointer: 10. of logical and physical addresses. The blocks used for the map table itself are not part of the database, and therefore do not have their own logical addresses Suppose that in-memory pointers are either not follorved (probability 1 - p) in the map table. Assuming that physical addresses use the minimum possible or are follon-ed k times (probability p). For what values of k and p do no- number of bytes for physical addresses (as calculated in Exercise 12.3.1), and srvizzling, automatic-swizzling, and on-demand-sn-izzling each offer the best logical addresses likewise use the minimum possible number of bytes for logical average performance? addresses, how many blocks of 4096 bytes does the map table for the disk occupy? 12.4 Variable-Length Data and Records *! Exercise 12.3.7: Suppose that we have 4096-byte blocks in which we store v records of 100 bytes. The block header consists of an offset table, as in Fig. 12.7. Until now, we have made the simplifying assumptions that every data item has using 2-byte pointers to records within the block. On an average day. two a fised length, that records have a fixed schema, and that the schema is a list of records per block are inserted, and one record is deleted. h deleted record must fixed-length fields. Howerer, in practice, life is rarely so simple. We may wish have its pointer replaced by a "tombstone," because there may be da~lgling to represent: Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
590 CHAPTER 12. REPRESENTING DATA ELEMENTS 12.4. T'I1RIABLELENGTH DATA AND RECORDS 591 1. Data items whose size varies. For instance, in Fig. 12.1 we considered a like. We shall always put the name before the address. Thus, no pointer to Moviestar relation that had an address field of up to 255 bytes. While the beginning of the name is needed; that field will always begin right after the there might be some addresses that long, the vast majority of them will fixed-length portion of the record. 0 probably be 50 bytes or less. We could probably save more than half the space used for storing MovieStar tuples if we used only as much space as other header information the actual address needed. record length 2. Repeating fields. If we try to represent a many-many relationship in a to address record representing an object, we shall have to store references to as many I I l l objects as are related to the given object. . . . . . . . . , . . . , . . ibirthdate j . . name i address 3. Variable-format records. Sometimes we do not know in advance what the . . . . fields of a record will be, or how many occurrences of each field there will be. For example, some movie stars also direct movies, and we might Figure 12.11: A MovieStar record with name and address implemented as want to add fields to their record referring to the movies they directed. variable-length character strings Likewise, some stars produce movies or participate in other ways, and we might wish to put this information into their record as well. However, since most stars are neither producers nor directors, we would not want to reserve space for this information in every star's record. 12.4.2 Records With Repeating Fields 4. Enormous fields. Modern DBMS's support attributes whose value is a A similar situation occurs if a record contains a variable number of occurrences of a field F, but the field itself is of fixed length. It is sufficient to group all very large data item. For instance, we might want to include a p i c t u r e attribute with a movie-star record that is a GIF image of the star. -1 occurrences of field F together and put in the record header a pointer to the first. We can locate all the occurrences of the field F as follows. Let the number movie record might have a field that is a 2-gigabyte MPEG encoding of the movie itself, as well as more mundane fields such as the title of the of bytes del-oted to one instance of field F be L. We then add to the offset for movie. These fields are so large, that our intuition that records fit within the field F all integer multiples of L, starting a t 0, then L, 2L, 3L, and so on. Eventually, we reach the offset of the field following F. whereupon we stop. blocks is contradicted. other header information 12.4.1 Records With Variable-Length Fields record length to address I - If one or more fields of a record have variable length, then the record must , to movie pointers contain enough information to let us find any field of the record. A simple 1 I . ' . ' . .. . . .. . . . , but effective scheme is to put all fixed-length fields ahead of the variable-length t '. t i. t . . . . . . . . . . . . . . , . . . . . . :, name i address i. i. i . i . i i i . . . . ;, fields. We then place in the record header: . . . ,. .. . . , . . . , . . . , . L A 1. The length of the record. 2. Pointers to (i.e., offsets of) the beginnings of all the variable-length fields. pointers to movies However, if the variable-length fields always appear in the same order. then the first of them needs no pointer; we know it immediately follo~vs Figure 12.12: -1record with a repeating group of references to movies the fiscd-length fields. Example 12.8: Suppose that w-e have movie-star records with name, address: Example 12.9 : Suppose that we redesign our movie-star records to hold only gender, and birthdate. \Ve shall assume that the gender and birthdate are the name and address (which are variable-length strings) and pointers to all fixed-length fields, taking 4 and 12 bytes, respectively. However, both name the movies of the star. Figure 12.12 shows how this type of record could be and address will be represented by character strings of xhatever length is ap- represented. The header contains pointers to the beginning of the address fieid propriate. Figure 12.11 suggests what a typical movie-star record would look (we assume the name field always begins right after the header) and to the Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
. 592 CHAPTER 12. REPRESENTING DATA ELEMENTS 12.4. K4RIABLELENGTH DATA AND RECORDS record header information Representing Null Values I I to name length of name Tuples often have fields that may be NULL. The record format of Fig. 12.11 to address offersa convenient way to represent NULL values. I a field such as address f length of address is null, then we put a null pointer in the place where the pointer to an to movie references address goes. Then, we need no space for an address, except the place for the pointer. This arrangement can save space on average, even if address Record is a fixed-length field but frequently has the value NULL. first of the movie pointers. The length of the record tells us how many movie . . . . . . . . .. .. .. pointers there are. . . . . . . . . . . .. address name . . . . . . . . . . . .. .. . An alternative representation is to keep the record of fixed length, and put the variabklength portion - be it fields of variable length or fields that sepeat an indefinite number of times - on a separate block. In the record itself we Additional space keep: Figure 12.13: Storing variable-length fields separately from the record 1. Pointers to the place where each repeating field begins, and 2. Either how many repetitions there are, or where the repetitions end. 12.4.3 Variable-Format Records Figure 12.13 shows the layout of a record for the problem of Example 12.9, but with the variable-length fields name and address, and the repeating field An even more complex situation occurs when records do not have a fixed starredrn (a set of movie references) kept on a separate block or blocks. schema. That is, the fields or their order are not completely determined by There are advantages and disadvantages to using indirection for the variable- the relation or class whose tuple or object the record represents. The simplest length components of a record: representation of sariable-format records is a sequence of tagged fields,each of which consists of: Keeping the record itself fixed-length allows records to be searched more efficiently, minimizes the overhead in block headers, and allows records to 1. Information about the role of this field, such as: be moved within or among blocks with minimum effort. (a) The attribute or field name, On the other hand, storing variable-length components on another block increases the number of disk I/07s needed to examine all components of (b) The type of the field, if it is not apparent from the field name and a record. some readily available schema information, and (c) The length of the field, if it is not apparent from the type. A compromise strategy is to keep in the fixed-length portion of the record enough space for: 2. The value of the field. 1. Some reasonable number of occurrences of the repeating fields, There are at least tn-o reasons why tagged fields would make sense. 2. A pointer to a place where additional occurrences could be found, and 1. Information-integrationapplicattons. Sometimes, a relation has been con- 3. X count of how many additional occurrences there are. structed from several earlier sources, and these sources hare different kinds If there are fewer than this number, some of the space would be unused. If there of information; see Section 20.1 for a discussion. For instance, our niovie- are more than can fit in the fixed-length portion, then the pointer to additional star information may h a ~ come from several sources, one of which records e space will be nonnull, and we can find the additional occurrences by following birthdates and the others do not, some gire addresses, others not, and so this pointer. on. If there are not too many fields, 1%-e probably best off leaving NULL are Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
594 CHAPTER 12. REPRESENTING DATA ELEMENTS 12.4. VARIABLELENGTH D.4TA AND RECORDS 595 those values we do not know. However, if there are many sources, with For both these reasons, it is sometimes desirable to allow records to be split many different kinds of information, then there may be too many NULL'S, across two or more blocks. The portion of a record that appears in one block is and we can save significant space by tagging and listing only the nonnull called a record fragment. A record with two or more fragments is called spanned, fields. and records that do not cross a block boundary are unspanned. If records can be spanned, then every record and record fragment requires 2. Records with a very flexible schema. If many fields of a record can repeat some extra header information: and/or not appear a t all, then even if we know the schema, tagged fields may be useful. For instance, medical records may contain information 1. Each record or fragment header must contain a bit telling whether or not about many tests, but there are thousands of possible tests, and each it is a fragment. patient has results for relatively few of them. 2. If it is a fragment, then it needs bits telling whether it is the first or last Example 12.10 : Suppose some movie stars have information such as movies fragment for its record. directed, former spouses, restaurants owned, and a number of other fixed but unusual pieces of information. In Fig. 12.14 we see the beginning of a hypothet- ical movie-star record using tagged fields. We suppose that single-byte codes 3. If there is a next and/or previous fragment for the same record, then the are used for the various possible field names and types. Appropriate codes are fragment needs pointers to these ot,her fragments. indicated on the figure, along with lengths for the two fields shown, both of which happen to be of type string. Example 12.11: Figure 12.15 suggests how records that were about GO% of a block in size could be stored with three records for every two blocks. The header for record fragment 2a contains an indicator that it is a fragment, an indicator I code for name 1 code for restaurant owned that it is the first fragment for its record, and a pointer to nest fragment, 2b. code for string type code for string type Similarly, the header for 2b indicates it is the last fragment for its record and 1length 7 length . holds a back-pointer to the previous fragment 2a. . . . . . , , . . . . . , . . . . . . N;. s j. 14; Clint ~astwood . S;16; Hog's Breath 1% . . R! . . . . . , . . . , .. . . . . block header Figure 12.14: A record with tagged fields : recor Ii 2-bd ;i record 3 12.4.4 Records That Do Not Fit in a Block We shall now address another problem whose importance has been increasing t as DBMS's are more frequently used to manage datatypes with large values: block 1 block 2 often values do not fit in one block. Typical examples are video or audio "clips." Often, these large values have a vaiiable length, but even if the length is fixed Figure 12.15: Storing spanned records across blocks for all values of the type, we need to use some special techniques to represent these values. In this section we shall consider a technique called '.spanned records" that can be used to manage records that are larger than blocks. The management of extremely large values (megabytes or gigabytes) is addressed in 12.4.5 BLOBS Section 12.4.5. Spanned records also are useful in situations where records are smaller than Xow, let us consider the representation of truly large values for records or fields blocks, but packing whole records into blocks wastes significant amounts of of records. The common esamples include images in ~arious formats (e.g., GIF, space. For instance, the waste space in Example 12.6 was only 7%, but if or JPEG), movies in formats such as IIPEG, or signals of all sorts: audio, radar, records are just slightly larger than half a block, the wasted space can approach and so on. Such values are often called binary, large objects, or BLOBS. When 50%. The reason is that then we can pack only one record per block. a field has a BLOB as value, we must rethink at least two issues. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 12. REPRESENTIArG DATA ELEMENTS 12.4. VARIABLE-LENGTH DAT4 -4iVD RECORDS 597 Storage of BLOBS cholesterol test requires 16 bytes for a date and an integer result of the test. Show the layout of patient records if: A BLOB must be stored on a sequence of blocks. Often we prefer that these blocks are allocated consecutively on a cylinder or cylinders of the disk, so the a) The repeating tests are kept with the record itself. BLOB may be retrieved efficiently. However, it is also possible to store the BLOB on a linked list of blocks. b) The tests are stored on a separate block, with pointers to them in the lloreo\rer, it is possible that the BLOB needs to be retrieved so quickly record. (e.g., a movie that must be played in real time), that storing it on one disk does not allow us to retrieve it fast enough. Then, it is necessary to stripe the Exercise 12.4.4 : Starting with the patient records of Exercise 12.4.1, suppose BLOB across several disks, that is, to alternate blocks of the BLOB among we add fields for tests and their results. Each test consists of a test name, a these disks. Thus, several blocks of the BLOB can be retrieved simultaneously. date, and a test result. Assume that each such test requires 40 bytes. Also, increasing the retrieval rate by a factor approximately equal to the number of suppose that for each patient and each test a result is stored with probability disks involved in the striping. P. Retrieval of BLOBS a) Assuming pointers and integers each require 4 bytes, what is the average Our assumption that when a client wants a record, the block containing the number of bytes devoted to test results in a patient record, assuming that record is passed from the database server to the client in its entirety may not all test results are kept within the record itself, as a variable-length field? hold. We may want to pass only the "small" fields of the record, and allow the client to request blocks of the BLOB one a t a time, independently of the rest of b) Repeat (a), if test results are represented by pointers within the record the record. For instance, if the BLOB is a 2-hour movie, and the client requests to test-result fields kept elselvhere. that the movie be played, the BLOB could be shipped several blocks at a time to the client, at just the rate necessary to play the movie. ! c) Suppose we use a hybrid scheme, where room for k test results are kept In many applications, it is also important that the client be able to request within the record, and additional test results are found by following a interior portions of the BLOB without having to receive the entire BLOB. pointer to another block (or chain of blocks) where those results are kept. Examples would be a request to see the 45th minute of a movie, or the ending As a function of p. what value of k minimizes the amount of storage used of an audio clip. If the DBMS is to support such operations, then it requires a for test results? suitable index structure, e.g., an index by seconds on a movie BLOB. !! d) The antount of space used by the repeating test-result fields is not the only issue. Let us suppose that the figure of merit 1%-ewish to minimize 12.4.6 Exercises for Section 12.4 is the number of bytes used. plus a penalty of 10,000 if we have to store * Exercise 12.4.1 : .A patient record consists of the follolving fixed-length fields: some results on another block (and therefore will require a disk I/O for the patient's date of birth, social-security number, and patient ID, each 10 bytes many of the test-result accesses we need to do. Under this assumption, long. It also has the following variable-length fields: name, address, and patient what is the best value of k as a function of p? history. If pointers within a record require 4 bytes, and the record length is a $-byte integer, how many bytes. esclusire of the space needed for the variable- *!! Exercise 12.4.5: Suppose blocks have 1000 bytes available for the storage of length fields, are needed for the record? You may assume that no alignment of records, and 1%-ewish to store on them fixed-length records of length r , where fields is required. 500 < r 5 1000. The value of r includes the record header, but a record fragment requires an additional 16 bytes for the fragment header. For what * Exercise 12.4.2: Suppose records arc as in Exercise 12.4.1, and the variable- values of r can we improve space utilization by spanning records? length fields name. address. and history each have a length that is unifornlly distributed. For the name. the range is 10-30 bytes; for address it is 20-80 !! Exercise 12.4.6: An NPEG movie uses about one gigahyte per hour of play. bytes, and for history it is 0-1000 bytes. What is the average length of a If we carefully organized several mox-ies on a Megatron 747 disk, ho~vmany patient record? could we deliver with only small delay (say 100 milliseconds) from one disk. Exercise 12.4.3: Suppose that the patient records of Exercise 12.4.1 are aug- Use the tinling estimates of Example 11.5: but remember that )pu can choose mented by an additional repeating field that represents cholesterol tests. Each how the movies are laid out on the disk. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
598 CHAPTER 12. REPRESENTING DATA ELEMENTS 12.5. RECORD MODIFlCATlONS 599 12.5 Record Modifications However, there may be no room in the block for the new record, in which case we have to find room outside the block. There are two major approaches Insertions, deletions, and update of records often create special problems. These to solving this problem, as well as combinations of these approaches. problems are most severe when the records change their length, but they come up even when records and fields are all of fixed length. 1. Find space on a "nearby" block. For example, if block B1 has no available space.for a record that needs to be inserted in sorted order into that block, 12.5.1 Insertion then look at the following block B2 in the sorted order of the blocks. If there is room in B2, move the highest record(s) of B1 to B2, and slide the First, let us consider insertion of new records into a relation (or equivalently, records around on both blocks. However, if there are external pointers to into the current extent of a class). If the records of a relation are kept in records, then we have to be careful to leave a forwarding address in the no particular order, we can just find a block with some empty space, or get offset table of B1 to say that a certain record has been moved to Bz and a new block if there is none, and put the record there. Usually, there is some where its entry in the offset table of B2 is. Allowing forwarding addresses mechanism for finding all the blocks holding tuples of a given relation or objects typically increases the amount of space needed for entries of the offset of a class, but we shall defer the question of how to keep track of these blocks table. until. Section 13.1. There is more of a problem when the tuples must be kept in some fixed 2. Create a n overflow block. In this scheme, each block B has in its header order, such a s sorted by their primary key. There is good reason to keep records a place for a pointer to an overflow block where additional records that sorted, since it facilitates answering certain kinds of queries, as we shall see in theoretically belong in B can be placed. The overflow block for B can Section 13.1. If we need to insert a new record, we first locate the appropriate point to a second overflow block, and so on. Figure 12.17 suggests the block for that record. Fortuitously, there may be space in the block to put the structure. We show the pointer for overflow blocks as a nub on the block, new record. Since records must be kept in order, we may have to slide records although it is in fact part of the block header. around in the block to make space available at the proper point. If we need to slide records, then the block organization that me showed in Fig. 12.7, which we reproduce here as Fig. 12.16, is useful. Recall from our discussion in Section 12.3.2 that we may create an "offset table" in the header of each block, with pointers to the location of each record in the block. A pointer to a record from outside the block is a "structured address," that is, Block B overflow block the block address and the location of the entry for the record in the offset table. for B +-- - offset table-) header --tf unused - Figure 12.17: A block and its first overflow block - 12.5.2 Deletion record record 4 record 3 record 1 When we delete a record, we may be able to reclaim its space. If we use an offset table as in Fig. 12.16 and records can slide around the block. then we 4 C 4 can compact the space in the block so there is aln-ays one unused region in the center. as suggested by that figure. If we cannot slide records, we should maintain an available-space list in the Figure 12.16: An offset table lets us slide records xithin a block to ilinke room block header. Then we shall knon where. arid how large, the available regions for new records are, n-hen a new record is inserted into the block. Sote that the block header normally does not need to hold the entire available space list. It is sufficient to If we can find room for the inserted record in the block at hand, then we put the list head in the block header, and use the available regions themsell-es simply slide the records within the block and adjust the pointers in the offset to hold the links in the list. much as we did in Fig. 12.10. table. The new record is inserted into the block, and a new pointer to the When a record is deleted, we may be able to do away with an overflow block. record is added to the offset table for the block. If the record is deleted either from a block B or from any block on its overflow Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
600 CHAPTER 12. REPRESENTING D.4TA ELEMENTS 12.5. RECORD MODIFIC.~TIOIYS chain, we can consider the total amount of used space on all the blocks of that 12.5.3 Update chain. If the records can fit on fewer blocks, and we can safely move records When a fixed-length record is updated, there is no effect on the storage system, among blocks of the chain, then a reorganization of the entire chain can be because we know it can occupy exactly the same space it did before the update. performed. However, when a variable-length record is updated, we have all the problems However, there is one additional complication involved in deletion, which we associated with both insertion and deletion, except that it is never necessary to must remember regardless of what scheme we use for reorganizing blocks. There create a tombstone for the old version of the record. may be pointers to the deleted record, and if so, we don't want these pointers If the updated record is longer than the old version, then we map need to dangle or wind up pointing to a new record that is put in the place of the to create more space on its block. This process may involve sliding records deleted record. The usual technique, which we pointed out in Section 12.3.2, is or even the creation of an overflow block. If variable-length portions of the to place a tombstone in place of the record. This tombstone is permanent; it record are stored on another block, as in Fig. 12.13, then we may need to move must exist until the entire database is reconstructed. elements around that block or create a new block for storing variable-length Where the tombstone is placed depends on the nature of record pointers. fields. Conversely, if the record shrinks because of the update, me have the If pointers go to fixed locations from which the location of the record is found, same opportunities as with a deletion to recover or consolidate space, or to then we put the tombstone in that fixed location. Here are two examples: eliminate overflow blocks. 1. We suggested in Section 12.3.2 that if the offset-table scheme of Fig. 12.16 12.5.4 Exercises for Section 12.5 were used, then the tombstone could be a null pointer in the offset table, Exercise 12.5.1 : Suppose we have blocks of records sorted by their sort key since pointers to the record were really pointers to the offset table entries. field and partitioned among blocks in order. Each block has a range of sort keys that is known from outside (the sparse-index structure in Section 13.1.3 is 2. If we are using a map table, as in Fig. 12.6, to translate logical record an example of this situation). There are no pointers to records from outside, so addresses to physical addresses, then the tombstone can be a null pointer it is possible to move records between blocks if \ye wish. Here are some of the in place of the physical address. ways we could manage insertions and deletions. If we need to replace records by tombstones, it would be wise to have at the i. Split blocks whenever there is an overflow. Adjust the range of sort keys very beginning of the record header a bit that serves as a tombstone; i.e., it is for a block when we do. 0 if the record is not deleted, while 1 means that the record has been deleted. ii. Keep the range of sort keys for a block fixed: and use overflow blocks as Then, only this bit must remain where the record used to begin, and subsequent needed. Keep for each block and each overflow block an offset table for bytes can be reused for another record, as suggested by Fig. 12.18.~\$'hen we the records in that block alone. follow a pointer to the deleted record, the first thing we see is the "tombstone" bit telling us that the record was deleted. We then know not to look at the iii. Same as (ii), but keep the offset table for the block and all its overflow following bytes. blocks in the first block (or overflow blocks if the offset table needs the space). Note that if more space for the offset table is needed. n-e can move records from the first block to an overflow block to make room. t 1 iv. Same as (ii), but keep the sort key along. n-ith a pointer in the offset tables. i record 2 2:. Same as (iii); but keep the sort key along with a pointer in the offset table. Figure 12.18: Record 1 can be replaced, but the tombstone remains: record 2 has no tombstone and can be seen when we follow a pointer to it -1nslver the following questions: * a) Compare methods (i) and (ii) for the average numbers of disk 110's 3~o~ve\.er, field-alignment problem discussed in Section 12.2.1 may force us to leave the needed to retrieve the record, once the block (or first block in a chain four bytes or more unused. with overflow blocks) that could have a record 1~-ith given sort key is a Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 12. REPRESEXTING DATA ELEhIEiVTS 12.7. REFERENCES FOR CHAPTER 12 603 found. Are there any disadvantages to the method with the fewer average + BLOBS: Very large values, such as images and videos, are called BLOBS disk I/O's? (binary, large objects). These values must be stored across many blocks. Depending on the requirements for access, it may be desirable to keep the b) Compare methods (ii) and (iib) for their average numbers of disk 110's per BLOB on one cylinder, to reduce the access time for the BLOB, or it may record retrival, as a function of b, the total number of blocks in the chain. be necessary to stripe the BLOB across several disks, to allow parallel Assume that the offset table takes 10%of the space, and the records take retrieval of its content.% the remaining 90%. + Offset Tables: To support insertions and deletions of records, as well as ! c) Include methods (iv) and (v) in the comparison from part (b). Assume records that change their length due to modification of varying-length that the sort key is 119 of the record. Note that we do not have to repeat fields, we can put in the block header an offset table that has pointers to the sort key in the record if it is in the offset table. Thus, in effect, the each of the records in the block. offset table uses 20% of the space and the remainders of the records use 80% of the space. + Overflow Blocks: Also to support insertions and growing records, a block may have a link to an overflow block or chain of blocks, wherein are kept Exercise 12.5.2 : Relational database systems have always preferred to use some records that logically belong in the first block. fixed-length tuples if possible. Give three reasons for this preference. + Database Addresses: Data managed by a DBMS is found among several storage devices, typically disks. To locate blocks and records in this stor- l2.6 Summary of Chapter 12 age system, we can use physical addresses, which are a description of the device number, cylinder, track, sector(s), and possibly byte within a + Fields: Fields are the most primitive data elements. Many, such as in- sector. We can also use logical addresses, which are arbitrary character tegers or fixed-length character strings, are simply given an appropriate strings that are translated into physical addresses by a map table. number of bytes in secondary storage. Variable-length character strings are stored either in a fixed sequence of bytes containing an endmarker, + Structured Addresses: We may also locate records by using part of the physical address, e.g., the location of the block whereon a record is found, or in an area for varying strings, with a length indicated by an integer at plus additional information such as a key for the record or a position in the beginning or an endmarker at the end. the offset table of a block that locates the record. + Records: Records are composed of several fieldsplus a record header. The + Pointer Swizzling: When disk blocks are brought to main memory, the header contains information about the record, possibly including such database addresses need to be translated to memory addresses, if pointers matters as a timestamp, schema information, and a record length. are to be followed. The translation is called swizzling, and can either be + Variable-Length Records: If records contain one or more variable-length done automatically, when blocks are brought to memory, or on-demand, fields or contain an unknown number of repetitions of a field, then addi- when a pointer is first followed. tional structure is necessary. A directory of pointers in the record header + Tombstones: When a record is deleted, pointers to it will dangle. A can be used to locate variable-length fields within the record. Alterna- tombstone in place of (part of) the deleted record warns the system that tively, we can replace the variable-length or repeating fields by (fised- the record is no longer there. length) pointers to a place outside the record where the field's value is kept. + Pinned Blocks: For various reasons, including the fact that a block may contain swizzled pointers, it may be unacceptable to copy a block from + Blocks: Records are generally stored within blocks. A block header. with memory back to its place on disk. Such a block is said to be pinned. If the information about that block. consumes some of the space in the block. pinning is due to slvizzled pointers. then they must be unswizzled before I\-ith the remainder occupied by one or more records. returning the block to disk. + Spanned Records: Generally, a record exists within one block. However, if records are longer than blocks, or we wish to make use of left,overspace 12.7 References for Chapter 12 nithin blocks, then we can break records into two or more fragments, one on each block. .! fragment header is then needed to link the fragments of - The classic 1968 text on the subject of data structures [2] has been updated a record. recently. [ I has information on structures relevant to this chapter and also .] Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 12. REPRESENTING DATA ELEMENTS Chapter 13. Tombstoner as a technique for dealing with deletion is from [3]. [I] covers data reoresentation issues, such as addresses and swizzling in the context of -. object-oriented DBMS's. 1. . . G. Cattell, Object Data Management, Addison-Wesley, Reading 1994. ?VIA, 2. D. E. Knuth, The Art of Computer Programming, Vol. I, Fundamental Algorithms, Third Edition, Addison-Wesley, Reading M.4, 1997. 3. D. Lomet, "Scheme for invalidating free references," IBM J. Research and Index Structures Development 19:l (1975), pp. 26-35. 4. G. Wiederhold, File Organization for Database Design, McGraw-Hill, New York, 1987. Having seen the options available for representing records, we must now consider how whole relations, or the extents of classes, are represented. It is not sufficient simply to scatter the records that represent tuples of the relation or objects of the extent aniong various blocks. To see mhy, ask how Ive would answer even the simplest query, such as SELECT * FROM R. ifre would have to examine every block in the storage system and hope there is enough information in block headers to identify where in the block records begin and enough information in record headers to tell in what relation the record belongs. A slightly better organization is to reserve some blocks, perhaps several xvhole cylinders, for a given relation. All blocks in those cylinders may be assumed to hold records that represent tuples of our relation. Now; at least we can find the tuples of the relation without scanning the entire data store. However. this organization offers no help should we want to answer the next-simplest query, such as SELECT * FROM R WHERE a=10. Section 6.6.6 in- troduced us to the importance of creating indexes on a relation, in order to speed up the discovery of those tuples of a relation that have a particular value for a particular attribute. As suggested in Fig. 13.1. an index is any data struc- ture that takes as input a property of records - typically the value of one or more fields - and finds the records with that property "quickly." In particu- lar, an index lets us find a record without having to look at more than a small fraction of all possible records. The field(s) on whose values the index is based is called the search key. or just "key" if the index is understood. Many different data structures can serve as indexes. In the remainder of this chapter n.e consider the follo~\-ingmethods: 1. Simple indexes on sorted files. 2. Secondary indexes on unsorted files. 3. B-trees, a commonly used way to build indexes on any file. 4. Hash tables, another useful and important index structure. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 13. INDEX STRUCTURES 13.1. 11vDEXES ON SEQ UENTML FILES value -+ Index - - - ) records - records Figure 13.1: An index takes a value for some field(s) and finds records with the matching value Keys and More Keys There are many meanings of the term "key." We used it in Section 7.1.1 to mean the primary key of a relation. In Section 11.4.4 we learned about Figure 13.2: -4sequential file .'sort keys," the attribute(s) on which a file of records is sorted. Now, we shall speak of "search keys," the attribute(s) for which we are given values and asked to search, through an index, for tuples with matching In this file, the tuples are sorted by their primary key. IVe imagine that keys \ralues. We try to use the appropriate adjective - "primary," "sort," or are integers; n-e show only the key field, and we make the atjpical assumption "search" - when the meaning of "key" is unclear. However, notice in that there is room for only two records in one block. For instance, the first sections such as 13.1.2 and 13.1.3 that there are many times when the block of the file holds the records with keys 10 and 20. In this and many other three kinds of keys arc one and the same. examples, we use integers that are sequential multiples of 10 as keys, although there is surely no requirement that keys be multiples of 10 or that records with all n~ultiples 10 appear. of 13.1 Indexes on Sequential Files 13.1.2 Dense Indexes We begin our study of index structures by considering what is probably the Sow that Re have our records sorted, we can build on them a dense mda, simplest structure: A sorted file, called the data file,is given another file, called which is a sequence of blocks holding only the keys of the records and pointers the rndm file. consisting of key-pointer pairs. A search key K in the index file to the records themselves; the pointers are addresses in the sense discussed in is associated with a pointer to a data-file record that has search key K. These Section 12.3. The index is called "dense" because every key from the data file indexes can be "dense," meaning there is an entry in the index file for every is represented in the index. In comparison, "sparse" indexes, to be discussed in record of the data file, or "sparse," meaning that only some of the data records Section 13.1.3. normally keep only one key per data block in the index. are represented in the index, often one index entry per block of the data file. The index blocks of the dense indes maintain these keys in the same sorted order as in the file itself. Since keys and pointers presumably take much less 13.1.1 Sequential Files space than complete records. we expect to use many fewer blocks for the index than for the file itself. The index is especially advantageous when it. but r~ot One of the silllplest index types relies on the file being sorted 011the attribute(s) the data file. can fit in main memory. Then, by using the index, we can find of the index. Such a file is called a sequenteal file. This structure is especially any record given its search key, with only one disk 1/0 per lookup. useful when the search key is the primary key of the relation, although it can be used for other attributes. Figure 13.2 suggests a relation represented as a Example 13.1 : Figure 13.3 suggests a dense index on a sorted file that begins sequential file. as Fig. 13.2. For convenience, we have assumed that the file continues with a Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 13. INDEX STRUCTURES 13.1. INDEXES ON SEQUENTI-4L FILES key every 10 integers, although in practice we would not expect to find such a regular pattern of keys. We have also assumed that index blocks can hold only Locating Index Blocks four key-pointer pairs. Again, in practice we would find typically that there We have assumed that some mechanism exists for locating the index [yere many more pairs per block, perhaps hundreds. blocks, from which the individual tuples (if the index is dense) or blocks of the data file (if the index is sparse) can be found. Many ways of locating the index can be used. For example, if the index is small, we may store it in reserved locations of memory or disk. If the index is larger, we can build another layer of index on top of it as \ire discuss in Section 13.1.4 and keep that in fixed locations. The ultimate extension of this idea is the B-tree of Section 13.3, where a-e need to know the location of only a single root block. 1 Example 13.2 : Imagine a relation of 1,000,000 tuples that fit ten to a 4096- byte block. The total space required by the data is over 400 megabytes, proba- bly too much to keep in main memory. However, suppose that the key field is 30 bytes, and pointers are 8 bytes. Then with a reasonable amount of block-header space we can keep 100 key-pointer pairs in a 4096-byte block. A dense index therefore requires 10,000 blocks, or 40 megabytes. We might be able to allocate main-memory buffers for these blocks, depending on what else we needed in main memory, and how much main memory there was. Fur- 1 ther. log2(10000) is about 13, so we only need to access 13 or 14 blocks in a Index file Data file binary search for a key. And since all binary searches 15-ould start out accessing only a small subset of the blocks (the block in the middle: those at the 114 and Figure 13.3: A dense index (left) on a sequential data file (right) 314 points, those at 118, 318; 518, and 718, and so on), even if u-e could not afford to keep the tvhole index in memory, we might be able to keep the most The first index block contains pointers to the first four records, the second important blocks in main memory, thus retrieving the record for any key with block has pointers to the next four, and so on. For reasons that we shall significantly fewer than 14 disk I/O's. discuss in Section 13.1.6, in practice we may not want to fill all the index blocks completely. The dense index supports queries that ask for records with a given search 13.1.3 Sparse Indexes key value. Given key value K , we search the index blocks for K , and when we If a dense index is too large, tve can use a similar structure, called a sparse index, find it, we follow the associated pointer to the record with key K . It might that uses less space at the expense of somewhat more time to find a record given appear that we need to examine every block of the index, or half the blocks of its key. - sparse index, as seen in Fig. 13.4, holds only one key-pointer per data 1 the index, on average, before we find I
C H A P T E R 13. INDEX STRUCTCrRES 13.1. INDEXES ON SEQUENTIAL FILES 611 to do many disk I/O's to get to the record we want. By putting an index on the index, we can make the use of the first level of index more efficient. Figure 13.5 extends Fig. 13.4 by adding a second indes level (as before, we assume the unusual pattern of keys every 10 integers). The same idea would let us place a third-level index on the second level, and so on. However, this idea has its limits, and we prefer the B-tree structure described in Section 13.3 over building many levels of index. Figure 13.4: -4 sparse index on a sequential file Example 13.4: A sparse index can require many fewer blocks than a dense index. Using the more realistic parameters of Example 13.2, where there are 100.000 data blocks and 100 key-pointer pairs fit on one index block, we need only 1000 index blocks if a sparse index is used. W w the index uses only four o megabytes, an amount that surely could be allocated in main memory. On the other hand, the dense index allows us to answer queries of the form Figure 13.5: Adding a second level of sparse indes "does there exist a record with key value I(?" without having to retrieve the block containing the record. The fact that K exists in the dense index is enough In this example. the first-level index is sparse. although 11-ecould have chosen to guarantee the existence of the record with key I(. On the other hand, the a dense index for the first level. Howel-er. the second and higher levels must same query, using a sparse index, requires a disk 1 / 0 to retrieve the block on be sparse. The reason is that a dense index on an index would have exactly which key I( rnight be found. as many key-pointer pairs as the first-level indcs. and therefore n-ould take exactly as much space as the first-level index. -4 second-level dense index thus To find the record with key I(, given a sparse index, we search the indes for introduces additional structure for no advantage. the largest key less than or equal to K. Since the index file is sorted by key, a modified binary search will locate this entry. We follon. the associated pointer Example 13.5: Continuing xith a study of the hypothetical relation of Ex- to a data block. Now, ~ v e must search this block for the record with key Ii. ample 13.4, suppose we put a second-lel-el index on the first-level sparse index. Of course the block must have enough format information that the records and Since the first-level index occupies 1000 blocks. and we can fit 100 key-pointer their contents can be identified. Any of the techniques from Sections 12.2 and pairs in a block. xve need 10 blocks for the second-level indes. 12.4 can be used. as appropriate. It is very likely that these 10 blocks can remain buffered in memory. If so. then to find the record with a given key I(. lve look up in the second-level index 13.1.4 Multiple Levels of Index to find the largest key less than or equal to X. The associated pointer leads to a block B of the first-level index that nil1 surely guide us to the desired record. An index itself can cover many blocks, as we saw in Exanlples 13.2 and 13.4. iVe read block B into memory if it is not already there: this read is the first Even if we use a binary search to find the desired index entry, we still may need disk I/O we need to do. ?Ve look in block B for the greatest key less than or Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
612 CHAPTER 13. INDEX STRUCTURES 13.1. I3DEXES ON SEQUEArTIAL FILES 613 equal to K, and that key gives us a data block that will contain the record with Example 13.6 : Suppose we want to find all the records with search key 20 in key I( if such a record exists. That block requires a second disk 110, and we Fig. 13.6. \ire find the 20 entry in the index and follow its pointer to the first are done, having used only two I/O's. record with search key 20. We then search forward in the data file. Since we are at the last record of the second block of this file, we move forward to the Indexes With Duplicate Search Keys third block.' We find the first record of this block has 20, but the second has 13.1.5 30. Thus, we need search no further; we have found the only two records with Until this point we have supposed that the search key, upon which the index is search key 20. 0 based, was also a key of the relation, so there could be at most one record with any key value. However, indexes are often used for nonkey attributes, so it is Figure 13.7 shows a sparse index on the same data file as Fig. 13.6. The possible that more than one record has a given key value. If we sort the records sparse index is quite conventional; it has key-pointer pairs corresponding to the by the search key, leaving records with equal search key in any order, then we first search key on each block of the data file. can adapt the previous ideas when the search key is not a key of the relation. Perhaps the simplest extension of previous ideas is to have a dense index with one entry with key K for each record of the data file that has search key K. That is, we allow duplicate search keys in the index file. Finding all the records with a given search key K is thus simple: Look for the first I< in the index file, find all the other K's, which must immediately follow, and pursue all the a5sociated pointers to find the records with search key K. A slightly more efficient approach is to have only one record in the dense index for each search key I ' This key is associated with a pointer to the first i. of the records with K. To find the others, move forward in the data file to find any additional records with K ; these must follow immediately in the sorted order of the data file. Figure 13.6 illustrates this idea. Figure 13.7: A sparse index indicating the lowest search key in each block To find the records with search key K in this data structure, we find the last entry of the index, call it E l , that has a key less than or equal to I
614 CH-APTER 13. INDEX STRUCTURES 13.1. ISDEXES OK SEQUENTIAL FLLES US to the second and third data blocks, and it is on these two blocks that we If K = -30; rule selects the third entry. Its pointer leads us to the third the find records with search key 20. data block. A-herethe records with search key 30 begin. Finally, if K = 25, For another example, if K = 10, then El is the second entry of the first then part (b) of the selection rule indicates the second index entry. We are thus index block, and Ez doesn't exist because we never find a smaller key. Thus. led to the wcond data block. If there were any records with search key 25, a t we follow the pointers in all index entries up to and including the second. That least one n-ould have to follow the records with 20 on that block, because n-e takes us to the first two data blocks, where we find all of the records with search know that rhe first new key in the third data block is 30. Since there are no key 10. 25's, we fail in our search. 13.1.6 Managing Indexes During Data Modifications Until this point, we have sho~vn data files and indexes as if they were sequences of blocks. fully packed with records of the appropriate type. Since data evolves with time. n-e expect that records will be inserted, deleted, and sometimes updated. .sa result, an organization like a sequential file will evolve so that I what once fit in one block no longer does. 'IQe can use the techniques discussed in Section 12.5 to reorganize the data file. Recall that the three big ideas from that section are: 1. Create overflow blocks if extra space is needed, or delete overflow blocks if enough records are deleted that the space is no longer needed. Overflow bloch do not have entries in a sparse index. Rather, they should be co~idered extensions of their primary block. as 2. Ins;cad of overflo~v blocks, we may be able to insert new blocks in the seqwntial order. If 1-e do, then the new block needs an entry in a sparse indtz 1 .should remember that changing an index can create the same % kirw& of problems on the index file that insertions and deletions to the d a ~ a c~eate. we create new index blocks. then these blocks must be file If Figure 13.8: A sparse index indicating the lowest new search key in each block loci.-ed someho~v. e.g.. with another level of index as in Section 13.1.1. A slightly different scheme is shown in Fig. 13.8. There, the index entry for 3. I\-1:tn there is no room to insert a tuple into a block. we can sometimes a data block holds the smallest search key that is new; i.e., it did not appear in slit; tuples to adjacent blocks. Conversely. if adjacent blocks grow too a prerious block. If there is no new search key in a block, then its index entr? em?::-. they can be combined. holds the lone search key found in that block. Under this scheme, we can find the records with search key I( by looking in the index for the first entry whose Hon-eyer. when changes occur to the data file, we nlust often cliange the key is either indes to &apt. The correct approach depends on 15-hetherthe indes is dense or sparse. z d on which of the three strategies enumerated above is used. However, a) Equal to IC; or one general principle should be remembered: b) Less than Ii, but the next key is great,er than I
616 CHAPTER 13. INDEX STRUCTURES 13.1. INDEXES ON SEQUENTIAL FILES 617 deleting empty blocks of the sequential file, inserting, deleting, and moving records. Notice that we assume only empty blocks can be created or destroyed. Preparing for Evolution of Data In~particular,if we want to delete a block that contains records, we must first delete the records or move them to another block. Since it is common for relations or class extents to grow with time, it is often ~ i s to distribute extra space among blocks - both data and index e blocks. If blocks are, say, 75% full to begin with, then we can run for .Action Dense Index Sparse Index some time before having to create overflow blocks or slide records between Create empty overflow block none none blocks. The ad\-antage to having no o~erflo~v blocks, or few overflow blocks, Delete empty overflow block none none is that the average record access then requires only one disk 1 0 The more 1. Create empty sequential block none insert overflo~v blocks, the higher will be the average number of blocks we need Delete empty sequential block none delete to look at in order to find a given record. Insert record insert update(?) Delete record delete update(?) Slide record update update(?) Example 13.9 : First, let us consider the deletion of a record from a sequential file with a dense index. We begin with the file and index of Fig. 13.3. Suppose Figure 13.9: How actions on the sequential file affect the index file that the record with key 30 is deleted. Figure 13.10 shorn-s the result of the deletion. In this table, we notice the following: Creating or destroying an empty overflow block has no effect on either type of index. It has no effect on a dense index, because that index refers to records. It has no effect on a sparse index, because it is only the primary blocks, not the overflow blocks, that have entries in the sparse index. Creating or destroying blocks of the sequential file has no effect on a dense index, again because that index refers to records, not blocks. It does affect a sparse index, since we must insert or delete an index entry for the block created or destroyed, respectively. Inserting or deleting records results in the same action on a dense indes: a key-pointer pair for that record is inserted or deleted. However, there is typically no effect on a sparse index. The exception is ~vhen record the is the first of its block, in which case the corresponding key value in the Figure 13.10: Deletion of record ivith search key 30 in a dense index sparse index must be updated. Thus, \I-e have put a question mark after "update" for these actions in the table of Fig. 13.9, indicating that the First. the record 30 is deleted from the sequential file. \Ve assume that there update is possible, but not certain. are possible pointers from outside the block to records in the block, so we have elected not to slide the remaining record, 10,forn-ard in the block. Rather, we Similarly. sliding a record, ~vhetherivithin a block or between blocks. suppose that a tombstone has been left in place of the record 30. results in an update to the corresponding entry of a dense index, but only In the indes. n-e deiete the key-pointer pair for 30. n suppose that there P affects a sparse index if the moved record \\-as or becomes the first of its cannot be pointers to index records from outside. so there is no need to leave a block. tombstone for the pair. Therefore, 11-ehave taken the option to consolidate the index block and move follo\ving records of the block forward. 0 Ke shall illustrate the family of algorithms implied by these rules in a series of examples. These examples involve both sparse and dense indexes and both Example 13.10 : Sow, let us consider two deletions from a file with a sparse "record sliding" and overflow-block approaches. index. \Ye begin with the structure of Fig. 13.1 and again suppose that the Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
618 CHAPTER 13. INDEX STRUCTURES 13.1. INDEXES ON SEQUELVTIAL FILES record with key 30 is deleted. We also assume that there is no impediment to sliding records around within blocks - either we know there are no pointers to records from anywhere, or we are using an offset table as in Fig. 12.16 to support such sliding. The effect of the deletion of record 30 is shorn in Fig. 13.11. The record has been deleted, and the following record, 40, slides forward to consolidate the block at the front. Since 40 is now the first key on the second data block, we need to update the index record for that block. We see in Fig. 13.11 that the key associated with the pointer to the second data block has been updated from 30 to 40. Figure 13.12: Deletion of record with search key 40 in a sparse index place. To fit record 20 on the second block and keep records sorted, we slide record 40 back in the second block and put 20 ahead of it. Our last step is to modify the index entries of the changed blocks. We might have to change the key in the index pair for block 1, but we do not in this case, because the inserted record is not the first in its block. \ire do, however, change the key in the index entry for the second data block. since the first record of that block, which used to be 40. is now 20. Figure 13.11: Deletion of record with search key 30 in a sparse index Example 13.12: The problem with the strategy exhibited in Example 13.11 is that we were lucky to find an empty space in an adjacent data block. Had Kow, suppose that record 40 is also deleted. ?\'e see the effect of this action in the record with key 30 not been deleted previously. 11-ewould have searched in Fig. 13.12. The second data block now has no records at all. If the sequential file vain for an empty space. In principle. we would have had to slide every record i stored on arbitrary blocks (rather than, say, consecutive blocks of a cylinder), s from 20 to the end of the file back until Ire got to the end of the file and could then we may link the unused block to a list of available space. create an additional block. We complete the deletion of record 40 by adjusting the index. Since the second data block no longer exists, we delete its entry from the index. \Ve also Because of this risk, it is often wiser to allow overflorv blocks to supplement show in Fig 13.12 the consolidation of the first index block, by moving forward the space of a primary block that has too many records. Figure 13.14 sl~o~\-s the following pairs. That step is optional. the effect of inserting a record with key 15 into the structure of Fig. 13.11. As in Example 13.11, the first data block has too many records. Instead of sliding Example 13.11: Now. let us consider the effect of an insertion. Begin at records to the second block, xse create an overflow block for the data block. We Fig. 13.11, where rve have just deleted record 30 from the file with a sparse index, have s1101rn in Fig. 13.11 a "nub" on each block. representing a place in the but the record 40 remains. We now insert a record with key 15. Consulting the block header n-here a pointer to an orerfloxv block may be placed. Any number sparse index, \re filld that this record belongs in the first data block. But that of overflow blocks may 1 e linked in a chain using these pointer spaces. 1 block is full; it holds records 10 and 20. In our example. record 1.5 is inserted in its rightful place, after record 10. One thing we can do is look for a nearby block with some extra space, and in Record 20 slides to the overflow block to make room. S o changes to the index this case we find it in the second data block. We thus slide records back~ard in are necessary, since the first record in data block 1 has not changed. Sotice that the file to make room for record 15. The result is shown in Fig. 13.13. Record no index entry is made for the overflow block, which is considered an estension 20 has been moved from the first to the second data block, and 15 put in its of data block 1, not a block of the sequential file on its elm. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.