next up previous
Next: Floating Buffer Tree Up: Efficient Batch Query Processing Previous: R-Trees

Subsections


Buffered R-Tree

The Buffer-Tree

In [4] a technique for designing EM data structures that supported batch operations in an I/O efficient manner was presented. The technique, the buffer tree, was used to develop EM version of one dimensional data structures including a segment tree, priority queue, and search tree. In this technique operations are performed on a high fan-out tree structure in a lazy manner with main-memory sized 'buffers' attached to the tree's internal nodes.

Figure 3: A buffer-tree.
Image RTREE

The buffer tree is an (a,b)-tree [15] with a = m/4 and b = m on O(n) leaves containing $ O(B)$ records. Attached to each internal node is a buffer of size O(m) . All leaves are at the same and the tree has hieght O(logm n). The buffer tree can handle insertion, deletion and query operations. Rather than execute an operation immediately, operations are performed in a lazy manner. Operations are collected at the root of the tree until B operations have been received at which time they are inserted as a block into the buffer of the root node. When the buffer of the root (or any internal node) becomes full (contains > m blocks of operations) the buffer is emptied and the operations in the buffer are flushed to the next buffer level. To empty a buffer the buffer is first sorted and operations that cancel each other (ie. an insertion and deletion with no intervening queries) are removed. Following the sorting/clean up the elements of the buffer are then routed from the current buffer node, v to the buffers at the next buffer level in the subtree of v. If during the emptying process buffers at the next level become full they too will be emptied.

Buffered R-Tree

The buffered R-Tree uses the lazy buffering techniques developed in the buffer-tree to provide efficient implementations of various bulk operations on R-Tree. The buffered R-Tree is constructed in the following manner. Take a standard R-Tree and starting from (but excluding) the leaf level augmented by adding buffers of size m/2 to all nodes at every 'th level. Nodes with associated buffers are termed buffer nodes (see Fig. 4).

Operations such as insert, delete and query on the buffered R-Tree are then performed in a lazy manner. As with the buffer tree when an operation is performed rather than execute the operation the operation is stored in a block of memory associated with the root of the tree. When B operations have been collected at the root they are stored in the root node's buffer. When the buffer associated with the root node runs full (contains more than m/4 operations), a buffer emptying process is performed on the root node's buffer. The buffer emptying process for a node, v is executed by loading into main memory the subtree rooted at v down to the next level of buffer nodes and routing all operations in the buffer of v to the appropriate buffers at the next buffer level. A buffer emptying process may result in some buffers at the next buffer level running-full, and in this case those buffers are also emptied, and so on down to the leaves of the tree. When an operation reaches a leaf the appropriate action is taken, for example reporting the intersected rectangles if the operation is a query.

For a batch of operations the buffered R-Tree can achieve significant I/O savings over one-by-one execution of the operations. The key to this improvement lies in the fact that during each buffer emptying process a large number of operations are pushed down levels in the tree. The number of buffer nodes that an operation can be routed to is bounded by nodes and the entire sub-tree from v (the buffer node being emptied) to the next buffer level fits in m/2 blocks in main memory (see Fig. 4). Thus the entire routing process can be performed in main memory with O(m) I/O's to write the M/4 operations to m/4 buffers. To route operations all the way down the O(logBn) levels in the tree uses O(1/B*logmn) I/Os.

Figure 4: The buffered R-tree shown on left with buffers at every 'th level. During the buffer emptying process the subtree (shaded) connecting the current buffer node v with the next buffer level in the tree is loaded into main memory and queries are routed through this subtree.
Image BufferRTree


next up previous
Next: Floating Buffer Tree Up: Efficient Batch Query Processing Previous: R-Trees
Craig Dillabaugh 2007-04-26