Persistent Binary Search Trees

This is an implementation of a persistent binary search tree for the GLib open-source data structures library.

My implementation uses a Treap [1] as the underlying binary search tree, and then uses the node-copying method as desribed by Driscol, et al.[2] to make it persistent. My motivation for developing this, and for my choices in how/why to implement it are outlined in my initial email to the gtk-devel mailing list on the topic:

Implementation Details

Treap

g_tree_lookup_related()

Parent Pointers

Persistent Nodes

The size of a persistent GTreeNode is roughly twice the size of an upstream node, when persistence is not being used. We removed balance, left_child and right_child from upstream, which saves 4+2*2 bytes in my test environment. However we add a version integer, data and v pointers, the nv integer, ref_count and the stolen flag. When multiple versions are used, some of these things are shared between copies of a node, but without persistence they all belong solely to a single node. In my 64-bit test environment, these all together add 48 bytes to a node structure, giving each node a total size of upstream-removed+added = 44-12+48 = 80 bytes.

When persistence is being used, each node's table of GTreeNodeVersion structures will contain four sets of version+pointers instead of one. This grows each node by 3*(3*8+4)=84 bytes, to a total size of 164 bytes.

During development, I considered using a single GTreeNode for all persistent versions of a node, removing the need for the data pointer and redirection to find a node's key or value. This node would then have an ever-increasing table of sets of version+pointers. However, it is required that at each node in a path, the next parent/child pointer can be found in O(1) time, and so an index into the target node's table would need to be stored along with each GTreeNode pointer. This change would be equivalent to increasing the size of every GTreeNode pointer inside the tree, increasing the total space used in the end, while also increasing the complexity of the code. Because of this I chose to instead allocate a new GTreeNode each time a node needs to add a pointer to its table, v, but does not have room.

For testing purposes, I applied a patch to both source trees in order to allow me to count the number of rotations being performed. The compiler I used is the 64-bit version of gcc (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu4).

This creates two installations of GLib in my home directory, one which uses the persistent GTree, and one which uses the upstream non-persistent GTree, both compiled without extra debug info, and with optimization level -O2.

Results

I used these test programs, test-single.c, test-few.c, test-many.c and Makefile, along with this script test-all, to compare the two implementations. Note that dynamic linking is required for this test to correctly use both GLib versions. When comparing the upstream and persistent implementations, the test uses the same random seed, so that the same keys are inserted into each tree, and the same queries are made, etc. The two implementations are compared using three different random seeds for each set of elements inserted into them, giving three different sets of key values to be inserted, queried, and deleted. The set of elements is 10⁶ (one million) elements, and each successive set grows by 10⁵. My full test results are available at the end, with the results summarized in the tables below.

In the first table we compare the upstream GTree implementation to the persistent GTree, when only using funtionality in the upstream version, or without making use of persistence. The full results show clearly what is summarized in this table. Inserting an element in the persistent tree takes less than 1.4 times longer than in upstream, which is a fairly small constant amount, independent of the number of elements in the tree. The different number of rotations is because we use a Treap versus the AVL tree in upstream. The AVL tree does slightly more rotations, creating a slightly better balanced tree in these tests. However, when multiple versions are used later, the balance of the persistent GTree improves significantly, leading me to believe that the hashing priority function in the Treap implementation has room for improvement in generating more random-like priorities. The query time in the persistent GTree is nearly identical on average to the upstream tree, and this difference is independent of the size of the tree, though we can see the Treaps unpredictability in the full results, as adding more elements to the tree can improve the query time. Deletions require more rotations in the treap, but generally take less time than in the upstream AVL tree, possibly because they don't need to modify nodes all the way to the root. The only real impact from using the persistent GTree comes in the memory footprint. By using the persistent tree with only 1 version, the memory foot print approximately doubles, as the structure has additional pointers and variables which are not all needed until additional versions are added to the tree.

GTree	Insert usec/n	Insert rot/n	Query usec/q	Delete usec/n	Delete rot/n	Size bytes/n
upstream	0.367236092	0.999985586	0.221296837	0.222647011	0	40.00282069
persistent	0.51106131	0.848266276	0.232498244	0.217069724	0.000245793	80.002842759

Table 1: Averaged results comparing upstream and persistent GTree implementations, without using persistence.

In the second table we see the cost of using versions in a persistent GTree. From the table it is clear that the costs for using persistence are quite reasonable. The time to insert an element increases, but only by a constant amount, completely independent of the size of the tree, or the number of versions in the tree. This cost comes from the fact that finding the right child pointer in each node along a path requires a search for the correct version, and that more memory allocations may be done when inserting and rotating the new node into place. The number of rotations per insert increases, about on par with the upstream GTree, and query time drops significantly. This appears to be due to priority function in the Treap giving better pseudo-random values since we are doing more memory allocations, pushing memory addresses further apart. The time to delete and the number of rotations needed to delete also drop as the tree is more balanced, and there is only a constant amount of increased overhead due to having multiple versions.

The last two columns in this table, the time to destroy the tree, and the size of the tree, are both averaged over the number of elements inserted into the tree plus the number of versions in the tree divided by four. The number of versions is divided by four because a new piece of memory is allocated for a node after it is modified over four different versions. It is clear that the size of the GTree, and thus the time to destroy it, should be proportional to the number of elements and the number of versions, and the results show the constant factor relating to them. When using one version, or n versions, regardless of the number of elements inserted or number of versions, the size value shown in this table remains constant, and is very close to the size of a single node, implying that on average one node is added to the GTree structure per insert, even when adding a new version.

Versions	Insert usec/n	Insert rot/n	Query usec/q	Delete usec/n	Delete rot/n	Destroy usec/(n+v/4)	Size bytes/(n+v/4)
1	0.51106131	0.848266276	0.232498244	0.217069724	0.000245793	0.461401185	80.002841379
n	0.714867195	0.999991379	0.179864611	0.169232782	0.000152621	0.709837425	165.667978372

Table 2: Averaged results comparing persistent GTree with and without using persistence.

Conclusions

The method of persistence described by [2] does work well in practice, producing a data structure that remains fast, and without requiring excessive memory overhead, especially as the number of versions stored increases.

Having a single tree structure in a library that provides both non-persistent and persistent functionality allows for a single interface to use a binary search trees, allowing for easy migration for application developers to using persistence, and requires only one tree implementation to be maintained within the library. For a pointer-based data structure, the persistent GTree is near-optimal in terms of average speed and space (within reasonable constant factors), and so it adds considerable value to make the tree implementation persistent. Persistence does, however, require that the tree remain a pointer-based structure in order to work effectively. Whereas a non-persistent tree implementation may be moved to an optimal cache-oblivious data structure such as this cache-oblivious B-tree. A cache-oblivious data structure has the potential to out-perform a pointer-based structure in practice because of their efficiency using the memory heirarchy at each level (i.e. each level of CPU cache, and main memory if the data structure exceeds the physical memory size and must partially reside on disk). For this reason, it may be desirable to keep a persistent search tree as a separate implementation in a data structures library, so that the performance of the non-persistent tree may be further improved without breaking its interface in teh future.

Availability

LibADDS - Open Source Advanced Data Structures Library

Full Test Results

The full results can be seen here. The file test-output.txt contains timing results, while the various mem.type.seed.num files contain massif heap profiling output for each type of GTree, with num elements inserted into it. The show-peaks script can be used to show the peak memory usage of all the massif output files. These entire results can be seen more easily in the table below.

GTree	Persistence	n (elements)	n Inserts (seconds)	n Inserts (rotations)	q (queries)	q Queries (seconds)	n Deletes (seconds)	n Delete (rotations)	Destroy (seconds)	Size (Bytes)
upstream	1 version	1000000	0.360289	999980	10000000	2.080198	0.210322	2	0.106857	40004090
upstream	1 version	1000000	0.361794	999980	10000000	2.082794	0.233572	2	0.106804	40004090
upstream	1 version	1000000	0.357193	999980	10000000	2.094601	0.217113	2	0.108319	40004090
upstream	1 version	1100000	0.402876	1099979	11000000	2.400965	0.242350	2	0.121526	44004090
upstream	1 version	1100000	0.396211	1099979	11000000	2.448406	0.241803	2	0.116967	-
upstream	1 version	1100000	0.414815	1099979	11000000	2.474160	0.249779	2	0.117525	-
upstream	1 version	1200000	0.444638	1199979	12000000	2.615538	0.275522	2	0.127294	48004090
upstream	1 version	1200000	0.442083	1199979	12000000	2.689526	0.264476	2	0.126977	-
upstream	1 version	1200000	0.439604	1199979	12000000	2.612322	0.269549	2	0.127768	-
upstream	1 version	1300000	0.482051	1299979	13000000	2.827375	0.316597	2	0.139125	52004090
upstream	1 version	1300000	0.494820	1299979	13000000	2.866409	0.307072	2	0.137906	52004090
upstream	1 version	1300000	0.473036	1299979	13000000	3.078726	0.286598	2	0.137678	52004090
upstream	1 version	1400000	0.506551	1399979	14000000	3.139297	0.307938	2	0.148062	56004090
upstream	1 version	1400000	0.510169	1399979	14000000	3.394319	0.308239	2	0.148790	-
upstream	1 version	1400000	0.509399	1399979	14000000	3.036422	0.308315	2	0.148292	-
upstream	1 version	1500000	0.546010	1499979	15000000	3.461912	0.329610	2	0.158792	60004090
upstream	1 version	1500000	0.556470	1499979	15000000	3.254288	0.335219	2	0.159493	-
upstream	1 version	1500000	0.547850	1499979	15000000	3.253508	0.331286	2	0.159697	-
upstream	1 version	1600000	0.581448	1599979	16000000	3.470931	0.352028	2	0.183812	64004090
upstream	1 version	1600000	0.584080	1599979	16000000	3.472722	0.352213	2	0.171622	64004090
upstream	1 version	1600000	0.585932	1599979	16000000	3.482003	0.351957	2	0.169921	64004090
upstream	1 version	1700000	0.623676	1699979	17000000	3.910369	0.374371	2	0.182281	68004090
upstream	1 version	1700000	0.625404	1699979	17000000	3.698552	0.388158	2	0.183890	-
upstream	1 version	1700000	0.621446	1699979	17000000	3.917655	0.373996	2	0.180336	-
upstream	1 version	1800000	0.694512	1799979	18000000	4.045737	0.408714	2	0.194383	72004090
upstream	1 version	1800000	0.673112	1799979	18000000	4.043627	0.396976	2	0.191511	-
upstream	1 version	1800000	0.659407	1799979	18000000	3.984874	0.396548	2	0.190505	-
upstream	1 version	1900000	0.706396	1899979	19000000	4.135065	0.417999	2	0.203039	76004090
upstream	1 version	1900000	0.703275	1899979	19000000	4.133898	0.418379	2	0.200840	-
upstream	1 version	1900000	0.700046	1899979	19000000	4.157925	0.418446	2	0.201358	-
persistent	1 version	1000000	0.508180	999982	10000000	2.510019	0.205469	284	0.464308	80004122
persistent	1 version	1000000	0.506266	999992	10000000	2.334333	0.193481	284	0.462331	80004122
persistent	1 version	1000000	0.524450	999982	10000000	2.457708	0.202558	284	0.462485	80004122
persistent	1 version	1100000	0.555531	1099982	11000000	2.671495	0.262468	308	0.512474	88004122
persistent	1 version	1100000	0.550718	1099982	11000000	2.675488	0.257931	308	0.507280	-
persistent	1 version	1100000	0.552077	1099982	11000000	2.609668	0.255387	308	0.513149	-
persistent	1 version	1200000	0.598701	1199981	12000000	2.972277	0.285345	334	0.558572	96004122
persistent	1 version	1200000	0.598763	1199981	12000000	3.140364	0.282181	334	0.559919	-
persistent	1 version	1200000	0.599944	1199981	12000000	2.971698	0.284266	334	0.554764	-
persistent	1 version	1300000	0.662663	1299983	13000000	2.909076	0.261226	362	0.596416	104004122
persistent	1 version	1300000	0.652815	1299983	13000000	3.042617	0.274130	362	0.603411	104004122
persistent	1 version	1300000	0.655600	1299983	13000000	2.909452	0.262575	362	0.598228	104004122
persistent	1 version	1400000	0.705562	1399967	14000000	5.839270	0.535780	646	0.645641	112004122
persistent	1 version	1400000	0.707307	1399967	14000000	5.997081	0.538280	646	0.643281	-
persistent	1 version	1400000	0.706213	1399967	14000000	5.809312	0.540373	646	0.644155	-
persistent	1 version	1500000	0.762084	1499986	15000000	2.826020	0.237013	287	0.691678	120004122
persistent	1 version	1500000	0.758042	1499986	15000000	2.813801	0.235883	287	0.696014	-
persistent	1 version	1500000	0.775162	1499986	15000000	2.812567	0.235078	287	0.691312	-
persistent	1 version	1600000	0.820830	1599979	16000000	4.531829	0.396939	393	0.742256	128004122
persistent	1 version	1600000	0.814305	1599979	16000000	4.334736	0.381579	393	0.733683	128004122
persistent	1 version	1600000	0.849087	1599979	16000000	4.363295	0.385533	393	0.734313	128004122
persistent	1 version	1700000	0.867889	1699986	17000000	3.187751	0.383622	389	0.779726	136004122
persistent	1 version	1700000	0.888863	1699986	17000000	3.189962	0.386470	389	0.779591	-
persistent	1 version	1700000	0.867457	1699986	17000000	3.201530	0.391508	389	0.787242	-
persistent	1 version	1800000	0.939098	1799986	18000000	3.359148	0.340578	340	0.825325	144004122
persistent	1 version	1800000	0.927377	1799986	18000000	3.446167	0.346883	340	0.829043	-
persistent	1 version	1800000	0.932494	1799986	18000000	3.360240	0.341237	340	0.834492	-
persistent	1 version	1900000	0.968590	1899989	19000000	2.902857	0.246160	221	0.876204	152004122
persistent	1 version	1900000	0.997419	1899989	19000000	2.909748	0.246421	221	0.875488	-
persistent	1 version	1900000	0.977680	1899989	19000000	3.047227	0.246179	221	0.874040	-
persistent	n versions	1000000	0.710953	999993	10000000	1.050974	0.145661	220	0.885344	207055410
persistent	n versions	1000000	0.741220	999993	10000000	1.099554	0.148827	220	0.917061	207055410
persistent	n versions	1000000	0.709816	999993	10000000	1.147848	0.152289	220	0.912249	207055410
persistent	n versions	1100000	0.787665	1099986	11000000	2.107660	0.192017	286	1.000884	227770762
persistent	n versions	1100000	0.789245	1099986	11000000	2.094027	0.188168	286	0.991551	-
persistent	n versions	1100000	0.776268	1099986	11000000	2.148659	0.191696	286	0.986304	-
persistent	n versions	1200000	0.848047	1199994	12000000	1.105762	0.136333	154	1.059789	248511186
persistent	n versions	1200000	0.873276	1199994	12000000	1.115378	0.140403	154	1.074009	-
persistent	n versions	1200000	0.845735	1199994	12000000	1.104125	0.136145	154	1.059633	-
persistent	n versions	1300000	0.915516	1299990	13000000	1.814951	0.171752	179	1.145020	269244210
persistent	n versions	1300000	0.917226	1299990	13000000	1.873896	0.172155	179	1.150497	269244210
persistent	n versions	1300000	0.916638	1299990	13000000	1.850338	0.172039	179	1.147329	269244210
persistent	n versions	1400000	0.996215	1399989	14000000	2.169372	0.168801	178	1.233657	289947082
persistent	n versions	1400000	0.986927	1399989	14000000	2.264875	0.177648	178	1.238888	-
persistent	n versions	1400000	0.996217	1399989	14000000	2.150762	0.169822	178	1.247902	-
persistent	n versions	1500000	1.100537	1499987	15000000	2.646327	0.224882	165	1.327333	310568090
persistent	n versions	1500000	1.070097	1499987	15000000	2.719206	0.229292	165	1.324975	-
persistent	n versions	1500000	1.062905	1499987	15000000	2.701209	0.227369	165	1.322224	-
persistent	n versions	1600000	1.144136	1599986	16000000	3.266689	0.297797	257	1.417095	331346554
persistent	n versions	1600000	1.133403	1599986	16000000	3.032173	0.278553	257	1.413353	331346554
persistent	n versions	1600000	1.134945	1599986	16000000	3.038172	0.279440	257	1.413571	331346554
persistent	n versions	1700000	1.211630	1699981	17000000	4.240850	0.379905	363	1.500128	352043538
persistent	n versions	1700000	1.227347	1699981	17000000	4.245171	0.376465	363	1.512506	-
persistent	n versions	1700000	1.211536	1699981	17000000	4.241356	0.376154	363	1.507197	-
persistent	n versions	1800000	1.288130	1799986	18000000	3.387680	0.356044	305	1.588621	372776850
persistent	n versions	1800000	1.286037	1799986	18000000	3.402412	0.357107	305	1.600261	-
persistent	n versions	1800000	1.288242	1799986	18000000	3.391532	0.355223	305	1.586708	-
persistent	n versions	1900000	1.360413	1899983	19000000	4.276822	0.385997	318	1.681350	393468426
persistent	n versions	1900000	1.388762	1899983	19000000	4.278769	0.386114	318	1.677843	-
persistent	n versions	1900000	1.377639	1899983	19000000	4.274557	0.387528	318	1.674128	-

Table 3: Full test results