Exploring the design space of binary search trees (opens in new tab)

(rtheunissen.github.io)

122 pointsrtheunissen2y ago28 comments

I started this project many years ago when I was coming up with ideas for immutable data structures for the PHP data structures extension [1]. I wanted to support access by position in a sorted set, which led to the idea of using binary search trees for both lists and sets. However, I did not expect the scope of this project to increase as much as it did.

The more I read about binary search trees, the more I thought about them, and so down the rabbit hole I went. When you read everything you can find about a topic for several years, eventually you develop a deep enough understanding to compose new ideas from different sources.

This project is the result of many rejected ideas and countless experiments. I tried my best to distill everything I consider important enough to keep, and will continue to develop other ideas as they come along. I had no intention to implement anything other than weight-balanced strategies to support positional access, but when I read about rank-balanced trees I knew I had to take on the challenge to implement them.

I've been in contact with various authors along the way, specifically Bob Tarjan and Salvador Roura, but have otherwise not received any feedback yet. Implementing all the algorithms was incredibly hard work, by far the hardest work I've ever done, so I hope that others may find value in their presentation.

There is still so much work that could be done, but there comes a time when working on something alone begins to yield diminishing returns. I hope to continue this project as an open-source collaborative effort, so please feel free to ask questions or suggest changes, however small.

[1] https://github.com/php-ds/ext-ds

28 comments

gniv2y ago

This looks like a thorough treatment of the subject. But my intuition is that B-trees are better than BSTs in almost every context. A good B-tree implementation is more cache-friendly, so faster to insert, delete, traverse. In my previous job I used to replace STL map uses with a btree implementation. It was always faster.

rtheunissenOP2y ago

That is true, but in a persistent setting they likely also copy more data, and iterator invalidation might be a concern in some cases when moving values around within a B-tree node. The motivation was not really to come up with the best tree structure or to even consider memory hierarchy at all.

Instead, within just the scope of binary search trees specifically, what cool things might we uncover by taking a closer look? In a hypothetical future, our hardware might be so different and caches so large and advanced that the best practices of today might not apply all the same.

Binary search trees today are probably only viable for absolutely massive collections, and at that scale there is almost definitely a better, bespoke solution available. Exploring without necessarily thinking about the hardware, in the abstract, has been a fun sandbox to play in. The fact that we still teach them suggests that they provide value beyond practical application.

Thank you for your feedback, you are absolutely right. In most practical use-cases today, a B-tree would likely achieve better results.

jiggawatts2y ago

> caches so large and advanced that the best practices of today might not apply all the same.

All trends are going in the wrong direction for binary trees to ever become useful again, and some of those limits are physics, not design.

- Branch predictors hate binary trees.

- CPU pipelines hate pointer chasing.

- Larger caches work less efficiently with binary trees than alternatives, such as "cache-oblivious" algorithms.

- Operations wasted per cache-miss due to memory latency has been going up, not down.

Etc...

> even consider memory hierarchy at all.

There's an often overlooked physics aspect here: physical memory requires physical space. As you add "N" bytes of memory, the layout of the circuits needs to expand into space. If deployed in a flat plane as is done typically, then this is a shape with radius roughly proportional to √N. For a space-filling volumetric design, at best it is ∛N.

Access to a piece of data requires a round-trip that takes time proportional to the distance taken through the computer. This is a simple consequence of the speed of light.

This means that unpredictable or random memory access time scales as ∛N to √N, not the constant "1" as typically assumed in big O analysis!

This is inescapable for computers made from physical components instead of pencil & paper abstractions.

If you add in this additional term, a lot of algorithms traditionally considered to be efficient suddenly look much worse. Conversely, algorithms preferred by developers with a knack for performance tuning don't get such a big penalty.

3 more replies

duped2y ago

A lot of cache oblivious structures start with a BST laid out in a flat array, which is actually more cache friendly than a naive B tree. But not many people do that in practice, and also require keeping it balanced, so B trees are fine.

BSTs are also useful as an analytical tool because they're just B trees with B = 2, so a lot of the insight into their structure and algorithms can be extrapolated to N-ary search trees aka B trees

kragen2y ago

by 'bst laid out in a flat array' i am guessing you mean a layout like a traditional binheap, so, for the keys 1 2 3 4 5 6 7

    4 2 6 1 3 5 7

is that right, and do you mean to make this arbitrarily large

i feel like this is less cache friendly than a naive b-tree, even without rebalancing; if your cache line size is 128 bytes, your keys are 4 bytes, and your b-tree nodes are 15 keys and 16 (4-byte!) pointers, you can reach any of 1048576 keys in 5 cache-line fills, of which probably 2 were already in your cache so it's really 3. by contrast, the flat-array binary search tree above is 20 levels deep. the first 5 levels are in one cache line (because you don't waste any space on pointers) but every key in the following 15 levels of the tree is in its own cache line so you have probably 14 cache-line fills

14 is like a lot more than 3 in my mind maybe

conceivably you have done this in practice and can tell me what i'm overlooking

1 more reply

frfl2y ago

No one else has mentioned it, so I'll be the first. I love the typography and layout. Looks like a proper book. I'll definitely be coming back to this for future reference for style and layout

rtheunissenOP2y ago

I was inspired by Stick & Rudder, which is a very easy book to recommend to most people in this community. I'm sure I'll keep coming back to make adjustments, but the end result is what I hoped to achieve.

The Charter font was a particularly good find as part of the "transitional" system font stack. I'll definitely use it for other projects in the future.

frfl2y ago

Really one of the most beautiful pages on the web. A prime example of what content on the web could look like. No bloat, fast, beautiful typography, SVG graphics not unnecessary jpegs/pngs.

You have my respect for making this and setting an example which I will definitely strive towards moving forward for my own stuff.

cb3212y ago

If you like this article then you might also be interested in: https://github.com/c-blake/bst

It's ANSI C with a bespoke macro generics system not Go, does not cover as many balancing rules, and is not written up as nicely. It does do something only weakly represented in your Go impls, AFAICT - "binary symmetry" - the reflection about left-right intrinsic to most ideas in this area. (Mehlhorn has a book that does this, but that is the only other source I know.) It also has API considerations like seek-edit separation and how to handle in-tree duplicate keys (FIFO/LIFO/etc. as opposed to values which are also collections).

Also, Re: B-trees among several comments here - edit heavy B-trees with very large nodes (driven by IO bandwidth-delay products) need some "mini-scalable" structure for their nodes since shifting (on average) half the entries can cost. That mini-scale could be another B-tree or it could be a binary search tree, perhaps adjusted to have 2-byte sized pointers into the little arena that is a node if 64Knode is enough. I once heard Sybase (now Microsoft SQL Server?) used skip lists. Anyway, this may be a remaining use case for binary trees even in the presence of a B-tree.

rtheunissenOP2y ago

Thank you for sharing this resource, I was not aware of it. I am happy to see the inclusion of LBSTs there too.

Re: binary symmetry, if I'm understanding correctly, another author that makes use of the symmetry is Ben Pfaff in libavl [1]. At the top of [2], which seems a bit misplaced now, I wrote:

>A choice was made to not unify the symmetric cases using the direction-based technique of Ben Pfaff and others because it makes the logic more difficult to follow even though there would be less code overall.

The choice of Go was to provide implementations that are both reliable to benchmark (though not as robust as C or Rust for example) but also easy to read. I would like to further reduce abstraction by decomposing common parts such that all the strategies are "one-file" references. This is then effectively the opposite of what the macro-based implementation achieves. Both have value, of course.

[1] https://adtinfo.org/libavl.html/BST-Node-Structure.html

[2] https://github.com/rtheunissen/bst/blob/main/trees/avl_botto...

cb3212y ago

LBSTs are pretty rare, too. :) Re: symmetry reasoning troubles, another benefit is that it enforces the symmetry algebraically rather than relying on code edits maintaining it (& multiplying it). As to DRY vs. reducing abstraction aka explicitness, I favor the former but, yeah, people are indeed passionate in both directions. E.g., see Java. :-)

Along that abstraction line, it perhaps bears emphasizing that it is not all subjective. E.g., being able to have 1-byte or 2-byte pointers can really shrink the per node space overhead from 16B to 2..4B, possibly total node size from 20B to 8B for a 4B payload (like a float32) or 2.5x overall space saving, an objective and interesting amount that might well keep a working set resident in faster memory. Indeed, even allowing any number of bits like 20 bits*2 is thinkable. Of course, that limits the number of nodes to the address space of the numbers, but that can be ok (such as inside 1 B-tree node or when populations are elsewise easily bounded). But then you pretty much must abstract "allocation/dereferencing" as done in that generic C package or analogously. (Others elsethread were discussing memory overheads vs. B-trees, but not at this API level and more related to tree/indirection depth in uncached worst cases.)

Anyway, I just thought that package might provide color along some other less traveled roads in this space..was not trying to direct or redirect your own effort. It's a nice write up of what you have so far.

1 more reply

emmanueloga_2y ago

Some feedback on readability: the article utilizes the term "logarithmic weight-balance" 6 times without explaining what it is (and several more times after). On the 7th mention, it links to a paper: "Frias[40] published top-down logarithmic weight-balance trees in 2005, under advice from Roura." But the referenced paper doesn't mention logarithmic weight-balance even once, I think [40] calls them "Logarithmic binary search trees", and links to "Roura S., 2001, A new method for balancing binary search trees".

After reading the last paragraph:

"Any current red-black tree implementation in practice can be replaced by a logarithmic weight-balance tree to achieve better balance, better performance, and get positional access as a bonus, all with a simpler algorithm."

... it is clear the article is advocating for these, and the title should hence be something like "Advantages of logarithmic weight-balanced Trees over other balanced trees", or something like that, immediately followed by an explanation of what these are, and what other people call them.

The repository linked at the top includes a lot of stuff and makes it hard to find a logarithmic weight-balance tree implementation, lost in all other kinds of trees in there. Would be helpful to see a separated logarithmic weight-balanced tree go module for people looking forward studying its source code.

40: L. Frias, Extending STL maps using LBSTs, 2005.

rtheunissenOP2y ago

Part 2 defines that "a node is logarithmically weight-balanced if the binary log of the weights of its subtrees differ by no more than 1" and references Roura directly there. Roura uses the acronym LBST, which corresponds to the file trees/lbst.go in the repository. I hoped that this would be intuitive enough to follow.

They are definitely part of the class of weight-balanced trees, using the exact same algorithms as BB[a] trees, which are the classic weight-balanced trees.

When I started this project, it was unclear that the conclusion would advocate for them. In fact, I did not even know about them when I started. My intention was to focus on the exploration more than the conclusion. I'm in the process of writing another conclusion around relaxed balance as a concept, which could just as well be the title.

I appreciate this feedback very much. Perhaps the paper could mention specifically that Roura and Frias call the logarithmic binary search trees, abbreviated as LBST. The repository's README could also provide an index to each tree in the source.

For reference: the weight-balanced algorithms can be found in [1], [2] and [3]. The logarithmic weight-balance rules are defined in [4].

[1] https://github.com/rtheunissen/bst/blob/main/trees/wbst_bott...

[2] https://github.com/rtheunissen/bst/blob/main/trees/wbst_topd...

[3] https://github.com/rtheunissen/bst/blob/main/trees/wbst_rela...

[4] https://github.com/rtheunissen/bst/blob/main/trees/lbst.go

emmanueloga_2y ago

I'm glad it is helpful! Why not just calling them LBSTs as the papers do? Catchier than LWBT I think... (gotta think of the marketing side too) :-p

Thanks for the extra pointers!

1 more reply

rurban2y ago

Empty page with noscript. And with Javascript enabled just text!

rtheunissenOP2y ago

This is tragic! Supporting noscript was a primary design goal but I forgot to only enable the knuth/plass justification in print media. The math expressions I'm moving to build time now. So sorry.

rtheunissenOP2y ago

This has now been fixed, thank you for reminding me.

scott_s2y ago

If you'd like to get more attention from academics, you could try submitting a journal paper of the work. The ACM Computing Surveys journal may be a good venue: https://dl.acm.org/journal/csur

mattnewport2y ago

Skimming through, I didn't see any discussion of AA trees which I have seen discussed as similar to red black trees but simpler to implement and have been interested in exploring for learning purposes. Any reason you didn't look specifically at AA trees, or were they covered somewhere that I missed?

rtheunissenOP2y ago

You are correct, they have not been covered yet. I've added a note in the "work in progress" section.

There are also LLRB trees that would be interesting to see compared within this framework. All the red-black trees are implemented as rank-balanced trees, so there is no concept of "color" exactly, but the authors do mention the left-leaning 2-3 rule and the left-leaning red-black rule -- I just haven't implemented those yet.

See Pg. 5 of https://citeseerx.ist.psu.edu/document?type=pdf&doi=52330eed...

davmsmith2y ago

A brilliant resource thank you! Will definitely be referencing this in our ds/algorithms courses.

j / k navigate · click thread line to collapse

28 comments

gniv2y ago

rtheunissenOP2y ago

Thank you for your feedback, you are absolutely right. In most practical use-cases today, a B-tree would likely achieve better results.

jiggawatts2y ago

> caches so large and advanced that the best practices of today might not apply all the same.

All trends are going in the wrong direction for binary trees to ever become useful again, and some of those limits are physics, not design.

- Branch predictors hate binary trees.

- CPU pipelines hate pointer chasing.

- Larger caches work less efficiently with binary trees than alternatives, such as "cache-oblivious" algorithms.

- Operations wasted per cache-miss due to memory latency has been going up, not down.

Etc...

> even consider memory hierarchy at all.

Access to a piece of data requires a round-trip that takes time proportional to the distance taken through the computer. This is a simple consequence of the speed of light.

This means that unpredictable or random memory access time scales as ∛N to √N, not the constant "1" as typically assumed in big O analysis!

This is inescapable for computers made from physical components instead of pencil & paper abstractions.

3 more replies

duped2y ago

BSTs are also useful as an analytical tool because they're just B trees with B = 2, so a lot of the insight into their structure and algorithms can be extrapolated to N-ary search trees aka B trees

kragen2y ago

by 'bst laid out in a flat array' i am guessing you mean a layout like a traditional binheap, so, for the keys 1 2 3 4 5 6 7

    4 2 6 1 3 5 7

is that right, and do you mean to make this arbitrarily large

14 is like a lot more than 3 in my mind maybe

conceivably you have done this in practice and can tell me what i'm overlooking

1 more reply

frfl2y ago

No one else has mentioned it, so I'll be the first. I love the typography and layout. Looks like a proper book. I'll definitely be coming back to this for future reference for style and layout

rtheunissenOP2y ago

The Charter font was a particularly good find as part of the "transitional" system font stack. I'll definitely use it for other projects in the future.

frfl2y ago

Really one of the most beautiful pages on the web. A prime example of what content on the web could look like. No bloat, fast, beautiful typography, SVG graphics not unnecessary jpegs/pngs.

You have my respect for making this and setting an example which I will definitely strive towards moving forward for my own stuff.

cb3212y ago

If you like this article then you might also be interested in: https://github.com/c-blake/bst

rtheunissenOP2y ago

Thank you for sharing this resource, I was not aware of it. I am happy to see the inclusion of LBSTs there too.

Re: binary symmetry, if I'm understanding correctly, another author that makes use of the symmetry is Ben Pfaff in libavl [1]. At the top of [2], which seems a bit misplaced now, I wrote:

[1] https://adtinfo.org/libavl.html/BST-Node-Structure.html

[2] https://github.com/rtheunissen/bst/blob/main/trees/avl_botto...

cb3212y ago

1 more reply

emmanueloga_2y ago

After reading the last paragraph:

40: L. Frias, Extending STL maps using LBSTs, 2005.

rtheunissenOP2y ago

They are definitely part of the class of weight-balanced trees, using the exact same algorithms as BB[a] trees, which are the classic weight-balanced trees.

For reference: the weight-balanced algorithms can be found in [1], [2] and [3]. The logarithmic weight-balance rules are defined in [4].

[1] https://github.com/rtheunissen/bst/blob/main/trees/wbst_bott...

[2] https://github.com/rtheunissen/bst/blob/main/trees/wbst_topd...

[3] https://github.com/rtheunissen/bst/blob/main/trees/wbst_rela...

[4] https://github.com/rtheunissen/bst/blob/main/trees/lbst.go

emmanueloga_2y ago

I'm glad it is helpful! Why not just calling them LBSTs as the papers do? Catchier than LWBT I think... (gotta think of the marketing side too) :-p

Thanks for the extra pointers!

1 more reply

rurban2y ago

Empty page with noscript. And with Javascript enabled just text!

rtheunissenOP2y ago

This is tragic! Supporting noscript was a primary design goal but I forgot to only enable the knuth/plass justification in print media. The math expressions I'm moving to build time now. So sorry.

rtheunissenOP2y ago

This has now been fixed, thank you for reminding me.

scott_s2y ago

If you'd like to get more attention from academics, you could try submitting a journal paper of the work. The ACM Computing Surveys journal may be a good venue: https://dl.acm.org/journal/csur

mattnewport2y ago

rtheunissenOP2y ago

You are correct, they have not been covered yet. I've added a note in the "work in progress" section.

See Pg. 5 of https://citeseerx.ist.psu.edu/document?type=pdf&doi=52330eed...

davmsmith2y ago

A brilliant resource thank you! Will definitely be referencing this in our ds/algorithms courses.

j / k navigate · click thread line to collapse