Immutable Data Structures in Qdrant (opens in new tab)

(qdrant.tech)

46 pointsandre-z1y ago13 comments

13 comments

I do not understand how you can pack 1536d into 192 bytes ??? That some entirely impossible?

> Vector search, on the other hand, requires reading a lot of small vectors, which might create a large overhead. It is especially noticeable if we use binary quantization, where the size of even large OpenAI 1536d vectors is compressed down to 192 bytes. Dataset size: 2M 768d vectors (~6Gb Raw data), binary quantization, 650Mb of RAM limit. All benchmarks are made with minimal RAM allocation to demonstrate disk cache efficiency.

wredue1y ago

>Immutable data structures, while tricky to implement correctly, offer significant performance gains, especially for read-heavy systems like search engines. They allow us to take full advantage of hardware optimizations, reduce memory overhead, and improve cache performance.

Nonsense.

LtdJorge1y ago

why?

wredue1y ago

Because the measurably do none of those things.

There is no performance benefit that immutable structures offer that cannot be had by mutable ones. It is asinine to assert otherwise, especially when literally every single measurement ever done demonstrates that immutable structures perform orders of magnitude slower.

You do not need immutability to create a slow moving cache. Immutability is not where any semblance of performance is coming from here.

If they didnt succumb to idiotic bullshit nonsense, they wouldn’t have even needed this post. If you see any of your senior architects reading medium, fire them immediately.

Immutable structure can, in a single use case, meet mutable ones for performance and that single case is:

-read speed on aligned, flattened data

But, that ignores that fact that getting to a point of aligned, flattened data with immutable structures is incredibly slow.

moab1y ago

Your analysis sounds reasonable to the non-expert, but recent work on purely-functional trees suggests that the gap is smaller than you suggest ("orders of magnitude slower").

E.g., see the nice work on the PAM library (https://arxiv.org/abs/1612.05665). Ideas from this work were used to build lots of cool things (immutable graph data structures, segment trees, databases) that are very fast, and all immutable.

1 more reply

j-pb1y ago

You obviously didn't read the article and jumped to the conclusion that they are talking about immutably peristent data structures, which they are not.

And even if they were talking about those (which they don't) your critique doesn't sound as smart as you seem to think if we just called them "lockless copy-on-write" data structures.

Lockless datastructures have some obvious advantages, and copy on write is one way to achieve that.

itishappy1y ago

> Immutable structure can, in a single use case, meet mutable ones for performance and that single case is:

> -read speed on aligned, flattened data

This is the case outlined in TFA.

> But, that ignores that fact that getting to a point of aligned, flattened data with immutable structures is incredibly slow.

It might, if that weren't the whole point of TFA.

1 more reply

wesnerm21y ago

> There is no performance benefit that immutable structures offer that cannot be had by mutable ones. It is asinine to assert otherwise, especially when literally every single measurement ever done demonstrates that immutable structures perform orders of magnitude slower.

Copying is free. Comparisons and change detection are much faster. Data-sharing, thread-safety, content-addressing, versioning/persistence have faster and more efficient implementations--often for zero cost. Immutable data structures have more guarantees, which lend themselves to more optimizations.

Chart parsing uses immutable data structures and many other DP algorithms rely on immutability to take an algorithm from exponential running time and space to polynomial running time and space. Git uses content-addressing to implement zero-cost branches, which used to be inefficient in traditional version control systems, which were more imperative.

1 more reply

j / k navigate · click thread line to collapse

13 comments

ThinkBeat1y ago

I do not understand how you can pack 1536d into 192 bytes ??? That some entirely impossible?

wredue1y ago

Nonsense.

LtdJorge1y ago

why?

wredue1y ago

Because the measurably do none of those things.

You do not need immutability to create a slow moving cache. Immutability is not where any semblance of performance is coming from here.

If they didnt succumb to idiotic bullshit nonsense, they wouldn’t have even needed this post. If you see any of your senior architects reading medium, fire them immediately.

Immutable structure can, in a single use case, meet mutable ones for performance and that single case is:

-read speed on aligned, flattened data

But, that ignores that fact that getting to a point of aligned, flattened data with immutable structures is incredibly slow.

moab1y ago

Your analysis sounds reasonable to the non-expert, but recent work on purely-functional trees suggests that the gap is smaller than you suggest ("orders of magnitude slower").

1 more reply

j-pb1y ago

You obviously didn't read the article and jumped to the conclusion that they are talking about immutably peristent data structures, which they are not.

And even if they were talking about those (which they don't) your critique doesn't sound as smart as you seem to think if we just called them "lockless copy-on-write" data structures.

Lockless datastructures have some obvious advantages, and copy on write is one way to achieve that.

itishappy1y ago

> Immutable structure can, in a single use case, meet mutable ones for performance and that single case is:

> -read speed on aligned, flattened data

This is the case outlined in TFA.

> But, that ignores that fact that getting to a point of aligned, flattened data with immutable structures is incredibly slow.

It might, if that weren't the whole point of TFA.

1 more reply

wesnerm21y ago

1 more reply

j / k navigate · click thread line to collapse