Edit: their blog post (https://subq.ai/how-ssa-makes-long-context-practical) does go pretty in-depth about it
Edit 2: the fact that they're going straight for an end-to-end coding product on day 1 is very ambitious. Other speed/efficiency-oriented AI companies (Cerebras and Inception come to mind) still don't have a first-party coding product after years. IMO this is absolutely the right way to go if they really do have the big breakthrough they're claiming.
- They are admitting that this is built on top of a Chinese model[1]
- They committed a huge chart crime with the Y axis of a chart comparing to Opus on their website that I can't find anymore (Too embarrassing to keep?). The delta between their score (81%) vs. Opus (87%) on SWE bench was hugely minimized
- They named the company subquadratic but in parts they said O(1) linear scaling. At O(1) you could do much more than 12M tokens context window. At O(log n) even.
I hope this is real but I doubt...
i see in the linked post they mention O(n) not O(1). O(1) would basically be impossible and instant. Something like no compute required, constant results...
The name subquadratic is actually good and makes sense to me. Because today's models are usually O(n^2) or worse. Anything equals or less than O(n^1) is basically sub-quadratic.
Meanwhile O(log n) would be logarithmic as the log name indicates. But we have a long way to go there. Maybe with double tokenizer plus extensive caching it may be possible...
What I mean here is tokenizing the user input; then capturing intent; caching intent -> response. So that next time once you get the intent, you don't need to do full transformer inference compute. This can be logarithmic complexity in terms of time complexity.
It seems at or above SOTA on the given benchmarks, doesn’t have context rot, is orders of magnitude faster, and uses less compute that current transformer models. I suppose it’s just an announcement and we can’t test it ourselves yet.
I am happy to answer any questions!
Do you anticipate having any kind of public accessible chat interface for testing in the near future?
Also, what, if any, benefits are there for smaller context windows? Is there still a material improvement in cost to serve under say 256k? I'm curious about the broader implications for the space beyond improvements for very large context windows.
Can you back up your claims?
Why did you not release the white paper in parallel with the product?
Feels really fishy.
Yes, this product doesn't exist.
And the last time a company claimed something similar it disappeared after taking money from investors.
no published benchmarks
no paper
no demonstrations of capabilities
Also, holy moly, the astroturfing.
But I'll still keep an eye on what they'll show up with in the next months. Sounds intriguing.
I don't know if this will help for things like understanding code, where the all relevant parts can be the file of 1000 lines that we are analyzing, and where every token is relevant in understanding recursion, loops, function calls, etc.
This sounds like it would be great to do SSA before passing things along to a code model like claude code.
Let me know if I misunderstood