Reducing problems to document ranking is effectively a type of test-time search - also very interesting!
I wonder if this approach could be combined with GRPO to create more efficient chain of thought search...
https://github.com/BishopFox/raink?tab=readme-ov-file#descri...
The LLM companies work on the LLMs, while tens of thousands of startups and established companies work on applying what already exists.
It's not either/or.
I am trying to grok why we want to find the fix - is it to understand what was done so we can exploit unpatched instances in the wild?
Also also
“identifying candidate functions for fuzzing targets“ - if every function is a document I get where the list of documents is, what what is the query - how do I say “find me a function most suitable to fuzzing”
Apologies if that’s brusque - trying to fit new concepts in my brain :-)
Maybe you even use the LLM to find vulnerable snippets at the beginning, but a multi class classifier or embedding model will be way better at runtime.
awesome-generative-information-retrieval > Re-ranking: https://github.com/gabriben/awesome-generative-information-r...
How'd it perform compared to listwise?
Also curious about whether you tried schema-based querying to the llm (function calling / structured output). I recently tried to have a discussion about this exact topic with someone who posted about pairwise ranking with llms.
https://lobste.rs/s/yxlisx/llm_sort_sort_input_lines_semanti...
Should be "document ranking reduces to these hard problems",
I never knew why the convention was like that, it seems backwards to me as well, but that's how it is.
At least bother to read the discussion in the sibling comments.