Are embeddings a hack? Is building out tooling and databases and APIs and companies around embeddings all going to be for naught as soon as there's a solid LLM/API with a big enough context window?
Yes - embeddings are a hack:
No - there won't anything like a "real API" unless there's a new discovery or a shift in the way LLMs are constructed. It's not theoretically impossible but there's no clear way to get guaranteed results from present day LLMs, all they do output guesses from their input text (combining prompt text and then user text).
Expanding context seems like an approach, but if you're trying to get an answer about your company's documentation, why would you need the entirety of GPT-X?
Here's a relevant quote: https://simonwillison.net/2023/Apr/15/ted-sanders-openai/
Analogous, more or less, to a human with general experience (base training), experience with your code base (fine tuning), and the ability to reference the current code base directly (embedding-based search/recall). All three have a role, they are complementary rather than mutually exclusive.