A full NLP system would include speech recognition, TTS, a large language model, and a vector search engine. The LM should be multi modal, multi language and multi task, "multi-multi-model" for short haha. I'm wondering when we'll have this stack as default on all OSes. We want to be able to search, transcribe, generate speech, run NLP tasks on the language model and integrate with external APIs by intent detection.
On the search part there are lots of vector search companies - Weaviate, Deepset Haystack, Milvus, Pinecone, Vespa, Vald, GSI and Qdrant. But it has not become generally deployed on most systems, people are just finding out about the new search system. Large language models are still difficult to run locally. And all these models would require plenty of RAM and GPU. So the entry barrier is still high.