Looks like only GUI aspects of the UI are self-hosted, but that the text and speech aspects of the UI (and the bulk of the computation and IP) are provided by two SaaS services.
Self-hosted (and some degree of open) ML models are what a lot of people might want, so we should probably be careful when saying "self-hosted" right now, to not disappoint people or confuse discussion when talking about what we want.
However, sophisticated readers familiar with ChatGPT will know the model and weights haven't been released and absent a leak/hack/release by OpenAI a completely self-hosted ChatGPT solution is impossible. Eventually we'll almost certainly see a "Completely self-hosted ChatGPT equivalent" (similar to Dall-E vs Stable Diffusion) but that's another thread for another time.
Based on my native speaker parsing of English "Self-hosted ChatGPT UI" is accurate and I'm not sure how else I would write it to disambiguate between a self-hosted UI and a completely self-hosted ChatGPT with a UI.
"Show HN: I made a self-hosted UI for the ChatGPT API"
"Show HN: I hade a self-hosted UI for a local GPT model"
But more to the point, a fully self hosted solution (llama), even running on a cellphone, is entirely believable. Look at some of the recent developments with llama.cpp and Stanford over the last week.
Integrating with Alpaca, Llama, ChatGLM, OpenChatBox and whatever comes next should be straightforward once people figure out reliable and fast methods to run the models locally.
Assuming the model was available, how big are the models and what kind of hardware is necessary to run the instance?
Also expect performance of couple seconds per token in that setup, for now you need something involving GPUs
Edit i think the title name was changed. Dang can you please show revision history otherwise i cant dicuss properly
I'm really interested in finding a frontend with LangChain integration that can switch between chat mode and doc mode or something along those lines. It would be great to have a more versatile tool for communication and collaboration.
Do any of you have any recommendations or know of any projects that fit this description?
https://old.reddit.com/r/OpenAI/comments/11k19en/i_made_an_a...
I'm building a chrome extension called SublimeGPT[1] to do exactly that. Right now, you can log in to your existing ChatGPT account, go to any page, and open a chat overlay. Next version will have the context options.
function __summarize(api_key) {
var selection = window.getSelection().toString();
if (selection.length == 0) return;
var xhr = new XMLHttpRequest();
xhr.open("POST", "https://api.openai.com/v1/chat/completions");
xhr.setRequestHeader('Content-Type', 'application/json');
xhr.setRequestHeader('Authorization', 'Bearer ' + api_key);
window.scrollTo({top: 0})
document.body.innerHTML = 'asking...'
document.body.style.backgroundColor = "white";
document.body.style.color = "black";
document.body.style.fontFamily = 'monospace'
document.body.style.fontSize = "16px"
document.body.style.margin = "auto"
document.body.style.padding = "1rem"
document.body.style.maxWidth = "60rem"
xhr.onreadystatechange = function() {
if (xhr.readyState == 4) {
if (xhr.status == 200) {
var response = JSON.parse(xhr.responseText);
var summary = response.choices[0].message.content;
document.body.innerHTML = summary
} else {
try {
var e = JSON.parse(xhr.responseText);
document.body.innerHTML = e.error.message
} catch(e) {
document.body.innerHTML = 'error asking.. check the console'
console.log(xhr)
}
}
}
}
var data = JSON.stringify({
"model": "gpt-3.5-turbo",
"messages": [
{"role": "system", "content": "Summarize the following text as if you are Richard Feynman"},
{"role": "user", "content": selection}
]
});
xhr.send(data);
}
(i have it as bookmarklet here https://gist.github.com/jackdoe/ce5a60b97e6d8487553cb00aa43f... change "YOUR API KEY HERE" with your key)Edit: It seems like it is just using the API instead of the web interface, and thus charging my account each time. I originally thought it was injecting into the free web interface. But is changing the system prompt going to get me banned?
[0]: https://platform.openai.com/docs/guides/chat/introduction
That way it works in any app automatically. Seamless system wide clipboard read is a big ask though, so ideally you'd want a self hosted model like llama.cpp
How would one create a "domain expert" version of this? The idea would be to feed the model a bunch of specialized, domain-specific content, and then use an app like this as the UX for that.
I guess you could also try to tack on an extra layer before the actual API call, and make your own system that includes key bits of info to the prompt from a more specific data set. But I'd guess at this rate of new releases from OpenAI, it might be a safe bet to wait the couple of weeks until they update the fine-tune API.
https://dev.to/dhanushreddy29/fine-tune-gpt-3-on-custom-data...
It is a good approach, but to use the word “fine-tuning“ for that is confusing, given that OpenAI actually has a process for fine-tuning, which works in a very different way.
True, and the free version does it lot, almost on purpose.
The paid version is a lot faster and doesn’t break as often, but it still breaks (eg. For the last two days, the chat list on the sidebar disappeared and it showed a message saying “don’t worry, your chats will show up eventually”).
I can't wait to test this! As other have mentioned, the "free" chat frontend is slow and the "Plus" one, not much better. Also, at $20/month, based on my usage, it's actually more expensive than using the API.
The last hurdle: as ChatGPT is not GDPR compliant, it would be really interesting/useful to find a way to "hide" the queries from openai and prevent the usage of your input in future training - basically, a self-hosted, non-leaking, chatGPT.