So you’d basically install Ollama, download one of the versions of this model off HuggingFace, create a Modelfile since this isn’t in the default Ollama repo, and then Ollama can answer prompts with the model. Modelfiles are very simple, based on Dockerfiles. It takes like 15 seconds to make one if you aren’t messing with the various parameters.
Once it’s in Ollama, just get one of the various GPT plugins for VSCode and give it the Ollama URL (http://localhost:11434 by default). I use continue.dev but there are many.
Continue takes over the tab autocomplete with the LLM, and has a chat window on the right where you can use keyboard shortcuts to copy code into the prompt and ask it to edit/generate code or ask questions about existing code.
the server is here: https://github.com/ggerganov/llama.cpp/tree/master/examples/...
And you can search for any GGUF on huggingface
Where would I start if I wanted to use a model programmatically ? Like let's say I am building a chat bot. I have a large data set of replies I want the model to mimic, and I'd want to do this in Python. Of course, I'd probably use a different model than Granite.
> Our process to prepare code pretraining data involves several stages. First, we collect a combination of publicly available datasets (e.g., GitHub Code Clean, Starcoder data), public code repositories, and issues from GitHub
Citation needed
All I've seen from them in my professional experience is actually legacy mainframe maintenance.. Not shovelware, but very far from hardcore tech.
They've been doing "AI" for ages. Notably Watson over the last couple of decades or so.
I've not seen any proper evaluations for Granite against, say, Llama or Mistral.
Until we do it's probably too early to say they can't compete, at least in some areas where others perform poorly.
Previous Granite models were on the level of first llama in my benchmarks.
I’m expecting this version to be roughly comparable to llama 2
Did you even read the benchmarks they post on that link? Assuming they're not outright lying, their 8B model is superior to Llama/Mistral models of the same size for coding tasks.