But the actual model architecture is slightly different, based on Pythia
I guess what is needed is a pythia.cpp https://github.com/ggerganov/llama.cpp/issues/742