For anyone interested in playing around with the internals of LLMs without needing to worry about having the hardware to train locally, a couple of projects I've found really fun and educational:
- Implement speculative decoding for two different sized models that share a tokenizer [0]
- Enforce structured outputs through constrained decoding (a great way to dive deeper in to regex parsing as well).
- Create a novel sampler using entropy or other information about token probabilities
The real value of open LLMs, at least for me, has been that they aren't black boxes, you can open them up and take a look inside. For all the AI hype it's a bit of shame that so few people seem to really be messing around with the insides of LLMs.
The models used in this experiment - deepseek-r1:8b, mistral:7b, qwen3:8b - are tiny. It's honestly a miracle that they produce anything that looks like working code at all!
I'm not surprised that the conclusion was that writing without LLM assistance would be more productive in this case.
That isn't an open source model, but a quantized version of GLM-4.5, an open-weight model. I'd say there's hope yet for small, powerful open models.
Right now, you need the bigger models for good responses, but in a year's time?
So the whole exercise was a bit of a waste of his time, the present target moves too quickly. This isn't a time to be clutching your pearls about running your own models unless you want to do something shady with AI.
And like video streaming was progressed by the porn industry, a lot of people are watching the, um, "thirsty" AI enthusiasts for the big advances in small models.
Be one of the few humans still pretty good at using their own brains for those problems LLMs can't solve, and you will be very employable.
If something can not be reproduced from sources which are all distributed under an OSI license it is not Open Source.
Non public sources of unknown license -> Closed source / Proprietary
No training code, no training sources -> Closed source / Proprietary
OSI public source code -> Open Source / Free Software
These terms are very well defined. https://opensource.org/osd
They do continue to require the core freedoms, most importantly "Use the system for any purpose and without having to ask for permission". That's why a lot of the custom licenses (Llama etc) don't fit the OSI definition.
Poor move IMO. Training data should be required to be released to be considered an open source model. Without it all I can do is set weights, etc. Without training data I can't truly reproduce the model, inspect the data for biases/audit the model for fairness, make improvements & redistribute (a core open source ethos).
Keeping the training data closed means it's not truly open.
For a (somewhat extreme) example, what if I use the model to write children's stories, and suddenly it regurgitates Mein Kampf? That would certainly ruin the day.
Yes. And you're using them wrong.
From the OSD:
< The source code must be the preferred form in which a programmer would modify the program. >
So, what's the preferred way to modify a model? You get the weights and then run fine-tuning with a relatively small amount of data. Which is way cheaper than re-training the entire thing from scratch.
---
The issue is that normal software doesn't have a way to modify the binary artifacts without completely recreating them, and that AI models not only do have that but have a large cost difference. The development lifecycle has nodes that don't exist for normal software.
Which means that really, AI models need their own different terminology that matches that difference. Say, open-weights and open-data or something.
Kinda like how Creative Commons is a thing because software development lifecycle concepts don't map very well to literature or artwork either.
> https://www.llama.com/ - "Industry Leading, Open-Source AI"
> https://www.llama.com/llama4/license/ - “Llama Materials” means, collectively, Meta’s proprietary Llama 4
Either the team that built the landing page (Marketing dept?) is wrong, or the legal department is wrong. I'm pretty sure I know who I'd bet on to be more correct.
The sad part is it is working. It is almost like Meta is especially skilled at mass public manipulation.
That's why we keep being annoying about "Free Software."
Can we all please stop confusing Free/Libre Open Source with Open Source?
https://www.gnu.org/philosophy/open-source-misses-the-point....
Maybe if we'd focused on communicating the ethics the world wouldn't be so unaware of the differences
I was attempting to direct that when software is called Open Source and actually is based on OSI licensed sources, then they are likely talking about Free Software.
Too much communicating of the ethics would have bogged down the useful legal work.
My take is, Free Software actually won and we're in a post-that world.