undefined | Better HN

0 pointsjart11mo ago0 comments

Mozilla and Google provide an alternative called gemmafile which gives you an airgapped version of Gemini (which Google calls Gemma) that runs locally in a single file without any dependencies. https://huggingface.co/jartine/gemma-2-27b-it-llamafile It's been deployed into production by 32% of organizations: https://www.wiz.io/reports/the-state-of-ai-in-the-cloud-2025

0 comments

ipsum211mo ago

There's nothing wrong with promoting your own projects, but its a little weird that you don't disclose that you're the creator.

jartOP11mo ago

It would be more accurate to say I packaged it. llamafile is a project I did for Mozilla Builders where we compiled llama.cpp with cosmopolitan libc so that LLMs can be portable binaries. https://builders.mozilla.org/ Last year I concatenated the Gemma weights onto llamafile and called it gemmafile and it got hundreds of thousands of downloads. https://x.com/JustineTunney/status/1808165898743878108 I currently work at Google on Gemini improving TPU performance. The point is that if you want to run this stuff 100% locally, you can. Myself and others did a lot of work to make that possible.

elbear11mo ago

I keep meaning to investigate how I can use your tools to create single-file executables for Python projects, so thanks for posting and reminding me.

ahgamut11mo ago

My early contributions to https://github.com/jart/cosmopolitan were focused towards getting a single-file Python executable. I wanted my Python scripts to run on both Windows and Linux, and now they do. To try out Python, you can:

    wget https://cosmo.zip/pub/cosmos/bin/python -qO python.com
    chmod +x python.com
    ./python.com

Adding pure-Python libraries just means downloading the wheel and adding files to the binary using the zip command:

    ./python.com -m pip download Click
    mkdir -p Lib && cd Lib
    unzip ../click*.whl
    cd ..
    zip -qr ./python.com Lib/
    ./python.com # can now import click

Cosmopolitan Libc provides some nice APIs to load arguments at startup, like cosmo_args() [1], if you'd like to run the Python binary as a specific program. For example, you could set the startup arguments to `-m datasette`.

[1]: https://github.com/jart/cosmopolitan/commit/4e9566cd3328626d...

nicce11mo ago

That is just Gemma model. Most people seek capabilities equivalent for Gemini 2.5 Pro if they want to do any kind of coding.

jartOP11mo ago

Gemma 27b can write working code in dozens of programming languages. It can even translate between languages. It's obviously not as good as Gemini, which is the best LLM in the world, but Gemma is built from the same technology that powers Gemini and Gemma is impressively good for something that's only running locally on your CPU or GPU. It's a great choice for airgapped environments. Especially if you use old OSes like RHEL5.

nicce11mo ago

It may be sufficient for generating serialized data and for some level of autocomplete but not for any serious agentic coding where you won't end up wasting time. Maybe some junior level programmers may find it still fascinating but senior level programmers end up fighting with bad design choices, poor algorithms and other verbose garbage most of the time. This happens even with the best models.

diggan11mo ago

> senior level programmers end up fighting with bad design choices, poor algorithms and other verbose garbage most of the time. This happens even with the best models.

Even senior programmers can misuse tools, happens to all of us. LLMs sucks at software design, choosing algorithms and are extremely crap unless you exactly tell them what to do and what not to do. I leave the designing to myself, and just use OpenAI and local models for implementation, and with proper system prompting you can get OK code.

But you need to build up a base-prompt you can reuse, by basically describing what is good code for you, as it differs quite a bit from person to person. This is what I've been using as a base for agent use: https://gist.github.com/victorb/1fe62fe7b80a64fc5b446f82d313..., but need adjustments depending on the specific use case

Although I've tried to steer Google's models in a similar way, most of them are still overly verbose and edit-happy, not sure if it's some Google practice that leaked through or something. Other models are way easier to stop from outputting so much superfluous code, and overall following system prompts.

ipsum211mo ago

I've spent a long time with models, gemma-3-27b feels distilled from Gemini 1.5. I think the useful coding abilities really started to emerge with 2.5.

seunosewa11mo ago

The technology that powers Gemini created duds until Gemini 2.5 Pro; 2.5 Pro is the prize.

j / k navigate · click thread line to collapse

0 comments

ipsum211mo ago

There's nothing wrong with promoting your own projects, but its a little weird that you don't disclose that you're the creator.

jartOP11mo ago

elbear11mo ago

I keep meaning to investigate how I can use your tools to create single-file executables for Python projects, so thanks for posting and reminding me.

ahgamut11mo ago

    wget https://cosmo.zip/pub/cosmos/bin/python -qO python.com
    chmod +x python.com
    ./python.com

Adding pure-Python libraries just means downloading the wheel and adding files to the binary using the zip command:

    ./python.com -m pip download Click
    mkdir -p Lib && cd Lib
    unzip ../click*.whl
    cd ..
    zip -qr ./python.com Lib/
    ./python.com # can now import click

[1]: https://github.com/jart/cosmopolitan/commit/4e9566cd3328626d...

nicce11mo ago

That is just Gemma model. Most people seek capabilities equivalent for Gemini 2.5 Pro if they want to do any kind of coding.

jartOP11mo ago

nicce11mo ago

diggan11mo ago

> senior level programmers end up fighting with bad design choices, poor algorithms and other verbose garbage most of the time. This happens even with the best models.

ipsum211mo ago

I've spent a long time with models, gemma-3-27b feels distilled from Gemini 1.5. I think the useful coding abilities really started to emerge with 2.5.

seunosewa11mo ago

The technology that powers Gemini created duds until Gemini 2.5 Pro; 2.5 Pro is the prize.

j / k navigate · click thread line to collapse