Open Euro LLM: Open LLMs for Transparent AI in Europe - https://news.ycombinator.com/item?id=42922989 - Feb 2025 (279 comments)
So on average 1.87 million per participating institution which might amount to funding ~5 PhD students per institution. Not bad for a training program.
The project has been awarded the Sovereignity Seal, an EU mark of Excellence before it even started. This is truly in accordance with european values, where we reward participation and proclamation. I don't think we will ever hear again from this project.
Congratulations to the participants of the consortium for receiving this large EU grant. Thoughts and prayers to the students who will be writing the deliverable progress reports.
Yes. In my experience the government is happy with "looks good doesn't work" as long as it truly looks good.
Jesus the ideology in this place runs so thick with some people.
as PhD ("Doktorand") Student/Finisher, you will get around 45.000 EUR - 60.000 EUR in most jobs, maybe there are some mega corps like BMW or Siemens which will pay more (or consulting or IB etc.), but the vast majority of jobs with a "research background" in Germany will NEVER land you near 100.000 or more
--- As someone who is in general skeptical of programs like this (and an European) there are 2 remarkable / timely things about this: - This project doesn't just allocate money to universities or one large company, but includes top research institutions as well as startups and GPU time on supercomputing clusters. The participants are very well connected (e.g. also supported by HF, Together and the likes with European roots) - Deepseek has just shown that you probably can't beat the big labs with these resources, but you can stay sufficient close to the frontier to make a dent.
Europe needs to try this. Will this close the Gap to the US/China? Probably not. But it could be a catalyst for competitive Open source models and partially revitalize AI in Europe. let's see..
PS: on Twitter there was a screenshot yesterday that in a new EU draft, "accelerate" was used six times. Maybe times are changing a little bit.
Disclaimer: Our company is part of this project, so I might be biased. --- I hope the next time this is on HN, it's with some cool release and not a PR :).
(@mods please delete if copy-quoting not allowed)
[0] https://news.ycombinator.com/item?id=43119913
[1] https://sites.google.com/view/eurollm
1. large multilingual dataset
2. open science approach
3. competitive performance
Here is the HF blogpost that introduced it in December 2024 (along with various benchmarks): https://huggingface.co/blog/eurollm-team/eurollm-9b
The project's lead has summarized the situation succinctly in their LinkedIn post [0]
I hope the different communities collaborate openly, share their expertise, and don't decide to reinvent the wheel every time a new project gets funded. Next what? "OpenEuroLLM with real cheese"?
[0] https://www.linkedin.com/posts/andre-martins-31476745_ai-art...Deliverables:
- A series of models of different sizes for optimal effectiveness and efficiency (1B, 9B and 22B) trained on 4T tokens
- A multimodal model which can process and understand speech or text input
- Full project codebase available to the public with detailed data and model descriptions
I can't find the codebase yet though
No sarcasm, sorry.
It should be done in secret? How did they manage to create CERN? maybe there was no reddit like people commenting back then?
This really reads like a parody. Press release, “a consortium of 20 research institutions”, “awarded the STEP (Strategic Technologies for Europe Platform) seal”. Lots of grandiose self-congratulations. All with nothing to run, download or try of course.
https://openai.com/news/company-announcements/
> a consortium of 20 research institutions
https://aimagazine.com/machine-learning/google-invests-in-ai...
> awarded the STEP (Strategic Technologies for Europe Platform) seal
https://openai.com/index/strengthening-americas-ai-leadershi...
> Lots of grandiose self-congratulations
https://x.com/sama/status/1891533802779910471
> All with nothing to run, download or try of course.
This has... a statement of intent to try to copy that product. Not remotely the same.
Yes it sounds like a parody or an onion piece. We know the European search engine, cloud, blockchain never got anywhere. I don't even believe that anybody ever really tried.
Now you have to put yourself in their head for 2 minutes and here is what I noticed by knowing a few of them (the "EU type").
In their perception of reality it seems they really believe that if they declare something it is real. This is why they get so deranged if you dare pointing to the facts or just asking questions. It seems they really believe they succeeded in all those projects. I they say it, it exists.
I am not really satisfied by the explanations we usually hear: they are incompetent, it is corruption or even insanity (some sort of mass hysteria that would take root in some institutions).
What I am wondering is, is there a concept in philosophy or some similar pattern in previous civilisation that could help us understand what is going on with the EU?
Because Gaia-X or OpenEuroLLM is one thing, but it is worrisome they now believe they can raise an army and declare war on everybody.
NOT when it comes to the level of violence and repression or quality of living. Those two things are world-class.
But in the sense that there's a more or less unelected political establishment that's
a) Recursive: It does things only to show them off to itself.
b) Not exposed to real-world consequences.
c) Has a non-falsifiable pretense to validate whatever they do and caution against undoing whatever it is. For the soviets, it was anti-capitalism. For the EU it's some notion of safety or sustainability.
d) Inadvertently benefits itself and other elites and harms the people they pretend to protect.
My hope is that as a democratic institution, the EU is capable of reform.
And honestly, people don't _want_ the European bureaucracy to move fast. Case in point: the USA.
I'm a german and yes i would absolutely want it to move faster. And I guess you are an american?
Am I the only one who doesn't see any link to any model? Too many words, no actual outcome.
> "The models will be developed within Europe's robust regulatory framework...
The title should be “effort to train model” or plans, not saying “series”! Series without having even one?
> truly open > including data, documentation, training and testing code, and evaluation metrics; including community involvement
> compliant > under EU regulations, OpenEuroLLM will provide a series of transparent and performant LLMs
> diverse > for European languages and other socially and economically interesting ones, preserving linguistic and cultural diversity
The first one seems good, but the second two seem to be pretty beside the point of creating models that compete with the cutting edge of China and the USA.
Others have responded to your "diversity" point, but making sure to train on adequate amounts of data in all EU languages is valuable, especially because LLMs are so prone to generating convincing BS when working close to the edges of their training set. If this exists, people in Malta are going to want to use it, so better for it to generate good Maltese than gibberish that sort of looks like Maltese, right?
Hier spricht man Deutsch.
A 600 km à l'ouest, on parle français.
50 km na wschód, Polska.
360 χλμ βόρεια, Δανέζικα, Σουηδικά; 250 χλμ νότια, Τσεχία; 750 χλμ νοτιοανατολικά, Ουγγρικά; και τα λοιπά.
Europe has a need, that the other models aren't bothered by — they can do it, but more by happenstance than on purpose.
I guess some people are surprised police might get involved in a defamation case because in the US it's not a crime but a civil wrong? Which means you can't get help from the police to identify the person who made a defamatory tweet? Or something?
Also, if someone says something that could threaten my safety (either directly or through inciting others) I would very much like them to get a visit from the police. This situation is so easily avoided by not being a dick to people.
Yeah, if you are from The Netherlands and want police showing up your door, mention on Twitter that you want to shoot mr. Wilders. Threatening someone to take away their life has repercussions. How peculiar!
(Please don't do it. Example is just illustrative. Actually, I know a website with a forum where this happened approx 20 years ago. Server got seized. They didn't log. FDE, but obviously got broken at some point.)
Freedom of speech isn't that you can spout whatever you want and not face repercussions.
Besides that, there's Popper.
Furthermore, there's this thing called chilling effect. You might wanna ask GOP Senators and Congressman about that.
I have faith in LLMs and AI, as long as it is reproducible and transparent. Right now, when I use Mistral, it refers to sources. A step in the right direction.
What EU relulations? It is a moving target, and nobody knows what exactly apply. It would be nice to provide list of regulations with references. And some testing suite or checklist, to verify AI use actually fits regulations.
Right now, if I integrate spellchecker into my app, I have no idea if I am breaking any AI EU regulations!!!
https://www.ai.se/en/ai-labs/natural-language-understanding/...
They've cooperated with a research agency known in part for their Prolog implementation, i.e. they've been at it since the last massive "AI" hype cycle.
It could be even nice idea for startup. All data are publicly available...
It's basically a classic EU research push - first you try to regulate the new technology to oblivion and then, when it becomes apparent that stifles development in the EU you bankroll many different projects with EU grants, often with limited success.
That project too seems dormant lately.
They just announce things and then the train leaves the station.
I don't believe anything was ever made public.
Although, as a Dutch person, I'd like to point out Brooklyn technically is Dutch. ;)
EU citizens badly need AI systems that are open and privacy-respecting. Getting together this rather large coalition of experts with quite some money and (importantly) access to compute power is a nice first step.
Let them play around, train some models, fail-and-get-up-again, start over, write papers and hopefully get some useful output. Remember, for the involved PhD students it will also be a learning experience!
Yes, it's only the first step. But yeez, it's a press release indicating the start of a scientific collaboration! Let's hold back on the negativity for a couple of years until after they've had a chance.
I, for one, hope this will lead to success and wish the team the best.
There are plenty AI systems that are open and privacy-respecting. In fact, any model you run on your own hardware is privacy-respecting. And open, for whatever that means.
And you'd see the same reaction if a "OpenMurica" LLM would be announced. It's just weird and cringy to attach patriotism to something like this
Did you expect OpenAI to release GPT in the press release that announced its creation as a company? Bullshit Silicon Valley startups do big press releases based on literally nothing all the time, but all of a sudden this is an issue if an academic European institution does it?
I hope the post is being bot-raided because otherwise I'll have to accept that the quality of thought on HN has gone down. I get the typical biased US-elitism that is pervasive on this website, but these reactions are just plain dumb.
I don't think it's (just) bots. I think it's the current strain of Silicon Valley arrogance, not unlike what helped create the current political landscape in America.
In real life: you either train your LLM on Anna's archive, or get left behind with sub-par model
Especially given how notoriously bad are the EU AI regulations.