undefined | Better HN

0 pointsdisgruntledphd29d ago0 comments

Because the AI act was mostly written to address issues with ML products and services. It was mostly done before ChatGPT happened, so all the foundation model stuff got shoehorned in.

Speaking as someone who's been doing stats and ML for a while now, the AI act is pretty good. The compliance burden falls mostly on the companies big enough to handle it.

The foundation model parts are stupid though.

0 comments

Aerroon9d ago

>Because the AI act was mostly written to address issues with ML products and services. It was mostly done before ChatGPT happened, so all the foundation model stuff got shoehorned in.

It's not an excuse. Anybody with half a working brain should've been able to tell that this was going to happen. You can't regulate a field in its infancy and expect it to ever function.

>The compliance burden falls mostly on the companies big enough to handle it.

You mean it falls on anyone that tries to compete with a model. There's a random 10^25 FLOPS compute rule in there. The B300 does 2500-3750 TFLOPS at fp16. 200 of these can hit that compute number in 6 months, which means that in a few years time pretty much every model is going to hit that.

And if somebody figures out fp8 training then it would only take 10 of these GPUs to hit it in 6 months.

The copyright rule and having to disclose what was trained on also means that it will be impossible to have enough training data for an EU model. And this even applies to people that make the model free and open weights.

I don't see how it is possible for any European AI model to compete. Even if these restrictions were lifted it would still push away investors because of the increased risk of stupid regulation.

disgruntledphd2OP7d ago

> It's not an excuse. Anybody with half a working brain should've been able to tell that this was going to happen. You can't regulate a field in its infancy and expect it to ever function.

As I said, the core of the AI act was written about supervised ML, not generative ML, as generative ML wasn't as big a deal pre Chat GPT.

> You mean it falls on anyone that tries to compete with a model. There's a random 10^25 FLOPS compute rule in there. The B300 does 2500-3750 TFLOPS at fp16. 200 of these can hit that compute number in 6 months, which means that in a few years time pretty much every model is going to hit that.

As I also said, the foundation model stuff (including this flops thing) is incredibly stupid. I agree with you on this, but my point is that the core of the AI act was supposed to cover the ML systems built since approx 2010.

> The copyright rule and having to disclose what was trained on also means that it will be impossible to have enough training data for an EU model. And this even applies to people that make the model free and open weights.

Again, you're talking about generative stuff (makes sense given the absurdly misleading name now) whereas I'm talking about the original AI act, which I read well before ChatGPT happened.

The training data thing is a tradeoff, like copyright is far too invasive (IMO) and it's good to be able to use this information for other purposes. However, I personally would be super worried about an ML team that couldn't tell me what data went into their model. Like, the data is core to all ML/AI approaches so that lack of understanding would make me very sceptical of any performance claims.

Lets be real, the AI companies don't want to say what's in their models because of the rampant copyright infringement, not because of any technical incapability.

j / k navigate · click thread line to collapse