> It's not an excuse. Anybody with half a working brain should've been able to tell that this was going to happen. You can't regulate a field in its infancy and expect it to ever function.
As I said, the core of the AI act was written about supervised ML, not generative ML, as generative ML wasn't as big a deal pre Chat GPT.
> You mean it falls on anyone that tries to compete with a model. There's a random 10^25 FLOPS compute rule in there. The B300 does 2500-3750 TFLOPS at fp16. 200 of these can hit that compute number in 6 months, which means that in a few years time pretty much every model is going to hit that.
As I also said, the foundation model stuff (including this flops thing) is incredibly stupid. I agree with you on this, but my point is that the core of the AI act was supposed to cover the ML systems built since approx 2010.
> The copyright rule and having to disclose what was trained on also means that it will be impossible to have enough training data for an EU model. And this even applies to people that make the model free and open weights.
Again, you're talking about generative stuff (makes sense given the absurdly misleading name now) whereas I'm talking about the original AI act, which I read well before ChatGPT happened.
The training data thing is a tradeoff, like copyright is far too invasive (IMO) and it's good to be able to use this information for other purposes. However, I personally would be super worried about an ML team that couldn't tell me what data went into their model. Like, the data is core to all ML/AI approaches so that lack of understanding would make me very sceptical of any performance claims.
Lets be real, the AI companies don't want to say what's in their models because of the rampant copyright infringement, not because of any technical incapability.