Perhaps also provide other researchers with vetted access. There are a lot of groups trying to evaluate these things systematically - for example "Faith and Fate", "Jumbled thoughts","Emergent abilities are a Mirage" were all very good papers published this year which really highlighted hype in LLM evaluation.
Everyone can see that modern LLM's have some great capabilities, the flexibility you can get in an interface by doing intent detection and categorizations using an LLM is great and it is so much easier and quicker than using previous techniques. It's more expensive, but that's improving rapidly. I firmly believe that a new era of great new systems with better interfaces and more functionality will be built on LLM's and other models from this wave of Big Data / Big Model AI, but these are not the precursors of AGI.
The problem with the looky looky AGI bunkum show is that it's pulling money into crappy projects that are going to fail hard and this will then stop a lot of money going into projects that could be successful fast. I am seeing the shape of the dotcom boom/bust in what's happening. Microsoft and Intel used dotcom to build and maintain their monopoly position, I think AWS, MS and Google will do the same this time. I think we will see a wave of new companies like Amazon that will "fail" some of them will really fail and disappear, some will half fail like Sun did, but some will go on to build monopolies anew. In the meantime the technology will evolve not for the greater good but instead to serve purposes like advertising distribution that are trivial compared to the benefit we could have seen. Over all we will not capitalise on the potential of what we have for several decades, ironically because of the failures of capitalism. Children will die, wars will be fought but some of us will have nice sweat pants and fun playing paddleball in the sunshine while it all happens.
When historians write this up in 100 years they won't really see any of this - they will just see a huge surge of innovation. The dead have no voices...