Getting LLMs to have a reliability rate that is on par or superior to human performance is very very achievable.
Source?