Stanford University’s HAI Research Institute is pushing for changes in the way AI skills are assessed. The institute has found that AI capabilities have reached a level comparable to an average person in various tests. The annual reports published by the institute delve into the different versions of artificial intelligence and their capabilities.
The latest 500-page review, published in April, highlights that the best AI models are now competing with humans in various tasks, such as image classification, language understanding, and everyday deduction based on visual cues. The significant advancements in AI capabilities have been observed particularly in the last decade, with notable progress seen in tests like Stanford’s MATH test.
For example, a recent test revealed that OpenAI’s artificial intelligence was able to solve 84.3 percent of math problems, a stark contrast to the 6.9 percent it could solve back in 2021. However, despite these impressive advancements, there are still challenges and limitations to be addressed, such as the tendency for large language models to make errors or “hallucinations.”
As advancements in AI continue to improve language models and image generation capabilities, researchers at Stanford University are now looking towards creating new tests that can not only better compare artificial intelligences but also pinpoint the areas where human skills still surpass those of artificial intelligence. The introduction of new models like GPT-5 is expected to further influence the development of these tests and shed light on the ongoing progress and challenges in the field of artificial intelligence.
In conclusion, researchers at Stanford University’s HAI Research Institute have found that AI capabilities have reached a level comparable to an average person in various tests. While significant progress has been made in recent years, there are still challenges and limitations that need to be addressed as we continue to develop new tests and models for assessing AI skills.