
Acing this new AI exam — which its creators say is the toughest in the world — might point to the first signs of AGI
Humanity’s Last Exam is a PhD-level benchmark designed to test the limits of AI reasoning. Although Google’s Gemini 3 scored a staggering 48.4%, experts stress that this does not indicate the arrival of artificial general intelligence (AGI).




























































