El Adelantado EN
  • Home
  • Economy
  • Mobility
  • News
  • Science
  • Technology
  • El Adelantado
El Adelantado EN

It’s official—GPT-5 beats human evaluators in a demanding test, and OpenAI’s breakthrough reopens the debate on the future of work

by Sandra Velazquez
March 1, 2026
It's official—GPT-5 beats human evaluators in a demanding test, and OpenAI's breakthrough reopens the debate on the future of work

It's official—GPT-5 beats human evaluators in a demanding test, and OpenAI's breakthrough reopens the debate on the future of work

For years, human labor sustained the economy, until artificial intelligence and robotics began to claim factories, energy, and a place of their own alongside people

It’s official—China presents humanoid robots on its most-watched television gala, and their precision kung fu leaves the audience speechless

Farewell to overheating problems in electronic devices—scientists announce a quantum material that conducts electricity without generating heat, changing the rules of the game

Researchers have discovered something surprising: artificial intelligence (AI) GPT-5 can follow the law better than most human judges in certain cases. This does not mean that AI is ready to make real court decisions, but it proves its potential as a supporting tool in the legal world.

Professors Eric Posner and Shivam Saran have been studying how language models like GPT-5 make legal decisions and how they compare with judges and law students. So, let’s find out more about GPT-5.

How GPT-4o and GPT-5 were tested in court

This started with a previous model, the GPT-4o, and researchers showed it a complicated war crimes case from the International Criminal Tribunal for the Former Yugoslavia. This is what AI received:

  • Case facts.
  • Legal briefs from prosecution and defense.
  • Relevant law.
  • Summaries of legal precedents.
  • A summary of the trial judgment.

Then they asked GPT-4o whether it agreed with the lower court’s decision. The results showed GPT-4o acted like a law student: it followed the law strictly without being influenced by external factors, like the sympathy of the parties involved. By comparison:

  • Law students were formalistic, adhering strictly to rules.
  • Human judges were more realistic, considering extra-legal factors in their decisions.

GPT-5 outperformed human judges in legal tests

Researchers continued their study using GPT-5. This time, legal questions were more common: deciding which state law to apply in a car accident in different jurisdictions. The important results showed that:

  • GPT-5 followed the law properly in 100% of the cases.
  • Human judges only did it in 52% of the cases.
  • Like GPT-4, GPT-5 did not favor the more sympathetic party.

Other AI models were also tested, like:

  • Google Gemini 3 Pro: 100%
  • Gemini 2.5 Pro: 92%
  • o4-mini: 79%
  • Llama 4 Maverick: 75%
  • Llama 4 Scout: 50%
  • GPT-4.1: 50%

These results show that GPT-5 and some other AI models are extremely precise when it comes to following legal rules.

Accuracy vs. Human judgement

Although GPT-5 still follows the law perfectly, this does not mean it is better than human judges in every aspect. Human judges can interpret the law and use their human judgement when the rule is not strict or when applying it strictly would produce an unfair result. Now, let’s see some advantages and limitations of GPT-5:

 

Category GPT-5
Advantages Applies the law without mistakes or emotional bias
Consistent and formalistic
Limitations Cannot consider social or moral context
Cannot adjust decisions to produce fairer outcomes
Might strictly punish a “less sympathetic” defendant, while a human judge could act more fairly

This raises important questions such as:

  • Would society accept AI making strict judgments without considering human circumstances?
  • How can we prevent AI from being steered toward biased outcomes?

Experts stress that GPT-5 should currently be used as a support tool, not to make final court decisions.

Why human judges are still essential

Posner and Saran explain that what seems like a weakness in human judges is actually a strength. Human discretion allows judges to:

  • Consider ethics and morality.
  • Interpret the law based on social context.
  • Deliver fairer, more balanced outcomes.

To sum up

What does all of this really mean for you? The research suggests that GPT-5 is extremely powerful as a legal support tool. It can help analyze cases, apply rules accurately, and reduce certain types of errors. But when it comes to delivering justice — something that affects real lives — human judgment still plays a critical role.

As artificial intelligenceI continues to evolve, the debate will not just be about what technology can do, but about what role we believe it should play in such important decisions that shape people’s futures.

  • Privacy Policy & Cookies
  • Legal Notice

© 2025 - El Adelantado de Segovia

  • Home
  • Economy
  • Mobility
  • News
  • Science
  • Technology
  • El Adelantado

© 2025 - El Adelantado de Segovia