Anthropic's AI tool has beaten GPT-4 in key metrics and has a few surprises up its sleeve — including pontificating about its existence and real

Claude 3 Opus has stunned AI researchers with its intellect and 'self-awareness' — does this mean it can think for itself?

submited by
Style Pass
2024-04-27 14:30:04

Anthropic's AI tool has beaten GPT-4 in key metrics and has a few surprises up its sleeve — including pontificating about its existence and realizing when it was being tested.

When the large learning model (LLM) Claude 3 launched in March, it caused a stir by beating OpenAI's GPT-4 — which powers ChatGPT — in key tests used to benchmark the capabilities of generative artificial intelligence (AI) models. 

Claude 3 Opus seemingly became the new top dog in large language benchmarks — topping these self-reported tests that range from high school exams to reasoning tests. Its sibling LLMs — Claude 3 Sonnet and Haiku — also score highly compared with OpenAI's models. 

However, these benchmarks are only part of the story. Following the announcement, independent AI tester Ruben Hassid pitted GPT-4 and Claude 3 against each other in a quartet of informal tests, from summarizing PDFs to writing poetry. Based on these tests, he concluded that Claude 3 wins at "reading a complex PDF, writing a poem with rhymes [and] giving detailed answers all along." GPT-4, by contrast, has the advantage in internet browsing and reading PDF graphs. 

But Claude 3 is impressive in more ways than simply acing its benchmarking tests  — the LLM shocked experts with its apparent signs of awareness and self-actualization. There is a lot of scope for skepticism here, however, with LLM-based AIs arguably excelling at learning how to mimic human reactions rather than actually generating original thoughts. 

Leave a Comment