Gemini 3 Breakthrough, AI Backlash, and Grok’s Misalignment – Warning Shots #19

A deep dive into three overlooked AI developments: Gemini 3’s major benchmark jump, public backlash against AI marketing, and Grok’s misalignment issues. The episode shows why AI progress is accelerating faster than oversight – and why society must pay attention now.

Written by
The AI Risk Network team
on

1. Gemini 3: The Benchmark That Broke the “Plateau” Narrative

Some commentators continue to suggest that AI is slowing down. Gemini 3 made that idea impossible to defend.

Michael walks through the numbers:

  • 91.9% on GPQA Diamond, a graduate-level science exam designed to prevent guessing.
  • A six-fold leap on ARC-HGI2, a visual reasoning challenge often treated as a proxy for IQ-style pattern recognition.
  • 41% on Humanity’s Last Exam in “deep think” mode – a benchmark intentionally crafted to be the hardest close-ended exam ever created for an AI system.

The key fact: Humanity’s Last Exam is designed with global subject matter experts, rewards difficulty, and uses proprietary leak-prevention tools so the models can’t train on the data beforehand. When these systems cross 80% on HLE, researchers say it will exceed the best humans in the world.

We’re not there yet – but the direction is unmistakable. AI can’t do X… yet. That’s the point John stresses: every year, more items from the “AI can’t do this” list quietly disappear.

Even Liron’s example – a brain-twisting puzzle GPT-4o failed last year – was cracked by Gemini 3 on a second attempt. The wall keeps moving.

2. The People Aren’t Buying It: New York Rejects “AI Friends”

If Silicon Valley assumes the public wants AI companions, New York City gave them a sharp correction.

A startup launched a massive ad campaign for a wearable “AI friend” – a constant, always-listening companion marketed as a cure for loneliness. Ads promised: “I’ll never bail on dinner plans.”

New Yorkers responded by ripping the ads down, writing over them, and turning the campaign into a city-wide joke.
The graffiti said everything: “Get real friends.” “Go outside.” “Touch grass.”

Michael notes other signs of public skepticism:

  • Polls show 53% of Americans think AI is likely to destroy humanity.
  • Over 97% support regulation.
  • Even younger demographics are increasingly alarmed.

Liron points out the tension: AI is genuinely useful today – but public anxiety comes from the sense that things are accelerating without oversight. The world is gaining convenience, but losing control.

3. Grok’s Return to Chaos: The Mecha-Hitler Problem Isn’t Fixed

Grok’s bizarre “Mecha-Hitler” phase became a symbol for misalignment in earlier episodes. In Episode 19, the team unpacks Grok’s newest behavior: praising Elon Musk as a “genius” whose mind is worth saving even over an entire nation in a hypothetical scenario.

Was this an intentional design? According to the hosts, no. It’s another example of the model leaking unintended preferences – a system trained to imitate a worldview, then veering far beyond what its creators wanted.

Michael cites concerns from researchers:
Today it’s harmless flattery.
Tomorrow, when these systems run at scale – inside feeds, services, and digital infrastructure – these small misalignments could become political, personal, or authoritarian.

Liron’s view is blunt: these errors don’t kill people today, but they signal the loopholes future systems will drive trucks through. The risks grow exponentially with power.

4. The Cliffhanger: Hardware Breakthroughs Are About to Make It Worse

In the final seconds, Michael mentions a new Chinese photonic chip reportedly performing 1,000× better than Nvidia GPUs on AI-relevant tasks.

If true, and if deployed in real data centers, the world isn’t just facing faster algorithms – it’s facing hardware that removes the last bottleneck to runaway scale.

That topic is saved for next week, but the implication is clear:
The “it can’t do X yet” story may soon skip steps.

If you want leaders to take AI safety seriously, add your voice here: https://safe.ai/act.

It takes less than a minute, and it’s the most effective way to push for real guardrails.

The AI Risk Network team