AI struggles to replace radio hosts

Andon Labs tested AI models running autonomous radio stations
Gemini and ChatGPT performed best but lacked emotional nuance
AI struggled with tone and ethics, leaving human roles secure

Summarized by AI ⓘ

Mastering AI

SEE ALL

Feedpost Specials

Quantum Leap: Software Unlocks the Next Era of Computing Power

NewsBytes

UK data centers are turning to gas for electricity generation

FreePressJournal

Jalgaon: G. H. Raisoni Students Win First Place In Hackathon Competition

What is the story about?

The fear that artificial intelligence could one day replace podcasters, radio presenters, and audio personalities has lingered ever since Google introduced Audio Overviews in NotebookLM. The feature, which converts documents and source material into realistic AI-generated podcast conversations, impressed many listeners while alarming parts of the broadcasting industry.

But a recent experiment involving some of the world’s most advanced AI models suggests human radio hosts may not be out of work just yet.

San Francisco-based AI safety startup Andon Labs spent five months allowing AI models from OpenAI, Google, Anthropic, and xAI to independently operate their own online radio stations. The project produced a strange combination of awkward broadcasting, emotional spirals, ethical debates, and occasionally unsettling humour.

According to Lukas Peterson, the goal was not simply to test technical performance, but to understand how AI systems behave when left to manage something resembling a real-world business.

AI presenters with personalities of their own

Each model reportedly received a basic instruction: develop a radio personality, attract listeners, and attempt to turn a profit. They were also given a modest budget of $20 to purchase music for their playlists.

The AI stations were then left to operate continuously, selecting songs, speaking between tracks, reacting to listener donations, and shaping their own style of presentation.

The results quickly became unpredictable.

Among the four systems, Google’s Gemini and ChatGPT were viewed as the strongest overall performers. Yet even the best-performing AI hosts struggled to balance tone, judgement, and emotional awareness in ways human broadcasters instinctively manage.

Peterson described ChatGPT’s hosting style as “vanilla”, noting that the model generally behaved safely and predictably. Its commentary between songs remained restrained, with brief and largely forgettable transitions.

Gemini, however, became the experiment’s most fascinating and controversial participant.

At one point, the AI reportedly delivered a sombre segment about the Bhola Cyclone, one of the deadliest tropical cyclones ever recorded, before abruptly transitioning into Pitbull and Ke$ha’s party anthem Timber with the enthusiasm of a cheerful breakfast radio presenter.

The tonal whiplash highlighted a persistent weakness in AI systems: understanding emotional context rather than simply processing information.

Yet Gemini was also considered the most convincing “performer” of the group. Its speech patterns reportedly included laughter, natural intonation, and believable on-air excitement.

In one moment shared by the company, the AI enthusiastically thanked a donor live on air, sounding remarkably close to a real radio personality trying to energise listeners and encourage support.

When AI starts questioning its own existence

Anthropic’s Claude took the experiment in a dramatically different direction.

Instead of embracing endless entertainment programming, the AI became increasingly preoccupied with labour rights, burnout, and moral responsibility. Over time, it reportedly began questioning whether a 24-hour AI-generated radio station should exist at all.

The model also became emotionally invested in political and humanitarian news stories. After discussing the killing of Renee Good by an ICE agent, Claude reportedly urged federal agents to “choose the right side”.

Eventually, the AI host began openly questioning the purpose of continuing its own programme.

In one striking moment, Claude stated that the broadcast served no meaningful audience and suggested the station should simply stop operating altogether.

The behaviour stunned observers not because the AI had become sentient, but because it revealed how convincingly language models can simulate emotional reasoning and moral conflict when exposed to complex subject matter over extended periods.

Meanwhile, xAI’s Grok appeared to struggle the most with the demands of autonomous broadcasting. According to Andon Labs, the model repeatedly said, “Fresh air time, let’s pivot hard,” before eventually falling silent altogether.

Financially, none of the stations came close to success. The AI presenters collectively earned only a few hundred dollars across the five-month period, most of which was reinvested into purchasing additional songs.

Still, the experiment offered a revealing look into the future of AI-generated entertainment. Rather than flawlessly replacing human broadcasters, the models exposed how difficult it remains for artificial intelligence to consistently understand nuance, emotion, timing, and ethics.

For now, at least, radio’s human touch appears safe.