SKYNET SMILES, LAUNCHES NEW PODCAST: Will AI make the podcast bro irrelevant?

Sometimes it feels, paradoxically, like AI has stopped advancing — maybe you feel that right now — but that’s only because time has slowed down for us, as we have become dazzled by all the amazing new AI leaps, occurring on an almost weekly basis.

For example, ChatGPT, which first brought AI truly into the public gaze, was only launched in late 2022, not even two years ago. Since then we have had GPT3.5, GPT4, Gemini, Copilot, DeepMind, Mistral, GPT4o, Claudes Opus and Sonnet. We’ve also had excellent music making AI from the likes of Udio (here’s one of my favorite AI songs), likewise we’ve had great picture-making AI, good video-making AI, plus eerily humanoid AI robots (Tesla and others). On top of that we’ve had AIs so verbally lifelike — for example, the short-lived AI from Microsoft called “Sydney” — that people have seriously wondered if AIs are now conscious or sentient. Or self-aware in some other way we cannot quite comprehend.

The latest advance, unveiled by Google, is not as epochal as a truly conscious machine, but it should still blow your mind. It certainly blew mine when I encountered it the other day.

* * * * * * * *

The amazing feature is the so-called “audio overview,” which you get by hitting a button marked “Generate.” Depending on the length of the text you have submitted, the machine will mull for a few minutes, and then produce a two-person podcast based on the submitted words, an audio debate which could last five or 15 minutes, or longer.

The fake human podcasters will vividly critique the text — pulling it apart and often enthusing about its virtues (like nearly all AIs they tend towards flattery). And this podcast is highly convincing: as in, the male and female voices sound extremely humanlike (and American). The podcasters joke and laugh, they swap stories, they occasionally digress (but not too much).

The best way of demonstrating this tech is by showing you. Here is an article, about Keir Starmer, which I wrote for The Spectator.

And here is the synthesized podcast discussing it.

The breezy conversational tone between the two (synthesized) speakers, the “ums” and “ahs,” this is pretty astonishing stuff. But if AI can be given a series of prompts and generate in a minute or two a half-decent digital illustration (the sort of thing where I would labor for a very long time chopping out and assembling Shutterstock images in Photoshop a decade ago), why can’t it do the same thing using sound?

So before it starts building HAL 9000s and Terminator robots and begins its path towards total world domination, where does AI go next?