Episode 196: Virtual Voice Audiobooks?


In this week’s episode, I answer a reader’s question about whether or not I will use KDP’s new Virtual Voice program to create AI-narrated audiobooks.

This coupon code will get you 50% off the audiobook of DRAGONSKULL: SHIELD OF THE KNIGHT (as excellently narrated by Brad Wills) at my Payhip store:

SPRINGSHIELD

The coupon code is valid through April 30th, 2024. So if you need a new audiobook for spring, we’ve got you covered!

TRANSCRIPT

00:00:00 Introduction and Writing Update

Hello, everyone. Welcome to Episode 196 of the Pulp Writer Show. My name is Jonathan Moeller. Today is April the 12th, 2024 and today we are talking about whether or not I will use Amazon Virtual Voice to produce audiobooks. Before we get to our main topics, we will have Coupon of the Week, some writing updates, and then a few random questions from readers. First up, let’s do Coupon of the Week. This episode will go live on Tax Day in the US, so let’s have a discount on an audiobook. This week’s coupon code will get you 50% off the audiobook of Dragonskull: Shield of the Knight, as excellently narrated by Brad Wills, at my Payhip store. That code is SPRINGSHIELD and that is SPRINGSHIELD again and that of course will be available in the show notes. This coupon code is valid through April 30th, 2024, so you need a new audiobook for spring, we have got you covered.

Now let’s have some updates on my current writing projects. I am very nearly almost done with Wizard-Thief. I’m hoping to finish up edits shortly and actually publish it on either April the 15th or April the 16th. So when this episode goes out, I may be publishing it literally as you are listening to this. This book, like Half-Elven Thief, will be available on Amazon and Kindle Unlimited.

Next up, my next main project after Wizard-Thief is published, is going to be Cloak of Titans, the 11th book in the Cloak Mage series. It’s not the end of the series; I’m planning that there’s probably going to be about 15 books with four more after this one. But we are going to be blowing up a lot of the subplots in this book. So while this is not the end of the series, it will definitely have the feel like the end of a lot of plot arcs. I’m 22,000 words into it, and if all goes well, I’m hoping it will be out sometime toward the end of May and will be available on Amazon, Barnes and Noble, Kobo, Google Play, Apple Books, Smashwords, and Payhip.

In audiobook news, recording is underway for Ghosts in the Veils and if all goes well, that should be out sometime toward the end of May. So those are the updates on my current writing projects.

00:02:09 Reader Questions and Comments/Question of the Week

Before we get to the Question of the Week, let’s have a couple of unrelated questions from readers. Cameron wrote in to ask: I’m just interested in knowing how the name Calliande came about and does it have any meaning? I originally thought up the name Calliande because I wanted a unique and distinctive sounding name for the character, and so I was looking at various French and Welsh names that started with C and were about that length. I was rearranging the letters and swapping the vowels out and I came across that name and I thought, you know, that works, we’re going with that.

Amusingly, when I first wrote Frostborn, in my head it was pronounced Callian-DAY. But then when I did the first couple of Frostborn audiobooks from Podium way back at the end of the 2010s, Steven Crossley, the narrator pronounced it Calli-AND and ever since then, because that’s where the direction he went, the official pronunciation has been Calliande and that’s been the way it’s been pronounced in all subsequent audio appearances of that character.

Our next question is from Scott, who asks about a screenshot of the PC game Pillars of Eternity I posted on Facebook the other day. Scott says: that’s on my Steam Wish List, but I haven’t gotten to it yet. What do you think of it? I like it. I am enjoying it. I’ve had it since 2014, but I’ve decided the time has finally come to buckle down and finish it. If you played the original Baldur’s Gate back in the ‘90s or Knights of the Old Republic or Icewind Dale or Planescape: Torment back in the ‘90s, then you will enjoy Pillars of Eternity. It’s definitely worth playing. It’s also an old enough game now that should work on most systems, and if you have Xbox Game Pass, the game’s owned by Microsoft now, so you have Xbox Game Pass so you can play it as part of your subscription on your Xbox.

Now it’s time for Question of the Week, which we ask to have interesting discussions and maybe find out some good suggestions for things we might not have thought of otherwise. And so this week’s question: if you listen to podcasts, what podcasts do you listen to most frequently? No wrong answers, obviously.

MacKenzie says I have four podcasts on consistent subscription: The Art of Manliness, The Black Pants Legion, A Delta Green actual play podcast (content warning for very dark humor), and of course The Pulp Writer Show. Thanks, MacKenzie.

Maaike says: currently just one, Kick in the Creatives posted by Sarah Busby and Tara Roskell. Through them I discovered a few more, but I haven’t really found the time to really listen to them just yet. I don’t know whether I can write or not, but I do know I can draw and paint, so that’s what I’m focusing on. Thinking of doing NaNoWriMo though to see how my writing is. It’s definitely worth trying NaNoWriMo just for once for the experience so you can see how you enjoy it or not.

Michael says only two regularly, The Pulp Writer Show (thanks, Michael!) and the Legend of the Bones, which is an epic, gritty D&D solo play narrative where the dice rule.

Perry says The Pulp Writer Show (not sure if you’ve heard of it) and The Self-Publishing Show (currently on episode 33 of 400 plus).

Anne Marie says Cabinet of Curiosities by Aaron Mahnke.

Jesse says: mostly Critical Role.

Justin says: I didn’t start listening to podcasts until I went full time with my current job in 2021. I listen to a bunch now, but most of my regular listens are The Glass Cannon Network, What Culture Wrestling, What Culture Gaming, and Adeptus Ridiculous.

It’s interesting how I actually haven’t heard of most of these podcasts, which I guess goes to show how diverse and widespread the podcast ecosystem is, where if you have a podcast that can be very famous in a specific niche, it might be like THE podcast in that niche, but anyone who’s not familiar with that particular subject of interest may have never heard of the podcast.

For myself, I did not really start listening to podcasts until 2019, which is when I started listening to some of the self-publishing ones. In the past few years. I have also discovered retro video game podcasts. In that time, I’ve mostly listened to The Sell More Books Show and the Remember The Game podcast about retro video games, which is quite funny (but it does have some foul language, so if you check that out, be aware of it).

00:06:40 Main Topic: Amazon Virtual Voice Audiobooks

Now on to our main topic of the week: Amazon Virtual Voice audiobooks. This was prompted by a question from Reader PML, who wrote in to ask: several of my favorite authors have opted to use AI Virtual Voice to release some of their older titles in audio format. I emailed you a while back hoping for more audio releases for Caina and Nadia. You indicated that audio publishing is expensive and you preferred to release the titles that were not short in length. I totally understand, but I wondered if you have considered releasing your back titles using Virtual Voice. The performance is not bad and I would really enjoy listening to all the books featuring Caina and Nadia. I don’t know what the pricing scale is, but it’s probably quite a bit less than a live reader. So thank you, PML for that question and for listening to all those audiobooks.

If you are not familiar with the term, Virtual Voice is Amazon’s new program for creating AI narrated audiobooks. Will I be using Virtual Voice to turn some of my older titles to audiobooks? No. Why? So there’s three levels to my answer here. One, is it ethical to use AI for audiobook narration? Two, is AI narration good enough for audiobook narration? Three, does this help visually impaired listeners? I should mention that I have in fact experimented quite a bit with AI narrated audiobooks. Part of the reason I did this was because I wanted to understand the technology so I had an informed opinion about it.

Google Play beat Amazon to the punch about two years ago, and I experimented with turning the Silent Order series into audiobooks with their technology, since I don’t think the Silent Order series sells well enough to support audiobooks. After that experiment, I didn’t think the AI generated audiobooks were good enough to sell in good conscience and just because you’re selling something doesn’t mean anyone will buy it. More on that to come. So instead, I put those AI narrated audiobooks on YouTube for free. That said, I did turn on AdSense for the audiobooks, so I made a satisfactory, if small bit of money from YouTube ads in 2023.

Overall, the response from people who listen to those audiobooks seemed to be that they loved the story (thanks, everyone!), but they hated the artificial voice. Like if they had actually paid for it instead of listening to it for free on YouTube, I could just imagine the complaints. I think a lot of the authors who create Virtual Voice audiobooks and audiobooks using similar products from Google Play or other companies will be disappointed by the response they get for those audiobooks. Like I’ve said before, audiobooks are basically self-publishing on hard mode. But if you’re coming to the market with an AI generated audiobook, it will be even harder to sell than one voiced by a human who knows what he or she is doing.

So with that sort of background in mind, let’s go on to the details for the answer to my question. One: is it ethical to use AI for audiobook narration? Ethics in AI is a bottomless quagmire of an Internet discussion. Overall, in my personal opinion, I think AI technology creates vastly more problems than it solves and is really nothing more than a very fancy autocomplete. I also suspect there’s a bit of a speculative bubble to AI technology like there was with cryptocurrency and NFTs. For a while, all the Galaxy Brain influencer people thought crypto and NFTs were the future, and then the bubble burst and a significant portion of everything connected to crypto and NFTs turned out to be a big old scam and all the Galaxy Brains migrated over to touting AI. I suspect a lot of the AI technology rushed out now has the same speculative bubble effect and when the bubble bursts, some companies are going to be out billions since they spent all that money building infinite crap generators.

A lot of people are rushing to shove AI into stuff because it’s trendy and not because it’s useful, like how (this is a 100% true story), the Washington State Lottery decided for whatever reason to put an AI image generator on its site, which it had to pull down hastily when that image generator started creating deepfake nude images of its users. It is also amusing how some of the really pro-AI Galaxy Brains like to say that the US needs to develop AI or else the Chinese will get it first, as if having an infinite crap generator to make deepfake nudes will somehow determine geopolitical dominance in the 21st century. But all that said, I don’t think AI is going to go away. The US courts seem (so far at least) consistent in their opinion that AI is in plagiarism but isn’t copyrightable, and there’s a wide range of useful activity in the not copyrightable but not plagiarism space. This might change if something gets all the way up to the Supreme Court or if Congress passes some legislation on that or the EU  puts out new regulations that the companies have to follow because the EU is such a big part of their market. But for now, that seems to be the position.

AI can do useful things that crypto and NFTs can’t. Like for example, suppose you’re applying to 40 different jobs and you can use ChatGPT or Microsoft Copilot to crank out 40 different customized cover letters for your job applications. Given how messed up the job market is at the moment, I could hardly blame someone for doing that. And you see examples of people using generative AI not to create artwork, but to handle data processing type chores (like the cover letters) in clever ways that don’t seem to cross any moral or ethical boundaries. So I suspect everyone will have to examine their own consciences and decide where their own line is for generative AI.

For me, I decided I’m not going to sell anything that I didn’t make myself, or in the case of an audiobook, was made by a human I hired. If I’m selling something, it was 100% written by Jonathan Moeller or 100% narrated by a human I hired, and the cover image doesn’t include any AI generated art elements. This is also true of books and stories I give away for free, like my permafree series starters. That’s where I’ve decided my line is going to be with AI usage.

I have used AI images for Facebook ads, since ads are low resolution anyway and you often have to change out the image every week or so. Ad images are essentially disposable, and I’ve heard people say AI art is also disposable, so why not use the disposable products of AI art for ad images?

Number two, is AI narration good enough for audiobook narration? All of my criticisms of AI aside, AI voice or Virtual Voice isn’t a new technology. It’s just improved text to speech synthesis technology and text to speech has been around since the late 1960s. The AI part just makes the synthetic voice sound closer to an actual human voice than the more obviously artificial tones of older technology. It’s also pretty good at imitating the real human voice by now, which is why you can go on YouTube and see comedy videos of President Biden trying to make his way through Skyrim or something. Is this AI narration good enough to support creating a paid audiobook? Well, kind of, sorta. It’s good enough now that it creates a near perfect imitation of a human voice.

The trouble is that the voice is so perfect that it triggers the uncanny valley effect, which is when you encounter something that almost seems human but isn’t. It’s also really bad at emotion. The best narrators make it sound like they’re telling a story, and that means varying the emotion of the voice at appropriate times, even if you’re not trying to create a distinctive voice for each character. Text to speech simply isn’t very good at that. That’s part of the reason I won’t use Virtual Voice. I don’t feel the end product is of high enough quality to sell. Give it away for free on YouTube? Sure. But sell? Definitely not. It would be good enough for very dry nonfiction things like legal casebooks, geological and oil surveys, that kind of thing. A nonfiction book that required varied emotion like a war memoir, for instance or comedic travelogue would not work at all well with AI narration.

And finally, number three: does this help visually impaired listeners? While I don’t want to use AI nourish and create paid audiobooks, I would like to see the technology become more ubiquitous and more integrated with ereader apps and operating systems. I think the mission of technology is to help us overcome or ameliorate the inherent frailties of the human condition. That is the best and most ethical use of technology. So I would like to see AI narration eventually become just a button in the ereader app for visually impaired listeners. Like you hit the read aloud button and then the computer reads to you in a voice of your choosing. You’ll still have the option to buy a human narrated audiobook if available, but the option to have the device read to you would be there if you want or need to use it.

We’re already kind of there, technology-wise. All the major operating systems for computer and mobile have read aloud functions. It’s just not implemented consistently across the platform and the voices aren’t always very good. I won’t use Virtual Voice or AI narration to create any audiobooks for sale. Unless something drastically changes in the field, I don’t think I’m going to change my mind on that, though of course anything is possible. In the spirit of full disclosure, as of right now (as of this recording on April 12th 2024), I have agreements with four different narrators to produce four different audiobooks, so I think I am literally putting my money where my mouth is.

So that is it for this week. Thank you for listening to The Pulp Writer Show. I hope you found the show useful. A reminder that you can listen to all back episodes on https://thepulpwritershow.com. If you enjoyed the podcast, please leave a review on your podcasting platform of choice. Stay safe and stay healthy and see you all next week.

Jonathan Moeller Written by:

One Comment

  1. Jim E.
    April 16, 2024

    How about a pronunciation guide for the names of the dragons in the Cloak series.

Comments are closed.