Sounding out

Podcasts are great, here’s how you can get started

Sounding out

Since our hosting provider Substack added free podcast hosting, we jumped in feet first and recorded audio episodes of all our essays. What makes Substack perfect for us is that the essay can be published in full, and sits below the audio player. Reading (words on a page) is the most emotionally satisfying and cognitively beneficial way of increasing our knowledge. So we really wanted the podcast to be an accoutrement to the writing rather than the centrepiece. Substack also gives us an RSS feed that we can submit to podcast directories. All you have to then do is submit your feed link to Apple or Spotify. There are plenty of other podcast hosting services out there: Craig Mod uses Simplecast for his excellent SW945 podcast. Transistor is probably the most polished of the lot, with analytics and lots of great embedded controls. All hosting will set you back mucho dinero, so you will need to gauge the value of throwing yourself in with the podcasting lot. The other option is to create a DIY RSS feed and store your audio files on Amazon S3 or other server that supports HTTP HEAD. The podcast feed resembles a regular RSS feed except for the iTunes-specific tags. You can follow John Peart’s tutorials on making a podcast for his static site if you want to give it a go.

But first you have to record the damn thing. If you’re like us and are blessed with thin, tinny, nasally voices, you probably need to consider alternatives to self-recording. There are some very professional podcasting studios who will use actors (or at least people with sonorous baritone or sultry soprano tones) to voice over your words, such as voices.com or you can always try fiverr.com. The other way is to use one of the AI-powered, natural voice text-to-speech converters. We use Google’s Wavenet to podcastify our essays, but there are others like Amazon Polly. They both use Speech Synthesis Markup Language (SSML)—which we’ll come to in a second—and let you download sound files for chunks of text. We chose Wavenet because it’s just more convenient to use. All you need to do is create a Google Cloud Console account and then get your Wavenet API key. Add the Wavenet for Chrome extension to, well, your Chrome browser, and configure it with your API key. Then, this is how we go about converting entire essays to speech:

  1. We paste the entire essay in an online plain text processor—like editpad.

  2. We tidy up the text, adding periods and other punctuation that the AI processor will need. For example, the processor will run-on sentences if we missed a period between paragraphs, in block quotes say.

  3. Since the AI can’t really distinguish between heteronyms, we replace such problematic words with their phonetic spellings. For example, “reed” for the present tense and “red” for the past participle.

  4. When we have fixed the text to be AI-ready, we simply select all of it, right-click, and then download our MP3 file using the extension.

SSML lets us vary the tone by adding pauses, emphasis, as well as changing the rate, pitch, and volume. You’ll need to enclose the sentence(s) that the SSML tag will modify within <speak> tags. 

<speak>
Mary had a little lamb <break time="3s"/>Whose fleece was white as snow.
</speak>

The most useful SSML tags are: TBC…