This post is part of a series about my podcast:
- The Big Idea
- Recording Gear
- Editing (this post)
- Publishing
- What’s Next
To celebrate the first anniversary of my podcast, The Informed Life, I’ve been writing about my podcasting setup. The first post was about the thinking that led up to the show. The second was about my recording setup. This post is about an important (and time-consuming) part of the process: editing each episode.
As I mentioned in the first post, I had limited experience with audio production before I started podcasting. But I had enough experience with other media (video, writing) to know that I wouldn’t be able to release episodes precisely as recorded. They’d need to be edited before I posted them. When people talk, we ramble. We cough. We pause to think about what we want to say. As I mentioned previously, I was aiming for thirty-minute episodes. This time constraint would require that I cut material from the source audio files.
So I knew editing would be a part of producing each episode. I evaluated audio editing software before starting the show. There are several audio editing tools available, ranging from open source (and free) like Audacity, to professional (and expensive) like Logic Pro X. One of my guiding principles was not to spend too much money on the show, so I was wary of investing in high-end tools. My experience with cross-platform open source software made me think Audacity may be a powerful tool, but perhaps wouldn’t feel like a native macOS application. (Something I find very distracting.)
At this point, you may be wondering: why not use Garageband, an audio editor that comes bundled for free with Macs? I evaluated Garageband, but it was missing a key feature: the ability to automatically split a track based on stretches of silence in the track. Breaking it up this way allows the editor to work more efficiently with blocks of speech, saving lots of time.
I settled on Ferrite, an iOS-based audio editor. Ferrite had many of the features I wanted in an audio editor but was relatively inexpensive. It includes the ability to detect silences in the track to split the track. The one downside to Ferrite, of course, is that it works on a different platform. I’d be recording audio tracks on macOS and editing them on my iPad. This workflow required moving large audio files between the two — more of an obstacle than an inconvenience. And using an iPad-based editor would make it easier for me to work on the show in different locales such as coffee shops and public transport.
This Ferrite-based setup was in place when I recorded my first show, the interview with Lou Rosenfeld. Everything went as planned: I recorded our conversation using Zoom’s built-in call recording feature, saved the audio file to Dropbox, and opened the file in Ferrite on my iPad. Six hours later, I was done — and horrified. I wouldn’t be able to produce a show every other week if each episode would take around six hours to edit. Of course, I made many rookie mistakes. I was learning a new tool, after all. But still, the process seemed inevitably time-consuming: I’d have to listen to the whole episode over and over, stopping to make significant changes, then little changes. Frankly, at that point, I was doubting my ability to continue producing the show at all, given my busy work schedule.
Then, serendipitously, I discovered a tool that has made it possible for me to edit the show more effectively: Descript. It’s perhaps the primary reason I’ve been able to continue producing the show at all. When I explain what it does, you’ll see why.
Descript’s primary interface looks like a text editor. You drag and drop audio files onto this editor, at which point it offers to transcribe them for you. The transcripts — which are done by machine learning algorithms, and therefore very quick — are astonishingly accurate. Descript offers to parse different speakers and to tag different parts of the transcript according to who says what. The kicker is that then you can make changes to the transcript that affect the underlying audio file. For example, if you select an entire paragraph of text and delete it, Descript removes that sequence from the audio track. As with most text editors, you can copy and paste, delete, undo, etc. In other words, Descript turns audio editing — which usually is an auditive process — into a visual process. This mode switch makes editing much faster since I don’t have to listen to the whole thing to make changes. The tool is cloud-based, so in theory the work could be distributed between different team members, a potential boon to productivity.
That said, Descript isn’t perfect. The tool is still new and evolving. I had frustrating moments with earlier versions, which sometimes would stop synching, mistranscribe files, lose work, etc. (It’s now much more reliable, which is why I feel comfortable recommending it at this point.) There are still some minor delays when editing files, caused (I assume) by the challenges inherent in keeping the audio and its transcript aligned, all the while syncing everything to the internet in the background. Finally, it’s not a cheap tool. (But it pays for itself, given how much time it’s saving me.)
Even though Descript has made the bulk of the editing faster, I still import the resulting file into Ferrite for final cleanup. I’m not thrilled that I still have to do this, but Ferrite does a better job of mixing everything. (Among other things, this is where I graft in the show intro and its background music.) That said, the process is much faster since I’m no longer doing the primary editing in an audio-first editor. At this point, I couldn’t imagine producing the show without Descript. Conversely, Descript seems to be adding features that turn it into a more fully-featured audio editor, so I could envision at one point getting rid of Ferrite.
So there you have it, a peek into my podcast editing setup and process. In the next entry in this series, I will tell you about how I publish the show so you can hear it in your phone.