Menu

Settings

Theme
Animations

Warning:Deprecation Notice

As of , Web Captioner has been sunset. I've left this article up, but unless someone hosts a fork of the original source code, this article won't work anymore. If you find viable alternatives, please let me know on Mastodon!

Introduction

Yesterday, I participated in a Twitter Space with JavaScript Jam to chat about accessibility and web standards. Ordinarily when I do this kind of speaking, I try to make sure there are, at minimum, automatic captions for the live event (yes, despite automatic captions' serious limitations) and polished captions for any recordings afterwards. What I didn't realize, until Nic Steenhout pointed it out, was that Twitter Spaces no longer provides captions. They used to, but Twitter's new management has since disabled that, rolling back significant progress.

Ideally, live chats like this will happen on platforms that provide captions out of the box which are built directly into the interface. That will ensure the captions are discoverable, and that the user won't have to leave the application window to get their captions. If you're looking to ensure participants can get captions, this is where you should start: by seeking out platforms that explicitly support captions.

If you absolutely can't get in-app captions to work or move to another platform, then what follows is the last-ditch fallback I ended up going with: using Web Captioner to generate a shareable link to automatic captions that you can pass along to participants in your Twitter Space, Discord voice chat, or other uncaptioned live audio.

What is Web Captioner?

Web Captioner is a website that uses Google Chrome's built-in speech recognition to provide a simple speech-to-text display. Currently, that speech recognition functionality is only in Google Chrome, so Chrome is required for whoever is recording the audio.

Web Captioner seems largely designed for local use cases, such as a classroom setting, where you could show the transcription on a big screen. Web Captioner also has options to integrate with software like Zoom, OBS Studio, or vMix to provide true closed captions. Crucially for this jerry-rigged solution, Web Captioner also provides an experimental feature for creating shareable links to your captions.

Step 1: Set Up Audio Loopback

The specifics of this step depend a lot on your operating system.

Web Captioner depends on Chrome's ability to capture audio. As a result, while Web Captioner can pick up your microphone just fine, it won't be able to transcribe the audio from anyone else on the call out of the box. If you want to transcribe the full call, you'll need to set up Google Chrome to use the system's entire audio output as an input.

To do this, you'll need to install and set up some audio loopback software, since operating systems can get pretty weird about using audio outputs from some applications as inputs in others. Web Captioner themselves have some steps you can walk through for setting up some loopback software:

Audio loopback can get really messy, particularly if you also plan to use your own microphone, so where possible, I'd recommend using a two-device setup — one for your microphone, and one for listening — if you can get away with it.

If you already have audio loopback software set up (For instance, I use shinywhitebox's SWB Audio App for streams, and I know of other people who use BlackHole) to provide device audio output as an audio input source, then you should be able to use that setup just fine.

Step 2: Get Chrome to Use Your Audio Output

Once you've set up your loopback software to provide device audio output as a new audio input source, we need to get Google Chrome to use that device audio in place of your microphone.

To set this, go to Chrome's microphone settings at chrome://settings/content/microphone, and find the microphone dropdown. Choose the audio source you created during Step 1. On my Mac, it had "(Virtual)" at the end to make it easier to find. I'm not sure whether Windows would do the same thing.

If your device audio isn't available from the dropdown, you might need to restart Chrome.

Step 3: Test the Transcription in Web Captioner

Time for the big moment of truth! Let's make sure Web Captioner can accurately pick up your device audio now.

Go to Web Captioner, and click either of the bright blue Start Captioning buttons in the header:

Web Captioner homepage, featuring an astonishingly bright yellow hero section with two blue 'Start Captioning' call-to-action buttons.

This will take you to the mostly blank, black screen of the transcription interface:

The Web Captioner transcription interface. The page is mostly a blank, black space. At the bottom of the page, a footer contains the Web Captioner logo, a yellow 'Start Captioning' button, and a profile menu toggle kebab

Click the yellow Start Captioning button in the footer. In another tab, window, or application, play some audio with some dialogue. I used a YouTube video for this. Hop back to Web Captioner and confirm that Web Captioner is transcribing the audio. If you're using Web Captioner on the same device you're chatting from, try speaking into the mic to confirm whether your mic audio is also getting transcribed.

Assuming this worked, it's all smooth sailing from here! ⛵

Next up, we'll have Web Captioner generate a link to our live captions that we can share with participants.

You'll need to be signed in to save settings and to generate the link. First, click the Settings menu icon in the very bottom right corner of the captioner's interface, and then click Sign in. Follow the steps to authenticate into Web Captioner with an account.

Next, we're going to enable the experimental Share feature. This experiment is currently hidden away, so to get to it, you'll need to visit https://webcaptioner.com/captioner/settings/experiments?add=share directly. When you do, you'll be greeted with a popup like this:

Web Captioner settings page with a modal dialog on top, which asks: Do you want to add this experiment? This feature is still in the oven and may not work right. Three checked checkboxes are used to confirm the user understands the experiment may not work perfectly and things might break, the experiment could go away at any time, and that the user agrees to give feedback. The bottom of the dialog has a Cancel button and an Add Experiment button.

Check each checkbox provided, and then click the Add Experiment button to proceed.

Return to the captioner interface. Next to the Start Captioning button, there should be a new button with an icon that looks like a radio tower. Click it to open up a new popup:

Web Captioner's captioning interface. In the bottom right, a modal dialog titled 'Share Captions' offers settings for share links, including whether the interface will display a link back to the stream or site, or whether it will provide a custom welcome message. Additional settings allow for generating a random link (expires in 48 hours) or a custom vanity link (never expires).

Use these settings to configure your share link as needed. I wasn't able to get the custom vanity link to work, but that could have just been a temporary issue. When you're ready, click Get Link.

Before the live event starts, promote the captions link, making it as easy as possible to find. After all, these captions are only as useful as they are discoverable. You should also draw attention to the link during the event itself. I set up a memorable, intuitive redirect (benmyers.dev/captions) so I could mention the captions link on air without having to spell out the randomly-generated string of letters. For the purposes of the Twitter Space, we also pinned a tweet to the top of the Space with a link to the captions.

When the event starts, be sure to click Start Captioning to kick off the transcription.

Step 6: Post-Event Wrap-Up

After the event is done, you'll want to return to Web Captioner and stop the captions. You should probably do this before you or any other hosts say something you don't want broadcast to the world 😅

While in the Web Captioner interface, you can export a transcript! This is especially helpful if you plan to upload a recording of the event. To export your transcript, pop open the Settings menu in the bottom right corner of the captioner interface again, and click the button with the floppy disk Save icon:

Web Captioner's captioning interface, with the Settings menu open. A button with a floppy disk-like Save icon has a tooltip which reads: Save transcript.

From there, you'll be able to export your transcript as both a text file and a Word document:

A modal dialog reads 'Save to File,' and offers buttons to export as a text file or a Word document.

Using this transcript elsewhere?

If you're planning to use this transcript alongside an upload of the event, please be sure to clean it up and correct it first. The exported transcript will have plenty of mistranscribed words, as well as weird mixes of prematurely cut off sentences alongside run-on sentences. You'll also need to clearly indicate who's speaking, and probably add in any non-dialogue audio cues as well.

Finally, you might want to put your Chrome instance back in its default microphone state, as well as disable any audio loopback software you have runnning, so that your audio experience for day-to-day app usage is back to normal.

Conclusion

If you can use a platform that supports captioning out of the box, please do so. It'll be far more reliable than running finicky audio loopback software, depending on continued support for a hidden experimental feature inside Web Captioner, and requiring listeners to have a separate window up to follow the conversation. However, if you've exhausted other options, a shareable link like this could work in a pinch.

You may also be interested in /u/mossonrok's Reddit post, where they go into using a similar approach on macOS, with a focus on Discord voice chats.