A Quick Look at the React Speech Recognition Hook

Overview

React Speech Recognition is a react hook that accesses the Web Speech API to convert speech from the machine’s microphone to the app’s React components.

There are two hooks in this framework:

useSpeechRecognition, a React hook that gives component access to a transcript of speech picked up from the user’s microphone.
SpeechRecognition manages the global state of the Web Speech API, exposing functions to turn the microphone on and off.

Prerequisites

This version requires React 16.8 so that React hooks can be used; please see the full framework README for more information.

Note: This framework uses WebSpeech API. Browser support for this API is currently limited, with Chrome having the best experience. As of June 2020 these browsers support the API:

Chrome (desktop): this is by far the smoothest experience
Microsoft Edge
Chrome (Android): a word of warning about this platform: there can be an annoying beeping sound when turning the microphone on. This is part of the Android OS and cannot be controlled from the browser
Android webview
Samsung Internet

What We’ll Make

I will be making a simple voice memo app with basic voice commands that run in the browser. If you would like to follow this tutorial, please be ready to work with the create-react-app boilerplate.

example

Let's Get Started

Step 1: Setting up the workspace

Create a new react app with create-react-app.

1npx create-react-app dictaphone

Modify the App.js file and add the dictaphone component like so:

1import Dictaphone from './dictaphoneSetup.js'
2function App() {
3return (
4<div className="App">
5<header className="Dictaphone-Tester">
6<Dictaphone />
7</header>
8</div>
9);
10}
11export default App;

NOTE: We haven't built the Dictaphone component yet; we'll jump into that next!

In the root directory of your app, install the react hook using:

1npm i react-speech-recognition

Create a file to house the Dictaphone component in your src directory and import the necessary dependencies:

1import React, { useEffect, useState } from 'react';
2import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';

And that's it for our basic setup!

Step 2: Setting up the dictaphone

First, we will build the skeleton of our component:

1import React, { useEffect, useState } from 'react';
2import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';
3const Dictaphone = () => {
4return (
5<div></div>
6);
7};
8export default Dictaphone1;

Next, we need to fetch certain props out of useSpeechRecognition. The props we will need are transcript, interimTranscript, finalTranscript, resetTranscript, and listening. You can do so like this:

1const {
2   transcript,
3   interimTranscript,
4   finalTranscript,
5   resetTranscript,
6   listening,
7 } = useSpeechRecognition();

We will also need to enable commands through this hook so add commands to gain access:

1} = useSpeechRecognition({ commands });

Moving swiftly onward, we need to need to add several functions that are built into the hook and React. First we will be using useEffect to print the transcript to the page.

1useEffect(() => {
2   if (finalTranscript !== '') {
3     console.log('Got final result:', finalTranscript);
4   }
5 }, [interimTranscript, finalTranscript]);

Then, we will add a listening function to start our dictaphone and throw in a quick conditional to alert the user if their browser is not compatible with this API. These functions would look like:

1if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
2   return null;
3 }
4if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
5console.log('Your browser does not support speech recognition software! Try Chrome desktop, maybe?');
6}
7const listenContinuously = () => {
8SpeechRecognition.startListening({
9continuous: true,
10language: 'en-GB',
11});
12};

Step 3: Building the dictaphone controls and page elements

We will add some page elements to access our dictaphone and print the words to the page. We will need three buttons: stop, listen, and reset - to control the dictaphone and reset the transcript. We may also want to add an indicator that informs the user if the dictaphone is listening or not. Within Dictaphone's return div create the following elements:

A span that houses a ternary conditional that uses the listening prop of useSpeechRecognition to tell if the dictaphone is currently accessing the microphone or not. Add a 'Listening:' label for clarity reasons (Please see below diagram for more precise explanation).
A reset button that will use the built-in function resetTranscript to- you guessed it- reset the transcript.
A listen and stop button that will use the functions listenContinuously and SpeechRecogniton.stopListening respectively to control the dictaphone.
One last span where we are housing our transcript.

Your return statement should look similar to this:

1return (
2    <div>
3      <div>
4        <span>
5          listening:
6          {' '}
7          {listening ? 'on' : 'off'}
8        </span>
9        <div>
10          <button type="button" onClick={resetTranscript}>Reset</button>
11          <button type="button" onClick={listenContinuously}>Listen</button>
12          <button type="button" onClick={SpeechRecognition.stopListening}>Stop</button>
13        </div>
14      </div>
15      <div>
16        <span>{transcript}</span>
17      </div>
18    </div>
19  );

If you go to localhost:3000 in your browser, the dictaphone should now be working.

Step 4: Adding commands

Now that our dictaphone is working, let's add some commands. We won't do anything too complicated but instead access the functions we already have at our disposal.

Within the dictaphone component, declare an array called commands. This will be an array of objects containing two properties each:

a command (string or regular expression)
a callback

A voice command that resets the transcript should look like this:

1const commands = [
2   {
3     command: 'reset',
4     callback: () => resetTranscript()
5   },
6   {
7     command: 'clear',
8     callback: () => resetTranscript()
9   }
10 ]

When the dictaphone picks up the words 'reset' or 'clear' from the user's speech, it will execute the callback associated with the command.

Let's add a response feature. If we add a react hook to manipulate the state like this:

1const [message, setMessage] = useState('');

Then add the message to the output, we can add commands that generate a response. For instance, if I were to say ‘Hello’, the app would print back 'Hi there!' or something to that effect. Below, I've written some simple commands.

1const commands = [
2   {
3     command: 'reset',
4     callback: () => resetTranscript()
5   },
6   {
7     command: 'shut up',
8     callback: () => setMessage('I wasn\'t talking.')
9   },
10   {
11     command: 'Hello',
12     callback: () => setMessage('Hi there!')
13   },
14 ]

With that our quick dictaphone is complete! Take a look at this code where it all comes together:

1import React, { useEffect, useState } from 'react';
2import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';
3const Dictaphone1 = () => {
4const [message, setMessage] = useState('');
5const commands = [
6{
7command: 'reset',
8callback: () => resetTranscript()
9},
10{
11command: 'shut up',
12callback: () => setMessage('I wasn't talking.')
13},
14{
15command: 'Hello',
16callback: () => setMessage('Hi there!')
17},
18]
19const {
20transcript,
21interimTranscript,
22finalTranscript,
23resetTranscript,
24listening,
25} = useSpeechRecognition({ commands });
26useEffect(() => {
27if (finalTranscript !== '') {
28console.log('Got final result:', finalTranscript);
29}
30}, [interimTranscript, finalTranscript]);
31if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
32return null;
33}
34if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
35console.log('Your browser does not support speech recognition software! Try Chrome desktop, maybe?');
36}
37const listenContinuously = () => {
38SpeechRecognition.startListening({
39continuous: true,
40language: 'en-GB',
41});
42};
43return (
44<div>
45<div>
46<span>
47listening:
48{' '}
49{listening ? 'on' : 'off'}
50</span>
51<div>
52<button type="button" onClick={resetTranscript}>Reset</button>
53<button type="button" onClick={listenContinuously}>Listen</button>
54<button type="button" onClick={SpeechRecognition.stopListening}>Stop</button>
55</div>
56</div>
57<div>
58{message}
59</div>
60<div>
61<span>{transcript}</span>
62</div>
63</div>
64);
65};
66export default Dictaphone1;