Join us on the demo, while our product experts provide a detailed walkthrough of our enterprise platform.

A Quick Look at the React Speech Recognition Hook

Learning and exploring the react-speech-recognition hook basics.

Krista Singleton
Krista Singleton
October 29, 2020
7 min read

Overview

React Speech Recognition is a react hook that accesses the Web Speech API to convert speech from the machine’s microphone to the app’s React components.

There are two hooks in this framework:

  • useSpeechRecognition, a React hook that gives component access to a transcript of speech picked up from the user’s microphone.
  • SpeechRecognition manages the global state of the Web Speech API, exposing functions to turn the microphone on and off.

Prerequisites

This version requires React 16.8 so that React hooks can be used; please see the full framework README here for more information.

Note: This framework uses WebSpeech API. Browser support for this API is currently limited, with Chrome having the best experience. As of June 2020 these browsers support the API:

  • Chrome (desktop): this is by far the smoothest experience
  • Microsoft Edge
  • Chrome (Android): a word of warning about this platform: there can be an annoying beeping sound when turning the microphone on. This is part of the Android OS and cannot be controlled from the browser
  • Android webview
  • Samsung Internet

What We’ll Make

I will be making a simple voice memo app with basic voice commands that run in the browser. If you would like to follow this tutorial, please be ready to work with the create-react-app boilerplate.

example

Let's Get Started

Step 1: Setting up the workspace

  1. Create a new react app with create-react-app.
npx create-react-app dictaphone
  1. Modify the App.js file and add the dictaphone component like so:
import Dictaphone from './dictaphoneSetup.js'

function App() {
  return (
    <div className="App">
      <header className="Dictaphone-Tester">
        <Dictaphone />
      </header>
    </div>
  );
}

export default App;

NOTE: We haven't built the Dictaphone component yet; we'll jump into that next!

  1. In the root directory of your app, install the react hook using:
npm i react-speech-recognition
  1. Create a file to house the Dictaphone component in your src directory and import the necessary dependencies:
import React, { useEffect, useState } from 'react';
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';

And that's it for our basic setup!

Step 2: Setting up the dictaphone

  1. First, we will build the skeleton of our component:
import React, { useEffect, useState } from 'react';
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';

const Dictaphone = () => {
 return (
   <div></div>
 );
};

export default Dictaphone1;
  1. Next, we need to fetch certain props out of useSpeechRecognition. The props we will need are transcript, interimTranscript, finalTranscript, resetTranscript, and listening. You can do so like this:

    const {
    transcript,
    interimTranscript,
    finalTranscript,
    resetTranscript,
    listening,
    } = useSpeechRecognition();

    We will also need to enable commands through this hook so add commands to gain access:

    } = useSpeechRecognition({ commands });
  2. Moving swiftly onward, we need to need to add several functions that are built into the hook and React. First we will be using useEffect to print the transcript to the page.

    useEffect(() => {
    if (finalTranscript !== '') {
     console.log('Got final result:', finalTranscript);
    }
    }, [interimTranscript, finalTranscript]);
  3. Then, we will add a listening function to start our dictaphone and throw in a quick conditional to alert the user if their browser is not compatible with this API. These functions would look like:
 if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
   return null;
 }

 if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
   console.log('Your browser does not support speech recognition software! Try Chrome desktop, maybe?');
 }
 const listenContinuously = () => {
   SpeechRecognition.startListening({
     continuous: true,
     language: 'en-GB',
   });
 };

Step 3: Building the dictaphone controls and page elements

We will add some page elements to access our dictaphone and print the words to the page. We will need three buttons: stop, listen, and reset - to control the dictaphone and reset the transcript. We may also want to add an indicator that informs the user if the dictaphone is listening or not. Within Dictaphone's return div create the following elements:

  1. A span that houses a ternary conditional that uses the listening prop of useSpeechRecognition to tell if the dictaphone is currently accessing the microphone or not. Add a 'Listening:' label for clarity reasons (Please see below diagram for more precise explanation).
  2. A reset button that will use the built-in function resetTranscript to- you guessed it- reset the transcript.
  3. A listen and stop button that will use the functions listenContinuously and SpeechRecogniton.stopListening respectively to control the dictaphone.
  4. One last span where we are housing our transcript.

Your return statement should look similar to this:

  return (
    <div>
      <div>
        <span>
          listening:
          {' '}
          {listening ? 'on' : 'off'}
        </span>
        <div>
          <button type="button" onClick={resetTranscript}>Reset</button>
          <button type="button" onClick={listenContinuously}>Listen</button>
          <button type="button" onClick={SpeechRecognition.stopListening}>Stop</button>
        </div>
      </div>
      <div>
        <span>{transcript}</span>
      </div>
    </div>
  );

If you go to localhost:3000 in your browser, the dictaphone should now be working.

Step 4: Adding commands

Now that our dictaphone is working, let's add some commands. We won't do anything too complicated but instead access the functions we already have at our disposal.

Within the dictaphone component, declare an array called commands. This will be an array of objects containing two properties each:

  • a command (string or regular expression)
  • a callback

A voice command that resets the transcript should look like this:

 const commands = [
   {
     command: 'reset',
     callback: () => resetTranscript()
   },
   {
     command: 'clear',
     callback: () => resetTranscript()
   }
 ]

When the dictaphone picks up the words 'reset' or 'clear' from the user's speech, it will execute the callback associated with the command.

Let's add a response feature. If we add a react hook to manipulate the state like this:

const [message, setMessage] = useState('');

Then add the message to the output, we can add commands that generate a response. For instance, if I were to say ‘Hello’, the app would print back 'Hi there!' or something to that effect. Below, I've written some simple commands.

 const commands = [
   {
     command: 'reset',
     callback: () => resetTranscript()
   },
   {
     command: 'shut up',
     callback: () => setMessage('I wasn\'t talking.')
   },
   {
     command: 'Hello',
     callback: () => setMessage('Hi there!')
   },
 ]

With that our quick dictaphone is complete! Take a look at this code where it all comes together:

import React, { useEffect, useState } from 'react';
import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';

const Dictaphone1 = () => {
 const [message, setMessage] = useState('');
 const commands = [
   {
     command: 'reset',
     callback: () => resetTranscript()
   },
   {
     command: 'shut up',
     callback: () => setMessage('I wasn\'t talking.')
   },
   {
     command: 'Hello',
     callback: () => setMessage('Hi there!')
   },
 ]
 const {
   transcript,
   interimTranscript,
   finalTranscript,
   resetTranscript,
   listening,
 } = useSpeechRecognition({ commands });

 useEffect(() => {
   if (finalTranscript !== '') {
     console.log('Got final result:', finalTranscript);
   }
 }, [interimTranscript, finalTranscript]);
 if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
   return null;
 }

 if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
   console.log('Your browser does not support speech recognition software! Try Chrome desktop, maybe?');
 }
 const listenContinuously = () => {
   SpeechRecognition.startListening({
     continuous: true,
     language: 'en-GB',
   });
 };
 return (
   <div>
     <div>
       <span>
         listening:
         {' '}
         {listening ? 'on' : 'off'}
       </span>
       <div>
         <button type="button" onClick={resetTranscript}>Reset</button>
         <button type="button" onClick={listenContinuously}>Listen</button>
         <button type="button" onClick={SpeechRecognition.stopListening}>Stop</button>
       </div>
     </div>
     <div>
       {message}
     </div>
     <div>
       <span>{transcript}</span>
     </div>
   </div>
 );
};

export default Dictaphone1;

Thanks for following along, and I hope this helps!



LoginRadius Docs

Implement Authentication in Minutes

click here

Most Popular Tags

EngineeringJavaScriptNodeJsReactCSSSecurityGoOAuthSocialLoginAuthentication

Are your customers safe on your application?

According to Forbes, data breaches exposed 4.1 billion records in the first six months of 2019. If this gets you worried, we’ve got your back!

LoginRadius protects your customers’ identities. We provide world-class security for your customers during login, registration, password setup, and any other data touchpoints, and make sure that their data is safe. We do so through by offering open source SDKs, integrations with over 150 third party applications, pre-designed and customizable login interfaces, and best-in-class data security products such as MFA, RBA, and Advanced Password Policies. The platform is already loved by over 3,000 businesses with a monthly reach of 1.17 billion users worldwide.Secure Your Application Now

Krista Singleton

Krista Singleton

View Profile

Try a Modern Authentication Solution

$0/ month

Free Sign Up
  • 5,000 MAU
  • 1 Web or mobile app
  • Standard login
  • 3 Social Login Providers
  • Transactional Email Template
  • Customizable Login Interfaces