How to Implement React Native Speech to Text Using Native Modules and Hooks

The best way to implement react native speech to text functionality is by bridging the iOS Speech Framework and Android SpeechRecognizer through the @react-native-voice/voice library, then abstracting the imperative native API into a declarative custom hook using React’s useState, useEffect, and useCallback primitives.

React Native applications access device hardware through a bridge architecture that connects JavaScript logic to native platform APIs. When implementing react native speech to text features, developers should leverage the @react-native-voice/voice community package alongside React’s intrinsic hook system—specifically the implementations found in packages/react/src/ReactHooks.js within the facebook/react repository—to ensure automatic cleanup of native listeners and prevent memory leaks.

Why React Hooks Are Essential for Speech Recognition

The React hook system provides the foundation for managing side effects and state in functional components. According to the facebook/react source code, the core hook primitives that enable this pattern are implemented in packages/react/src/ReactHooks.js:

  • useState (lines 66‑71): Returns a stateful value and a function to update it, triggering re-renders when the transcript changes.
  • useEffect (lines 87‑101): Accepts a function containing imperative side effects—such as attaching native event listeners—and returns an optional cleanup function that React executes before the component unmounts.
  • useCallback (defined adjacent to the above): Memoizes callback functions to prevent unnecessary re-registrations of native event handlers.

By wrapping the imperative @react-native-voice/voice API in a custom hook built with these primitives, you ensure that native resources are released automatically when the component unmounts, mirroring the cleanup logic found in React’s internal implementation.

Step‑by‑Step Implementation Guide

1. Install the Voice Recognition Library

Add the community-maintained package to your project:

npm install @react-native-voice/voice

For React Native 0.60 and above, autolinking handles the native module setup automatically. For iOS, execute npx pod-install to install the required CocoaPods dependencies.

2. Configure Platform Permissions

Speech recognition requires explicit user permissions on both iOS and Android.

iOS – Add the following keys to Info.plist:

<key>NSSpeechRecognitionUsageDescription</key>
<string>This app uses speech recognition to convert your voice to text.</string>
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access to record your voice.</string>

Android – Add to AndroidManifest.xml:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />

For Android 6.0 (API 23) and above, request runtime permissions using PermissionsAndroid from react-native or the react-native-permissions library.

3. Create the useSpeechToText Hook

Create a custom hook that abstracts the native voice API. This implementation leverages the React hooks defined in packages/react/src/ReactHooks.js to ensure automatic cleanup and optimal performance.

import {useEffect, useState, useCallback} from 'react';
import Voice from '@react-native-voice/voice';

export default function useSpeechToText() {
  const [transcript, setTranscript] = useState('');
  const [isListening, setIsListening] = useState(false);
  const [error, setError] = useState(null);

  // Memoized handlers prevent unnecessary re-registrations of native listeners
  const onSpeechResults = useCallback(
    e => {
      // e.value is an array of possible transcriptions
      setTranscript(prev => (prev ? `${prev} ` : '') + e.value.join(' '));
    },
    [],
  );

  const onSpeechError = useCallback(e => {
    setError(e.error);
    setIsListening(false);
  }, []);

  // Setup and cleanup native listeners using useEffect
  useEffect(() => {
    Voice.onSpeechResults = onSpeechResults;
    Voice.onSpeechError = onSpeechError;

    // Cleanup mirrors React's internal cleanup logic (ReactHooks.js lines 87-101)
    return () => {
      Voice.destroy().catch(() => {}); // Releases native resources
    };
  }, [onSpeechResults, onSpeechError]);

  // Public API methods
  const startListening = useCallback(async () => {
    try {
      setError(null);
      await Voice.start('en-US'); // Parameterize language code as needed
      setIsListening(true);
    } catch (e) {
      setError(e.message);
    }
  }, []);

  const stopListening = useCallback(async () => {
    try {
      await Voice.stop();
      setIsListening(false);
    } catch (e) {
      setError(e.message);
    }
  }, []);

  const reset = useCallback(() => {
    setTranscript('');
    setError(null);
  }, []);

  return {
    transcript,
    isListening,
    error,
    startListening,
    stopListening,
    reset,
  };
}

Key Implementation Details:

  • useState (as implemented in ReactHooks.js lines 66‑71) manages the transcript, listening state, and error objects.
  • useCallback memoizes the native event handlers to prevent unnecessary re-registrations on every render.
  • useEffect (lines 87‑101 in ReactHooks.js) handles the side effect of attaching native listeners and returns a cleanup function that invokes Voice.destroy(), ensuring native resources are released when the component unmounts.

4. Build the UI Component

Consume the custom hook in a functional component to render the speech interface:

import React from 'react';
import {
  View,
  Text,
  Button,
  StyleSheet,
  PermissionsAndroid,
  Platform,
  Alert,
} from 'react-native';
import useSpeechToText from './useSpeechToText';

export default function SpeechScreen() {
  const {
    transcript,
    isListening,
    error,
    startListening,
    stopListening,
    reset,
  } = useSpeechToText();

  // Request Android microphone permission at runtime
  async function requestPermission() {
    if (Platform.OS !== 'android') return true;
    
    const granted = await PermissionsAndroid.request(
      PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
      {
        title: 'Microphone Permission',
        message: 'This app needs microphone access to convert speech to text.',
        buttonPositive: 'OK',
      },
    );
    return granted === PermissionsAndroid.RESULTS.GRANTED;
  }

  const handleStart = async () => {
    const hasPermission = await requestPermission();
    if (!hasPermission) {
      Alert.alert('Permission Denied', 'Cannot start speech recognition without microphone access.');
      return;
    }
    startListening();
  };

  return (
    <View style={styles.container}>
      <Text style={styles.title}>Speech-to-Text Demo</Text>
      
      <Text style={styles.transcript}>
        {transcript || 'Tap "Start" and speak...'}
      </Text>
      
      {error && <Text style={styles.error}>Error: {error}</Text>}
      
      <View style={styles.buttonRow}>
        <Button
          title={isListening ? 'Stop Listening' : 'Start Listening'}
          onPress={isListening ? stopListening : handleStart}
        />
        <Button title="Clear" onPress={reset} />
      </View>
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    flex: 1,
    justifyContent: 'center',
    padding: 20,
  },
  title: {
    fontSize: 24,
    fontWeight: 'bold',
    marginBottom: 20,
    textAlign: 'center',
  },
  transcript: {
    fontSize: 18,
    marginVertical: 20,
    minHeight: 100,
  },
  error: {
    color: 'red',
    marginBottom: 10,
  },
  buttonRow: {
    flexDirection: 'row',
    justifyContent: 'space-around',
  },
});

Implementation Notes:

  • The component delegates all speech logic to the custom hook, maintaining a clean separation of concerns.
  • Platform-specific permission handling is isolated to the component level, while the hook remains platform-agnostic.
  • State updates flow unidirectionally from the hook to the UI, following React’s declarative paradigm as implemented in the facebook/react source.

Key React Source Files

Understanding the internals of React’s hook system helps explain why this implementation pattern is robust. The following files from the facebook/react repository define the primitives used in the speech-to-text hook:

These source files demonstrate that React’s hook cleanup mechanisms directly support the pattern of subscribing to native events in useEffect and returning a cleanup function to destroy the voice recognizer.

Summary

  • Bridge Architecture: React Native speech to text requires bridging native platform APIs (iOS Speech Framework and Android SpeechRecognizer) into JavaScript.
  • Hook Abstraction: Wrap the imperative @react-native-voice/voice API in a custom hook using useState, useEffect, and useCallback from packages/react/src/ReactHooks.js to ensure automatic cleanup and type safety.
  • Resource Management: Always invoke Voice.destroy() in the useEffect cleanup function (mirroring React’s internal cleanup logic at lines 87‑101 of ReactHooks.js) to prevent memory leaks and release native resources.
  • Permission Handling: Request runtime microphone permissions on Android using PermissionsAndroid and declare required keys in Info.plist for iOS before initiating speech recognition.

Frequently Asked Questions

How do I handle speech recognition errors in React Native?

Wrap the native Voice.onSpeechError event in a useCallback hook and update an error state variable using useState. This ensures error handlers are memoized and do not trigger unnecessary re-registrations of native listeners. Always clear the error state when restarting recognition to avoid displaying stale error messages in the UI.

Can I use react native speech to text in the background?

Standard speech recognition APIs on both iOS and Android require the app to be in the foreground with an active audio session. Background speech-to-text is restricted by platform privacy policies and typically requires special entitlements on iOS or foreground services on Android that are beyond the scope of standard community libraries like @react-native-voice/voice.

What is the difference between react-native-voice and @react-native-voice/voice?

@react-native-voice/voice is the actively maintained fork of the original react-native-voice package. It supports modern React Native versions (0.60+) with autolinking, includes TypeScript definitions, and receives regular updates for iOS and Android compatibility. New projects should always use @react-native-voice/voice rather than the legacy package.

How do I prevent memory leaks when using speech recognition?

Always implement the cleanup function in your useEffect hook that calls Voice.destroy(). This pattern mirrors the internal cleanup logic defined in packages/react/src/ReactHooks.js (lines 87‑101) and ensures that native event listeners are removed and audio resources are released when the component unmounts or when the effect dependencies change.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →