Speech Provider - v0.1.4
    Preparing search index...

    Speech Provider - v0.1.4

    speech-provider

    A unified interface for browser speech synthesis and Eleven Labs voices.

    # Using npm
    npm install speech-provider

    # Using yarn
    yarn add speech-provider

    # Using bun
    bun add speech-provider

    Full API documentation is available at https://osteele.github.io/speech-provider/.

    import { getVoiceProvider } from 'speech-provider';

    // Use browser voices only
    const provider = getVoiceProvider({});

    // Use Eleven Labs voices if API key is available
    const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

    // Use Eleven Labs with custom cache duration
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    cacheMaxAge: 86400 // Cache for 1 day
    });

    // Use Eleven Labs with volume normalization
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    normalizeVolume: true // Enable volume normalization
    });

    // Get voices for a specific language
    const voices = await provider.getVoices({ lang: 'en-US', minVoices: 1 });

    // Get default voice for a language
    const defaultVoice = await provider.getDefaultVoice({ lang: 'en-US' });

    // Create and play an utterance
    if (defaultVoice) {
    const utterance = defaultVoice.createUtterance('Hello, world!');
    utterance.onstart = () => console.log('Started speaking');
    utterance.onend = () => console.log('Finished speaking');
    utterance.start();
    }
    • Unified interface for both browser speech synthesis and Eleven Labs voices
    • Automatic fallback to browser voices when Eleven Labs API key is not provided
    • Typesafe API with TypeScript support
    • Simple voice selection by language
    • Event listeners for speech start and end events
    • Efficient caching of Eleven Labs API responses using the browser's Cache API
    • Configurable cache duration for Eleven Labs responses
    • Audio volume normalization for Eleven Labs voices to ensure consistent volume levels

    This package is used in Mandarin Sentence Practice, a web application for practicing Mandarin Chinese with listening and translation exercises. The app uses this package to provide high-quality text-to-speech for Mandarin sentences, with automatic fallback to browser voices when Eleven Labs is not available.

    The package includes an interactive example in the examples directory that demonstrates both browser and Eleven Labs voice providers. To run it:

    1. View the live demo, or
    2. Open examples/demo.html directly in a browser, or
    3. Run bunx serve examples and open http://localhost:3000/demo.html

    The example includes:

    • API key management for Eleven Labs
    • Provider selection (Browser/Eleven Labs)
    • Language selection with system language detection
    • Voice selection with descriptions
    • Example sentences in multiple languages
    • Text-to-speech controls
    • Volume normalization for Eleven Labs voices

    Creates a voice provider based on the available API keys. Falls back to browser speech synthesis if no API keys are provided.

    function getVoiceProvider(options: {
    elevenLabsApiKey?: string | null;
    cacheMaxAge?: number | null; // Cache duration in seconds (default: 1 hour). Set to null to disable caching.
    normalizeVolume?: boolean; // Enable volume normalization for Eleven Labs voices (default: false)
    }): VoiceProvider;

    Creates an Eleven Labs voice provider with optional configuration.

    function createElevenLabsVoiceProvider(
    apiKey: string,
    baseUrl?: string,
    options?: {
    validateResponses?: boolean;
    printVoiceProperties?: boolean;
    cacheMaxAge?: number | null; // Cache duration in seconds (default: 1 hour). Set to null to disable caching.
    normalizeVolume?: boolean; // Enable volume normalization for more consistent audio levels (default: false)
    }
    ): VoiceProvider;

    The library implements efficient caching for Eleven Labs API responses using the browser's Cache API:

    • Browser voices are cached automatically by the browser's speech synthesis engine
    • Eleven Labs responses are cached using the browser's Cache API with a default duration of 1 hour
    • Cache duration can be configured when creating the provider
    • Cached responses are automatically invalidated after the specified duration
    • Cache can be disabled by setting cacheMaxAge: null in the provider options
    • The Cache API provides better performance than IndexedDB for network requests

    Examples of cache configuration:

    // Use default 1-hour cache
    const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

    // Cache for 1 day
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    cacheMaxAge: 86400 // 24 hours in seconds
    });

    // Cache for 1 week
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    cacheMaxAge: 604800 // 7 days in seconds
    });

    // Disable caching (preferred approach)
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    cacheMaxAge: null
    });

    // Alternative way to disable caching
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    cacheMaxAge: 0
    });

    The library includes a volume normalization feature for Eleven Labs voices to ensure consistent audio levels:

    • Automatically normalizes audio volume during playback using the Web Audio API
    • Helps maintain consistent volume levels across different voices and utterances
    • Uses a dynamics compressor to balance loud and quiet sections of audio
    • Can be enabled by setting normalizeVolume: true in the provider options

    Examples of volume normalization configuration:

    // Enable volume normalization
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    normalizeVolume: true
    });

    // Enable volume normalization with custom cache settings
    const provider = getVoiceProvider({
    elevenLabsApiKey: 'your-api-key',
    normalizeVolume: true,
    cacheMaxAge: 86400 // 24 hours in seconds
    });

    // Direct use with ElevenLabsVoiceProvider
    const provider = createElevenLabsVoiceProvider('your-api-key', undefined, {
    normalizeVolume: true
    });
    interface VoiceProvider {
    name: string;
    getVoices({ lang, minVoices }: { lang: string; minVoices: number }): Promise<Voice[]>;
    getDefaultVoice({ lang }: { lang: string }): Promise<Voice | null>;
    }
    interface Voice {
    name: string;
    id: string;
    lang: string;
    provider: VoiceProvider;
    description: string | null;
    createUtterance(text: string): Utterance;
    }
    interface Utterance {
    start(): void;
    stop(): void;
    set onstart(callback: () => void);
    set onend(callback: () => void);
    }

    Copyright 2025 by Oliver Steele

    MIT