Speech Provider - v0.1.4

speech-provider

A unified interface for browser speech synthesis and Eleven Labs voices.

Installation

# Using npm
npm install speech-provider

# Using yarn
yarn add speech-provider

# Using bun
bun add speech-provider

Documentation

Full API documentation is available at https://osteele.github.io/speech-provider/.

Usage

import { getVoiceProvider } from 'speech-provider';

// Use browser voices only
const provider = getVoiceProvider({});

// Use Eleven Labs voices if API key is available
const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Use Eleven Labs with custom cache duration
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 86400 // Cache for 1 day
});

// Use Eleven Labs with volume normalization
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  normalizeVolume: true // Enable volume normalization
});

// Get voices for a specific language
const voices = await provider.getVoices({ lang: 'en-US', minVoices: 1 });

// Get default voice for a language
const defaultVoice = await provider.getDefaultVoice({ lang: 'en-US' });

// Create and play an utterance
if (defaultVoice) {
  const utterance = defaultVoice.createUtterance('Hello, world!');
  utterance.onstart = () => console.log('Started speaking');
  utterance.onend = () => console.log('Finished speaking');
  utterance.start();
}

Features

Unified interface for both browser speech synthesis and Eleven Labs voices
Automatic fallback to browser voices when Eleven Labs API key is not provided
Typesafe API with TypeScript support
Simple voice selection by language
Event listeners for speech start and end events
Efficient caching of Eleven Labs API responses using the browser's Cache API
Configurable cache duration for Eleven Labs responses
Audio volume normalization for Eleven Labs voices to ensure consistent volume levels

Used In

This package is used in Mandarin Sentence Practice, a web application for practicing Mandarin Chinese with listening and translation exercises. The app uses this package to provide high-quality text-to-speech for Mandarin sentences, with automatic fallback to browser voices when Eleven Labs is not available.

Examples

The package includes an interactive example in the examples directory that demonstrates both browser and Eleven Labs voice providers. To run it:

View the live demo, or
Open examples/demo.html directly in a browser, or
Run bunx serve examples and open http://localhost:3000/demo.html

The example includes:

API key management for Eleven Labs
Provider selection (Browser/Eleven Labs)
Language selection with system language detection
Voice selection with descriptions
Example sentences in multiple languages
Text-to-speech controls
Volume normalization for Eleven Labs voices

API

`getVoiceProvider(options)`

Creates a voice provider based on the available API keys. Falls back to browser speech synthesis if no API keys are provided.

function getVoiceProvider(options: {
  elevenLabsApiKey?: string | null;
  cacheMaxAge?: number | null; // Cache duration in seconds (default: 1 hour). Set to null to disable caching.
  normalizeVolume?: boolean; // Enable volume normalization for Eleven Labs voices (default: false)
}): VoiceProvider;

`createElevenLabsVoiceProvider(apiKey, options?)`

Creates an Eleven Labs voice provider with optional configuration.

function createElevenLabsVoiceProvider(
  apiKey: string,
  baseUrl?: string,
  options?: {
    validateResponses?: boolean;
    printVoiceProperties?: boolean;
    cacheMaxAge?: number | null; // Cache duration in seconds (default: 1 hour). Set to null to disable caching.
    normalizeVolume?: boolean; // Enable volume normalization for more consistent audio levels (default: false)
  }
): VoiceProvider;

Caching

The library implements efficient caching for Eleven Labs API responses using the browser's Cache API:

Browser voices are cached automatically by the browser's speech synthesis engine
Eleven Labs responses are cached using the browser's Cache API with a default duration of 1 hour
Cache duration can be configured when creating the provider
Cached responses are automatically invalidated after the specified duration
Cache can be disabled by setting cacheMaxAge: null in the provider options
The Cache API provides better performance than IndexedDB for network requests

Examples of cache configuration:

// Use default 1-hour cache
const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Cache for 1 day
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 86400 // 24 hours in seconds
});

// Cache for 1 week
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 604800 // 7 days in seconds
});

// Disable caching (preferred approach)
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: null
});

// Alternative way to disable caching
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  cacheMaxAge: 0
});

Volume Normalization

The library includes a volume normalization feature for Eleven Labs voices to ensure consistent audio levels:

Automatically normalizes audio volume during playback using the Web Audio API
Helps maintain consistent volume levels across different voices and utterances
Uses a dynamics compressor to balance loud and quiet sections of audio
Can be enabled by setting normalizeVolume: true in the provider options

Examples of volume normalization configuration:

// Enable volume normalization
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  normalizeVolume: true
});

// Enable volume normalization with custom cache settings
const provider = getVoiceProvider({
  elevenLabsApiKey: 'your-api-key',
  normalizeVolume: true,
  cacheMaxAge: 86400 // 24 hours in seconds
});

// Direct use with ElevenLabsVoiceProvider
const provider = createElevenLabsVoiceProvider('your-api-key', undefined, {
  normalizeVolume: true
});

`VoiceProvider` Interface

interface VoiceProvider {
  name: string;
  getVoices({ lang, minVoices }: { lang: string; minVoices: number }): Promise<Voice[]>;
  getDefaultVoice({ lang }: { lang: string }): Promise<Voice | null>;
}

`Voice` Interface

interface Voice {
  name: string;
  id: string;
  lang: string;
  provider: VoiceProvider;
  description: string | null;
  createUtterance(text: string): Utterance;
}

`Utterance` Interface

interface Utterance {
  start(): void;
  stop(): void;
  set onstart(callback: () => void);
  set onend(callback: () => void);
}

License

MIT

Speech Provider - v0.1.4

speech-provider

Installation

Documentation

Usage

Features

Used In

Examples

API

`getVoiceProvider(options)`

`createElevenLabsVoiceProvider(apiKey, options?)`

Caching

Volume Normalization

`VoiceProvider` Interface

`Voice` Interface

`Utterance` Interface

License

Settings

On This Page

Speech Provider - v0.1.4

speech-provider

Installation

Documentation

Usage

Features

Used In

Examples

API

getVoiceProvider(options)

createElevenLabsVoiceProvider(apiKey, options?)

Caching

Volume Normalization

VoiceProvider Interface

Voice Interface

Utterance Interface

License

Settings

On This Page

`getVoiceProvider(options)`

`createElevenLabsVoiceProvider(apiKey, options?)`

`VoiceProvider` Interface

`Voice` Interface

`Utterance` Interface