Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,18 +172,18 @@ client.on(EVENT_REMOTE_TRACKS_UPDATE, (tracksData) => {
```

`EVENT_REMOTE_TRACKS_UPDATE` - An update has occurred to the set of remote audio tracks. (Use this event to access new audio streams.)
`EVENT_ROOM_CONNECTED` - The WebRTC room connection was established.
`EVENT_ROOM_DISCONNECTED` - The WebRTC room connection closed or lost.
`EVENT_ROOM_CONNECTED` - The WebRTC room connection was established.
`EVENT_ROOM_DISCONNECTED` - The WebRTC room connection closed or lost.
`EVENT_CONNECTION_STATE_CHANGED` - The connection state has changed (e.g., connecting, connected, disconnected).
`EVENT_DATA_RECEIVED` - Custom data or messages have been received from the server via the WebRTC data channel.
`EVENT_START_TRANSLATION` - The translation process has started.
`EVENT_STOP_TRANSLATION` – The translation process has stopped.
`EVENT_TRANSCRIPTION_RECEIVED` - The full transcription (recognized text) of the source audio has been received.
`EVENT_TRANSLATION_RECEIVED` - The full, written translation of the source audio has been received.
`EVENT_STOP_TRANSLATION` – The translation process has stopped.
`EVENT_TRANSCRIPTION_RECEIVED` - The full transcription (recognized text) of the source audio has been received.
`EVENT_TRANSLATION_RECEIVED` - The full, written translation of the source audio has been received.
`EVENT_PARTIAL_TRANSLATED_TRANSCRIPTION_RECEIVED` - A partial translation of the transcription has been received.
`EVENT_PARTIAL_TRANSCRIPTION_RECEIVED` - A partial transcription has been received. (Useful for real-time updates.)
`EVENT_PARTIAL_TRANSCRIPTION_RECEIVED` - A partial transcription has been received. (Useful for real-time updates.)
`EVENT_PIPELINE_TIMINGS_RECEIVED` - Timing or performance data about the translation pipeline has been received. (Useful for diagnostics and/or analytics.)
`EVENT_ERROR_RECEIVED` - An error in the translation or streaming process has occurred.
`EVENT_ERROR_RECEIVED` - An error in the translation or streaming process has occurred.

## Usage Examples

Expand Down
139 changes: 77 additions & 62 deletions packages/lib/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# @palabra-ai/translator
# <a href="https://palabra.ai"><img src="https://avatars.githubusercontent.com/u/199107821?s=32" alt="Palabra AI" align="center"></a> Palabra AI TypeScript Library

A TypeScript library for Palabra AI's real-time speech-to-speech translation API. Break down language barriers and enable seamless communication across 25+ languages.
🌍 A TypeScript library for Palabra AI's real-time speech-to-speech translation API.
🚀 Break down language barriers and enable seamless communication across 25+ languages.

## Overview
## Overview 📋

The @palabra-ai/translator TypeScript library enables you to integrate real-time speech translation into your Web applications. Whether you're building a new application, enhancing an existing product, or streamlining business processes, this library gives you the tools to:

## Features

* Real-Time Speech-to-Speech Translation: Instantly translate live speech, making conversations feel smooth and natural
* Voice Cloning & Management: Preserve the original speaker's voice and emotional nuances in translated speech
* Real-Time Transcription: Instantly convert spoken language into accurate, readable text as the speech occurs — ideal for captions, accessibility, and analysis.
🎯 The `@palabra-ai/translator` TypeScript library enables you to integrate real-time speech translation into your Web applications.
Whether you're building a new application, enhancing an existing product, or streamlining business processes, this library has the tools you need.
With Palabra AI, you can:

* ⚡ Translate live speech in real time, making conversations smooth and natural
* 🎙️ Preserve the original speaker's voice and tone in translated speech
* 🔄 Convert spoken language instantly into accurate, readable text — great for captions, accessibility, and analysis

## Installation

Expand All @@ -23,16 +23,17 @@ pnpm add @palabra-ai/translator
yarn add @palabra-ai/translator
```

## Requirements
## Prerequisites

- Runs in modern browsers (uses WebRTC and Web Audio APIs)
- Requires Palabra API credentials (see [Palabra documentation](https://docs.palabra.ai/docs/auth/obtaining_api_keys))
- A modern web browser (uses WebRTC and Web Audio APIs)
- [Palabra API credentials](https://docs.palabra.ai/docs/auth/obtaining_api_keys)

## Quick Start
Follow the steps below to run your first translation using Palabra AI's TypeScript library.

### 1. Get a local audio track

You need to provide a function that returns a `MediaStreamTrack` from the user's microphone. The library provides a helper for this:
Use a function to return a `MediaStreamTrack` from the user's microphone:

```ts
import { getLocalAudioTrack } from '@palabra-ai/translator';
Expand Down Expand Up @@ -73,45 +74,52 @@ await client.stopPlayback();
await client.stopTranslation();
```

Audio playback
Browsers can be restrictive with regards to audio playback that is not initiated by user interaction. What each browser considers as user interaction can vary by vendor (for example, Safari on iOS is very restrictive).
> [!NOTE]
> Browsers may restrict audio playback initiated without user interaction.
> Each browser may also define user interaction differently.
> (For example, Safari on iOS is restrictive.)

## API Reference

See TypeScript types for full API documentation.

## PalabraClient

The `PalabraClient` class is the main entry point for integrating real-time speech-to-speech translation using the Palabra API. It manages connection setup, session lifecycle, audio handling, transcription and translation events, and playback of translated speech.
The `PalabraClient` class is the main entry point for integration with the Palabra API.
It manages connection setup, session lifecycle, audio handling, transcription and translation events, and playback of translated speech.

### Key Features
### Features

* Connects to the Palabra API for real-time speech translation
* Manages translation sessions and language settings
Key features of `PalabraClient`:
* Connects to the Palabra API
* Manages translation sessions
* Manages language settings
* Emits events for transcription and translation results
* Plays translated audio in the browser
* Flexible management of target languages and session configuration
* Manages target languages and session configuration

### Constructor

```ts
new PalabraClient(options: PalabraClientData)
```

Parameters:
`auth`: Authentication data (either clientId/clientSecret or userToken)
`translateFrom`: Source language code (e.g., 'en')
`translateTo`: Target language code (e.g., 'es')
`handleOriginalTrack`: Function returning the original audio track (MediaStreamTrack)
`apiBaseUrl` (optional): API URL (defaults to Palabra cloud)
#### Parameters

* `auth`: Authentication data (either `clientId`/`clientSecret` or `userToken`)
* `translateFrom`: Source language code (e.g., 'en')
* `translateTo`: Target language code (e.g., 'es')
* `handleOriginalTrack`: Function returning the original audio track (MediaStreamTrack)
* `apiBaseUrl` (optional): API URL (defaults to Palabra cloud)

### Public Methods

- `startTranslation(): Promise<boolean>`
Starts a translation session and connects the audio stream. Returns `true` on success.
Starts a translation session and connects the audio stream.
Returns `true` on success.

- `stopTranslation(): Promise<void>`
Stops the translation session and disconnects the transport.
- `stopTranslation(): Promise<void>`
Stops a translation session and disconnects the transport.

- `startPlayback(): Promise<void>`
Enables playback of translated audio in the browser.
Expand All @@ -126,7 +134,7 @@ Parameters:
Changes the target language for translation on the fly.

- `addTranslationTarget(langCode: TargetLangCode): Promise<void>`
Adds an additional target language for translation.
Adds a target language for translation.

- `removeTranslationTarget(langCode: TargetLangCode | TargetLangCode[]): Promise<void>`
Removes one or more target languages from translation.
Expand All @@ -138,12 +146,13 @@ Parameters:
Unmutes the original audio track (microphone).

- `cleanup(): Promise<void>`
Stops translation and playback, releases resources, and resets the client to its initial state as it was after creation.

Stops translation and playback, releases resources, and resets the client to its initial state.

## Events
PalabraClient provides events that let you track connection status, receive audio tracks, and handle transcription and translation results.

The `PalabraClient` class provides events that let you track connection status, receive audio tracks, and handle transcription and translation results.

[//]: # (Come back to this line...)
You can use these events to update your UI, handle errors, and get real-time updates during the speech processing flow — from connecting to receiving translated audio and text.

```ts
Expand All @@ -162,19 +171,19 @@ client.on(EVENT_REMOTE_TRACKS_UPDATE, (tracksData) => {
});
```

`EVENT_REMOTE_TRACKS_UPDATE` - Fired when the set of remote audio tracks is updated. Use this to get new audio streams.
`EVENT_ROOM_CONNECTED` - Emitted when the WebRTC room connection is successfully established.
`EVENT_ROOM_DISCONNECTED` - Emitted when the WebRTC room connection is closed or lost.
`EVENT_CONNECTION_STATE_CHANGED` - Fired when the connection state changes (e.g., connecting, connected, disconnected).
`EVENT_DATA_RECEIVED` - Emitted when custom data or messages are received from the server via the WebRTC data channel.
`EVENT_START_TRANSLATION` - Fired when the translation process starts.
`EVENT_STOP_TRANSLATION` – Fired when the translation process is stopped.
`EVENT_TRANSCRIPTION_RECEIVED` - Emitted when a full transcription (recognized text) is received for the source audio.
`EVENT_TRANSLATION_RECEIVED` - – Emitted when the full translated transcription of the source audio is received
`EVENT_PARTIAL_TRANSLATED_TRANSCRIPTION_RECEIVED` - Emitted when a partial (intermediate) translation of the transcription is received.
`EVENT_PARTIAL_TRANSCRIPTION_RECEIVED` - Emitted when a partial (intermediate) transcription is received (before the full phrase is recognized) (useful for real-time updates).
`EVENT_PIPELINE_TIMINGS_RECEIVED` - Fired when timing or performance data about the translation pipeline is received (for diagnostics or analytics).
`EVENT_ERROR_RECEIVED` - Emitted when an error occurs in the translation or streaming process.
`EVENT_REMOTE_TRACKS_UPDATE` - An update has occurred to the set of remote audio tracks. (Use this event to access new audio streams.)
`EVENT_ROOM_CONNECTED` - The WebRTC room connection was established.
`EVENT_ROOM_DISCONNECTED` - The WebRTC room connection closed or lost.
`EVENT_CONNECTION_STATE_CHANGED` - The connection state has changed (e.g., connecting, connected, disconnected).
`EVENT_DATA_RECEIVED` - Custom data or messages have been received from the server via the WebRTC data channel.
`EVENT_START_TRANSLATION` - The translation process has started.
`EVENT_STOP_TRANSLATION` – The translation process has stopped.
`EVENT_TRANSCRIPTION_RECEIVED` - The full transcription (recognized text) of the source audio has been received.
`EVENT_TRANSLATION_RECEIVED` - The full, written translation of the source audio has been received.
`EVENT_PARTIAL_TRANSLATED_TRANSCRIPTION_RECEIVED` - A partial translation of the transcription has been received.
`EVENT_PARTIAL_TRANSCRIPTION_RECEIVED` - A partial transcription has been received. (Useful for real-time updates.)
`EVENT_PIPELINE_TIMINGS_RECEIVED` - Timing or performance data about the translation pipeline has been received. (Useful for diagnostics and/or analytics.)
`EVENT_ERROR_RECEIVED` - An error in the translation or streaming process has occurred.

## Usage Examples

Expand Down Expand Up @@ -209,7 +218,7 @@ await client.stopTranslation();

### Advanced: Output translated audio to a custom `<audio>` element

You can listen for the `EVENT_REMOTE_TRACKS_UPDATE` event to get the translated audio tracks and play them in your own `<audio>` element:
Listen for the `EVENT_REMOTE_TRACKS_UPDATE` event to get the translated audio tracks and play them in your own `<audio>` element:

```ts
import { PalabraClient, getLocalAudioTrack } from '@palabra-ai/translator';
Expand Down Expand Up @@ -250,20 +259,20 @@ const startPlayback = () => {

---

You can use these patterns to integrate Palabra's real-time translation into any web application and control audio output as needed.

## Monorepo structure
The examples below show how to integrate Palabra's real-time translation into any web application and control audio output as needed.

## Monorepo Structure
## Development Setup

This project is a monorepo containing two main packages:
[//]: # (What is "this project")
This project contains two main packages:
- `@palabra-ai/translator`: The main library package `packages/lib`
- `dev-app`: A Vue.js development application for testing the library `packages/dev-app`

### Prerequisites

- Node.js (latest LTS version recommended)
- pnpm (package manager)
- [Node.js](https://nodejs.org/en) (latest LTS version recommended)
- [pnpm](https://pnpm.io/)

### Installation

Expand All @@ -273,40 +282,46 @@ pnpm install
```

### Running in Development Mode

#### Library Development
To run the library in watch mode (auto-rebuilds on changes):

Run the library in watch mode (auto-rebuild on changes):

```bash
cd packages/lib
pnpm dev
```

#### Development App
To run the development application with hot-reload:
#### Development Application

Run the dev app with hot-reload:

```bash
cd packages/dev-app
pnpm dev
```

The development app will be available at `http://localhost:5173`

### Additional Commands
Open `http://localhost:5173` in your browser to view the dev app.

### More Commands
#### Library Package `packages/lib`

- `pnpm build` - Build the library
- `pnpm test` - Run tests
- `pnpm lint` - Run linting

#### Development App `packages/dev-app`

- `pnpm build` - Build for production
- `pnpm dev` - Run dev application
- `pnpm dev` - Run dev app

## Supported Languages

### Speech Recognition Languages

🇸🇦 Arabic (AR), 🇨🇳 Chinese (ZH), 🇨🇿 Czech (CS), 🇩🇰 Danish (DA), 🇳🇱 Dutch (NL), 🇬🇧 English (EN), 🇫🇮 Finnish (FI), 🇫🇷 French (FR), 🇩🇪 German (DE), 🇬🇷 Greek (EL), 🇮🇱 Hebrew (HE), 🇭🇺 Hungarian (HU), 🇮🇹 Italian (IT), 🇯🇵 Japanese (JA), 🇰🇷 Korean (KO), 🇵🇱 Polish (PL), 🇵🇹 Portuguese (PT), 🇷🇺 Russian (RU), 🇪🇸 Spanish (ES), 🇹🇷 Turkish (TR), 🇺🇦 Ukrainian (UK)

### Translation Languages

🇸🇦 Arabic (AR), 🇧🇬 Bulgarian (BG), 🇨🇳 Chinese Mandarin (ZH), 🇨🇿 Czech (CS), 🇩🇰 Danish (DA), 🇳🇱 Dutch (NL), 🇬🇧 English UK (EN_GB), 🇺🇸 English US (EN_US), 🇫🇮 Finnish (FI), 🇫🇷 French (FR), 🇩🇪 German (DE), 🇬🇷 Greek (EL), 🇮🇱 Hebrew (HE), 🇭🇺 Hungarian (HU), 🇮🇩 Indonesian (ID), 🇮🇹 Italian (IT), 🇯🇵 Japanese (JA), 🇰🇷 Korean (KO), 🇵🇱 Polish (PL), 🇵🇹 Portuguese (PT), 🇧🇷 Portuguese Brazilian (PT_BR), 🇷🇴 Romanian (RO), 🇷🇺 Russian (RU), 🇸🇰 Slovak (SK), 🇪🇸 Spanish (ES), 🇲🇽 Spanish Mexican (ES_MX), 🇸🇪 Swedish (SV), 🇹🇷 Turkish (TR), 🇺🇦 Ukrainian (UK), 🇻🇳 Vietnamese (VN)

## License
Expand Down