Skip to content

esp_tts_parse_pinyin crashes with assert failure instead of returning error (AIS-2253) #186

@cnadler86

Description

@cnadler86

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate
  • Read the documentation to confirm the issue is not addressed there and your configuration is set correctly
  • Tested with the latest version to ensure the issue hasn't been fixed

How often does this bug occurs?

always

Expected behavior

int esp_tts_parse_pinyin(esp_tts_handle_t tts_handle, const char *pinyin);

The function should:

  1. Return 0 if parsing fails (syllable not found in voice set)
  2. Return 1 if parsing succeeds
  3. NOT crash the system under any circumstances

Actual behavior (suspected bug)

The function crashes with an assert failure before returning:

E (16568) tts_parser: Fail to search ni3 hao3 in voice set xiaole_20220719

assert failed: esp_tts_utt_append esp_tts_parser.c:93 (syll_idx>=0)

Error logs or terminal output

Steps to reproduce the behavior

Steps to Reproduce

Code Example

#include "esp_tts.h"
#include "esp_tts_voice_template.h"
#include "esp_partition.h"

// Initialize voice set from partition
const esp_partition_t* part = esp_partition_find_first(
    ESP_PARTITION_TYPE_DATA, 
    ESP_PARTITION_SUBTYPE_DATA_FAT, 
    "voice_data"
);

spi_flash_mmap_handle_t mmap;
uint16_t *voicedata;
esp_partition_mmap(part, 0, part->size, ESP_PARTITION_MMAP_DATA, 
                   (const void**)&voicedata, &mmap);

esp_tts_voice_t *voice = esp_tts_voice_set_init(&esp_tts_voice_template, voicedata);
esp_tts_handle_t tts_handle = esp_tts_create(voice);

// Test 1: Pinyin with tone numbers (crashes)
esp_tts_stream_reset(tts_handle);
int result = esp_tts_parse_pinyin(tts_handle, "ni3 hao3");
// Never reaches here - system crashes with assert

// Test 2: Pinyin without tone numbers (also crashes)
esp_tts_stream_reset(tts_handle);
result = esp_tts_parse_pinyin(tts_handle, "ni hao");
// Never reaches here - system crashes with assert

// Test 3: Single syllable (also crashes)
esp_tts_stream_reset(tts_handle);
result = esp_tts_parse_pinyin(tts_handle, "ma");
// Never reaches here - system crashes with assert

Project release version

latest

System architecture

Intel/AMD 64-bit (modern PC, older Mac)

Operating system

Linux

Operating system version

WSL ubuntu 22.04

Shell

Bash

Additional context

Decoded Backtrace

0x420c6149: esp_tts_utt_append at esp_tts_parser.c:93
0x420c678f: esp_tts_parser_pinyin at esp_tts_parser.c:488  
0x420c58b4: esp_tts_parse_pinyin at esp_tts.c:144

Console Output

input:ni3 hao3
data:ni3 hao3
word_num:1, i:8
E (16568) tts_parser: Fail to search ni3 hao3 in voice set xiaole_20220719

A fatal error occurred. The crash dump printed below may be used to help
determine what caused it.

assert failed: esp_tts_utt_append esp_tts_parser.c:93 (syll_idx>=0)

Backtrace: [see above]

ELF file SHA256: 8e92259d3

Rebooting...

Analysis

The issue is in the precompiled library libesp_tts_chinese.a. The function esp_tts_utt_append() uses assert(syll_idx >= 0) instead of proper error handling when a syllable is not found in the voice set.

The error flow is:

  1. esp_tts_parse_pinyin() is called
  2. esp_tts_parser_pinyin() tries to find syllables
  3. esp_tts_utt_append() is called with syll_idx = -1 (not found)
  4. Assert fires → System crashes
  5. Function never returns the documented error code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions