Skip to content

analyze() returns empty string for reading field on all morphemes #1

@aptend

Description

@aptend

Description

The API docs and Morpheme interface document a reading field ("Reading in katakana", e.g. "タベ"), but analyze() always returns "" for every morpheme in v0.9.3.

Reproduction

import { Suzume } from "@libraz/suzume"
const s = await Suzume.create()
console.log(s.version) // "0.9.3"

for (const m of s.analyze("食べる")) {
  console.log(m.surface, "→ reading:", JSON.stringify(m.reading))
}
// 食べる → reading: ""
// Expected: reading: "タベル"

for (const m of s.analyze("東京に行きました")) {
  console.log(m.surface, "→ reading:", JSON.stringify(m.reading))
}
// 東京 → reading: ""  (expected: "トウキョウ")
// に → reading: ""
// 行き → reading: ""  (expected: "イキ")
// まし → reading: ""
// た → reading: ""

Environment

  • @libraz/suzume@0.9.3
  • Node.js v22.18.0
  • Also tested in browser (Obsidian/Electron)

Analysis

Looking at the C++ source, analyzer.cpp does copy edge.reading into the morpheme:

if (!edge.reading.empty()) {
    morpheme.reading = std::string(edge.reading);
}

But the embedded minimal dictionary does not seem to contain reading data, so edge.reading is always empty.

Question

Is reading support a planned feature? Or is there a way to provide reading data (e.g. via user dictionary or a supplementary dictionary)?

We are building a furigana annotation tool and would love to use Suzume for its small size, but reading data is essential for our use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions