Description
The API docs and Morpheme interface document a reading field ("Reading in katakana", e.g. "タベ"), but analyze() always returns "" for every morpheme in v0.9.3.
Reproduction
import { Suzume } from "@libraz/suzume"
const s = await Suzume.create()
console.log(s.version) // "0.9.3"
for (const m of s.analyze("食べる")) {
console.log(m.surface, "→ reading:", JSON.stringify(m.reading))
}
// 食べる → reading: ""
// Expected: reading: "タベル"
for (const m of s.analyze("東京に行きました")) {
console.log(m.surface, "→ reading:", JSON.stringify(m.reading))
}
// 東京 → reading: "" (expected: "トウキョウ")
// に → reading: ""
// 行き → reading: "" (expected: "イキ")
// まし → reading: ""
// た → reading: ""
Environment
@libraz/suzume@0.9.3
- Node.js v22.18.0
- Also tested in browser (Obsidian/Electron)
Analysis
Looking at the C++ source, analyzer.cpp does copy edge.reading into the morpheme:
if (!edge.reading.empty()) {
morpheme.reading = std::string(edge.reading);
}
But the embedded minimal dictionary does not seem to contain reading data, so edge.reading is always empty.
Question
Is reading support a planned feature? Or is there a way to provide reading data (e.g. via user dictionary or a supplementary dictionary)?
We are building a furigana annotation tool and would love to use Suzume for its small size, but reading data is essential for our use case.
Description
The API docs and
Morphemeinterface document areadingfield ("Reading in katakana", e.g."タベ"), butanalyze()always returns""for every morpheme in v0.9.3.Reproduction
Environment
@libraz/suzume@0.9.3Analysis
Looking at the C++ source,
analyzer.cppdoes copyedge.readinginto the morpheme:But the embedded minimal dictionary does not seem to contain reading data, so
edge.readingis always empty.Question
Is reading support a planned feature? Or is there a way to provide reading data (e.g. via user dictionary or a supplementary dictionary)?
We are building a furigana annotation tool and would love to use Suzume for its small size, but reading data is essential for our use case.