Skip to content

Synthesized audio not correct when synthesized within an installed VS code extension #28

@deinhofer

Description

@deinhofer

I have added some test infrastructure in the extension project, so that I can easily compare the audio output for different providers for the same text, see Running Commandline Tests.

I have also added some debugging messages to the js-tts-wrapper to check, what is actually sent to the API, see PR#27.

If I now compare the log messages for the examples/Simple-Speechmarkdown-Examples.smd, I get two different results:

Run node script locally

Running the node script with the locally linked js-tts-wrapper with node test/src/test-simple-smd.js , I get:

Azure SSML warnings: [
  "Engine 'azure' requires xmlns attribute in <speak> tag.",
  "Engine 'azure' requires version attribute in <speak> tag."
]
AzureTTSClient.synthToBytes - TTS text <speak xml:lang="en-US" version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"><voice name="zh-CN-XiaoxiaoMultilingualNeural"><prosody rate="medium" pitch="medium" volume="100%">MS Azure test
There is a short pause <break time="500ms"/>, before I continue.
I can make text <emphasis level="strong">important</emphasis> or use very emphasised and slightly emphasised.
<prosody rate="x-slow">I can speak text slow</prosody> and <prosody rate="x-fast">I can speak text fast</prosody>.
<prosody pitch="high">I can speak text high</prosody> and <prosody pitch="low">I can speak text low</prosody>.
<prosody volume="loud">I can speak text loud</prosody> and <prosody volume="soft">I can speak text soft</prosody>.</prosody></voice></speak>, Options: {"useSpeechMarkdown":true,"format":"mp3"}

Run within installed extension

When I run it within the installed extension v0.0.20-debug1, I get the following log message:

console.ts:139 [Extension Host] Converting Speech Markdown to SSML for Azure.
console.ts:139 [Extension Host] oA.synthToBytes - TTS text <speak xml:lang="en-US" version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"><voice name="zh-CN-XiaoxiaoMultilingualNeural"><prosody rate="medium" pitch="medium" volume="100%">There is a short pause <break time="500ms"/>, before I continue.
I can make text ++important++ or use (very emphasised)[emphasis:'strong'] and (slightly emphasised)[emphasis:'reduced'].
(I can speak text slow)[rate:'x-slow'] and (I can speak text fast)[rate:'x-fast'].
(I can speak text high)[pitch:"high"] and (I can speak text low)[pitch:"low"].
(I can speak text loud)[volume:"loud"] and (I can speak text soft)[volume:"soft"].</prosody></voice></speak>, Options: {"useSpeechMarkdown":true,"format":"mp3"}

The vs code extension also has speechmarkdown-js as dependency, maybe that is the reason? Or something in the packaging process?
It seems that a behaviour of the peer dependency is different or that you use some dynamic loading of libs that behaves differently, when executed within the extension.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions