Conversation
Protect against ValueError exception when a line doesn't have an = like an empty line or a comment
|
Seems like this would accept "wrong" configurations where comments are not started with e.g. # I implemented this solution: https://github.com/muisje/subgen/blob/LanguageCode-improvements/load_env_variables.py |
|
And here's an example file that shows what I think should be parsed (and does in my branch) https://github.com/muisje/subgen/blob/LanguageCode-improvements/subgen.env.example |
|
That looks nice. You should open a PR to get it integrated. |
subgen.py
Outdated
| webhookport = int(os.getenv('WEBHOOKPORT', 9000)) | ||
| word_level_highlight = convert_to_bool(os.getenv('WORD_LEVEL_HIGHLIGHT', False)) | ||
| word_level_highlight = convert_to_bool(os.getenv('WORD_LEVEL_HIGHLIGHT', True)) | ||
| segment_level = convert_to_bool(os.getenv('SEGMENT_LEVEL', False)) |
There was a problem hiding this comment.
What's the use case here? SEGMENT_LEVEL is True by default.
subgen.py
Outdated
| text=appended_text, | ||
| words=[], # Empty list for words | ||
| id=lastSegment.id + 1 | ||
| words=[ |
There was a problem hiding this comment.
Beyond being done differently, what does is change?
subgen.py
Outdated
| } | ||
| ) | ||
| if output == 'srt': | ||
| return Response( |
There was a problem hiding this comment.
Reluctant to implement response instead of StreamingResponse. I see no downside to StreamingResponse whereas Response will start to have performance issues on returned files exceeding 10-50mb.
There was a problem hiding this comment.
StreamingResponse was really slow delivering the response, changing to Response made it much more performant. It was just a quick solution for me, don't read much into it.
subgen.py
Outdated
| 'Source': 'Transcribed using stable-ts from Subgen!', | ||
| } | ||
| ) | ||
| elif output == 'json': |
There was a problem hiding this comment.
Not opposed to implementing, what's the usecase?
There was a problem hiding this comment.
bleh, sorry for the mess up. I pushed changes to my main unrelated to this PR. I've reverted to the original changeset. That'll teach me to use a branch on my fork. But to answer the question, I am doing some post-processing using the word timings so having the raw result JSON is handy for that.
subgen.py
Outdated
| namesublang = os.getenv('NAMESUBLANG', '') | ||
| webhookport = int(os.getenv('WEBHOOKPORT', 9000)) | ||
| word_level_highlight = convert_to_bool(os.getenv('WORD_LEVEL_HIGHLIGHT', False)) | ||
| word_level_highlight = convert_to_bool(os.getenv('WORD_LEVEL_HIGHLIGHT', True)) |
There was a problem hiding this comment.
This needs to stay default False, otherwise lots of people will complain when their subs start highlighting words as they are displayed.
subgen.py
Outdated
| if not force_language: | ||
| force_language = LanguageCode.from_string(result.language) | ||
| result.to_srt_vtt(name_subtitle(file_path, force_language), word_level=word_level_highlight) | ||
| result.to_srt_vtt(name_subtitle(file_path, force_language), word_level=word_level_highlight, segment_level=segment_level) |
There was a problem hiding this comment.
Same as above, what's the use case for changing segment_level to false?
Protect against ValueError exception when a line doesn't have an = like an empty line or a comment