Conversation
austin-carnahan
left a comment
There was a problem hiding this comment.
Looks good! I had a hard time finding anything to improve on!
S23/b-sai/main.py
Outdated
| if letter == 'I': | ||
| lower_letter = 'ı' | ||
| elif language.startswith(('gd', 'gv', 'ga')): | ||
| if idx == 1 and (letter in ['A', 'E', 'I', 'O', 'U', 'Á', 'É', 'Í', 'Ó', 'Ú', "Ó"] or ord(letter) in [211]) and word[0] in ['n', 't'] and (len(word)-idx >= 2 and ord(word[idx+1]) != 771): |
There was a problem hiding this comment.
I think this conditional statement is a little long - it took me a minute to tease out what it's saying. Consider breaking it up on multiple lines or adding some inline documentation to explain.
S23/b-sai/main.py
Outdated
| elif language.startswith('el'): | ||
| if letter == 'Σ' and idx == len(word)-1: | ||
| lower_letter = 'ς' | ||
| elif language.startswith(("zh", "th", "ja")): |
There was a problem hiding this comment.
It's not going to save a ton of time, but if you aren't making any changes to the letters in these languages -- you don't need to loop through each character. Consider moving this conditional to the top of your loop and exiting early.
| """ | ||
| result = "" | ||
|
|
||
| if language.startswith(("zh", "th", "ja")): |
There was a problem hiding this comment.
There are 3-letter language code permitted in BCP-47, so "startswith" won't work here, e.g. "jam" is "Jamaican Creole English".
| result = "" | ||
|
|
||
| if language.startswith(("zh", "th", "ja")): | ||
| return word.lower() |
There was a problem hiding this comment.
And the point was in these cases to not bother calling lower() as an optimization.
| if language == 'tr' or language == 'az': | ||
| if letter == 'I': | ||
| lower_letter = 'ı' | ||
| elif language.startswith(('gd', 'gv', 'ga')): |
There was a problem hiding this comment.
Same issue as above; this won't work.
| is_2nd_letter = idx == 1 | ||
| is_exception_letter = letter in [ | ||
| 'A', 'E', 'I', 'O', 'U', 'Á', 'É', 'Í', 'Ó', 'Ú', "Ó"] | ||
| is_letter_o_latin = ord(letter) in [211] |
There was a problem hiding this comment.
Magic numbers, here and 771 below! Unreadable.
| word, language, actual = test.split("\t") | ||
| predicted = to_lowercase(word, language) | ||
| if predicted != actual: | ||
| print(f"COuldn't convert {word} in {language}!") |
| is_letter_o_latin = ord(letter) in [211] | ||
| is_beginning_exception = word[0] in ['n', 't'] | ||
| is_not_last = len(word)-idx > 1 | ||
| if is_2nd_letter and (is_exception_letter or is_letter_o_latin) and is_beginning_exception and (is_not_last and ord(word[idx+1]) != 771): |
There was a problem hiding this comment.
The Unicode business needs a bit more work; will discuss in class.
No description provided.