Skip to content

code sample for lab day#189

Open
akumar2709 wants to merge 1 commit intokscanne:masterfrom
akumar2709:code_review
Open

code sample for lab day#189
akumar2709 wants to merge 1 commit intokscanne:masterfrom
akumar2709:code_review

Conversation

@akumar2709
Copy link
Copy Markdown

Sorry I won't be present for the code review lab. But this is my submission for the programming assignment.

#function converts word to lower case and provides special rules for irish and turkish since they are not covered by the book
def toLower(input_string : str, language: str) -> str:
irish_vowels = "AEIOUÁÉÍÓÚ"
tilda_ord = 771
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something more general going on here that you'll want to handle in general, not just for this one combining diacritics. We'll discuss in class when we get into Unicode more deeply, or you can stop by office hours if you're curious.

# ord 771 since that means that there is a tild on the second alphabet since that could be achieved by either a singular unicode or by 2 unicodes
# this is the reason we only check at postion[2] since if there is a tilda at [2] it means there can't be a vowel at [1]

if(language.find("ga") + 1):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!= -1 would be more readable. Also, this isn't exactly correct, since there are 3-letter language codes in ISO 639-3, and so this would work for "gaa", for example (Ga language spoken in Ghana).

# this is the reason we only check at postion[2] since if there is a tilda at [2] it means there can't be a vowel at [1]

if(language.find("ga") + 1):
irish_lower = lambda input_string : (input_string[0] + "-" + input_string[1:]).lower() \
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit odd to do this as a lambda. Why not just apply these lines of code directly to the object with the same name?

return irish_lower(input_string)

# Turkish and Azerbaijani follow similar rules to standard covertion with just a different i
elif((language.find("tr") + 1) or (language.find("az") + 1)):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue with 3-letter codes.


# Turkish and Azerbaijani follow similar rules to standard covertion with just a different i
elif((language.find("tr") + 1) or (language.find("az") + 1)):
return input_string.lower().replace("i", "ı")
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this behaves the way we want. Did you test with both dotted and non-dotted uppercase?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really need a lot more tests, since they might turn up additional bugs.

return input_string.lower().replace("i", "ı")

else:
return input_string.lower()
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem description asked for an optimization for languages without case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants