Skip to content

CI: obs-crowdin-sync Python Rewrite#10578

Closed
ghost wants to merge 3 commits intomasterfrom
unknown repository
Closed

CI: obs-crowdin-sync Python Rewrite#10578
ghost wants to merge 3 commits intomasterfrom
unknown repository

Conversation

@ghost
Copy link

@ghost ghost commented Apr 21, 2024

Description

Rewrites the obs-crowdin-sync repository in Python and adds the action to this repository.

Changes that come along with it:

  • Translates the UI/cmake/linux/com.obsproject.Studio.metainfo.xml.in file.
  • Creates a pull request instead of pushing directly: https://github.com/Vainock/obs-studio/pull/4
  • Changes the name of English to English (USA) for en-US.
  • Lowers the requirements for new languages to be added to the UI/data/locale.ini.
  • Host the Language string in a separate Language Name file on Crowdin, remove it here.

Makes the organization/repository secret CROWDIN_SYNC_GITHUB_PAT obsolete.

Motivation and Context

The obs-crowdin-sync repository was too much for what it solves.

This solution is better for integration testing, easier to extend, easier for error analysis in the log because you don't have a huge bundled script and much cleaner to look at.

Pushing directly to the repo is bad, creating a PR is safer and transparent.

How Has This Been Tested?

Locally and https://github.com/Vainock/obs-studio/pull/4. The update-crowdin-locales action was tested in a obs-studio and obs-browser fork.

Types of changes

  • Tweak (non-breaking change to improve existing functionality)

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.
  • I have included updates to all appropriate documentation.

@ghost ghost mentioned this pull request Apr 22, 2024
6 tasks
@WizardCM WizardCM self-requested a review May 4, 2024 23:09
@tytan652
Copy link
Collaborator

tytan652 commented May 9, 2024

Blocked by #10658, once merged please update your code to only translate UI/cmake/linux/com.obsproject.Studio.metainfo.xml.in and not UI/xdg-data/com.obsproject.Studio.metainfo.xml.in.

The latter is in the deprecated CMake path, the former is the new one.

@ghost
Copy link
Author

ghost commented May 11, 2024

@tytan652 Should it also update UI/cmake/linux/com.obsproject.Studio.desktop instead of UI/xdg-data/com.obsproject.Studio.desktop?

@tytan652
Copy link
Collaborator

tytan652 commented May 11, 2024

Yes

I think the change about going away from xdg-data folder has completely gone unseen. We found out about it recently through metainfo regression in CI.

@ghost
Copy link
Author

ghost commented May 11, 2024

Ready for review.

@ghost ghost changed the title CI: Rewrite obs-crowdin-sync in Python & add to repository CI: obs-crowdin-sync Python Rewrite May 13, 2024
@ghost ghost marked this pull request as draft June 7, 2024 14:08
@ghost ghost marked this pull request as ready for review June 9, 2024 16:04
@RytoEX RytoEX self-assigned this Oct 4, 2024
@RytoEX RytoEX requested a review from PatTheMav October 4, 2024 23:13
@ghost ghost mentioned this pull request Oct 5, 2024
6 tasks


def get_dir_id(name: str) -> int:
crowdin_api.source_files.list_directories(filter=name)["data"][0]["data"]["id"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is list_directories assured to always return a dictionary with an non-empty array with key data, which itself will contain objects that always contain an object at key data which has a key id?

If the answer is "no" or "maybe" use a chain of get calls instead:

directories = crowdin_api.source_files.list_directories(filter=name).get("data", [])

if len(directories) > 0:
  id = directories[0].get("data", {}).get("id", None)
  if id is not None:
    return id
  else:
    raise Exception("No id field on directory data")
else:
  raise Exception("No crowdin source file directories found")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked this, and the Crowdin documentation explicitly states that ["data"][0]["data"] are all guaranteed ("required") in the response, if the response is 200.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be the first time a response header states 200 and the result is an empty string, or it's truncated, or it's malformed, or it's actually a piece of HTML.

There is no good reason not to write code that is guaranteed to succeed instead of code that just doesn't seem to fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True enough. What I also just realised is that function is missing a return.



def add_to_crowdin_storage(file_path: str) -> int:
return crowdin_api.storages.add_storage(open(file_path))["data"]["id"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above re. avoidance of ValueErrors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I try...except this theses object accesses, what should I do in the except block? I caught the error, but should I raise an exception again? Just log it?

My only idea would be to try...except the upload() function in the main function (https://github.com/Vainock/obs-studio/blob/ba977d75bd6f8e1546f0d2824d6714110073af31/.github/scripts/utils.py/update-crowdin-locales.py#L101) for all KeyErrors and ValueErrors. This would cover all instances where these exceptions can occur in that file. My problem is that I can only return the error code from the main function. Do you have an idea for how to solve that?

Copy link
Member

@PatTheMav PatTheMav Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to catch exceptions right where they happen but where you can handle them, so it'd be fine to not handle them on lower levels and catch them on the top-most level and log the error there and return from the main function with an error code.

The exception to that would be if you want to be able to more specifically identify the error to have more specific error messages (so you want to differentiate which interaction resulted in a ValueError).

This would also enable you to raise errors with error messages in the inner functions instead of having to log error messages, because you're catching them in main and print the error there.

You usually raise the exceptions in the lowest levels, ignore them in the middle layers which just pipe data back and forth and catch them at the levels the user interacts with (because it's where you'd need to compose error messages, display them to users, ask them how to handle the error, etc.).

It's considered good practice to document this though:

    def someFunction(someArgument: someType) -> Int
       """This does thing

       Returns:
           Int: The number of thing.

       Raises:
            MyException - In case something happen
       """

if "exportOptions" not in source_file:
continue

export_path: str = source_file["exportOptions"]["exportPattern"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-dimensional direct access will lead to ValueError if key doesn't return dictionary with the expected key. Use chained get calls to ensure the access does not fail and use a fallback like None to check for errors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the dictionary access fails, there is no reason to recover from there on. If there is no exportPattern in exportOptions, then the configuration on Crowdin needs to be updated. The code depends a lot on the configuration on Crowdin and assumes a specific structure. I support making the code flexible to the API results, but have trouble understanding what makes encapsulating all possible situations with error handling because of possible misconfiguration on Crowdin or broken responses beneficial.
Adding optional debug logging for all responses makes in my head more sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time you access an API or process data you're not in control of, you have to ensure to either safely evaluate the state of the data before accessing it to stay in control of program execution (otherwise you openly rely on undefined state and you can make no assumptions of the states your code can end up in).

If any issue while accessing that data is considered a possible error, at the very least wrap the access in a try: [...] except (ValueError, KeyError) as e: block because the code can potentially break on both of these errors because of an invalid access violation. You can then handle the error by logging an appropriate error function, clean up, and exit the program with an error code under your control.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments for update-crowdin-locales.py regarding globals, lack of a main function and command-line arguments, as well as error-prone deep access of nested dictionary keys without proper error handling (at the very least a try.. except block catching a ValueError would be necessary).



def write_locale_file(build: CrowdinBuild) -> None:
FILE_PATH = "UI/data/locale.ini"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use upper case naming for anything but actual constants.

If these are actually constants, they should either be declared global in the file scope or encapsulated (the latter is more elegant because it doesn't pollute the file scope, but not strictly necessary):

class OBSCrowdinConstants:
  FILE_PATH = ""
[..]

def write_locale_file(build: CrowdinBuild) -> None:
  [..]
  with open(OBSCrowdinConstants.FILE_PATH, encoding="utf-8") as file:
    [...]

@ghost ghost closed this by deleting the head repository Oct 30, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants