Skip to content

feat: add translation script and GitHub Action for mobile i18n autofill#108

Open
markrizkalla wants to merge 9 commits intoopenMF:devfrom
markrizkalla:feature/auto-localization-on-pull-request
Open

feat: add translation script and GitHub Action for mobile i18n autofill#108
markrizkalla wants to merge 9 commits intoopenMF:devfrom
markrizkalla:feature/auto-localization-on-pull-request

Conversation

@markrizkalla
Copy link

@markrizkalla markrizkalla commented Jan 31, 2026

  • Create translate.py to automate Android string resource translations using the Google Gemini API, featuring placeholder/markup preservation, change detection via snapshots, and batch processing.
  • Add .github/workflows/mobile-i18n-autofill-pr.yml to trigger translation autofill on pull requests to the development branch.
  • Support translation for 24 locales using the gemma-3-27b-it model.
  • Implement automated validation of translated Android resources and direct commits of updates to PR branches.

Summary by CodeRabbit

  • New Features

    • Automated mobile localization: PRs touching Android resources trigger AI-assisted translations for multiple locales, validate generated resources, preserve formatting/placeholders/comments, and provide a concise summary.
    • Local/CLI translation tool: run end-to-end Android string translations locally with snapshotting, validation, placeholder safety, and merge-ready outputs.
  • Chores

    • CI workflow added to autofill translations on PRs and push standardized commits back to the PR branch when changes occur.

- Create `translate.py` to automate Android string resource translations using the Google Gemini API, featuring placeholder/markup preservation, change detection via snapshots, and batch processing.
- Add `.github/workflows/mobile-i18n-autofill-pr.yml` to trigger translation autofill on pull requests to the development branch.
- Support translation for 24 locales using the `gemma-3-27b-it` model.
- Implement automated validation of translated Android resources and direct commits of updates to PR branches.
@coderabbitai
Copy link

coderabbitai bot commented Jan 31, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a GitHub Actions workflow and a new translate.py tool that auto-discovers Android strings.xml, uses Gemini to generate translations for specified locales with snapshot-driven diffs, validates resources via Gradle, and commits translation updates back to the PR head when produced. (33 words)

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
.github/workflows/mobile-i18n-autofill-pr.yml
New workflow triggered on PRs to dev for cmp-android/**, feature/**, the workflow file, or translate.py. Checks out PR head, sets up JDK21 & Python3.11, installs deps, runs translate.py --apply with locales/model/batch-size, validates with Gradle, and commits/pushes translation changes if present.
Translation Engine & CLI
translate.py
New, comprehensive translator: GeminiTranslator (batch/single, retries, backoff, throttling), dataclasses and exceptions, snapshot-driven change detection, placeholder/tag freeze/unfreeze, XML read/write preserving comments/attributes, sanitization/validation, create/merge locale files, summary reporting, and a CLI with mode/locales/model options.
Documentation
docs/TRANSLATE.md
New documentation describing usage, CLI flags, behaviors (apply/check), implementation details (snapshots, placeholders, rate limiting), and requirements for translate.py.
Workflow Targets / Android resources
cmp-android/..., feature/...
Workflow scans and may modify strings.xml under these directories; generated locale XMLs are validated via ./gradlew :cmp-android:processDemoDebugResources before being committed to the PR branch.

Sequence Diagram

sequenceDiagram
    participant PR as "GitHub PR"
    participant GHA as "GitHub Actions"
    participant Script as "translate.py"
    participant API as "Gemini API"
    participant FS as "Filesystem/Repo"
    participant Gradle as "Gradle"

    PR->>GHA: PR created/updated (paths matched)
    GHA->>GHA: Checkout PR head, setup JDK21 & Python3.11
    GHA->>Script: Invoke translate.py (--apply, locales, model)
    Script->>FS: Discover source `strings.xml`, load snapshots
    FS-->>Script: Return source entries
    Script->>Script: Freeze placeholders/tags, batch requests
    Script->>API: Send batch translation requests
    API-->>Script: Return translated frozen texts
    Script->>Script: Unfreeze, validate, sanitize translations
    Script->>FS: Write/merge locale XMLs and update snapshots
    Script-->>GHA: Return has_changes flag
    GHA->>Gradle: Run resource validation task
    Gradle-->>GHA: Return validation result
    alt has_changes == true
        GHA->>FS: Commit & push translation changes to PR head
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I hopped through strings with gentle care,
Froze tags and placeholders in the air,
Sent batches to Gemini’s bright machine,
Unwrapped them neat, XML kept clean,
Pushed translations home — a small rabbit’s scene.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.01% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding a translation script and GitHub Action for mobile i18n autofill.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @.github/workflows/mobile-i18n-autofill-pr.yml:
- Around line 79-89: The workflow step that runs the push uses direct
interpolation of `${{ github.event.pull_request.head.ref }}` in the shell which
can lead to injection; change the job step to pass the PR head ref as an
environment variable (e.g., PR_HEAD_REF) and then use that env var in the run
block and quote it when invoking the push (the git push command in the "Commit
and Push changes" step), so replace direct interpolation with an env reference
and ensure the push uses a quoted ref; keep the existing git config, git add,
and git commit commands and only adjust how the ref is supplied to git push and
referenced in the run block.

In `@translate.py`:
- Around line 1305-1311: The create_batches function currently uses range with
step=batch_size which will raise if batch_size <= 0; add an explicit guard at
the start of create_batches to validate batch_size is a positive integer (e.g.,
if batch_size <= 0: raise ValueError("batch_size must be > 0")). Keep the rest
of the function unchanged so it still returns [items[i:i + batch_size] for i in
range(0, len(items), batch_size)].
- Line 145: TOKEN_SEQUENCE_RE is using "$$" anchors so findall() never matches
tokens like "[[PH_0]]" — update the regex used by TOKEN_SEQUENCE_RE to match
literal double-brackets and the PH/TAG prefix (e.g. change the pattern to use
"\[\[(?:PH|TAG)_\d+\]\]" or equivalent) so findall() returns the token sequence
and the order validation that uses TOKEN_SEQUENCE_RE will run; update the
TOKEN_SEQUENCE_RE declaration accordingly.
- Around line 1471-1504: Track per-source locale failures and only advance the
snapshot when there are zero failures: add a boolean (e.g., source_had_errors)
and set it true inside the locale loop whenever a locale_result reports failures
(check whichever field your result type exposes, e.g., locale_result.failed or
locale_result.errors or locale_result.failure_count) in the same place you
inspect locale_result.newly_translated; then include source_had_errors in the
snapshot gating logic (alongside source_had_translations and
snapshot_needs_update) so save_snapshot(snapshot_path, source_entries) is only
called when no locale failures occurred for that source.

- Modify `get_snapshot_path` to resolve snapshot storage at the module level when a `src` directory is present.
- Refactor snapshot filename generation to use underscores for path separators.
- Update the license URL reference in the file header template.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@translate.py`:
- Line 1481: Remove the unnecessary f-string prefix on the logger.debug call
(the string passed to logger.debug has no placeholders); locate the logger.debug
invocation (the line containing logger.debug(f"  Source strings changed since
last snapshot")) and change it to a normal string literal without the leading f
so the message is logged as plain text.
🧹 Nitpick comments (10)
translate.py (10)

700-707: Remove unused source_entries parameter.

The source_entries parameter is passed but never used—the function iterates over source_root instead. This dead parameter adds confusion about the function's contract.

♻️ Suggested fix
 def _create_from_source(
     target_xml: Path,
     translations: Dict[str, str],
-    source_entries: List[StringEntry],
     source_root: ET._Element,
     validate: bool,
     warn_unknown_tags: bool,
 ) -> int:

Also update the call site at lines 694-697:

     return _create_from_source(
-        target_xml, translations, source_entries,
+        target_xml, translations,
         source_root, validate, warn_unknown_tags
     )

810-811: Chain the exception for better tracebacks.

Using raise ... from e preserves the original exception context, making debugging easier.

♻️ Suggested fix
-            raise XmlWriteError(f"Written file is malformed: {target_xml}: {e}")
+            raise XmlWriteError(f"Written file is malformed: {target_xml}: {e}") from e

839-847: Consider making the copyright year dynamic.

The hardcoded Copyright 2026 will become stale. Consider using the current year dynamically or accepting it as a parameter.

♻️ Suggested fix
+from datetime import datetime
+
 def _fix_xliff_namespaces_in_file(target_xml: Path) -> None:
     # ... existing code ...
+    current_year = datetime.now().year
     copyright_header = f'''<!--
-    Copyright 2026 Mifos Initiative
+    Copyright {current_year} Mifos Initiative
 
     This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0.

887-887: Remove unused root parameter.

The root parameter is never used—the function only uses elem and elem.getparent().

♻️ Suggested fix
-def _remove_element_preserve_whitespace(root: ET._Element, elem: ET._Element) -> None:
+def _remove_element_preserve_whitespace(elem: ET._Element) -> None:
     """Remove element while preserving surrounding whitespace structure."""

Update call sites (e.g., line 790):

-        _remove_element_preserve_whitespace(root, elem)
+        _remove_element_preserve_whitespace(elem)

947-949: Use underscore for unused loop variable.

The comments variable from the unpacked tuple is not used in the loop body.

♻️ Suggested fix
-    for comments, strings in source_sections:
+    for _, strings in source_sections:
         source_order.extend(strings)

1066-1067: Chain the exception for better tracebacks.

♻️ Suggested fix
-                raise XmlWriteError(f"Written file is malformed: {target_xml}: {e}")
+                raise XmlWriteError(f"Written file is malformed: {target_xml}: {e}") from e

1307-1309: Use logger.exception to include traceback.

When logging in an exception handler, logger.exception automatically includes the traceback, which aids debugging failed translations.

♻️ Suggested fix
         except TranslationError as e:
-            logger.error(f"Failed to translate '{key}': {e}")
+            logger.exception(f"Failed to translate '{key}': {e}")
             return None

1397-1399: Use logger.exception to include traceback.

♻️ Suggested fix
         except TranslationError as e:
-            logger.error(f"    Batch {batch_idx} failed: {e}")
+            logger.exception(f"    Batch {batch_idx} failed: {e}")
             result.errors.append(f"Batch {batch_idx} failed: {e}")

1525-1525: Remove unused mode parameter.

The mode parameter is passed but never used in print_summary.

♻️ Suggested fix
-def print_summary(result: ProcessingResult, mode: str) -> None:
+def print_summary(result: ProcessingResult) -> None:
     """Print formatted processing summary."""

Update the call site at line 1662:

-    print_summary(result, config.mode)
+    print_summary(result)

1634-1641: Consider warning when overriding user-provided values.

When the user explicitly provides --request-delay or --batch-size, silently overriding their values for Gemma models may be unexpected. The current logging only fires when values are changed, but consider making it clearer that user-provided values are being adjusted.

♻️ Suggested enhancement
     if "gemma" in args.model.lower():
         if request_delay < 4.0:
+            if args.request_delay < 4.0:
+                logger.warning(f"Overriding --request-delay {args.request_delay}s → {4.0}s for Gemma model")
             request_delay = 4.0
-            logger.info(f"Adjusted request_delay to {request_delay}s for Gemma model")
         if batch_size > 15:
+            if args.batch_size > 15:
+                logger.warning(f"Overriding --batch-size {args.batch_size} → {15} for Gemma model")
             batch_size = 15
-            logger.info(f"Adjusted batch_size to {batch_size} for Gemma model")

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/mobile-i18n-autofill-pr.yml:
- Around line 16-19: The i18n-autofill job runs for forked pull requests and
will fail when accessing secrets (secrets.GEMINI_API_KEY, secrets.PAT_TOKEN);
guard it to same-repo PRs by adding a job-level condition on the i18n-autofill
job (reference the job name i18n-autofill) that checks the pull request head
repo equals the base repo (e.g., github.event.pull_request.head.repo.full_name
== github.repository) so the job only runs for same-repo PRs.

Updated workflow to use pull_request_target for better access control and added checks for changes before committing.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.github/workflows/mobile-i18n-autofill-pr.yml:
- Around line 20-21: The workflow comment references adding an environment but
no environment is configured; update the workflow job (the i18n-autofill job
definition) to include an environment key (e.g., add environment: i18n-autofill
under the job) so repository environment protection rules can apply, and then
configure the corresponding protection settings in the repo’s Environments UI;
ensure you modify the job named i18n-autofill (or the appropriate jobs: block)
to include the environment property.
- Around line 3-15: The workflow uses pull_request_target while checking out and
running forked PR code (see pull_request_target, translate.py, gradlew) which
can exfiltrate secrets like GEMINI_API_KEY; fix by removing execution of
untrusted fork code under the base-repo context—either restrict the job to
same-repo PRs using a conditional on
github.event.pull_request.head.repo.full_name == github.repository so only
trusted code runs with permissions, or convert to a safe two-workflow pattern:
use pull_request (no secrets) to validate/build artifacts and then a
workflow_run in the base repo to run the privileged steps (translate.py/gradlew)
with secrets, and ensure checkout actions never fetch the forked head when
running with write permissions.
🧹 Nitpick comments (1)
.github/workflows/mobile-i18n-autofill-pr.yml (1)

41-44: Consider pinning Python dependency versions for reproducibility.

Installing google-genai and lxml without version constraints may lead to unexpected behavior if breaking changes are released.

       - name: Install Python dependencies
         run: |
           python -m pip install --upgrade pip
-          pip install google-genai lxml
+          pip install google-genai==1.0.0 lxml==5.1.0  # Pin to tested versions

Alternatively, add a requirements.txt file to the repository for better dependency management.

@biplab1
Copy link
Collaborator

biplab1 commented Feb 4, 2026

@coderabbitai full review.

@coderabbitai
Copy link

coderabbitai bot commented Feb 4, 2026

@biplab1, I will conduct a full review of the PR including the translation script and GitHub Actions workflow.

✅ Actions performed

Full review triggered.

Copy link
Collaborator

@biplab1 biplab1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. This can be merged.

@biplab1
Copy link
Collaborator

biplab1 commented Feb 5, 2026

@niyajali If everything looks good on your end, feel free to merge this.

@biplab1
Copy link
Collaborator

biplab1 commented Feb 5, 2026

@markrizkalla Can you please mark all the conversations as resolved?

@niyajali
Copy link
Collaborator

niyajali commented Feb 5, 2026

@markrizkalla, thanks for your pull request, and @biplab1, thank you for the review. Unfortunately, I can't merge it yet as it requires some improvements. For instance, the current workflow hardcodes specific locales for translation generation. This will cause issues if the project doesn't support these locales or if additional locales are added. Additionally, we need to trigger this workflow based on user preference rather than on every pull request.

You are welcome to address other matters, and I will handle this when my schedule allows. Thank you!

- Introduce `StringArrayEntry`, `PluralsEntry`, and `SourceResources` dataclasses to handle complex Android resource types.
- Implement flattening logic to convert arrays and plurals into translatable `StringEntry` objects with unique key suffixes (`__item_N`, `__plural_QUANTITY`).
- Update snapshot logic to hash and track changes for string-arrays and plurals.
- Refactor XML parsing and writing to support merging and creating translations for all resource types while preserving source structure.
- Expand file discovery to include `arrays.xml` alongside `strings.xml`.
- Update translation prompts to provide guidance on handling plural forms and array item keys.
…matting

- Implement `_cleanup_orphaned_translations` to remove localized strings, arrays, and plurals that no longer exist in the source English file.
- Add logic to remove orphaned comments preceding deleted resource elements.
- Improve XML whitespace management with `_normalize_resource_whitespace` to prevent empty line buildup during updates.
- Enhance placeholder detection to include `\n` and `\t` as frozen tokens.
- Update GitHub Action to support manual triggers (`workflow_dispatch`) with configurable parameters for locales, models, and batch sizes.
- Expand GitHub Action to include `cmp-navigation/` in automated commits.
- Fix XML header and resource tag indentation in generated files.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (5)
translate.py (3)

147-147: ⚠️ Potential issue | 🟠 Major

Token-order validation regex is broken.

Line 147 uses anchors ($$) instead of matching literal [[...]] tokens, so FrozenText.validate() token order checks won’t trigger correctly.

Proposed fix
-TOKEN_SEQUENCE_RE = re.compile(r"$$\[(?:PH|TAG)_\d+$$\]")
+TOKEN_SEQUENCE_RE = re.compile(r"\[\[(?:PH|TAG)_\d+\]\]")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@translate.py` at line 147, The regex TOKEN_SEQUENCE_RE is using "$$" anchors
instead of matching the literal token delimiters, so FrozenText.validate() won't
detect token sequences; replace the pattern with one that matches literal
double-brackets and the PH/TAG tokens (e.g., use a raw-string regex like
/\[\[(?:PH|TAG)_\d+\]\]/ compiled into TOKEN_SEQUENCE_RE) so the validator can
find occurrences of [[PH_123]] and [[TAG_456]] correctly; ensure
TOKEN_SEQUENCE_RE remains a compiled re.Pattern and update any dependent code if
it relied on the old anchors.

1755-1760: ⚠️ Potential issue | 🟡 Minor

Guard batch_size before using it as range step.

Line 1760 can throw when batch_size <= 0.

Proposed fix
 def create_batches(
     items: List[Tuple[str, FrozenText]],
     batch_size: int,
 ) -> List[List[Tuple[str, FrozenText]]]:
     """Split items into batches of specified size."""
+    if batch_size <= 0:
+        raise ValueError("batch_size must be > 0")
     return [items[i:i + batch_size] for i in range(0, len(items), batch_size)]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@translate.py` around lines 1755 - 1760, The create_batches function uses
batch_size as the step in range(0, len(items), batch_size) and will raise a
ValueError if batch_size <= 0; add an explicit guard at the start of
create_batches to validate that batch_size is an int > 0 and raise a clear
ValueError (e.g., "batch_size must be > 0") if not, so callers get a
deterministic error and the range call is safe.

1936-1956: ⚠️ Potential issue | 🟠 Major

Do not advance source snapshot when any locale failed.

Line 1951-Line 1956 can still save snapshots after partial locale failures, which can prevent changed strings from being retried later.

Proposed fix
-        source_had_translations = False
+        source_had_translations = False
+        source_had_failures = False
@@
             if locale_result.newly_translated > 0:
                 source_had_translations = True
+            if locale_result.failed > 0 or locale_result.errors:
+                source_had_failures = True
@@
         if config.mode == "apply":
-            if source_had_translations:
+            if source_had_translations and not source_had_failures:
                 should_save_snapshot = True
                 save_reason = "Updated"
-            elif snapshot_needs_update:
+            elif snapshot_needs_update and not source_had_failures:
                 should_save_snapshot = True
                 save_reason = "Synced"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@translate.py` around lines 1936 - 1956, The code currently sets
should_save_snapshot based only on source_had_translations and
snapshot_needs_update; update the loop that calls process_locale to also track
whether any locale failed (e.g., set a boolean any_locale_failed if
locale_result indicates failures such as a failed_count or errors on the
returned locale_result). Then change the save decision (where
should_save_snapshot and save_reason are set) to require that no locales failed
— only set should_save_snapshot when source_had_translations or
snapshot_needs_update AND any_locale_failed is false. Reference process_locale,
locale_result (e.g., locale_result.newly_translated and
locale_result.failed/failed_count/errors) and the variables
should_save_snapshot/snapshot_needs_update/source_had_translations when making
the change.
.github/workflows/mobile-i18n-autofill-pr.yml (2)

37-40: ⚠️ Potential issue | 🟠 Major

Guard labeled PR execution to same-repo branches when secrets are required.

For pull_request labeled events, this job can run on fork PRs where GEMINI_API_KEY is unavailable, leading to avoidable failures.

Proposed fix
   i18n-autofill:
     if: >-
       github.event_name == 'workflow_dispatch' ||
-      github.event.label.name == 'needs-translation'
+      (
+        github.event_name == 'pull_request' &&
+        github.event.label.name == 'needs-translation' &&
+        github.event.pull_request.head.repo.full_name == github.repository
+      )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/mobile-i18n-autofill-pr.yml around lines 37 - 40, The
current if-condition allows labeled pull_request events from forks where secrets
like GEMINI_API_KEY are unavailable; update the workflow conditional to only
allow labeled pull_request runs when the PR branch is in the same repository by
adding a check that github.event.pull_request.head.repo.full_name ==
github.repository (while keeping existing checks for workflow_dispatch and the
label). Modify the condition that now references github.event_name and
github.event.label.name so it requires either workflow_dispatch OR (the label
AND (not a pull_request OR the pull_request head repo equals the repository)) —
target the conditional in the mobile-i18n-autofill-pr.yml job definition.

89-97: ⚠️ Potential issue | 🟠 Major

Do not interpolate PR head ref directly inside shell commands.

Line 97 uses ${{ github.event.pull_request.head.ref ... }} directly in run, which is an injection risk.

Proposed fix
       - name: Commit and Push changes
+        env:
+          PR_HEAD_REF: ${{ github.event.pull_request.head.ref || github.ref_name }}
         run: |
           git config user.name "github-actions[bot]"
           git config user.email "github-actions[bot]@users.noreply.github.com"
@@
-            git push origin HEAD:${{ github.event.pull_request.head.ref || github.ref_name }}
+            git push origin "HEAD:${PR_HEAD_REF}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/mobile-i18n-autofill-pr.yml around lines 89 - 97, The
workflow currently interpolates the PR head ref directly inside the run shell
block (git push origin HEAD:${{ github.event.pull_request.head.ref ||
github.ref_name }}), which is an injection risk; fix by computing the target ref
using a workflow expression into an env (e.g., TARGET_REF: ${{
github.event.pull_request.head.ref || github.ref_name }}) at the step level and
then reference that env inside run (git push origin HEAD:"$TARGET_REF"),
ensuring the variable is quoted; update the step using the symbols git push
origin HEAD:${{ ... }} and the run block that sets git config / git commit to
use the env variable instead of direct expression interpolation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/mobile-i18n-autofill-pr.yml:
- Around line 35-100: The i18n-autofill job is implemented inline but must be
converted to call the reusable workflow from openMF/mifos-x-actionhub@v1.0.8;
replace the inline job named i18n-autofill with a single uses:
openMF/mifos-x-actionhub@v1.0.8 entry (keeping the job-level if and runs-on as
supported) and map the current env/inputs (LOCALES, MODEL, BATCH_SIZE,
REPO_ROOT, GEMINI_API_KEY) into the reusable workflow's with: inputs, and remove
the inline steps (Checkout repository, Set up JDK 21, Set up Python 3.11,
Install Python dependencies, Run translation autofill, Validate Android
resources compile, Commit and Push changes) so that those actions are executed
by the reused workflow; ensure input names in the with: block correspond to the
reusable workflow's expected input keys and preserve the same default values
from the current env section.

In `@docs/TRANSLATE.md`:
- Around line 80-81: The docs mention a CLI flag (--max-retries) that doesn’t
exist and uses “backoff” instead of the verb “back off”; update either
translate.py to expose a --max-retries argument in parse_args() (add an int arg
with a sensible default and wire it into the retry logic) or remove the
--max-retries mention from TRANSLATE.md and instead describe the retry behavior
generically, and change “backoff” to “back off” in the Rate Limit Handling
section; reference translate.py, parse_args(), and the --max-retries token when
making the change.

In `@translate.py`:
- Around line 1957-1964: The current logic sets should_save_snapshot and calls
save_snapshot_full even when config.mode == "check", causing writes during a
read-only check; change the flow so that when config.mode == "check" you never
set should_save_snapshot nor call save_snapshot_full (instead surface a
failure/diagnostic), i.e. move the save_snapshot_full(snapshot_path,
source_resources) and the save_reason logging behind a guard that excludes
config.mode == "check" (or explicitly check config.mode != "check" before
setting should_save_snapshot/save_reason), and ensure functions/variables
referenced (config.mode, snapshot, should_save_snapshot, save_snapshot_full,
snapshot_path, save_reason) are used only when writes are allowed.

---

Duplicate comments:
In @.github/workflows/mobile-i18n-autofill-pr.yml:
- Around line 37-40: The current if-condition allows labeled pull_request events
from forks where secrets like GEMINI_API_KEY are unavailable; update the
workflow conditional to only allow labeled pull_request runs when the PR branch
is in the same repository by adding a check that
github.event.pull_request.head.repo.full_name == github.repository (while
keeping existing checks for workflow_dispatch and the label). Modify the
condition that now references github.event_name and github.event.label.name so
it requires either workflow_dispatch OR (the label AND (not a pull_request OR
the pull_request head repo equals the repository)) — target the conditional in
the mobile-i18n-autofill-pr.yml job definition.
- Around line 89-97: The workflow currently interpolates the PR head ref
directly inside the run shell block (git push origin HEAD:${{
github.event.pull_request.head.ref || github.ref_name }}), which is an injection
risk; fix by computing the target ref using a workflow expression into an env
(e.g., TARGET_REF: ${{ github.event.pull_request.head.ref || github.ref_name }})
at the step level and then reference that env inside run (git push origin
HEAD:"$TARGET_REF"), ensuring the variable is quoted; update the step using the
symbols git push origin HEAD:${{ ... }} and the run block that sets git config /
git commit to use the env variable instead of direct expression interpolation.

In `@translate.py`:
- Line 147: The regex TOKEN_SEQUENCE_RE is using "$$" anchors instead of
matching the literal token delimiters, so FrozenText.validate() won't detect
token sequences; replace the pattern with one that matches literal
double-brackets and the PH/TAG tokens (e.g., use a raw-string regex like
/\[\[(?:PH|TAG)_\d+\]\]/ compiled into TOKEN_SEQUENCE_RE) so the validator can
find occurrences of [[PH_123]] and [[TAG_456]] correctly; ensure
TOKEN_SEQUENCE_RE remains a compiled re.Pattern and update any dependent code if
it relied on the old anchors.
- Around line 1755-1760: The create_batches function uses batch_size as the step
in range(0, len(items), batch_size) and will raise a ValueError if batch_size <=
0; add an explicit guard at the start of create_batches to validate that
batch_size is an int > 0 and raise a clear ValueError (e.g., "batch_size must be
> 0") if not, so callers get a deterministic error and the range call is safe.
- Around line 1936-1956: The code currently sets should_save_snapshot based only
on source_had_translations and snapshot_needs_update; update the loop that calls
process_locale to also track whether any locale failed (e.g., set a boolean
any_locale_failed if locale_result indicates failures such as a failed_count or
errors on the returned locale_result). Then change the save decision (where
should_save_snapshot and save_reason are set) to require that no locales failed
— only set should_save_snapshot when source_had_translations or
snapshot_needs_update AND any_locale_failed is false. Reference process_locale,
locale_result (e.g., locale_result.newly_translated and
locale_result.failed/failed_count/errors) and the variables
should_save_snapshot/snapshot_needs_update/source_had_translations when making
the change.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eab5c7c and a33d7f1.

📒 Files selected for processing (3)
  • .github/workflows/mobile-i18n-autofill-pr.yml
  • docs/TRANSLATE.md
  • translate.py

Comment on lines +35 to +100
jobs:
i18n-autofill:
if: >-
github.event_name == 'workflow_dispatch' ||
github.event.label.name == 'needs-translation'
runs-on: ubuntu-latest

env:
LOCALES: ${{ inputs.locales || 'ar,bn,de,en,es,fa,fr,gu,hi,hu,id,km,kn,ml,mr,ms,my,pl,pt,ru,si,sw,te,ur' }}
MODEL: ${{ inputs.model || 'gemma-3-27b-it' }}
BATCH_SIZE: ${{ inputs.batch-size || '15' }}
REPO_ROOT: ${{ inputs.repo-root || '.' }}

steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event.pull_request.head.ref || github.ref }}
token: ${{ secrets.GITHUB_TOKEN }}

- name: Set up JDK 21
uses: actions/setup-java@v4
with:
java-version: '21'
distribution: 'temurin'

- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install google-genai lxml

- name: Run translation autofill
env:
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
run: |
python translate.py \
--mode apply \
--repo-root "$REPO_ROOT" \
--locales "$LOCALES" \
--model "$MODEL" \
--batch-size $BATCH_SIZE

- name: Validate Android resources compile
run: |
./gradlew :cmp-android:processDemoDebugResources

- name: Commit and Push changes
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"

git add cmp-android/ cmp-navigation/ feature/

if ! git diff --cached --quiet; then
git commit -m "chore: auto-generate mobile i18n translations"
git push origin HEAD:${{ github.event.pull_request.head.ref || github.ref_name }}
else
echo "No changes to commit."
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Workflow violates repository standard for reusable workflows.

This job is implemented inline, but workflow files in this repo must delegate to reusable workflows from openMF/mifos-x-actionhub@v1.0.8.

As per coding guidelines .github/workflows/*.{yml,yaml}: Configure GitHub Actions workflows to use reusable workflows from openMF/mifos-x-actionhub@v1.0.8.

🧰 Tools
🪛 actionlint (1.7.11)

[error] 89-89: "github.event.pull_request.head.ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/reference/security/secure-use#good-practices-for-mitigating-script-injection-attacks for more details

(expression)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/mobile-i18n-autofill-pr.yml around lines 35 - 100, The
i18n-autofill job is implemented inline but must be converted to call the
reusable workflow from openMF/mifos-x-actionhub@v1.0.8; replace the inline job
named i18n-autofill with a single uses: openMF/mifos-x-actionhub@v1.0.8 entry
(keeping the job-level if and runs-on as supported) and map the current
env/inputs (LOCALES, MODEL, BATCH_SIZE, REPO_ROOT, GEMINI_API_KEY) into the
reusable workflow's with: inputs, and remove the inline steps (Checkout
repository, Set up JDK 21, Set up Python 3.11, Install Python dependencies, Run
translation autofill, Validate Android resources compile, Commit and Push
changes) so that those actions are executed by the reused workflow; ensure input
names in the with: block correspond to the reusable workflow's expected input
keys and preserve the same default values from the current env section.

Removed unnecessary check for snapshot creation in 'check' mode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants