Skip to content

PR#401

Closed
atoox-git wants to merge 10 commits intosantifer:mainfrom
atoox-git:main
Closed

PR#401
atoox-git wants to merge 10 commits intosantifer:mainfrom
atoox-git:main

Conversation

@atoox-git
Copy link
Copy Markdown

@atoox-git atoox-git commented Apr 21, 2026

What does this PR do?

Related issue

Type of change

  • Bug fix
  • New feature
  • Documentation / translation
  • Refactor (no behavior change)

Checklist

  • I have read CONTRIBUTING.md
  • I linked a related issue above (required for features and architecture changes)
  • My PR does not include personal data (CV, email, real names)
  • I ran node test-all.mjs and all tests pass
  • My changes respect the Data Contract (no modifications to user-layer files)
  • My changes align with the project roadmap

Questions? Join the Discord for faster feedback.

Summary by CodeRabbit

  • New Features

    • Native France Travail API integration with OAuth2 authentication
    • Multiple French job portal adapters (APEC, HelloWork, Jobijoba, RégionsJob, Welcome to the Jungle)
    • New workflow modes: motivation letter generation, follow-up tracking, interview preparation, rejection pattern analysis, project evaluation, and company research
  • Documentation

    • Complete French localization with setup guides and usage documentation
    • Privacy policy and legal notices for France-specific compliance
    • Example CVs and job postings for reference
  • Configuration

    • French portal scanning template with job title filtering and scraping policies
  • Tests

    • Golden test cases and validation runner for quality assurance

bb-gi and others added 10 commits April 20, 2026 21:13
Basé sur santifer/career-ops@411afb3.
Fork sous atoox-git/career-ops-fr.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- README.fr.md : traduction complète + adaptation portails FR
- CLAUDE.md : traduction in-place, toutes règles préservées
- docs/SETUP.fr.md : guide d'installation FR
- 9 modes traduits dans modes/fr/ (scan, deep, contact, entretien,
  formation, patterns, projet, relance, _profile.template)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion, CV, exemples

- templates/portals-fr.example.yml : 9 portails FR en 3 tiers
- modes/fr/lettre-motivation.md : nouveau mode from scratch
- templates/cv-template-fr.html : template CV Europass FR
- examples/ : 2 CV + 5 offres FR anonymisées
- package.json : ajout scripts claude:eval et ollama:eval (stub)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- adapters/portals/france-travail.mjs : OAuth2 client_credentials, search, getById, normalizeToJD
- adapters/portals/_shared.mjs : robots.txt, rate limit 3s, cache 24h, browser singleton
- 5 adapters Playwright : APEC, WTTJ, HelloWork, Jobijoba, RégionsJob
- scripts/test-api-francetravail.mjs : script de test manuel
- docs/API-FRANCE-TRAVAIL.md : guide étape par étape

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- tests/golden/cases.json : 5 cas FR (dev, commercial, manager, artisan, alternance)
- tests/run-golden.mjs : runner de régression structurelle
- .github/workflows/test-fr.yml : CI sur branches fr/** et PRs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- NOTICE.md : attribution Santiago + copyright Bertrand
- MENTIONS-LEGALES.md : mentions légales françaises
- PRIVACY.md : politique de confidentialité (local-first, RGPD)
- .env.example : variables France Travail ajoutées
- .gitignore : cache portails + nul Windows
- CHANGELOG-FR.md : trace complète des divergences

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Le fichier test-fr.yml sera ajouté via l'interface GitHub ou après
mise à jour du token avec le scope workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
10 sections : installation, configuration, première évaluation,
comprendre le rapport, commandes quotidiennes, workflow recommandé,
archétypes FR, personnalisation, dépannage, ressources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Variante A (storytelling), B (concise), C (communauté).
Inclut conseils de publication et version pour commentaire.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive French localization and fork (career-ops-fr) of the upstream career-ops project, adding France-specific job portal integrations (France Travail OAuth2 API, APEC, HelloWork, Jobijoba, RégionsJob, Welcome to the Jungle), Playwright-based web scraping utilities, extensive French-language documentation and modes, French CV and job offer examples, a French CV HTML template, configuration templates, golden test cases, and supporting scripts.

Changes

Cohort / File(s) Summary
Configuration & Environment
.env.example, .gitignore, package.json
Added France Travail API environment variables (FRANCE_TRAVAIL_CLIENT_ID, FRANCE_TRAVAIL_CLIENT_SECRET, FRANCE_TRAVAIL_SCOPE), cache directory ignore patterns, updated npm scripts (claude:eval, ollama:eval), added keywords, updated author/contributors/repository metadata to reference the French fork.
Shared Portal Utilities
adapters/portals/_shared.mjs
Introduced comprehensive Playwright tier-2 adapter infrastructure: robots.txt checking, per-domain rate limiting, local JSON caching with 24h TTL, browser singleton management, and centralized fetchWithPlaywright orchestration handling cache, robots validation, rate limits, page navigation, extraction, and error/fallback handling.
Playwright Portal Adapters
adapters/portals/apec.mjs, adapters/portals/hellowork.mjs, adapters/portals/jobijoba.mjs, adapters/portals/regionsjob.mjs, adapters/portals/welcometothejungle.mjs
Five new portal adapters implementing CSS selector-based job field extraction (title, company, location, contract type, salary, description) and delegating to shared fetchWithPlaywright orchestration for each platform (APEC, HelloWork, Jobijoba, RégionsJob, Welcome to the Jungle).
France Travail API Integration
adapters/portals/france-travail.mjs, scripts/test-api-francetravail.mjs, docs/API-FRANCE-TRAVAIL.md
Implemented OAuth2 client_credentials adapter for France Travail "offres d'emploi v2" API with token caching/refresh, authenticated search/fetch, offer normalization to markdown JD format, plus manual test script and step-by-step setup documentation.
French Documentation & Legal
CHANGELOG-FR.md, CLAUDE.md, MENTIONS-LEGALES.md, NOTICE.md, PRIVACY.md, README.fr.md, docs/SETUP.fr.md, docs/GUIDE-UTILISATION.md, docs/POST-LINKEDIN-LANCEMENT.md
Added comprehensive French documentation covering fork changelog, fork-specific French defaults, legal notices, licensing/attribution, privacy/data handling, full README with tech badges and quick-start, installation guide, usage guide, and LinkedIn launch announcement templates.
French Modes
modes/fr/_profile.template.md, modes/fr/contact.md, modes/fr/deep.md, modes/fr/entretien.md, modes/fr/formation.md, modes/fr/lettre-motivation.md, modes/fr/patterns.md, modes/fr/projet.md, modes/fr/relance.md, modes/fr/scan.md, modes/fr/README.md
Translated and adapted 10 core modes into French with locale-specific guidance (profile template, LinkedIn contact playbook, company research, interview prep, training evaluation, motivation letter generation, rejection pattern detection, portfolio projects, follow-up cadence, job portal scanning); updated mode directory README with translated-mode inventory and French terminology glossary.
French Examples & Templates
examples/cv-exemple-1.md, examples/cv-exemple-2.md, examples/offres-exemple/alternance-dev-web.md, examples/offres-exemple/commercial-b2b-saas.md, examples/offres-exemple/dev-fullstack-pme.md, examples/offres-exemple/manager-equipe-tech.md, examples/offres-exemple/technicien-chauffagiste.md, templates/cv-template-fr.html, templates/portals-fr.example.yml
Added two French CV examples (fullstack developer, B2B sales), five French job offer examples covering six market archetypes (apprenticeship, B2B SaaS commercial, fullstack startup, tech lead fintech, HVAC technician), a French CV HTML template with embedded fonts and print-safe styling, and a French portal configuration example with Playwright/API tiers, title filtering, search queries, and scraping policies.
Golden Tests
tests/golden/cases.json, tests/run-golden.mjs
Added 5 golden test case fixtures mapping archetype examples to expected score ranges, and a Node.js test runner validating case structure, required fields, file existence, and score ranges with aggregated pass/fail reporting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

📄 docs, 🔴 core-architecture, 🔧 scripts

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'PR' is vague and non-descriptive, providing no meaningful information about the changeset. Replace the generic title with a concise, specific summary of the main change. Example: 'Add French localization and France Travail API integration' or 'Introduce FR fork with localized modes, API adapters, and documentation'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 40

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.env.example:
- Line 22: The FRANCE_TRAVAIL_SCOPE value in .env.example contains a space which
breaks shell-style loaders; update the FRANCE_TRAVAIL_SCOPE entry to wrap the
whole scope string in quotes (e.g., FRANCE_TRAVAIL_SCOPE="...") so tools treat
it as a single value and preserve spaces; modify the FRANCE_TRAVAIL_SCOPE line
accordingly.

In `@adapters/portals/_shared.mjs`:
- Around line 40-93: checkRobotsTxt currently caches decisions by domain only
(robotsCache) and returns early when it finds the first matching Disallow,
letting that single check decide every path on the host; update checkRobotsTxt
to either (A) compute and cache per-path keys (e.g., `${domain}:${path}`) when
storing results in robotsCache, or (B) change the parsing logic in
checkRobotsTxt to first collect all Disallow rules for the wildcard group
(accumulate into an array) and only after parsing all lines determine if the
requested path matches any disallow before caching a domain-scoped result;
modify the cache set/get usage around robotsCache, and keep the existing
abort/timeout and error handling intact.
- Around line 173-180: The fetchWithPlaywright() flow opens a new BrowserContext
via newPage() but only calls page.close(), leaving contexts (and
cookies/storage) alive; in the finally block of fetchWithPlaywright() replace
the page.close() call with page.context().close() so the entire BrowserContext
(and all pages) is disposed; locate references to newPage(),
fetchWithPlaywright(), and the finally cleanup where page.close() is invoked and
change that single cleanup call to page.context().close().
- Around line 25-31: newPage() creates a browser context that is never closed,
causing a resource leak per job; update newPage() and fetchWithPlaywright() to
ensure the created context (from browser.newContext()) is closed after use (use
a try/finally around page creation/usage so both page.close() and
context.close() are called, and ensure browser.close() is invoked where scan
lifecycle ends, e.g., in scan.mjs shutdown); additionally, harden getDomain()
(or validate in fetchWithPlaywright()) to verify the URL uses http: or https:
before returning hostname to avoid processing non-web protocols.

In `@adapters/portals/apec.mjs`:
- Around line 43-44: The adapter export fetch() currently forwards arbitrary
URLs to fetchWithPlaywright (e.g., apec.fetch -> fetchWithPlaywright(url,
extractFromPage, 'APEC')), which is an SSRF risk; add a host-allowlist
validation step before calling fetchWithPlaywright: create or use a shared
function (e.g., isAllowedHost(url, allowedSuffixes)) that parses the URL,
rejects non-http(s), rejects literal IPs/localhost/private ranges, and ensures
hostname endsWith the adapter's allowed domain suffix (for APEC use its suffix),
then throw an error if validation fails; call this validator at the start of
fetch() in apec.mjs (and apply the same pattern to hellowork.mjs, jobijoba.mjs,
regionsjob.mjs, welcometothejungle.mjs) so only validated hosts are passed to
fetchWithPlaywright.
- Around line 19-34: The extractFromPage function may run before async-rendered
job fields appear, causing nulls; mirror the Welcome to the Jungle pattern by
awaiting page.waitForSelector(SELECTORS.title, { timeout: 10000 }).catch(() =>
{}) at the start of extractFromPage (before page.evaluate) so the scraper gives
the page time to render; keep using SELECTORS and preserve the existing
page.evaluate logic for extraction.

In `@adapters/portals/france-travail.mjs`:
- Around line 88-90: Replace the hard throw on res.status === 429 with logic
that reads the Retry-After header (e.g., const retryAfter =
res.headers.get('Retry-After')), parses it to milliseconds, and either sleep and
retry the request once or return a structured rate-limit result containing
rate_limit_ms; update the branch that currently throws "Rate limit France
Travail atteint..." (the res.status === 429 block) to use the parsed delay and
follow the same retry/resiliency pattern used by other Tier-2 adapters instead
of immediately throwing.
- Around line 132-134: The getById function currently interpolates id directly
into the path passed to apiCall (getById), which allows slashes or reserved
chars to break or redirect the request; update getById to URI-encode the id (use
encodeURIComponent or equivalent) before building the `/offres/${id}` path so
the request path is safe against path-traversal/SSRF and reserved characters are
escaped; ensure the encoded value is used everywhere that id was previously
interpolated into URLs.
- Around line 66-98: The apiCall and getToken functions can hang because their
fetch calls have no timeout; implement a reusable fetchWithTimeout(url, options,
timeoutMs) helper that uses AbortController to abort after a configurable
timeout (e.g. 15000–30000 ms) and properly clears timers, then replace the raw
fetch calls inside apiCall and getToken with fetchWithTimeout(url, { headers:
{...}, ...otherOptions }, timeoutMs); ensure the replacement preserves headers
and response handling and that abort errors are surfaced or converted to a clear
timeout Error so callers can handle timeouts consistently.
- Around line 13-60: getToken currently mutates tokenCache without
synchronization so concurrent callers each trigger a fresh OAuth request; change
it to cache the in-flight fetch promise (e.g. add a tokenFetchPromise variable
or a promise field on tokenCache) inside getToken so multiple callers await the
same promise instead of issuing parallel requests. Implementation: on
cache-miss, if tokenFetchPromise exists return await tokenFetchPromise;
otherwise set tokenFetchPromise = (async () => { perform fetch, on success set
tokenCache.accessToken and tokenCache.expiresAt, return token; })(); await it,
and in finally on rejection clear tokenFetchPromise so future calls can retry.
Keep the existing expiry check in getToken and ensure tokenFetchPromise is
cleared on error.

In `@adapters/portals/jobijoba.mjs`:
- Around line 8-15: The SELECTORS in adapters/portals/jobijoba.mjs are
intentionally generic and must not be used in production until validated; add a
clear guard to prevent accidental deployment by introducing a validation
flag/check (e.g., a constant like JOBIJOBA_SELECTORS_VALIDATED or an env var)
and make the module fail-fast or disable the adapter at startup (throw an error
or return no-op) when that flag is false, and add a comment referencing
SELECTORS so reviewers can re-run validation once Bertrand provides a real URL;
also mirror this guard for the identical TODO in hellowork.mjs so both adapters
are protected.

In `@adapters/portals/regionsjob.mjs`:
- Around line 18-33: The extractFromPage function currently calls page.evaluate
immediately and may read client-rendered fields too early; before calling
page.evaluate in extractFromPage, await a stable client-rendered selector (e.g.
await page.waitForSelector(SELECTORS.title)) to ensure the job details are
present; keep the wait targeted to SELECTORS.title (or the most reliable text
selector), use a reasonable timeout, then run the existing page.evaluate block
unchanged so getText reads the rendered DOM reliably.

In `@adapters/portals/welcometothejungle.mjs`:
- Around line 39-40: The fetch export currently forwards any input URL to
fetchWithPlaywright without validation; add an SSRF host allowlist check in the
fetch function: parse the incoming url (using the URL constructor), verify its
hostname (and optionally port) is in the approved set (e.g., welcome host and
allowed subdomains), and throw or return a rejected response when it is not
allowed before calling fetchWithPlaywright; update the same fetch function
(referencing fetch, fetchWithPlaywright, and extractFromPage) so the Playwright
navigation only runs for validated hosts.

In `@CHANGELOG-FR.md`:
- Around line 34-37: Update the CHANGELOG-FR.md entry under "Qualité (Phase 6)"
so the CI status is accurate: either remove the bullet referencing
`.github/workflows/test-fr.yml` or change its text to indicate the workflow is
temporarily disabled/pending due to OAuth scope (e.g., "CI (test-fr.yml) —
workflow temporarily disabled/pending due to OAuth scope"). Edit the line that
mentions `.github/workflows/test-fr.yml` accordingly to avoid implying golden
tests run on PRs.

In `@CLAUDE.md`:
- Around line 297-300: Update the CLAUDE.md section that currently states
"GitHub Actions s'exécutent à chaque PR" and "Protection de branche sur `main` :
les vérifications de statut doivent passer" to reflect the current reality:
either restore the removed workflow in this PR or explicitly state that the CI
workflow is temporarily disabled and status checks are not currently enforced;
adjust the "GitHub Actions" bullet and the "Protection de branche" bullet
accordingly so they no longer claim CI runs on every PR or that status checks
are required until the workflow is restored.
- Line 71: Update the report header keys in the `reports/` description so they
use the canonical machine-readable labels `**URL:**` and `**Legitimacy:**`
(replace the current French `**URL :**` and `**Légitimité :** {tier}`
occurrences), ensuring the documentation text for `reports/` and any other
occurrence (e.g., the line currently showing `**Légitimité :** {tier}`) uses
those exact labels so validators and downstream tooling can parse headers
correctly.

In `@docs/API-FRANCE-TRAVAIL.md`:
- Around line 83-89: Update the endpoints documentation to show the full OAuth
token URL and API base plus the relative paths: replace the relative token path
`POST /connexion/oauth2/access_token` with the full URL `POST
https://entreprise.francetravail.fr/connexion/oauth2/access_token?realm=/partenaire`,
add a clear API base entry `Base URL: https://api.francetravail.io/`, and list
the resource paths `GET /offresdemploi/v2/offres/search` and `GET
/offresdemploi/v2/offres/{id}` as relative to that base; also add the required
scope `api_offresdemploiv2` next to the token entry so callers know to request
that scope.

In `@docs/GUIDE-UTILISATION.md`:
- Around line 175-179: Update the "Score global" guidance so thresholds match
the project policy that scores below 4.0 are strongly discouraged: modify the
listed bands in the "Score global" section to read "4.5+ → Fonce, c'est un
excellent match", "4.0–4.4 → Bon match, candidate", and "< 4.0 → Le système
déconseille de candidater" (and remove or replace the 3.5–3.9 band wording).
Ensure the wording explicitly states "Strongly discourage applications with
scores below 4.0/5; do not apply unless there is a specific overriding reason"
so the documentation aligns with the agreed policy.

In `@docs/POST-LINKEDIN-LANCEMENT.md`:
- Line 10: Update the prose example fenced code blocks so they are marked as
plain text (change the opening fences from ``` to ```text) in the
docs/POST-LINKEDIN-LANCEMENT.md examples shown (the blocks around line 10 and
also the similar example fences at the other noted positions). Locate each prose
example code fence and replace its opening triple-backtick with ```text so
markdownlint treats them as text without changing rendering.
- Around line 20-23: Replace absolute local-only privacy claims like "Sans
inscription. Sans données qui partent dans le cloud." (and the similar phrases
at lines referenced) with a nuanced statement that explains local storage by
default but notes that analysis may be sent to the configured AI provider for
processing; e.g., keep the UX promise of no signup and quick results while
adding a brief qualifier such as "Traitement local par défaut; si un fournisseur
AI externe est configuré, certaines données (offre/CV) peuvent être envoyées
pour analyse" so the file sections containing those phrases (the short privacy
lines around the top and the repeated instances at the other referenced
locations) are updated accordingly.

In `@docs/SETUP.fr.md`:
- Around line 38-40: Update the setup step that copies the portal template so it
uses the French-specific template: replace the current copy command that
references templates/portals.example.yml with one that copies
templates/portals.fr.example.yml to portals.yml (i.e., use
templates/portals.fr.example.yml -> portals.yml) so the French setup initializes
portals.yml with the France-specific defaults.

In `@examples/cv-exemple-1.md`:
- Line 9: The markdown heading "### Développeur Fullstack — TechPME SAS (Paris,
15 salariés)" should have blank lines before and after it to follow markdown
best practices; update the example at that heading (and the other job title
heading referenced around line 17) by inserting a single empty line above and
below each "###" job title heading so adjacent paragraphs/lists render
correctly.

In `@examples/cv-exemple-2.md`:
- Line 9: Add a blank line before and after each job title heading to satisfy
markdownlint and improve readability; specifically update the heading "###
Account Executive Senior — CloudSoft (Paris, 120 salariés)" (and the other
similar job title headings in this file) so there is an empty line above the
"### ..." line and an empty line immediately following it.

In `@MENTIONS-LEGALES.md`:
- Around line 24-26: Update the "Les seules données transmises à des tiers sont
:" disclosure to match the France Travail adapter's actual parameters by listing
all fields that may be sent (search keywords, department, commune INSEE code,
ROME code, contract type, pagination and similar search filters) and explicitly
state that requests to the chosen AI provider (Anthropic, Google, etc.) may
include the full offer text and the user's CV or resume when those are used;
edit the existing bullet points in the MENTIONS-LEGALES.md section that
currently start with "Les requêtes envoyées à l'API France Travail..." and "Les
requêtes envoyées au fournisseur d'IA..." to reflect this expanded, precise
parameter list and AI payloads.

In `@modes/fr/_profile.template.md`:
- Around line 5-13: Update the localized template text in
modes/fr/_profile.template.md to clearly state it is a system/template file (not
the user-owned profile) and warn users not to edit it; then implement startup
behavior to silently copy modes/_profile.template.md to modes/_profile.md on
first run if modes/_profile.md does not exist so users edit the persistent file
instead; ensure any logic that currently treats modes/*/_profile.template.md as
the editable profile is changed to prefer modes/_profile.md and only read
templates from modes/_profile.template.md (and localized variants) as defaults.

In `@modes/fr/contact.md`:
- Around line 19-45: The CTAs for each contact type (under the headings
Recruteur, Hiring Manager, Pair (referral), Intervieweur (pré-entretien)) are
only in English; add French variants immediately alongside the English examples
so the FR mode provides localized phrasing—e.g. for Recruteur add «Je peux vous
envoyer mon CV si cela vous convient», for Hiring Manager add «J’aimerais en
savoir plus sur la façon dont votre équipe gère [défi spécifique]», for Pair add
«Je travaille sur des sujets similaires chez [entreprise], j’aimerais connaître
votre avis sur [sujet]», and for Intervieweur add «Au plaisir d’échanger le
[date]» (or equivalent FR sentences); ensure each CTA line includes both EN and
FR versions or at minimum one explicit FR sample per contact type so francophone
users have clear localized examples.
- Line 32: Replace the incorrect word "genuïne" in the "**Phrase 1 (Intérêt)**"
line with a proper French term such as "authentique" (or "sincère") so the
string reads e.g. "Référence authentique à son travail — article de blog, talk,
projet open source, ou publication"; update that exact phrase in the
modes/fr/contact.md content.

In `@modes/fr/deep.md`:
- Line 43: Update the text that references the project profile path so it points
to the actual configured locations: replace "profile.yml" with
"config/profile.yml" and mention the additional personalization file
"modes/_profile.md" (e.g., change the sentence "lire cv.md et profile.yml pour
l'expérience spécifique" to reference "lire cv.md, config/profile.yml et
modes/_profile.md" or equivalent).
- Line 7: The fenced code block currently uses an unlabelled triple-backtick
fence (```); update the opening fence to include a language token (e.g., change
the opening ``` to ```text) so the block becomes a labeled plain-text fence and
avoids markdownlint errors—locate the prompt block in modes/fr/deep.md where the
triple-backticks are used and add the "text" language identifier to the opening
fence.

In `@modes/fr/entretien.md`:
- Around line 119-127: Update the report header template in
modes/fr/entretien.md to include the required metadata fields **URL:** and
**Legitimacy:** (formatted exactly as "**URL:** {URL de l'offre d'emploi
originale ou N/A}" and "**Legitimacy:** {Verified/Not verified}") placed above
the existing **Rapport :** line so the header reads: title, URL, Legitimacy,
Rapport, Recherché le, Sources; ensure the field names and placement match the
coding guideline for all generated `**/*.md` reports.

In `@modes/fr/lettre-motivation.md`:
- Around line 113-115: The docs currently instruct reusing
templates/cv-template.html for PDF export but that CV template contains
hardcoded CV sections (Professional Summary, Core Competencies, Work Experience,
Projects, Education, Certifications, Skills) which don't fit a cover letter;
either add a new templates/lettre-motivation-template.html with appropriate
cover-letter structure (header, recipient, 3–4 body paragraphs, closing,
signature placeholders) and ensure the PDF export path uses this template, or
update modes/fr/lettre-motivation.md to document exactly which template
variables must be populated (with an example of rendered output) if you truly
want to reuse templates/cv-template.html. Ensure changes reference the template
names (templates/cv-template.html, templates/lettre-motivation-template.html)
and adjust any export logic that picks the template to choose the new
lettre-motivation template for cover letters.

In `@modes/fr/relance.md`:
- Around line 169-174: Add the required top-of-file header fields "URL:" and
"Legitimacy:" to relance.md’s front matter/header so the document complies with
the guidelines; set URL to the canonical document or route that references this
cadence and set Legitimacy to the appropriate classification (e.g., "official",
"internal", or "community") per project conventions, ensuring the rest of the
content (the cadence table) remains unchanged.

In `@modes/fr/scan.md`:
- Around line 100-143: Renumber the top-level ordered list so each step has a
unique ordinal: change the first "6. Niveau 3 — Requêtes WebSearch" → "6.", keep
"Filtrer par titre" → "7.", change "Dédupliquer" → "8.", change "7.5. Vérifier
la vivacité des résultats WebSearch (Niveau 3)" → "9." and shift the following
steps accordingly (so the current "8. Pour chaque offre..." → "10.", "9." →
"11.", "10." → "12.", "11." → "13."). Also update the in-text cross-reference
"continuer à l'étape 8" to point to the new correct step number "continuer à
l'étape 9". Ensure headings like "Niveau 3 — Requêtes WebSearch", "Filtrer par
titre", "Dédupliquer", and "Vérifier la vivacité des résultats WebSearch (Niveau
3)" reflect the new numbering.

In `@package.json`:
- Line 32: The author field in package.json contains mojibake ("Fern��ndez");
update the "author" value to use the correct UTF-8 name string so npm metadata
is correct—replace the current value with "Santiago Fernández de Valderrama
<hi@santifer.io> (https://santifer.io)" in the author property.

In `@PRIVACY.md`:
- Line 24: Update the PRIVACY.md entry for "API France Travail" to accurately
list all search parameters actually sent by the adapter (as implemented in
adapters/portals/france-travail.mjs): include commune, codeROME (code ROME),
typeContrat (type de contrat), and range in addition to mots-clés de recherche
and code département so the privacy notice matches the implementation.

In `@README.fr.md`:
- Around line 91-94: Update the French README to instruct users to copy the
France-specific portal template instead of the generic one: change the copy
command that references templates/portals.example.yml to use
templates/portals.fr.yml (output still to portals.yml) so French users get the
France Travail/APEC/HelloWork defaults; apply the same change wherever the
generic template is referenced (e.g., the other occurrence of
templates/portals.example.yml).

In `@templates/cv-template-fr.html`:
- Around line 404-407: The template contains Mustache-style conditional blocks
(e.g., {{`#DRIVING_LICENSE`}}...{{/DRIVING_LICENSE}}, and similar blocks for
INTERESTS, AVAILABILITY, MOBILITY) but the PDF pipeline only does token
replacement, so those markers will appear verbatim; fix by removing the Mustache
section markers and leaving plain placeholders (e.g., replace the entire
{{`#DRIVING_LICENSE`}}...{{/DRIVING_LICENSE}} block with a single unconditional
{{DRIVING_LICENSE}} token) so the upstream agent can supply empty or real
values, or alternatively wire a Mustache renderer into the HTML filling step
(ensure generate-pdf.mjs invokes the renderer before PDF generation and passes
the data map) and update references to DRIVING_LICENSE, INTERESTS, AVAILABILITY,
and MOBILITY accordingly.

In `@templates/portals-fr.example.yml`:
- Around line 185-200: The template's schema doesn't match what scan.mjs
expects; update templates/portals-fr.example.yml to follow the scanner contract
by adding a tracked_companies array (populate entries matching
config.tracked_companies used by scan.mjs) and change each search_queries item
from { query, source } to the shape scan.mjs expects: { name, query, enabled }
(keep source as optional metadata inside name or a separate field if you want
but ensure query/name/enabled exist). Also remove or convert the existing
portals[] adapter/tier/auth blocks into either entries under tracked_companies
or documented optional metadata so scan.mjs won't ignore them. Ensure the file's
keys exactly match config.tracked_companies and search_queries so the scanner
iterates and returns results.

In `@tests/run-golden.mjs`:
- Around line 44-63: The validation currently only checks presence of
testCase.expectedScore?.min/max but doesn't ensure they're numbers; update the
checks around expectedScore (references: expectedScore, expectedScore.min,
expectedScore.max, checks) to first verify both values are finite numbers (use
typeof === 'number' && Number.isFinite(...) or coerce and test with
Number.isFinite), push an error if not numeric, and only then perform the
ordering (min > max) and range (min < 1 || max > 5) validations; keep the
existing error messages but add a new message like 'expectedScore.min/max not a
finite number' when the numeric check fails.
- Around line 49-52: The test fixture loader currently joins ROOT and
testCase.input directly (see testCase.inputType, inputPath, existsSync),
allowing absolute paths or “../” to escape the repo; fix by resolving and
validating the resulting path before checking existence: compute a resolved path
using path.resolve(ROOT, testCase.input) (or equivalent), ensure the resolved
path is inside ROOT (e.g., resolvedPath === ROOT or resolvedPath.startsWith(ROOT
+ path.sep)), and if the check fails push the same “fichier introuvable : …”
error (or a guard error) instead of calling existsSync on an escaped path; only
then call existsSync on the validated resolved path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 63f28639-df3c-4677-8d8c-857f8d56be4b

📥 Commits

Reviewing files that changed from the base of the PR and between 411afb3 and a8f9610.

📒 Files selected for processing (43)
  • .env.example
  • .gitignore
  • CHANGELOG-FR.md
  • CLAUDE.md
  • MENTIONS-LEGALES.md
  • NOTICE.md
  • PRIVACY.md
  • README.fr.md
  • adapters/portals/_shared.mjs
  • adapters/portals/apec.mjs
  • adapters/portals/france-travail.mjs
  • adapters/portals/hellowork.mjs
  • adapters/portals/jobijoba.mjs
  • adapters/portals/regionsjob.mjs
  • adapters/portals/welcometothejungle.mjs
  • docs/API-FRANCE-TRAVAIL.md
  • docs/GUIDE-UTILISATION.md
  • docs/POST-LINKEDIN-LANCEMENT.md
  • docs/SETUP.fr.md
  • examples/cv-exemple-1.md
  • examples/cv-exemple-2.md
  • examples/offres-exemple/alternance-dev-web.md
  • examples/offres-exemple/commercial-b2b-saas.md
  • examples/offres-exemple/dev-fullstack-pme.md
  • examples/offres-exemple/manager-equipe-tech.md
  • examples/offres-exemple/technicien-chauffagiste.md
  • modes/fr/README.md
  • modes/fr/_profile.template.md
  • modes/fr/contact.md
  • modes/fr/deep.md
  • modes/fr/entretien.md
  • modes/fr/formation.md
  • modes/fr/lettre-motivation.md
  • modes/fr/patterns.md
  • modes/fr/projet.md
  • modes/fr/relance.md
  • modes/fr/scan.md
  • package.json
  • scripts/test-api-francetravail.mjs
  • templates/cv-template-fr.html
  • templates/portals-fr.example.yml
  • tests/golden/cases.json
  • tests/run-golden.mjs

Comment thread .env.example
# Free API: https://francetravail.io — see docs/API-FRANCE-TRAVAIL.md
FRANCE_TRAVAIL_CLIENT_ID=your_client_id_here
FRANCE_TRAVAIL_CLIENT_SECRET=your_client_secret_here
FRANCE_TRAVAIL_SCOPE=api_offresdemploiv2 o2dsoffre
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Quote the scope value containing spaces.

The current example is fragile for .env tooling and shell-style loading because the value contains a space.

🧹 Proposed fix
-FRANCE_TRAVAIL_SCOPE=api_offresdemploiv2 o2dsoffre
+FRANCE_TRAVAIL_SCOPE="api_offresdemploiv2 o2dsoffre"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FRANCE_TRAVAIL_SCOPE=api_offresdemploiv2 o2dsoffre
FRANCE_TRAVAIL_SCOPE="api_offresdemploiv2 o2dsoffre"
🧰 Tools
🪛 dotenv-linter (4.0.0)

[warning] 22-22: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.env.example at line 22, The FRANCE_TRAVAIL_SCOPE value in .env.example
contains a space which breaks shell-style loaders; update the
FRANCE_TRAVAIL_SCOPE entry to wrap the whole scope string in quotes (e.g.,
FRANCE_TRAVAIL_SCOPE="...") so tools treat it as a single value and preserve
spaces; modify the FRANCE_TRAVAIL_SCOPE line accordingly.

Comment on lines +25 to +31
function getDomain(url) {
try {
return new URL(url).hostname;
} catch {
return url;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Find portal adapters that navigate/fetch caller-provided URLs outside the shared guard.
# Expect: Portal adapters delegate to fetchWithPlaywright, or validate URLs before page.goto/fetch.

rg -n -C4 '\b(page\.goto|fetch\s*\(|new URL\s*\(|fetchWithPlaywright\s*\()' --iglob '*.mjs' adapters

Repository: santifer/career-ops

Length of output: 7027


🏁 Script executed:

cd adapters/portals && fd -t f "\.mjs$" -exec grep -l "export.*function fetch" {} \;

Repository: santifer/career-ops

Length of output: 320


🏁 Script executed:

# Find where these portal adapters are imported and called
rg -n "from.*portals/" --type mjs | head -20

Repository: santifer/career-ops

Length of output: 90


🏁 Script executed:

# Check if there's a main entry point that processes URLs before passing to adapters
rg -n "fetchWithPlaywright|portals.*fetch" --type mjs -B2 -A2 | head -40

Repository: santifer/career-ops

Length of output: 90


🏁 Script executed:

# Look for where URLs originate (job postings, API, user input, etc.)
fd -t f "\.mjs$" -path "*/job*" -o -path "*/fetch*" | head -20

Repository: santifer/career-ops

Length of output: 231


🏁 Script executed:

# Find where portal adapters are imported
rg "portals/(welcometothejungle|regionsjob|hellowork|jobijoba|apec)" -g "*.mjs"

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

# Search for the main entry point that calls these adapters
rg -n "fetch.*url|fetchWithPlaywright" -g "*.mjs" | grep -v "adapters/portals/_shared.mjs" | head -30

Repository: santifer/career-ops

Length of output: 1666


🏁 Script executed:

# Check what's in the root or main directory
git ls-files | grep -E "^[^/]*\.mjs$|^(index|main|run|fetch|job|scan).*\.mjs$"

Repository: santifer/career-ops

Length of output: 341


🏁 Script executed:

# Look for where URLs are passed to these functions - search for job listing/URL processing
rg -n "portal.*fetch|fetch.*url" -g "*.mjs" -B3 -A3 | head -50

Repository: santifer/career-ops

Length of output: 2457


🏁 Script executed:

# Find where portal adapter functions are actually called
rg "from.*portals" -g "*.mjs" -A5 | head -40

Repository: santifer/career-ops

Length of output: 789


🏁 Script executed:

# Check scan.mjs to see where the URLs come from and if they're validated
sed -n '280,310p' scan.mjs

Repository: santifer/career-ops

Length of output: 961


🏁 Script executed:

# Check what targets/companies are and where they come from
sed -n '200,295p' scan.mjs | head -60

Repository: santifer/career-ops

Length of output: 1963


🏁 Script executed:

# Search for any URL validation or sanitization functions
rg -n "validate.*url|sanitize|protocol|hostname" -g "*.mjs" -i

Repository: santifer/career-ops

Length of output: 435


🏁 Script executed:

# Check where the URLs passed to portal adapters come from
# Search for how jobs/job.url are generated from API responses
rg -n "PARSERS|job\.url|extractFromPage" -g "*.mjs" -B2 -A2 | head -50

Repository: santifer/career-ops

Length of output: 2369


🏁 Script executed:

# Check portals.yml to understand the data source
cat portals.yml 2>/dev/null | head -40

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

# Look at the browser initialization to understand the execution context
sed -n '1,100p' adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 2903


🏁 Script executed:

# Check if Chromium browser has any network restrictions or sandboxing
rg -n "browser\s*=" -g "*.mjs" -B5 -A5

Repository: santifer/career-ops

Length of output: 1948


🏁 Script executed:

# Check how fetchWithPlaywright closes resources
sed -n '200,250p' adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 1288


🏁 Script executed:

# Check if the context is closed anywhere
rg -n "context.*close|page.*close" adapters/portals/_shared.mjs -A2 -B2

Repository: santifer/career-ops

Length of output: 165


🏁 Script executed:

# Look at the full newPage and getBrowser implementation
sed -n '150,190p' adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 1081


🏁 Script executed:

# Check where portals.yml comes from and if it can be user-controlled
cat portals.yml 2>/dev/null || echo "File not found"

Repository: santifer/career-ops

Length of output: 77


🏁 Script executed:

# Check how scan.mjs processes and validates job.url from API responses
sed -n '1,120p' scan.mjs

Repository: santifer/career-ops

Length of output: 3447


🏁 Script executed:

# Verify if there's any validation or filter on job.url before passing to portal adapters
rg -n "job\.url|newOffers" scan.mjs -B3 -A3 | head -60

Repository: santifer/career-ops

Length of output: 1624


🏁 Script executed:

# Check if portal adapters are ever called with user-supplied URLs
rg -n "export.*function fetch" adapters/portals/*.mjs -A3 | grep -A3 "fetch"

Repository: santifer/career-ops

Length of output: 1397


🏁 Script executed:

# Verify context lifecycle - check if context is ever closed elsewhere
rg -n "\.close\(\)" adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 129


🏁 Script executed:

# Check for any shutdown/cleanup hooks that close contexts
rg -n "closeBrowser|process\.on|exit" adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 105


🏁 Script executed:

# Verify the context leak severity - count how many times newPage is called
rg -n "newPage\|fetchWithPlaywright" adapters/portals/ -c

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

# Verify how many requests typically happen and whether context accumulation would be noticeable
# Check scan.mjs to see how many times fetchWithPlaywright might be called
sed -n '320,330p' scan.mjs

Repository: santifer/career-ops

Length of output: 262


🏁 Script executed:

# Check if there's any call to closeBrowser in scan.mjs
rg -n "closeBrowser" scan.mjs

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

# Verify the actual context leak - show the newPage() implementation again
sed -n '173,180p' adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 263


🏁 Script executed:

# Verify if portal adapters are used in any long-running processes
rg -n "import.*portals|fetchWithPlaywright" -g "*.mjs" --type-list

Repository: santifer/career-ops

Length of output: 452


🏁 Script executed:

# Check if scan.mjs is a one-off script or a daemon
head -20 scan.mjs | grep -E "setInterval|daemon|loop|watch"

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

# Double-check the context leak: verify context is never stored or closed
rg -n "context\s*=" adapters/portals/_shared.mjs -B2 -A2

Repository: santifer/career-ops

Length of output: 247


🏁 Script executed:

# Final verification: check the exact context leak impact
# Count how many portal adapters are actually used
rg -n "export.*function fetch" adapters/portals/*.mjs | wc -l

Repository: santifer/career-ops

Length of output: 64


🏁 Script executed:

# Verify scan.mjs execution flow - is it single-run or could loop?
tail -20 scan.mjs

Repository: santifer/career-ops

Length of output: 678


🏁 Script executed:

# Confirm context lifecycle one more time
sed -n '223,246p' adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 656


Close browser contexts to prevent resource leak in newPage().

The newPage() function at line 173 creates a browser.newContext() but the context is never closed. Only page.close() is called in fetchWithPlaywright() (line 245). Each job extraction leaks one context, which accumulates across the scan. Additionally, browser.close() is never called in scan.mjs.

Fix by closing the context:

Resource cleanup
export async function newPage() {
  const browser = await getBrowser();
  const context = await browser.newContext({
    userAgent: USER_AGENT,
    locale: 'fr-FR',
  });
-  return context.newPage();
+  const page = context.newPage();
+  // Store context reference on page for cleanup
+  page._context = context;
+  return page;
}

export async function fetchWithPlaywright(url, extractFn, portalName) {
  // ... existing code ...
  const page = await newPage();
  try {
    // ... existing code ...
    return { jd, source: portalName };
  } finally {
    await page.close();
+   if (page._context) await page._context.close();
  }
}

Optional hardening: Although URLs come from trusted APIs (Greenhouse, Ashby, Lever), consider validating protocol to http: / https: in getDomain() or fetchWithPlaywright() for defense-in-depth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/portals/_shared.mjs` around lines 25 - 31, newPage() creates a
browser context that is never closed, causing a resource leak per job; update
newPage() and fetchWithPlaywright() to ensure the created context (from
browser.newContext()) is closed after use (use a try/finally around page
creation/usage so both page.close() and context.close() are called, and ensure
browser.close() is invoked where scan lifecycle ends, e.g., in scan.mjs
shutdown); additionally, harden getDomain() (or validate in
fetchWithPlaywright()) to verify the URL uses http: or https: before returning
hostname to avoid processing non-web protocols.

Comment on lines +40 to +93
export async function checkRobotsTxt(url) {
const domain = getDomain(url);
const parsedUrl = new URL(url);

if (robotsCache.has(domain)) {
return robotsCache.get(domain);
}

try {
const robotsUrl = `${parsedUrl.protocol}//${domain}/robots.txt`;
const res = await fetch(robotsUrl, {
headers: { 'User-Agent': USER_AGENT },
signal: AbortSignal.timeout(5000),
});

if (!res.ok) {
// Pas de robots.txt = tout est autorisé
const result = { allowed: true };
robotsCache.set(domain, result);
return result;
}

const text = await res.text();
const path = parsedUrl.pathname;

// Parsing simplifié : chercher Disallow pour User-agent: *
const lines = text.split('\n');
let inWildcard = false;

for (const line of lines) {
const trimmed = line.trim().toLowerCase();
if (trimmed.startsWith('user-agent:')) {
inWildcard = trimmed.includes('*');
}
if (inWildcard && trimmed.startsWith('disallow:')) {
const disallowed = trimmed.replace('disallow:', '').trim();
if (disallowed && path.startsWith(disallowed)) {
const result = { allowed: false, reason: `robots.txt Disallow: ${disallowed}` };
robotsCache.set(domain, result);
return result;
}
}
}

const result = { allowed: true };
robotsCache.set(domain, result);
return result;
} catch {
// Timeout ou erreur réseau = autoriser (fail-open)
const result = { allowed: true };
robotsCache.set(domain, result);
return result;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cache robots decisions per path, or parse all matching groups before caching.

robotsCache is keyed only by domain, so the first checked URL decides access for every path on that host. If /public-job is checked first, a later /private-job can bypass a Disallow; if a disallowed path is checked first, valid paths are blocked.

Proposed minimal fix
 export async function checkRobotsTxt(url) {
   const domain = getDomain(url);
   const parsedUrl = new URL(url);
+  const cacheKey = `${domain}${parsedUrl.pathname}`;
 
-  if (robotsCache.has(domain)) {
-    return robotsCache.get(domain);
+  if (robotsCache.has(cacheKey)) {
+    return robotsCache.get(cacheKey);
   }
@@
-      robotsCache.set(domain, result);
+      robotsCache.set(cacheKey, result);
@@
-          robotsCache.set(domain, result);
+          robotsCache.set(cacheKey, result);
@@
-    robotsCache.set(domain, result);
+    robotsCache.set(cacheKey, result);
@@
-    robotsCache.set(domain, result);
+    robotsCache.set(cacheKey, result);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/portals/_shared.mjs` around lines 40 - 93, checkRobotsTxt currently
caches decisions by domain only (robotsCache) and returns early when it finds
the first matching Disallow, letting that single check decide every path on the
host; update checkRobotsTxt to either (A) compute and cache per-path keys (e.g.,
`${domain}:${path}`) when storing results in robotsCache, or (B) change the
parsing logic in checkRobotsTxt to first collect all Disallow rules for the
wildcard group (accumulate into an array) and only after parsing all lines
determine if the requested path matches any disallow before caching a
domain-scoped result; modify the cache set/get usage around robotsCache, and
keep the existing abort/timeout and error handling intact.

Comment on lines +173 to +180
export async function newPage() {
const browser = await getBrowser();
const context = await browser.newContext({
userAgent: USER_AGENT,
locale: 'fr-FR',
});
return context.newPage();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect Playwright context creation/closing in portal utilities.
# Expect: Every browser.newContext() has a matching context.close() path.

rg -n -C4 '\bnewContext\s*\(|\.context\(\)\.close\s*\(|\.close\s*\(' --iglob '*.mjs'

Repository: santifer/career-ops

Length of output: 2368


🏁 Script executed:

# Read the full context around the flagged lines to see both occurrences
sed -n '165,250p' adapters/portals/_shared.mjs | cat -n

Repository: santifer/career-ops

Length of output: 2761


🏁 Script executed:

# Find all newPage() calls in the codebase to understand usage patterns
rg -n 'newPage\s*\(' --iglob '*.mjs' -A 5 -B 2

Repository: santifer/career-ops

Length of output: 2198


🏁 Script executed:

# Check how pages are closed after being created
rg -n 'page\.close\s*\(|context\s*\(' adapters/portals/_shared.mjs -B 3 -A 1

Repository: santifer/career-ops

Length of output: 164


🌐 Web query:

Playwright page.context() API close context vs page.close()

💡 Result:

page.context.close closes the entire BrowserContext and all Pages within it. page.close closes only the specific Page, leaving the BrowserContext and other Pages open. From official Playwright docs: BrowserContext.close: "Closes the browser context. All the pages that belong to the browser context will be closed." Note: The default browser context cannot be closed. [sources 2,3,8] Page.close: Closes the specific page. Optional runBeforeUnload parameter controls whether to run beforeunload handlers (default false). The page's context remains open. [source 1] Key differences: - Scope: context.close affects all pages in the context; page.close affects only that page. - Use case: Use page.close to close individual tabs/pages while keeping the session (cookies, storage) alive. Use context.close for full isolation cleanup, ensuring artifacts like HARs/videos are saved gracefully before browser.close. [sources 2,3,6,7]

Citations:


🏁 Script executed:

# Search for any other references to context or page lifecycle in the shared module
rg -n 'context|page\.' adapters/portals/_shared.mjs | head -30

Repository: santifer/career-ops

Length of output: 252


🏁 Script executed:

# Check if there are any comments or documentation about context management
rg -n '(context|memory|leak|cleanup)' adapters/portals/_shared.mjs -i

Repository: santifer/career-ops

Length of output: 143


🏁 Script executed:

# Find all calls to newPage() to assess impact scope
rg -n '\.newPage\s*\(|newPage\s*\(' --iglob '*.mjs' -B 3 -A 8

Repository: santifer/career-ops

Length of output: 3113


🏁 Script executed:

# Check if there are any tests or other files that might call newPage indirectly
rg -n 'fetchWithPlaywright|newPage' --iglob '*.mjs' | grep -v node_modules

Repository: santifer/career-ops

Length of output: 1425


Close Playwright contexts to prevent memory leaks from accumulated cookies and storage.

Each newPage() call in fetchWithPlaywright() creates a new BrowserContext (line 175), but only the page is closed in the finally block (line 245). Contexts remain in the browser instance and accumulate cookies, storage, and routes across multiple portal requests.

Replace page.close() with page.context().close() in the finally block. Per Playwright docs, closing the context automatically closes all pages within it, ensuring proper cleanup of isolation artifacts.

Lifecycle fix
 export async function fetchWithPlaywright(url, extractFn, portalName) {
   // ...
   const page = await newPage();
   try {
     // extraction logic
   } finally {
-    await page.close();
+    await page.context().close();
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/portals/_shared.mjs` around lines 173 - 180, The
fetchWithPlaywright() flow opens a new BrowserContext via newPage() but only
calls page.close(), leaving contexts (and cookies/storage) alive; in the finally
block of fetchWithPlaywright() replace the page.close() call with
page.context().close() so the entire BrowserContext (and all pages) is disposed;
locate references to newPage(), fetchWithPlaywright(), and the finally cleanup
where page.close() is invoked and change that single cleanup call to
page.context().close().

Comment thread adapters/portals/apec.mjs
Comment on lines +19 to +34
async function extractFromPage(page) {
return page.evaluate((sel) => {
const getText = (selector) => {
const el = document.querySelector(selector);
return el ? el.textContent.trim() : null;
};

return {
title: getText(sel.title),
company: getText(sel.company),
location: getText(sel.location),
contractType: getText(sel.contractType),
salary: getText(sel.salary),
description: getText(sel.description),
};
}, SELECTORS);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /repo && git ls-files | grep -E "(apec|fetch.*playwright)" | head -20

Repository: santifer/career-ops

Length of output: 118


🏁 Script executed:

find /repo -type f -name "*.mjs" | grep -E "(apec|playwright)" | head -20

Repository: santifer/career-ops

Length of output: 103


🏁 Script executed:

find /repo -type f -name "apec.mjs"

Repository: santifer/career-ops

Length of output: 103


🏁 Script executed:

pwd && ls -la

Repository: santifer/career-ops

Length of output: 5125


🏁 Script executed:

find . -type f -name "apec.mjs" 2>/dev/null

Repository: santifer/career-ops

Length of output: 90


🏁 Script executed:

ls -la adapters/portals/ 2>/dev/null || find . -type d -name "adapters" 2>/dev/null

Repository: santifer/career-ops

Length of output: 622


🏁 Script executed:

cat -n adapters/portals/apec.mjs

Repository: santifer/career-ops

Length of output: 1932


🏁 Script executed:

cat -n adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 9448


🏁 Script executed:

rg -A 5 "fetch\(" adapters/portals/*.mjs | head -40

Repository: santifer/career-ops

Length of output: 2164


🏁 Script executed:

rg "apec\.mjs\|apec" --type=mjs | grep -E "(import|from|require)" | head -20

Repository: santifer/career-ops

Length of output: 90


🏁 Script executed:

grep -r "hostname\|protocol\|host" adapters/portals/ --include="*.mjs"

Repository: santifer/career-ops

Length of output: 223


🏁 Script executed:

rg "allowlist\|whitelist\|SSRF\|domain.*check\|validate.*url" adapters/ --type=js -B 2 -A 2

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

rg "getDomain\|hostname" adapters/portals/_shared.mjs -B 3 -A 3

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

grep -r "apec\.apec\|apec\.org\|apec\.fr" . --include="*.mjs" --include="*.json"

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

rg "apec.*fetch\|from.*apec" . --type=js -B 2 -A 2

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

cat scan.mjs | head -100

Repository: santifer/career-ops

Length of output: 2829


🏁 Script executed:

rg "fetchWithPlaywright\|adapters/portals" scan.mjs -B 3 -A 3

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

rg "from.*apec\|import.*apec" . --type=js

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

rg "fetchWithPlaywright" . --type=js -B 2 -A 2

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

rg "SSRF\|host.*allow\|domain.*check" . --type=js -i

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

ls -la modes/

Repository: santifer/career-ops

Length of output: 1638


🏁 Script executed:

rg "getDomain\|newPage\|page\.goto" adapters/portals/_shared.mjs -B 1 -A 5

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

grep -n "url\|domain\|hostname\|protocol" adapters/portals/_shared.mjs | head -30

Repository: santifer/career-ops

Length of output: 1384


🏁 Script executed:

cat -n adapters/portals/_shared.mjs | sed -n '200,230p'

Repository: santifer/career-ops

Length of output: 1077


🏁 Script executed:

rg "apec.*fetch\|from.*apec\|import.*apec" . -A 2 -B 2

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

grep -rn "import.*adapters\|from.*adapters" . --include="*.mjs" | head -20

Repository: santifer/career-ops

Length of output: 174


🏁 Script executed:

cat modes/scan.md | head -80

Repository: santifer/career-ops

Length of output: 4780


🏁 Script executed:

cat -n adapters/portals/apec.mjs | sed -n '10,20p'

Repository: santifer/career-ops

Length of output: 701


🏁 Script executed:

rg "whitelist|allowlist|TRUSTED.*HOST|HOST.*CHECK" adapters/ -i

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

rg "page\.evaluate|waitForSelector" adapters/portals/ -B 2 -A 5

Repository: santifer/career-ops

Length of output: 2666


🏁 Script executed:

cat -n adapters/portals/welcometothejungle.mjs | head -40

Repository: santifer/career-ops

Length of output: 1859


🏁 Script executed:

rg "page\.goto\|newPage" adapters/portals/_shared.mjs -B 3 -A 3

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

grep -n "TRUSTED\|HOST\|ALLOW" CLAUDE.md | head -20

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

cat -n adapters/portals/welcometothejungle.mjs | sed -n '1,40p'

Repository: santifer/career-ops

Length of output: 1859


🏁 Script executed:

rg "SELECTORS\." adapters/portals/welcometothejungle.mjs -B 2 -A 2

Repository: santifer/career-ops

Length of output: 291


🏁 Script executed:

grep -n "TODO\|vérifier" adapters/portals/apec.mjs

Repository: santifer/career-ops

Length of output: 138


🏁 Script executed:

python3 << 'EOF'
# Verify that Playwright's waitForSelector handles comma-separated selectors
import re

selectors = 'h1[class*="offer-title"], h1[class*="detailOffre"], h1'
print(f"Selector string: {selectors}")
print(f"Type: comma-separated CSS (valid for document.querySelector and Playwright)")
print(f"Playwright handles: ✓ Single selector")
print(f"Playwright handles: ✓ Comma-separated selectors (will wait for ANY to match)")
EOF

Repository: santifer/career-ops

Length of output: 325


🏁 Script executed:

cat -n adapters/portals/_shared.mjs | sed -n '225,236p'

Repository: santifer/career-ops

Length of output: 472


Add selector wait before extraction, matching Welcome to the Jungle pattern.

APEC uses unverified, fallback-heavy CSS selectors. If the job detail renders asynchronously (after domcontentloaded), extractFromPage runs immediately and returns { title: null }, triggering a manual fallback even though the page is valid.

Welcome to the Jungle already implements this pattern (line 20):

await page.waitForSelector(SELECTORS.title, { timeout: 10000 }).catch(() => {});

Apply the same approach to APEC:

⏱️ Proposed wait before extraction
 async function extractFromPage(page) {
+  await page.waitForSelector(SELECTORS.title, { timeout: 10000 }).catch(() => {});
+
   return page.evaluate((sel) => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@adapters/portals/apec.mjs` around lines 19 - 34, The extractFromPage function
may run before async-rendered job fields appear, causing nulls; mirror the
Welcome to the Jungle pattern by awaiting page.waitForSelector(SELECTORS.title,
{ timeout: 10000 }).catch(() => {}) at the start of extractFromPage (before
page.evaluate) so the scraper gives the page time to render; keep using
SELECTORS and preserve the existing page.evaluate logic for extraction.

Comment thread README.fr.md
Comment on lines +91 to +94
# 3. Configurer
cp config/profile.example.yml config/profile.yml # Édite avec tes infos
cp templates/portals.example.yml portals.yml # Personnalise les entreprises

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the France-specific portal template in the French README.

The PR adds a French portal template, but the README tells users to copy the generic templates/portals.example.yml. New FR users may miss the France Travail/APEC/HelloWork defaults described in this same document.

📘 Proposed documentation fix
-cp templates/portals.example.yml portals.yml       # Personnalise les entreprises
+cp templates/portals-fr.example.yml portals.yml    # Personnalise les portails FR
-Copie `templates/portals.example.yml` vers `portals.yml` et ajoute tes propres entreprises.
+Copie `templates/portals-fr.example.yml` vers `portals.yml` et ajoute tes propres entreprises.

Also applies to: 179-179

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.fr.md` around lines 91 - 94, Update the French README to instruct
users to copy the France-specific portal template instead of the generic one:
change the copy command that references templates/portals.example.yml to use
templates/portals.fr.yml (output still to portals.yml) so French users get the
France Travail/APEC/HelloWork defaults; apply the same change wherever the
generic template is referenced (e.g., the other occurrence of
templates/portals.example.yml).

Comment on lines +404 to +407
{{#DRIVING_LICENSE}}
<span class="separator">|</span>
<span>{{DRIVING_LICENSE}}</span>
{{/DRIVING_LICENSE}}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Find template replacement/rendering code and conditional-section support.
# Expect: A Mustache/Handlebars-like renderer, or explicit handling for {{`#KEY`}}...{{/KEY}} blocks.

rg -n -C4 '\{\{#|\{\{/[A-Z_]+|replace\s*\(|mustache|handlebars|template' --type=js --type=mjs

Repository: santifer/career-ops

Length of output: 90


🏁 Script executed:

#!/bin/bash
# Find the generate-pdf.mjs file and look for template rendering logic
find . -name "generate-pdf.mjs" -o -name "*.mjs" | head -20

Repository: santifer/career-ops

Length of output: 554


🏁 Script executed:

#!/bin/bash
# Search for template processing logic without relying on mjs file type
rg -n -C4 '\{\{#|\{\{/[A-Z_]+|mustache|handlebars|Handlebars|Mustache' --type=js

Repository: santifer/career-ops

Length of output: 211


🏁 Script executed:

#!/bin/bash
# Look for package.json to understand dependencies
find . -name "package.json" -type f | xargs cat

Repository: santifer/career-ops

Length of output: 1545


🏁 Script executed:

#!/bin/bash
# Check file size before reading
wc -l ./generate-pdf.mjs

Repository: santifer/career-ops

Length of output: 85


🏁 Script executed:

#!/bin/bash
# Search for template processing/replacement logic
rg -n 'replace|template|html|fill|render' ./generate-pdf.mjs -i -A3 -B1

Repository: santifer/career-ops

Length of output: 3307


🏁 Script executed:

#!/bin/bash
# Search for template filling/rendering logic in all .mjs files
rg -n 'readFile|template|fill|render|\.html|DRIVING_LICENSE|NAME|EMAIL' --type-list | head -5

Repository: santifer/career-ops

Length of output: 452


🏁 Script executed:

#!/bin/bash
# Look for how templates are actually used - search for references to cv-template
rg -n 'cv-template|templates.*html|\.html' -A2 -B2

Repository: santifer/career-ops

Length of output: 15167


🏁 Script executed:

#!/bin/bash
# Check the templates/README.md mentioned in the review to understand the expected placeholder handling
cat ./templates/README.md

Repository: santifer/career-ops

Length of output: 3256


🏁 Script executed:

#!/bin/bash
# Check if cv-template-fr.html exists and examine it
find . -name "*cv-template*" -type f

Repository: santifer/career-ops

Length of output: 151


🏁 Script executed:

#!/bin/bash
# Read the French template to see what conditional syntax is actually used
head -100 ./templates/cv-template-fr.html

Repository: santifer/career-ops

Length of output: 2717


🏁 Script executed:

#!/bin/bash
# Search for the upstream template filling logic
rg -n 'readFileSync|template|fill|replace.*{{' --type-list

Repository: santifer/career-ops

Length of output: 452


🏁 Script executed:

#!/bin/bash
# Check file size of cv-template-fr.html
wc -l ./templates/cv-template-fr.html

Repository: santifer/career-ops

Length of output: 98


🏁 Script executed:

#!/bin/bash
# Read lines 400-410 to see the DRIVING_LICENSE section mentioned in the review
sed -n '400,410p' ./templates/cv-template-fr.html

Repository: santifer/career-ops

Length of output: 385


🏁 Script executed:

#!/bin/bash
# Search for where templates are actually filled (likely in Python or batch prompts)
rg -n 'readFile.*template|fill.*template|replace.*{{' -i

Repository: santifer/career-ops

Length of output: 234


🏁 Script executed:

#!/bin/bash
# Search for where the template is actually filled with data
rg -n 'template' -i --type md | grep -i fill | head -20

Repository: santifer/career-ops

Length of output: 540


🏁 Script executed:

#!/bin/bash
# Look at batch-prompt.md to see how the template is actually populated
sed -n '220,280p' ./batch/batch-prompt.md

Repository: santifer/career-ops

Length of output: 2626


🏁 Script executed:

#!/bin/bash
# Check if there's any Node.js code that does template filling
rg -n 'readFile|template' --type js | head -30

Repository: santifer/career-ops

Length of output: 211


🏁 Script executed:

#!/bin/bash
# Look for DRIVING_LICENSE and other conditional sections in the template
rg -n '{{#|{{/' ./templates/cv-template-fr.html

Repository: santifer/career-ops

Length of output: 145


🏁 Script executed:

#!/bin/bash
# Compare with the English template to see if it also uses conditionals
rg -n '{{#|{{/' ./templates/cv-template.html | head -20

Repository: santifer/career-ops

Length of output: 145


🏁 Script executed:

#!/bin/bash
# Search for any Handlebars or Mustache mention in documentation
rg -i 'handlebars|mustache|conditional|#[A-Z]' ./batch ./modes ./docs

Repository: santifer/career-ops

Length of output: 1715


🏁 Script executed:

#!/bin/bash
# Search for conditional markers - use literal strings instead of regex
grep -n 'DRIVING_LICENSE\|INTERESTS\|AVAILABILITY\|MOBILITY' ./templates/cv-template-fr.html

Repository: santifer/career-ops

Length of output: 679


🏁 Script executed:

#!/bin/bash
# Check the English template for same pattern
grep -n 'DRIVING_LICENSE\|INTERESTS\|AVAILABILITY\|MOBILITY' ./templates/cv-template.html

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Look for where the template gets filled - search for any JS/Node code that reads and fills templates
find . -name "*.mjs" -type f | xargs grep -l "template\|html" | head -10

Repository: santifer/career-ops

Length of output: 172


🏁 Script executed:

#!/bin/bash
# Check if doctor.mjs or other utilities have template handling logic
grep -n 'template\|mustache\|handlebars\|conditional' ./doctor.mjs -i | head -20

Repository: santifer/career-ops

Length of output: 125


🏁 Script executed:

#!/bin/bash
# Look at batch-prompt.md more carefully - specifically the template placeholders section
sed -n '265,320p' ./batch/batch-prompt.md

Repository: santifer/career-ops

Length of output: 2572


🏁 Script executed:

#!/bin/bash
# Search for any mentions of conditional rendering or {{#...}} in documentation
grep -rn 'conditional\|{{#\|Mustache\|Handlebars' ./batch ./modes ./docs --include="*.md"

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Check if there's any Handlebars/Mustache processing in the GEMINI.md or CLAUDE.md guides
grep -i 'handlebars\|mustache\|conditional\|#[A-Z_]*}}' ./GEMINI.md ./CLAUDE.md

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Look at test-all.mjs to see if it provides any clues about template processing
sed -n '140,160p' ./test-all.mjs

Repository: santifer/career-ops

Length of output: 646


Remove Mustache-style conditionals from cv-template-fr.html—the rendering pipeline does not support them.

The template uses {{#DRIVING_LICENSE}}...{{/DRIVING_LICENSE}}, {{#INTERESTS}}, {{#AVAILABILITY}}, and {{#MOBILITY}} blocks (lines 404–407, 495–506), but the PDF generation workflow only performs simple token replacement ({{NAME}} → value). No Mustache or Handlebars library is in the dependency chain, and generate-pdf.mjs reads pre-filled HTML files without processing conditionals. These markers will render literally in the final PDF.

Either replace conditionals with unconditional placeholders filled by the upstream AI agent, or integrate a Mustache renderer into the template-filling step.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@templates/cv-template-fr.html` around lines 404 - 407, The template contains
Mustache-style conditional blocks (e.g.,
{{`#DRIVING_LICENSE`}}...{{/DRIVING_LICENSE}}, and similar blocks for INTERESTS,
AVAILABILITY, MOBILITY) but the PDF pipeline only does token replacement, so
those markers will appear verbatim; fix by removing the Mustache section markers
and leaving plain placeholders (e.g., replace the entire
{{`#DRIVING_LICENSE`}}...{{/DRIVING_LICENSE}} block with a single unconditional
{{DRIVING_LICENSE}} token) so the upstream agent can supply empty or real
values, or alternatively wire a Mustache renderer into the HTML filling step
(ensure generate-pdf.mjs invokes the renderer before PDF generation and passes
the data map) and update references to DRIVING_LICENSE, INTERESTS, AVAILABILITY,
and MOBILITY accordingly.

Comment on lines +185 to +200
# -- Requêtes de recherche pour le scanner --

search_queries:
# [PERSONNALISER] Adapte les requêtes à tes rôles cibles
- query: "développeur fullstack CDI"
source: france-travail
- query: "ingénieur logiciel Paris"
source: france-travail
- query: "commercial B2B SaaS"
source: france-travail
- query: "tech lead CDI"
source: apec
- query: "développeur senior"
source: welcometothejungle
- query: "alternance développeur"
source: france-travail
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm scan.mjs still only reads tracked_companies + the old search_queries shape.
rg -nP -C2 '\b(tracked_companies|search_queries|title_filter|portals)\b' scan.mjs
# Check whether a new loader handles adapter/tier from portals-fr.example.yml.
rg -nP -C3 '\b(adapter|tier|auth|base_url)\b' --type=js --type=mjs

Repository: santifer/career-ops

Length of output: 1083


🏁 Script executed:

# First, check the actual structure of the two template files
echo "=== templates/portals-fr.example.yml ==="
cat templates/portals-fr.example.yml

echo -e "\n\n=== templates/portals.example.yml ==="
cat templates/portals.example.yml

Repository: santifer/career-ops

Length of output: 38250


🏁 Script executed:

# Search for adapter/tier/auth/source references to see if new schema is handled
rg -n '\b(adapter|tier|auth)' --type=yaml
echo "---"
# Also check for source field references  
rg -n '\bsource\b' --type=js

Repository: santifer/career-ops

Length of output: 1566


🏁 Script executed:

# Search for code that reads adapter field or portals config structure
rg -n "adapter" --type=js --type=mjs -C2 | head -50

echo "---"

# Check if scan.mjs was modified to handle portals: instead of tracked_companies
rg -n "portals:" --type=js --type=mjs -C2

echo "---"

# Check for any references to tier field in code
rg -n "\btier\b" --type=js --type=mjs -C2

Repository: santifer/career-ops

Length of output: 171


🏁 Script executed:

# Check git status to see if scan.mjs was recently modified
git log --oneline -20 -- scan.mjs scan-auth.mjs

echo "---"

# Check the actual current scan.mjs to see if it loads portals or tracked_companies
sed -n '258,290p' scan.mjs

Repository: santifer/career-ops

Length of output: 1327


🏁 Script executed:

# Check if scan-auth.mjs is a new scanner that handles the portals: schema
head -50 scan-auth.mjs

echo "---"

# Check if adapter files exist
find . -type f -name "*.mjs" -path "*/adapters/*" 2>/dev/null | head -10

echo "---"

# Check what scan-auth.mjs reads from config
rg -n "config\." scan-auth.mjs | head -30

Repository: santifer/career-ops

Length of output: 452


🏁 Script executed:

# Check if scan-auth.mjs exists or if the learning was referring to a different PR
git log --oneline -10

echo "---"

# List files in repo root and check what was added
git ls-files | grep -E "(scan|auth)" 

echo "---"

# Check the actual structure of adapters
find adapters -type f | head -20

Repository: santifer/career-ops

Length of output: 475


🏁 Script executed:

# Check what files the adapter files are actually doing
head -30 adapters/portals/france-travail.mjs

echo "---"

head -30 adapters/portals/_shared.mjs

Repository: santifer/career-ops

Length of output: 2031


🏁 Script executed:

# Search for code that iterates through portals: config or loads adapters
rg -n "portals\[" --type js
rg -n "\.adapter" --type js
rg -n "import.*adapter" --type js

echo "---"

# Check if there's any reference to 'source' field in search_queries
rg -n "\.source" --type js | grep -i search

echo "---"

# Look for any file that might be a new scanner using the FR template schema
ls -la *.mjs *.js 2>/dev/null | grep -E "scan|crawler"

Repository: santifer/career-ops

Length of output: 742


🏁 Script executed:

# Check git diff or git show to see what changed in this PR
git diff HEAD~1 HEAD -- templates/portals-fr.example.yml | head -100

echo "---"

# Check if there's a new scan-fr.mjs or similar
git ls-files | grep -i "scan.*fr"

Repository: santifer/career-ops

Length of output: 112


🏁 Script executed:

# Check git status to understand the state of the repo
git status

echo "---"

# Check if the adapters are actually being called/imported anywhere
rg -n "france-travail|adapters/portals" --type js

echo "---"

# Check GUIDE-UTILISATION to see how portals-fr.example.yml is documented
cat docs/GUIDE-UTILISATION.md | grep -A 10 -B 5 "portals-fr"

Repository: santifer/career-ops

Length of output: 1591


🏁 Script executed:

# Check if the adapters are standalone scripts or modules meant to be used with scan.mjs
head -5 adapters/portals/apec.mjs

echo "---"

# Check the current state - is portals.yml supposed to exist?
cat portals.yml 2>/dev/null || echo "portals.yml does not exist"

echo "---"

# Look at the learning about Step 3 - what does it say about portals.yml vs portals-fr.example.yml
cat CLAUDE.md 2>/dev/null | head -20

Repository: santifer/career-ops

Length of output: 1585


Schema mismatch — templates/portals-fr.example.yml is incompatible with scan.mjs.

Following the documented onboarding (GUIDE-UTILISATION.md step 2.3), users copy templates/portals-fr.example.yml to portals.yml. However, scan.mjs expects a different schema and will fail silently:

  1. Missing tracked_companies: scan.mjs iterates config.tracked_companies (line 265), which the FR template omits entirely — scan.mjs processes an empty list and returns no results.
  2. Incompatible search_queries shape: scan.mjs expects { name, query, enabled }, but the FR template uses { query, source }.
  3. Orphaned adapter modules: The template defines portals[] with adapter, tier, auth fields, and adapter files exist in adapters/portals/, but scan.mjs does not consume them.

Either (a) extend scan.mjs to load the new portals: schema and dispatch to adapter modules, or (b) align the FR template to follow the tracked_companies/search_queries contract of the existing scanner. The current state breaks the documented flow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@templates/portals-fr.example.yml` around lines 185 - 200, The template's
schema doesn't match what scan.mjs expects; update
templates/portals-fr.example.yml to follow the scanner contract by adding a
tracked_companies array (populate entries matching config.tracked_companies used
by scan.mjs) and change each search_queries item from { query, source } to the
shape scan.mjs expects: { name, query, enabled } (keep source as optional
metadata inside name or a separate field if you want but ensure
query/name/enabled exist). Also remove or convert the existing portals[]
adapter/tier/auth blocks into either entries under tracked_companies or
documented optional metadata so scan.mjs won't ignore them. Ensure the file's
keys exactly match config.tracked_companies and search_queries so the scanner
iterates and returns results.

Comment thread tests/run-golden.mjs
Comment on lines +44 to +63
if (!testCase.expectedScore?.min || !testCase.expectedScore?.max) {
checks.push('expectedScore.min/max manquant');
}

// Vérifier que le fichier d'entrée existe
if (testCase.inputType === 'file') {
const inputPath = join(ROOT, testCase.input);
if (!existsSync(inputPath)) {
checks.push(`fichier introuvable : ${testCase.input}`);
}
}

// Vérifier la cohérence du score
if (testCase.expectedScore) {
if (testCase.expectedScore.min > testCase.expectedScore.max) {
checks.push('expectedScore.min > expectedScore.max');
}
if (testCase.expectedScore.min < 1 || testCase.expectedScore.max > 5) {
checks.push('expectedScore hors plage [1, 5]');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate score types before comparing ranges.

expectedScore.min/max can be strings and still bypass parts of this validation. Since this runner is the integrity gate for golden cases, require finite numbers before checking order and range.

🧪 Proposed validation fix
-  if (!testCase.expectedScore?.min || !testCase.expectedScore?.max) {
-    checks.push('expectedScore.min/max manquant');
-  }
+  const scoreMin = testCase.expectedScore?.min;
+  const scoreMax = testCase.expectedScore?.max;
+  if (
+    typeof scoreMin !== 'number' ||
+    !Number.isFinite(scoreMin) ||
+    typeof scoreMax !== 'number' ||
+    !Number.isFinite(scoreMax)
+  ) {
+    checks.push('expectedScore.min/max manquant ou non numérique');
+  }
@@
-  if (testCase.expectedScore) {
-    if (testCase.expectedScore.min > testCase.expectedScore.max) {
+  if (typeof scoreMin === 'number' && typeof scoreMax === 'number') {
+    if (scoreMin > scoreMax) {
       checks.push('expectedScore.min > expectedScore.max');
     }
-    if (testCase.expectedScore.min < 1 || testCase.expectedScore.max > 5) {
+    if (scoreMin < 1 || scoreMax > 5) {
       checks.push('expectedScore hors plage [1, 5]');
     }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!testCase.expectedScore?.min || !testCase.expectedScore?.max) {
checks.push('expectedScore.min/max manquant');
}
// Vérifier que le fichier d'entrée existe
if (testCase.inputType === 'file') {
const inputPath = join(ROOT, testCase.input);
if (!existsSync(inputPath)) {
checks.push(`fichier introuvable : ${testCase.input}`);
}
}
// Vérifier la cohérence du score
if (testCase.expectedScore) {
if (testCase.expectedScore.min > testCase.expectedScore.max) {
checks.push('expectedScore.min > expectedScore.max');
}
if (testCase.expectedScore.min < 1 || testCase.expectedScore.max > 5) {
checks.push('expectedScore hors plage [1, 5]');
}
const scoreMin = testCase.expectedScore?.min;
const scoreMax = testCase.expectedScore?.max;
if (
typeof scoreMin !== 'number' ||
!Number.isFinite(scoreMin) ||
typeof scoreMax !== 'number' ||
!Number.isFinite(scoreMax)
) {
checks.push('expectedScore.min/max manquant ou non numérique');
}
// Vérifier que le fichier d'entrée existe
if (testCase.inputType === 'file') {
const inputPath = join(ROOT, testCase.input);
if (!existsSync(inputPath)) {
checks.push(`fichier introuvable : ${testCase.input}`);
}
}
// Vérifier la cohérence du score
if (typeof scoreMin === 'number' && typeof scoreMax === 'number') {
if (scoreMin > scoreMax) {
checks.push('expectedScore.min > expectedScore.max');
}
if (scoreMin < 1 || scoreMax > 5) {
checks.push('expectedScore hors plage [1, 5]');
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/run-golden.mjs` around lines 44 - 63, The validation currently only
checks presence of testCase.expectedScore?.min/max but doesn't ensure they're
numbers; update the checks around expectedScore (references: expectedScore,
expectedScore.min, expectedScore.max, checks) to first verify both values are
finite numbers (use typeof === 'number' && Number.isFinite(...) or coerce and
test with Number.isFinite), push an error if not numeric, and only then perform
the ordering (min > max) and range (min < 1 || max > 5) validations; keep the
existing error messages but add a new message like 'expectedScore.min/max not a
finite number' when the numeric check fails.

Comment thread tests/run-golden.mjs
Comment on lines +49 to +52
if (testCase.inputType === 'file') {
const inputPath = join(ROOT, testCase.input);
if (!existsSync(inputPath)) {
checks.push(`fichier introuvable : ${testCase.input}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Report file inputs in golden cases that are absolute paths or escape the repo root.
python - <<'PY'
import json
from pathlib import Path

cases_path = Path("tests/golden/cases.json")
if not cases_path.exists():
    print("tests/golden/cases.json not found")
    raise SystemExit(0)

root = Path.cwd().resolve()
cases = json.loads(cases_path.read_text())
for case in cases:
    if case.get("inputType") != "file":
        continue
    raw = case.get("input", "")
    resolved = (root / raw).resolve()
    try:
        resolved.relative_to(root)
    except ValueError:
        print(f"{case.get('id', '<missing id>')}: escapes repo root -> {raw}")
PY

Repository: santifer/career-ops

Length of output: 45


🏁 Script executed:

# Verify the file exists and check the code around lines 49-52
wc -l tests/run-golden.mjs

Repository: santifer/career-ops

Length of output: 86


🏁 Script executed:

# Read the actual code to verify the snippet and surrounding context
sed -n '40,60p' tests/run-golden.mjs

Repository: santifer/career-ops

Length of output: 846


🏁 Script executed:

# Check imports at the top of the file to see what's currently imported
head -20 tests/run-golden.mjs

Repository: santifer/career-ops

Length of output: 719


🏁 Script executed:

# Verify ROOT definition
rg "const ROOT\s*=" tests/run-golden.mjs -A 1

Repository: santifer/career-ops

Length of output: 154


🏁 Script executed:

# Check if there are other file path operations in run-golden.mjs that might have similar issues
rg "join\(|readFileSync|existsSync|writeFileSync" tests/run-golden.mjs -n

Repository: santifer/career-ops

Length of output: 422


Block path traversal in test fixture loader.

testCase.input from cases.json is passed directly to join(ROOT, testCase.input) without validation. An absolute path or ../ sequence can escape the repository and read arbitrary files. Resolve the path and reject escapes before calling existsSync:

Path guard
-import { join } from 'path';
+import { join, resolve, relative, isAbsolute } from 'path';
@@
   if (testCase.inputType === 'file') {
-    const inputPath = join(ROOT, testCase.input);
+    const inputPath = resolve(ROOT, testCase.input);
+    const relativeInputPath = relative(ROOT, inputPath);
+    if (relativeInputPath.startsWith('..') || isAbsolute(relativeInputPath)) {
+      checks.push(`chemin hors dépôt : ${testCase.input}`);
+      continue;
+    }
     if (!existsSync(inputPath)) {
       checks.push(`fichier introuvable : ${testCase.input}`);
     }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (testCase.inputType === 'file') {
const inputPath = join(ROOT, testCase.input);
if (!existsSync(inputPath)) {
checks.push(`fichier introuvable : ${testCase.input}`);
if (testCase.inputType === 'file') {
const inputPath = resolve(ROOT, testCase.input);
const relativeInputPath = relative(ROOT, inputPath);
if (relativeInputPath.startsWith('..') || isAbsolute(relativeInputPath)) {
checks.push(`chemin hors dépôt : ${testCase.input}`);
continue;
}
if (!existsSync(inputPath)) {
checks.push(`fichier introuvable : ${testCase.input}`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/run-golden.mjs` around lines 49 - 52, The test fixture loader currently
joins ROOT and testCase.input directly (see testCase.inputType, inputPath,
existsSync), allowing absolute paths or “../” to escape the repo; fix by
resolving and validating the resulting path before checking existence: compute a
resolved path using path.resolve(ROOT, testCase.input) (or equivalent), ensure
the resolved path is inside ROOT (e.g., resolvedPath === ROOT or
resolvedPath.startsWith(ROOT + path.sep)), and if the check fails push the same
“fichier introuvable : …” error (or a guard error) instead of calling existsSync
on an escaped path; only then call existsSync on the validated resolved path.

@santifer
Copy link
Copy Markdown
Owner

@atoox-git, thanks for putting this together — completing modes/fr/ with the missing modes (contact, deep, entretien, formation, patterns, projet, relance, scan) plus adding 6 French portal adapters (APEC, France Travail, HelloWork, Jobijoba, RegionsJob, Welcome to the Jungle), legal pages, and localized examples is real work with real value. The French market is on our roadmap.

I can't merge this as-is though:

  • The PR is titled "PR" with an empty description and unchecked template — I have no way to review 40+ files of intent
  • It modifies CLAUDE.md and package.json without discussion — those need issues first per CONTRIBUTING.md
  • modes/fr/lettre-motivation.md maps to a feature (cover letter generation) that's still an open proposal in issue #199 — we shouldn't land localized copies before the English version is decided

Could you split this into focused PRs so the work actually lands? For example:

  1. One PR per portal adapter (France Travail first — the API is solid)
  2. One PR for the missing modes/fr/ translations (exclude lettre-motivation.md until Proposal: Add cover letter generation workflow #199 is resolved)
  3. One PR for the French legal pages (PRIVACY, NOTICE, MENTIONS-LEGALES)

Each one small, linked to an issue, with a clear description. That way your work lands instead of staying blocked.

@santifer santifer closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants