Skip to content

Refactor FedEx, Goodyear, HCL Technologies, and Inetum scrapers #685

Merged
lalalaurentiu merged 1 commit intopeviitor-ro:mainfrom
lalalaurentiu:main
Mar 25, 2026
Merged

Refactor FedEx, Goodyear, HCL Technologies, and Inetum scrapers #685
lalalaurentiu merged 1 commit intopeviitor-ro:mainfrom
lalalaurentiu:main

Conversation

@lalalaurentiu
Copy link
Copy Markdown
Collaborator

This pull request refactors and improves the job scraping scripts for several companies, focusing on simplifying HTTP requests, improving city/county normalization, and enhancing the robustness and maintainability of the code. The most important changes are grouped below by theme.

Refactoring and Simplification of HTTP Requests:

  • Replaced custom Scraper class usage with direct requests library calls in hcltechnologies.py and inetum.py, streamlining HTTP requests and making the code more standard and maintainable. [1] [2]
  • Updated the fedex.py script to use a session with an initial GET request to set cookies, followed by paginated POST requests to fetch job data, cleaning up redundant code and improving reliability.

City and County Normalization Improvements:

  • Introduced a normalize_city helper in fedex.py to standardize city/county extraction, with special handling for "Judetul Cluj" and "Bucharest" cases.
  • In inetum.py, improved city and county extraction from job location text, including special handling for Bucharest and remote/hybrid job types.
  • Added explicit city and county assignment for Goodyear jobs in Bucharest.

Job Data Structuring and Robustness:

  • Standardized the structure of job entries across scripts, ensuring consistent inclusion of fields like city, county, and remote. [1] [2] [3] [4]
  • Improved handling of missing or malformed data, such as defaulting to empty lists or fallback values if certain fields are absent. [1] [2] [3]

Code Cleanup:

  • Removed unused imports, redundant variables, and legacy code, resulting in cleaner and more readable scripts. [1] [2] [3]

Pagination and Performance Enhancements:

  • Improved pagination logic for fetching job listings in fedex.py and hcltechnologies.py, ensuring all available jobs are retrieved efficiently. [1] [2]

…prove city and county handling, enhance API requests, and streamline job data extraction
@lalalaurentiu lalalaurentiu merged commit fe2d835 into peviitor-ro:main Mar 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant