Skip to content

Conversation

JuanPabloDiaz
Copy link
Collaborator

#110

This pull request introduces a new feature for scraping conference details from a provided URL. The changes include adding a new endpoint, implementing the scraping logic, and updating the routes to support the feature.

API Enhancements:

  • Added a new scrape action in Api::ConferencesController to handle requests for scraping conference details from a URL. The action validates the presence of the URL, calls the scraper service, and returns the result or an error message. (app/controllers/api/conferences_controller.rb, app/controllers/api/conferences_controller.rbR2-R20)

Service Implementation:

  • Introduced a ScraperService in the ConferenceScraper module that uses Nokogiri and open-uri to scrape conference details (title, date, location) from the provided URL. The service includes basic error handling and customizable selectors for extracting data. (app/services/conference/scraper_service.rb, app/services/conference/scraper_service.rbR1-R24)

Route Updates:

  • Added a new POST /conferences/scrape route to the conferences resource for invoking the scrape functionality. (config/routes.rb, config/routes.rbR22)

@JuanPabloDiaz
Copy link
Collaborator Author

JuanPabloDiaz commented Jul 15, 2025

Status Update: The endpoint is working as expected. Example response for:
curl -X POST http://localhost:3000/api/conferences/scrape -H 'Content-Type: application/json' -d '{"url": "https://www.rubyconf.org/"}'

Response:

{
  "title": "RubyConf 2024 – Chicago, IL – November 13–15, 2024: Sign-up to learn when more information about RubyConf is available!",
  "date": null,
  "location": "Location\n    Hilton Chicago Downtown\n    Hilton Chicago boasts a convenient central location on South Michigan Avenue, overlooking Grant Park and Lake Michigan. With many sights and attractions within a short walk from the hotel, it’s the perfect base for your stay in Chicago.\n    Book your room at the Hilton Chicago at the conference rate $219.00 per night (+tax and fees) by 11:59 PM CST on October 18th, 2024.\n    \n      \n  Book Your Room"
}
image

@nimzco
Copy link
Member

nimzco commented Jul 16, 2025

Given the example you're giving, wouldn't you want something more parsed and robust?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants