Skip to content

Conversation

@tobias-hotz
Copy link
Contributor

@tobias-hotz tobias-hotz commented Nov 14, 2025

Currently, CKAN Harvester simply puts the HTML into the abstract of ISO Metadata.
With this patch, the html2text function is used to clean up the HTML. Addtionally, the function was improved to replace more html elements with plain text alternatives (like links and paragraphs)

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

@PascalLike PascalLike self-requested a review November 19, 2025 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant