Skip to content

wordlift/graph-sync-template

Repository files navigation

Graph Sync Template

Graph Sync Template logo

Template Smoke Python 3.12 Copier 9 uv wordlift-sdk 8.0.16+

Copier template for bootstrapping worai graph sync projects with the current WordLift SDK v8 cloud-flow contract.

Why This Template

Use this repository when you need a new graph sync project with the WordLift runtime contract, GitHub Actions workflow, profile scaffolding, and local examples already aligned.

This template gives you:

Quick Start

Generate from the local checkout:

copier copy . ../my-graph-project

Generate from GitHub:

copier copy gh:wordlift/graph-sync-template my-graph-project

For offline or automation-friendly generation, skip API-key validation explicitly:

copier copy --data validate_api_key=false gh:wordlift/graph-sync-template my-graph-project

Template Contract

Required inputs

  • api_key
  • source_type with one of: urls, sitemap, google_sheets

Source-specific inputs

  • urls: urls
  • sitemap: sitemap_url, optional sitemap_url_pattern
  • google_sheets: sheets_url, sheets_name, sheets_service_account

Runtime defaults baked into the template

  • overwrite = true
  • materialization_backend = "worph"
  • canonical_id_strategy = "dependency_graph"
  • concurrency = 4
  • ingest_loader = "web_scrape_api"
  • ingest_timeout_ms = 120000
  • google_search_console = false
  • profiles = ["default"]
  • default_profile = "default"
  • validate_api_key = true

What Generation Does

During copier copy, the template:

  • validates the WordLift API key against /accounts/me by default
  • derives the runtime package name from the returned dataset_uri
  • renames the local runtime package from acme_kg to <dataset>_graph_sync
  • writes secrets to a local .env instead of tracked config
  • sets generated pyproject.toml [project].name from the destination directory name
  • scaffolds profiles/<profile>/mappings, templates, and postprocessors
  • removes .copier-answers.yml and excludes copier.yml from generated output

If validation is skipped or the API is unreachable, the fallback package name is acme_graph_sync.

Generated Project Shape

Generated projects include:

  • worai.toml
  • .github/workflows/graph-sync.yml
  • .env
  • profiles/<profile>/mappings
  • profiles/<profile>/templates
  • profiles/<profile>/postprocessors

Generated projects do not include template-maintenance assets such as:

Runtime Compatibility

The template is aligned to the SDK v8 cloud-flow contract:

  • wordlift-sdk>=8.0.16,<9.0.0
  • lxml>=5.4.0 (required by the bundled YouTube postprocessor example)
  • explicit ingest_source
  • explicit ingest_loader
  • explicit ingest_timeout_ms
  • no legacy web_page_import_mode or web_page_import_timeout fallback keys

Static Template Rules

Static entity templates in generated projects must follow these constraints:

  • one static template file defines exactly one subject node
  • no blank nodes
  • explicit IRIs only
  • schema:url and schema:sameAs must be URL literals
  • filenames use depth prefixes such as 10_, 20_, 30_
  • exported root IRIs in exports.toml(.j2) remain stable and unhashed

Default scaffold examples:

  • profiles/default/templates/20_organization.ttl.j2
  • profiles/default/templates/20_website.ttl.j2
  • profiles/default/templates/40_organization_postal_address.ttl.j2

Development

This repository uses:

  • Python 3.12
  • uv for dependency management
  • pytest for verification

Install dependencies:

uv sync --dev

Run the template-maintenance test suite:

uv run pytest -q

Run the render smoke check:

uv run scripts/smoke_render_template.sh

Maintainer Macros

  • deploy release [major|minor|patch] (default: patch)

    • scripts/deploy_release.sh [major|minor|patch]
    • bumps version, refreshes dependencies and lockfile, requires docs/spec updates, then commits, tags, and pushes (including tags)
  • upgrade project

    • scripts/upgrade_project.sh
    • updates wordlift-sdk to latest, updates .github/workflows/graph-sync.yml to latest wordlift/graph-sync tag, then runs deploy release patch

Repository Map

CI

The template-maintenance workflow lives in .github/workflows/template-smoke.yml. It:

  • installs dependencies with uv
  • runs uv run pytest -q
  • runs uv run scripts/smoke_render_template.sh

Generated projects receive .github/workflows/graph-sync.yml, which exposes profile-based manual dispatch and reusable workflow inputs.

About

Copier template for bootstrapping worai graph sync projects.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors