PINS-ML: Exploring PINS Appeals Data with SPARQL → pandas

This repo is a small, reproducible workspace for discovering and profiling Planning Inspectorate (PINS) appeals data published via the Open Data Communities (ODC) SPARQL endpoint - [[https://opendatacommunities.org/sparql]], and shaping it into notebook-friendly pandas tables.

What this project does

Connects to ODC’s SPARQL endpoint and runs a set of reusable queries against the PINS graph.
Builds a compact data profile of the graph:
- total rows (triples),
- unique triples,
- unique appeals (defined pragmatically as “subjects having …/pins-appeals/CaseRef”),
- a candidate unique ID predicate (key-like; typically CaseRef),
- date ranges across the whole graph and for domain dates (e.g., DeciDate),
- sample triple fetches (paged).
Exposes the results as pandas DataFrames and saves key summaries under data/ for reuse.

Repository structure

├── notebooks/
│ ├── PINS-ML_SPARQL_to_Pandas.ipynb # main analysis & queries
│ └── fallback_investigation.ipynb # connectivity diagnostics & fixes
├── data/
│ ├── pins_predicate_counts.csv # export: predicate → count (+pretty columns)
│ └── pins_predicate_counts.parquet # same as parquet
└── README.md

Why these folders exist

notebooks/ — the working analysis. Everything runs through a single function run_sparql(query, endpoint=ENDPOINT, ...) -> pd.DataFrame and the predefined PREFIXES and GRAPH_IRI. No hidden helpers are required to reproduce the results.
data/ — lightweight derived artifacts (e.g., predicate counts) generated by the notebooks so you can view outputs between runs, share small tables, and avoid re-querying when not needed.

Notebooks

1) `PINS-ML_SPARQL_to_Pandas.ipynb` (core analysis)

End-to-end, notebook-ready queries that all go through your run_sparql():

Graph profiling
- Total rows (triples)
- Unique triples (distinct ?s ?p ?o)
- Unique appeals via CaseRef
Predicate discovery
- Predicate frequency table (?p, count)
- Pretty columns: URI, Namespace, Predicate (derived from URI tail)
Key candidate (unique ID)
- Finds predicates that are single-valued per subject and globally unique across appeals
Date ranges
- Whole-graph min/max for any xsd:date or xsd:dateTime
- Domain date ranges (e.g., …/pins-appeals/DeciDate)
- Per-predicate date coverage (how widely each date appears across appeals)
Triple sampling
- Fetch N triples (paged, stable ordering)
- Fetch all triples for one appeal (deterministic pick or specific CaseRef)

2) `fallback_investigation.ipynb` (diagnostics)

SPARQLWrapper vs requests fallback behaviour
TLS/CA bundle fix (why requests worked while SPARQLWrapper failed, and how we aligned them)
Lightweight probes (GET vs POST) to confirm transport and headers

Notebook constants (already in notebooks)

ENDPOINT – ODC SPARQL endpoint

GRAPH_IRI – PINS graph IRI

PREFIXES – exact set used throughout:

PREFIX pins: http://opendatacommunities.org/def/ontology/planning/pins/
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX xsd: http://www.w3.org/2001/XMLSchema#
PREFIX dct: http://purl.org/dc/terms/

run_sparql(query, endpoint=ENDPOINT, timeout=60, verbose=True) -> pd.DataFrame (always returns a DataFrame; tries SPARQLWrapper, falls back to requests)

Notes & acknowledgements

Data originates from Open Data Communities (ODC) under their published terms.
This repo contains queries and lightweight derived tables, not bulk data dumps.
Thanks to the ODC team for making PINS data available via SPARQL.

This is the end of phase 1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PINS-ML: Exploring PINS Appeals Data with SPARQL → pandas

What this project does

Repository structure

Why these folders exist

Notebooks

1) `PINS-ML_SPARQL_to_Pandas.ipynb` (core analysis)

2) `fallback_investigation.ipynb` (diagnostics)

Notebook constants (already in notebooks)

Notes & acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

PINS-ML: Exploring PINS Appeals Data with SPARQL → pandas

What this project does

Repository structure

Why these folders exist

Notebooks

1) PINS-ML_SPARQL_to_Pandas.ipynb (core analysis)

2) fallback_investigation.ipynb (diagnostics)

Notebook constants (already in notebooks)

Notes & acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1) `PINS-ML_SPARQL_to_Pandas.ipynb` (core analysis)

2) `fallback_investigation.ipynb` (diagnostics)

Packages