Methodology | Planning Appeals

Trust, transparency, and data quality sit at the centre of how we build. This page sets out where the data comes from, how we process it, what we standardise, and where the limits are.

We can't document every technical detail — some of it is how the product works, and publishing the full recipe wouldn't serve users or us. What we can do is describe the steps we go through, the checks we run, and the places we know the data is thinner or harder. If you want to know how something on the platform works and it isn't covered here, tell us.

1. How we extract arguments from decision letters

Arguments, policy citations, and pieces of inspector reasoning shown on the platform are linked back to the passage in the original decision letter they came from. "See source" opens that passage.

Getting extraction right has been a significant investment on the platform. A planning appeal decision letter is a long-form PDF written in specialist language. Inspectors summarise arguments from multiple parties, cite NPPF paragraphs in shorthand, reference local plan policies by code, and arrive at reasoning that can run across several paragraphs. Turning that reliably into structured, searchable data is not a solved problem in the wider industry. We've treated it as the core of the product.

How the pipeline works. Extraction is automated. We tested a range of language models, open and commercial, and ran benchmarks on known-difficult cases to choose the most accurate model for each extraction task: argument identification, policy attribution, decision metadata. Where one model was stronger on one task and weaker on another, we routed accordingly. We re-benchmark when new models are released and when prompts change.

Guardrails at each stage. LLM extraction can produce plausible-sounding text that isn't in the source — a failure mode the industry calls hallucination, and one we've built the pipeline around reducing. Extractions are graded for confidence, and checked against the source text. Low-confidence results don't reach the platform. Known-harder cases are flagged for review. If extraction quality drops below our internal thresholds during a batch run, the batch halts rather than proceeding. Where applicable, the raw model output is retained alongside the validated extraction, so we can audit what the model produced and what we accepted. We'd rather drop an extraction than show something we can't stand behind.

Manual spot-checking. We spot-check by hand. On the hundreds of excerpts we've tested, copy the excerpt and search the PDF, the passage appears in the decision letter. We can't check every extraction across 216,000+ appeals, which is why "See source" is there. The decision letter is always the authority; check any passage you intend to rely on before using it in professional work.

Where it's still hard. Decision letters have categories of passage we know the pipeline finds difficult, and we actively track them. Nested attribution is the most common: an inspector writing "the Council considers that the proposal would harm the character of the area" is reporting the Council's view, not the inspector's own finding, and a naive read can miss the attribution. Rejected claims that read like affirmative ones are another: an inspector writing "while the appellant contends the development would be sustainable, I find no persuasive evidence in support" is dismissing the contention, not endorsing it. Long specialist passages, heritage assessments, Green Belt proportionality tests, and sustainability balancing exercises can resist clean extraction because the inspector's conclusion depends on the whole passage rather than a quotable sentence. For these categories, we use specific prompts and validation checks and monitor performance on them as models and prompts change. When confidence is low, the extraction doesn't make it onto the platform; we'd rather show nothing than show something wrong.

"Key" arguments. The platform flags arguments the inspector treated as decisive. An updated methodology for this signal arrives by end of April 2026. Until then, "Key" badges use the earlier approach.

2. Sources and coverage

The platform covers planning appeals decided across the four UK jurisdictions. Each has its own appellate body, its own publication practices, and its own data characteristics.

Each body publishes some fields directly: case references, decision dates, outcomes, local authority. The richer fields (arguments, policy citations, inspector names, reasoning) are extracted from the decision letter by the pipeline described in Section 1.

England, Planning Inspectorate (PINS). The largest source by volume and the richest in content. We ingest from both of the publication systems PINS operates. Most of the platform's extracted data comes from England, and insights features (allow rates, policy effectiveness, inspector statistics) are currently England-only.

Wales, Planning and Environment Decisions Wales (PEDW). Welsh appeals were handled by PINS until devolution in 2021; PEDW has run them since. We submitted Freedom of Information requests to PEDW and PINS in Q1 2026 for pre-devolution Welsh casework; both were refused. PEDW coverage therefore starts at the devolution handover. Outcome data has known gaps: a share of decisions are published without a structured outcome recorded at source. We don't infer outcomes we can't confirm.

Scotland, Planning and Environmental Appeals Division (DPEA). DPEA handles major Scottish planning appeals. Householder appeals are decided by Local Review Bodies at individual councils and are not part of DPEA's caseload; we hold historic Scottish householder appeals from before that change but not current ones. Case-type and procedure taxonomies differ from England's; we standardise for cross-jurisdiction search while preserving the original values.

Northern Ireland, Planning Appeals Commission (PAC NI). The smallest volume and the thinnest metadata. Procedure type is not consistently published, which is why procedure filters are hidden for Northern Ireland cases. Decision letters and outcomes are covered where available.

Errors and original records. Source data from all four bodies contains errors: typos, miscoded fields, malformed postcodes. We preserve the original record as published; the appellate body's published decision is the legal fact and we don't rewrite it. Occasionally we enrich a record where an error would otherwise hide a case from users. For example, a decision published as dismissed that the letter itself confirms was allowed. Where we do this, the case returns in searches for the corrected outcome but still carries the original label on the record, so you can see both.

What "covered" means. A case is on the platform when the decision letter has been published by the appellate body and has been ingested by the platform. Publication lag varies by jurisdiction and is covered under Updates and freshness. Appeals withdrawn before decision, pre-decision correspondence, and internal working papers are not published and are not on the platform.

3. Data standardisation

Different jurisdictions describe the same thing in different ways. England records a "Householder" appeal; Scotland's equivalent sits under a different taxonomy. Outcomes are labelled differently: "Allowed" in one system, a longer phrase in another. If you searched the platform literally, a cross-jurisdiction query would return less than it should.

We standardise three fields for cross-jurisdiction search: outcome, case type, and procedure. For each, we maintain a canonical vocabulary and a mapping from each source's raw values to it. Filters on the platform use the canonical values. The original value from the source is preserved on every record and visible in the case detail, so you can see what the appellate body actually published.

Inspector names are standardised. Inspector names are extracted from the decision letter by the pipeline in Section 1. Because the same inspector appears under different forms across decisions (with or without titles, initials, post-nominal qualifications), we normalise each extracted name and resolve it to a canonical inspector identity. This is what makes inspector profile pages and the inspector filter work: a search for one inspector returns every decision by that inspector, regardless of how their name appeared on any given letter. The raw extracted name and the verbatim passage it came from are both preserved alongside the canonical identity. Where a name variant can't be resolved confidently, it's held for manual review rather than merged by guess.

Local planning authority is also standardised, using a different mechanism because LPAs change over time (councils merge, are abolished, are renamed). The successor/predecessor rollup that this enables is substantial enough to warrant its own section; see Section 4.

Unmapped values for outcome, case type, and procedure are tracked rather than silently dropped. When an appellate body introduces a new value we haven't seen before, it goes into a review queue and is added to the mapping before it affects search results. This applies to ongoing changes as well as historical gaps.

Standardisation is applied at ingestion, not at query time. Search is fast and consistent as a result, but a change to the mapping takes effect going forward; historical records aren't silently reclassified without a backfill.

Kept as published. Planning references and decision letter text are not standardised. They appear on records as the appellate body published them.

4. Local planning authority handling

Local authorities change. Councils merge, get abolished, get renamed, gain new responsibilities. A planning appeal decided in 2012 by Bournemouth Borough Council sits in the database under Bournemouth. The council itself no longer exists; its planning functions passed to Bournemouth, Christchurch and Poole Council in 2019. A planning professional researching precedents for a current BCP case expects to find that 2012 decision. Making that work takes significant effort.

Canonical authority identifiers. Every appeal record carries a reference to a canonical LPA identity. The canonical list for England is seeded from the Government's planning-data register. Wales, Scotland, and Northern Ireland use jurisdiction-specific identifiers. Raw LPA names from the source are preserved on every record alongside the canonical reference, and we record how each mapping was established.

Resolution at ingestion. When a new appeal enters the platform, the raw LPA name is looked up against our mapping table and the canonical reference is populated automatically. Unmapped names don't block ingestion; the case enters with a null reference and the raw name triggers a drift alert for operator review. New LPAs, renames, and ceased authorities are picked up this way.

Predecessor and successor rollup (England). The canonical list tracks which authorities succeeded which. A current authority's profile page and statistics aggregate cases across its predecessor councils: North Yorkshire's allow rate, for example, includes cases decided by the six districts it replaced in 2023. The aggregation happens at query time, not by rewriting records. Historical cases keep their original LPA name and predecessor reference; no case is re-attributed. The LPA profile page lists predecessors and their case counts so you can always see what's included.

LPA filter widening. When you filter a search by a modern authority, the platform also returns cases from its predecessors. This widening is deliberate and matches how professionals research a current authority's track record. It isn't currently indicated in the search interface; the LPA profile page is the canonical place to see the predecessor set.

Outside England. Canonical LPA identity is in place for all four jurisdictions; successor rollup is England-only. Authority reorganisation elsewhere in the UK is less frequent and the structured data to support rollup isn't in the same shape. Cases are searchable by their recorded LPA.

5. Search and ranking

Search covers the full text of every decision letter, the structured fields on each appeal, and every extracted argument and policy citation.

"Most relevant" ranking. The default sort ranks results by how closely they match the query, with a bias toward matches in positions that typically matter most to planning research. We don't publish the ranking formula. "Newest first" is available as an alternative and sorts by decision date alone; results may differ from the relevance view because the ranking inputs differ.

Outcome filter default. When no outcome is selected, the search returns all outcomes, including Withdrawn, Invalid, and Pending. To restrict to terminal outcomes (Allowed, Dismissed, Split Decision), select them explicitly. The default is deliberate: many useful research queries turn up cases at any stage, and hiding pending cases would miss live material.

LPA filter widening. When you filter by a modern authority, search also returns cases from its predecessor councils (see Section 4). This isn't indicated on the search interface; the LPA profile page is the canonical place to see which predecessors are included.

Insights statistics and minimum sample size. The insights pages (LPA profile, NPPF policy effectiveness, procedure impact, inspector statistics) report rates and distributions computed from the underlying case set. Where a specific combination of filters would produce a sample too small to be meaningful — for example, a particular case type at a particular authority with only a handful of decisions — the statistics fall back to a broader grouping, typically the case type at the national level or the authority's overall rate. The intention is to avoid presenting a rate computed from a handful of cases as if it were representative. Sample sizes are shown alongside the statistic where the space allows.

6. Updates and freshness

The platform ingests new appeals automatically as each jurisdiction publishes them. Jurisdictions are checked on their own schedules, picked to match how often each body publishes.

Lag from decision to platform. In practice, an English PINS decision typically appears on the platform within a day of the appellate body publishing it. PEDW, DPEA, and PAC NI publish less frequently and less predictably, so their observed lag is longer, typically within a week. These are the lags we see, not service commitments; publication from the appellate bodies is irregular, and when a source publishes in a burst or pauses for a period, the platform's lag follows. We're reading from the same public sources planning professionals would consult directly.

Rates, counts, and result sets on the platform reflect the decisions we've ingested so far. For the most recent period, more decisions may still be arriving: figures for this month or this quarter will typically grow as decisions catch up.

Structured fields land before extracted content. Case references, decision dates, outcomes, and local authorities are available immediately after ingestion. Arguments, policy citations, and inspector reasoning are added by the extraction pipeline (Section 1) and can take longer. A result without extracted arguments won't always be a case with no arguments; it may be a case still being processed, a case whose decision document we couldn't locate or access, a case where extracted content didn't clear our quality thresholds, or occasionally a decision that genuinely had no extractable arguments. Where structured fields are present, the case is still searchable on reference, outcome, LPA, date, and the other published fields.

7. Where to go from here

We've written this page because the professionals who rely on this data deserve to know how it was put together. The detail above is the working; two things are the conclusion.

First: the original decision letter is always the authority. Everywhere the platform shows extracted content, a link back to the source is there for a reason. Before citing, arguing from, or advising on any case, open the decision letter and check.

Second: the legal position on what the platform is and isn't responsible for sits in the Terms of Service. This page is methodology; the Terms are the contract.