Skip to content

Equal Pay as a Data Operation: A Union's Claims Base and a Valuation Engine It Can Argue With

How Datamise built the equal-pay work behind GMB Union's campaigns — an owned, relational claims base and the 38-finding audit that...

Peter McCann Strain20 May 202612 min read
Equal Pay as a Data Operation: A Union's Claims Base and a Valuation Engine It Can Argue With
Founder Engineering • Legal Infrastructure

An equal-pay campaign is not won in a courtroom. It is won, or lost, in the data that feeds the courtroom — and most of that data starts its life in a spreadsheet that nobody fully trusts.

That is the problem GMB Union brought to Datamise, the consultancy I co-founded in 2023. Not "build us an app." Something more specific: the union runs equal-pay claims for thousands of low-paid women across UK local government, and the operational system behind those claims had to be something the union itself could own, understand, and keep running long after any vendor went away.

The work split into two halves that mirror the campaign itself. First, the operation: a relational claims base the union owns, and an audit that found 38 things wrong with it before they could quietly cost real claims. Then the arithmetic: SettleMise, a separate product that turns a messy equal-pay dataset into a valuation a solicitor can actually interrogate. Neither replaces legal judgement. Both make the work around that judgement trustworthy.

Why ownership was the requirement, not a preference

Equal-pay claims arise because, for decades, women in roles such as care, cleaning, catering, and school support were graded below male-dominated roles of equal value. Correcting that means filing claims at scale: a single regional campaign can hold thousands of claimants, each with identity, membership status, employment history, role data, consent, correspondence, tribunal filings, and a settlement state that changes over months.

A campaign of that shape is a data operation before it is anything else. And a data operation a union depends on cannot live inside a proprietary platform whose roadmap, pricing, and data handling sit with a supplier. A licensed claims portal would have traded the union's operational control for a maintenance contract — the wrong trade for mission-critical legal work. So GMB built on Airtable: a relational base the union's own staff could see into, edit, and clone for each new regional campaign. Datamise was brought in to build, operate, and harden it. That conviction — that mission-driven organisations should not have to choose between affordability, quality, and control of their own systems — became one of the founding reasons my co-founder and I started Datamise.

The shape of a claim

Every claim in the base moves through the same eight stages, from a claimant filling in a form to the file being closed. The base is built around that lifecycle: tables, views, formulas, and automations all exist to move a claim from one stage to the next without losing it.

Interactive Figure

Loading claim lifecycle…

Each stage is a handoff, and every handoff is a place a record can be dropped, duplicated, or mismatched. Intake comes in through a JotForm. Membership is checked against the union's legacy membership system. Pre-ACAS validates that a claim has the fields it needs before the external ACAS conciliation step. ACAS and ET are external filings. Leigh Day is the instructed law firm, reached today by a manual export. Then the result is pushed back to the legacy system and the claim is archived.

Nothing here is exotic. The difficulty is volume and repetition: the same eight-stage pipeline, cloned and re-run across region after region, where a small structural flaw is not one flaw — it is a flaw copied into every campaign that inherits the template.

The gold base, and the risk hiding inside it

GMB's base is run as a gold standard template: one canonical base that gets cloned for each new regional campaign. That model is genuinely good practice, and it concentrates risk. A clean template means every campaign starts strong; a subtle defect in the template means every campaign starts broken in exactly the same way.

By early 2026 the base was overdue for review. It had been built competently in 2023, then maintained episodically as campaigns came and went, until a cascade of script failures took it offline. Datamise was asked to audit it — every table, view, automation, formula, integration, and interface — benchmarked against current Airtable best practice, with findings prioritised and a roadmap to fix them. The audit was a Datamise deliverable, completed in April 2026: a prioritised findings register plus a recommended roadmap. The fixes themselves were not part of this deliverable and had not shipped at the time of writing.

Interactive Figure

Loading audit findings…

What 38 findings actually told us

38 findings, 21 of them critical. What they shared mattered more than the count: three patterns ran through almost all of them.

The base joins tables with string matching. Across seven tables, there were zero linked-record fields. Every relationship between a claimant, their membership record, and their live campaign row was reconstructed by matching text. The join key was a formula called "Full Name & DoB" that strips all spaces from a name and date of birth and compares the result — reused as the join in 9 of the base's 13 scripts. Any whitespace drift between two tables, a trailing or double space, silently produces a mismatch. The base also held a "Full Name" formula that called Airtable's SUBSTITUTE with one argument where the function needs three. It had been quietly wrong the whole time.

Scripts do work the data model should be doing. With no linked records, there are no native rollups or lookups, so scripts emulate them, expensively. One script full-scans all 2,091 Active Members on every new submission. Another, a fuzzy address check, runs an O(M·N·t²) Levenshtein cross-scan that will blow straight through Airtable's 30-second automation limit on any clone with more than a few thousand records.

Several scripts can lose data, not just slow down. Four findings were data-loss-class. A membership-update script sorts records by subtracting ISO date strings, which produces NaN, which produces arbitrary ordering, and then deletes the record it decides is stale. A deduplication script has a broken multi-select comparison that generates false matches, so genuine new submissions are auto-deleted and vanish without a trace. A third script depends on a table that does not exist in the base, so it throws on every run. None of these had caused visible disaster yet, only because the base was offline. Every one fires the moment automations are switched back on in a clone.

There were claimant-facing failures too, and these are the ones that sting. A literal XXX placeholder was never substituted in one email subject line, so real claimants received mail headed "GMB XXX Equal Pay Claim" for a year. Two early emails were addressed to GMB's own no-reply mailbox instead of the claimant. And 27 unread failure notifications had piled up: validation scripts had been failing on most submissions between October 2025 and February 2026, then were simply switched off. That backlog of silent failures is why the base went dark. A scaled system rarely dies from one dramatic break; it dies from small failures that fire silently, accumulate, and are eventually muted because nobody has time to read the alerts.

The fix is architectural, and it is scoped

The audit's recommended roadmap does not patch symptoms. It changes the model: introduce linked records across the three member tables so a claim points at a real Active Members row instead of guessing by name; normalise contracts out of the 122-field Live Campaign table into a clean child table; consolidate five separate validator scripts into one on-create orchestrator and five overlapping dedupe systems into one pipeline; make deduplication non-destructive (flag-and-review, never silent delete); and pin a twelve-item clone-time checklist inside the base so every regional clone is set up the same way instead of from memory.

The audit's own estimate to bring the base to 2026 best practice is about 14 days of engineering — one week clears every critical defect, one to two more covers the architecture — with an overall verdict that the base is "fundamentally sound" and roughly two weeks from modern. That is a qualified assessment, scoped to what the findings register actually measured. The single highest-leverage recommendation is also the cheapest to state: commit to a genuine clean-template model, so every future campaign inherits the correctness instead of inheriting the bug.

The other half: a valuation a reviewer can disbelieve

The claims base keeps the operation legible and owned. It is not where a claim's settlement value is calculated — that is a distinct problem, and Datamise built a separate product for it.

On 20 May 2026, Vercel reported the SettleMise deployment READY. Browser sign-in was broken anyway. The build was green, the dashboard rendered, and a reviewer still could not get into a case until an auth-proxy configuration was found and fixed. A valuation engine that nobody can log into is not a valuation engine; it is a screenshot. That gap — between "the platform says it is up" and "a human can actually use it" — is the whole discipline behind the product.

Because the hard part of a valuation engine has nothing to do with arithmetic. Equal-pay claims live or die on a settlement figure, and a figure on its own is not evidence: a solicitor, an accountant, or a tribunal cannot rely on a number they cannot interrogate. They need to see the data that fed it, the assumptions that shaped it, the rows that were excluded and why. The hard product question is whether the calculation is legible enough to be argued with.

A walkthrough on synthetic, fictional demo data. It shows the product workflow, not a real claimant dataset.

SettleMise is a web app that turns equal-pay datasets into reviewable valuation schedules. It does not decide liability, it does not replace professional judgement, and it is not "tribunal-ready" out of the box. What it does is make a calculation inspectable, so the humans who carry the legal and accounting responsibility can do their job with the work shown.

Why a spreadsheet is the wrong tool here

A spreadsheet can produce a total while hiding everything around it. Column names drift between source files. Comparator job titles match ambiguously, or not at all. Dates need interpretation. Tax, interest, pension, and employer-side assumptions get baked into formulas where nobody can see them. And the worst failure mode is a row that is unsafe to calculate getting silently swept into a SUM, so the total looks complete when it is quietly wrong.

So SettleMise treats valuation as a workflow with checkpoints, not a single formula. A user signs in, opens a case workspace, and uploads three datasets: claims data, a pay-grade table, and a comparator table. Before anything is calculated, the app surfaces a review pass over file structure, fuzzy column mapping, missing fields, date-order inference, pay-grade normalisation, and per-row warnings; unsupported formats are rejected outright rather than half-parsed. Only then does the user confirm the calculation assumptions — claim period, valuation date, interest, tax, NI bands, pension, gross-up, bonus, overtime, weeks per year — each an explicit, recorded input. Nothing important hides in a default.

The calculation engine then produces row-level outputs and totals alongside review records: missing pay grades, unconfirmed comparator matches, invalid dates, no-claim rows, issue summaries. The governing principle is simple. An exact, confident join is allowed to calculate; an ambiguous case becomes a review item, not a silent assumption. Comparator matching is where that earns its keep: SettleMise tries an exact normalised-title match first, and if that fails it does not guess — it surfaces the near-misses for a human to confirm and excludes the unconfirmed match from the total. The total you get is the total of what the system was sure about. Everything else is waiting, visibly, for review.

The implementation is a Next.js 16 / React 19 / TypeScript app with Clerk authentication and Upstash Redis-backed case records. Case access is owner-scoped; mutating routes are guarded by same-origin and fetch-site checks; case artifacts are encrypted at rest and persisted with an operational audit trail that records calculation runs, analysis runs, and export downloads while keeping raw claimant rows out of the audit stream. The schedule can leave the app in the formats a professional actually works in, a tribunal-style PDF among them — "style" because the people who carry that responsibility are the ones who certify it, not the tool.

What it is, and what it is not

  • SettleMise does not make legal decisions and does not determine liability. It computes and exposes a valuation; a human decides what it means.
  • It does not replace solicitors, accountants, experts, or reviewers. It is built to support their review, not substitute for it.
  • It is not "certified" or "tribunal-ready." It produces a tribunal-style schedule for people who carry that responsibility to check.
  • The synthetic demo is fictional. It shows the workflow and the product's shape, nothing about any named client or sector.

A valuation engine that overclaimed any one of those would be unusable for the exact people it is meant to serve.

The two halves fit together cleanly. One keeps the operation legible and owned; the other makes the calculation inspectable. And the lesson that runs through both is the same one the 20 May deployment taught in miniature: a green build and a usable system are not the same thing, and a settlement figure is only worth as much as a reviewer's ability to reach it, take it apart, and stand behind it. The 27 muted alerts that took the claims base offline are the same failure wearing different clothes — work that reports success while quietly going wrong. The whole point of both products is to make that failure visible while there is still time to fix it.

"Datamise has set a gold standard in professionalism and exceptional software development. Their commitment to quality of work has been evident in every stage of the project, with top-notch solutions and attention to detail."

Megan Fisher, GMB Union (datamise.co.uk)

Sources

Public context for GMB's equal-pay programme. These describe the union's wider campaign, not Datamise's delivery:

Want the connective tissue?

Peter's profile AI can unpack this article, connect it to the wider work, or point you to the next thing worth reading without pretending there's a newsletter backend here.

Comments

Loading comments…
Related threads

Related Posts