Skip to content

Recruitment AI safety

Recruitment AI Safety & Fairness

Recruitment AI is high-stakes because model outputs affect opportunity. Peter builds systems that test fairness before deployment and preserve human oversight where it matters.

Company-bound platform

Aurum at SThree

Controlled hiring simulations for fairness review, red-team testing, and audit records.

Review surfaces

Evaluator diagnostics

Review surfaces make disagreement and failure modes visible before deployment.

Fairness signals

Bounded fairness signals

Fairness findings are framed as risk signals, not legal certification.

Red-team workflow

Adversarial harness

Adversarial testing is integrated with provenance and human oversight.

What the platform tests

Aurum creates controlled hiring simulations, then helps reviewers inspect how recruitment AI behaves under ordinary, fairness-sensitive, and adversarial conditions.

  • Review surfaces make disagreement and failure modes visible rather than hiding them behind one confidence score.
  • Fairness diagnostics are separated from compliance claims so weak signals do not become false certainty.
  • Adversarial testing is treated as review material with provenance, not a standalone certification claim.

Why governance changes the architecture

Recruitment AI safety needs more than a model score: traceability, audit artifacts, small-sample caveats, and explicit points where a human stays accountable.

  • Red-team mechanics remain private because Aurum is SThree employer work, but the public posture is clear: attack findings must be bounded and reviewable.
  • Demo materials will use safe fixtures and avoid private prompts, workflows, attack recipes, or operational details.
  • Fairness findings on synthetic data are framed as failure discovery and regression testing, not a guarantee of deployed-world fairness.

Self-improvement with claim discipline

The Aurum self-improvement research treats evaluator optimization as a governance problem as much as a modelling one. It works against a synthetic proxy oracle, not human hiring ground truth.

  • A DSPy input-leakage audit keeps the proxy label and prior scores out of the evaluator-visible inputs, and the audit is mechanically checkable.
  • MultiStep+GEPA led Phase 4 on fitness and separation, but the top-3 ensemble beat it on rank metrics in all five seeds — reported as a trade-off.
  • A claim ledger and frozen manifest keep the work from claiming real-world hiring validity, legal sign-off, or universal GEPA superiority.

Continue exploring

Related work

Read all writing