£1,800 → £10
SThree CV skill tagging
Per-million-CV cost cut 180× by moving inference to retrieval, filters, and a 7 kB head.
Production AI economics
Peter treats LLMs as tools, not default plumbing: use them where they create leverage, remove them where cheaper signals carry the job.
£1,800 → £10
Per-million-CV cost cut 180× by moving inference to retrieval, filters, and a 7 kB head.
+12.03 pp
A three-line top-k cap by risk score was the single largest precision lift in the programme.
4–8s → ~50ms
A ~100× faster tagging path replaced per-skill runtime LLM validation.
174× fewer calls
A four-line patch cut Tier-5 rescue calls 174× with a +0.00 pp precision change.
Cost-aware private assistant
Routine paths stay efficient, while deeper reasoning is reserved for tasks where it changes the result.
Personal Assistant AI adds a private-product example of model-cost discipline: a memory-heavy assistant should not spend the same reasoning budget on every request.
The Skills Taxonomy project is a clean case of knowing when an LLM is the wrong production primitive. LLMs still do offline labelling and evaluation; the inference path became cheaper, faster, and more inspectable.
The real move was not a bigger model. It was ranking and pruning the candidates retrieval had already produced.
The logistic-regression head shows how a small supervised layer can carry earlier LLM-judge labels into a cheap inference boundary when the task is well-shaped.
Cost wins only count if the quality story holds. The Skills Taxonomy work names its limits as part of the result.
Continue exploring
How durable memory, proactive support, and product boundaries shape a private assistant case study.
OpenThe full narrative of removing the LLM from SThree’s CV skill-tagging hot path.
OpenHow production AI economics fits the broader engineering profile.
OpenJudge limitations, benchmark discipline, and bounded evaluator claims.
OpenThe adjacent lesson: more orchestration is not a substitute for controlled tests.
Open