Skip to content

Deep learning and vision

Computer Vision For Space, Medical Imaging, And Generative Media

Peter has built computer-vision systems where labels were scarce, domain shift mattered, and geometric or biological validity mattered as much as model choice.

6DoF pose estimation

Satellite pose estimation

Rendezvous-vision pipelines in the ELSA-M mission context.

FID 109 → 49

AstroGAN

A sim-to-real translation network that halved the synthetic-to-real FID gap.

89% · 8 cell classes

Oxford histology pipeline

Whole-slide nuclei detection, cell classification, tissue segmentation, and feature extraction.

Mean quality 7.9/10

Generative media pipeline

Photo restoration, ArcFace identity checks, image-to-video generation, and film assembly.

Markush OCSR

Praviar chemical-structure vision

Reads compound and Markush structures straight out of pharmaceutical-patent drawings, held-out validated.

Space vision

At Astroscale, Peter worked where classical vision had underperformed and synthetic-to-real transfer was the central risk.

  • Built 6DoF pose-estimation systems from ResNet baselines through FPN, RANSAC-PnP, Vision Transformers, and multi-task detection/pose/keypoint/segmentation designs.
  • Designed a custom 3D projection loss that projects the satellite model into image space to align supervision with geometric pose error.
  • Measured the sim-to-real gap with per-domain image statistics on SPEED+, then built AstroGAN — a UVCGANv2-based translation network — cutting FID from 109 to 49 by 500 epochs.

Medical imaging research

The Oxford DPhil required computer vision at whole-slide scale: nuclei detection, cell classification, tissue segmentation, and biological interpretation.

  • Built a RetinaNet-based nuclei detector and an 8-class cell classifier reaching 89% accuracy on mouse placental tissue.
  • Recovered the 8 known cell types at ~0.99 across ARI, AMI, and V-measure using Spectral Clustering on UMAP-reduced embeddings.
  • Trained a SimCLR contrastive model on a 3-million-sample unlabelled dataset to probe finer cell substructure.

Chemical and document vision

Praviar extends the vision work into a legal-adjacent product: pharmaceutical patents hide their most important claims in drawings that are never transcribed.

  • Built an optical chemical-structure-recognition cascade that reads compound and Markush (generic) structures straight out of patent drawings, validated on a held-out set of pharmaceutical-patent chemical drawings.
  • Treats patent intelligence as a multimodal problem, so the system does not walk past the exact diagram that decides freedom to operate.
  • The model and provider stack stays product-confidential; the disclosable signal is that vision is a first-class product layer, not a text-only afterthought.

Creative image-to-video engineering

A private family-film project adds a different vision signal: generative media built as an inspectable engineering workflow.

  • Built a 50+ module pipeline around colour-managed 16-bit scans, FLUX 2 [pro] restoration, and Kling 3.0 image-to-video generation.
  • Used ArcFace identity checks, a VLM judge, and five rounds of prompt iteration to keep generated motion stable, achieving mean quality 7.9/10.
  • The local six-stage restoration pipeline was built and tested but bypassed; the film shipped on hosted FLUX restoration.

Continue exploring

Related work

Read all writing