Face Recognition¶

To recognize faces, PhotoPrism uses a multi-stage AI pipeline that detects faces, generates embeddings, and clusters similar faces so they can be easily organized by person.

PhotoPrism supports two interchangeable detection engines that you can switch between depending on your hardware and accuracy requirements. The ONNX-based engine provides significantly improved detection of faces that are occluded, at angles, or in challenging lighting conditions.

How It Works¶

The face recognition pipeline consists of three stages:

Detection - Locate faces in photos using Pigo or ONNX SCRFD engines
Embedding - Generate 512-dimensional vectors to characterize each face using TensorFlow based on FaceNet
Clustering - Group similar faces using the DBSCAN algorithm so they can be assigned to people

Detection Engines¶

PhotoPrism offers two face detection engines with different characteristics:

Pigo¶

Pigo is a fast, CPU-only cascade classifier based on pixel intensity comparisons. It provides good performance for straightforward face detection scenarios and maintains PhotoPrism's historical detection behavior.

Best for:

Quick processing of well-lit, front-facing portraits
Lower resource consumption

ONNX SCRFD 0.5g¶

ONNX Runtime-backed CNN model that delivers higher recall on challenging faces. This engine:

Detects faces that are partially occluded (covered by hands, objects, etc.)
Works better with off-axis or angled faces
Handles difficult lighting conditions more effectively
Consumes 720px thumbnails (model input 640px)
Schedules work on the meta/vision workers
Defaults to half the available CPUs (minimum 1 thread)

The ONNX engine is automatically enabled when FACE_ENGINE=auto and the bundled SCRFD model is present. The prebuilt runtime targets glibc ≥ 2.27 on x86_64/arm64 architectures.

Best for:

Maximum face detection accuracy
Photos with challenging angles or lighting
Group photos with partially obscured faces

Configuration¶

We recommend that only advanced users and developers change these parameters.

Detection Settings¶

Environment Variable	CLI Flag	Default	Description
PHOTOPRISM_FACE_ENGINE	--face-engine	auto	Detection engine (`auto`, `pigo`, `onnx`). `auto` uses ONNX when available
PHOTOPRISM_FACE_ENGINE_THREADS	--face-engine-threads	runtime.NumCPU()/2 (≥1)	Number of ONNX inference threads; ignored by Pigo
PHOTOPRISM_FACE_ANGLE	--face-angle	-0.3,0,0.3	Detection angles in radians for Pigo multi-angle scanning
PHOTOPRISM_FACE_SIZE	--face-size	50	Minimum size of faces in `PIXELS` (20-10000)
PHOTOPRISM_FACE_SCORE	--face-score	9.0	Minimum face `QUALITY` score (1-100)
PHOTOPRISM_FACE_OVERLAP	--face-overlap	42	Face area overlap threshold in `PERCENT` (1-100)

Clustering Settings¶

It is strongly recommended that you run the "photoprism faces reset" command in a terminal to remove existing clusters and mappings after changing any of the clustering parameters, as otherwise inconsistencies may result in unexpected behavior or errors.

Environment Variable	CLI Flag	Default	Description
PHOTOPRISM_FACE_CLUSTER_SIZE	--face-cluster-size	80	Minimum size of automatically clustered faces in `PIXELS` (20-10000)
PHOTOPRISM_FACE_CLUSTER_SCORE	--face-cluster-score	15	Minimum `QUALITY` score of automatically clustered faces (1-100)
PHOTOPRISM_FACE_CLUSTER_CORE	--face-cluster-core	4	`NUMBER` of faces forming a cluster core (1-100)
PHOTOPRISM_FACE_CLUSTER_DIST	--face-cluster-dist	0.64	Similarity `DISTANCE` of faces forming a cluster core (0.1-1.5)
PHOTOPRISM_FACE_MATCH_DIST	--face-match-dist	0.46	Similarity `OFFSET` for matching faces with existing clusters (0.1-1.5)

Tuning Tips¶

A reasonable range for the similarity distance between face embeddings is between 0.60 and 0.70, with a higher value being more aggressive and leading to larger clusters with more false positives.
To cluster a smaller number of faces, you can reduce the kernel to 3 or 2 similar faces.
The ONNX engine typically provides better results on challenging photos (angles, occlusions, lighting) compared to Pigo.
Use FACE_ENGINE=auto to automatically select the best available engine.

Face Embeddings¶

After detection, PhotoPrism generates 512-dimensional embedding vectors using TensorFlow to characterize each face. These embeddings are used to:

Match faces across different photos
Cluster similar faces using the DBSCAN algorithm
Assign faces to people with manual confirmation

Normalization¶

All face embeddings are L2-normalized to unit length (‖x‖₂ = 1) at:

Creation time (after TensorFlow inference)
Midpoint calculation when merging clusters
Deserialization when loading from the database

This normalization ensures that Euclidean distance comparisons are equivalent to cosine similarity, aligning with FaceNet research standards.

Commands¶

Learn more about CLI commands ›

Performance Improvements¶

Recent optimizations have significantly improved face detection and clustering:

Benchmark	Before	After	Improvement
Embedding Distance	~242 ns/op	~155 ns/op	36% faster
Embeddings Midpoint	~194 µs/op, 528 KB	~99 µs/op, 4 KB	49% faster, 99% less memory
Face Matching (1024 faces)	1024 comparisons	~16 comparisons	98% fewer evaluations
Cluster Materialization	~29.8 µs/op, 384 allocs	~14.8 µs/op, 64 allocs	50% faster, 83% fewer allocations

These improvements mean:

Faster indexing of large photo collections
Lower memory usage during face clustering
Quicker face matching when organizing people
More efficient distance calculations