Label Generation1¶
PhotoPrism’s built-in classification relies on TensorFlow models such as Nasnet. With the new Ollama integration, you can generate labels via multimodal LLMs.
The Ollama integration is under active development, so the configuration, commands, and other details may change or break unexpectedly. Please keep this in mind and notify us when something doesn't work as expected. Thank you for your help in keeping this documentation updated!
Ollama Setup Guide¶
Follow the steps below to connect PhotoPrism directly to an Ollama instance and replace (or augment) the default Nasnet classifier with labels generated by a vision-capable LLM.
Step 1: Install Ollama¶
To run Ollama on the same server as PhotoPrism, add the ollama
service to the services
section of your compose.yaml
(or docker-compose.yml
) file, as shown in the example below.2
Alternatively, most of the compose.yaml
configuration examples on our download server already have Ollama preconfigured, so you can start it with the following command (remove profiles: ["ollama"]
from the ollama
service to start it by default, without using --profile ollama
):
docker compose --profile ollama up -d
compose.yaml
services:
photoprism:
## The ":preview" build gives early access to new features:
image: photoprism/photoprism:preview
...
## Ollama Large-Language Model Runner (optional)
## Run "ollama pull [name]:[version]" to download a vision model
## listed at <https://ollama.com/search?c=vision>, for example:
## docker compose exec ollama ollama pull gemma3:latest
ollama:
image: ollama/ollama:latest
restart: unless-stopped
stop_grace_period: 15s
## Insecurely exposes the Ollama service on port 11434
## without authentication (for private networks only):
# ports:
# - "11434:11434"
environment:
## Ollama Configuration Options:
OLLAMA_HOST: "0.0.0.0:11434"
OLLAMA_MODELS: "/root/.ollama" # model storage path (see volumes section below)
OLLAMA_MAX_QUEUE: "100" # maximum number of queued requests
OLLAMA_NUM_PARALLEL: "1" # maximum number of parallel requests
OLLAMA_MAX_LOADED_MODELS: "1" # maximum number of loaded models per GPU
OLLAMA_LOAD_TIMEOUT: "5m" # maximum time for loading models (default "5m")
OLLAMA_KEEP_ALIVE: "5m" # duration that models stay in memory (default "5m")
OLLAMA_CONTEXT_LENGTH: "4096" # maximum input context length
OLLAMA_MULTIUSER_CACHE: "false" # optimize prompt caching for multi-user scenarios
OLLAMA_NOPRUNE: "false" # disables pruning of model blobs at startup
OLLAMA_NOHISTORY: "true" # disables readline history
OLLAMA_FLASH_ATTENTION: "false" # enables the experimental flash attention feature
OLLAMA_KV_CACHE_TYPE: "f16" # cache quantization (f16, q8_0, or q4_0)
OLLAMA_SCHED_SPREAD: "false" # allows scheduling models across all GPUs.
OLLAMA_NEW_ENGINE: "true" # enables the new Ollama engine
# OLLAMA_DEBUG: "true" # shows additional debug information
# OLLAMA_INTEL_GPU: "true" # enables experimental Intel GPU detection
## NVIDIA GPU Hardware Acceleration (optional):
# NVIDIA_VISIBLE_DEVICES: "all"
# NVIDIA_DRIVER_CAPABILITIES: "compute,utility"
volumes:
- "./ollama:/root/.ollama"
## NVIDIA GPU Hardware Acceleration (optional):
# deploy:
# resources:
# reservations:
# devices:
# - driver: "nvidia"
# capabilities: [ gpu ]
# count: "all"
Note that the NVIDIA Container Toolkit must be installed for GPU hardware acceleration to work. Experienced users may also run Ollama on a separate, more powerful server.
Ollama does not enforce authentication by default. Only expose port 11434
inside trusted networks or behind a reverse proxy that adds access control.
Step 2: Download Models¶
Pull at least one multimodal model that can return structured JSON (for example, gemma3:latest
):
docker compose exec ollama ollama pull gemma3:latest
Other viable options already validated in our tests include qwen2.5vl:3b
, qwen2.5vl:3b-q8_0
, and qwen2.5vl:3b-fp16
. Stick to lightweight quantizations when running on CPUs and reserve FP16 variants for GPUs.
Step 3: Configure PhotoPrism¶
Create or update the vision.yml
file in the PhotoPrism storage directory (inside the container: /photoprism/storage/config/vision.yml
). Reuse your existing TensorFlow defaults but add an Ollama-based label model as shown below:
vision.yml
Models:
- Type: labels
Default: true # Nasnet fallback
- Type: nsfw
Default: true
- Type: face
Default: true
- Type: caption
Default: true
- Type: labels
Name: gemma3:latest
Engine: ollama
Run: newly-indexed
System: |
You are a PhotoPrism vision model. Follow the provided schema exactly.
Prompt: |
Analyze the image and list the most relevant labels. Provide the response in English
unless this prompt specifies a different language.
Service:
Uri: http://ollama:11434/api/generate
Thresholds:
Confidence: 10
Prompt localization
Keep the base instructions in English and append the desired language (e.g., “Respond in German”). This mirrors our recommendation for caption prompts and tends to be more reliable than asking the model to output multiple languages simultaneously.
PhotoPrism evaluates models from the bottom of the list upwards, so placing the Ollama entry after the Nasnet fallback ensures it is picked first while Nasnet remains available as a backup.
PhotoPrism records Ollama-generated labels with the ollama
metadata source automatically, so you do not need to request a specific source
field in the schema or pass --source
to the CLI unless you want to override the default.
Scheduling Options¶
Run: newly-indexed
(recommended) runs the Ollama model via the metadata worker shortly after new photos are indexed. This keeps the indexing pipeline responsive while still updating metadata within minutes.Run: on-index
executes during the primary indexing process. Only use this for lightweight or local models; external calls will otherwise lengthen imports noticeably.Run: manual
disables automatic execution so you can invoke the model explicitly viaphotoprism vision run -m labels
.
Step 4: Restart PhotoPrism¶
Run the following commands to restart photoprism
and apply the new settings:
docker compose stop photoprism
docker compose up -d
Test the integration with:
docker compose exec photoprism photoprism vision run -m labels --count 5 --force
Troubleshooting¶
Verify Active Configuration¶
docker compose exec photoprism photoprism vision ls
Ensure the output lists both Nasnet (default) and your Ollama label model. If the custom entry is missing, double-check the YAML indentation, file name (vision.yml
, not .yaml
), or environment overrides.
Schema or JSON Errors¶
If PhotoPrism logs vision: invalid label payload from ollama
, the model returned data that didn’t match the expected structure. Confirm that:
- The adapter injected schema instructions (keep
System
/Prompt
intact or reuse the defaults).
PhotoPrism may fall back to the existing TensorFlow Nasnet model when the Ollama response cannot be parsed.
Latency & Timeouts¶
Structured responses introduce additional parsing overhead. If you encounter timeouts:
- Increase the global service timeout (e.g.,
ServiceTimeout
in advanced deployments) if needed. - Reduce image resolution (
Resolution: 500
) or use smaller models. - Keep
Options.Temperature
low to encourage deterministic output.
GPU Considerations¶
When Ollama uses GPUs, long-running sessions might degrade over time due to VRAM fragmentation. Restart the Ollama container to recover performance:
docker compose down ollama
docker compose up -d ollama
-
Available to all users with the next stable version, see our release notes for details. ↩
-
Unrelated configuration details have been omitted for brevity. ↩