Skip to content

Label Generation1

PhotoPrism’s built-in classification relies on TensorFlow models such as Nasnet. With the new Ollama integration, you can generate labels via multimodal LLMs.

The Ollama integration is under active development, so the configuration, commands, and other details may change or break unexpectedly. Please keep this in mind and notify us when something doesn't work as expected. Thank you for your help in keeping this documentation updated!

Ollama Setup Guide

Follow the steps below to connect PhotoPrism directly to an Ollama instance and replace (or augment) the default Nasnet classifier with labels generated by a vision-capable LLM.

Step 1: Install Ollama

To run Ollama on the same server as PhotoPrism, add the ollama service to the services section of your compose.yaml (or docker-compose.yml) file, as shown in the example below.2

Alternatively, most of the compose.yaml configuration examples on our download server already have Ollama preconfigured, so you can start it with the following command (remove profiles: ["ollama"] from the ollama service to start it by default, without using --profile ollama):

docker compose --profile ollama up -d

compose.yaml

services:
  photoprism:
  ## The ":preview" build gives early access to new features:
  image: photoprism/photoprism:preview
  ...

  ## Ollama Large-Language Model Runner (optional)
  ## Run "ollama pull [name]:[version]" to download a vision model
  ## listed at <https://ollama.com/search?c=vision>, for example:
  ## docker compose exec ollama ollama pull gemma3:latest
  ollama:
    image: ollama/ollama:latest
    restart: unless-stopped
    stop_grace_period: 15s
    ## Insecurely exposes the Ollama service on port 11434
    ## without authentication (for private networks only):
    # ports:
    #  - "11434:11434"
    environment:
      ## Ollama Configuration Options:
      OLLAMA_HOST: "0.0.0.0:11434"
      OLLAMA_MODELS: "/root/.ollama"  # model storage path (see volumes section below)
      OLLAMA_MAX_QUEUE: "100"         # maximum number of queued requests
      OLLAMA_NUM_PARALLEL: "1"        # maximum number of parallel requests
      OLLAMA_MAX_LOADED_MODELS: "1"   # maximum number of loaded models per GPU
      OLLAMA_LOAD_TIMEOUT: "5m"       # maximum time for loading models (default "5m")
      OLLAMA_KEEP_ALIVE: "5m"         # duration that models stay in memory (default "5m")
      OLLAMA_CONTEXT_LENGTH: "4096"   # maximum input context length
      OLLAMA_MULTIUSER_CACHE: "false" # optimize prompt caching for multi-user scenarios
      OLLAMA_NOPRUNE: "false"         # disables pruning of model blobs at startup
      OLLAMA_NOHISTORY: "true"        # disables readline history
      OLLAMA_FLASH_ATTENTION: "false" # enables the experimental flash attention feature
      OLLAMA_KV_CACHE_TYPE: "f16"     # cache quantization (f16, q8_0, or q4_0)
      OLLAMA_SCHED_SPREAD: "false"    # allows scheduling models across all GPUs.
      OLLAMA_NEW_ENGINE: "true"       # enables the new Ollama engine
      # OLLAMA_DEBUG: "true"            # shows additional debug information
      # OLLAMA_INTEL_GPU: "true"        # enables experimental Intel GPU detection
      ## NVIDIA GPU Hardware Acceleration (optional):
      # NVIDIA_VISIBLE_DEVICES: "all"
      # NVIDIA_DRIVER_CAPABILITIES: "compute,utility"
    volumes:
      - "./ollama:/root/.ollama"
    ## NVIDIA GPU Hardware Acceleration (optional):
    # deploy:
    #  resources:
    #    reservations:
    #      devices:
    #        - driver: "nvidia"
    #          capabilities: [ gpu ]
    #          count: "all"

Note that the NVIDIA Container Toolkit must be installed for GPU hardware acceleration to work. Experienced users may also run Ollama on a separate, more powerful server.

Ollama does not enforce authentication by default. Only expose port 11434 inside trusted networks or behind a reverse proxy that adds access control.

Step 2: Download Models

Pull at least one multimodal model that can return structured JSON (for example, gemma3:latest):

docker compose exec ollama ollama pull gemma3:latest

Other viable options already validated in our tests include qwen2.5vl:3b, qwen2.5vl:3b-q8_0, and qwen2.5vl:3b-fp16. Stick to lightweight quantizations when running on CPUs and reserve FP16 variants for GPUs.

Step 3: Configure PhotoPrism

Create or update the vision.yml file in the PhotoPrism storage directory (inside the container: /photoprism/storage/config/vision.yml). Reuse your existing TensorFlow defaults but add an Ollama-based label model as shown below:

vision.yml

Models:
  - Type: labels
    Default: true             # Nasnet fallback
  - Type: nsfw
    Default: true
  - Type: face
    Default: true
  - Type: caption
    Default: true
  - Type: labels
    Name: gemma3:latest
    Engine: ollama
    Run: newly-indexed
    System: |
      You are a PhotoPrism vision model. Follow the provided schema exactly.
    Prompt: |
      Analyze the image and list the most relevant labels. Provide the response in English
      unless this prompt specifies a different language.
    Service:
      Uri: http://ollama:11434/api/generate
Thresholds:
  Confidence: 10

Prompt localization

Keep the base instructions in English and append the desired language (e.g., “Respond in German”). This mirrors our recommendation for caption prompts and tends to be more reliable than asking the model to output multiple languages simultaneously.

PhotoPrism evaluates models from the bottom of the list upwards, so placing the Ollama entry after the Nasnet fallback ensures it is picked first while Nasnet remains available as a backup.

PhotoPrism records Ollama-generated labels with the ollama metadata source automatically, so you do not need to request a specific source field in the schema or pass --source to the CLI unless you want to override the default.

Scheduling Options

  • Run: newly-indexed (recommended) runs the Ollama model via the metadata worker shortly after new photos are indexed. This keeps the indexing pipeline responsive while still updating metadata within minutes.
  • Run: on-index executes during the primary indexing process. Only use this for lightweight or local models; external calls will otherwise lengthen imports noticeably.
  • Run: manual disables automatic execution so you can invoke the model explicitly via photoprism vision run -m labels.

Step 4: Restart PhotoPrism

Run the following commands to restart photoprism and apply the new settings:

docker compose stop photoprism
docker compose up -d

Test the integration with:

docker compose exec photoprism photoprism vision run -m labels --count 5 --force

Troubleshooting

Verify Active Configuration

docker compose exec photoprism photoprism vision ls

Ensure the output lists both Nasnet (default) and your Ollama label model. If the custom entry is missing, double-check the YAML indentation, file name (vision.yml, not .yaml), or environment overrides.

Schema or JSON Errors

If PhotoPrism logs vision: invalid label payload from ollama, the model returned data that didn’t match the expected structure. Confirm that:

  • The adapter injected schema instructions (keep System/Prompt intact or reuse the defaults).

PhotoPrism may fall back to the existing TensorFlow Nasnet model when the Ollama response cannot be parsed.

Latency & Timeouts

Structured responses introduce additional parsing overhead. If you encounter timeouts:

  • Increase the global service timeout (e.g., ServiceTimeout in advanced deployments) if needed.
  • Reduce image resolution (Resolution: 500) or use smaller models.
  • Keep Options.Temperature low to encourage deterministic output.

GPU Considerations

When Ollama uses GPUs, long-running sessions might degrade over time due to VRAM fragmentation. Restart the Ollama container to recover performance:

docker compose down ollama
docker compose up -d ollama

  1. Available to all users with the next stable version, see our release notes for details. 

  2. Unrelated configuration details have been omitted for brevity.