Setup and Runtime Layout¶

Last updated: 2026-05-28

Filesystem Layout¶

The local AI runtime root is:

/srv/localai

Current layout:

/srv/localai/
  .cache/
  .npm/
  config/
  documents/
    inbox/
    processing/
    archive/
    failed/
  llama.cpp/
  logs/
  models/
    qwen3-coder-next-q4_k_m/
    qwen3-embedding-0.6b-q8_0/
    qwen3.5-9b-vlm-q4_k_m/
  rag/
    app/
    static/
    templates/
  venv/

The stack runs as the localai user and group. Runtime-owned files under /srv/localai are owned by localai:localai.

Repository Copies¶

The server repository keeps deployable source/config copies:

config/litellm.yaml
rag/
systemd/

The deployed runtime copies are:

/etc/localai/litellm.yaml
/srv/localai/rag
/etc/systemd/system/localai-*.service
/etc/systemd/system/localai-rag-ingest.timer

Keep the repository copies updated when changing runtime behavior. The wiki should document the deployed state, not only the plan.

Runtime Config¶

Runtime config lives under:

/etc/localai

Files:

/etc/localai/localai.env
/etc/localai/litellm.yaml

Permissions verified:

/etc/localai              root:localai  drwxr-x---
/etc/localai/localai.env  root:localai  -rw-r-----
/etc/localai/litellm.yaml root:localai  -rw-r-----

Do not print or document secret values from these files.

Environment Variables¶

The environment file defines these variable names:

LITELLM_API_KEY
LITELLM_BASE_URL
LITELLM_MASTER_KEY
LITELLM_SALT_KEY
LOCALAI_CHAT_MODEL
LOCALAI_DOCUMENT_ROOT
LOCALAI_EMBED_MODEL
LOCALAI_RAG_DATABASE_URL
LOCALAI_VISION_MODEL

Expected non-secret defaults and meanings:

Variable	Purpose
`LITELLM_API_KEY`	Bearer key used by local services when calling LiteLLM
`LITELLM_BASE_URL`	Base URL for LiteLLM, normally `http://127.0.0.1:4000/v1`
`LITELLM_MASTER_KEY`	LiteLLM master key
`LITELLM_SALT_KEY`	LiteLLM salt key
`LOCALAI_CHAT_MODEL`	Default RAG chat model, normally `local/qwen-coder`
`LOCALAI_DOCUMENT_ROOT`	Document root, normally `/srv/localai/documents`
`LOCALAI_EMBED_MODEL`	Embedding model, normally `local/embed-engineering`
`LOCALAI_RAG_DATABASE_URL`	PostgreSQL connection string
`LOCALAI_VISION_MODEL`	Vision model, normally `local/qwen-vision-fast`

LiteLLM Config Shape¶

LiteLLM is configured with three model routes:

model_list:
  - model_name: local/qwen-coder
    litellm_params:
      model: openai/local/qwen-coder
      api_base: http://127.0.0.1:8010/v1
      api_key: os.environ/LITELLM_API_KEY

  - model_name: local/qwen-vision-fast
    litellm_params:
      model: openai/local/qwen-vision-fast
      api_base: http://127.0.0.1:8011/v1
      api_key: os.environ/LITELLM_API_KEY

  - model_name: local/embed-engineering
    litellm_params:
      model: openai/local/embed-engineering
      api_base: http://127.0.0.1:8012/v1
      api_key: os.environ/LITELLM_API_KEY

litellm_settings:
  drop_params: true
  request_timeout: 600
  num_retries: 1

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY

The deployed file may use LiteLLM's exact environment interpolation syntax. Do not replace environment references with literal secret values.

SELinux Contexts¶

SELinux persistent file contexts were added for the runtime paths:

/srv/localai/documents(/.*)?              var_lib_t
/srv/localai/llama.cpp/build/bin(/.*)?    bin_t
/srv/localai/logs(/.*)?                   var_log_t
/srv/localai/models(/.*)?                 var_lib_t
/srv/localai/rag(/.*)?                    usr_t
/srv/localai/venv/bin(/.*)?               bin_t
/srv/localai/venv/lib(/.*)?               lib_t
/srv/localai/venv/lib64(/.*)?             lib_t

If files are moved into these paths manually, restore contexts:

sudo restorecon -Rv /srv/localai /etc/localai

If a service starts manually but fails under systemd, check SELinux denials before changing service logic:

sudo journalctl -t setroubleshoot --since "30 minutes ago"
sudo ausearch -m avc -ts recent

Build and Hardware Checks¶

The llama.cpp build is under:

/srv/localai/llama.cpp

Useful checks:

/srv/localai/llama.cpp/build/bin/llama-cli --list-devices

Verified Vulkan device output included:

Vulkan0: AMD Radeon 8060S Graphics (RADV STRIX_HALO)

As of 2026-05-28, Fedora reported about 62 GiB system memory. llama.cpp device enumeration reported the Radeon Vulkan device with about 97495 MiB total and about 37612 MiB free at the time of the check. Treat those values as operational snapshots, not guaranteed capacity.

Database Layout¶

The local RAG database is:

localai_rag

Required PostgreSQL extensions:

vector
pg_trgm

Primary tables:

rag_documents
rag_chunks

The schema is stored in:

rag/schema.sql
/srv/localai/rag/schema.sql