Setup and Runtime Layout¶
Last updated: 2026-05-28
Filesystem Layout¶
The local AI runtime root is:
/srv/localai
Current layout:
/srv/localai/
.cache/
.npm/
config/
documents/
inbox/
processing/
archive/
failed/
llama.cpp/
logs/
models/
qwen3-coder-next-q4_k_m/
qwen3-embedding-0.6b-q8_0/
qwen3.5-9b-vlm-q4_k_m/
rag/
app/
static/
templates/
venv/
The stack runs as the localai user and group. Runtime-owned files under /srv/localai are owned by localai:localai.
Repository Copies¶
The server repository keeps deployable source/config copies:
config/litellm.yaml
rag/
systemd/
The deployed runtime copies are:
/etc/localai/litellm.yaml
/srv/localai/rag
/etc/systemd/system/localai-*.service
/etc/systemd/system/localai-rag-ingest.timer
Keep the repository copies updated when changing runtime behavior. The wiki should document the deployed state, not only the plan.
Runtime Config¶
Runtime config lives under:
/etc/localai
Files:
/etc/localai/localai.env
/etc/localai/litellm.yaml
Permissions verified:
/etc/localai root:localai drwxr-x---
/etc/localai/localai.env root:localai -rw-r-----
/etc/localai/litellm.yaml root:localai -rw-r-----
Do not print or document secret values from these files.
Environment Variables¶
The environment file defines these variable names:
LITELLM_API_KEY
LITELLM_BASE_URL
LITELLM_MASTER_KEY
LITELLM_SALT_KEY
LOCALAI_CHAT_MODEL
LOCALAI_DOCUMENT_ROOT
LOCALAI_EMBED_MODEL
LOCALAI_RAG_DATABASE_URL
LOCALAI_VISION_MODEL
Expected non-secret defaults and meanings:
| Variable | Purpose |
|---|---|
LITELLM_API_KEY |
Bearer key used by local services when calling LiteLLM |
LITELLM_BASE_URL |
Base URL for LiteLLM, normally http://127.0.0.1:4000/v1 |
LITELLM_MASTER_KEY |
LiteLLM master key |
LITELLM_SALT_KEY |
LiteLLM salt key |
LOCALAI_CHAT_MODEL |
Default RAG chat model, normally local/qwen-coder |
LOCALAI_DOCUMENT_ROOT |
Document root, normally /srv/localai/documents |
LOCALAI_EMBED_MODEL |
Embedding model, normally local/embed-engineering |
LOCALAI_RAG_DATABASE_URL |
PostgreSQL connection string |
LOCALAI_VISION_MODEL |
Vision model, normally local/qwen-vision-fast |
LiteLLM Config Shape¶
LiteLLM is configured with three model routes:
model_list:
- model_name: local/qwen-coder
litellm_params:
model: openai/local/qwen-coder
api_base: http://127.0.0.1:8010/v1
api_key: os.environ/LITELLM_API_KEY
- model_name: local/qwen-vision-fast
litellm_params:
model: openai/local/qwen-vision-fast
api_base: http://127.0.0.1:8011/v1
api_key: os.environ/LITELLM_API_KEY
- model_name: local/embed-engineering
litellm_params:
model: openai/local/embed-engineering
api_base: http://127.0.0.1:8012/v1
api_key: os.environ/LITELLM_API_KEY
litellm_settings:
drop_params: true
request_timeout: 600
num_retries: 1
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
The deployed file may use LiteLLM's exact environment interpolation syntax. Do not replace environment references with literal secret values.
SELinux Contexts¶
SELinux persistent file contexts were added for the runtime paths:
/srv/localai/documents(/.*)? var_lib_t
/srv/localai/llama.cpp/build/bin(/.*)? bin_t
/srv/localai/logs(/.*)? var_log_t
/srv/localai/models(/.*)? var_lib_t
/srv/localai/rag(/.*)? usr_t
/srv/localai/venv/bin(/.*)? bin_t
/srv/localai/venv/lib(/.*)? lib_t
/srv/localai/venv/lib64(/.*)? lib_t
If files are moved into these paths manually, restore contexts:
sudo restorecon -Rv /srv/localai /etc/localai
If a service starts manually but fails under systemd, check SELinux denials before changing service logic:
sudo journalctl -t setroubleshoot --since "30 minutes ago"
sudo ausearch -m avc -ts recent
Build and Hardware Checks¶
The llama.cpp build is under:
/srv/localai/llama.cpp
Useful checks:
/srv/localai/llama.cpp/build/bin/llama-cli --list-devices
Verified Vulkan device output included:
Vulkan0: AMD Radeon 8060S Graphics (RADV STRIX_HALO)
As of 2026-05-28, Fedora reported about 62 GiB system memory. llama.cpp device enumeration reported the Radeon Vulkan device with about 97495 MiB total and about 37612 MiB free at the time of the check. Treat those values as operational snapshots, not guaranteed capacity.
Database Layout¶
The local RAG database is:
localai_rag
Required PostgreSQL extensions:
vector
pg_trgm
Primary tables:
rag_documents
rag_chunks
The schema is stored in:
rag/schema.sql
/srv/localai/rag/schema.sql