What is Looma

Looma is a local-first AI chat client where conversation behavior and context management are programmable.

Most AI applications treat conversations as fixed systems. You can choose a model, but the application decides how prompts are built, how context is managed, and how tools are used.

Looma takes a different approach.

The platform provides the infrastructure — models, storage, and tools — while strategies define how conversations work.

Conversations Are Context Systems

Large language models are fundamentally context-limited systems. Every response depends on what information is placed into the prompt. This means every AI application must constantly answer questions like:

What conversation history should be included?
What information should be removed to fit token limits?
When should knowledge be retrieved?
When should the system call tools?

Most applications hide this logic inside the product.

Looma exposes it.

Programmable Chat Flow

In Looma, the behavior of a conversation is defined by a strategy.

A strategy is a small TypeScript module that controls how the system interacts with the model.

Strategies define the chat flow of a conversation:

how prompts are constructed
which context should be included
how history is selected
when tools are used
when memories are written or retrieved
and everything...

Because strategies run inside an isolated engine, they can extend the system without modifying the core application.

Local-First by Design

Looma is designed around a local-first architecture.

Conversations, files, and indexes are stored locally using a single local file. This keeps the system fast and private while still allowing cloud models when needed.

The platform handles the heavy infrastructure work:

model provider integration
token budgeting
tool execution
file ingestion
retrieval-augmented generation

Strategies only focus on defining conversation behavior.

System Architecture

Looma is organized into three layers.

Host Platform

Handles the UI, database, model APIs, and system lifecycle.

Strategy Engine

Runs programmable strategies inside isolated worker threads.

Memory & Ingest System

Manages document ingestion, vector indexing, and semantic search.

Conversations Are Context Systems

Programmable Chat Flow

Local-First by Design

System Architecture

On this page