Praheri is a sovereign financial-crime platform built on a typed ontology — a living digital twin of accounts, transactions, claims and companies. A local Llama traverses it to expose fraud rings, drafts the regulatory report, and proposes governed actions a human approves — every step audited, nothing leaving the building.
The console runs on-prem — by design, it never leaves your machine. Run it locally →
RBI data-localization makes the cloud copilot a non-starter — and a black-box API can't be audited, fine-tuned, or trusted with a freeze decision. Praheri is the alternative: the entire intelligence layer runs on-prem on open-weight Llama, over a typed model of the institution's own world, with a human in the loop on every consequential action.
Borrowing Palantir's framing: the ontology is a semantic layer — the nouns and how they relate — and a kinetic layer — how decisions are written back as governed actions. Your data stops being rows in a table and becomes a live, connected map of reality that both analysts and AI can reason over.
Every real-world entity becomes a typed object — an Account, a Transaction, a Claim, a Company — carrying its own properties. An object type is the table; an object is the row.
Relationships are first-class and typed — sent →, serviced_by →, owns →. The links are the structure, so tracing a ring six hops out is a native graph traversal, not a recursive SQL nightmare.
The model never mutates data directly. It proposes an action — freeze an account, file an STR, escalate a KYC review — which routes to a human approver and, on approval, is applied and audited. This is the closed loop: read the world → decide → act → the record updates.
Ontology-Augmented Generation feeds the model structured, typed, linked objects — not prose that mentions them. So the model reasons over ground-truth structure, and every claim it makes cites a real object ID. The fraud ring isn't described; it's traversed.
Ten fuzzy case notes are retrieved as prose. The model re-parses English to infer who connects to whom — and may invent a link that was never written, or miss one that was.
The same ring is retrieved as typed objects with their actual links. The model follows the edges, sees the cycle, and names every node by ID. The structure is the evidence.
The whole stack runs on-prem on open-weight Llama. Compliance by construction, not by promise.
The model proposes; a human approves; everything is audited. No mutation without a governed action.
Swap the ontology cartridge and a new sector lights up — zero engine code changed.
This pipeline is unchanged across all verticals.
Each sector is a configuration — a set of typed objects, the signals to detect, and the governed actions a human approves. The engine underneath never changes.
The live console runs the full loop: triage an alert, traverse the ontology, expose the fraud ring, draft the report, propose a freeze, approve it as the MLRO, and read the audit trail — all on-prem, with no network egress.
Sovereign by design — the console can't be hosted on a public cloud without breaking the premise. It runs on your own machine, with no network egress.
The console is open source and runs entirely on-prem on open-weight Llama via Ollama. Clone it, point it at a local model, and the full investigation loop runs with no network calls.
# 1 · clone the repo git clone https://github.com/surajsrivastava94/praheri-sovereign-aip.git cd praheri-sovereign-aip # 2 · python env + dependencies (Python 3.11+) python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt # 3 · pull the open-weight models, then serve them locally ollama pull llama3.1:8b ollama pull nomic-embed-text ollama serve # 4 · generate the synthetic bank + planted fraud rings python -m praheri.generate python -m praheri.generate_verticals # 5 · launch the console → http://localhost:8501 streamlit run app/streamlit_app.py