A sovereign financial-crime platform: a local Llama reasons over a typed ontology of your accounts, claims and companies to expose fraud rings and draft the regulatory report — every action human-approved, nothing leaving the building. We want one design partner to harden it — we think that's Jio Financial Services.
A hosted walkthrough on synthetic data — feel the real clicks. The production console runs air-gapped on your own hardware, zero egress. Run it locally →
So Praheri runs the entire intelligence layer on-prem, on open-weight Llama, with a human on every consequential action.
RBI data-localization rules it out, and a black-box API can't be audited, fine-tuned, or trusted with a freeze decision. The alternative is a model of the institution's own world that reasons in-building — sovereign and capable at once.
Your data stops being rows in a table and becomes a live, connected map of reality that analysts and AI reason over together.
Borrowing Palantir's framing: a semantic layer — the nouns and how they relate — and a kinetic layer — how decisions are written back as governed actions.
Every entity is a typed object — Account, Transaction, Claim, Company — with its own properties.
Relationships are first-class and typed — sent →, owns →. Tracing a ring six hops out is a native traversal, not a recursive SQL nightmare.
The model never mutates data. It proposes — freeze, file an STR, escalate KYC — a human approves, and it's applied and audited.
Ontology-Augmented Generation feeds the model typed, linked objects — not prose that mentions them — so every claim cites a real object ID. The ring isn't described; it's traversed.
Case notes come back as prose. The model re-parses English to infer who connects to whom — and may invent a link, or miss one.
The same ring comes back as typed objects with real links. The model follows the edges, sees the cycle, names every node by ID. The structure is the evidence.
In BFSI the pain, the regulation and the data-shape all point at the same architecture. Three forces just converged:
Laundering, staged-claim rings and shell ownership are patterns of connection, not properties of a single row. A relationship-native ontology is the right data model; flat tables aren't.
RBI data-localization and the DPDP Act rule out a frontier-cloud copilot for this data. The only compliant path is on-prem, open-weight, auditable — and that constraint is our moat.
Llama-class models now classify a typology and draft a filing-grade narrative on a single on-prem box. What used to need a frontier API runs inside the building — sovereign and capable, no longer a trade-off.
Graph algorithms and rules find the ring — better than any LLM — so they do the detection, in deterministic Python. Llama does the two jobs ML structurally can't.
Ring traversal, structuring / circular-flow detection and the risk floor are pure code — reproducible, auditable, no hallucination. A fired signal is a confirmed typology the model can escalate but never downgrade to CLEAR.
1 · It writes the report a human can sign. Graph output is a diagram and a score; the officer must file plain English. (A filing-grade STR for FIU-IND, citing real IDs.)
2 · One brain works for every department. Swap the settings file, not the model — no per-sector retraining. (Zero per-domain feature engineering or labelled set.)
A graph model outputs a diagram and a risk score, but the compliance officer has to file a written report that names each account and explains, in words, why it's suspicious — grounded in the bank's own rules. That's what an LLM does and a graph model can't.
And a traditional fraud model must be custom-built and re-trained per area — one for banking, another for insurance — each with its own data and team. The same Llama handles all of them by swapping the config.
The whole stack runs on-prem on open-weight Llama. Compliance by construction, not by promise.
The model proposes; a human approves; everything is audited. No mutation without a governed action.
Swap the ontology cartridge and a new sector lights up — zero engine code changed.
This pipeline is unchanged across all verticals.
Each sector is a configuration — typed objects, the signals to detect, the governed actions a human approves. The engine never changes. AML and Insurance are where we'd start with JFS — they sit directly on Jio Payments Bank, Jio Finance and Jio Insurance Broking. The other four prove the same engine already reaches across the group.
JFS spans every sector this engine already runs — payments, lending, insurance, asset management — at Reliance scale, greenfield enough to build compliance-native. We're proposing a design partnership to harden Phase 1 on your real workflows.
Retail money movement at Reliance scale is the natural home of mule networks and structuring. → Praheri AML: expose the ring, draft the STR for FIU-IND, propose the freeze — human-approved, audited.
Claims exposure is where organized fraud erodes the loss ratio. → Praheri Insurance SIU: surface staged-accident and garage-collusion rings across otherwise-unrelated claims before payout.
As the group expands, the same engine extends to wealth suitability, corporate UBO resolution and procurement controls — zero engine code per sector. One platform, the whole JFS surface.
Deploy on-prem against a masked slice of Jio Payments Bank transactions. Prove the ring-to-STR loop on real workflow shape. Success metric agreed up front. No data leaves JFS.
Iterate the two Phase-1 cartridges on live JFS ontology and MLRO/SIU feedback. Wire the real approval + audit workflow. This is where we co-build — your analysts shape the product.
Extend the same engine to Jio Finance lending EWS, JioBlackRock suitability, group UBO and procurement — each a config cartridge, not a new build. One sovereign intelligence layer for all of JFS.
The demo console runs the full loop — triage, traverse, expose the ring, draft the report, propose a freeze, approve as MLRO, read the audit trail — right here in your browser, on synthetic data.
This is a hosted walkthrough on synthetic data. The production console is sovereign by design — it runs air-gapped on your own hardware, with no network egress. Run the real one locally →
Open source, on-prem, open-weight Llama via Ollama. Clone it, point at a local model, and the full loop runs with no network calls.
# 1 · clone the repo git clone https://github.com/surajsrivastava94/praheri-sovereign-aip.git cd praheri-sovereign-aip # 2 · python env + dependencies (Python 3.11+) python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt # 3 · pull the open-weight models, then serve them locally ollama pull llama3.1:8b ollama pull nomic-embed-text ollama serve # 4 · generate the synthetic bank + planted fraud rings python -m praheri.generate python -m praheri.generate_verticals # 5 · launch the console → http://localhost:8501 streamlit run app/streamlit_app.py