Skip to content

Claw chat (RAG)

Claw is the chat interface on the Claw page. Ask a free-form question — in English or Chinese — and Claw answers by retrieving your own past signals and synthesising them with DeepSeek through its OpenAI-compatible API.

Current retrieval state (2026-04-22)

The hybrid pipeline below is half-live: the BM25 / Postgres side works and is what Claw currently uses to answer questions. The dense / ChromaDB side is designed and wired up to the retrieval code, but embedding_worker.py is still a TODO: Implement stub — so no signal is actually embedded yet, and dense retrieval returns an empty list on every query. RRF fusion therefore degrades to BM25-only for now. Finishing the embedder is tracked as follow-up work.

The retrieval pipeline

Your question
    ├────── dense search ──────► ChromaDB  (planned; embedder stub)
    │       paraphrase-multilingual embeddings
    └────── BM25 keyword search ──► Postgres tsvector
            query tokenised → "term1 | term2 | …" (OR-match)
            over summary_en + summary_zh + tickers
              Reciprocal Rank Fusion (RRF, k=60)
         top-6 fused signals, RLS-filtered to yours
        prompt assembly → DeepSeek LLM → answer
               Row written to rag_query_logs

BM25 tokeniser

User queries come in as natural-language sentences ("Should I buy NVDA right now?"). Postgres's websearch_to_tsquery defaults to AND-matching every word, which almost never hits anything. We pre-tokenise in rag/bm25_search._build_or_tsquery:

  1. Extract \w+ tokens, lowercase.
  2. Drop single-character tokens and a small English stopword set.
  3. Dedup while preserving order.
  4. Join with | and hand to to_tsquery.

"Should I buy NVDA right now?" → "buy | nvda" → matches any signal mentioning either word, ranked by ts_rank. The LLM then picks the relevant subset from the top-6.

What gets retrieved

  • Only your own signals. Both dense and lexical paths enforce user isolation — Chroma via a where={"user_id": ...} metadata filter, Postgres via the same RLS policy that protects the dashboard. You cannot accidentally query another user's history.
  • Fused, not concatenated. Dense retrieval catches semantic paraphrases (a Chinese question finding an English signal, or "supply chain" matching "downstream manufacturers"). BM25 catches exact tickers and names that dense embeddings can miss. RRF merges the two ranked lists with no need to calibrate their very different score scales.
  • Bilingual. The embedder (paraphrase-multilingual-MiniLM-L12-v2, 384-dim) puts English and Chinese summaries into the same semantic space, so a query in either language retrieves both languages' content.

What the answer looks like

Claw is prompted to:

  • Cite facts it uses with bracketed numbers [1], [2] keyed to the retrieved signals.
  • Reply in the same language as the question.
  • Say "the signals don't support that" when the retrieval didn't yield anything relevant — no hallucinated specifics.
  • Stay under ~250 words unless the question requires a list.

Every call is logged to rag_query_logs with:

  • The query text and the answer
  • The list of retrieved signal ids
  • Top-1 cosine similarity (useful as a retrieval-quality proxy — below 0.30 usually means the question was out of scope)
  • Model name and latency

When Claw is most useful

  • "Why did NVDA drop today?" — pulls your recent NVDA-tagged signals and synthesises a why-narrative.
  • "Summarise this week's FLASH signals" — retrieves all FLASH-tier signals in the window and returns a roll-up.
  • "What's the Chinese semiconductor policy situation?" — exactly the kind of multi-signal, multi-agent aggregation that a single card can't convey.
  • "最近 AI 芯片需求方面有什么信号?" — full Chinese round-trip works.

When it's not

  • Current price / quote. Claw reads signals, not live market data. Use the dashboard's index bar or stock card.
  • Cross-user aggregation. RLS hides everyone else's signals from your session — Claw has no "global" view.
  • Advice. Claw refuses to pretend to be a licensed advisor.

See also

  • Scoring — understand the retrieval prior
  • Agents — what populates the knowledge base