RAG & Knowledge Ingestion

Srasta turns controlled knowledge into governed context.

Enterprise retrieval is not just “upload documents and search.” Srasta ingests approved sources into tenant-scoped collections, builds dense and sparse indexes, retrieves context through policy-aware APIs, and keeps document, vector, prompt, response, and audit data inside the customer perimeter.

Ingestion Path

Knowledge enters Srasta through an explicit pipeline.

The ingestion scripts read files on disk, filter eligible content, chunk text, generate embeddings, fit sparse retrieval parameters, tag tenant scope, create indexes, and insert batches into Milvus.

Srasta RAG knowledge ingestion pipeline Customer-controlled knowledge plane ingest · index · retrieve · govern · audit Approvedsources Filter +chunk Embed +BM25 Milvuscollection Hybridretrieve Scopedcontext Ingest authtoken + tenant check Dense + sparseHNSW · inverted index Request controlsRBAC · persona · audit

What Ingest Does

Documents become searchable evidence, not uncontrolled prompt stuffing.

Srasta’s default ingestion path is designed for repeatable enterprise operation. Operators can fully refresh a collection or run incremental updates based on git changes, while keeping tenant tagging and collection naming explicit.

File filtering

Eligible source types include code, markdown, YAML, JSON, Terraform, shell, text, Dart, TypeScript, TOML, HCL, and Nomad files.

Safe exclusions

Common generated or dependency directories are skipped, and credential-named files are excluded from ingest.

Chunking

Text is chunked by configurable size and overlap, with stable chunk identifiers derived from file path and chunk index.

Embedding

Dense vectors are generated through the configured embedding service, with Ollama defaults and TEI support in the RAG API runtime.

Sparse search

BM25 sparse vectors are fit for each corpus and saved for query-time use, preserving keyword precision for codes, IDs, and names.

Indexing

Milvus stores dense vectors, sparse vectors, source path, chunk ID, document text, and tenant ID with dense, sparse, and tenant indexes.

Retrieval Path

Hybrid retrieval balances meaning and exactness.

Dense semantic retrieval

  • Finds conceptually similar content even when words differ.
  • Works well for policy, architecture, and natural-language questions.
  • Backed by vector embeddings stored in Milvus.

BM25 sparse retrieval

  • Preserves exact matches for identifiers, filenames, services, and config keys.
  • Reduces failures where semantic similarity misses literal tokens.
  • Combined with dense search through weighted ranking.

Srasta’s RAG API uses a hybrid search strategy, with default top-k controls and support for optional reranking and context compression before model inference.

Scoping Controls

Retrieval scope is selected deliberately.

A single deployment can carry multiple collections. Operators and clients can scope retrieval by header, model prefix, path-based URL, explicit environment configuration, or tenant-aware Helm release strategy.

Header scoping

Programmatic clients can use request headers to choose one or more collections.

Model prefix

IDE and OpenAI-compatible clients can prefix the model name with a collection.

Path routing

Gateway paths can map URLs to collection-scoped RAG access.

Tenant-aware collections

Production values support release or namespace-scoped Milvus collections for tenant isolation.

Governance

Knowledge access follows the same control model as inference.

AuthenticationIngest can validate tokens against Srasta API before writing vectors.
AuthorizationOIDC and role enforcement can require tenant match and role permissions.
Policy scanningCompliance profiles can scan input and output for sensitive patterns.
Audit trailRequests are recorded with actor, path, outcome, persona, session, tenant, roles, and timing.
ObservabilityPrometheus metrics and Langfuse traces help operators understand retrieval and generation behavior.
Customer perimeterDocuments, embeddings, responses, audit records, and backups stay inside customer-controlled infrastructure by default.

Operations

Operators need both full refresh and incremental paths.

Full replacement is useful for clean baselines. Incremental ingest is useful when a git-backed corpus changes frequently and operators want to delete stale chunks and insert only changed content.

Full refresh

Drops and recreates a collection, rebuilds dense and sparse indexes, and inserts a fresh corpus snapshot.

Incremental ingest

Uses git diff state to identify added, modified, deleted, and renamed files, then updates affected chunks.

Auto-discovery

When explicit collection lists are not set, the RAG API can discover available Milvus collections at query time.

Day-2 tuning

Operators can tune top-k, embedding host, model route, compression threshold, reranker, and policy profiles.

FAQ

RAG and Knowledge Ingestion FAQ

Where are ingested documents and vectors stored?

Ingested source files remain in the customer-controlled environment, while chunks and vectors are stored in Milvus inside the deployment. Optional object storage such as MinIO is used for platform storage and session persistence when configured.

How does Srasta retrieve knowledge?

Srasta combines dense semantic search with BM25 sparse keyword search in Milvus, then ranks results with a weighted hybrid strategy before injecting scoped context into the inference request.

How are tenants separated?

Ingest supports tenant IDs, the RAG API enforces tenant and role checks when OIDC authorization is enabled, and SaaS-style Helm values use namespace or release-scoped Milvus collection naming for isolation.

Can operators choose which knowledge source to search?

Yes. Collections can be scoped by request header, model prefix, path-based routing, or deployment configuration. If no explicit list is set, Srasta can auto-discover available Milvus collections.

Next

Use retrieval as governed context, then measure the outcome.

Knowledge ingestion gives Srasta a controlled retrieval base. Measure Loop shows which documents, prompts, routes, policies, and workflows actually improve enterprise decisions.

Review data flow