# Knowledge Base (RAG) (Reference: https://docs.iqra.bot/build/knowledge/rag)
The **Knowledge Base** is your agent's long-term memory. It allows the AI to "read" your proprietary documents (PDFs, Manuals, Policies) and answer questions based *only* on that data.
Iqra AI uses a **RAG (Retrieval Augmented Generation)** pipeline. We do not retrain the model; we inject relevant information into the context window in real-time.
The Ingestion Pipeline [#the-ingestion-pipeline]
It is important to understand how we treat your data. **We do not store your raw files.** Once uploaded, a document is converted into text, chunked, and embedded.
***
1. Creating a Group [#1-creating-a-group]
Documents are organized into **Groups** (e.g., "HR Policies", "Technical Manuals"). Retrieval settings are defined at the Group level.
Chunking Strategies [#chunking-strategies]
How should we split your documents?
} title="General Chunking">
Splits text linearly based on character count.
* **Best for:** Simple text files, FAQs.
* **Settings:** Max Chunk Length (e.g., 500 chars), Overlap.
} title="Parent-Child Chunking">
**High Precision.** Splits documents into small "Child" chunks for precise searching, but retrieves the larger "Parent" chunk for context.
* **Best for:** Complex documents where a single sentence loses meaning without its surrounding paragraph.
Retrieval Configuration [#retrieval-configuration]
* **Vector Search:** Matches semantic meaning (concepts).
* **Full Text:** Matches exact keywords.
* **Hybrid Search (Recommended):** Combines both scores for best results.
* **Reranking:** Re-orders the top results using a high-precision model to ensure the most relevant chunk is first.
***
2. Managing Documents [#2-managing-documents]
Once a group is created, you can upload and manage the data.
Upload & Pre-processing [#upload--pre-processing]
Upload PDF, DOCX, TXT, or MD files. You can enable **Cleaning Rules** to automatically strip URLs, emails, or excessive whitespace during extraction.
Chunk Management (Crucial) [#chunk-management-crucial]
After processing, the file exists only as a **List of Text Chunks**.
* **Edit Chunks:** If the PDF parser messed up a table, you can click on the chunk and fix the text manually.
* **Add Chunks:** You can manually add a text block (e.g., a quick policy update) without uploading a file.
* **Delete Chunks:** Remove irrelevant footers or legal disclaimers that confuse the AI.
***
3. Connecting to Agent [#3-connecting-to-agent]
Creating a database is useless if the Agent can't access it.
You must link your Knowledge Base Group to an Agent in the **[Agent Studio](/build/agent/intelligence)**.
Simply linking the KB doesn't mean the agent searches it every time. You must define a **Search Strategy** (e.g., *Always Search*, *Smart Classifier*, or *Script Tool*).
Read the **[Agent Intelligence Guide](/build/agent/intelligence#knowledge-base)** to configure *when* the agent searches.
***
Roadmap: Future Capabilities [#roadmap-future-capabilities]
We are actively expanding our Knowledge Engine.
} title="Dynamic Data Sources">
**Live Sync.** Instead of manual uploads, connect to **Google Drive**, **Notion**, or a **Website URL**. The system will periodically re-crawl and re-index the data to keep the agent up to date automatically.
} title="GraphRAG">
**Knowledge Graph.** Moving beyond simple vectors. We plan to map relationships between entities (e.g., "Product A *is compatible with* Product B"). This allows the agent to answer complex reasoning questions that standard RAG fails at.