# Knowledge Base (RAG) (Reference: https://docs.iqra.bot/build/knowledge/rag) The **Knowledge Base** is your agent's long-term memory. It allows the AI to "read" your proprietary documents (PDFs, Manuals, Policies) and answer questions based *only* on that data. Iqra AI uses a **RAG (Retrieval Augmented Generation)** pipeline. We do not retrain the model; we inject relevant information into the context window in real-time. The Ingestion Pipeline [#the-ingestion-pipeline] It is important to understand how we treat your data. **We do not store your raw files.** Once uploaded, a document is converted into text, chunked, and embedded. *** 1. Creating a Group [#1-creating-a-group] Documents are organized into **Groups** (e.g., "HR Policies", "Technical Manuals"). Retrieval settings are defined at the Group level. Chunking Strategies [#chunking-strategies] How should we split your documents? } title="General Chunking"> Splits text linearly based on character count. * **Best for:** Simple text files, FAQs. * **Settings:** Max Chunk Length (e.g., 500 chars), Overlap. } title="Parent-Child Chunking"> **High Precision.** Splits documents into small "Child" chunks for precise searching, but retrieves the larger "Parent" chunk for context. * **Best for:** Complex documents where a single sentence loses meaning without its surrounding paragraph. Retrieval Configuration [#retrieval-configuration] * **Vector Search:** Matches semantic meaning (concepts). * **Full Text:** Matches exact keywords. * **Hybrid Search (Recommended):** Combines both scores for best results. * **Reranking:** Re-orders the top results using a high-precision model to ensure the most relevant chunk is first. *** 2. Managing Documents [#2-managing-documents] Once a group is created, you can upload and manage the data. Upload & Pre-processing [#upload--pre-processing] Upload PDF, DOCX, TXT, or MD files. You can enable **Cleaning Rules** to automatically strip URLs, emails, or excessive whitespace during extraction. Chunk Management (Crucial) [#chunk-management-crucial] After processing, the file exists only as a **List of Text Chunks**. * **Edit Chunks:** If the PDF parser messed up a table, you can click on the chunk and fix the text manually. * **Add Chunks:** You can manually add a text block (e.g., a quick policy update) without uploading a file. * **Delete Chunks:** Remove irrelevant footers or legal disclaimers that confuse the AI. *** 3. Connecting to Agent [#3-connecting-to-agent] Creating a database is useless if the Agent can't access it. You must link your Knowledge Base Group to an Agent in the **[Agent Studio](/build/agent/intelligence)**. Simply linking the KB doesn't mean the agent searches it every time. You must define a **Search Strategy** (e.g., *Always Search*, *Smart Classifier*, or *Script Tool*). Read the **[Agent Intelligence Guide](/build/agent/intelligence#knowledge-base)** to configure *when* the agent searches. *** Roadmap: Future Capabilities [#roadmap-future-capabilities] We are actively expanding our Knowledge Engine. } title="Dynamic Data Sources"> **Live Sync.** Instead of manual uploads, connect to **Google Drive**, **Notion**, or a **Website URL**. The system will periodically re-crawl and re-index the data to keep the agent up to date automatically. } title="GraphRAG"> **Knowledge Graph.** Moving beyond simple vectors. We plan to map relationships between entities (e.g., "Product A *is compatible with* Product B"). This allows the agent to answer complex reasoning questions that standard RAG fails at.