RAG (For Your Document) - Lakshya Tripathi Portfolio

Overview

A powerful Retrieval-Augmented Generation (RAG) application that enables users to upload documents and query them using natural language. Built with cutting-edge AI technologies including Streamlit, LangChain, and Groq AI, this system provides intelligent answers backed by source attribution.

Key Features

Multi-format Support
PDF, CSV, and TXT files

Configurable Chunking
Adjustable text chunking with overlap

Local Vector Storage
ChromaDB integration

Semantic Search
Sentence Transformers powered

AI-Powered Responses
Groq's Gemma2-9b-it model

Source Attribution
Track answer sources

RAG Pipeline Architecture

Document Upload → Text Splitting → Embeddings → Vector Store → Similarity Search → LLM Context

The system processes documents through a sophisticated pipeline that converts text into embeddings, stores them in a vector database, performs similarity searches, and generates contextual responses using a large language model.

Technology Stack

Streamlit LangChain Groq AI Gemma2-9b-it Sentence Transformers ChromaDB Python

Installation

                # Clone the repository

git clone https://github.com/lakshya1410/RAG_for_your_document.git

cd RAG_for_your_document

# Install dependencies

pip install -r requirements.txt

# Create .env file with your Groq API key

echo "GROQ_API_KEY=your_groq_api_key_here" > .env

Note: Get your free Groq API key from console.groq.com

Usage

streamlit run main.py

Upload documents (PDF, CSV, TXT) via the sidebar
Configure chunk size (default: 1000) and overlap (default: 100)
Click "Submit & Process"
Ask questions and get AI-generated answers with source attribution

Programmatic Usage

                from rag_pipeline import process_files, ask_question

# Process documents

with open('document.pdf', 'rb') as f:

    process_files([f], chunk_size=1000, chunk_overlap=100)

# Ask questions

answer, sources = ask_question("What is the main topic?", k=3)

Troubleshooting

"GROQ_API_KEY not found": Create .env file with valid API key
"Vector store is empty": Upload and process documents first
"Error processing file": Ensure supported format (PDF, CSV, TXT) and valid encoding

📄 RAG (For Your Document)