09 January 2026

#RAG

#RAG

Key Concepts


S.No Topic Sub-Topics
1 RAG What is RAG, Why RAG, RAG vs LLM-only, RAG use cases, RAG limitations
2 LLM Fundamentals for RAG Transformer basics, Context window, Tokens, Prompt-response flow, Hallucinations
3 Text Embeddings What are embeddings, Vector representation, Embedding models, Dimensionality, Similarity meaning
4 Embedding Models OpenAI embeddings, SentenceTransformers, Multilingual embeddings, Trade-offs, Model selection
5 Vector Databases Basics Vector DB concept, ANN search, Indexing basics, Metadata storage, Vector lifecycle
6 Vector DB Tools FAISS, Pinecone, Weaviate, Milvus, ChromaDB
7 Distance Metrics Cosine similarity, Dot product, Euclidean distance, Trade-offs, Metric selection
8 Chunking Strategies Fixed chunking, Semantic chunking, Chunk size, Overlap, Parent-child chunks
9 Document Ingestion PDF ingestion, Text files, HTML ingestion, Cleaning text, Normalization
10 Indexing Pipeline Embedding generation, Batch indexing, Metadata tagging, Versioning, Index updates
11 Retrieval Basics Top-k retrieval, Similarity threshold, Recall vs precision, Retrieval latency, Query flow
12 Hybrid Search Dense search, Sparse search, Keyword search, BM25, Hybrid ranking
13 Metadata Filtering Structured filters, Access control, User-based filtering, Time filters, Security filters
14 Prompt Engineering for RAG Prompt templates, Context injection, Instructions, Citations, Answer formatting
15 Naive RAG Architecture Single retriever, Single prompt, Context stuffing, Limitations, Failure cases
16 Advanced RAG Architecture Multi-retriever, Reranking, Compression, Query rewriting, Modular design
17 Reranking Techniques Cross-encoders, Relevance scoring, Latency trade-off, Top-n rerank, Quality boost
18 Context Optimization Token limits, Context pruning, Compression, Redundancy removal, Ordering chunks
19 Multi-hop Retrieval Complex queries, Query decomposition, Iterative retrieval, Chain-of-thought, Examples
20 Agentic RAG LLM agents, Tool calling, Planner-executor, Memory, Autonomous retrieval
21 Structured Data RAG SQL integration, CSV data, APIs, Knowledge graphs, Hybrid retrieval
22 RAG with LangChain Retrievers, Chains, Vector stores, Memory, RAG pipelines
23 RAG with LlamaIndex Indexes, Query engines, Node parsing, Storage context, Tools
24 Evaluation of RAG Retrieval metrics, Answer quality, Faithfulness, Relevance, Latency
25 RAGAS Framework Faithfulness score, Context recall, Answer relevance, Ground truth, Automation
26 Security in RAG Prompt injection, Data leakage, RBAC, PII handling, Secure retrieval
27 Scalability & Performance Index sharding, Caching, Async retrieval, Load balancing, Cost control
28 Production Deployment API design, Model hosting, Vector DB hosting, Monitoring, Logging
29 Monitoring & Feedback User feedback, Drift detection, Retrieval errors, Continuous improvement, Alerts
30 Enterprise RAG Use Cases Chatbots, Search engines, Knowledge assistants, Analytics, Decision support

Interview question

Basic Level

  1. What is Retrieval-Augmented Generation (RAG)?
  2. Why is RAG needed for LLM applications?
  3. What problems does RAG solve?
  4. What are the core components of a RAG system?
  5. What is retrieval in RAG?
  6. What is generation in RAG?
  7. How is RAG different from fine-tuning?
  8. How is RAG different from prompt engineering?
  9. What is a knowledge base in RAG?
  10. What type of data can RAG consume?
  11. What are embeddings?
  12. Why are embeddings used in RAG?
  13. What is a vector database?
  14. Examples of vector databases?
  15. What is semantic search?
  16. What is similarity search?
  17. What distance metrics are commonly used?
  18. What is cosine similarity?
  19. What is text chunking?
  20. Why is chunking important in RAG?
  21. What is context window?
  22. What is prompt grounding?
  23. What is hallucination in LLMs?
  24. How does RAG reduce hallucinations?
  25. What are common RAG use cases?

Intermediate Level

  1. Explain the end-to-end RAG workflow.
  2. How are embeddings generated?
  3. Which embedding models are commonly used?
  4. What is embedding dimensionality?
  5. How does chunk size affect retrieval?
  6. What is chunk overlap?
  7. What is metadata filtering?
  8. What is hybrid search?
  9. Difference between sparse and dense retrieval?
  10. What is keyword search vs vector search?
  11. What is top-k retrieval?
  12. How do you decide the value of k?
  13. What is reranking?
  14. Why is reranking important?
  15. What is prompt templating in RAG?
  16. How is retrieved context injected into prompts?
  17. What is latency challenge in RAG?
  18. How do you improve RAG response speed?
  19. What is document indexing?
  20. How do you update knowledge base data?
  21. What is FAISS?
  22. What is Pinecone?
  23. What is Weaviate?
  24. What is Chroma DB?
  25. What role does LangChain play in RAG?

Advanced Level

  1. What are different RAG architectures?
  2. What is naive RAG?
  3. What is advanced RAG?
  4. What is agentic RAG?
  5. What is multi-hop retrieval?
  6. What is query rewriting?
  7. What is a self-query retriever?
  8. What is parent-child chunking?
  9. Difference between document-level and chunk-level retrieval?
  10. What is contextual compression?
  11. How do you handle long documents in RAG?
  12. How does RAG integrate with structured data?
  13. How can SQL databases be used in RAG?
  14. What is retrieval evaluation?
  15. What metrics are used to evaluate RAG?
  16. What is recall vs precision in RAG?
  17. What is MMR (Max Marginal Relevance)?
  18. How does MMR help improve answer quality?
  19. What is data skew in retrieval?
  20. How do you handle stale data?
  21. How do you implement real-time RAG?
  22. How is access control handled in RAG?
  23. How do you secure sensitive documents?
  24. How does multilingual RAG work?
  25. What are common RAG failure patterns?

Expert Level

  1. How do you design a production-grade RAG system?
  2. How does RAG scale to millions of documents?
  3. What are trade-offs between RAG and fine-tuning?
  4. How do you optimize RAG for low latency?
  5. How do you debug poor RAG responses?
  6. What causes irrelevant retrieval?
  7. How do you improve retrieval accuracy?
  8. How do context limits impact RAG?
  9. What strategies help reduce token usage?
  10. How do you prevent prompt injection in RAG?
  11. How do you measure answer faithfulness?
  12. What is RAGAS evaluation framework?
  13. How do you monitor RAG systems in production?
  14. How do you build feedback loops?
  15. What is continuous indexing?
  16. How do you version embeddings?
  17. How do you migrate vector databases safely?
  18. How do you control RAG operational costs?
  19. How do you handle LLM model upgrades?
  20. How does RAG enable explainability?
  21. What is citation-based RAG?
  22. How does RAG work with AI agents?
  23. What are emerging RAG patterns?
  24. What are the limitations of RAG?
  25. Explain enterprise-level RAG use cases.

Related Topics