Description: Simple retrieval-then-generate approach
Flow: Query → Retrieve relevant documents → Generate response using retrieved context
Components: Single vector database, basic embedding model, LLM for generation
Use Case: Simple Q&A applications, proof of concepts
Limitations: No query optimization, basic relevance matching
Description: Enhanced retrieval with pre/post-processing
Features: Query rewriting, document re-ranking, result filtering
Components: Query optimizer, multiple retrieval strategies, re-ranking models
Improvements: Better relevance, reduced hallucination, contextual understanding
Use Case: Production applications requiring higher accuracy
Description: Flexible, component-based architecture
Features: Interchangeable modules (retrieval, generation, reasoning)
Components: Pluggable retrievers, generators, evaluators, orchestrators
Benefits: Customizable pipelines, easier testing and optimization
Use Case: Complex applications with varying requirements
Description: Agent-driven retrieval with dynamic decision-making
Features: Multi-step reasoning, adaptive retrieval strategies, tool usage
Components: Planning agents, retrieval agents, synthesis agents
Capabilities: Complex query decomposition, iterative refinement
Use Case: Research assistance, complex problem-solving
Description: Multi-level retrieval with document hierarchy
Structure: Document → Section → Paragraph → Sentence levels
Features: Coarse-to-fine retrieval, contextual preservation
Benefits: Better long-document understanding, improved context relevance
Use Case: Technical documentation, legal documents, research papers
Description: Combines multiple retrieval methods
Approaches: Dense + sparse retrieval, multiple embedding models
Techniques: Reciprocal rank fusion, weighted combination
Benefits: Improved recall and precision, robust retrieval
Use Case: Diverse content types, comprehensive search requirements
Description: Self-reflective retrieval with quality assessment
Features: Retrieval necessity prediction, relevance scoring, response verification
Components: Reflection tokens, quality critics, adaptive triggering
Benefits: Reduced unnecessary retrievals, improved factual accuracy
Use Case: High-stakes applications requiring reliability
Description: Error-correcting retrieval with web fallback
Features: Retrieval quality assessment, web search integration, knowledge correction
Flow: Assess retrieval → Correct if needed → Generate with verified knowledge
Benefits: Handles knowledge gaps, improves factual accuracy
Use Case: Dynamic knowledge domains, fact-checking applications
Description: Dynamic strategy selection based on query complexity
Features: Query classification, strategy routing, adaptive processing
Strategies: No retrieval, single-step, multi-step, iterative
Benefits: Optimized processing, cost efficiency, improved performance
Use Case: Mixed query types, production optimization
Description: Graph-based knowledge representation and retrieval
Features: Entity relationships, graph traversal, community detection
Components: Knowledge graphs, graph databases, relationship reasoning
Benefits: Complex relationship understanding, multi-hop reasoning
Use Case: Knowledge-intensive domains, relationship-heavy queries
Description: Retrieval across text, images, audio, video
Features: Cross-modal embeddings, multi-modal fusion, diverse content types
Components: Vision encoders, audio processors, multi-modal LLMs
Use Case: Rich media applications, comprehensive content search
Description: Context-aware retrieval for multi-turn conversations
Features: Conversation history integration, context tracking, query refinement
Components: Memory management, context windows, dialogue state tracking
Use Case: Chatbots, virtual assistants, interactive applications
Description: Time-aware retrieval considering temporal relevance
Features: Temporal embeddings, time-based filtering, recency weighting
Components: Time-aware indexing, temporal ranking, freshness scoring
Use Case: News applications, time-sensitive information, evolving knowledge
Description: Retrieval across multiple distributed knowledge sources
Features: Cross-source querying, source-specific optimization, result aggregation
Components: Multiple vector stores, federation layer, source routing
Use Case: Multi-tenant systems, organizational silos, diverse data sources
Description: Real-time retrieval and generation for continuous data
Features: Incremental updates, real-time indexing, streaming responses
Components: Stream processing, dynamic indexing, real-time APIs
Use Case: Live data feeds, real-time monitoring, continuous learning
Query Complexity: Simple → Basic RAG; Complex → Agentic RAG
Accuracy Requirements: High → Self-RAG, CRAG; Standard → Advanced RAG
Data Types: Text → Basic RAG; Mixed → Multi-Modal RAG
Scale: Small → Naive RAG; Enterprise → Modular/Federated RAG
Real-time Needs: Static → Basic RAG; Dynamic → Streaming/Adaptive RAG
Relationship Importance: Entity-heavy → GraphRAG; Document-focused → Hierarchical RAG
Each pattern addresses specific use cases and requirements, and modern implementations often combine multiple patterns for optimal performance.
1. Basic/Foundational RAG
Primary Paper:
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). "Retrieval-augmented generation for knowledge-intensive NLP tasks." Advances in Neural Information Processing Systems, 33, 9459-9474.
Venue: NeurIPS 2020
Citations: 3,000+ (highly influential)
Follow-up Work:
Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., ... & Yih, W. T. (2020). "Dense passage retrieval for open-domain question answering." arXiv preprint arXiv:2004.04906.
2. Self-RAG
Primary Paper:
Asai, A., Wu, Z., Wang, Y., Sil, A., & Hajishirzi, H. (2023). "Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection." International Conference on Learning Representations (ICLR), 2024.
Venue: ICLR 2024 (accepted)
arXiv: arXiv:2310.11511
3. Corrective RAG (CRAG)
Primary Paper:
Yan, S., Gu, J. C., Zhu, Y., & Ling, Z. H. (2024). "Corrective Retrieval Augmented Generation." International Conference on Learning Representations (ICLR), 2024.
Venue: ICLR 2024 (accepted)
arXiv: arXiv:2401.15884
4. GraphRAG (Academic Foundations)
Core Concept Papers:
Yasunaga, M., Leskovec, J., & Liang, P. (2021). "QA-GNN: Reasoning with language models and knowledge graphs for question answering." NAACL-HLT, 2021.
Venue: NAACL 2021
Zhang, Y., Li, X., Cui, L., Wu, B., Gu, Y., & Dublish, N. (2023). "Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models." arXiv preprint arXiv:2309.01219.
Microsoft's Implementation (Technical Report):
Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A., ... & Larson, J. (2024). "From Local to Global: A Graph RAG Approach to Query-Focused Summarization." arXiv preprint arXiv:2404.16130.
5. Fusion/Hybrid RAG
Rank Fusion Foundations:
Cormack, G. V., Clarke, C. L., & Buettcher, S. (2009). "Reciprocal rank fusion outperforms condorcet and individual rank learning methods." Proceedings of the 32nd international ACM SIGIR conference.
Venue: SIGIR 2009
Dense-Sparse Hybrid:
Gao, L., Ma, X., Lin, J., & Callan, J. (2023). "Precise Zero-Shot Dense Retrieval without Relevance Labels." Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.
Venue: ACL 2023
Lin, S. C., Yang, J. H., & Lin, J. (2021). "Distilling dense representations for ranking using tightly-coupled teachers." arXiv preprint arXiv:2010.11386.
6. Multi-Modal RAG (Emerging Academic Work)
Foundation Papers:
Chen, J., Lin, H., Han, X., & Sun, L. (2023). "M-BEIR: A Multi-domain Benchmark for Multi-modal Information Retrieval." arXiv preprint arXiv:2308.14565.
Li, J., Li, D., Xiong, C., & Hoi, S. (2022). "BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation." ICML, 2022.
Venue: ICML 2022
7. Conversational RAG
Academic Foundations:
Qu, C., Yang, L., Qiu, M., Croft, W. B., Zhang, Y., & Iyyer, M. (2019). "BERT with history answer embedding for conversational question answering." Proceedings of the 42nd International ACM SIGIR Conference.
Venue: SIGIR 2019
Anantha, R., Vakulenko, S., Tu, Z., Longpre, S., Pulman, S., & Chappidi, S. (2021). "Open-domain question answering goes conversational via question rewriting." NAACL-HLT, 2021.
Venue: NAACL 2021
NLP: ACL, EMNLP, NAACL, EACL
ML: ICLR, NeurIPS, ICML
IR: SIGIR, ECIR, CIKM, WSDM
AI: AAAI, IJCAI
These papers represent the academic foundation. Many "industry patterns" (like Modular RAG, Agentic RAG) are implementation frameworks from companies like:
LangChain/LangSmith documentation
LlamaIndex research papers (often arXiv, not peer-reviewed)
OpenAI, Anthropic technical reports
AWS, Google Cloud, Microsoft technical documentation
Agentic RAG is essentially a combination/application of:
Basic RAG (Lewis et al., 2020) - for retrieval mechanism
ReAct (Yao et al., 2022) - for reasoning and action planning
Tool Learning (various papers) - for dynamic tool selection
Multi-step reasoning - from planning literature
Academic Status:
The components are academically established
The specific combination as "Agentic RAG" is primarily industry terminology
Industry Sources:
LangChain/LangGraph documentation
LlamaIndex agent frameworks
OpenAI Assistants API documentation
Anthropic Claude function calling
AWS Bedrock Agents
1. Basic/Naive RAG
AWS Services: Bedrock Knowledge Bases, OpenSearch, Kendra
Implementation: Direct integration with foundation models
Status: Production-ready, well-documented
2. Advanced RAG
AWS Services: Bedrock Knowledge Bases with chunking strategies, Kendra intelligent ranking
Features: Query preprocessing, result re-ranking, metadata filtering
Status: Supported through configuration options
3. Agentic RAG
AWS Services: Bedrock Agents (dedicated service)
Capabilities: Function calling, tool orchestration, multi-step reasoning
Integration: Lambda functions, API Gateway, other AWS services
Status: Native support, actively developed
4. Multi-Modal RAG
AWS Services: Bedrock (Claude 3, GPT-4V), Rekognition, Textract
Support: Text + image retrieval and generation
Status: Supported through multimodal foundation models
5. Conversational RAG
AWS Services: Bedrock with conversation memory, DynamoDB for session storage
Features: Context preservation, multi-turn conversations
Status: Supported through application design patterns
6. Self-RAG
Implementation: Custom logic using Bedrock APIs + Lambda
Components: Retrieval quality assessment, response verification
Status: Possible but requires significant custom development
7. Corrective RAG (CRAG)
Implementation: Bedrock + custom orchestration + web search APIs
Components: Quality assessment, fallback to web search
Status: Achievable through multi-service architecture
8. GraphRAG
AWS Services: Neptune (graph database) + Bedrock
Implementation: Custom integration between graph queries and LLM generation
Status: Technically possible, limited native support
9. Fusion/Hybrid RAG
AWS Services: Kendra (keyword) + OpenSearch (semantic) + custom ranking
Implementation: Multi-retriever setup with result fusion
Status: Achievable through architecture design
10. Hierarchical RAG
Implementation: Custom chunking strategies in Bedrock Knowledge Bases
Features: Document structure preservation, multi-level retrieval
Status: Limited native support, mostly custom implementation
11. Modular RAG
Status: Framework concept, not a specific AWS service feature
Implementation: Achievable through microservices architecture
12. Adaptive RAG
Implementation: Custom routing logic + multiple Bedrock configurations
Status: Requires significant custom orchestration
13. Federated RAG
Implementation: Cross-account/cross-region custom setup
Status: Architecturally possible but complex
14. Temporal RAG
Implementation: Custom time-aware indexing + metadata filtering
Status: Limited native temporal awareness
15. Streaming RAG
AWS Services: Kinesis + Lambda + Bedrock for real-time processing
Status: Possible through event-driven architecture
Vector storage (OpenSearch, Pinecone, Redis)
Automatic chunking and embedding
Metadata filtering and hybrid search
Integration with S3, SharePoint, Confluence, Salesforce
Function Calling: Lambda integration
Action Groups: Custom tool definitions
Memory: Conversation persistence
Orchestration: Multi-step reasoning and planning
Guardrails: Safety and content filtering
Anthropic Claude: Text and multimodal
Amazon Titan: Text embeddings and generation
Cohere: Text generation and embeddings
Meta Llama: Text generation
Stability AI: Image generation
Basic RAG: Bedrock Knowledge Bases + Foundation Models
Agentic RAG: Bedrock Agents + Lambda functions
Conversational RAG: Bedrock + DynamoDB + API Gateway
Hybrid RAG: Kendra + OpenSearch + custom fusion logic
Multi-Modal RAG: Bedrock multimodal models + S3 + Rekognition
Self-RAG: Custom orchestration with quality assessment
1. Basic/Naive RAG
Paper: "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks" (Lewis et al., 2020, NeurIPS)
Status: Foundational paper, highly cited (~3000+ citations)
2. Self-RAG
Paper: "Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection" (Asai et al., 2023, ICLR 2024)
Status: Peer-reviewed, significant impact
3. Corrective RAG (CRAG)
Paper: "Corrective Retrieval Augmented Generation" (Yan et al., 2024, ICLR 2024)
Status: Recently accepted at top-tier venue
4. GraphRAG
Papers: Multiple works including "Graph-RAG" papers and Microsoft's GraphRAG implementation
Status: Mixed - concept is established, specific implementations vary
5. Fusion/Hybrid RAG
Papers: "Precise Zero-Shot Dense Retrieval without Relevance Labels" (Gao et al., 2023) and related fusion techniques
Status: Rank fusion methods well-established in IR literature
6. Modular RAG - More of an engineering pattern from LangChain, LlamaIndex
7. Agentic RAG - Emerging from agent frameworks, not formalized academically yet
8. Hierarchical RAG - Implementation pattern, limited formal research
9. Adaptive RAG - Engineering optimization, some academic work emerging
10. Multi-Modal RAG - Active research area but fragmented approaches
Others - Mostly industry terminology or implementation patterns
ACL, EMNLP, NAACL - Main NLP venues
ICLR, NeurIPS, ICML - ML conferences with RAG research
SIGIR, ECIR - Information retrieval conferences
arXiv - Preprints (not peer-reviewed but influential)