diff options
author | Aditya <bluenerd@protonmail.com> | 2025-03-01 09:27:13 +0530 |
---|---|---|
committer | Aditya <bluenerd@protonmail.com> | 2025-03-01 09:27:13 +0530 |
commit | dae35f43ed65196949bae6d46028370f7ed28549 (patch) | |
tree | 38b7bf6c09ac6222f9173d905668db7b684e3b62 | |
parent | 2d6ee2589c06007a0f24527c38158a54377209b9 (diff) |
add literature review
-rw-r--r-- | literature review.ods | bin | 0 -> 48861 bytes | |||
-rw-r--r-- | literature review1.ods | bin | 0 -> 52479 bytes | |||
-rw-r--r-- | papers/Retrieval-Augmented Generation for AI-Generated Content A Survey.pdf | bin | 0 -> 4281893 bytes | |||
-rw-r--r-- | table.html | 795 |
4 files changed, 795 insertions, 0 deletions
diff --git a/literature review.ods b/literature review.ods Binary files differnew file mode 100644 index 0000000..5888c2c --- /dev/null +++ b/literature review.ods diff --git a/literature review1.ods b/literature review1.ods Binary files differnew file mode 100644 index 0000000..b2eddf9 --- /dev/null +++ b/literature review1.ods diff --git a/papers/Retrieval-Augmented Generation for
AI-Generated Content A Survey.pdf b/papers/Retrieval-Augmented Generation for
AI-Generated Content A Survey.pdf Binary files differnew file mode 100644 index 0000000..1ce4e36 --- /dev/null +++ b/papers/Retrieval-Augmented Generation for AI-Generated Content A Survey.pdf diff --git a/table.html b/table.html new file mode 100644 index 0000000..19e4e78 --- /dev/null +++ b/table.html @@ -0,0 +1,795 @@ +<table border="1" cellspacing="0" cellpadding="5"> + <thead> + <tr> + <th>Paper Title</th> + <th>Approach</th> + <th>Datasets Used</th> + <th>Results</th> + <th>Key Contributions</th> + </tr> + </thead> + <tbody> + <!-- Paper 1 --> + <tr> + <td> + <ul> + <li>A Neural Corpus Indexer for Document Retrieval (Wang et al., 2022)</li> + </ul> + </td> + <td> + <ul> + <li>End-to-end seq2seq network with a Prefix-Aware Weight-Adaptive (PAWA) Decoder</li> + <li>Query generation network and hierarchical k-means indexing</li> + </ul> + </td> + <td> + <ul> + <li>NQ320k</li> + <li>TriviaQA</li> + </ul> + </td> + <td> + <ul> + <li>+21.4% relative improvement in Recall@1 on NQ320k</li> + <li>+16.8% improvement in R-Precision on TriviaQA</li> + </ul> + </td> + <td> + <ul> + <li>Unifies training and indexing</li> + <li>Introduces a novel decoder and realistic query–document pair generation for enhanced retrieval performance</li> + </ul> + </td> + </tr> + <!-- Paper 2 --> + <tr> + <td> + <ul> + <li>Active Retrieval Augmented Generation (Jiang et al., 2023)</li> + </ul> + </td> + <td> + <ul> + <li>Dynamic, iterative retrieval integrated into generation (FLARE)</li> + <li>Detects low-confidence tokens and retrieves additional context</li> + </ul> + </td> + <td> + <ul> + <li>Knowledge-intensive tasks (e.g., multihop QA, open-domain summarization)</li> + <li>Specific datasets not detailed</li> + </ul> + </td> + <td> + <ul> + <li>Significant performance improvements in complex, long-form generation tasks</li> + </ul> + </td> + <td> + <ul> + <li>Introduces a forward-looking, active retrieval mechanism</li> + <li>Moves beyond static, single-time retrieval methods</li> + </ul> + </td> + </tr> + <!-- Paper 3 --> + <tr> + <td> + <ul> + <li>Atlas Few-shot Learning with Retrieval Augmented Language Models</li> + </ul> + </td> + <td> + <ul> + <li>Dual-encoder retrieval combined with a sequence-to-sequence generator</li> + <li>Joint pre-training of both components</li> + </ul> + </td> + <td> + <ul> + <li>Natural Questions</li> + <li>MMLU, KILT benchmarks</li> + </ul> + </td> + <td> + <ul> + <li>Over 42% accuracy on Natural Questions with only 64 training examples</li> + <li>Outperforms larger models (e.g., PaLM) by 3%</li> + </ul> + </td> + <td> + <ul> + <li>Demonstrates effective few-shot learning with minimal data</li> + <li>Offers an adaptable document index</li> + </ul> + </td> + </tr> + <!-- Paper 4 --> + <tr> + <td> + <ul> + <li>Benchmarking Large Language Models in Retrieval-Augmented Generation (Chen et al., 2024)</li> + </ul> + </td> + <td> + <ul> + <li>Evaluation framework (RGB) assessing retrieval quality in LLMs (e.g., ChatGPT, ChatGLM, Vicuna)</li> + </ul> + </td> + <td> + <ul> + <li>Evaluation tasks in English and Chinese under varying noise conditions</li> + </ul> + </td> + <td> + <ul> + <li>Accuracy drop: e.g., ChatGPT from 96.33% to 76% with noise</li> + <li>Multi-document integration challenges (accuracy drops to 43–55%)</li> + </ul> + </td> + <td> + <ul> + <li>Provides a rigorous benchmark for RAG settings</li> + <li>Highlights error detection and rejection behaviors in LLMs</li> + </ul> + </td> + </tr> + <!-- Paper 5 --> + <tr> + <td> + <ul> + <li>C-RAG Certified Generation Risks for Retrieval-Augmented Language Models (Kang et al., 2024)</li> + </ul> + </td> + <td> + <ul> + <li>Conformal risk analysis to certify generation risks</li> + <li>Establishes an upper bound (“conformal generation risk”)</li> + </ul> + </td> + <td> + <ul> + <li>AESLC</li> + <li>CommonGen, DART, E2E</li> + </ul> + </td> + <td> + <ul> + <li>Consistently lower conformal generation risks compared to non-retrieval models</li> + </ul> + </td> + <td> + <ul> + <li>Extends conformal prediction methods to RAG</li> + <li>Provides a framework for risk certification in generation</li> + </ul> + </td> + </tr> + <!-- Paper 6 --> + <tr> + <td> + <ul> + <li>Can Knowledge Graphs Reduce Hallucinations in LLMs: A Survey (Agrawal et al., 2024)</li> + </ul> + </td> + <td> + <ul> + <li>Survey categorizing KG integration methods into:</li> + <li> • Knowledge-aware inference</li> + <li> • Knowledge-aware learning</li> + <li> • Knowledge-aware validation</li> + </ul> + </td> + <td> + <ul> + <li>Aggregated studies across multiple tasks</li> + <li>No single dataset</li> + </ul> + </td> + <td> + <ul> + <li>Up to 80% enhancement in answer correctness in certain settings</li> + <li>Improved chain-of-thought reasoning</li> + </ul> + </td> + <td> + <ul> + <li>Comprehensively categorizes KG-based augmentation methods</li> + <li>Addresses hallucination reduction in LLMs</li> + </ul> + </td> + </tr> + <!-- Paper 7 --> + <tr> + <td> + <ul> + <li>Dense Passage Retrieval for Open-Domain Question Answering (Karpukhin et al., 2020)</li> + </ul> + </td> + <td> + <ul> + <li>Dual-encoder dense vector representations for semantic matching</li> + <li>Utilizes in-batch negative training</li> + </ul> + </td> + <td> + <ul> + <li>Natural Questions</li> + <li>Other open-domain QA benchmarks</li> + </ul> + </td> + <td> + <ul> + <li>Top-20 accuracy of 78.4% on Natural Questions (vs. 59.1% for BM25)</li> + </ul> + </td> + <td> + <ul> + <li>Introduces dense retrieval techniques</li> + <li>Significantly improves semantic matching in QA systems</li> + </ul> + </td> + </tr> + <!-- Paper 8 --> + <tr> + <td> + <ul> + <li>DocPrompting Generating Code by Retrieving the Docs (Zhou et al., 2023)</li> + </ul> + </td> + <td> + <ul> + <li>Retrieval of documentation to guide code generation</li> + <li>Focuses on documentation rather than NL-code pairs</li> + </ul> + </td> + <td> + <ul> + <li>CoNaLa (Python)</li> + <li>Curated Bash dataset</li> + </ul> + </td> + <td> + <ul> + <li>52% relative gain in pass@1</li> + <li>30% relative gain in pass@10 on CoNaLa</li> + </ul> + </td> + <td> + <ul> + <li>Highlights the importance of documentation retrieval</li> + <li>Boosts code generation accuracy and generalization</li> + </ul> + </td> + </tr> + <!-- Paper 9 --> + <tr> + <td> + <ul> + <li>Document Language Models, Query Models, and Risk Minimization for Information Retrieval<br>(Ponte & Croft, 1998; Berger & Lafferty, 1999; Lafferty & Zhai, 2001)</li> + </ul> + </td> + <td> + <ul> + <li>Combines unigram language models</li> + <li>Statistical translation methods</li> + <li>Markov chain query models and Bayesian risk minimization</li> + </ul> + </td> + <td> + <ul> + <li>TREC collections</li> + </ul> + </td> + <td> + <ul> + <li>Significant improvements over traditional vector space models</li> + </ul> + </td> + <td> + <ul> + <li>Laid the foundation for integrating DLMs, QMs, and risk minimization</li> + <li>Influenced modern retrieval methods</li> + </ul> + </td> + </tr> + <!-- Paper 10 --> + <tr> + <td> + <ul> + <li>Evaluating Retrieval Quality in Retrieval-Augmented Generation (Salemi & Zamani, 2024)</li> + </ul> + </td> + <td> + <ul> + <li>eRAG: Uses LLMs to generate document-level relevance labels</li> + <li>Labels correlate with downstream performance</li> + </ul> + </td> + <td> + <ul> + <li>Various downstream RAG tasks</li> + <li>Exact datasets not specified</li> + </ul> + </td> + <td> + <ul> + <li>Kendall’s tau increased from 0.168 to 0.494</li> + <li>Up to 50× memory efficiency and 2.468× speedup</li> + </ul> + </td> + <td> + <ul> + <li>Proposes a novel evaluation metric aligning retrieval quality with end-task performance</li> + <li>Reduces computational overhead</li> + </ul> + </td> + </tr> + <!-- Paper 11 --> + <tr> + <td> + <ul> + <li>Fine Tuning vs Retrieval Augmented Generation for Less Popular Knowledge</li> + </ul> + </td> + <td> + <ul> + <li>Comparative analysis between fine tuning (FT) and RAG</li> + </ul> + </td> + <td> + <ul> + <li>Not explicitly specified</li> + </ul> + </td> + <td> + <ul> + <li>RAG achieves higher accuracy for low-frequency entities</li> + <li>Hybrid FT+RAG yields best results for smaller models</li> + </ul> + </td> + <td> + <ul> + <li>Highlights benefits of RAG over traditional fine tuning</li> + <li>Effective for less popular or emerging knowledge</li> + </ul> + </td> + </tr> + <!-- Paper 12 --> + <tr> + <td> + <ul> + <li>How Much Knowledge Can You Pack Into the Parameters of a Language Model (Roberts et al., 2020)</li> + </ul> + </td> + <td> + <ul> + <li>Fine-tuning with salient span masking (SSM) as a pre-training objective</li> + <li>Applied to open-domain QA</li> + </ul> + </td> + <td> + <ul> + <li>Natural Questions</li> + <li>WebQuestions</li> + <li>TriviaQA</li> + </ul> + </td> + <td> + <ul> + <li>Larger models outperform smaller ones</li> + <li>Significant performance gains with SSM</li> + </ul> + </td> + <td> + <ul> + <li>Contrasts closed-book vs. open-book QA</li> + <li>Demonstrates task-specific pre-training benefits</li> + </ul> + </td> + </tr> + <!-- Paper 13 --> + <tr> + <td> + <ul> + <li>Learning Transferable Visual Models From Natural Language Supervision<br>(Radford et al., 2021; Brown et al., 2020; Deng et al., 2009)</li> + </ul> + </td> + <td> + <ul> + <li>Contrastive Language-Image Pre-training (CLIP)</li> + <li>Joint image and text encoders</li> + </ul> + </td> + <td> + <ul> + <li>400M (image, text) pairs</li> + <li>Evaluated on ImageNet and other benchmarks</li> + </ul> + </td> + <td> + <ul> + <li>Competitive zero-shot performance on ImageNet</li> + <li>Robust to natural distribution shifts</li> + </ul> + </td> + <td> + <ul> + <li>Bridges visual and textual modalities</li> + <li>Enables transferable visual representations via contrastive learning</li> + </ul> + </td> + </tr> + <!-- Paper 14 --> + <tr> + <td> + <ul> + <li>Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering<br>(Izacard & Grave, 2021; Roberts et al., 2020)</li> + </ul> + </td> + <td> + <ul> + <li>Fusion-in-Decoder: Independently encodes multiple passages</li> + <li>Aggregates evidence in the decoder</li> + </ul> + </td> + <td> + <ul> + <li>Natural Questions</li> + <li>TriviaQA</li> + </ul> + </td> + <td> + <ul> + <li>State-of-the-art Exact Match scores</li> + <li>Performance scales with more retrieved passages</li> + </ul> + </td> + <td> + <ul> + <li>Combines retrieval with generation to synthesize evidence</li> + <li>Improves open-domain QA accuracy</li> + </ul> + </td> + </tr> + <!-- Paper 15 --> + <tr> + <td> + <ul> + <li>Precise Zero-Shot Dense Retrieval without Relevance Labels (Gao et al., 2023)</li> + </ul> + </td> + <td> + <ul> + <li>HyDE: Two-step process</li> + <li>Generates hypothetical documents using instruction-following LMs</li> + <li>Applies unsupervised contrastive encoding</li> + </ul> + </td> + <td> + <ul> + <li>Various tasks: web search, QA, fact verification (multi-language settings)</li> + </ul> + </td> + <td> + <ul> + <li>Outperforms existing unsupervised dense retrieval models</li> + <li>Competitive with fine-tuned models</li> + </ul> + </td> + <td> + <ul> + <li>Enables effective zero-shot retrieval without explicit relevance labels</li> + <li>Leverages hypothetical document generation</li> + </ul> + </td> + </tr> + <!-- Paper 16 --> + <tr> + <td> + <ul> + <li>Re2G Retrieve, Rerank, Generate (Lewis et al., 2020; Guu et al., 2020)</li> + </ul> + </td> + <td> + <ul> + <li>Integrated framework combining retrieval, reranking, and generation</li> + <li>Uses knowledge distillation</li> + </ul> + </td> + <td> + <ul> + <li>Various tasks (exact datasets not specified)</li> + </ul> + </td> + <td> + <ul> + <li>Enhanced selection of relevant passages</li> + <li>Improved overall performance across tasks</li> + </ul> + </td> + <td> + <ul> + <li>Unifies retrieval, reranking, and generation in an end-to-end framework</li> + <li>Improves evidence selection</li> + </ul> + </td> + </tr> + <!-- Paper 17 --> + <tr> + <td> + <ul> + <li>REALM Retrieval-Augmented Language Model Pre-Training<br>(Guu et al., 2020; Devlin et al., 2018; Raffel et al., 2019)</li> + </ul> + </td> + <td> + <ul> + <li>Two-step process: retrieval followed by masked language model prediction</li> + </ul> + </td> + <td> + <ul> + <li>Natural Questions</li> + <li>WebQuestions</li> + </ul> + </td> + <td> + <ul> + <li>4–16% absolute accuracy improvements on open-domain QA benchmarks</li> + </ul> + </td> + <td> + <ul> + <li>Integrates retrieval into pre-training</li> + <li>Enhances prediction accuracy and model interpretability</li> + </ul> + </td> + </tr> + <!-- Paper 18 --> + <tr> + <td> + <ul> + <li>REST Retrieval-Based Speculative Decoding<br>(He et al., 2024; Miao et al., 2023; Chen et al., 2023)</li> + </ul> + </td> + <td> + <ul> + <li>Uses a non-parametric retrieval datastore to construct draft tokens</li> + <li>For speculative decoding</li> + </ul> + </td> + <td> + <ul> + <li>HumanEval</li> + <li>MT-Bench</li> + </ul> + </td> + <td> + <ul> + <li>1.62× to 2.36× speedup in token generation</li> + <li>Compared to standard autoregressive methods</li> + </ul> + </td> + <td> + <ul> + <li>Improves generation speed without additional training</li> + <li>Allows seamless integration with various LLMs</li> + </ul> + </td> + </tr> + <!-- Paper 19 --> + <tr> + <td> + <ul> + <li>Retrieval Augmentation Reduces Hallucination in Conversation<br>(Roller et al., 2021; Maynez et al., 2020; Lewis et al., 2020b; Shuster et al., 2021; Dinan et al., 2019b; Zhou et al., 2021)</li> + </ul> + </td> + <td> + <ul> + <li>Integration of retrieval mechanisms into dialogue systems</li> + <li>Fetches relevant documents for improved factuality</li> + </ul> + </td> + <td> + <ul> + <li>Knowledge-grounded conversational datasets</li> + <li>Specific names not provided</li> + </ul> + </td> + <td> + <ul> + <li>Reduced hallucination rates</li> + <li>Higher factual accuracy compared to standard models</li> + </ul> + </td> + <td> + <ul> + <li>Demonstrates more reliable, factually grounded conversational responses</li> + </ul> + </td> + </tr> + <!-- Paper 20 --> + <tr> + <td> + <ul> + <li>Retrieval Augmented Code Generation and Summarization<br>(Parvez et al., 2021; Karpukhin et al., 2020; Feng et al., 2020; Guo et al., 2021; Ahmad et al., 2021)</li> + </ul> + </td> + <td> + <ul> + <li>REDCODER framework: Two-step process combining retrieval with generation</li> + <li>Uses pre-trained code models</li> + </ul> + </td> + <td> + <ul> + <li>Code generation benchmarks (e.g., CoNaLa, CodeXGLUE)</li> + <li>Evaluated via BLEU, Exact Match, CodeBLEU</li> + </ul> + </td> + <td> + <ul> + <li>Significant improvements in BLEU, Exact Match, and CodeBLEU scores</li> + </ul> + </td> + <td> + <ul> + <li>Enhances code generation and summarization</li> + <li>Effectively retrieves relevant code snippets and integrates pre-trained models</li> + </ul> + </td> + </tr> + <!-- Paper 21 --> + <tr> + <td> + <ul> + <li>Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks<br>(Lewis et al., 2020; Karpukhin et al., 2020; Guu et al., 2020; Petroni et al., 2019)</li> + </ul> + </td> + <td> + <ul> + <li>Combines retrieval with generative modeling to synthesize external knowledge</li> + </ul> + </td> + <td> + <ul> + <li>Various knowledge-intensive benchmarks (e.g., Natural Questions, TriviaQA)</li> + </ul> + </td> + <td> + <ul> + <li>Outperforms extractive and closed-book models in accuracy and robustness</li> + </ul> + </td> + <td> + <ul> + <li>Balances internal model knowledge with external retrieval</li> + <li>Provides accurate and comprehensive answers</li> + </ul> + </td> + </tr> + <!-- Paper 22 --> + <tr> + <td> + <ul> + <li>Retrieval-Enhanced Machine Learning (Zamani et al., 2022)</li> + </ul> + </td> + <td> + <ul> + <li>REML framework: Joint optimization of a prediction model and a retrieval model</li> + </ul> + </td> + <td> + <ul> + <li>Applied in domain adaptation and few-shot learning scenarios</li> + <li>Datasets not specified</li> + </ul> + </td> + <td> + <ul> + <li>Improves model generalization, scalability, and interpretability</li> + </ul> + </td> + <td> + <ul> + <li>Offloads memorization to a retrieval system</li> + <li>Supports dynamic updates to knowledge bases</li> + </ul> + </td> + </tr> + <!-- Paper 23 --> + <tr> + <td> + <ul> + <li>Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection (Asai et al., 2024)</li> + </ul> + </td> + <td> + <ul> + <li>Adaptive retrieval with self-reflection using “reflection tokens”</li> + <li>Structured self-critique</li> + </ul> + </td> + <td> + <ul> + <li>Open-domain QA, reasoning, and fact verification tasks</li> + <li>Specific datasets not provided</li> + </ul> + </td> + <td> + <ul> + <li>Outperforms state-of-the-art models in factuality and citation accuracy</li> + </ul> + </td> + <td> + <ul> + <li>Introduces self-critique into the RAG pipeline</li> + <li>Enables adaptive retrieval and improved output reliability</li> + </ul> + </td> + </tr> + <!-- Paper 24 --> + <tr> + <td> + <ul> + <li>The Probabilistic Relevance Framework – BM25 and Beyond<br>(Robertson & Sparck Jones, 1977; Robertson et al., 1994; Robertson & Zaragoza, 2009; Sparck Jones et al., 2000)</li> + </ul> + </td> + <td> + <ul> + <li>Probabilistic relevance modeling using term frequency, inverse document frequency, and document length normalization</li> + </ul> + </td> + <td> + <ul> + <li>TREC collections and other ad-hoc retrieval tasks</li> + </ul> + </td> + <td> + <ul> + <li>Demonstrated robust performance as a benchmark for relevance estimation</li> + </ul> + </td> + <td> + <ul> + <li>Provides the theoretical foundation for modern IR systems</li> + <li>Basis for the widely adopted BM25 scoring function</li> + </ul> + </td> + </tr> + <!-- Paper 25 --> + <tr> + <td> + <ul> + <li>TIARA Multi-grained Retrieval for Robust Question Answering over Large Knowledge Base<br>(Gu et al., 2021; Ye et al., 2021; Chen et al., 2021; Raffel et al., 2020; Devlin et al., 2019)</li> + </ul> + </td> + <td> + <ul> + <li>Multi-grained retrieval integrating entity, exemplary logical form, and schema retrieval</li> + <li>Uses constrained decoding</li> + </ul> + </td> + <td> + <ul> + <li>GrailQA</li> + <li>WebQuestionsSP</li> + </ul> + </td> + <td> + <ul> + <li>Significant improvements in compositional and zero-shot generalization</li> + <li>Outperforms previous methods</li> + </ul> + </td> + <td> + <ul> + <li>Addresses KBQA challenges by retrieving multiple granularities of context</li> + <li>Enhances accuracy and reliability of logical form generation</li> + </ul> + </td> + </tr> + </tbody> +</table> + |