PIRSA:25040076

LitLLMs, LLMs for Literature Review: Are We There Yet?

APA

Sahu, G. (2025). LitLLMs, LLMs for Literature Review: Are We There Yet?. Perimeter Institute for Theoretical Physics. https://pirsa.org/25040076

MLA

Sahu, Gaurav. LitLLMs, LLMs for Literature Review: Are We There Yet?. Perimeter Institute for Theoretical Physics, Apr. 08, 2025, https://pirsa.org/25040076

BibTex

          @misc{ scivideos_PIRSA:25040076,
            doi = {10.48660/25040076},
            url = {https://pirsa.org/25040076},
            author = {Sahu, Gaurav},
            keywords = {},
            language = {en},
            title = {LitLLMs, LLMs for Literature Review: Are We There Yet?},
            publisher = {Perimeter Institute for Theoretical Physics},
            year = {2025},
            month = {apr},
            note = {PIRSA:25040076 see, \url{https://scivideos.org/index.php/pirsa/25040076}}
          }
          

Gaurav Sahu Mila - Quebec Artificial Intelligence Institute

Talk numberPIRSA:25040076
Source RepositoryPIRSA
Talk Type Conference

Abstract

Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially due to the recent influx of research papers. In this talk, we will explore the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We will decompose the task into two components: 1. Retrieving related works given a query abstract, and 2. Writing a literature review based on the retrieved results. We will then analyze how effective LLMs are for both components. For retrieval, we will discuss a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we will study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods, while providing insights into the LLM's decision-making process. We will then discuss the two-step generation phase that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We will also see a quick demo of LitLLM in action towards the end.