Large Language Models for Accelerating Scientific Research Work in Industry: RAG and Agents
Abstract
Large Language Models (LLMs) are changing the way scientific work gets planned, carried out, and communicated, especially in fields like life sciences, drug development, and pharmaceuticals. This paper
focuses on what is actually working in practice: how teams are using LLMs to navigate scientific literature, pull relevant information from clinical or research text, and support decision-making in drug discovery. We break down the core pieces behind these systems, things like retrieval-augmented generation (RAG), hybrid vector search, agentic LLM orchestration, and discuss how to measure the retrieval process and entire system performance. We also explore some of the newer
directions in this space, particularly agentic RAGs, where LLMs do not just answer questions but actively decide what steps to take, having at their disposal tools that they can run when needed. This includes retrieving
papers from sources like PubMed, arXiv, bioRxiv, or Google Scholar, querying structured data, and chaining together reasoning steps to dig deeper into a problem.
focuses on what is actually working in practice: how teams are using LLMs to navigate scientific literature, pull relevant information from clinical or research text, and support decision-making in drug discovery. We break down the core pieces behind these systems, things like retrieval-augmented generation (RAG), hybrid vector search, agentic LLM orchestration, and discuss how to measure the retrieval process and entire system performance. We also explore some of the newer
directions in this space, particularly agentic RAGs, where LLMs do not just answer questions but actively decide what steps to take, having at their disposal tools that they can run when needed. This includes retrieving
papers from sources like PubMed, arXiv, bioRxiv, or Google Scholar, querying structured data, and chaining together reasoning steps to dig deeper into a problem.
Keywords
Large Language Models (LLM), retrieval augmented generation (RAG), scientific research, AI assistant research, agentic RAGs, LLM agents