Posts | Arc AI - Marco Menner

ai research codebase llm rag

Approaches for the efficient deployment of large code bases for LLMs

January 6, 2025

This post summarizes key insights from a recent academic paper I co-authored, which investigates methods for handling large codebases using Large Language Models (LLMs). The original paper was written in German and focuses on the technical and structural challenges of applying LLMs in software engineering contexts—especially when working with large repositories.

Overview

The paper, titled “Ansätze zur effizienten Bereitstellung großer Codebasen für Large Language Models”, presents a structured review of current approaches in the field. We focused particularly on Retrieval-Augmented Generation (RAG), chunking strategies, and graph-based code representations to optimize context management.