Skip to content

itsluketwist/agent-library-usage

Repository files navigation

agent-library-usage

This repository contains the artifacts and full results for the research paper A Study of Library Usage in Agent-Authored Pull Requests, accepted at the 23rd International Conference on Mining Software Repositories (MSR '26), April 13-14, 2026, Rio de Janeiro, Brazil, and available on arXiv.

abstract

Coding agents are becoming increasingly capable of completing end-to-end software engineering workflows that previously required a human developer, including raising pull requests (PRs) to propose their changes. However, we still know little about how these agents use libraries when generating code, a core part of real-world software development. To fill this gap, we study 26,760 agent-authored PRs from the AIDev dataset to examine three questions: how often do agents import libraries, how often do they introduce new dependencies (and with what versioning), and which specific libraries do they choose? We find that agents often import libraries (29.5% of PRs) but rarely add new dependencies (1.3% of PRs); and when they do, they follow strong versioning practices (75.0% specify a version), an improvement on direct LLM usage where versions are rarely mentioned. Generally, agents draw from a surprisingly diverse set of external libraries, contrasting with the limited "library preferences" seen in prior non-agentic LLM studies. Our results offer an early empirical view into how AI coding agents interact with today's software ecosystems.

dataset

This work is part of the MSR 2026 Mining Challenge, analysing the AIDev dataset, the first large-scale, openly available dataset of agent-authored pull requests from real-world GitHub repositories. The dataset was introduced by Li et al. and captures the emergence of autonomous coding agents in software engineering, providing a unique opportunity to study how AI teammates interact with real-world codebases and software ecosystems.

Dataset Version: This research utilises AIDev dataset revision eee0408a277826d88fc0ca5fa07d2fc325c96af1 (November 2025 snapshot).

installation

The code requires Python 3.11 or later to run. Ensure you have it installed with the command below, otherwise download and install it from here.

python --version

Now clone the repository code:

git clone https://github.com/itsluketwist/agent-library-usage

Once cloned, install the requirements locally in a virtual environment:

python -m venv .venv

. .venv/bin/activate

pip install .

usage

After installation, all analysis is run through Jupyter notebooks in the notebooks/ directory. Run the notebooks in order:

  1. 01_download_dataset.ipynb - Download and prepare the AIDev dataset
  2. 02_explore_languages.ipynb - Identify programming languages in the dataset
  3. 03_analyze_library_usage.ipynb - Analyse library usage patterns across all languages
  4. 04_generate_latex_tables.ipynb - Generate LaTeX tables for the research paper

Each notebook is self-contained and documents its purpose and outputs.

structure

development

We use a few extra processes to ensure the code maintains a high quality. First clone the project and create a virtual environment - as described above. Now install the editable version of the project, with the development dependencies.

pip install --editable ".[dev]"

tests

This project includes unit tests to ensure correct functionality. Use pytest to run the tests with:

pytest tests

linting

We use pre-commit to lint the code, run it using:

pre-commit run --all-files

dependencies

We use uv for dependency management. First add new dependencies to requirements.in. Then version lock with uv using:

uv pip compile requirements.in --output-file requirements.txt --upgrade

citation

If you use this work in your research, please cite our paper:

ACM Reference Format:

Lukas Twist and Jie M. Zhang. 2026. A Study of Library Usage in Agent-Authored Pull Requests. In 23rd International Conference on Mining Software Repositories (MSR '26), April 13-14, 2026, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/3793302.3793562

BibTeX:

@inproceedings{twist2026AgentLibraryUsage,
  title = {{A Study of Library Usage in Agent-Authored Pull Requests}},
  author = {Twist, Lukas and Zhang, Jie M.},
  booktitle = {Proceedings of the 23rd International Conference on Mining Software Repositories},
  series = {MSR '26},
  location = {Rio de Janeiro, Brazil},
  year = {2026},
  month = {April},
  publisher = {ACM},
  doi = {10.1145/3793302.3793562},
}

acknowledgments

In a fitting twist of irony, this repository–which analyses how AI coding agents use libraries–was itself developed with assistance from Claude Code, an AI coding agent. All code was thoroughly reviewed and validated by the authors, who remain responsible for the scientific interpretations and conclusions.

About

Companion repository for research paper: A Study of Library Usage in Agent-Authored Pull Requests (MSR '26)

Resources

License

Stars

Watchers

Forks

Contributors