Skip to content

Add Rust tool-calling bindings for llama.cpp#977

Open
MegalithOfficial wants to merge 8 commits intoutilityai:mainfrom
MegalithOfficial:main
Open

Add Rust tool-calling bindings for llama.cpp#977
MegalithOfficial wants to merge 8 commits intoutilityai:mainfrom
MegalithOfficial:main

Conversation

@MegalithOfficial
Copy link
Copy Markdown

@MegalithOfficial MegalithOfficial commented Mar 30, 2026

This PR addresses #864 by filling in the missing Rust-side tool-calling pieces that llama.cpp already supports.

It adds:

  • typed tool definitions
  • chat template application with tools
  • typed OpenAI-compatible options and tool choice handling
  • typed parsed responses and streaming deltas
  • JSON schema to grammar conversion helper
  • tests and example updates

The goal here was to make tool calling usable from Rust without forcing everything through raw JSON strings, while still keeping the raw JSON APIs available for users who prefer them.

For verification, I ran cargo test -p llama-cpp-2 --lib for the library changes and used cargo check on the updated examples to make sure the typed APIs and example flows still compiled cleanly. I also exercised the reasoning and tool-calling path directly with:

cargo run -release --example tools_reasoning -- --continous hf-model unsloth/Qwen3.5-4B-GGUF Qwen3.5-4B-Q4_K_M.gguf

I also added --continous flag to the reasoning example so it can take the model’s initial tool call, inject a mock tool result back into the conversation, and then generate the follow-up assistant response in the same run.

@MegalithOfficial MegalithOfficial marked this pull request as draft March 30, 2026 13:17
@MegalithOfficial MegalithOfficial marked this pull request as ready for review March 30, 2026 13:20
Comment on lines +18 to +19
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are pretty big deps (compile time wise) to drag in by default, put this behind a feature flag.

@MarcusDunn
Copy link
Copy Markdown
Contributor

MarcusDunn commented Mar 30, 2026

looks good. throw this behind a feature flag (I don't want serde pulled in by default) or better yet, just accept string refs. We don't do anything with the json to justify the dep beyond better types.

Copy link
Copy Markdown
Contributor

@MarcusDunn MarcusDunn Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this belongs in this crate. Others can implement provider-specific logic.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. When I first created this, I thought it would be better to provide typed functions on the Rust side instead of making users build and parse JSON strings by hand. This is especially important since the whole reason for this PR was #864 and the underlying llama.cpp support is already there. I agree that this started mixing two layers: the actual missing bindings needed for tool calling, and a more opinionated Rust/OpenAI convenience layer on top.

I've updated the PR to focus on the binding surface needed for #864. The typed serde-based helpers are no longer in the library API, but the raw JSON/string-based path for passing tools into chat templates, getting prompts and grammar back, and parsing OpenAI-compatible responses is still there. I also removed the standard serde/serde_json dependency from the main collection of code and updated the examples to use that raw flow instead.

@MegalithOfficial
Copy link
Copy Markdown
Author

I have updated the READMEs.

@MarcusDunn
Copy link
Copy Markdown
Contributor

lgtm. am I correct in thinking this adds very little to the core lib? the main contributions are better docs/examples and the test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants