Add Rust tool-calling bindings for llama.cpp#977
Add Rust tool-calling bindings for llama.cpp#977MegalithOfficial wants to merge 8 commits intoutilityai:mainfrom
Conversation
llama-cpp-2/Cargo.toml
Outdated
| serde = { version = "1.0", features = ["derive"] } | ||
| serde_json = "1.0" |
There was a problem hiding this comment.
these are pretty big deps (compile time wise) to drag in by default, put this behind a feature flag.
|
looks good. throw this behind a feature flag (I don't want serde pulled in by default) or better yet, just accept string refs. We don't do anything with the json to justify the dep beyond better types. |
There was a problem hiding this comment.
I do not think this belongs in this crate. Others can implement provider-specific logic.
There was a problem hiding this comment.
Okay. When I first created this, I thought it would be better to provide typed functions on the Rust side instead of making users build and parse JSON strings by hand. This is especially important since the whole reason for this PR was #864 and the underlying llama.cpp support is already there. I agree that this started mixing two layers: the actual missing bindings needed for tool calling, and a more opinionated Rust/OpenAI convenience layer on top.
I've updated the PR to focus on the binding surface needed for #864. The typed serde-based helpers are no longer in the library API, but the raw JSON/string-based path for passing tools into chat templates, getting prompts and grammar back, and parsing OpenAI-compatible responses is still there. I also removed the standard serde/serde_json dependency from the main collection of code and updated the examples to use that raw flow instead.
|
I have updated the READMEs. |
|
lgtm. am I correct in thinking this adds very little to the core lib? the main contributions are better docs/examples and the test? |
This PR addresses #864 by filling in the missing Rust-side tool-calling pieces that
llama.cppalready supports.It adds:
The goal here was to make tool calling usable from Rust without forcing everything through raw JSON strings, while still keeping the raw JSON APIs available for users who prefer them.
For verification, I ran
cargo test -p llama-cpp-2 --libfor the library changes and usedcargo checkon the updated examples to make sure the typed APIs and example flows still compiled cleanly. I also exercised the reasoning and tool-calling path directly with:I also added
--continousflag to the reasoning example so it can take the model’s initial tool call, inject a mock tool result back into the conversation, and then generate the follow-up assistant response in the same run.