Skip to content

main into nvidia branch#4

Merged
meghsat merged 108 commits intonvidia_benchmark_nvidia-smifrom
main
Jan 6, 2026
Merged

main into nvidia branch#4
meghsat merged 108 commits intonvidia_benchmark_nvidia-smifrom
main

Conversation

@meghsat
Copy link
Copy Markdown
Owner

@meghsat meghsat commented Jan 6, 2026

No description provided.

jeremyfowers and others added 30 commits November 7, 2025 07:51
* C++: Improve linux CLI experience

* Show error messages on cli failures

* Fix status/stop on linux run

* Clean up the run command

* extend testing
* fix: bypass OpenAI client for chat and completion requests to avoid Pydantic serialization issues

* fix: remove unused json import in wrapped_server.py

* chore: update .gitignore to include Claude, Bruno, and Cursor files

* fix: update OpenAI package version and adjust API calls in wrapped_server.py

* fix: add logprobs parameter to response output in Server class
* add dll copying to path

* upgrade to new rai whl

* update version check
* Rev version numvers for 8.2.1 / 9.0.1-beta release

* Enable the stats endpoint

* Remove unused code

* add prompt tokens to stats

* update docs

* update docs again
…e-sdk#547)

* C++: env vars to override cli defaults

* Enable custom backend arguments for llamacpp

* Update docs
…emonade-sdk#548)

* C++: Make it easier to change the llamacpp build

* Clean up install directory
* Remove python nsis installer and related tests

* Remove the 'beta' tag from the C++ server

* rev version numbers

* Update install instructions

* Remove most instances of lemonade-server-dev

* Documentation updates

* Add python deprecation notice

* Improve deprecation notice

* Fix make_http_request() bug

* Change default host to localhost and fix bugs

* Status function checks ipv4 and 6
* Update open-hands.md

* Update open-hands.md
vgodsoe and others added 29 commits December 10, 2025 16:04
* Update runner labels

* Update labels in documentation

* Suggested changes
…emonade-sdk#729)

* Fix bug that resulted in user-model registrations

* Eliminate redundant methods and data structures

* Fix model pull bugs

---------

Co-authored-by: Daniel Holanda <holand.daniel@gmail.com>
* Overhaul flm install flow

* Adjust model list cache for correctness

* Fix FLM and rev lemonade version

* Change FLM driver req to .304. Update website.
* More helpful 'not-found' errors

* Better error message when models are filtered

* Show 'model load failed' errors

---------

Co-authored-by: Daniel Holanda <holand.daniel@gmail.com>
…e-sdk#731)

* Show embeddings well

* cleaner

* cleaner rerank

* Better reranker

* Better embeddings

* Great embedding and reranking

* transcription

* transcriber

* nit

* Cleanup styles merge

* Fix chat bar
* Add --extra-models-dir option

* Refine the language
* Touch up the documentation

* Apply suggestions from code review

Co-authored-by: Ramakrishnan Sivakumar <ramkrishna2910@gmail.com>

---------

Co-authored-by: Ramakrishnan Sivakumar <ramkrishna2910@gmail.com>
…container (lemonade-sdk#723)

* add: added docker build/run instruction for lemonade cpp

* add: updated ubuntu to support the new cmake version requirements

* add: added refrence to devcontainers

* add: added note regarding folder structure:

---------

Co-authored-by: Jeremy Fowers <80718789+jeremyfowers@users.noreply.github.com>
This reverts commit 88ab33c.
…emonade-sdk#764)

* add LEMONADE_DISABLE_MODEL_FILTERING env var

* Add Ryzen AI Z2 to the supported NPU processors regex
* Steamline the new user experience

* bug fix

* new file
* add support for recipe options

* add documentation for recipe_options.json

* corrected documentation

* add logging for recipe options
* add support for API key

* document LEMONADE_API_KEY

* do not require auth on HTTP OPTIONS  method
@meghsat meghsat merged commit 7541eac into nvidia_benchmark_nvidia-smi Jan 6, 2026
18 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.