llama-server

Here are 5 public repositories matching this topic...

Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.

self-hosted mlx openai-api llm llamacpp llama-cpp vllm llm-inference localllm localllama llama-server llm-router mlx-lm

A lightweight chat terminal-interface for llama.cpp server written in C++ with many features and windows/linux support.

chat roleplay llama teminal-application llamacpp mistral-7b llama-server

Local LLM proxy, DevOps friendly

This is a Bash script to automatically launch llama-server, detects available .gguf models, and selects GPU layers based on your free VRAM.

bash cli utility ai launcher nvidia llama nvidia-smi nvidia-gpu llm llamacpp gguf llama-server gguf-models

A simple web application for real-time AI vision analysis using SmolVLM-500M-Instruct with live camera feed processing and text-to-speech.

face-recognition webcam llm-inference llama-server smolvlm

Add a description, image, and links to the llama-server topic page so that developers can more easily learn about it.

To associate your repository with the llama-server topic, visit your repo's landing page and select "manage topics."