|
| 1 | + |
| 2 | +# Description |
| 3 | + |
| 4 | +Creates a chat based LLM app, similar to what you would expect from ChatGPT or Gemini (but worse :D) |
| 5 | + |
| 6 | +Features: |
| 7 | + |
| 8 | +- Support for Local (HuggingFace) or Remote (OpenRouter API) backends |
| 9 | +- Multiple models available for each Backend |
| 10 | +- Tune generation parameters (temperature, max tokens, etc..) |
| 11 | +- Text + Image input (local backend only) |
| 12 | +- System prompts to tune model output format |
| 13 | +- LLM with chat history |
| 14 | +- Save/Load conversations or Anonymous chat |
| 15 | +- Chat with code highlights |
| 16 | +- Retrieval Augmented Generation (RAG) based on langchain + FAISS embeddings database |
| 17 | +- "nice" frontend interface |
| 18 | + |
| 19 | +--- |
| 20 | + |
| 21 | +# Setup |
| 22 | + |
| 23 | +```bash |
| 24 | +cd ~/dev-playground/llm_app/ |
| 25 | +# create conda environment |
| 26 | +conda env create -f environment.yml |
| 27 | +# install backend pkgs |
| 28 | +pip install -e . |
| 29 | + |
| 30 | +# install frontend pkgs |
| 31 | +cd ~/dev-playground/llm_app/llm_app/frontend |
| 32 | +npm install |
| 33 | +``` |
| 34 | + |
| 35 | +--- |
| 36 | + |
| 37 | +# Usage |
| 38 | + |
| 39 | +Open 2 terminals, one to run the llm backend service and another to run the frontend |
| 40 | + |
| 41 | +Run llm backend |
| 42 | + |
| 43 | +```bash |
| 44 | +# export OPENROUTER_API_KEY="<API_KEY>" # NOTE: if you plan on using the OpenRouterAPI backend, then also export the API key: |
| 45 | +conda activate env_llm_app |
| 46 | +cd ~/dev-playground/llm_app/llm_app/backend |
| 47 | +uvicorn main:app --host 0.0.0.0 --port 8000 --reload |
| 48 | +# your backend should now be running on https://localhost:8000 |
| 49 | +# you can check the docs/swagger API on http://localhost:8000/docs |
| 50 | +``` |
| 51 | + |
| 52 | +Run the frontend interface |
| 53 | + |
| 54 | +```bash |
| 55 | +cd ~/dev-playground/llm_app/llm_app/frontend |
| 56 | +npm run dev |
| 57 | +# your frontend should now be running on http://localhost:5174 |
| 58 | +``` |
| 59 | + |
| 60 | +# Results |
| 61 | + |
| 62 | +<details> |
| 63 | +<summary> Interface </summary> |
| 64 | +  |
| 65 | +</details> |
| 66 | + |
| 67 | +<details> |
| 68 | +<summary> Text and Image inputs </summary> |
| 69 | +  |
| 70 | +</details> |
| 71 | + |
| 72 | +--- |
| 73 | + |
| 74 | +# Ideas / TODOs |
| 75 | + |
| 76 | +- [ ] Display in the frontend the sources used in RAG context |
| 77 | + |
| 78 | +- [ ] Support WEB search |
| 79 | + - for each query search, first search the web and download the first n pages |
| 80 | + - store the pages with the RAG embeddings database |
| 81 | + - pass additional context from downloaded pages to the LLM |
| 82 | + |
| 83 | +- [ ] Support Speech to text to Speech model and support voice input/outputs |
| 84 | + - Change front-end to enable voice/audio inputs |
| 85 | + - add voice-to-text intermediate model |
| 86 | + - pass the text to the LLM to generate an output |
| 87 | + - optionally convert back the output to audio using text-to-voice model |
| 88 | + |
| 89 | +- [ ] Real-time Streaming Responses using Websockets |
| 90 | + |
| 91 | +- [ ] Create docker compose image to run both frontend and backend and deploy it somewhere |
| 92 | + |
| 93 | +- [ ] Debug leftover issues in the frontend/backend communication |
| 94 | + |
| 95 | +# References |
0 commit comments