Conversation
Added: - HydraInfer: Hybrid disaggregated scheduling for MLLM serving - Paper on disaggregated multi-stage MLLM inference with GPU scheduling Both papers added to Multi-Modal Serving Systems section. Co-authored-by: Jiachen LIU <AmberLJC@users.noreply.github.com>
PR Review: Add new multimodal LLM serving papersThank you for the contribution! I've reviewed the changes and have the following feedback: ✅ Strengths
|
This PR adds 2 new papers to the Multi-Modal Serving Systems section:
Closes #46
Generated with Claude Code
Note
Updates the
README.mdpaper list.HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model Servingto theMulti-Modal Serving SystemssectionEnabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharingto theMulti-Modal Serving SystemssectionWritten by Cursor Bugbot for commit 473028e. This will update automatically on new commits. Configure here.