aswanthmanoj / llm-continuous-batching-simulator Goto Github PK
View Code? Open in Web Editor NEWThis project forked from hitpoint6/llm-continuous-batching-simulator
Simulate how llm serving engines like vllm make use of python asyncio.Queue to achieve dynamic batching: batch generation at the iteration level.