Comparisons here: https://github.com/lapp0/lm-inference-engines/ See also: MLC-LLM ExLlamaV2 gpt-fast 2024-01-09 TGI Inferencing Cost https://www.reddit.com/r/LocalLLaMA/comments/192silz/llm_comparison_using_tgi_mistral_falcon7b/ https://blog.salad.com/llm-comparison-tgi-benchmark/