Visual
script that prompt runs big screen capture
Script
Early 2025 - best practices LLMs image generation
ML Performance refers to “Quality” vs how fast it goes. Quality Throughput Latency - specifically Time to First Token.
Home vs Productoin
Compute Memory Bandwidth Memory
bs=1
Biggest model
Which Models to use?
- llama2 7B
- Llama3
- Qwen
llama.cpp llama-bench
Go further ShareGPT sglang
Coding Model
Quants
TorchTune
Training
Quality; MixEval, lighteval
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
export CPATH=$CONDA_PREFIX/include:$CPATH