📖 llm-tracker

Search

❯

❯

SOTA

Jun 01, 2024, 1 min read

To make a SOTA, fast model:

Distillation
Sparsification
Speculative Decoding Techniques
- https://pytorch.org/blog/hitchhikers-guide-speculative-decoding/
- https://arxiv.org/abs/2211.17192
- https://github.com/FasterDecoding/Medusa
- https://arxiv.org/abs/2401.10774
- https://arxiv.org/abs/2402.05109
- https://arxiv.org/pdf/2403.09919
- S3D (Skippy Simultaneous Speculative Decoding)
  - https://arxiv.org/pdf/2405.20314
- Lookahead Decoding
- Ouraboros (Speculative + Lookahead Decoding)
  - https://github.com/thunlp/Ouroboros

Backlinks

No backlinks found

Created with Quartz © 2025