Large Language Models (LLM) are a type of generative AI that power chatbot systems like ChatGPT.
You can try many of these for free if you’ve never tried one (although most of this site is aimed at those with more familiarity with these types of systems).
All of these have free access, although many may require user registration, and remember, all data is being sent online to third parties so don’t say anything you’d want to keep very private.
Hosted Services
Proprietary Model Chat
- OpenAI ChatGPT - the free version uses 3.5, and is not as smart, but good for dipping your toe in.
- Anthropic Claude - this is an interesting alternative, has super long context window (short-term memory) so can interact w/ very long files for summarization, etc.
- Google Bard - now running Gemini Pro, which is about ChatGPT 3.5 level
- Perplexity.ai - the best free service w/ web search capabilities
- You.com - another search and chat provider
- Phind - Proprietary coding model (fine-tuned off of Code Llama 34B)
Open Model Chat
- HuggingFace Chat - chat for free w/ some of the best open models
- ChatNBX - talk to some of the best open models and compare
- Perplexity Labs - hosts some open models to try as well
- DeepSeek Coder - an Open Source coding model developed by a Chinese company, https://deepseekcoder.github.io/ - currently one of the strongest coding models
Open Model Testing
-
Hugging Face Spaces - there’s a huge amount of “spaces” with hosted models for testing. These are typically running on small/slow instances with queues so mainly good for the most simple testing
-
Vercel SDK Playground - lets you easily A/B test a small selection of models
-
nat.dev Playground - a better selection of models
-
Comparisons
Paid Open Models
This site tracks quality, throughput, price of various API hosting providers:
https://openrouter.ai/ https://www.anyscale.com/endpoints https://replicate.com/ https://www.together.ai/ https://www.fireworks.ai/ https://octo.ai/pricing/
GPU Rental
For good primers to hardware:
-
https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
-
https://github.com/stas00/ml-engineering/tree/master/compute/accelerator
-
H100s: https://gpus.llm-utils.org/h100-gpu-cloud-availability-and-pricing/
H100’s
You can also run LLMs locally on most modern computers (although larger models require strong GPUs).
The easiest (virtually one-click, no command line futzing) way to test out some models is with LM Studio (Linux, Mac, Windows). Other alternatives include:
- Nomic’s GPT4All (Github) - Windows, Mac, Linux
- Ollama (Github) - Mac, Linux
- https://jan.ai/
If you are more technical:
- oobabooga - think of it as the automatic1111 of LLMs
- koboldcpp - more oriented for character/roleplay, see also SillyTavern
- openplayground
- llama.cpp
- exllama
Other chat clients: https://news.ycombinator.com/item?id=39532367
Most of the guides in this HOWTO section will assume:
- On a UNIXy platform (Linux, macOS, WSL)
- Familiarity w/ the command line, comfortable installing apps etc
- Some familiarity with Python, git
Global Recommendations:
Install Mambaforge and create a new conda environment anytime you are installing a package which have many dependencies. eg, create a separate exllama
and autogptq
or lm-eval
environment.
Other Resources
https://github.com/LykosAI/StabilityMatrix https://github.com/comfyanonymous/ComfyUI