Skip to main content
Advanced Search
Search Terms
Content Type

Exact Matches
Tag Searches
Date Options
Updated after
Updated before
Created after
Created before

Search Results

36 total results found

Japanese LLMs

LLMs

There has been a stream of open Japanese LLMs being trained but they are on average far behind their English counterparts. The current most promising open model for conversation and instruction are the ELYZA Llama2-based models. GPT-4 and gpt-3.5-turbo are sti...

OmniQuant

LLMs Quantization

Summary OmniQuant (omnidirectionally calibrated quantization) is a quantization technique published (2023-08-25) by Wenqi Shao and Mengzhao Chen from the General Vision Group, Shanghai AI Lab. Instead of hand-crafted quantization parameters, OmniQuant uses tra...

OpenAI API Compatibility

LLMs

Most inferencing packages have their own REST API, but having an OpenAI compatible API is useful for using a variety of clients, or to be able to easily switch between providers.   https://github.com/abetlen/llama-cpp-python General llama.cpp Python wrapp...

Prompting

LLMs

Prompting https://www.promptingguide.ai/ https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api https://oneusefulthing.substack.com/p/power-and-weirdness-how-to-use-bing Prompt Format Most instruct/chat fine tunes u...

Translation

LLMs

Google MADLAD-400 10.7B model NLLB 54B model https://research.facebook.com/publications/no-language-left-behind/ https://ai.meta.com/blog/nllb-200-high-quality-machine-translation/ https://github.com/facebookresearch/fairseq/tree/nllb/ https://w...

Interpretability

LLMs Research

Language Models Implement Simple Word2Vec-styleVector Arithmetic https://arxiv.org/pdf/2305.16130.pdf

Fine Tuning Mistral

Logbook

We'll try to fine tune Mistral 7B. Training Details The Mistral AI Discord has a #finetuning channel which has some info/discussion: dhokas: here are the main parameters we used for the instruct model : optimizer: adamw, max_lr: 2.5e-5, warmup steps: 50, tota...

StyleTTS 2 Setup Guide

HOWTO Guides

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models Samples: https://styletts2.github.io/ Paper: https://arxiv.org/abs/2306.07691 Repo: https://github.com/yl4579/StyleTTS2 Style...

Comparing Quants

Logbook

https://github.com/mit-han-lab/smoothquant https://neuralmagic.com/blog/fast-llama-2-on-cpus-with-sparse-fine-tuning-and-deepsparse/ Future Project for different quants Perplexity https://oobabooga.github.io/blog/posts/perplexities/ https://oobabooga.github....