LLM-Performance-Improvement-Paper

Overview

Paper List

[1] Scale laws

[2] Diversity & Quality

LIMA : Less Is More for Alignment
Textbooks are all you need

[3] Filter Model

Textbooks are all you need
GPT3:Language Models are Few-Shot Learners

[4][5] Teacher Model & Zero shot labelling

Textbooks are all you need
GPT Self-Supervision for a Better Data Annotator

[6] Reward Model

Lets Verify Step by Step

[7] RLHF or DPO

Training language models to follow instructions with human feedback
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Scaling Laws for Reward Model Overoptimization
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
RRHF: Rank Responses to Align Language Models with Human Feedback without tears

[8] Instruct Understanding & Prompt-Answer Matching

Training language models to follow instructions with human feedback
LIMA : Less Is More for Alignment
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Scaling Instruction-Finetuned Language Models

[9] Self Instruct from GPT

Aligning Language Models with Self-Generated Instructions
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions
Wizardlm: Empowering large language models to follow complex instructions
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
Alpaca: A Strong Replicable Instruction-Following Model
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

[10] Principle-Driven Alignment:

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
The paradigm of improving the performance of LLM.png		The paradigm of improving the performance of LLM.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-Performance-Improvement-Paper

Overview

Paper List

[1] Scale laws

[2] Diversity & Quality

[3] Filter Model

[4][5] Teacher Model & Zero shot labelling

[6] Reward Model

[7] RLHF or DPO

[8] Instruct Understanding & Prompt-Answer Matching

[9] Self Instruct from GPT

[10] Principle-Driven Alignment:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LLM-Performance-Improvement-Paper

Overview

Paper List

[1] Scale laws

[2] Diversity & Quality

[3] Filter Model

[4][5] Teacher Model & Zero shot labelling

[6] Reward Model

[7] RLHF or DPO

[8] Instruct Understanding & Prompt-Answer Matching

[9] Self Instruct from GPT

[10] Principle-Driven Alignment:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages