Repository files navigation LLM-Performance-Improvement-Paper
LIMA : Less Is More for Alignment
Textbooks are all you need
Textbooks are all you need
GPT3:Language Models are Few-Shot Learners
[4][5] Teacher Model & Zero shot labelling
Textbooks are all you need
GPT Self-Supervision for a Better Data Annotator
Training language models to follow instructions with human feedback
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Scaling Laws for Reward Model Overoptimization
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
RRHF: Rank Responses to Align Language Models with Human Feedback without tears
[8] Instruct Understanding & Prompt-Answer Matching
Training language models to follow instructions with human feedback
LIMA : Less Is More for Alignment
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Scaling Instruction-Finetuned Language Models
[9] Self Instruct from GPT
Aligning Language Models with Self-Generated Instructions
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions
Wizardlm: Empowering large language models to follow complex instructions
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
Alpaca: A Strong Replicable Instruction-Following Model
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
[10] Principle-Driven Alignment:
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
You can’t perform that action at this time.