A curated list of research-oriented skills that are usable in OpenAI Codex.
The goal is simple: if a skill is listed here, it should either be from the official Codex ecosystem or from a repository that explicitly supports Codex or the open Agent Skills workflow that Codex can use.
- What Are Codex Skills?
- Inclusion Rules
- How To Use This List
- Skill List
- Installation and Usage
- License
- References
Codex skills are folder-based instruction bundles that teach Codex how to handle a task more reliably.
A typical skill usually includes:
- a
SKILL.mdfile with trigger rules and workflow guidance - optional scripts, templates, and references
- a stable folder structure that Codex can discover from standard skill locations
In practice, a good skill works like a reusable research playbook: Codex loads it when needed, follows the instructions, and combines that guidance with your local repository context.
This list only keeps skills that satisfy at least one of the following:
- official OpenAI Codex skills
- repositories that explicitly document Codex support
- repositories built around the open Agent Skills format that Codex can consume with minimal adaptation
This list intentionally removes:
- skills that are exclusive to other platforms, for example Claude Code-only skills
- document-oriented workflows that depend on platform-built-in capabilities and do not translate cleanly into reusable Codex skills
- repositories whose Codex compatibility is unclear
This list works best as a research-workflow index. It helps narrow the search space first, then sends you back to the upstream SKILL.md for the actual details.
If you are new to the list, a quick task-based pass is usually enough:
- For research workflow design, task decomposition, and context management, start with sections 1 and 2.
- For paper writing, literature organization, and evidence summaries, start with sections 3 and 4.
- For demos, observability, data work, or experiment pipelines, start with sections 5 and 6.
- If a skill looks promising, open the upstream
SKILL.mdbefore deciding whether to install or rely on it.
| Skill | What It Does | Link |
|---|---|---|
project-development |
Helps scope LLM projects, assess task-model fit, and design practical research-agent architectures. | Agent-Skills-for-Context-Engineering |
notion-research-documentation |
Research across Notion and turn scattered notes into briefs, comparisons, and reports with citations. | openai/skills |
brainstorming-research-ideas |
Guides structured ideation for discovering defensible, high-impact research directions. | AI-Research-SKILLs |
creative-thinking-for-research |
Applies cognitive-science creativity frameworks to help generate genuinely novel research ideas. | AI-Research-SKILLs |
| Skill | What It Does | Link |
|---|---|---|
context-fundamentals |
Explains how context works in agent systems and how to design high-signal task context. | Agent-Skills-for-Context-Engineering |
context-degradation |
Helps diagnose lost-in-the-middle, distraction, poisoning, and other context failure modes. | Agent-Skills-for-Context-Engineering |
context-compression |
Teaches how to summarize and compress long-running research or coding sessions without losing critical state. | Agent-Skills-for-Context-Engineering |
advanced-evaluation |
Covers LLM-as-a-judge workflows, rubric design, and bias-aware automated evaluation. | Agent-Skills-for-Context-Engineering |
| Skill | What It Does | Link |
|---|---|---|
doc |
Codex-oriented DOCX workflow for creating or editing research reports with rendering and layout checks. | openai/skills |
notion-research-documentation |
Useful for writing research briefs, comparison notes, and structured evidence summaries. | openai/skills |
hugging-face-paper-publisher |
Publishes papers, links them to models or datasets, and generates professional research article pages. | huggingface/skills |
ml-paper-writing |
Writes publication-ready ML/AI/Systems papers with citation verification and venue-aware structure. | AI-Research-SKILLs |
| Skill | What It Does | Link |
|---|---|---|
notion-research-documentation |
Synthesizes multi-source findings into literature summaries and cited research notes. | openai/skills |
llamaindex |
Builds document-ingestion and retrieval pipelines for paper corpora, notes, and private research archives. | AI-Research-SKILLs |
faiss |
Provides high-performance dense retrieval for large paper or note embedding collections. | AI-Research-SKILLs |
sentence-transformers |
Generates strong semantic embeddings for literature search, clustering, and retrieval. | AI-Research-SKILLs |
| Skill | What It Does | Link |
|---|---|---|
gradio |
Builds interactive demos and polished research interfaces for models, ablations, and prototypes. | huggingface/skills |
hugging-face-trackio |
Tracks and visualizes training metrics with real-time dashboards. | huggingface/skills |
langsmith-observability |
Adds tracing, evaluation, and monitoring for LLM apps and experiment pipelines. | AI-Research-SKILLs |
phoenix-observability |
Open-source observability for tracing, experiments, and real-time analysis of AI systems. | AI-Research-SKILLs |
stable-diffusion-image-generation |
Useful for generating figures, visual concepts, teaser graphics, or multimodal assets for presentations. | AI-Research-SKILLs |
This is an extra category I recommend keeping because modern research workflows depend on it heavily.
| Skill | What It Does | Link |
|---|---|---|
hugging-face-datasets |
Creates, manages, queries, and transforms datasets on Hugging Face Hub. | huggingface/skills |
hugging-face-evaluation |
Adds and manages structured evaluation results for models and benchmarks. | huggingface/skills |
hugging-face-model-trainer |
Trains or fine-tunes LLMs with TRL on Hugging Face Jobs. | huggingface/skills |
hugging-face-jobs |
Runs compute jobs on Hugging Face infrastructure for evaluation, generation, or training workflows. | huggingface/skills |
evaluating-llms-harness |
Runs standardized academic benchmarks such as MMLU, HumanEval, GSM8K, and TruthfulQA. | AI-Research-SKILLs |
serving-llms-vllm |
Serves LLMs with high-throughput inference and OpenAI-compatible endpoints. | AI-Research-SKILLs |
This repository is a curated list, not a unified marketplace. You usually install a skill from its original source repository into a Codex skill directory.
Common Codex skill locations include:
- repository scope:
.codex/skills/<skill-name>/ - user scope:
~/.codex/skills/<skill-name>/
Example 1: install an official Codex-oriented skill from openai/skills
mkdir -p ~/.codex/skills
cd /tmp
git clone --depth 1 https://github.com/openai/skills.git
cp -R skills/skills/.curated/notion-research-documentation ~/.codex/skills/Example 2: install a research skill from AI-Research-SKILLs
mkdir -p ~/.codex/skills
cd /tmp
git clone --depth 1 https://github.com/Orchestra-Research/AI-Research-SKILLs.git
cp -R AI-Research-SKILLs/20-ml-paper-writing ~/.codex/skills/ml-paper-writingOnce the folder is in a valid Codex skill location, you can invoke it naturally in your prompt.
Examples:
Use the ml-paper-writing skill to turn this repo into a NeurIPS-style draft.Use brainstorming-research-ideas to generate three defensible project directions.Use notion-research-documentation to turn these source notes into a cited literature brief.Use evaluating-llms-harness to benchmark this checkpoint on MMLU and GSM8K.Use gradio to build a polished demo for this paper artifact.
- Pick one skill for one clear bottleneck.
- Start with a narrow task, not a giant workflow.
- Read the upstream
SKILL.mdbefore relying on output. - For academic work, it is worth manually checking citations, claims, equations, dataset handling, and benchmark settings, especially when copyright, privacy, attribution, or research-integrity requirements matter.
The content of this repository is released under the MIT License.
Third-party skills linked from this list keep their own licenses. Always check the original repository before installing or redistributing anything.
If you notice a dead link, a compatibility change, or a clearly better entry for the list, a short issue or PR is enough.