Skip to content

Commit 90516ca

Browse files
authored
Merge branch 'main' into docs/vector_search_with_hub_as_backend
2 parents 9c0ce48 + 9318e2e commit 90516ca

File tree

59 files changed

+113120
-8863
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+113120
-8863
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
.vscode
22
.idea/
33
.venv/
4+
.env
45

56
**/.ipynb_checkpoints
67
**/.DS_Store

notebooks/en/_toctree.yml

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,8 @@
5454
title: Building RAG with Custom Unstructured Data
5555
- local: fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format
5656
title: Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format
57+
- local: finetune_t5_for_search_tag_generation
58+
title: Fine-tuning T5 for Automatic GitHub Tag Generation with PEFT
5759
- local: llm_gateway_pii_detection
5860
title: LLM Gateway for PII Detection
5961
- local: information_extraction_haystack_nuextract
@@ -76,8 +78,18 @@
7678
title: Scaling Test-Time Compute for Longer Thinking in LLMs
7779
- local: fine_tuning_llm_grpo_trl
7880
title: Post training an LLM for reasoning with GRPO in TRL
79-
- local: medical_rag_and_Reasoning
81+
- local: trl_grpo_reasoning_advanced_reward
82+
title: TRL GRPO Reasoning with Advanced Reward
83+
- local: medical_rag_and_reasoning
8084
title: HuatuoGPT-o1 Medical RAG and Reasoning
85+
- local: fine_tune_chatbot_docs_synthetic
86+
title: Documentation Chatbot with Meta Synthetic Data Kit
87+
- local: optuna_hpo_with_transformers
88+
title: Hyperparameter Optimization with Optuna and Transformers
89+
- local: function_calling_fine_tuning_llms_on_xlam
90+
title: Fine-tuning LLMs for Function Calling with the xLAM Dataset
91+
92+
8193

8294
- title: Computer Vision Recipes
8395
isExpanded: false
@@ -118,6 +130,12 @@
118130
title: Structured Generation from Images or Documents Using Vision Language Models
119131
- local: fine_tuning_granite_vision_sft_trl
120132
title: Fine-tuning Granite Vision with TRL
133+
- local: fine_tuning_vlm_object_detection_grounding
134+
title: Fine tuning a VLM for Object Detection Grounding using TRL
135+
- local: fine_tuning_vlm_mpo
136+
title: Fine-Tuning a Vision Language Model with TRL using MPO
137+
- local: fine_tuning_vlm_grpo_trl
138+
title: Post training an VLM for reasoning with GRPO using TRL
121139

122140
- title: Search Recipes
123141
isExpanded: false

notebooks/en/agent_data_analyst.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,17 +42,17 @@
4242
},
4343
{
4444
"cell_type": "code",
45-
"execution_count": 4,
45+
"execution_count": null,
4646
"metadata": {},
4747
"outputs": [],
4848
"source": [
49-
"from smolagents import HfApiModel, CodeAgent\n",
49+
"from smolagents import InferenceClientModel, CodeAgent\n",
5050
"from huggingface_hub import login\n",
5151
"import os\n",
5252
"\n",
5353
"login(os.getenv(\"HUGGINGFACEHUB_API_TOKEN\"))\n",
5454
"\n",
55-
"model = HfApiModel(\"meta-llama/Llama-3.1-70B-Instruct\")\n",
55+
"model = InferenceClientModel(\"meta-llama/Llama-3.1-70B-Instruct\")\n",
5656
"\n",
5757
"agent = CodeAgent(\n",
5858
" tools=[],\n",

notebooks/en/agent_rag.ipynb

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@
219219
"- *`tools`*: a list of tools that the agent will be able to call.\n",
220220
"- *`model`*: the LLM that powers the agent.\n",
221221
"\n",
222-
"Our `model` must be a callable that takes as input a list of [messages](https://huggingface.co/docs/transformers/main/chat_templating) and returns text. It also needs to accept a `stop_sequences` argument that indicates when to stop its generation. For convenience, we directly use the `HfApiModel` class provided in the package to get a LLM engine that calls our [Inference API](https://huggingface.co/docs/api-inference/en/index).\n",
222+
"Our `model` must be a callable that takes as input a list of [messages](https://huggingface.co/docs/transformers/main/chat_templating) and returns text. It also needs to accept a `stop_sequences` argument that indicates when to stop its generation. For convenience, we directly use the `InferenceClientModel` class provided in the package to get a LLM engine that calls our [Inference API](https://huggingface.co/docs/api-inference/en/index).\n",
223223
"\n",
224224
"And we use [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct), served for free on Hugging Face's Inference API!\n",
225225
"\n",
@@ -232,9 +232,9 @@
232232
"metadata": {},
233233
"outputs": [],
234234
"source": [
235-
"from smolagents import HfApiModel, ToolCallingAgent\n",
235+
"from smolagents import InferenceClientModel, ToolCallingAgent\n",
236236
"\n",
237-
"model = HfApiModel(\"meta-llama/Llama-3.1-70B-Instruct\")\n",
237+
"model = InferenceClientModel(\"meta-llama/Llama-3.1-70B-Instruct\")\n",
238238
"\n",
239239
"retriever_tool = RetrieverTool(vectordb)\n",
240240
"agent = ToolCallingAgent(\n",
@@ -263,15 +263,15 @@
263263
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
264264
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"font-weight: bold\">How can I push a model to the Hub?</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
265265
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
266-
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">╰─ HfApiModel - meta-llama/Llama-3.1-70B-Instruct ────────────────────────────────────────────────────────────────╯</span>\n",
266+
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">╰─ InferenceClientModel - meta-llama/Llama-3.1-70B-Instruct ────────────────────────────────────────────────────────────────╯</span>\n",
267267
"</pre>\n"
268268
],
269269
"text/plain": [
270270
"\u001b[38;2;212;183;2m╭─\u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m \u001b[0m\u001b[1;38;2;212;183;2mNew run\u001b[0m\u001b[38;2;212;183;2m \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╮\u001b[0m\n",
271271
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
272272
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[1mHow can I push a model to the Hub?\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
273273
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
274-
"\u001b[38;2;212;183;2m╰─\u001b[0m\u001b[38;2;212;183;2m HfApiModel - meta-llama/Llama-3.1-70B-Instruct \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╯\u001b[0m\n"
274+
"\u001b[38;2;212;183;2m╰─\u001b[0m\u001b[38;2;212;183;2m InferenceClientModel - meta-llama/Llama-3.1-70B-Instruct \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╯\u001b[0m\n"
275275
]
276276
},
277277
"metadata": {},

notebooks/en/agent_text_to_sql.ipynb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@
160160
"\n",
161161
"We use the `CodeAgent`, which is `transformers.agents`' main agent class: an agent that writes actions in code and can iterate on previous output according to the ReAct framework.\n",
162162
"\n",
163-
"The `llm_engine` is the LLM that powers the agent system. `HfApiModel` allows you to call LLMs using Hugging Face's Inference API, either via Serverless or Dedicated endpoint, but you could also use any proprietary API: check out [this other cookbook](agent_change_llm) to learn how to adapt it."
163+
"The `llm_engine` is the LLM that powers the agent system. `InferenceClientModel` allows you to call LLMs using Hugging Face's Inference API, either via Serverless or Dedicated endpoint, but you could also use any proprietary API: check out [this other cookbook](agent_change_llm) to learn how to adapt it."
164164
]
165165
},
166166
{
@@ -169,11 +169,11 @@
169169
"metadata": {},
170170
"outputs": [],
171171
"source": [
172-
"from smolagents import CodeAgent, HfApiModel\n",
172+
"from smolagents import CodeAgent, InferenceClientModel\n",
173173
"\n",
174174
"agent = CodeAgent(\n",
175175
" tools=[sql_engine],\n",
176-
" model=HfApiModel(\"meta-llama/Meta-Llama-3-8B-Instruct\"),\n",
176+
" model=InferenceClientModel(\"meta-llama/Meta-Llama-3-8B-Instruct\"),\n",
177177
")"
178178
]
179179
},
@@ -189,15 +189,15 @@
189189
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
190190
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"font-weight: bold\">Can you give me the name of the client who got the most expensive receipt?</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
191191
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
192-
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">╰─ HfApiModel - meta-llama/Meta-Llama-3-8B-Instruct ──────────────────────────────────────────────────────────────╯</span>\n",
192+
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">╰─ InferenceClientModel - meta-llama/Meta-Llama-3-8B-Instruct ──────────────────────────────────────────────────────────────╯</span>\n",
193193
"</pre>\n"
194194
],
195195
"text/plain": [
196196
"\u001b[38;2;212;183;2m╭─\u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m \u001b[0m\u001b[1;38;2;212;183;2mNew run\u001b[0m\u001b[38;2;212;183;2m \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╮\u001b[0m\n",
197197
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
198198
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[1mCan you give me the name of the client who got the most expensive receipt?\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
199199
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
200-
"\u001b[38;2;212;183;2m╰─\u001b[0m\u001b[38;2;212;183;2m HfApiModel - meta-llama/Meta-Llama-3-8B-Instruct \u001b[0m\u001b[38;2;212;183;2m─────────────────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╯\u001b[0m\n"
200+
"\u001b[38;2;212;183;2m╰─\u001b[0m\u001b[38;2;212;183;2m InferenceClientModel - meta-llama/Meta-Llama-3-8B-Instruct \u001b[0m\u001b[38;2;212;183;2m─────────────────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╯\u001b[0m\n"
201201
]
202202
},
203203
"metadata": {},
@@ -396,15 +396,15 @@
396396
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
397397
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"font-weight: bold\">Which waiter got more total money from tips?</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
398398
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span> <span style=\"color: #d4b702; text-decoration-color: #d4b702\">│</span>\n",
399-
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">╰─ HfApiModel - Qwen/Qwen2.5-72B-Instruct ────────────────────────────────────────────────────────────────────────╯</span>\n",
399+
"<span style=\"color: #d4b702; text-decoration-color: #d4b702\">╰─ InferenceClientModel - Qwen/Qwen2.5-72B-Instruct ────────────────────────────────────────────────────────────────────────╯</span>\n",
400400
"</pre>\n"
401401
],
402402
"text/plain": [
403403
"\u001b[38;2;212;183;2m╭─\u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m \u001b[0m\u001b[1;38;2;212;183;2mNew run\u001b[0m\u001b[38;2;212;183;2m \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╮\u001b[0m\n",
404404
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
405405
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[1mWhich waiter got more total money from tips?\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
406406
"\u001b[38;2;212;183;2m│\u001b[0m \u001b[38;2;212;183;2m│\u001b[0m\n",
407-
"\u001b[38;2;212;183;2m╰─\u001b[0m\u001b[38;2;212;183;2m HfApiModel - Qwen/Qwen2.5-72B-Instruct \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╯\u001b[0m\n"
407+
"\u001b[38;2;212;183;2m╰─\u001b[0m\u001b[38;2;212;183;2m InferenceClientModel - Qwen/Qwen2.5-72B-Instruct \u001b[0m\u001b[38;2;212;183;2m───────────────────────────────────────────────────────────────────────\u001b[0m\u001b[38;2;212;183;2m─╯\u001b[0m\n"
408408
]
409409
},
410410
"metadata": {},
@@ -740,7 +740,7 @@
740740
"\n",
741741
"agent = CodeAgent(\n",
742742
" tools=[sql_engine],\n",
743-
" model=HfApiModel(\"Qwen/Qwen2.5-72B-Instruct\"),\n",
743+
" model=InferenceClientModel(\"Qwen/Qwen2.5-72B-Instruct\"),\n",
744744
")\n",
745745
"\n",
746746
"agent.run(\"Which waiter got more total money from tips?\")"

0 commit comments

Comments
 (0)