[E2E] Align `pytorch/benchmarks/dynamo` configs and `torchbench` dependencies to match what `torch-xpu-ops` uses #5390

anmyachev · 2025-10-27T10:11:26Z

Inspired by intel/torch-xpu-ops@779f899

CI:

https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/18837835379/job/53742874238 (huggingface)
Cannot access gated repo for url https://huggingface.co/google/gemma-2-2b/resolve/main/config.json. Access to model google/gemma-2-2b is restricted. You must have access to it and be authenticated to access it. Please log in.
https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/18854550236/job/53799063239 (torchbench; more models started working, for example: torchrec_dlrm)

Test all models: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/18856801306 (see summary)

Summary:

=========================================
Summary of only failed models:
Real failed models: 3 [['meta-llama/Llama-3.2-1B', 'eager_fail_to_run'], ['google/gemma-2-2b', 'eager_fail_to_run'], ['CamemBert', 'eager_fail_to_run']]
Real failed models: 3 [['google/gemma-2-2b', 'eager_fail_to_run'], ['meta-llama/Llama-3.2-1B', 'eager_fail_to_run'], ['CamemBert', 'eager_fail_to_run']]
Real failed models: 4 [['google/gemma-2-2b', 'eager_fail_to_run'], ['meta-llama/Llama-3.2-1B', 'eager_fail_to_run'], ['openai/whisper-tiny', 'fail_accuracy'], ['CamemBert', 'eager_fail_to_run']]
Real failed models: 3 [['meta-llama/Llama-3.2-1B', 'eager_fail_to_run'], ['CamemBert', 'eager_fail_to_run'], ['google/gemma-2-2b', 'eager_fail_to_run']]
Real failed models: 3 [['CamemBert', 'eager_fail_to_run'], ['meta-llama/Llama-3.2-1B', 'eager_fail_to_run'], ['google/gemma-2-2b', 'eager_fail_to_run']]
Real failed models: 1 [['CamemBert', 'eager_fail_to_run']]
Real failed models: 1 [['CamemBert', 'eager_fail_to_run']]
Real failed models: 1 [['CamemBert', 'eager_fail_to_run']]
Real failed models: 1 [['CamemBert', 'eager_fail_to_run']]
Real failed models: 1 [['CamemBert', 'eager_fail_to_run']]
Real failed models: 1 [['convit_base', 'eager_fail_to_run']]
Real failed models: 1 [['convit_base', 'eager_fail_to_run']]
Real failed models: 2 [['convit_base', 'eager_fail_to_run'], ['sebotnet33ts_256', 'fail_accuracy']]
Real failed models: 1 [['convit_base', 'eager_fail_to_run']]
Real failed models: 2 [['maml_omniglot', 'eager_fail_to_run'], ['functorch_maml_omniglot', 'eager_fail_to_run']]
Real failed models: 3 [['detectron2_fasterrcnn_r_50_fpn', 'eager_1st_run_OOM'], ['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
Real failed models: 2 [['maml_omniglot', 'eager_fail_to_run'], ['functorch_maml_omniglot', 'eager_fail_to_run']]
Real failed models: 2 [['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
Real failed models: 6 [['detectron2_fasterrcnn_r_50_dc5', 'eager_1st_run_OOM'], ['functorch_maml_omniglot', 'eager_fail_to_run'], ['detectron2_fasterrcnn_r_101_c4', 'eager_1st_run_OOM'], ['detectron2_fasterrcnn_r_50_c4', 'eager_1st_run_OOM'], ['maml_omniglot', 'eager_fail_to_run'], ['detectron2_fasterrcnn_r_101_dc5', 'eager_1st_run_OOM']]
Real failed models: 2 [['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
Real failed models: 2 [['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
Real failed models: 2 [['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
Real failed models: 2 [['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
Real failed models: 2 [['functorch_maml_omniglot', 'eager_fail_to_run'], ['maml_omniglot', 'eager_fail_to_run']]
ERROR: Found failed models!

Error checking:

meta-llama/Llama-3.2-1B and google/gemma-2-2b work locally. Problem with token in CI.
CamemBert is not supposed to work according to intel/torch-xpu-ops@779f899. Can be ignored.
openai/whisper-tiny (fail_accuracy). There might be a problem with the Triton, but it's a new model and hasn't been tested before (I checked it here: https://github.com/intel/torch-xpu-ops/actions/runs/17338511026/job/49263018812), so it's not a regression and a blocker.
sebotnet33ts_256 (fail_accuracy). There might be a problem with the Triton. However, the problem was already present in the previous iteration of validation: https://github.com/intel/torch-xpu-ops/actions/runs/17338511026/job/49263026473#step:18:7802, so it's not a regression and a blocker.
functorch_maml_omniglot and maml_omniglot are fixed in [E2E] Align torchbench dependencies to match what torch-xpu-ops uses #5398. There was a problem in the environment.
convit_base. The error looks like this and is not related to Triton:

    return F.linear(input, self.weight, self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

detectron2_fasterrcnn_r_50_fpn the problem was already present in the previous iteration of validation: https://github.com/intel/torch-xpu-ops/actions/runs/17329304896/job/49900834883#step:14:41819
Other Detectron models (eager_1st_run_OOM) worked before, the cause of the breakdown is unknown.

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev · 2025-10-27T11:18:54Z

https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/18837835379/job/53742874238 (huggingface)
Cannot access gated repo for url https://huggingface.co/google/gemma-2-2b/resolve/main/config.json. Access to model google/gemma-2-2b is restricted. You must have access to it and be authenticated to access it. Please log in.

It looks like HUGGING_FACE_HUB_TOKEN needs to be updated:

intel-xpu-backend-for-triton/.github/workflows/e2e-reusable.yml

Line 199 in eff6a02

HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}

@kwasd could you take a look? This has P0 priority.

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

.github/workflows/e2e-reusable.yml

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

.github/workflows/e2e-reusable.yml

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev linked an issue Oct 27, 2025 that may be closed by this pull request

[E2E][Accuracy] google/gemma-2-2b and meta-llama/Llama-3.2-1B Huggingface models are broken #5389

Closed

[E2E] Align benchmarks/dynamo configs to match what torch-xpu-ops uses

90fd146

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev force-pushed the amyachev/issue5389 branch from b6e5d0a to 90fd146 Compare October 27, 2025 10:15

anmyachev added 2 commits October 27, 2025 11:18

install rsync

b599938

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

update torch-xpu-ops pin

260d5f1

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev added 3 commits October 27, 2025 18:29

add '--disable-cudagraphs'

21cde65

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

update torchbench installation

2908c90

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

return 'pip indtall -e .'

020a92f

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev commented Oct 27, 2025

View reviewed changes

.github/workflows/e2e-reusable.yml Outdated Show resolved Hide resolved

Apply suggestion from @anmyachev

eedf6ab

anmyachev changed the title ~~[E2E] Align pytorch/benchmarks/dynamo configs to match what torch-xpu-ops uses~~ [E2E] Align pytorch/benchmarks/dynamo configs and torchbench dependencies to match what torch-xpu-ops uses Oct 27, 2025

anmyachev requested review from kwasd and whitneywhtsang October 27, 2025 21:44

anmyachev marked this pull request as ready for review October 27, 2025 21:44

don't use symlink in load action by default

5325768

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev commented Oct 28, 2025

View reviewed changes

.github/workflows/e2e-reusable.yml Show resolved Hide resolved

anmyachev added 3 commits October 28, 2025 15:29

Apply suggestion from @anmyachev

f4d5b13

define 'HF_TOKEN' also

e1027c2

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

move HF tokens to global scope

91e9746

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>

anmyachev marked this pull request as draft October 29, 2025 10:30

anmyachev removed a link to an issue Oct 29, 2025

[E2E][Accuracy] google/gemma-2-2b and meta-llama/Llama-3.2-1B Huggingface models are broken #5389

Closed

anmyachev mentioned this pull request Oct 29, 2025

Triton pin update in PyTorch #5407

Open

anmyachev closed this Oct 29, 2025

anmyachev deleted the amyachev/issue5389 branch October 29, 2025 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[E2E] Align `pytorch/benchmarks/dynamo` configs and `torchbench` dependencies to match what `torch-xpu-ops` uses #5390

[E2E] Align `pytorch/benchmarks/dynamo` configs and `torchbench` dependencies to match what `torch-xpu-ops` uses #5390

Uh oh!

anmyachev commented Oct 27, 2025 •

edited

Loading

Uh oh!

anmyachev commented Oct 27, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[E2E] Align pytorch/benchmarks/dynamo configs and torchbench dependencies to match what torch-xpu-ops uses #5390

[E2E] Align pytorch/benchmarks/dynamo configs and torchbench dependencies to match what torch-xpu-ops uses #5390

Uh oh!

Conversation

anmyachev commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anmyachev commented Oct 27, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[E2E] Align `pytorch/benchmarks/dynamo` configs and `torchbench` dependencies to match what `torch-xpu-ops` uses #5390

[E2E] Align `pytorch/benchmarks/dynamo` configs and `torchbench` dependencies to match what `torch-xpu-ops` uses #5390

anmyachev commented Oct 27, 2025 •

edited

Loading