Skip to content

Conversation

@Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Nov 25, 2025

Offline mode / env variables changes:

  • def offline_mode helper directly takes its value from huggingface_hub.constants without an intermediate _offline_mode variable. It looks like it was only useful for testing purposes. In practice it's much better to rely solely on the hfh's one so that if a user monkeypatch it, it will be globally monkeypatched both in huggingface_hub and transformers (and other libraries)
  • related to Harmonize HF environment variables + other cleaning #27564, I've removed some deprecated env variables to rely only on huggingface_hub.constants.HF_HUB_CACHE. It is not possible anymore to define a cache specific to transformers (which is on-purpose^^)
    • TRANSFORMERS_CACHE
    • PYTORCH_TRANSFORMERS_CACHE
    • PYTORCH_PRETRAINED_BERT_CACHE
  • updated the offline tests to make them cleaner (use unittest.patch, less logging, use pytest.raises, etc.)
  • updated docs to mention only HF_HUB_CACHE / HF_HOME / XDG_CACHE_HOME
  • removed transformers.utils.hub.torch_cache_home (not used anywhere)
  • removed transformers.utils.hub.default_cache_path (not used anywhere but exposed at root)
  • removed transformers.utils.hub.HUGGINGFACE_CO_EXAMPLES_TELEMETRY (not used anywhere)
  • removed transformers.utils.hub.HUGGINGFACE_CO_PREFIX (not used anywhere but exposed at root)
  • removed transformers.utils.hub.HUGGINGFACE_CO_RESOLVE_ENDPOINT (used in some legacy parts and exposed at root, not used anymore)

Basically, "do not parse high-level env variables from transformers".


PushToHubMixin changes

Also some changes in PushToHubMixin:

  • removed deprecated organization and repo_url from PushToHubMixin._create_repo
  • therefore removed PushToHubMixin._create_repo entirely (it became just an alias so not worth it)
  • removed deprecated organization from PushToHubMixin
  • removed deprecated repo_path_or_name from PushToHubMixin
  • removed ignore_metadata_errors from PushToMixin. This parameter was added in #28405 but more as a copy-paste (see comment) rather than a deliberated decision. In practice if we ignore errors while loading the model card, we won't be able to push the card back to the Hub. Better to fail early here and not provide the option to fail later. (also removed from create_and_tag_model_card helper)
  • since push_to_hub do not use **kwargs anymore, remove it from the method signature
  • make arguments in push_to_hub keyword-only to avoid confusion + reorder them. Only repo_id can be positional since it's the main arg. Same as what is done on huggingface_hub side.
  • removed use_temp_dir argument from push_to_hub. We now use a tmp dir in all cases
  • removed working_or_temp_dir helper (not used anymore)

Open questions!

There are a few more things I would like to cleanup as well, either in this PR or a follow-up one. Not 100% sure though as one might disagree with me^^

  • IMO push_to_hub should not save to a local dir at all. Currently if the model name part of repo_id is also a local folder, then the model is saved in it and then pushed to the Hub, unless use_temp_dir=True is passed. This logic is super unintuitive in my opinion (and likely related to when we were pushing from a git repo). It also induce some logic to determine which files have been updated (and push only these ones). All in all quite a lot of complexity for little benefit. For the record, the PyTorchModelHubMixin always use a tmp dir
    • => Done in this PR ✔️
  • do we still want to allow saving with safe_serialization=False? The v5 release feels a very good opportunity to definitely get rid of non-safetensors serialization. WDYT ?
    • => yes but future PR
  • in push_to_hub(...) all parameters are positional, meaning that the order counts, meaning it's harder to remove or add a new one. I would strongly suggest to use *, to explicitly require keyword-arguments. This is what we do in huggingface_hub and other places of transformers. Let's take advantage of v5 to introduce this breaking change?
    • => Done in this PR ✔️
  • currently max_shard_size defaults to 5GB. Recommendations are now to use <50GB chunks (the recommendation increased a lot since Xet is now the default). Should we increase the default value to e.g. 48GB? (not 50, to avoid having files slightly larger than 50).
    • => maybe, depends on I/O performances

I'm willing to take care of all of these changes if approved. Sorry to bring this up so late in the release cycle. For some reasons I thought I had already taken a look at this but doesn't seem to be the case^^ Let me know @huggingface/transformers-core-maintainers

@Wauplin Wauplin requested a review from a team November 25, 2025 12:35
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@LysandreJik
Copy link
Member

Thanks @Wauplin!

  1. IMO push_to_hub should not save to a local dir at all. Currently if the model name part of repo_id is also a local folder, then the model is saved in it and then pushed to the Hub, unless use_temp_dir=True is passed. This logic is super unintuitive in my opinion (and likely related to when we were pushing from a git repo). It also induce some logic to determine which files have been updated (and push only these ones). All in all quite a lot of complexity for little benefit. For the record, the PyTorchModelHubMixin always use a tmp dir
    • => my suggestion is to entirely remove use_temp_dir and always use a tmp dir

Yes, this sounds good to me, cc @SunMarc for the trainer

  1. do we still want to allow saving with safe_serialization=False? The v5 release feels a very good opportunity to definitely get rid of non-safetensors serialization. WDYT ?

Yes, this could be cool! Maybe a bit of work necessary here, so could be cool to land it for RC1 rather than RC0.

  1. in push_to_hub(...) all parameters are positional, meaning that the order counts, meaning it's harder to remove or add a new one. I would strongly suggest to use *, to explicitly require keyword-arguments. This is what we do in huggingface_hub and other places of transformers. Let's take advantage of v5 to introduce this breaking change?

No problem for me!

  1. currently max_shard_size defaults to 5GB. Recommendations are now to use <50GB chunks (the recommendation increased a lot since Xet is now the default). Should we increase the default value to e.g. 48GB? (not 50, to avoid having files slightly larger than 50).

This is the only one I'm not sure about -> have smaller shards means that it's easier to parallelize load from checkpoints. Will let @ArthurZucker complete as he's been working with this quite a bit lately

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have not really seen the exact line, but its something we do need to study: smaller shards -> better i/o file lock potentially?

I am probably wrong as we can parallel read a file (which we did not do before) so should be fine with main but let's test it out please

@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 25, 2025

Thanks for the quick answers! I'll add 1. and 3. to this PR since it's small-ish and quite related.

For 2. (safe serialization), I'll start a new PR later this week.

For 4., I'd rather not touch it for now. Seems like an interesting topic but lower prio + requires investigation.

@SunMarc
Copy link
Member

SunMarc commented Nov 25, 2025

Yes, this sounds good to me, cc @SunMarc for the trainer

Happy to have this, not sure we use use_temp_dir at all in trainer as we have our own push_to_hub.

@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 25, 2025

Thanks for your feedback @LysandreJik @ArthurZucker @SunMarc ! I've addressed the remaining open questions so PR should now be in a final state for review :) If I see anything new, I'll do it in a separate PR.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Wauplin

Comment on lines 90 to 135
_is_offline_mode = huggingface_hub.constants.HF_HUB_OFFLINE


def is_offline_mode():
return _is_offline_mode
return constants.HF_HUB_OFFLINE


torch_cache_home = os.getenv("TORCH_HOME", os.path.join(os.getenv("XDG_CACHE_HOME", "~/.cache"), "torch"))
default_cache_path = constants.default_cache_path

# Determine default cache directory. Lots of legacy environment variables to ensure backward compatibility.
# Determine default cache directory.
# The best way to set the cache path is with the environment variable HF_HOME. For more details, check out this
# documentation page: https://huggingface.co/docs/huggingface_hub/package_reference/environment_variables.
#
# In code, use `HF_HUB_CACHE` as the default cache path. This variable is set by the library and is guaranteed
# to be set to the right value.
#
# TODO: clean this for v5?
PYTORCH_PRETRAINED_BERT_CACHE = os.getenv("PYTORCH_PRETRAINED_BERT_CACHE", constants.HF_HUB_CACHE)
PYTORCH_TRANSFORMERS_CACHE = os.getenv("PYTORCH_TRANSFORMERS_CACHE", PYTORCH_PRETRAINED_BERT_CACHE)
TRANSFORMERS_CACHE = os.getenv("TRANSFORMERS_CACHE", PYTORCH_TRANSFORMERS_CACHE)

HF_MODULES_CACHE = os.getenv("HF_MODULES_CACHE", os.path.join(constants.HF_HOME, "modules"))
TRANSFORMERS_DYNAMIC_MODULE_NAME = "transformers_modules"
SESSION_ID = uuid4().hex

# Add deprecation warning for old environment variables.
for key in ("PYTORCH_PRETRAINED_BERT_CACHE", "PYTORCH_TRANSFORMERS_CACHE", "TRANSFORMERS_CACHE"):
if os.getenv(key) is not None:
warnings.warn(
f"Using `{key}` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.",
FutureWarning,
)


S3_BUCKET_PREFIX = "https://s3.amazonaws.com/models.huggingface.co/bert"
CLOUDFRONT_DISTRIB_PREFIX = "https://cdn.huggingface.co"

_staging_mode = os.environ.get("HUGGINGFACE_CO_STAGING", "NO").upper() in ENV_VARS_TRUE_VALUES
_default_endpoint = "https://hub-ci.huggingface.co" if _staging_mode else "https://huggingface.co"

HUGGINGFACE_CO_RESOLVE_ENDPOINT = _default_endpoint
if os.environ.get("HUGGINGFACE_CO_RESOLVE_ENDPOINT", None) is not None:
warnings.warn(
"Using the environment variable `HUGGINGFACE_CO_RESOLVE_ENDPOINT` is deprecated and will be removed in "
"Transformers v5. Use `HF_ENDPOINT` instead.",
FutureWarning,
)
HUGGINGFACE_CO_RESOLVE_ENDPOINT = os.environ.get("HUGGINGFACE_CO_RESOLVE_ENDPOINT", None)
HUGGINGFACE_CO_RESOLVE_ENDPOINT = os.environ.get("HF_ENDPOINT", HUGGINGFACE_CO_RESOLVE_ENDPOINT)
HUGGINGFACE_CO_PREFIX = HUGGINGFACE_CO_RESOLVE_ENDPOINT + "/{model_id}/resolve/{revision}/{filename}"
HUGGINGFACE_CO_EXAMPLES_TELEMETRY = HUGGINGFACE_CO_RESOLVE_ENDPOINT + "/api/telemetry/examples"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magnificent

"""
try:
# Check if the model card is present on the remote repo
model_card = ModelCard.load(repo_id, token=token, ignore_metadata_errors=ignore_metadata_errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure we don't want to ignore metadata errors when it makes sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure I would say. To be honest I'm not sure anyone really used this parameter as it's hidden and even if you get an error, it doesn't tell you to use it. Also if a metadata error happens we won't be able to push the model card back to the Hub afterwards

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, convnext, deformable_detr, dit, dpt, glpn, got_ocr2, groupvit, internvl, llama, maskformer

@Wauplin Wauplin enabled auto-merge (squash) November 27, 2025 14:54
@LysandreJik LysandreJik disabled auto-merge November 27, 2025 14:54
@LysandreJik LysandreJik merged commit f624084 into main Nov 27, 2025
11 of 24 checks passed
@LysandreJik LysandreJik deleted the cleanup-offline-mode-and-cache-dir branch November 27, 2025 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants