Update dependency sentence-transformers to v5 by mend-for-github-com[bot] · Pull Request #3 · Yuliya65/AutoPrompt

mend-for-github-com · 2025-11-10T17:11:55Z

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Change	Age	Adoption	Passing	Confidence
sentence-transformers	`==2.2.2` → `==5.3.0`

Release Notes

huggingface/sentence-transformers (sentence-transformers)

`v5.3.0`: - Improved Contrastive Learning, New Losses, and Transformers v5 Compatibility

Compare Source

This minor version brings several improvements to contrastive learning: MultipleNegativesRankingLoss now supports alternative InfoNCE formulations (symmetric, GTE-style) and optional hardness weighting for harder negatives. Two new losses are introduced, GlobalOrthogonalRegularizationLoss for embedding space regularization and CachedSpladeLoss for memory-efficient SPLADE training. The release also adds a faster hashed batch sampler, fixes GroupByLabelBatchSampler for triplet losses, and ensures full compatibility with the latest Transformers v5 versions.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.3.0

# Inference only, use one of:
pip install sentence-transformers==5.3.0
pip install sentence-transformers[onnx-gpu]==5.3.0
pip install sentence-transformers[onnx]==5.3.0
pip install sentence-transformers[openvino]==5.3.0

Updated MultipleNegativesRankingLoss (a.k.a. InfoNCE)

MultipleNegativesRankingLoss received two major upgrades: support for alternative InfoNCE formulations from the literature, and optional hardness weighting to up-weight harder negatives.

Support other InfoNCE variants (#3607)

MultipleNegativesRankingLoss now supports several well-known contrastive loss variants from the literature through new directions and partition_mode parameters. Previously, this loss only supported the standard forward direction (query → doc). You can now configure which similarity interactions are included in the loss:

"query_to_doc" (default): For each query, its matched document should score higher than all other documents.
"doc_to_query": The symmetric reverse — for each document, its matched query should score higher than all other queries.
"query_to_query": For each query, all other queries should score lower than its matched document.
"doc_to_doc": For each document, all other documents should score lower than its matched query.

The partition_mode controls how scores are normalized: "joint" computes a single softmax over all directions, while "per_direction" computes a separate softmax per direction and averages the losses.

These combine to reproduce several loss formulations from the literature:

Standard InfoNCE (default, unchanged behavior):

loss = MultipleNegativesRankingLoss(model)

# equivalent to directions=("query_to_doc",), partition_mode="joint"

Symmetric InfoNCE (Günther et al. 2024) — adds the reverse direction so both queries and documents are trained to find their match:

loss = MultipleNegativesRankingLoss(
    model,
    directions=("query_to_doc", "doc_to_query"),
    partition_mode="per_direction",
)

GTE improved contrastive loss (Li et al. 2023) — adds same-type negatives (query <-> query, doc <-> doc) for a stronger training signal, especially useful with pairs-only data:

loss = MultipleNegativesRankingLoss(
    model,
    directions=("query_to_doc", "query_to_query", "doc_to_query", "doc_to_doc"),
    partition_mode="joint",
)

Hardness-weighted contrastive learning (#3667)

Adds optional hardness weighting to MultipleNegativesRankingLoss and CachedMultipleNegativesRankingLoss, inspired by Lan et al. 2025 (LLaVE). This up-weights harder negatives in the softmax by adding hardness_strength * stop_grad(cos_sim) to selected negative logits. The feature is off by default (hardness_mode=None), so existing behavior is unchanged.

The hardness_mode parameter controls which negatives receive the penalty:

"in_batch_negatives": Penalizes in-batch negatives only (positives and hard negatives from other samples). Works with all data formats including pairs-only.
"hard_negatives": Penalizes explicit hard negatives only (columns beyond the first two). Only active when hard negatives are provided.
"all_negatives": Penalizes both in-batch and hard negatives, leaving only the positive unpenalized.

from sentence_transformers.losses import MultipleNegativesRankingLoss

loss = MultipleNegativesRankingLoss(
    model,
    hardness_mode="in_batch_negatives",
    hardness_strength=9.0,
)

New loss: GlobalOrthogonalRegularizationLoss (#3654)

Introduces GlobalOrthogonalRegularizationLoss (Zhang et al. 2017), a regularization loss that encourages embeddings to be well-distributed in the embedding space. It penalizes two things: (1) high mean pairwise similarity across unrelated embeddings, and (2) high second moment of similarities (which indicates clustering). This loss is meant to be combined with a primary contrastive loss like MultipleNegativesRankingLoss. By wrapping both losses in a single module, you can share embeddings and only require one forward pass:

import torch
from datasets import Dataset
from torch import Tensor
from sentence_transformers import SentenceTransformer, SentenceTransformerTrainer
from sentence_transformers.losses import GlobalOrthogonalRegularizationLoss, MultipleNegativesRankingLoss
from sentence_transformers.util import cos_sim

model = SentenceTransformer("microsoft/mpnet-base")
train_dataset = Dataset.from_dict({
    "anchor": ["It's nice weather outside today.", "He drove to work."],
    "positive": ["It's so sunny.", "He took the car to the office."],
})

class InfoNCEGORLoss(torch.nn.Module):
    def __init__(self, model: SentenceTransformer, similarity_fct=cos_sim, scale=20.0) -> None:
        super().__init__()
        self.model = model
        self.info_nce_loss = MultipleNegativesRankingLoss(model, similarity_fct=similarity_fct, scale=scale)
        self.gor_loss = GlobalOrthogonalRegularizationLoss(model, similarity_fct=similarity_fct)

    def forward(self, sentence_features: list[dict[str, Tensor]], labels: Tensor | None = None) -> Tensor:
        embeddings = [self.model(sentence_feature)["sentence_embedding"] for sentence_feature in sentence_features]
        info_nce_loss: dict[str, Tensor] = {
            "info_nce": self.info_nce_loss.compute_loss_from_embeddings(embeddings, labels)
        }
        gor_loss: dict[str, Tensor] = self.gor_loss.compute_loss_from_embeddings(embeddings, labels)
        return {**info_nce_loss, **gor_loss}

loss = InfoNCEGORLoss(model)
trainer = SentenceTransformerTrainer(
    model=model,
    train_dataset=train_dataset,
    loss=loss,
)
trainer.train()

New loss: CachedSpladeLoss for memory-efficient SPLADE training (#3670)

Introduces CachedSpladeLoss, a gradient-cached version of SpladeLoss that enables training SPLADE models with larger batch sizes without additional GPU memory. It applies the GradCache technique at the SpladeLoss wrapper level, so both the base loss and regularizers receive pre-computed embeddings — no changes to existing base losses or regularizers are needed.

from datasets import Dataset
from sentence_transformers.sparse_encoder import SparseEncoder, SparseEncoderTrainer
from sentence_transformers.sparse_encoder.losses import CachedSpladeLoss, SparseMultipleNegativesRankingLoss

model = SparseEncoder("distilbert/distilbert-base-uncased")
train_dataset = Dataset.from_dict({
    "anchor": ["It's nice weather outside today.", "He drove to work."],
    "positive": ["It's so sunny.", "He took the car to the office."],
})
loss = CachedSpladeLoss(
    model=model,
    loss=SparseMultipleNegativesRankingLoss(model),
    document_regularizer_weight=3e-5,
    query_regularizer_weight=5e-5,
    mini_batch_size=32,
)

trainer = SparseEncoderTrainer(model=model, train_dataset=train_dataset, loss=loss)
trainer.train()

Faster NoDuplicatesBatchSampler with hashing (#3611)

Adds a NO_DUPLICATES_HASHED batch sampler option, which uses the existing NoDuplicatesBatchSampler with precompute_hashes=True. This pre-computes xxhash 64-bit values for each sample, providing significant speedups for large batch sizes at a small memory cost. Requires the xxhash library.

from sentence_transformers import SentenceTransformerTrainingArguments

args = SentenceTransformerTrainingArguments(
    batch_sampler="NO_DUPLICATES_HASHED"  # Pre-computes hashes for faster duplicate checking
)

GroupByLabelBatchSampler improvements for triplet losses (#3668)

Fixes a critical issue where GroupByLabelBatchSampler produced ~99% single-class batches, causing zero gradients with triplet losses. The sampler now uses round-robin interleaving where each label emits 2 samples per round, with the label visit order reshuffled every round. This guarantees every batch contains multiple distinct labels, each with at least 2 samples.

Transformers v5 compatibility

This release includes full compatibility updates for Transformers v5:

Compatibility with transformers 5.0.0rc01 and later versions (#3597, #3615)
Support for T5Gemma and T5Gemma2 models (#3644)
Transformers v5.2 compatibility for the trainer's _nested_gather method (#3664)
Support for both warmup_steps and warmup_ratio until Transformers v4 support is dropped (#3645)
Updated CI to test against full Transformers v5 (#3615)

Minor Features

Add triplets/n-tuple support to AnglE by @tomaarsen in #3609
Replace requests dependency with optional httpx dependency by @tomaarsen in #3618
Specify numpy manually in dependencies by @tomaarsen in #3608
Support excluding prompt tokens with pooling with left-padding tokenizer by @tomaarsen in #3598

Bug Fixes

Fix InformationRetrievalEvaluator prediction export when output_path does not exist by @ignasgr in #3659
Add padding for odd embedding dimensions in tensors (sparse encoders) by @jadermcs in #3623
Fix IndexError in CrossEncoder MultipleNegativesRankingLoss when num_negatives=None by @fuutot in #3636
Fix valid negatives selection in CrossEncoder MultipleNegativesRankingLoss by @fuutot in #3641
Mention TSDAE incompatibility with transformers v5 by @tomaarsen in #3619
Fix model card generation with set_transform with new column names by @tomaarsen in #3680

Performance Improvements

Speed up NoDuplicatesBatchSampler iteration using NumPy arrays and linked lists by @hotchpotch in #3658

Training Script Migrations (v2 to v3)

Migrate training_batch_hard_trec.py by @omkar-334 in #3624
Migrate train_ct_from_file.py by @omkar-334 in #3625
Migrate train_stsb_ct.py by @omkar-334 in #3626
Migrate train_stsb_ct_improved by @omkar-334 in #3627
Migrate train_askubuntu_ct-improved.py by @omkar-334 in #3628
Migrate 2_programming_train_bi-encoder.py by @omkar-334 in #3629
Migrate train_askubuntu_simcse.py by @omkar-334 in #3630
Migrate train_simcse_from_file.py by @omkar-334 in #3631
Migrate training_multi-task-learning.py by @harshitsharma496 in #3632
Replace http_get with load_dataset - wiki1m_for_simcse and STSbenchmark by @omkar-334 in #3635
Replace http_get with load_dataset - askubuntu and all-nli by @omkar-334 in #3638
Update ContrastiveTensionLoss and ContrastiveTensionLossInBatchNegatives by @omkar-334 in #3639
Migrate train_ct-improved_from_file.py from v2 to v3 by @omkar-334 in #3646
Migrate train_askubuntu_ct.py from v2 to v3 by @omkar-334 in #3647
Migrate train_stsb_simcse.py from v2 to v3 by @omkar-334 in #3648
Update docstring for DenoisingAutoEncoderLoss.py by @omkar-334 in #3652
Replace model.fit in test files by @omkar-334 in #3653
Fix: pass batch_size args to CE evaluators by @omkar-334 in #3643

Documentation

Add sample CLIP training script with datasets & MLFlow by @aardoiz in #3595
Add Unsloth to Docs by @shimmyshimmer in #3613
Add tips for adjusting batch size to improve processing speed by @tomaarsen in #3672
CE trainer: Removed IterableDataset from train and eval dataset type hints by @tomaarsen in #3676

All Changes

chore: Increment development version for 'main' by @tomaarsen in #3594
Introduce compatibility with transformers 5.0.0rc01 by @tomaarsen in #3597
docs: fix typo in custom models: reemain -> remain by @tomaarsen in #3596
[feat] Support excluding prompt tokens with pooling with left-padding tokenizer by @tomaarsen in #3598
Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in #3600
Specify numpy manually in dependencies, as it's directly used/imported by @tomaarsen in #3608
[tests] Relax the CI branches by @tomaarsen in #3610
[compat] Expand test suite to full transformers v5 by @tomaarsen in #3615
[deps] Replace requests dependency with optional httpx dependency by @tomaarsen in #3618
Mention TSDAE incompatibility with transformers v5, update TSDAE snippet by @tomaarsen in #3619
docs: add sample clip training script with datasets & mlfow implement… by @aardoiz in #3595
[Fix] Add padding for odd embedding dimensions in tensors (sparse encoders) by @jadermcs in #3623
[feat] Add triplets/n-tuple support to AnglE by @tomaarsen in #3609
Replace http_get with load_dataset - wiki1m_for_simcse and STSbenchmark by @omkar-334 in #3635
[tests] Use 120s HF Hub timeout for tests by @tomaarsen in #3637
Fix IndexError in MultipleNegativesRankingLoss when num_negatives=None by @fuutot in #3636
migrate training_multi-task-learning.py v2to v3 by @harshitsharma496 in #3632
[feat] Add NO_DUPLICATES_HASHED: optional hashing for NoDuplicatesBatchSampler by @hotchpotch in #3611
Fix: select valid negatives as in-batch negatives by @fuutot in #3641
migrate 2_programming_train_bi-encoder.py from v2 to v3 by @omkar-334 in #3629
migrate train_simcse_from_file.py from v2 to v3 by @omkar-334 in #3631
Update ContrastiveTensionLoss and ContrastiveTensionLossInBatchNegatives by @omkar-334 in #3639
Replace http_get with load_dataset -askubuntu and all-nli by @omkar-334 in #3638
fix: pass batch_size args to CE evaluators by @omkar-334 in #3643
replace trec dataset and migrate training_batch_hard_trec.py from v2 to v3 by @omkar-334 in #3624
Add Unsloth to Docs by @shimmyshimmer in #3613
migrate train_stsb_ct.py from v2 to v3 by @omkar-334 in #3626
migrate train_ct_from_file.py from v2 to v3 by @omkar-334 in #3625
migrate train_askubuntu_ct-improved.py from v2 to v3 by @omkar-334 in #3628
migrate train_stsb_ct_improved from v2 to v3 by @omkar-334 in #3627
migrate train_askubuntu_simcse.py from v2 to v3 by @omkar-334 in #3630
migrate train_stsb_simcse.py from v2 to v3 by @omkar-334 in #3648
migrate train_askubuntu_ct.py from v2 to v3 by @omkar-334 in #3647
migrate train_ct-improved_from_file.py from v2 to v3 by @omkar-334 in #3646
update docstring for DenoisingAutoEncoderLoss.py by @omkar-334 in #3652
Replace model.fit in test files by @omkar-334 in #3653
[feat] Add support for T5Gemma and T5Gemma2 models by @tomaarsen in #3644
[feat] Refactor MultipleNegativesRankingLoss to support improved contrastive loss from GTE paper by @hotchpotch in #3607
[compat] Allow for both warmup_steps and warmup_ratio until transformers v4 support is dropped by @tomaarsen in #3645
[feat] Introduce GlobalOrthogonalRegularizationLoss by @tomaarsen in #3654
fix: correct typo 'seperated' to 'separated' by @thecaptain789 in #3657
fix: typos by @omkar-334 in #3660
[compat] Introduce Transformers v5.2 compatibility: trainer _nested_gather moved by @tomaarsen in #3664
Fix InformationRetrievalEvaluator prediction export when output_path does not exist by @ignasgr in #3659
Fix typo in training_batch_hard_trec.py by @tomaarsen in #3669
[perf] Speed up NoDuplicatesBatchSampler iteration (NO_DUPLICATES and NO_DUPLICATES_HASHED) by @hotchpotch in #3658
[fix] GroupByLabelBatchSampler to guarantee multi-class batches for triplet losses by @MrLoh in #3668
[feat] Introduce CachedSpladeLoss for memory-efficient SPLADE training by @yjoonjang in #3670
[docs] Add tips for adjusting batch size to improve processing speed by @tomaarsen in #3672
[docs] CE trainer: Removed IterableDataset from train and eval dataset type hints by @tomaarsen in #3676
[loss] Disallow query_to_query/doc_to_doc with partition_mode="per_direction" due to negative loss by @tomaarsen in #3677
[feat] Add hardness-weighted contrastive learning to losses by @yjoonjang in #3667
[fix] Fix model card generation with set_transform with new column names by @tomaarsen in #3680
[tests] Add slow reproduction tests for most common models by @tomaarsen in #3681

New Contributors

@salmanmkc made their first contribution in #3600
@aardoiz made their first contribution in #3595
@jadermcs made their first contribution in #3623
@fuutot made their first contribution in #3636
@harshitsharma496 made their first contribution in #3632
@hotchpotch made their first contribution in #3611
@shimmyshimmer made their first contribution in #3613
@thecaptain789 made their first contribution in #3657
@yjoonjang made their first contribution in #3670

A big thanks to my repeat contributors, a lot of this release originated from your contributions. Much appreciated!

Full Changelog: huggingface/sentence-transformers@v5.2.3...v5.3.0

`v5.2.3`: - Compatibility with Transformers v5.2 training

Compare Source

This patch release introduces compatibility with Transformers v5.2.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.2.3

# Inference only, use one of:
pip install sentence-transformers==5.2.3
pip install sentence-transformers[onnx-gpu]==5.2.3
pip install sentence-transformers[onnx]==5.2.3
pip install sentence-transformers[openvino]==5.2.3

Transformers v5.2 Support

Transformers v5.2 has just released, and it updated its Trainer in such a way that training with Sentence Transformers would start failing on the logging step. The #3664 pull request has resolved this issue.

If you're not training with Sentence Transformers, then older versions of Sentence Transformers are also compatible with Transformers v5.2.

All Changes

[compat] Introduce Transformers v5.2 compatibility: trainer _nested_gather moved by @tomaarsen (#3664)

Full Changelog: huggingface/sentence-transformers@v5.2.2...v5.2.3

`v5.2.2`: - Replace mandatory `requests` dependency with optional `httpx` dependency

Compare Source

This patch release replaces mandatory requests dependency with an optional httpx dependency.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.2.2

# Inference only, use one of:
pip install sentence-transformers==5.2.2
pip install sentence-transformers[onnx-gpu]==5.2.2
pip install sentence-transformers[onnx]==5.2.2
pip install sentence-transformers[openvino]==5.2.2

Transformers v5 Support

Transformers v5.0 and its required huggingface_hub versions have dropped support of requests in favor of httpx. The former was also used in sentence-transformers, but not listed explicitly as a dependency. This patch removes the use of requests in favor of httpx, although it's now optional and not automatically imported. This should also save some import time.

Importing Sentence Transformers should now not crash if requests is not installed.

All Changes

[deps] Replace requests dependency with optional httpx dependency by @tomaarsen (#3618)

Full Changelog: huggingface/sentence-transformers@v5.2.1...v5.2.2

`v5.2.1`: - Joint Transformers v4 and v5 compatibility

Compare Source

This patch release adds support for the full Transformers v5 release.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.2.1

# Inference only, use one of:
pip install sentence-transformers==5.2.1
pip install sentence-transformers[onnx-gpu]==5.2.1
pip install sentence-transformers[onnx]==5.2.1
pip install sentence-transformers[openvino]==5.2.1

Transformers v5 Support

Sentence Transformers v5.2.0 already introduced support for the Transformers v5.0 release candidates, but this release is adding support for the full release. The intention is to maintain backward compatibility with v4.x. The library includes dual CI testing for both version for now, allowing users to upgrade to the newest Transformers features when ready. In future versions, Sentence Transformers may start requiring Transformers v5.0 or higher.

All Changes

Introduce compatibility with transformers 5.0.0rc01 by @tomaarsen (#3597)
Specify numpy manually in dependencies, as it's directly used/imported by @tomaarsen (#3608)
Expand test suite to full transformers v5 by @tomaarsen (#3615)

Full Changelog: huggingface/sentence-transformers@v5.2.0...v5.2.1

`v5.2.0`: - CrossEncoder multi-processing, multilingual NanoBEIR evaluators, similarity score in `mine_hard_negatives`, Transformers v5 support

Compare Source

This minor release introduces multi-processing for CrossEncoder (rerankers), multilingual NanoBEIR evaluators, similarity score outputs in mine_hard_negatives, Transformers v5 support, Python 3.9 deprecations, and more.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.2.0

# Inference only, use one of:
pip install sentence-transformers==5.2.0
pip install sentence-transformers[onnx-gpu]==5.2.0
pip install sentence-transformers[onnx]==5.2.0
pip install sentence-transformers[openvino]==5.2.0

CrossEncoder Multi-processing

The CrossEncoder class now supports multiprocessing for faster inference on CPU and multi-GPU setups. This brings CrossEncoder functionality in line with the existing multiprocessing capabilities of SentenceTransformer models, allowing you to use multiple CPU cores or GPUs to speed up both the predict and rank methods when processing large batches of sentence pairs.

The implementation introduces these new methods, mirroring the SentenceTransformer approach:

start_multi_process_pool() - Initialize a pool of worker processes
stop_multi_process_pool() - Clean up the worker pool

Usage is straightforward with the new pool parameter:

from sentence_transformers.cross_encoder import CrossEncoder

def main():
	model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L6-v2')
	
	# Start a pool of workers
	pool = model.start_multi_process_pool()
	
	# Use the pool for faster inference
	scores = model.predict(sentence_pairs, pool=pool)
	rankings = model.rank(query, documents, pool=pool)
	
	# Clean up when done
	model.stop_multi_process_pool(pool)

if __name__ == "__main__":
    main()

Or simply pass a list of devices to device to have predict and rank automatically create a pool behind the scenes.

from sentence_transformers.cross_encoder import CrossEncoder

def main():
	model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L6-v2', device="cpu")
	
	# Use 4 processes
	scores = model.predict(sentence_pairs, device=["cpu"] * 4)
	rankings = model.rank(query, documents, device=["cpu"] * 4)

if __name__ == "__main__":
    main()

This enhancement is particularly beneficial for CPU-based deployments and enables multi-GPU reranking in the mine_hard_negatives function, making hard negative mining faster for large datasets.

Multilingual NanoBEIR Support

The NanoBEIR evaluators now support custom dataset IDs, allowing for evaluation on non-English NanoBEIR collections. All three NanoBEIR evaluators (dense, sparse, and cross-encoder) support this functionality with a simple dataset_id parameter.

For example:

import logging
from pprint import pprint

from sentence_transformers import SentenceTransformer
from sentence_transformers.evaluation import NanoBEIREvaluator

logging.basicConfig(format="%(asctime)s - %(message)s", datefmt="%Y-%m-%d %H:%M:%S", level=logging.INFO)

# Load a model to evaluate
model = SentenceTransformer("google/embeddinggemma-300m")

# Use a Serbian translation of NanoBEIR
evaluator = NanoBEIREvaluator(
    ["msmarco", "nq"],
    dataset_id="Serbian-AI-Society/NanoBEIR-sr"
)
results = evaluator(model)
print(results[evaluator.primary_metric])
pprint({key: value for key, value in results.items() if "ndcg@10" in key})
"""
{'NanoBEIR_mean_cosine_ndcg@10': 0.44754032737278326,
 'NanoMSMARCO_cosine_ndcg@10': 0.4424192627754922,
 'NanoNQ_cosine_ndcg@10': 0.45266139197007427}
"""

There are already supported translations for French, Arabic, German, Spanish, Italian, Portuguese, Norwegian, Swedish, Serbian, Korean, Japanese, and 22 Bharat languages in the NanoBEIR collection. Contact me (@tomaarsen) if you have found or created another translation and would like to get it added to the collection!

Similarity Scores in Hard Negatives Mining

The mine_hard_negatives function now includes an output_scores parameter that allows you to export similarity scores alongside the mined negatives. When output_scores=False (default), these are the output formats for various output_formats:

"triplet": (anchor, positive, negative)
"n-tuple": (anchor, positive, negative_1, ..., negative_n)
"labeled-pair": (anchor, passage, label)
"labeled-list": (anchor, [passages], [labels])

And when output_scores=True, the format becomes:

"triplet": (anchor, positive, negative, [scores])
"n-tuple": (anchor, positive, negative_1, ..., negative_n, [scores])
"labeled-pair": (anchor, passage, score)
"labeled-list": (anchor, [passages], [scores])

For context, labels are binary options denoting whether the relevant pair was labeled as a positive or not, whereas scores are similarity scores from the SentenceTransformer or CrossEncoder model.
Additionally:

The deprecated n-tuple-scores format has been replaced with the cleaner output_format="n-tuple" combined with output_scores=True.
Several issues with datasets supporting multiple positives have been resolved

For example:

from sentence_transformers.util import mine_hard_negatives
from sentence_transformers import SentenceTransformer
from datasets import load_dataset

# Load a Sentence Transformer model
model = SentenceTransformer("sentence-transformers/static-retrieval-mrl-en-v1")

# Load a dataset to mine hard negatives from
dataset = load_dataset("sentence-transformers/natural-questions", split="train").select(range(10000))
print(dataset)
"""
Dataset({
    features: ['query', 'answer'],
    num_rows: 10000
})
"""

# Mine hard negatives into num_negatives + 3 columns:

# 'query', 'answer', 'negative_1', 'negative_2', ..., 'score'
# where 'score' is a list of similarity scores for the query-answer plus each query-negative pair.
dataset = mine_hard_negatives(
    dataset=dataset,
    model=model,
    num_negatives=5,
    sampling_strategy="top",
    relative_margin=0.05,
    batch_size=128,
    use_faiss=True,
    output_format="labeled-list",
    output_scores=True,
)
"""
Negative candidates mined, preparing dataset...
Metric       Positive       Negative     Difference
Count          10,000         49,241
Mean           0.5884         0.3909         0.2033
Median         0.6005         0.3766         0.1837
Std            0.1467         0.1050         0.1337
Min            0.0272         0.1595         0.0088
25%            0.4918         0.3127         0.0903
50%            0.6005         0.3766         0.1837
75%            0.6974         0.4558         0.2924
Max            0.9679         0.8505         0.7281
Skipped 25,451 potential negatives (4.89%) due to the relative_margin of 0.05.
Could not find enough negatives for 148 samples (1.48%). Consider adjusting the range_max and relative_margin parameters if you'd like to find more valid negatives.
"""
print(dataset)
"""
Dataset({
    features: ['query', 'answer', 'scores'],
    num_rows: 9852
})
"""
print(dataset[0])
{
    "query": "when did richmond last play in a preliminary final",
    "answer": [
        "Richmond Football Club Richmond began 2017 with 5 straight wins, a feat it had not achieved since 1995. A series of close losses hampered the Tigers throughout the middle of the season, including a 5-point loss to the Western Bulldogs, 2-point loss to Fremantle, and a 3-point loss to the Giants. Richmond ended the season strongly with convincing victories over Fremantle and St Kilda in the final two rounds, elevating the club to 3rd on the ladder. Richmond's first final of the season against the Cats at the MCG attracted a record qualifying final crowd of 95,028; the Tigers won by 51 points. Having advanced to the first preliminary finals for the first time since 2001, Richmond defeated Greater Western Sydney by 36 points in front of a crowd of 94,258 to progress to the Grand Final against Adelaide, their first Grand Final appearance since 1982. The attendance was 100,021, the largest crowd to a grand final since 1986. The Crows led at quarter time and led by as many as 13, but the Tigers took over the game as it progressed and scored seven straight goals at one point. They eventually would win by 48 points – 16.12 (108) to Adelaide's 8.12 (60) – to end their 37-year flag drought.[22] Dustin Martin also became the first player to win a Premiership medal, the Brownlow Medal and the Norm Smith Medal in the same season, while Damien Hardwick was named AFL Coaches Association Coach of the Year. Richmond's jump from 13th to premiers also marked the biggest jump from one AFL season to the next.",
        "2017 AFL Grand Final The 2017 AFL Grand Final was an Australian rules football game contested between the Adelaide Crows and the Richmond Tigers, held at the Melbourne Cricket Ground on 30 September 2017. It was the 121st annual grand final of the Australian Football League (formerly the Victorian Football League), staged to determine the premiers for the 2017 AFL season.[1]. Richmond defeated Adelaide by 48 points, marking the club's eleventh premiership and first since 1980. Richmond's Dustin Martin won the Norm Smith Medal as the best player on the ground. The match was attended by 100,021 people, the largest crowd since the 1986 Grand Final.",
        "Raid of Richmond The Richmond Campaign was a group of British military actions against the capital of Virginia, Richmond, and the surrounding area, during the American Revolutionary War. Led by American turncoat Benedict Arnold, the Richmond Campaign is considered one of his greatest successes while serving under the British Army, and one of the most notorious actions that Arnold ever performed.",
        "2001 AFL Grand Final The 2001 AFL Grand Final was an Australian rules football game contested between the Essendon Football Club and the Brisbane Lions, held at the Melbourne Cricket Ground in Melbourne on 29 September 2001. It was the 105th annual Grand Final of the Australian Football League (formerly the Victorian Football League),[1] staged to determine the premiers for the 2001 AFL season. The match, attended by 91,482 spectators, was won by Brisbane by a margin of 26 points, marking that club's first premiership victory.",
        "1964 VFL Grand Final The 1964 VFL Grand Final was an Australian rules football game contested between the  Collingwood Football Club and Melbourne Football Club, held at the Melbourne Cricket Ground in Melbourne on 19 September 1964. It was the 68th annual Grand Final of the Victorian Football League, staged to determine the premiers for the 1964 VFL season. The match, attended by 102,471 spectators, was won by Melbourne by a margin of 4 points, marking that club's 12th (and to date, most recent) premiership victory.",
        "1998 AFL Grand Final The 1998 AFL Grand Final was an Australian rules football game contested between the Adelaide Crows and the North Melbourne Kangaroos, held at the Melbourne Cricket Ground in Melbourne on 26 September 1998. It was the 102nd annual Grand Final of the Australian Football League (formerly the Victorian Football League), staged to determine the premiers for the 1998 AFL season. The match, attended by 94,431 spectators, was won by Adelaide by a margin of 35 points marking that club's second consecutive premiership victory, and second premiership overall.",
    ],
    "scores": [
        0.5460646748542786,
        0.5105829238891602,
        0.4460095167160034,
        0.3221113085746765,
        0.3161606788635254,
        0.31184709072113037,
    ],
}

# dataset.push_to_hub("natural-questions-hard-negatives", "labeled-list-scores")

Transformers v5 Support

Sentence Transformers now supports the latest Transformers v5.0 release while maintaining backward compatibility with v4.x. The library includes dual CI testing for both version for now, allowing users to upgrade to the newest Transformers features when ready. In future versions, Sentence Transformers may start requiring Transformers v5.0 or higher.

Pillow now Optional

The Pillow library is now an optional dependency rather than a required one, reducing installation size for users who don't work with image-based models. Users who need image functionality can install it via pip install sentence-transformers[image] or directly with pip install pillow.

Python 3.9 Deprecation

Following Python's deprecation schedule, Sentence Transformers v5.2.0 has deprecated support for Python 3.9. Users are encouraged to upgrade to Python 3.10 or newer to continue receiving updates and new features.

Minor Changes

Training dataset columns with names "scores" and "labels" are now also considered special label columns, whose information will be passed to the labels argument in the loss that's used to train (#3506).
The sentence-transformers[onnx] and sentence-transformers[onnx-gpu] extra's now rely on the new optimum-onnx package with optimum >= 2.0.0.

All Changes

[tests] Loosen safetensors test rtol/atol by @tomaarsen in #3572
[deprecation] Deprecate Python 3.9, upgrade ruff by @tomaarsen in #3573
ArXiv -> HF Papers by @qgallouedec in #3565
Document broken LexRank pip implementation by @stevenae in #3567
Add documentation analytics by @tomaarsen in #3577
[fix]: correct condition for restoring layer embeddings in TransformerDecorator/AdaptiveLayerLoss by @emapco in #3560
[chore] Rename master to main, update outdated URLs by @tomaarsen in #3579
[tests] Increase atol/rtol from 1e-6 to 1e-5 for higher test consistency by @tomaarsen in #3578
[feat] Allow transformers v5.0, add CI for transformers =v5 by @tomaarsen in #3586
add multiprocessing support for Cross Encoder by @omkar-334 in #3580
[deps] Use optimum-onnx now that both optimum-onnx and optimum-intel can use optimum==2.0.0 by @tomaarsen in #3587
Skip test_train_stsb tests; triggers rate limit too often by @tomaarsen in #3590
feat/deps: make Pillow an optional dependency by @akx in #3589
Extend NanoBEIR evaluators to support custom NanoBEIR datasets by @milistu in #3583
Mine hard negatives: optionally output similarity scores by @tsbalzhanov in #3506
docs: update NanoBEIR collection links and descriptions for evaluators by @tomaarsen in #3591
docs: add release notes summary for v5.2 on main page by @tomaarsen in #3592

New Contributors

@qgallouedec made their first contribution in #3565
@stevenae made their first contribution in #3567
@omkar-334 made their first contribution in #3580
@akx made their first contribution in #3589
@tsbalzhanov made their first contribution in #3506

An extra thanks to @Samoed, @NohTow, and @raphaelsty for engaging in valuable discussions in the pull requests, @omkar-334 for finding all kinds of open issues where possible, and @marquesafonso for working on a solid PR for multilingual NanoBEIR that we didn't end up going for.

Additionally, a big thanks to @milistu from Serbian-AI-Society, @NohTow & @raphaelsty from LightOn, @mlabonne and Fernando Fernandes Neto from LiquidAI, @lbourdois from CATIE-AQ and Arun Arumugam for creating the NanoBEIR translations that are supported out of the gate.

Full Changelog: huggingface/sentence-transformers@v5.1.2...v5.2.0

`v5.1.2`: - Sentence Transformers joins Hugging Face; model saving/loading improvements and loss compatibility

Compare Source

This patch celebrates the transition of Sentence Transformers to Hugging Face, and improves model saving, loading defaults, and loss compatibilities.

Install this version with

# Training + Inference
pip install sentence-transformers[train]==5.1.2

# Inference only, use one of:
pip install sentence-transformers==5.1.2
pip install sentence-transformers[onnx-gpu]==5.1.2
pip install sentence-transformers[onnx]==5.1.2
pip install sentence-transformers[openvin

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xNTYuMSIsInVwZGF0ZWRJblZlciI6IjQyLjk5LjAiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbXX0=-->

mend-for-github-com bot force-pushed the whitesource-remediate/sentence-transformers-5.x branch from d51b989 to 757610c Compare December 12, 2025 20:33

mend-for-github-com bot changed the title ~~Update dependency sentence-transformers to v5~~ [NEUTRAL] Update dependency sentence-transformers to v5 Dec 12, 2025

mend-for-github-com bot changed the title ~~[NEUTRAL] Update dependency sentence-transformers to v5~~ Update dependency sentence-transformers to v5 Dec 16, 2025

mend-for-github-com bot force-pushed the whitesource-remediate/sentence-transformers-5.x branch from 757610c to e7d81b4 Compare January 27, 2026 15:21

mend-for-github-com bot changed the title ~~Update dependency sentence-transformers to v5~~ [NEUTRAL] Update dependency sentence-transformers to v5 Jan 27, 2026

mend-for-github-com bot changed the title ~~[NEUTRAL] Update dependency sentence-transformers to v5~~ Update dependency sentence-transformers to v5 Jan 31, 2026

mend-for-github-com bot force-pushed the whitesource-remediate/sentence-transformers-5.x branch from e7d81b4 to d9c5111 Compare February 19, 2026 12:04

mend-for-github-com bot changed the title ~~Update dependency sentence-transformers to v5~~ [NEUTRAL] Update dependency sentence-transformers to v5 Feb 19, 2026

mend-for-github-com bot changed the title ~~[NEUTRAL] Update dependency sentence-transformers to v5~~ Update dependency sentence-transformers to v5 Feb 21, 2026

[NEUTRAL] Update dependency sentence-transformers to v5

66f7d39

mend-for-github-com bot force-pushed the whitesource-remediate/sentence-transformers-5.x branch from d9c5111 to 66f7d39 Compare March 12, 2026 16:31

mend-for-github-com bot changed the title ~~Update dependency sentence-transformers to v5~~ [NEUTRAL] Update dependency sentence-transformers to v5 Mar 12, 2026

mend-for-github-com bot changed the title ~~[NEUTRAL] Update dependency sentence-transformers to v5~~ Update dependency sentence-transformers to v5 Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dependency sentence-transformers to v5#3

Update dependency sentence-transformers to v5#3
mend-for-github-com[bot] wants to merge 1 commit intomainfrom
whitesource-remediate/sentence-transformers-5.x

mend-for-github-com bot commented Nov 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

mend-for-github-com bot commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

v5.3.0: - Improved Contrastive Learning, New Losses, and Transformers v5 Compatibility

Updated MultipleNegativesRankingLoss (a.k.a. InfoNCE)

Support other InfoNCE variants (#​3607)

Hardness-weighted contrastive learning (#​3667)

New loss: GlobalOrthogonalRegularizationLoss (#​3654)

New loss: CachedSpladeLoss for memory-efficient SPLADE training (#​3670)

Faster NoDuplicatesBatchSampler with hashing (#​3611)

GroupByLabelBatchSampler improvements for triplet losses (#​3668)

Transformers v5 compatibility

Minor Features

Bug Fixes

Performance Improvements

Training Script Migrations (v2 to v3)

Documentation

All Changes

New Contributors

v5.2.3: - Compatibility with Transformers v5.2 training

Transformers v5.2 Support

All Changes

v5.2.2: - Replace mandatory requests dependency with optional httpx dependency

Transformers v5 Support

All Changes

v5.2.1: - Joint Transformers v4 and v5 compatibility

Transformers v5 Support

All Changes

v5.2.0: - CrossEncoder multi-processing, multilingual NanoBEIR evaluators, similarity score in mine_hard_negatives, Transformers v5 support

CrossEncoder Multi-processing

Multilingual NanoBEIR Support

Similarity Scores in Hard Negatives Mining

Transformers v5 Support

Pillow now Optional

Python 3.9 Deprecation

Minor Changes

All Changes

New Contributors

v5.1.2: - Sentence Transformers joins Hugging Face; model saving/loading improvements and loss compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

mend-for-github-com bot commented Nov 10, 2025 •

edited

Loading

`v5.3.0`: - Improved Contrastive Learning, New Losses, and Transformers v5 Compatibility

Support other InfoNCE variants (#3607)

Hardness-weighted contrastive learning (#3667)

New loss: GlobalOrthogonalRegularizationLoss (#3654)

New loss: CachedSpladeLoss for memory-efficient SPLADE training (#3670)

Faster NoDuplicatesBatchSampler with hashing (#3611)

GroupByLabelBatchSampler improvements for triplet losses (#3668)

`v5.2.3`: - Compatibility with Transformers v5.2 training

`v5.2.2`: - Replace mandatory `requests` dependency with optional `httpx` dependency

`v5.2.1`: - Joint Transformers v4 and v5 compatibility

`v5.2.0`: - CrossEncoder multi-processing, multilingual NanoBEIR evaluators, similarity score in `mine_hard_negatives`, Transformers v5 support

`v5.1.2`: - Sentence Transformers joins Hugging Face; model saving/loading improvements and loss compatibility