File tree Expand file tree Collapse file tree 1 file changed +10
-5
lines changed
Expand file tree Collapse file tree 1 file changed +10
-5
lines changed Original file line number Diff line number Diff line change 1- # pyspark_huggingface
1+ <p align =" center " >
2+ <img alt =" Hugging Face x Spark " src =" https://pbs.twimg.com/media/FvN1b_2XwAAWI1H?format=jpg&name=large " width =" 352 " style =" max-width : 100% ;" >
3+ <br />
4+ <br />
5+ </p >
26
37<p align =" center " >
48 <a href="https://github.com/huggingface/pyspark_huggingface/releases"><img alt="GitHub release" src="https://img.shields.io/github/release/huggingface/pyspark_huggingface.svg"></a>
59 <a href="https://huggingface.co/datasets/"><img alt="Number of datasets" src="https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/datasets&color=brightgreen"></a>
610</p >
711
12+ # Spark Data Source for Hugging Face Datasets
13+
814A Spark Data Source for accessing [ 🤗 Hugging Face Datasets] ( https://huggingface.co/datasets ) :
915
10- - Stream datasets directly from Hugging Face to your Spark application
11- - Select subsets and splits
12- - Apply projection and predicate filters for Parquet datasets
13- - Push Spark DataFrames as Parquet files the Hugging Face Dataset Hub
16+ - Stream datasets from Hugging Face as Spark DataFrames
17+ - Select subsets and splits, apply projection and predicate filters
18+ - Save Spark DataFrames as Parquet files to Hugging Face
1419- Fully distributed
1520- Authentication via ` huggingface-cli login ` or tokens
1621- Compatible with Spark 4 (with auto-import)
You can’t perform that action at this time.
0 commit comments