Skip to content

Commit 85d9aa1

Browse files
update naming convention
1 parent b424900 commit 85d9aa1

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,29 @@ For production use, consider these official data source implementations built wi
5858
|--------------------------|-----------------------------------------------------------------------------------------------|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|
5959
| **HuggingFace Datasets** | [@huggingface/pyspark_huggingface](https://github.com/huggingface/pyspark_huggingface) | Production-ready Spark Data Source for 🤗 Hugging Face Datasets | • Stream datasets as Spark DataFrames<br>• Select subsets/splits with filters<br>• Authentication support<br>• Save DataFrames to Hugging Face<br> |
6060

61+
## Data Source Naming Convention
62+
63+
When creating custom data sources using the Python Data Source API, follow these naming conventions for the `short_name` parameter:
64+
65+
### Recommended Approach
66+
- **Use the system name directly**: Use lowercase system names like `huggingface`, `opensky`, `googlesheets`, etc.
67+
- This provides clear, intuitive naming that matches the service being integrated
68+
69+
### Conflict Resolution
70+
- **If there's a naming conflict**: Use the format `pyspark.datasource.<system_name>`
71+
- Example: `pyspark.datasource.salesforce` if "salesforce" conflicts with existing naming
72+
73+
### Examples from this repository:
74+
```python
75+
# Direct system naming (preferred)
76+
spark.read.format("github").load() # GithubDataSource
77+
spark.read.format("googlesheets").load() # GoogleSheetsDataSource
78+
spark.read.format("opensky").load() # OpenSkyDataSource
79+
80+
# Namespaced format (when conflicts exist)
81+
spark.read.format("pyspark.datasource.opensky").load()
82+
```
83+
6184
## Contributing
6285
We welcome and appreciate any contributions to enhance and expand the custom data sources.:
6386

0 commit comments

Comments
 (0)