Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion data_collections_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
from __future__ import annotations

__version__ = "0.1.0"
__author__ = "Jacob Wilkins, Elliot Kasoar, Jas Kalaya, Alin Elena"
__author__ = "Jacob Wilkins, Elliot Kasoar, Jas Kalayan, Alin Elena"
9 changes: 6 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,16 @@
"sphinx.ext.mathjax",
"sphinx.ext.viewcode",
"sphinxcontrib.contentui",
"myst_parser",
"myst_nb",
]

nb_execution_mode = "off"

source_suffix = {
".rst": "restructuredtext",
".txt": "markdown",
".md": "markdown",
".txt": "myst-nb",
".md": "myst-nb",
".ipynb": "myst-nb",
}

apidoc_modules = [
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 10 additions & 0 deletions docs/source/deposition_tutorial/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Tutorials
=========

This contains a list of tutorials on using ``data_collections_api``.

.. toctree::
:maxdepth: 2
:caption: Contents:

tutorial
35 changes: 35 additions & 0 deletions docs/source/deposition_tutorial/record.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
metadata:
creators:
- affiliations:
- name: # name of institution
person_or_org:
given_name: # given name
family_name: # family name
identifiers:
- identifier: # identifier number
scheme: orcid
name: # family name, given name
type: personal
title: # title
description: # <p>A description of the data being uploaded</p>
identifiers:
- identifier: # link to publication
publication_date: # 'YYY-MM-DD'
subjects:
- subject: # <subject>
- subject: # <subject>
publisher: PSDI
resource_type:
id: model
rights:
- id: cc-by-4.0
version: v1
custom_fields:
dsmd:
- # field1: 'value'
# field2: 'value'
access:
files: public
record: public
files:
enabled: true
281 changes: 281 additions & 0 deletions docs/source/deposition_tutorial/tutorial.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,281 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "732db279",
"metadata": {},
"source": [
"# Deposit Data to the PSDI data-collections Repository"
]
},
{
"cell_type": "markdown",
"id": "04991dae",
"metadata": {},
"source": [
"This tutorial shows the steps involved in using the data-collections API to deposit data for review to the PSDI data-collections repository, which is built using InvenioRDM. PSDI (Physical Sciences Data Infrastructure) is an initiative to connect and provide data services for the physical sciences. One such service is a data repository for a collection of communities within the physical sciences to share their data."
]
},
{
"cell_type": "markdown",
"id": "49c23f56",
"metadata": {},
"source": [
"## Prerequisites"
]
},
{
"cell_type": "markdown",
"id": "13ab93b5",
"metadata": {},
"source": [
"### Create an access to token "
]
},
{
"cell_type": "markdown",
"id": "dc534f6c",
"metadata": {},
"source": [
"To use the data-collections-API for uploading data to an InvenioRDM instance, users first need to create an account on the repository instance. Once access is gained, a personal token can be created, usually using the following steps on the web interface of the instance:"
]
},
{
"cell_type": "markdown",
"id": "08ef72f6",
"metadata": {},
"source": [
"``Login > Account > Applications > Personal access tokens: Add New Token > set a Token name > Create > Copy Access token and store securely``"
]
},
{
"cell_type": "markdown",
"id": "dc1f8fdb",
"metadata": {},
"source": [
"In more detail, for the data-collections repository, steps are as follows:"
]
},
{
"cell_type": "markdown",
"id": "f192f754",
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"source": [
"1. Once logged in, click on Account to display the dropdown menu and choose the “Applications” option\n",
"\n",
"![screenshot01](images/screenshot01.png)"
]
},
{
"cell_type": "markdown",
"id": "0407ca30",
"metadata": {},
"source": [
"2. Add a new personal access token \n",
"\n",
"![screenshot02](images/screenshot02.png)\n"
]
},
{
"cell_type": "markdown",
"id": "831eb0b3",
"metadata": {},
"source": [
"3. Name the token, click create and save the subsequently displayed token securely, never share this token. This token will be used to access the repository via the API.\n",
"\n",
"![screenshot03](images/screenshot03.png)"
]
},
{
"cell_type": "markdown",
"id": "9ed6fe76",
"metadata": {},
"source": [
"### Software Installation"
]
},
{
"cell_type": "markdown",
"id": "402cb25b",
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"source": [
"If you are running this notebook as a container, data-collections-API and its dependencies are already installed and you can continue to the next section. Otherwise, the API can be installed into a python environment by cloning the repository containing the code and install this into a python environment, as shown below."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1269b106",
"metadata": {
"vscode": {
"languageId": "powershell"
}
},
"outputs": [],
"source": [
"# clone repository\n",
"! git clone https://github.com/PSDI-UK/data-collections-API\n",
"\n",
"# Create and activate a new python environment\n",
"! conda create -n data-collections-API-env python==3.13\n",
"! conda activate data-collections-API-env\n",
"\n",
"# Install the data-collections-API to your new python environment \n",
"%cd data-collections-API\n",
"! pip install ."
]
},
{
"cell_type": "markdown",
"id": "1a22383d",
"metadata": {},
"source": [
"Open this notebook whilst in your python environment when using the data-collections-API."
]
},
{
"cell_type": "markdown",
"id": "1eb4d079",
"metadata": {},
"source": [
"## Submit Data for Review to the PSDI data-collections"
]
},
{
"cell_type": "markdown",
"id": "fe445986",
"metadata": {},
"source": [
"### Submission file template"
]
},
{
"cell_type": "markdown",
"id": "f85f4ccc",
"metadata": {},
"source": [
"To submit data to data-collections, a metadata file is required along with the files you wish to upload. A template for the metadata required to submit a record to the data-collections repository can be found in the `record.yaml` file."
]
},
{
"cell_type": "markdown",
"id": "97378ed5",
"metadata": {},
"source": [
"### Choosing a community"
]
},
{
"cell_type": "markdown",
"id": "c7d8de33",
"metadata": {},
"source": [
"The deposition process for each community follows the same steps, however each community has its own domain specific metadata that can be populated in the submission file.\n",
"\n",
"The domain-specific metadata (DSMD) section varies between communities, please see what metadata terms are available for your community, either by exporting an existing record uploaded to the community and viewing the DSMD list, or contact your community directly for this list."
]
},
{
"cell_type": "markdown",
"id": "90a1ec13",
"metadata": {},
"source": [
"Once your metadata file is filled in, you can validate it via:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c0184ce6",
"metadata": {
"vscode": {
"languageId": "powershell"
}
},
"outputs": [],
"source": [
"! data_collections validate record.yaml"
]
},
{
"cell_type": "markdown",
"id": "b97ca429",
"metadata": {},
"source": [
"Once your metadata is validated, you can submit your data for review by setting the variables below and using the `data_collections upload` command."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0abc5c70",
"metadata": {
"vscode": {
"languageId": "powershell"
}
},
"outputs": [],
"source": [
"REPOSITORY_URL=\"https://data-collections.psdi.ac.uk/api\" # URL for data-collections API\n",
"TOKEN=\"XXX\" # token generated in previous steps\n",
"METADATA_PATH=\"record.yaml\" # path to your metadata file\n",
"DATA_PATH=\"my_data/*\" # path to your data\n",
"COMMUNITY=\"biosimdb\" # set applicable community name"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "603f3010",
"metadata": {
"vscode": {
"languageId": "powershell"
}
},
"outputs": [],
"source": [
"! data_collections upload --api-url {REPOSITORY_URL} --api-key {TOKEN} --metadata-path {METADATA_PATH} --files {DATA_PATH}\n",
"--community {COMMUNITY}"
]
},
{
"cell_type": "markdown",
"id": "0df7c072",
"metadata": {},
"source": [
"Once your record is submitted for review, you will be able to see the status of the record as a request on your dashboard in [data collections](https://data-collections.psdi.ac.uk).\n",
"\n",
"![screenshot04](images/screenshot04.png)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "data-collections-API",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@ Project to allow simplified editing and construction of Invenio data for the PSD

cli
schema
deposition_tutorial/index
schemas/index
API Documentation <api/modules>
4 changes: 1 addition & 3 deletions docs/source/scripts/schema_gen.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,7 @@
:maxdepth: 1
:caption: Schemas:

{schemas}

"""
{schemas}"""


def get_arg_parser() -> argparse.ArgumentParser:
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ docs = [
"sphinxcontrib-contentui<1.0.0,>=0.2.5",
"furo==2025.9.25",
"numpydoc>=1.9.0",
"myst-parser",
"jsonschema-markdown",
"myst-nb",
]
lint = ["pre-commit<5.0.0,>=4.2.0", "ruff==0.13.3", "numpydoc>=0.19.0"]
test = ["pytest==8.3.4", "pytest-cov==5.0.0"]
Expand Down