Need for IPIP: Data Onboarding via HTTP POST (and future ipfs:// POST|PUT)

## Problem statement

HTTP Gateways are the most successful way for retrieving content-addressed data. Successful use of HTTP for retrieval use cases proves that IPFS does not replace HTTP, but augment it by providing variability and resiliency. IPFS over HTTP brings more value than the sum of its parts.

Removing the need for implementation specific RPC APIs (like one in Kubo) allowed not only faster adoption of CIDs on the web, but enabled alternative implementations of IPFS (like Iroh in Rust) to test compliance and benchmark thenselves against each other.

 While we have HTTP Gateways as a standard HTTP-based answer to the retrieval of data stored with IPFS (including verifiable [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) and [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) responses), the data onboarding over HTTP is currently done with vendor-specific APIs. 

The status quo at 2023 Q1 is pretty bad from the end user/developer’s perspective: every IPFS implementation, including online services providing storage and pinning services, exposes custom opinionated HTTP API for onboarding data to IPFS.

## Why we need IPIP for HTTP Data Onboarding

To illustrate, some prominent examples (2022 Q4):

<details>
  <summary>Click to expand :see_no_evil: </summary>

- Implementations
    - Kubo RPC  (AKA legacy /api/v0/..)
        - Is often used as a “standard HTTP API upload template” because it has commands for all onboarding needs:
            - [https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-add](https://web.archive.org/web/20221201011916/https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-add)  – files and directories
                - FLAG: it uses custom form-data handling that requires special library for directory upload, which is an awful papercut for someone expecting simple upload with “curl” ([http://web.archive.org/web/20221201011916/https://docs.ipfs.tech/reference/kubo/rpc/#request-body](http://web.archive.org/web/20221201011916/https://docs.ipfs.tech/reference/kubo/rpc/#request-body))
                - FLAG: Kubo RPC was never designed to be used in browser context, and there are known bugs around the way it handles uploads (example: [https://github.com/ipfs/kubo/issues/5168](https://github.com/ipfs/kubo/issues/5168))
            - [https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-block-put](https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-block-put) – raw block
            - [https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-dag-put](https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-dag-put) – JSON-like documents  and custom DAGs (DAG-JSON and DAG-CBOR)
            - [https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-dag-import](https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-dag-import) – arbitrary bags of blocks  in CAR format
    - JS-IPFS
        - Reimplements most of the Kubo RPC and exposes it over HTTP,, but diverged long time ago and is not 1:1
        - FLAG: In addition to HTTP, JS-IPFS exposes  selected commands over gRPC-over-WebSockets, to work-around browser issues caused by Kubo RPC ([https://web.archive.org/web/20220528152743/https://github.com/ipfs/js-ipfs/tree/master/packages/ipfs-grpc-server#why](https://web.archive.org/web/20220528152743/https://github.com/ipfs/js-ipfs/tree/master/packages/ipfs-grpc-server#why))
    - IPFS Cluster
        - Acts as a reverse proxy for Kubo RPC, but has own commands too and provides special behavior on top of what Kubo RPC does:
            - [https://web.archive.org/web/20220911053755/https://ipfscluster.io/documentation/reference/api/](http://web.archive.org/web/20220911053755/https://ipfscluster.io/documentation/reference/api/) – `/add` endpoint uses unixfs by default, but also accepts CARs when HTTP POST request is made with `?format=car`  and it only accepts CARs with single root.
- Online services
    - Pinata
        - [https://web.archive.org/web/20220930091452/https://docs.pinata.cloud/pinata-api/pinning/pin-file-or-directory](https://web.archive.org/web/20220930091452/https://docs.pinata.cloud/pinata-api/pinning/pin-file-or-directory) – onboarding file or directory
        - [https://web.archive.org/web/20220817122725/https://docs.pinata.cloud/pinata-api/pinning/pin-json](https://web.archive.org/web/20220817122725/https://docs.pinata.cloud/pinata-api/pinning/pin-json) – onboarding JSON document
    - web3storage
        - [http://web.archive.org/web/20220914153854/https://web3.storage/docs/reference/http-api/](http://web.archive.org/web/20220914153854/https://web3.storage/docs/reference/http-api/) – file and CAR uploads
        - note: no block API (impossible to import DAG-CBOR without the overhead of single-block-CAR for every CID)
    - Infura
        - [http://web.archive.org/web/20220429202905/https://docs.infura.io/infura/networks/ipfs/http-api-methods/add](http://web.archive.org/web/20220429202905/https://docs.infura.io/infura/networks/ipfs/http-api-methods/add) – file and directory import API that is carbon-copy of Kubo’s internal  RPC API
        - [http://web.archive.org/web/20220429203039/https://docs.infura.io/infura/networks/ipfs/http-api-methods/block_put](http://web.archive.org/web/20220429203039/https://docs.infura.io/infura/networks/ipfs/http-api-methods/block_put) – raw block import that is carbon-copy of Kubo’s internal  RPC API
        - note: no CAR import
    - TODO: source more examples

</details>

And the CAR upload API insanity corca 2024 Q1:

- https://discuss.ipfs.tech/t/uploading-cars-and-user-generated-cids/17592

This state of things introduces an artificial barrier to adoption: the user needs to learn what APIs are available, and then “pick winners” – decide which implementations and services are the most future-proof. And even then, many choices are burdened by legacy of Kubo RPC and it’s degraded performance and DX/UX in web browsers.
 
 ## Goal: create data onboarding protocol for both HTTP and native IPFS

The intention here is to  create IPIP with a vendor-agnostic protocol  for onboarding data that:
-  is easy to use and implement in HTTP (`POST https://`)
    - does not require any libraries or documentation,
    -  and is as easy to work with from JS with `fetch` API as it is in the command-line with `curl`
- follow the retrieval story, where `ipfs://` behavior is analogous to subdomain gateways
  - :point_right:  what we want, is to have a protocol that can be represented as both `POST https://` AND `POST ipfs://` APIs

## IPIP scope

We want two IPIPs: one for onboarding data with HTTP POST, and one for authoring (modifying/pathing) it with HTTP PUT.
This allows us to ship most useful onboarding first, and then do authoring as an optional add-on, which services may support, but dont have to (if they are only onboarding to filecoin etc). 

For now, focusing on the POST

### POST Requests (Onboarding)


> 👉 This is the minimal scope we need to cover from the day one, ensuring every use case has a vendor-agnostic spec.



- **Delegated**
    - Single  File (UnixFS) or single (DAG-)CBOR/JSON document
    - Arbitrary Directory tree (UnixFS)
        - Option A: TAR stream
            - open question: how does this handle interrupted upload? can server tell some data is missing?
        - Option B: custom form-data? (think twice, we have lessons learned around RPC at `/api/v0/add` in Kubo)
- **Native**
    - Raw block
    - CAR stream

The working code for this will be reference implementation that replaces/updates the legacy [`Gateway.Writable` feature in Kubo](https://github.com/ipfs/kubo/blob/master/docs/config.md#gatewaywritable) with the above feature set.

### PUT/PATCH/DELETE Requests (Authoring)

This will be a separate IPIP, but flagging this as long term plans that should feel idiomatic too.

- TBD: **Delegated** vs **Native**
- Critical: ensure no surprises, UX/DX is paramount. Needs research and analysis.
  - One idea is to keep it limited to patching UnixFS paths and DAG-JSON/CBOR documents.
  - Other idea is to have syntax parity with JSON-based [IPLD Path](https://ipld.io/specs/patch/) and have the same JSON syntax as  [`dag diff`](https://github.com/ipfs/kubo/issues/4801) and [`dag patch`](https://github.com/ipfs/kubo/issues/4782) commands. 

## References

- Revisit the [concept of Writable Gateways](https://discuss.ipfs.io/t/writeable-http-gateways/210?u=lidel)
- https://github.com/ipfs/go-ipfs/blob/master/docs/config.md#gatewaywritable
- https://discuss.ipfs.io/t/writeable-http-gateways/210
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Location#pointing_to_a_new_document_http_201_created
- WIP private IPIP draft: https://www.notion.so/protocollabs/wip-IPIP-Data-Onboarding-with-HTTP-POST-4c394b8ebb774f2d87d34466019257fc
- Alex prototyped some REST APIs in https://github.com/ipfs/specs/pull/224/files (while this was intending to be update to Kubo RPC, the document includes some ideas around patching  files and directories)
- https://docs.api.video/vod/delegated-upload-tokens as prior art where [opaque token can be used with standard tools like curl](https://docs.api.video/vod/delegated-upload-tokens#upload-a-video-with-delegated-tokens)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need for IPIP: Data Onboarding via HTTP POST (and future ipfs:// POST|PUT) #375

Problem statement

Why we need IPIP for HTTP Data Onboarding

Goal: create data onboarding protocol for both HTTP and native IPFS

IPIP scope

POST Requests (Onboarding)

PUT/PATCH/DELETE Requests (Authoring)

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Need for IPIP: Data Onboarding via HTTP POST (and future ipfs:// POST|PUT) #375

Description

Problem statement

Why we need IPIP for HTTP Data Onboarding

Goal: create data onboarding protocol for both HTTP and native IPFS

IPIP scope

POST Requests (Onboarding)

PUT/PATCH/DELETE Requests (Authoring)

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions