Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP-378: Delegated Routing HTTP POST API #378

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Conversation

masih
Copy link
Member

@masih masih commented Feb 13, 2023

Previous work reduced the scope of IPIP-337 to read operations only. The changes here re-introduce the write operations originally written by @guseggert back into IPIP-337.

The specification documents the ability to provide Bitswap records over PUT POST requests with advisory TTL. An implementation of the specification is already present in go-libipfs boxo.

See:

Previous work reduced the scope of IPIP-337 to read operations only. The
changes here re-introduce the write operations originally written by
@guseggert back into IPIP-337.

The specification documents the ability to provide Bitswap records over
`PUT` requests with advisory TTL. An implementation of the specification
is already present in go-libipfs.

See:
 - #370
@lidel lidel changed the title Reintroduce provide records via PUT to HTTP delegated routing IPIP-378: Delegated Content Routing PUTs over HTTP API Feb 13, 2023
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick first pass with some questions.

Also, please add quick memo to IPIP/ directory based on https://github.com/ipfs/specs/blob/main/IPIP/0000-template.md
It creates space for us to discuss Compatibility with existing nodes (which already create provider records and already sign them for DHT).

routing/DELEGATED_CONTENT_ROUTING_HTTP.md Outdated Show resolved Hide resolved
routing/DELEGATED_CONTENT_ROUTING_HTTP.md Outdated Show resolved Hide resolved
Comment on lines 228 to 230
Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself.
Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS).
If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we don't include PublicKey field from the start?
If we make it an optional, opaque bytes field, and say it should be deserialized as the protobuf from libp2p peerid specs, we have both solved RSA problem, and don't need to maintain PublicKeyType registry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could include PublicKey in the peer schema on out-of-band GET to /routing/v1/peers/{peerid} (IPIP-417).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with either of these paths.
@masih are you okay with this suggestion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me 👍

Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS).
If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme.

The `Payload` field is a string, not a proper JSON object, to prevent its contents from being accidentally parsed and re-encoded by intermediaries, which may change the order of JSON fields and thus cause the record to fail validation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to mangle Payload JSON anyway, what is the benefit of introducing it as a new struct?
It was probably discussed before elsewhere, but why can't we use exactly the same protobuf and signature DHT uses?

Reusing DHT record payload would save CPU time on IPFS nodes that publish to both DHT and IPNI (signature would be generated only once, not twice).

Copy link
Member Author

@masih masih Feb 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the benefit of introducing it as a new struct?

I suspect the reason for this is better human readability? The beginning of this document (merged as part of IPIP-337 states "human-readable encodings of types are preferred".

save CPU time on IPFS nodes

We can make the same argument about HTTP delegated routing itself. Unless there is evidence to suggest that we should be heavily optimising this, my vote would be to favour human readability.

@guseggert what are your thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing DHT record payload

@lidel can I ask where I can find the DHT record payload specification? I think we can add this to the spec via a specific request Content-Type media type, similar to #379.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guillaumemichel this is related to #345 – can you point us where up-to-date protobufs related to CID announcements on DHT lives these days? Unsure how feasible it is to reuse the announcement's wire format here.

Could we write down and publish the basic wire format information like protobuf section about ipns record, so this IPIP can refer to spec website, and not some go code 🙏


@masih I think Content-Type reuse similar to what we did #379 would help in some cases, but not a silver bullet here. There is way way more CIDs than IPNS records, so some way of doing bulk PUT is still desired, but it should not be tied to a single protocol.

When IPFS node speaks more than bitswap (we already plan to do HTTP next to bitswap in many places), doing a PUT of the same CIDs once for every protocol duplicates the cost on the client.

I don't think there is anything special about Schema: bitswap-announce, having dedicated PUT for this one protocol feels odd, PUT for HTTP will look exactly the same.

Maybe instead of having Schema: bitswap-announce we should have Schema: provider-announce with Protocols list, which could then have both bitswap and HTTP?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having dedicated PUT for this one protocol feels odd, PUT for HTTP will look exactly the same

I agree 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After IPIP-417 we are in a better place now.

I think we should avoid protocol-specific schema for bitswap and use peer schema from IPIP-417. It allows for passing optional protocol-specific metadata for things other than bitswap, allowing client to do single PUT with all info.

@@ -118,7 +160,7 @@ limits, allowing every site to query the API for results:

```plaintext
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, OPTIONS
Access-Control-Allow-Methods: GET, PUT, OPTIONS
```

## Known Transfer Protocols
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 this section grows way faster than the rest of the spec.

Would it be ok to move "Known Transfer Protocols" to a separate file (KNOWN_TRANSFER_PROTOCOLS.md)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend doing this in a separate PR to reduce lines changes what are irrelevant to what this PR is trying to introduce.

If this is not a blocker for this PR, I'd be happy to take this as a TODO and get it done once this PR is merged.

This is to facilitate a place for discussing compatibility with the
existing nodes.
@masih masih requested a review from a team as a code owner February 14, 2023 11:15
@masih
Copy link
Member Author

masih commented Feb 14, 2023

please add quick memo to IPIP/

Thanks @lidel for the review. I have added a .md file under /IPIP; is that what you had in mind?


- `Signature`: a multibase-encoded signature of the sha256 hash of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload` JSON. Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.

- Servers may ignore this field if they do not require signature verification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masih is cid.contact doing verification of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cid.contact does not support any PUT over http delegated routing.

Comment on lines 197 to 199
"Schema": "bitswap",
"Signature": "<signature>",
"Payload": "<payload>"
Copy link
Member

@lidel lidel May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 this creates one bitswap schema for GET and the other bitswap for PUT. Both under the same name, but with different fields. Bit nasty and error prone.

@masih is this ossified, or is there still time to fix this and use different schema for PUTs?

Suggested change
"Schema": "bitswap",
"Signature": "<signature>",
"Payload": "<payload>"
"Schema": "bitswap-announce",
"Signature": "<signature>",
"Payload": "<payload>"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still time. Changing thanks


```json
{
"Keys": ["cid1", "cid2"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masih what is the max length of this list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is to make the maximum length be derived from whatever the server supports as max request body. A user should be able to find this out via OPTIONS request usually?

If that makes sense I am happy to document this in the spec. Unless you think we need some explicit max length for that array?

Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masih I'm picking up the work here, thank you for your patience, we had some resourcing issues.

  • This PR needs to be rebased on top of ./src/routing/http-routing-v1.md
  • Updated to the latest IPIP template and moved to ./src/ipips/

but that can be done later.

For now, quick questions below 🙏

```

- `Keys` is a list of the CIDs being provided
- `Timestamp` is the current time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the format? Unix epoch? What resolution (ms? ns?)

Would ASCII string that follows notation from [rfc3339] (1970-01-01T00:00:00.000000001Z) be less ambiguous?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on following standardized format that is easily human-parseable and doesn't have ambiguity.


- `Keys` is a list of the CIDs being provided
- `Timestamp` is the current time
- `AdvisoryTTL` is the time by which the caller expects the server to keep the record available
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a timestamp of a duration? If a duration, is it miliseconds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a duration, maybe get that into the name.
Also embed the units in the parameter name so there is no ambiguity?

- `Timestamp` is the current time
- `AdvisoryTTL` is the time by which the caller expects the server to keep the record available
- If this value is unknown, the caller may use a value of 0
- `ID` is the peer ID that was used to sign the record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 there are 3 peerid types, two of them are legacy (see the end of this spec).

I'd like to avoid legacy from /routing/v1 and use CIDv1 with libp2p-key codec everywhere. @masih any concerns around this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concerns 👍

Comment on lines 228 to 230
Note that this only supports Peer IDs expressed as identity multihashes. Peer IDs with older key types that exceed 42 bytes are not verifiable since they only contain a hash of the key, not the key itself.
Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS).
If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could include PublicKey in the peer schema on out-of-band GET to /routing/v1/peers/{peerid} (IPIP-417).

Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS).
If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme.

The `Payload` field is a string, not a proper JSON object, to prevent its contents from being accidentally parsed and re-encoded by intermediaries, which may change the order of JSON fields and thus cause the record to fail validation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guillaumemichel this is related to #345 – can you point us where up-to-date protobufs related to CID announcements on DHT lives these days? Unsure how feasible it is to reuse the announcement's wire format here.

Could we write down and publish the basic wire format information like protobuf section about ipns record, so this IPIP can refer to spec website, and not some go code 🙏


@masih I think Content-Type reuse similar to what we did #379 would help in some cases, but not a silver bullet here. There is way way more CIDs than IPNS records, so some way of doing bulk PUT is still desired, but it should not be tied to a single protocol.

When IPFS node speaks more than bitswap (we already plan to do HTTP next to bitswap in many places), doing a PUT of the same CIDs once for every protocol duplicates the cost on the client.

I don't think there is anything special about Schema: bitswap-announce, having dedicated PUT for this one protocol feels odd, PUT for HTTP will look exactly the same.

Maybe instead of having Schema: bitswap-announce we should have Schema: provider-announce with Protocols list, which could then have both bitswap and HTTP?

Copy link
Member Author

@masih masih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel this PR has aged long enough that it seems it needs to be rewritten, specially to accommodate transport agnostic publication of CIDs. There is the missing DHT record specs issue but i don't think that's a blocker for now while we iterate over this?

If you agree I revise this PR entirely, considering the original was extracted from IPIP-337 written by Gus.

Thoughts?


```json
{
"Keys": ["cid1", "cid2"],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is to make the maximum length be derived from whatever the server supports as max request body. A user should be able to find this out via OPTIONS request usually?

If that makes sense I am happy to document this in the spec. Unless you think we need some explicit max length for that array?

Normally, if the Peer ID contains only a hash of the key, then the key is obtained out-of-band (e.g. by fetching the block via IPFS).
If support for these Peer IDs is needed in the future, this spec can be updated to allow the client to provide the key and key type out-of-band by adding optional `PublicKey` and `PublicKeyType` fields, and if the Peer ID is a CID, then the server can verify the public key's authenticity against the CID, and then proceed with the rest of the verification scheme.

The `Payload` field is a string, not a proper JSON object, to prevent its contents from being accidentally parsed and re-encoded by intermediaries, which may change the order of JSON fields and thus cause the record to fail validation.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having dedicated PUT for this one protocol feels odd, PUT for HTTP will look exactly the same

I agree 👍

@lidel
Copy link
Member

lidel commented Jun 13, 2023

@masih yes, fine to revise this 🙏

Either same PR, or close this one and open a new one to start with clean slate if that's easier
(changes against https://github.com/ipfs/specs/blob/main/src/routing/http-routing-v1.md + https://github.com/ipfs/specs/raw/main/ipip-template.md copied to https://github.com/ipfs/specs/tree/main/src/ipips)

@lidel
Copy link
Member

lidel commented Oct 5, 2023

@masih just checking-in, how does the future of this IPIP look like? (asking in context of ipni/index-provider#403 (comment))
Something you or someone from IPNI is willing to DRI, or is it ok to park till 2024 or until someone from other team have bandwidth to take over?

(fwiw we want to support PUTs in /routing/v1 exposed by Kubo, but this is most likely a 2024 roadmap item)

@masih
Copy link
Member Author

masih commented Oct 6, 2023

park till 2024 or until someone from other team have bandwidth to take over

Yes please. I don't have bandwidth to work on this just now unfortunately.

Comment on lines 125 to 127
TODO: is below a sensible limit?

There SHOULD be no more than 100 `Providers` per request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this distinction implies we won't want to use this same protocol to sync overall routing tables between nodes - if there are more than 100 known providers for data, would we expect a different protocol to be used?

Copy link
Member

@lidel lidel Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could:

src/routing/http-routing-v1.md Outdated Show resolved Hide resolved
src/routing/http-routing-v1.md Show resolved Hide resolved
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed first stab at reviving this based on rebase from #452 and recent conversations. Feedback welcome.

src/routing/http-routing-v1.md Outdated Show resolved Hide resolved
Comment on lines 414 to 426
#### Use in PUT responses

Server MAY return additional TTL information if the TTL is not provided in the request,
or if server policy is to provide TTL different than the requested one.

```json
{
"Schema": "announcement",
"Payload": {
"TTL": 17280000
}
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we create a separate announcement-response schema, or using announcement is ok? @hacdias what is a smaller headache for implementers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel I think it is fine to use the same one, and mention that when getting it as response, there may be an Error field too for https://github.com/ipfs/specs/pull/378/files#r1418770825.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Maybe overthinking this IPIP, but I am trying to keep some future doors open.

This IPIP is an opportunity to create a signed routing record convention that could be reused outside of HTTP API. For example, Amino DHT could start accepting signed announcements from this spec in addition to routing puts where PeerID matched one from libp2p connection, allowing light clients to delegate DHT puts.

I think with switch to DAG-CBOR done in 05dceba it is generic enough to use it for both (a signed DAG-CBOR map similar to IPNS).

But for errors, it felt like mixing abstractions, so I've added explicit error schema in ccbc085.

Comment on lines 395 to 400
- `Signature` is a string with multibase-encoded binary signature that provides integrity and authenticity of the `Payload` field.
- Signature is created by following below steps:
1. Convert `Payload` to deterministic, ordered [DAG-JSON](https://ipld.io/specs/codecs/dag-json/spec/) map notation
2. Prefix the DAG-JSON bytes with ASCII string `PUT /routing/v1 announcement:`
3. Sign the bytes with the private key of the Peer ID specified in the `Payload.ID`.
- Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By normalizing Payload to DAG-JSON here, as a step in signature generation, we no longer have to have Payload sent as base64-encoded string, and can have human-readable JSON object instead.

I think this is a net positive change, as it makes debugging/tracing easier, payload is no longer obfuscated on the wire.

@willscott @masih @hacdias – any concerns with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concerns other than the usual pitfalls of verifying signature of JSON object, which a reference implementation should clarify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel what do we do with empty fields? Are they not set? Are they set as null? Are they set as empty value (depending on their value, e.g., empty array)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty fields should not be present, just to remove ambiguity on the default value.

Copy link
Member

@lidel lidel Jan 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized going with DAG-JSON introduces inconsistency across specs.
IPNS Records do the same signature dance but normalize key-value map with DAG-CBOR. If we can, we should do the same here.

Let's switch to DAG-CBOR, this way there is no ambiguity, implementers don't try to be clever and reuse JSON strings, and signature does not require JSON representation in non-JSON context.

Suggested change
- `Signature` is a string with multibase-encoded binary signature that provides integrity and authenticity of the `Payload` field.
- Signature is created by following below steps:
1. Convert `Payload` to deterministic, ordered [DAG-JSON](https://ipld.io/specs/codecs/dag-json/spec/) map notation
2. Prefix the DAG-JSON bytes with ASCII string `PUT /routing/v1 announcement:`
3. Sign the bytes with the private key of the Peer ID specified in the `Payload.ID`.
- Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.
- `Signature` is a string with multibase-encoded binary signature that provides integrity and authenticity of the `Payload` field.
- Signature is created by following below steps:
1. Convert `Payload` to deterministic, ordered [DAG-CBOR](https://ipld.io/specs/codecs/dag-cbor/spec/) map notation
- Intention here is to use similar signature normalization as with DAG-CBOR `Data` field in IPNS Records, allowing for partial code and dependency reuse.
2. Prefix the DAG-CBOR bytes with ASCII string `routing-record:` to avoid signature reuse attacks.
3. Sign the bytes with the private key of the Peer ID specified in the `Payload.ID`.
- Signing details for specific key types should follow [libp2p/peerid specs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#key-types), unless stated otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to DAG-CBOR in 05dceba

src/routing/http-routing-v1.md Show resolved Hide resolved
src/routing/http-routing-v1.md Outdated Show resolved Hide resolved
- `Payload`: is a DAG-JSON-compatible object with a subset of the below fields
- `CID` is the CID being provided.
- This field is not presend when used for `PUT /routing/v1/peers`
- `Scope` is an optional hint that provides semantic meaning about announced identifies:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field should also not be present for PUT /routing/v1/peers, right?

Copy link
Member

@lidel lidel Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, do you think it will be less error-prone if we have separate schema for peers?

Copy link
Member

@hacdias hacdias Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary. The only potential footgun is the validation and the marshalling: what do we do with empty fields? We should talk about it. This is important for both validation and marshalling, both for requests and responses since we're using the same schema, and the payload is always encoded.

Following suggestion from
#378 (comment)

Fixed outdated links and statements.
@lidel lidel changed the title IPIP-378: Delegated Content Routing PUTs over HTTP API IPIP-378: Delegated Routing HTTP POST API Feb 5, 2024
lidel added a commit that referenced this pull request Feb 5, 2024
@lidel
Copy link
Member

lidel commented Feb 6, 2024

Pushed some editorial changes based on feedback, highlights:

  • /providers and /peers switched from PUT to POST
  • Signature generation uses DAG-CBOR instead of DAG-JSON
  • /peers envelope field renamed from Providers to Peers
  • 2MiB limit per announcement

There is wip work on reference implementation in ipfs/boxo#539

The Reframe API was superseded by `/routing/v1`.
See [IPIP-337/Backwards Compatibility](https://specs.ipfs.tech/ipips/ipip-0337/#backwards-compatibility).

#### Forwards Compatibility
Copy link
Member

@lidel lidel Feb 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

As this IPIP is right now, the routing announcement signature is used only for the remote HTTP server to decide if announcement should be accepted or not. This blocks announcing with PeerID that you don't own.

However, when user asks for providers, they won't get original payload+signature back, and won't be able to verify provider records themselves.

We don't have to solve or require it in this IPIP, but we should add section to "Forwards Compatibility" explaining if/how signature type introduced in this IPIP could be stored and returned to end clients when someoneone wants that functionality.

If we don't write it down, we will be having multiple competing conventions, because people who want end-to-end routing integrity will invent something.


Initial thought on this is that we already have a concept of opaque metadata fields for opaque protocols in Peer Schema.

If we reuse it here, this section could hint at doing something like:

{
  "Schema": "peer",
  "ID": "bafz...",
  "Addrs": ["/ip4/..."],
  "Protocols": ["transport-bitswap", "signed-routing-v1"],
  "signed-routing-v1": {
    "Payload": "[mbase64-dag-cbor-blob]", 
    "Signature": "[mbase64-blob]"
  }
}

Any concerns @masih @willscott @hacdias ?

This is just for "Forwards Compatibility" section, don't expect this to be implemented by cid.contact any time soon (since storing payload and signatyre will add cost), but if we have it, we could also implement it in someguy, allowing people end-to-end integrity when they self-host own router (e.g. in smaller private swarms).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate concern on my part 👍

Copy link

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look at this. Overall seems reasonable. I'd push for using JWS for authentication of the payload instead of the proposed DAG-CBOR strategy.

:::warn

Since non-streaming results have to be buffered before sending,
server SHOULD be no more than 100 `Providers` per `application/json` response.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo or awkward phrasing?

@@ -225,7 +319,7 @@ limits, allowing every site to query the API for results:

```plaintext
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, OPTIONS
Access-Control-Allow-Methods: GET, POST, PUT, OPTIONS

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is PUT still used?


- Signature is created by following below steps:
1. Convert `Payload` JSON to deterministic, ordered [DAG-CBOR](https://ipld.io/specs/codecs/dag-cbor/spec/) map notation
- Specification intention here is to use similar signature normalization as with DAG-CBOR `Data` field in IPNS Records, allowing for partial code and dependency reuse.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting to interoperate with IPNS records? If not, I'm curious if we should instead use JWS on the JSON payloads instead of this. JWS is more widely used and available than our DAG-CBOR encoding. It's also a well defined standard.

This also has the benefit of simplifying the verification. Minimal work is needed to authenticate the message. Importantly you do not have to parse the message into DAG-CBOR before authenticating the message, which could be a possible attack vector.

If we want to authenticate over more than just the payload we can look at RFC 9421 for a standard on how to sign HTTP messages. But I don't think we need to.

Copy link
Contributor

@gobengo gobengo Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 RFC9421 because it is content-type agnostic. (ActivityPub/Mastodon use this)

And by use JWS do you mean this JSON serialization of it? (iiuc a JSON object with payload and signatures properties)
https://datatracker.ietf.org/doc/html/rfc7515#section-7.2.1
+0 that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever serialization makes the most sense. Why +0 to the idea?

Copy link
Contributor

@gobengo gobengo Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+0 meaning 'no objection' and '+' because I trust you have more context than I do so I'm happy to defer to what you think is best. But personally it seems like choosing RFC9421 it would provide similar functionality for JSON but also every other media type, and then maybe wouldn't need to do RFC7515 at all.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. I don't have a preference on RFC 7515 or RFC 9421. But I do have a preference for using an IETF standard here instead of the proposed DAG-CBOR solution.

Copy link
Member

@lidel lidel May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for feedback.

Additional context is that in the future, a light client may want to use /routing/v1 POST to delegate announcements of signed records to routing systems like IPNI or Amino DHT, and reuse signature (ideally, client signs once). This means we want to implement a signature type that would be easy to implement and support by Amino DHT nodes as well.

A hesitation around using something like RFC9421 is that it brings more complexity: users of IPFS stack likely already have libp2p crypto due to PeerIDs and dag-cbor due to things like IPNS.
In contrast, the RFC9421 is relatively new and requires custom implementation (not in go yet: golang/go#41046).

It feels we don't want to rush signatures until we have time to work on delegated Amino DHT, making sure we have end-to-end, real world use case for them.

My current thinking is either parking the entire thing, or reducing the scope of this IPIP, moving signatures to separate IPIP, and shipping basic POST here without signatures specified, as a DRAFT. It will still be useful, there are use cases where delegated routing is done in a controlled environment, where signatures introduce unnecessary penalty, but comes with risk of us making something that is then difficult to sign.

Thoughts?


Each object in the `Providers` list is a *write provider record* entry.

Server SHOULD accept representing writes is [Announcement Schema](#announcement-schema).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Server SHOULD accept representing writes is [Announcement Schema](#announcement-schema).
Server SHOULD accept representing writes as [Announcement Schema](#announcement-schema).

- `Timestamp` is the current time, formatted as an ASCII string that follows notation from [rfc3339](https://specs.ipfs.tech/ipns/ipns-record/#ref-rfc3339).

- `TTL` is caching and expiration hint informing the server how long to keep the record available, specified as integer in milliseconds.
- If this value is unknown, the caller may skip this field or set it to 0. The server's default will be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this scenario, are we expecting clients to POST all the records they want reprovided after TTL expires?


- `TTL` in response is the time at which the server expects itself to drop the record
- If less than the `TTL` in the request, then the client SHOULD repeat announcement earlier, before the announcement TTL expires and is forgotten by the routing system
- If greater than the `TTL` in the request, then the server client SHOULD save resources and not repeat announcement until the announcement TTL expires and is forgotten by the routing system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- If greater than the `TTL` in the request, then the server client SHOULD save resources and not repeat announcement until the announcement TTL expires and is forgotten by the routing system
- If greater than the `TTL` in the request, then the client SHOULD save resources and not repeat announcement until the announcement TTL expires and is forgotten by the routing system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏃 In Progress
Status: 🏃‍♀️ In Progress
Development

Successfully merging this pull request may close these issues.

9 participants