Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pathing semantics #82

Closed
hannahhoward opened this issue Feb 9, 2023 · 3 comments · Fixed by #119
Closed

Support pathing semantics #82

hannahhoward opened this issue Feb 9, 2023 · 3 comments · Fixed by #119
Assignees

Comments

@hannahhoward
Copy link
Collaborator

What

Lassie should support the following semantics for paths, which match the gateway semantics:

lassie fetch CID/path/to/file

This should return a CAR containing the blocks for CID, intermediate CIDs between the CID and the root CID of file, and all of the cids that make up file.

lassie fetch CID/path/to/subdir

This should return a CAR containing the blocks for CID, intermediate CIDs between the CID and the root CID of subdir, and all of the cids that make up the directory tree for subdir, but NOT any files or directories associated with subdir (ultimately this means just a single block unless the directory is a HAMT)

lassie fetch CID/path/to/subdir?all

This should return a CAR containing the blocks for CID, intermediate CIDs between the CID and the root CID of subdir, and all of the cids that make up the directory tree for subdir, AND any files or directories associated with subdir AND any additional subdirectories recursively

Selector conversion

I believe there is a relatively straight forward conversion between this structure and selectors:

For each segment of the path, I think we need an ExploreInterpretAs with KnownADL "unixfs" followed by an ExploreFields.

At the end of the path, I believe we want an ExploreInterpretAs "unixfs-preload" followed by a "Matcher" selector, except in the case of the all query parameter where I believe we can simply use an ExploreRecursiveAll selector

Paramater passing

We will need to add either a path & query string to the RetrievalRequest type, or a selector -- it's kind of implementors choice and I'm not sure I have an opinion myself, though early conversion to selector locks us into selector traversal for bitswap (see https://github.com/ipfs/go-fetcher for what this ends up looking like)

@willscott
Copy link
Contributor

rvagg added a commit that referenced this issue Feb 24, 2023
rvagg added a commit that referenced this issue Feb 24, 2023
@rvagg rvagg mentioned this issue Feb 24, 2023
@rvagg
Copy link
Member

rvagg commented Feb 24, 2023

Mostly being done in #119

Current team understanding/agreement of what we're aiming for (initially at least) for Rhea:

  1. All requests are CID + path, where path may be empty.
  2. The path is interpreted as UnixFS where possible (plain IPLD where not)
  3. All blocks from the CID to the path terminus are included in the returned data
  4. Requests may either be “full” or “shallow”
    1. A “full” request will attempt to fetch and return all blocks that make up the DAG that exists below the path terminus
    2. A “shallow” request will attempt to fetch and return all blocks that make up the DAG that represents whatever the path terminates on:
      1. In the case of a UnixFS file, all blocks that make up the file
      2. In the case of a UnixFS directory, all blocks that make up just the directory—if it’s a HAMT, then all blocks of the HAMT, but no more.
      3. In the case of non-UnixFS data, just the terminus block.
    3. Exactly how “full” or “shallow” are provided to lassie are tbd, a “depth=int” query parameter doesn’t seem to make sense.
      1. ?full + none for shallow
      2. ?depthType=full + ?depthType=shallow (or none, shallow default?)
      3. ?complete=full + ?complete=shallow (or none, shallow default?)
  5. Range requests are an upstream concern—it is assumed the entire resource will be fetched and just the range requested is served back to the user.
  6. HEAD requests are just a special case of range requests that return the first 1024 bytes of the UnixFS data, and are therefore an upstream concern.

@willscott
Copy link
Contributor

rvagg added a commit that referenced this issue Feb 27, 2023
rvagg added a commit that referenced this issue Feb 28, 2023
rvagg added a commit that referenced this issue Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants