Getting to Filecoin Scale?

laudiacay · February 13, 2023, 8:50pm

adapted from discord post that I should have just put here.
things that need to happen before we can put massive filesystems into WNFS and slap 'em on filecoin in a friendly way

discussed here: Compression/rabin on roadmap? Scaling? - #3 by laudiacay
compression (probably should do this, maybe outside WNFS)
compression over an entire filesystem, with knowledge of old versions…? (this might be hard/long-term, probably is not necessary just yet)
content-defined chunking
deduplication (perhaps this should be handled outside of WNFS, I am not sure yet)

WNFS versioning/migration/backwards compat (seems like it’s almost ready)

splitting the privateforest into two privateforests: one for directory structure and file metadata and symlinks, another for the actual file chunks (this one might be controversial and I’m happy to explain my logic for why I think this is imperative)

performance testing for various parts (we already have a harness and some test cases that will be helpful once we’ve integrated things, it’s in the dataprep repo)
car filing with intelligent block ordering to improve “cache access properties” (not needing to unseal 8 sectors in order to crawl one 5GB directory) (currently WNFS outputs blocks in a way that prevents you from doing this. Something to think about, whether it should go inside or outside)

expede · February 13, 2023, 10:33pm

Interesting! I’d love to hear more on this for sure. Splitting them that way does leak certain kinds of metadata, and we’ve been thinking about actually doing the opposite: shoving even more data roots into a single private forest. Certainly not against exploring other directions though!

laudiacay · February 14, 2023, 8:57pm

We should chat more in Discord- my concern is more or less: “unseal one sector and decrypt” → “it’s an IPLD link to contents in another sector” → “unseal another sector and decrypt” → repeat.

This could lead to days of non-concurrent unsealing work for certain directory structures.

I’d propose to put the metadata into one private forest (directory structure + file metadata), and actual file contents into another, so you can request directly the destination PieceCID for unsealing.

This split makes sense in a lot of ways- keep your metadata locally + put the file chunks onto IPFS, or keep your top-level CID + keep your metadata on IPFS + keep your file chunks in Filecoin.

This is also sort of how Dropbox, iCloud, and Google Drive do it- they split the metadata (kept locally on your machine) from the file data (kept in servers), so you can navigate directly to the file that you would like to look at, instead of having to do a lot of costly network requests every time you open a directory.

You could still have the chunks be in a flat namespace.

edit: not PrivateForest- meant BlockStore here.

expede · February 14, 2023, 9:11pm

Hmm, it’s possible that we’re talking about different things. Do you perhaps mean maintaining a secondary index into the filesystem? Separating out the file headers will probbaly get you a lot less than you’d otherwise expect, unless you encrypt the entire filesytsem with a single key — which may be all that you need for your use case?

expede · February 14, 2023, 9:21pm

Also perhaps worth calling out: I haven’t had a chance to catch up with @matheus23 et al since your recent call, so it’s very likely that I’m missing some context

laudiacay · February 14, 2023, 9:27pm

So I think what I mean is: abusing WNFS slightly to use one blockstore for PrivateNode::store (having this be IPFS), and another blockstore to store the contents of the PrivateForest (having this be Filecoin).

What I meant is one for the privatenode/privatefile/privatedirectory serialization

github.com

wnfs-wg/rs-wnfs/blob/703d1c193e5510d14652c97567f1b2f57b878d01/wnfs/src/private/file.rs#L541


      
                  previous,
                  metadata,
                  header: {
                      let cbor_bytes = key.0.decrypt(&header).map_err(DeError::custom)?;
                      dagcbor::decode(&cbor_bytes).map_err(DeError::custom)?
                  },
                  content,
              })
          }
          
          
pub(crate) async fn store(
              &self,
              store: &mut impl BlockStore,
              rng: &mut impl RngCore,
          ) -> Result<Cid> {
              let cid = self
                  .persisted_as
                  .get_or_try_init::<anyhow::Error>(async {
                      // TODO(matheus23) deduplicate when reworking serialization
                      let private_ref = &self.header.get_private_ref();

one for the content stream/serialization of the FileContent::External/Privateforest contents like this:
rs-wnfs/file.rs at 703d1c193e5510d14652c97567f1b2f57b878d01 · wnfs-wg/rs-wnfs · GitHub /
github (dot) com/wnfs-wg/rs-wnfs/blob/e38d039d3886f8590e00c7f87a530ca207f8a713/wnfs/src/private/forest.rs#L73 (evading rules on new users only being allowed 2 links per post)

laudiacay · February 14, 2023, 9:30pm

briefly thought there was some metadata in the PrivateForests- This may actually just be a library usage change, but still a potentially problematic one (leaking a bit of information about which blocks are metadata/directory structure/file contents)

expede · February 14, 2023, 9:51pm

I’m not sure if the below is helpful or not; I’m gong to tag along to tge biweekly call in Friday, but also feel free to toss something in my calendar if you think it would be helpful to have a catch up before then

Yeah, it depends on a lot of factors I think. The previous version of WNFS maintained a “skeleton” index that would help you avoid round trips, but it’s not Byzantine fault tolerant, and we dropped it in the upgrade (but can always add indices later).

I guess I have a couple questions about the Banyan intended use case (because WNFS is currently appraoching a lot of use cases, this can help me narrow down where data or SDK changes could occur). I’m guessing that you don’t need concurrent writes, the ability to merge filesystems, or the ability to miantain history.

Do you need recursive encryption — i.e. give someone access to /Photos but not /Documents?

In the current data layout, file heirarchy is kept directly in directory nodes, and temporal information (e.g. the data you need to retrieve updates) is kept separately so that you can share a single snapshot, or some range. The file heirarchy gives you nice key management properties, but means that walking a path requires inspecting each node as you walk the graph. If I’m understanding you correctly, the concern is that this will hop across a bunch of Filecoin sectors, which is a lot of work to retrieve and unseal.

Correct me if I’m wrong, but it sounds like the concern is mainly around sector locality. If you have heirarchical access control (i.e. someone can access /Photos but not /Documents), then sharding the file system by directory (or some other depth-first means) is probably viable. We also have a concept of symlinks which could be a lightweight way of refercing different sectors.

As soon as you start sharing things by what “belongs together”, you start exposing relationships between data — but that’s sometimes a valid tradeoff. In RhizomeDB, we’re making a simialr tradeoff because otherwise the number of round trips on single database rows would be unreal

laudiacay · February 14, 2023, 10:18pm

Ok I think we’re understanding each other textual communication hard 4 me- fine to talk Friday about this, these are something we want to have on our 3-6 month roadmap.

Our roadmap to get to E2E or E2Eish is: first they just put their keys in our webserver and we do normal cybersecurity best practices. Second, they just put their key manifest files in WNFS (these describe how to recover CAR files to the original filesystem including age identity private keys), and maybe we do something multisig-looking somewhere, and we use it with webnative to get a key-manifest-management platform that looks something like the photo-sharing demo you have live right now. Third, we want to put the whole filesystem into WNFS and then put it into Filecoin.

For V3, ignore concurrent writes, allow FS merges, and allow maintaining history would probably work. Concurrent writes eventually would be nice.

Recursive encryption would be necessary for good sharing UX… Symlinks might be helpful, but splitting them would be better.

Re: belongs together: agree about leakage in this case, but on the flip side… here’s a funny thing to think about… (probably not important yet, just fun):

giving how you bundle blocks into Filecoin Pieces good spatial/temporal/usage locality properties could actually shield information. If a group of people all frequently accesses 32G of data that could fit into a piece, but instead is split into 32 different pieces because we serialized their FS in an unlucky order, the permutations of which group members access which subsets of those pieces through time can give high-information-content public fingerprints for which data is accessed together, and reveal relationships in their work patterns. Basically I think the more you have “these particular pieces are unsealed/downloaded together a lot!”, the more information you are leaking, so a good design would prevent that by making sure things are downloaded together as infrequently as possible, and when they are it’s a common pattern probably corresponding to downloading a relatively-whole sub-DAG.

Not sure what would give a decent solution to this issue, heuristically or computationally tbh (I think it’s NP-complete even with full information about future access patterns), but generally keeping files and directories together in Filecoin chunks would be a helpful protective measure.