IPFS watcher: A service that notifies on data changes

bnchrch · October 2, 2019, 5:03pm

Description

Build a small self host-able service that will watch a given CID on IPFS (I think this is better said as PeerId on IPNS, @expede is that right?) and send a request to a given webhook when the file has changed.

User Impact

Who would want to use this and why?
There’s many reasons people would want to know of a data change

Create a RSS feed from an IPFS blog
Notified on form submission. (If ipfs-forms is completed)
Notifications when deploy processes end (“v1.2.3 of Staging is now live”)
Keep an eye on sensitive files that should not change
Force a IPFS node to keep the document in cache

Features

A first version should allow a user to:

Self Host

Users should be able to easily host this service on the infrastructure of their choosing.

Setup quickly

We should allow users to one click deploy this to Heroku.

Watch many and notify multiple

A user should be able to set multiple CID’s to be watched each with their own polling interval. Detected changes should also be able to go out to 1 or more webhooks per CID.

Future Exntensions

Hosted

We should offer to host this for users given a range of free and paid plans.

OrbitDB

Extend this to include orbitdb table changes (*Pet 123 added/deleted/updated). So that people can know when data has changed in their database.

IPFS PubSub

Extend this to also send notifications on publishes to IPFS channels in ipfs-pubsub

expede · October 6, 2019, 5:53am

TL;DR

IPNS-over-pubsub does this for us already
It’s slow to start up at the moment
We may work around for that while we’re the DNS updater

Detailed Version

and send a request to a given webhook when the file has changed.

So, yes I agree that this is a good feature, but want to clarify a technical detail. You can’t edit a file at a CID; they’re immutable.

In a traditional RESTful system, you have {path => content} where the path is an arbitrary string. On IPFS (or with content addressing generally), you have {hash(content) => content}. You can never “create”, “edit”, or “destroy” a file at a CID. All CIDs “exist” already, but you either have access to copy available or you don’t.

How to Get the End Result

DNS

When we receive a request to update DNS, we push out a pubsub request over web2-style WebRTC/sockets/&c

IPNS

We have this feature in web3-land! What we’re describing is actually what the “experimental” (i.e. the only usable version of) IPNS In our testing, it takes between 2 and 120 seconds to initially subscribe to a channel (while it syncs the global head), but once you’re in you get pushed live updates basically instantaneously.

If a client is bootstrapped to us directly, it’ll be on the lower end of that range, but (e.g.) 10000ms is still a heck of a long time on the web. This goes for all IPFS-based pubsub AFAICT, not just IPNS-over-pubsub.

Speeding Things Up

We can speed this up a few ways, but may need to tap into some lower-level primitives in libp2p. Right now, pubsub waits for n confirmations from distinct peers to make sure that it’s at the correct head. This is the correct thing to do on paper, but in practice the network topology won’t be spread thin. We can make the trade-off of eventual consistency, and let the client handle how long their willing to wait. With zero data to back me up, I’d bet that new events would gossip through the network in the order of seconds. Being a few seconds behind the global head with some delay probability is not really a huge problem for the 95% use case, IMO.

Thoughts?

bnchrch · October 7, 2019, 1:17pm

Nice follow up! I don’t have a lot to add here other than for the above use cases I think a 10000ms delay would be a problem. The uses cases outlined don’t depend on instant notification to be useful.

When we receive a request to update DNS, we push out a pubsub request over web2-style WebRTC/sockets/&c

Absolutely, also this infrastructure would

Allow us to be open by default (feel free to subscribe to these pushes yourself)
Give us the ability to manage complexity for a fee. For example create our own service that can consume these events and direct the information to other external services of our customers choice.