Just floating an idea about #PID and #FAIR that has been in back of my mind for some time. In the blockchain world, @IpfSbot creates PIDs or addresses (#multihash) based on the content of the digital object. Wouldn't it be great if the scientific community embraced this pattern?
A little bit of context: creating PIDs for your digital assets is the first #FAIR principle (F1) and an absolute cornerstone for rolling out #datastrategy. This is typically solved in the scientific world by creating handles through a party that guarantees persistence, e.g. #DOI.
Independently, I have been following #IPFS (ipfs.io) #opensource community over the last few years who have a really interesting architecture for the decentralized internet. It's a mix of #git like merkle trees, p2p technology (like bittorrent), and crypto hashes.
So, if you have a large file, like a genome in a deterministic format, you can compute its hash, and use that to uniquely identify that exact file. Anyone who wants to download it, can just ask peer nodes if they have it by broadcasting the hash and get it docs-beta.ipfs.io/concepts/what-…
You can also 'pin' a file or folder in your node to permanently host it. Combined with a name service like IPNS, and you have a decentralized internet -if you have a network of computers with part of the internet cached or pinned, you could still browse it on a trip in deep space
Why could the scientific community benefit from this? Universities often have large compute and storage clusters, but files still get copied many times or even lost because of broken links and missing metadata. We are relying on someone in the middle, #repository or cloudprovider
If scientists would use #IPFS (or #IPLD, as umbrella standard) to persist and exchange data files, the system guarantees by design that as long as someone in the network has that file, you can retrieve it. Repositories can then focus on more interesting things than storing files,
they could focus on producing relevant metadata, curating collections, etc. and it would also likely save the scientific community a lot of storage and maintenance costs. I've brought this to the attention of @GOFAIRofficial and @FAIRsFAIR_EU and hope we can investigate further.
One last note, ipld.io is multi-protocol, so it's not just about files - also researcher identities (e.g. using #DIDw3.org/TR/did-core/), university identities, even peer review processes and certifications and of course payments could be de-intermediated.
Imagine, how a fully decentralized #peerreview process would look like: you would self-sign the paper and data, publish the address to relevant repositories/journals/communities/twitter feeds, and invite others to review, who can sign with verifiable academic credentials while
still remaining anonymous if they want to, which can then be picked up by curators (from traditional journals but also modern science curators like @EricTopol). I'm not an #openaccess expert but this would really be #openscience, and a lot of the pieces are available today!