-
Notifications
You must be signed in to change notification settings - Fork 696
NIP Relay discovery via DHT #2018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Good idea. Maybe it could be done without protocol interventions. because relay discovery could be facilitated in many different ways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand DHTs very well, so take this with a grain of salt, but here are a few thoughts:
- This seems a little convoluted, is it for relay discovery, NIP 65 event discovery, or is it a general purpose sha256-based DHT? I see how making it general purpose would incentivise more nodes to be run, but it seems weird to have such heterogenous data types living in the same table.
- Related, poisoning attacks are mitigated by connecting to relays and verifying signatures, but this seems like a limitation on the general-purpose use of the DHT, since some use cases might not have the same kind of validations.
- Addresses or event ids seem like natural keys (rather than using the hash of a pubkey to point to a nip 65 event, use a hash of the event's address).
Thank you for the simple description of the DHT functionality. Why hash the npub in order to get the user id? Isn't it better to just use the pubkey? We don't want that because it's not evenly distributed? My honest thoughts about this are: 1) we don't really need DHTs, because hardcoding some well-known relays is ok, just like it's ok to hardcode the IP addresses of some DNS servers; 2) pubkeys are already very cryptic to share, so sharing an But of course more decentralization always helps, so I like the idea of having a DHT for this, just in case. But then wouldn't it be better to have an alternative system? Not involving existing relays, but only dedicated DHT nodes? |
Hey all, Thanks for your time giving feedback.
Pkarr is based on the BitTorrent Mainline DHT which uses ed25519 keys which are incompatible with secp256k1 that Nostr uses. So people would need a way to find out a user's ed25519 key from the npub. Could still be interesting though. Putting a Mainline DHT proxy into Nostr relays would unlock some interesting use-cases.
Yep exactly. It's a simple way to get a deterministic lookup from
Normally when you implement a DHT you have to specify the table storage mechanism. BitTorrent goes into detail about this for example in BEP0005 and BEP0044. In our case, relays already have the storage protocol sorted out in the form of the generic event storage that relays already implement. The "DHT storage" here is simply writing an event to the relay using the existing websocket mechanism, which is already very open ended and heterogeneous. Restricting that somehow would be more complicated than just leaving it how it works already.
Again, this is already happening as a normal part of the protocol. The NIP is supposed to point out that this is working and there's nothing to change or implement. That should probably be more clearly worded.
You could do that too, yes. You could share the hash of the event's address and the user could look it up that way.
Yes good point! We can just use the pubkey.
I guess the difference is hardcoding the IP address of DNS servers happens an the OS level, whereas I'm thinking at the application level. Applications generally don't have IP addresses hardcoded. On the other hand, in Bittorrent apps they do hardcode some torrent trackers for bootstrapping the DHT. I guess hard coding a couple of big relays is similar to that.
Wow, I had no idea I could do this, thank you. It's right there in nostr-tools README. 😅 I am building small web apps where the user can sync their data via relays. At the moment the user has to share an
The good thing about relays is they exist, in stark contrast with the majority of decentralization ideas (and this DHT proposal). An interesting alternative would be a client based DHT which goes via the relays without needing any changes to relay software. Will have a think about this. One other thing I realized is a client can still compute a Thanks very much for your responses. I think I can actually solve my main issues without needing a relay DHT for now. Maybe this proposal can serve as a source of ideas for any future DHT developments. |
Yes. Pubky App uses intermediate servers, "homeservers", which have an access to DHT (since there is no library for that to work from browsers) so one solution would be a "bridge" that does that for Web apps. However, such solution defeats the purpose. |
I don't have a lot to say about the details of the DHT implementation, but i think if this were paired with query that could route DHT queries through a relay, and we could add that nip support to several of the relay servers, it would let clients which can't directly query the DHT use a random public relay that does talk to the DHT as the gateway. The big relays then become more discovery gateways and less concentrated. This is more open than the home server, let's us have DHT's for npub-> relay discovery, and shouldn't be hard to implement. |
Ok, I figured out a way to drastically under-engineer this and make it 100% client side. It works like this:
This basically achieves most of what I wanted with this draft NIP, without changing relay software. Bootstrapping the relay list and keeping it updated can be done infrequently and the cached list can be used over and over to determine a deterministic set of relays to associate with an npub, or any string. My primary use case is decentralized private web apps where I don't want the hardcoded relays to be relied on so hard. I made a small zero dependency client side library that uses this technique to build a DHT mapping pubkeys to relay sets using client only queries (depends on ws but only in Node). https://github.com/chr15m/nostr-dht Browser demo: https://chr15m.github.io/nostr-dht/ (paste your npub) Node library & demo:
import { discoverRelays, getClosestRelays } from 'nostr-dht';
const relays = await discoverRelays(boostrapRelays);
const closestRelays = await getClosestRelays(npub, relays); @fiatjaf @staab I would love your feedback if you're not too busy. I'd be happy for this code to end up in nostr-tools if it turns out not to be batshit crazy. |
You can also use all the event hints to build a bigger relay list of options for each key. On the new Amethyst, I track all pubkey hints and nprofile uris inside contents from all event kinds and put them into a 4MB bloom filter that can hold the relay list of about 1 million keys (with around 10 hint relays each). In that way, I discover even fringe community relays the user is using and can hit them to download stuff I can't find on the user's own outbox relays. It's probably not the most effective data structure, but it works quite well for the memory limitations in mobile apps. |
@vitorpamplona wow that sounds comprehensive. Relay lists for 1 million keys in 4mb is insane! 🤯 I did some basic spot-check testing to see how many relays each kind yielded:
Once I saw the basic power law curve with Actually since it's so easy to make compound filters I might as well update it to use the top 3 kinds. 🤔 For small web apps I guess we don't have access to anywhere near the number of events Amethyst is seeing. |
@vitorpamplona I'm curious if you have a sense for how many total relays there are out there? I wonder how nostr.watch builds its list? |
A regular session on Amethyst will capture about 850 normalized, unique, non-DM relay hints, but if you remove all the local relays (.local, 127..) which are not usually accessible, then it goes around 527. Now, that is just the view of one user and their follows/feeds. Our Push Notification server pulls events from a total of 3105 unique inbox relays (DM + 10002) |
Also, I am getting ready to automatically re-broadcast kind 10002 to relays that we couldn't find them. Our users pick their preferred Index relays, and if the app sees a kind 10002 that we know their indexers don't have (we already have an empty EOSE for that filter), it will automatically send it back there. We do something similar today when any relay sends an outdated replaceable. This new algorithm just adds relays that didn't send anything into the mix. Meaning, libraries and clients can automatically "fix" any indexing logic that we like to make it easier for clients to find the user's information. |
Whoa! So that's a lot of small private instances that are largely invisible to the network except for their owners? BTW in nostr-dht I'm not just removing 127 + localhost but a bunch of other invalid relays I saw. See isValidRelay for all checks: https://github.com/chr15m/nostr-dht/blob/f7f7601e525b0fb5a1a686454b5fae621a540f44/nostr-dht.js#L44 I also removed non-SSL servers for the purposes of the DHT only using high quality nodes. Thanks for working on Nostr tech. 🙏 |
I think the title of the PR is wrong, it should be DHT proxy via Nostr, in DTAN-server we use DHT to discover relays (other dtan-server instances) using the SHA1("nostr:2003") info hash. I expected this PR to be the same as that but it isnt. |
This NIP defines a distributed hash table (DHT) protocol for Nostr relays to enable decentralized relay discovery. The protocol allows a client to deterministically locate the NIP-65 relay list for a participating npub without first connecting to the npub's NIP-65 relay set and without centralizing on popular relays.
This aims solve a chicken-and-egg problem with npub relay discovery, further decentralizing the network and increasing censorship resistance. The DHT can be trivially extended by clients for other uses and event types in future, without modifying the relay implementation further.
Demo
A simplified interactive demo to get an idea for how the DHT would work, is available here:
https://chr15m.github.io/nostr-relay-research/
Stats
On the same repository as the demo above I ran some research on relays in the wild:
Per the stats above, if this NIP was implemented in one or two of the top four relays, a sufficiently large set of DHT nodes would be available to get pretty good decentralization during relay discovery (and other future uses).
Though the sample size is small, the
COUNT
result shows a long tailed power-law distribution where a large number of NIP-65 events are centralized on a few relays. If this NIP is deployed it would hopefully lead to a less skewed distribution as users will have less incentive to centralize on popular relays.Endorsement
I'd like to offer this endorsement in support of my PR.