Skip to content

Conversation

rklaehn
Copy link
Contributor

@rklaehn rklaehn commented Sep 10, 2025

No description provided.

Copy link

vercel bot commented Sep 10, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
iroh-computer Ready Ready Preview Comment Sep 11, 2025 0:43am

@rklaehn rklaehn marked this pull request as draft September 10, 2025 13:14
@n0bot n0bot bot added this to iroh Sep 10, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Sep 10, 2025
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review only, but I noticed more changes already so submitting now.


For that reason, mature systems such as the mainline DHT restrict value size to 1000 bytes, and we are going to limit value size to 1024 bytes or 1KiB.

You could write a DHT to store arbitrary values, but in almost all cases the value should have some relationship with the key. E.g. for mainline, the value in most cases is a set of socket addresses where you can download the data for the SHA1 hash of the key. So in principle you could validate the key by checking if you can actually download the data from the socket addresses contained in the data. In some mainline extensions, like bep_0044, the key is the SHA1 hash of an ED25519 public key, and the value contains the actual public key, a signature computed from the corresponding private key, and some user data. Again, it is possible to validate the value based on the key - if the SHA1 hash of the public key contained in the value does not match the lookup key, the value is invalid for the key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You sort of seem to suggest that enforcing only storing values that can be validated is a good idea. But you don't actually go as far as suggesting that. What is the aim of this paragraph? It may need to be a bit stronger worded in it's suggestion/recommendation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good idea, and mainline kinda sorta does it. All the extensions have validatable values (bep_0044 and the one for immutable data).

For the main use case, you can not store arbitrary socket addrs for a SHA1 hash, but only the socket addrs of your own node as seen from the callee. The only free parameter is the port number.

I think I want to do this as well, but for blake3 providers you either let the DHT node do BLAKE3 probes (costly), or you store a signed record or something. Not quite worked out yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, possibility 1: values for content discovery are just node ids. Tracker can lazily do BLAKE3 probes to make sure the data is there, and then purge values if not.

Possibility 2: values are a signed promise by the announcer that the data is there. Tracker can check the signature, but that does not really tell us if the promise is upheld. If you want to know if the data is there, you have to check.

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did get a little further... but not yet to the end.


And that's it. That is the entire rpc protocol. Many DHT implementations also add a `Ping` call, but since querying the routing table is so cheap, if you want to know if a node is alive you might as well ask it for the closest nodes to some random key and get some extra information for free.

## RPC client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could skip this entire section for a blog post, it would also shorten it. This is more tutorial-material. To me the protocol is the important bit, this is just boilerplate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe publish the crate (under a better name) and just point to the docs / impl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's also a good option if you can point to the code in a repo somewhere. Though maybe @b5 is interested in a tutorial as well to turn into another video?


## Storage implementation

The first thing we would have to do to implement this protocol would be the storage part. For this experiment we will just use a very simple memory storage. This might even be a good idea for production! We have a limited value size, and DHTs are not persistent storage anyway. DHT records need to be continuously republished, so if a DHT node goes down it will just be repopulated with values shortly after becoming online again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The first thing we would have to do to implement this protocol would be the storage part. For this experiment we will just use a very simple memory storage. This might even be a good idea for production! We have a limited value size, and DHTs are not persistent storage anyway. DHT records need to be continuously republished, so if a DHT node goes down it will just be repopulated with values shortly after becoming online again.
The first thing we would have to do to implement this protocol would be the storage part. For this experiment we will use a very simple memory storage. This might even be a good idea for production! We have a limited value size, and DHTs are not persistent storage anyway. DHT records need to be continuously republished, so if a DHT node goes down it will just be repopulated with values shortly after becoming online again.

Writing tip: you can almost always strike "just" from any text and it becomes better. (yeah, my own drafts still contain that word a lot too...)


So let's define the routing table. First of all we need some simple integer arithmetic like xor and leading_zeros for 256 bit numbers. There are various crates that provide this, but since we don't need anything fancy like multiplication or division, we just quickly implemented it inline.

The routing table itself is just a 2d array of node ids. Each row (k-bucket) has a small fixed upper size, so we are going to use the [ArrayVec] crate to prevent allocations. For each node id we just keep a tiny bit of extra information - a timestamp when we have last seen any evidence that the node actually exists and responds, to decide which nodes to check for liveness.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really been paying much attention to this till now, so there might be other occurances. But perhaps we should stick to "node ID" in text when referring to the iroh NodeId. It's mostly a stylistic question though.


Hence the `Box`, and you will sometimes have to jump through some hoops initializing a Buckets struct.

The problem would go away if we were to use Vec instead of ArrayVec, but that would mean that the routing table data is spread all over the heap depending on heap fragmentation at the time of allocation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The problem would go away if we were to use Vec instead of ArrayVec, but that would mean that the routing table data is spread all over the heap depending on heap fragmentation at the time of allocation.
The problem would go away if we were to use `Vec` instead of `ArrayVec`, but that would mean that the routing table data is spread all over the heap depending on heap fragmentation at the time of allocation.

Again a stylistic question, I haven't been consistently looking out for these.

@ramfox
Copy link
Member

ramfox commented Sep 12, 2025

Okay, this is great. Note, I did not read for grammar or correctness, just read it while thinking about structure.

My suggestion here would be to immediately lean into the fact that this is going to be a series and already split this into 3 separate posts.

The first post would be about what a DHT is and what properties we want in a DHT. We can link to code as a preview for what is to come, and add a conclusion that describes the next few blog posts.

The second post would be the implementation post, ending with something to the effect of "so now we have an implementation, how do we know it works? Next blog post we will describe how to test our DHT implementation and tips for testing iroh networks on the order of 100k locally by using irpc in our protocols."

The third post would be illustrating the way you tested the DHT using irpc. The conclusion can describe some topics for future posts on DHTs.

Each post should have a table of contents and the top with links to each blog post in the series as they get published, as well as a link to the next post in the series at the bottom of each page.

@rklaehn what are your thoughts on splitting it this way? Your post is already basically structured in this fashion, it would just be splitting it explicitly.

@rklaehn
Copy link
Contributor Author

rklaehn commented Sep 12, 2025

Okay, this is great. Note, I did not read for grammar or correctness, just read it while thinking about structure.

My suggestion here would be to immediately lean into the fact that this is going to be a series and already split this into 3 separate posts.

The first post would be about what a DHT is and what properties we want in a DHT. We can link to code as a preview for what is to come, and add a conclusion that describes the next few blog posts.

The second post would be the implementation post, ending with something to the effect of "so now we have an implementation, how do we know it works? Next blog post we will describe how to test our DHT implementation and tips for testing iroh networks on the order of 100k locally by using irpc in our protocols."

The third post would be illustrating the way you tested the DHT using irpc. The conclusion can describe some topics for future posts on DHTs.

Each post should have a table of contents and the top with links to each blog post in the series as they get published, as well as a link to the next post in the series at the bottom of each page.

@rklaehn what are your thoughts on splitting it this way? Your post is already basically structured in this fashion, it would just be splitting it explicitly.

Yeah, makes sense. I wonder if the first part is interesting enough, but we can follow it up qucikly with at least the second part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

3 participants