Skip to content

Conversation

alpe
Copy link
Contributor

@alpe alpe commented Aug 7, 2025

🚧 Draft - WIP

  • Sequencer fetches direct-TXs from DA and detects inclusion window missed
  • Fullnode fetches direct-TXs from DA and detects inclusion window missed: sets fallback mode

Not included are fallback mode behaviour for both components

alpe added 7 commits August 1, 2025 14:46
* main:
  fix: docs deployment (#2528)
  chore: merge docs repo into this repo (#2526)
  optimize: implement head index for BatchQueue.Next() to avoid O(n) slice re-slicing (#2519)
  build(deps): Bump github.com/rs/zerolog from 1.33.0 to 1.34.0 (#2525)
  chore: migrate from ipfs/go-log to zerolog across codebase (#2516)
  fix: proto inclusion (#2515)
  build(deps): Bump github.com/docker/docker from 28.2.2+incompatible to 28.3.3+incompatible in /test/docker-e2e in the go_modules group across 1 directory (#2503)
  refactor: replace Split in loops with more efficient SplitSeq (#2502)
  feat: expose ev-node P2P and RPC ports in docker compose (#2513)
  feat: create grpc execution environemnt (#2490)
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request represents a significant "spike" to explore and implement the concept of direct transactions within our system. My primary goal is to establish a robust pipeline for transactions submitted directly to the Data Availability (DA) layer, ensuring their eventual inclusion even if they bypass the standard mempool. This involves introducing new components for processing, queuing, and adapting the execution layer to handle these transactions, alongside mechanisms for forced inclusion and a fallback mode to enhance network resilience.

Highlights

  • Introduction of Direct Transactions: I've introduced the concept of "direct transactions" (DirectTX), which are transactions submitted directly to the Data Availability (DA) layer, bypassing the traditional mempool. This allows for guaranteed inclusion under certain conditions.
  • New Direct Transaction Processing Pipeline: A new pkg/directtx package has been created, housing core components such as the DirectTxReaper (to poll DA for transactions), Extractor (to deduplicate and process them), Sink (to manage a persistent queue), and ExecutionAdapter (to integrate direct transactions with the existing executor).
  • Integration into Node Architecture: The block.Manager and node.FullNode components have been updated to integrate this new direct transaction processing. This includes handling direct transactions from DA blobs and adapting the execution flow.
  • Forced Inclusion Mechanism: I've implemented logic for 'forced inclusion' of direct transactions, controlled by new configurable parameters (min_da_delay and max_inclusion_delay). This ensures that direct transactions are included within a specified time window, even if the sequencer is slow or unresponsive.
  • Fallback Mode for Sequencer Failures: A 'fallback mode' has been introduced to handle scenarios where a sequencer fails to include a direct transaction within its designated window. This mechanism aims to ensure chain liveness and data integrity even in the face of sequencer misbehavior or downtime.
  • New End-to-End Testing: Comprehensive end-to-end tests have been added to validate the forced inclusion of direct transactions, including a test case for the full node's behavior in fallback mode.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism for handling direct transactions submitted to the DA layer, which is a significant feature. It includes a new directtx package with components like a Reaper, Extractor, Sink, and an ExecutionAdapter to integrate with the existing block production flow. A fallback mode is also introduced for when the sequencer fails to include a direct transaction. My review focuses on the correctness and robustness of this new functionality. I've identified some critical issues in error handling and the fallback mechanism's design that could lead to network instability. Additionally, there are several areas where hardcoded values should be made configurable to improve flexibility and maintainability.

Comment on lines 60 to 69
func (d *ExecutionAdapter) ExecuteTxs(ctx context.Context, txs [][]byte, blockHeight uint64, timestamp time.Time, prevStateRoot []byte) (updatedStateRoot []byte, maxBytes uint64, err error) {
if d.directTXProvider.MarkTXIncluded(ctx, txs, blockHeight, timestamp) != nil {
return nil, 0, err
}
if !IsInFallbackMode(ctx) && // bypass check in fallback mode
d.directTXProvider.HasMissedDirectTX(ctx, blockHeight, timestamp) != nil {
return nil, 0, err
}
return d.nested.ExecuteTxs(ctx, txs, blockHeight, timestamp, prevStateRoot)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical bug in the error handling within ExecuteTxs. When d.directTXProvider.MarkTXIncluded or d.directTXProvider.HasMissedDirectTX returns an error, the function returns (nil, 0, err). Because err is a named return value that is nil at that point in the execution, the error is effectively swallowed and not propagated up the call stack. This can mask critical issues during transaction execution.

Suggested change
func (d *ExecutionAdapter) ExecuteTxs(ctx context.Context, txs [][]byte, blockHeight uint64, timestamp time.Time, prevStateRoot []byte) (updatedStateRoot []byte, maxBytes uint64, err error) {
if d.directTXProvider.MarkTXIncluded(ctx, txs, blockHeight, timestamp) != nil {
return nil, 0, err
}
if !IsInFallbackMode(ctx) && // bypass check in fallback mode
d.directTXProvider.HasMissedDirectTX(ctx, blockHeight, timestamp) != nil {
return nil, 0, err
}
return d.nested.ExecuteTxs(ctx, txs, blockHeight, timestamp, prevStateRoot)
}
func (d *ExecutionAdapter) ExecuteTxs(ctx context.Context, txs [][]byte, blockHeight uint64, timestamp time.Time, prevStateRoot []byte) (updatedStateRoot []byte, maxBytes uint64, err error) {
if err := d.directTXProvider.MarkTXIncluded(ctx, txs, blockHeight, timestamp); err != nil {
return nil, 0, err
}
if !IsInFallbackMode(ctx) { // bypass check in fallback mode
if err := d.directTXProvider.HasMissedDirectTX(ctx, blockHeight, timestamp); err != nil {
return nil, 0, err
}
}
return d.nested.ExecuteTxs(ctx, txs, blockHeight, timestamp, prevStateRoot)
}

Comment on lines +33 to +34
func (f *FallbackBlockProducer) CreateFallbackBlock(ctx context.Context) ([][]byte, error) {
//todo: we need deterministic header and data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The CreateFallbackBlock function is intended to produce deterministic blocks when the sequencer is down. However, the //todo comment correctly identifies that the current implementation is not deterministic regarding the block header and data. If different full nodes enter fallback mode, they will likely produce different blocks, leading to a network fork. This is a critical flaw in the fallback mechanism that must be addressed to ensure chain consistency during a sequencer outage.

Comment on lines +927 to +931
if errors.Is(err, directtx.ErrDirectTXWindowMissed) {
// the sequencer has missed to include a direct TX. Either by censoring or downtime
m.fallbackMode = true
return types.State{}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallbackMode flag is set to true when ErrDirectTXWindowMissed occurs, but there appears to be no mechanism to reset it to false. This means that once a node enters fallback mode, it could remain in that mode indefinitely. This could have serious implications for the node's behavior and performance. A strategy for exiting fallback mode should be implemented, whether it's automatic after a certain condition is met or requires manual intervention.

var directTXSeq coresequencer.DirectTxSequencer
if nodeConfig.Node.Aggregator {
// Aggregator nodes use the full DirectTxSequencer that can sequence direct transactions
directTXSeq = single.NewDirectTxSequencer(sequencer, logger, datastore, 100, nodeConfig.ForcedInclusion) // todo (Alex): what is a good max value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The maximum size for the direct transaction sequencer queue is hardcoded to 100. This value should be configurable to allow for tuning in different deployment environments. A // todo comment already indicates that this needs to be addressed.

Comment on lines +552 to +555
const ( // copied from da/jsonclient/internal
defaultGovMaxSquareSize = 64
defaultMaxBytes = defaultGovMaxSquareSize * defaultGovMaxSquareSize * share.ContinuationSparseShareContentSize
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constants defaultGovMaxSquareSize and defaultMaxBytes are copied from another package. This can lead to inconsistencies if the original values in the source package change. To improve maintainability, these constants should be imported from a shared package or the source package should be updated to export them.

@@ -118,7 +134,10 @@ func newFullNode(
return nil, err
}

reaper := block.NewReaper(
const maxBlockBytes = 1024 * 1024 // todo (Alex): what is a good default? make configurable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The maxBlockBytes is hardcoded to 1024 * 1024. This value should be configurable to allow for tuning based on network requirements and performance characteristics. A // todo comment already highlights this.

Comment on lines +53 to +57
txs, err := d.directTXProvider.GetPendingDirectTXs(ctx, d.daHeightSource.GetDAIncludedHeight(), d.maxBlockBytes)
if err != nil {
return nil, err
}
return append(txs, otherTx...), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In GetTxs, direct transactions are fetched and appended to mempool transactions. This could lead to a situation where a large number of direct transactions fill up the entire block, effectively starving transactions from the mempool. The // todo comment also raises this concern. Consider implementing a strategy to reserve a portion of the block for mempool transactions to ensure fairness and prevent one channel from dominating block space.

@@ -0,0 +1,19 @@
# direct TX

- What format can we expect for the direct TX on the DA? some wrapper type would be useful to unpack and distinct from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an open design question. we could either set up some sort of frontend like optimism portal (https://docs.optimism.io/app-developers/tutorials/transactions/send-tx-from-eth#trigger-the-transaction) that allows users to directly submit direct TXs

or

just allow bake in this functionality of submitting a direct tx or set of a them from any full node which can publish a set of txs to the namespace in some format deserializable by the sequencer.


- What format can we expect for the direct TX on the DA? some wrapper type would be useful to unpack and distinct from
random bytes
- Should we always include direct TX although they may be duplicates to mempool TX? spike: yes, first one wins.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, first one wins. even if there's a duplicate, the duplicate would just fail automatically due to usual tx replay protections in state machines like nonces

- What format can we expect for the direct TX on the DA? some wrapper type would be useful to unpack and distinct from
random bytes
- Should we always include direct TX although they may be duplicates to mempool TX? spike: yes, first one wins.
- Should we fill the block space with directTX if possible or reserve space for mempool TX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this case is when there's a lot of direct txs and mempool txs at the same time, so there needs to be a decision on what to do here. We can limit the amount of direct txs in each evolve block to some fixed number (can be made dynamic later) which follows an inclusion window so that the sequencer's mempool txs are still the ones included in an evolve block the fastest.

random bytes
- Should we always include direct TX although they may be duplicates to mempool TX? spike: yes, first one wins.
- Should we fill the block space with directTX if possible or reserve space for mempool TX
- What should we do when a sequencer was not able to add a direct-TX within the time window?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sequencer following the rules of the network should be able to add a direct tx within the time window. If there's too many direct txs to include in a time window, then everyone can see that and those extra direct txs would just be ignored. This responsibility of telling a user if their direct tx will be included or not, or has be included can be passed on to the frontends.

- Should we always include direct TX although they may be duplicates to mempool TX? spike: yes, first one wins.
- Should we fill the block space with directTX if possible or reserve space for mempool TX
- What should we do when a sequencer was not able to add a direct-TX within the time window?
- Do we need to trace the TX source chanel? There is currently no way to find out if a sequencer adds a direct TX "to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think it's necessary to trace the source channel, maybe it's useful for the sequencer itself though.

The sequencer has preferred execution rights in an evolve block so it can included whatever transactions it wants really since only valid txs will be actually executed. If the sequencer adds a direct tx, X, before it's even included in the DA layer, then every full node can see that that X was included already so that's perfectly valid. I do not see how this can be harmful to do.

- What should we do when a sequencer was not able to add a direct-TX within the time window?
- Do we need to trace the TX source chanel? There is currently no way to find out if a sequencer adds a direct TX "to
early". Can be a duplicate from mempool as well.
- What is a good max size for a sequencer/fallback block? Should this be configurable or genesis?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean a good max size? The max size of an evolve block is equal to the max blob size acceptable by the DA layer (minus any fixed serialization overhead). I think this is already set to to the DA max blob size in Evolve's current version and is only relevant to the sequencer. May be making it configurable is useful in the future but I don't see how it's related to direct txs. Can you please elaborate?

- Do we need to trace the TX source chanel? There is currently no way to find out if a sequencer adds a direct TX "to
early". Can be a duplicate from mempool as well.
- What is a good max size for a sequencer/fallback block? Should this be configurable or genesis?
- Do we need to track the sequencer DA activity? When it goes down with no direct TX on DA there are no state changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in order to decide when to enter fallback mode, full nodes need to assess the liveness of the sequencer. The liveness of the sequencer is dependent on whether they're posting evolve blocks on the DA layer.

Gossiping evolve blocks over P2P shouldn't be considered the way to check liveness since they're vulnerable to P2P split network attacks like eclipse attacks. Opting to listen over P2P is purely optional for a full node. So we should assume the case for a full node only listening to DA for evolve blocks

- What is a good max size for a sequencer/fallback block? Should this be configurable or genesis?
- Do we need to track the sequencer DA activity? When it goes down with no direct TX on DA there are no state changes.
When it goes down and direct TX missed their time window, fullnodes switch to fallback mode anyway.
- How do we restore the chain from recovery mode?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming by recovery mode, you mean fallback mode here. There doesn't need to be socially coordinated fork necessarily. Let's say the original sequencer was just experiencing downtime and the rest of the network is in fallback mode. Now if the original sequencer comes back online, first it needs to sync to the rest of the network that's in fallback mode to get to the tip of the chain, and it can then signal by posting something to the DA layer that it's back online. Then, the rest of the network full nodes can see this signal and exit fallback mode. They can then wait for the original period of time that causes fallback mode for the sequencer to start posting Evolve blocks to the DA layer.

The alternative if the original sequencer is forever down for some reason like its signing keys being lost or something, is introducing a socially coordinated fork like you suggested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants