Skip to content

Conversation

@bshastry
Copy link
Contributor

PR description

This PR adds transaction tracing support to the block-test subcommand in evmtool, enabling detailed debugging of block test execution with opcode-level traces.

New Feature: -t/--trace flag

The block-test command now supports a -t or --trace flag that outputs detailed transaction execution traces in JSON format compatible with go-ethereum. This enables developers to:

  • Debug complex block test failures at the opcode level
  • Analyze gas consumption patterns
  • Inspect stack and memory state during execution
  • Compare execution behavior across different Ethereum clients

Usage

# Basic tracing
evmtool block-test -t test.json

# With stack and memory traces
evmtool block-test -t --trace-stack --trace-memory test.json

Example Trace Output

Traces are output to stderr in JSON format:

{"pc":7,"op":62,"gas":"0xf3c12b","gasCost":"0x0","memSize":0,"depth":1,"refund":0,"opName":"RETURNDATACOPY","error":"Out of bounds"}

Implementation Details

  • Integrates with existing TracerManager and StandardJsonTracer infrastructure
  • Correctly handles all transaction statuses per Ethereum protocol:
    • INVALID transactions (wrong nonce, insufficient balance) → reject block
    • FAILED transactions (REVERT, out of gas, stack underflow) → include in block with receipt
    • SUCCESSFUL transactions → include in block with receipt
  • Trace format matches go-ethereum for compatibility with existing analysis tools
  • Behavior aligns with Besu's standard block import path and reference implementations

Testing

Manually tested with fuzzer derived blocktests (about 800 of them)

Both tests include transactions with runtime errors (stack underflow, out of gas, invalid opcodes) that are correctly traced and handled.

Fixed Issue(s)

N/A - This is a new feature enhancement, not fixing a reported issue.


Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests (N/A - no database changes)

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build (evmtool has no specific unit tests for this command)
  • acceptance tests: ./gradlew acceptanceTest (not required for evmtool changes)
  • integration tests: ./gradlew integrationTest (not required for evmtool changes)
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests (not applicable)
  • hive tests: Engine or other RPCs modified? (N/A - evmtool only)

Notes

  • This is a developer tooling enhancement for evmtool, not a change to core Besu functionality
  • Manually tested with complex Ethereum reference test cases
  • No changes to APIs, consensus, networking, or database
  • Documentation may be beneficial to show developers how to use the new tracing capability

@bshastry bshastry force-pushed the feature/blocktest-tracing branch from 6224044 to d35158c Compare October 15, 2025 09:38
Implements tracing functionality for the block-test subcommand in evmtool,
allowing developers to debug block test execution with detailed transaction
traces including opcodes, gas usage, stack, and memory state.

New features:
- Added -t/--trace-transactions flag to enable tracing during block test execution
- Added --trace-memory, --trace-stack, --trace-returndata, --trace-storage options
- Added --trace-output option to specify output file (default: stderr)
- Integrated with StandardJsonTracer infrastructure
- Traces are output in JSON format compatible with go-ethereum

Implementation:
The processBlockWithTracing() method processes transactions with full tracing
while maintaining correct transaction status handling. Transactions are
categorized as INVALID, FAILED, or SUCCESSFUL per Ethereum protocol:
- INVALID transactions (wrong nonce, insufficient balance) reject the block
- FAILED transactions (reverted execution) are included with receipts
- SUCCESSFUL transactions are included with receipts

This distinction ensures blocks with reverted transactions are correctly
accepted, matching behavior of the standard block import path and reference
implementations like go-ethereum.

Conflict resolution:
This commit was rebased onto main which added test summary reporting (hyperledger#9246).
Both features are now merged:
- Test summary reporting tracks pass/fail counts across all tests
- Transaction tracing provides detailed execution traces when enabled
- Both features work independently and complement each other

Usage:
  evmtool block-test -t --trace-stack --trace-memory <test-file.json>
  evmtool block-test -t --trace-output=trace.jsonl <test-file.json>

Tested with Ethereum reference test suite including complex scenarios with
failed transactions. Trace output format matches go-ethereum for compatibility
with existing tooling.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Bhargava Shastry <[email protected]>
@bshastry bshastry force-pushed the feature/blocktest-tracing branch from d35158c to bd27f72 Compare October 17, 2025 10:27
Signed-off-by: Bhargava Shastry <[email protected]>
Signed-off-by: Bhargava Shastry <[email protected]>
@lu-pinto
Copy link
Member

lu-pinto commented Nov 3, 2025

Hi @bshastry first of all, thank you for your PR. I'm wondering what's the rationale for adding tracing support to block-test? Doesn't the state-test suffice for what you need? Take a look at an example here: https://github.com/hyperledger/besu/blob/main/ethereum/evmtool/src/test/resources/org/hyperledger/besu/evmtool/state-test/clz-opcode.json This already supports tracing out-of-the-box.
What's your use case?

@bshastry
Copy link
Contributor Author

bshastry commented Nov 4, 2025

Hi @bshastry first of all, thank you for your PR. I'm wondering what's the rationale for adding tracing support to block-test? Doesn't the state-test suffice for what you need? Take a look at an example here: https://github.com/hyperledger/besu/blob/main/ethereum/evmtool/src/test/resources/org/hyperledger/besu/evmtool/state-test/clz-opcode.json This already supports tracing out-of-the-box. What's your use case?

Hi @lu-pinto Thank you for your feedback. The rationale for this PR is to be able to compare clients' execution outcome across transactions, right now we only do this at the statetest level i.e., a pre state, a single tx, and a post state. The idea is to generalize this to a pre state, a sequence of blocks, and a post state and compare the resulting trace between clients. This PR permits such a trace to be obtained. We use this infrastructure to then obtain and compare traces for multi block processing (still at the tx level, but it can shed light on state effects that are not captured by statetests) between besu and say geth and nethermind. This is usually orchestrated by a fuzzer such as goevmlab that is already capable of comparing clients' processing at the statetest level, but if this PR is merged then even at the blocktest level.

@Option(
names = {"--trace-storage"},
description = "Include storage changes in traces")
private boolean traceStorage = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I inherited from the parentCommand instead of creating new options.


@Option(
names = {"--trace-output"},
description = "Output file for traces (default: stderr)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you choose stderr? I think this is a great option to add to all subcommands actually, I would prefer if it were moved to EvmToolCommand instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fuzzers read from process pipes to save disk I/O for performance reasons (maximizes test throughput). stdout may contain diagnostics, hence stderr. Could we may be address the addition of trace-output to evmtoolcommand in a separate PR?

Copy link
Member

@lu-pinto lu-pinto Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fuzzers read from process pipes to save disk I/O for performance reasons (maximizes test throughput). stdout may contain diagnostics, hence stderr.

But if you have the option to output to a file you can just pipe the file to the fuzzer no? My main problem is inconsistency with other commands as they all go to System.out by default.

re: moving option: sure, let's treat the move to the parent command separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I see you are only sending traces to this output but all the other prints are going to parentCommand.out. Why not replacing parentCommand.out with the PrintStream based on this option and send everything there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fuzzer consumes the trace by using a strict jsonl parser. However, if the trace were to be sent to parentCommand.out, the fuzzer's jsonl parser errors out because non jsonl lines (info/error logging that is sent to parentCommand.out is interspersed with actual traces). Hence the separation (and use of stderr).

Considering test_name
Block 1 (0x1234...) Imported in 1.23 ms (456.78 MGas/s)
{"pc":0,"op":96,"gas":"0x5f5e100","gasCost":"0x1",...}
{"pc":2,"op":96,"gas":"0x5f5e0ff","gasCost":"0x2",...}
Block 2 (0x5678...) Imported in 0.98 ms (789.01 MGas/s)
{"pc":4,"op":96,"gas":"0x5f5e0fd","gasCost":"0x1",...}
...
Chain import successful - test_name

{"test":"test_name","pass":true,"fork":"mainnet","duration":1234,...}

@bshastry bshastry force-pushed the feature/blocktest-tracing branch from 2241260 to 1acbdd4 Compare November 4, 2025 11:48
@Option(
names = {"-t", "--trace-transactions"},
description = "Enable transaction tracing to stderr")
private boolean enableTracing = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be doing the same as final Boolean showJsonResults = false; from EvmToolCommand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, fixed it.

output.close();
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you created this class mostly because of the close? I don't think you need it, since you are already using an autoFlush PrintStream?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't work consider using a PrintStream instead of a PrintWriter with a FileOutputStream

} else {
parentCommand.out.println("Chain import successful - " + test);
}
} finally {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally block adds unnecessary complexity IMO. In Java most exceptions are checked so you have to handle them, for a case like this where you are running tests if there's an unchecked one it's best to let them propagate and crash the app. What would be the benefit of getting a summary of results if there's a developer error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the finally block. Clean up happens at the end of normal execution flow.

parentCommand.out.printf(
"Block %d (%s) Rejected (correctly)%n",
block.getHeader().getNumber(), block.getHash());
// Original block import logic (from main branch)
Copy link
Member

@lu-pinto lu-pinto Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these comments about? (from main branch) / (from feature branch) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, fixed

if (enableTracing && tracerManager != null) {
// Process block with tracing
importResult =
processBlockWithTracing(
Copy link
Member

@lu-pinto lu-pinto Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it adds duplicate code which is hard to sanity check and maintain in the future. Would you be happy to call blockImporter.importBlock(context, block, validationMode, validationMode); if you could add in the tracer before calling into it?
I think it might be doable and if you are happy with that I can give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made this change, please let me know if there are any issues.

bshastry and others added 5 commits November 4, 2025 14:16
This commit addresses code review feedback by refactoring the block-test
tracing implementation to use Besu's standard BlockImportTracerProvider
plugin infrastructure instead of custom transaction processing logic.

Changes:
- Remove duplicate tracing option definitions (now uses parent command's
  inherited options via @ParentCommand)
- Create BlockTestTracerProvider implementing BlockImportTracerProvider
- Create BlockAwareOperationTracerAdapter to wrap StreamingOperationTracer
- Remove processBlockWithTracing method (~138 lines of duplicate code)
- Remove finally block complexity - cleanup happens at end of normal flow
- Remove all merge conflict comments (// from feature branch, etc.)
- Use standard blockImporter.importBlock() for both tracing and non-tracing
- Register tracer provider with ServiceManager when tracing is enabled

Benefits:
- Eliminates code duplication between tracing and non-tracing paths
- Consistent with Besu's plugin architecture
- Transaction boundaries handled automatically in AbstractBlockProcessor
- Better maintainability - single source of truth for block processing
- Cleaner separation of concerns

Tested with:
- Block tests with tracing enabled (memory, storage, stack)
- Block tests without tracing
- Verified test summary metrics (gasUsed, txCount, blockCount)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Bhargava Shastry <[email protected]>
Address PR review feedback about resource management in BlockTestTracerManager.
The close() method was unconditionally closing all PrintWriters, which is
problematic when wrapping System.err (a shared system resource).

Changes:
- Add shouldCloseOutput parameter to BlockTestTracerManager constructor
- Only close file-based PrintWriters, not System.err
- Document stdout/stderr separation rationale (JSONL purity for fuzzers)
- Simplify tracing enablement using parentCommand.showJsonResults

Technical context:
- Trace output (pure JSONL) must be separated from test status messages
- Test status goes to stdout (parentCommand.out), traces to stderr or file
- Mixing them would break JSONL parsers used by fuzzing tools
- File writers need explicit close() for proper resource cleanup
- System.err should never be closed as it's shared across the JVM

Tested with:
- File output: 53,932 trace lines written and file properly closed
- Stderr output: Traces correctly written without closing System.err

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: Bhargava Shastry <[email protected]>
.getByName(spec.getNetwork());

final MutableBlockchain blockchain = spec.getBlockchain();
ProtocolContext context = spec.getProtocolContext();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a compilation issue here. Fixed in my latest commit

Comment on lines 465 to 486
@Override
public void traceStartBlock(
final WorldView worldView,
final org.hyperledger.besu.plugin.data.BlockHeader blockHeader,
final BlockBody blockBody,
final Address miningBeneficiary) {
// No-op: StreamingOperationTracer doesn't need block-level events
}

@Override
public void traceEndBlock(
final org.hyperledger.besu.plugin.data.BlockHeader blockHeader, final BlockBody blockBody) {
// No-op: StreamingOperationTracer doesn't need block-level events
}

@Override
public void traceStartBlock(
final WorldView worldView,
final org.hyperledger.besu.plugin.data.ProcessableBlockHeader processableBlockHeader,
final Address miningBeneficiary) {
// No-op: StreamingOperationTracer doesn't need block-level events
}
Copy link
Member

@lu-pinto lu-pinto Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to define this, the implemented interface already has empty methods. Removed in my latest commit

currentTracer =
new StreamingOperationTracer(
output,
OpCodeTracerConfigBuilder.create()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are creating an OpcodeTracerConfig from scratch so you need to specify every single field. Tracing was failing because traceOpcodes option was not defined. Fixed now in my latest commit.

*/
public StreamingOperationTracer createTracer() {
currentTracer =
new StreamingOperationTracer(
Copy link
Member

@lu-pinto lu-pinto Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: doesn't make much sense to me - why do you use a StreamingOperationTracer if you only flush at the end? Is only flushing at the end a requirement?

lu-pinto
lu-pinto previously approved these changes Nov 11, 2025
Copy link
Member

@lu-pinto lu-pinto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good now @bshastry! I also made some simplifications in the code in my last commit.

I will add another commit to print the trace as the code is being traced - that way we can avoid flushing at the end. If this does not work for you feel free to remove the commit.

Copy link
Member

@lu-pinto lu-pinto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bshastry feel free to keep working on it or say the word and I'll merge it

@macfarla macfarla assigned bshastry and unassigned lu-pinto Nov 11, 2025
@macfarla
Copy link
Contributor

@bshastry are you ready for this to be merged?

@bshastry
Copy link
Contributor Author

@bshastry are you ready for this to be merged?

I'll do one final pass. Hopefully will be done by the end of the week. Sorry to keep you waiting.

@macfarla
Copy link
Contributor

macfarla commented Nov 18, 2025

@bshastry are you ready for this to be merged?

I'll do one final pass. Hopefully will be done by the end of the week. Sorry to keep you waiting.

no worries - just checking you weren't waiting on us! I'll change it to draft while you're still working on it, you can switch it back to ready for review when you're done

@macfarla macfarla marked this pull request as draft November 18, 2025 00:30
@bshastry bshastry marked this pull request as ready for review November 18, 2025 09:03
@bshastry
Copy link
Contributor Author

bshastry commented Nov 18, 2025

@lu-pinto @macfarla This PR is ready to be merged. I have run a moderate fuzzer test suite on the PR with about 1000 different block tests to ensure the tracing works as expected (and can be compared against other EL clients). Thank you for helping me with this PR 🙏

@macfarla macfarla assigned lu-pinto and unassigned bshastry Nov 18, 2025
@lu-pinto lu-pinto enabled auto-merge (squash) November 18, 2025 10:56
@lu-pinto lu-pinto merged commit 5f8ceec into hyperledger:main Nov 18, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants