Skip to content

Conversation

@ehigham
Copy link
Member

@ehigham ehigham commented Dec 5, 2024

Changes include:

  • Remove custom benchmark() decorator - a test being in 'benchmark/' is sufficient to indicate that it's a benchmark.
  • Add xtimeout marker to gracefully handle benchmarks that exceed time limits
  • Set default data directory from environment variable HAIL_BENCHMARK_DIR
  • Add configurable maximum duration for benchmark trials
  • Implement a failure counter to stop benchmarking after multiple consecutive failures
  • Set a temporary directory handling for each benchmark so that consecutive runs don’t run out of disk space
  • Enhance error reporting and output formatting
  • Hack to clean up Spark temporary files between benchmark runs. I didn’t find a nicer way of doing this from a quick glance at the spark docs.

This change has no impact on the GCP hail batch deployment as it is limited to refactoring of query benchmark infrastructure. Changes only affect how benchmarks are run and reported, not the actual functionality being tested.

@ehigham ehigham force-pushed the ehigham/benchmark-in-batch-variability branch from e47b948 to df2624c Compare December 5, 2024 19:58
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 33b6b9d to 54cd48a Compare December 5, 2024 19:58
@ehigham ehigham force-pushed the ehigham/benchmark-in-batch-variability branch from df2624c to b6c0201 Compare December 5, 2024 21:16
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 54cd48a to 93e81d9 Compare December 5, 2024 21:16
@ehigham ehigham force-pushed the ehigham/benchmark-in-batch-variability branch from b6c0201 to 5828ffd Compare December 5, 2024 21:19
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 93e81d9 to 2344aed Compare December 5, 2024 21:19
@ehigham ehigham force-pushed the ehigham/benchmark-in-batch-variability branch from 5828ffd to d8e0986 Compare December 5, 2024 21:42
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 2344aed to e8cb1fb Compare December 5, 2024 21:42
@ehigham ehigham changed the base branch from ehigham/benchmark-in-batch-variability to ehigham/tmpdir-per-query December 5, 2024 21:42
@ehigham ehigham force-pushed the ehigham/tmpdir-per-query branch from 5c2b0f3 to ada7cfa Compare December 5, 2024 21:52
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from e8cb1fb to c4a8c24 Compare December 5, 2024 21:52
@ehigham ehigham changed the base branch from ehigham/tmpdir-per-query to ehigham/benchmark-in-batch-variability December 5, 2024 21:52
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from c4a8c24 to 687437c Compare December 5, 2024 22:00
@ehigham ehigham changed the base branch from ehigham/benchmark-in-batch-variability to ehigham/tmpdir-per-query December 5, 2024 22:00
@ehigham ehigham marked this pull request as ready for review December 5, 2024 22:02
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 687437c to c10bd00 Compare December 5, 2024 22:12
@ehigham ehigham force-pushed the ehigham/tmpdir-per-query branch from ada7cfa to 92c63d7 Compare December 6, 2024 23:13
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from c10bd00 to 578a974 Compare December 6, 2024 23:13
@ehigham ehigham force-pushed the ehigham/tmpdir-per-query branch from 92c63d7 to 4e259cd Compare December 12, 2024 21:52
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 578a974 to 624cfbb Compare December 12, 2024 21:52
@ehigham ehigham force-pushed the ehigham/tmpdir-per-query branch from 4e259cd to a44a1f9 Compare December 12, 2024 22:01
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 624cfbb to 4ff2e4d Compare December 12, 2024 22:01
@ehigham ehigham force-pushed the ehigham/tmpdir-per-query branch from a44a1f9 to 908af43 Compare December 12, 2024 22:03
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 4ff2e4d to 30172a1 Compare December 12, 2024 22:03
@ehigham ehigham force-pushed the ehigham/tmpdir-per-query branch from 908af43 to 71ad800 Compare December 16, 2024 18:50
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 30172a1 to 4c313eb Compare December 16, 2024 18:50
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch 5 times, most recently from db3c740 to f639115 Compare September 19, 2025 17:18
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch 11 times, most recently from 904283a to e0980e6 Compare October 1, 2025 17:52
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from e0980e6 to ae6d51c Compare October 8, 2025 18:07
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch 2 times, most recently from 341f1dd to 744cbba Compare October 24, 2025 19:04
Comment on lines -29 to -32
# Note: the below does not work on batch due to docker/ssl problems
# dest = os.path.join(data_dir, filename)
# urlretrieve(url, dest)
subprocess.check_call(['curl', url, '-Lfs', '--output', f'{data_dir / filename}'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you're deleting this comment. Have we every diagnosed why we can't use python to download these files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No clue. urlretrieve seems to be deprecated now so I'm not going to look into it.
This is tested and seems to work - not sure if looking into a python only implementation is worth it.

items = collect.items

if include is not None:
diff = set(include) - set([nodeid for nodeid, *_ in items])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a set comprehension.

Suggested change
diff = set(include) - set([nodeid for nodeid, *_ in items])
diff = set(include) - {nodeid for nodeid, *_ in items}

b.sc.cancelAllJobs()

yield (max_duration, True, traceback.format_exc())
raise TimeoutError(f'Timed out after {max_duration}s')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an analyzer warning here. Is there a reason that we aren't re-raising?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - __Timeout is an internal implementation detail. I'll supress the warning.

@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 744cbba to 9000864 Compare October 29, 2025 16:44
@ehigham ehigham requested a review from chrisvittal October 29, 2025 16:45
@ehigham ehigham force-pushed the ehigham/benchmark-suite-improvements branch from 9000864 to 27fabe0 Compare October 29, 2025 18:20
@hail-ci-robot hail-ci-robot merged commit 052c838 into main Oct 29, 2025
2 of 3 checks passed
@hail-ci-robot hail-ci-robot deleted the ehigham/benchmark-suite-improvements branch October 29, 2025 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants