LST: Introducing a condition before checking cuts in the T5 counting kernel #49164

kk428 · 2025-10-14T18:26:51Z

With dynamic memory allocation implemented, the high pT QCD sample I am looking at causes LST to crash. There is an overwhelming large amount of fake tracks that need to be accounted for. As is, it attempts to allocate memory for millions of (mostly fake) T5's. We expect the true amount to be around 300,000, compared to the 60,000 in the ttbar sample.

We could implement a cut that would remove many of the fake tracks, but there is a chance that this impacts the efficiency. Instead, here I included a condition that performs the standard set of cuts in the counting kernel if the corresponding triplet is densely connected. Most triplets will only be connected to a handful of potential quintuplets, whereas the ones leading to a large amount of fake tracks will have O(1000) connections. So, I included a condition that checks if the number of inner and outer connections is less than 1000.

Normally the full set of cuts would only be performed in the creation kernel. The cuts that are now included in the counting kernel do not affect the timing for a ttbar sample as none of its events meet the densely connected condition.

Here is a comparison in performance and timing on standalone:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
[target branch]
   avg     15.2      0.4      0.4      0.5      0.6      0.3      0.6      0.3      0.9      0.0      19.3       3.9+/-  0.8      19.3   explicit[s=1]
   avg      1.2      0.6      0.6      0.7      0.8      0.3      0.9      0.4      1.3      0.0       6.8       5.3+/-  1.1       6.9   explicit[s=2]
   avg      2.1      0.8      1.0      1.2      1.3      0.4      1.5      0.7      1.9      0.0      10.9       8.4+/-  1.5       2.8   explicit[s=4]
   avg      2.4      1.3      1.5      1.8      1.8      0.6      2.0      0.9      2.6      0.0      14.8      11.8+/-  2.3       2.5   explicit[s=6]
   avg      3.5      1.7      2.0      2.2      2.5      0.7      2.9      1.2      3.3      0.0      20.1      15.8+/-  3.6       5.1   explicit[s=8]
[this PR]
   avg     24.3      0.4      0.4      0.5      0.9      0.3      0.6      0.3      0.9      0.0      28.7       4.1+/-  1.7      28.8   explicit[s=1]
   avg      1.2      0.6      0.6      0.7      1.1      0.3      1.0      0.5      1.2      0.0       7.2       5.7+/-  1.9       3.6   explicit[s=2]
   avg      2.5      0.8      1.0      1.2      1.6      0.4      1.5      0.7      1.9      0.0      11.8       8.8+/-  2.4       3.0   explicit[s=4]
   avg      3.0      1.3      1.4      1.8      2.3      0.6      2.1      1.0      2.7      0.0      16.3      12.7+/-  3.4       2.8   explicit[s=6]
   avg      3.8      1.7      2.0      2.4      3.0      0.7      2.9      1.3      3.3      0.0      21.2      16.7+/-  3.9       2.7   explicit[s=8]

…ting kernels

cmsbuild · 2025-10-14T18:27:10Z

cms-bot internal usage

cmsbuild · 2025-10-14T18:28:31Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49164/46457

There are other open Pull requests which might conflict with changes you have proposed:
- File RecoTracker/LSTCore/interface/alpaka/Common.h modified in PR(s): Line Segment Tracking (LST): Replace pT3 Rphi Chi-Squared Cut with DNN #49163

cmsbuild · 2025-10-14T18:28:58Z

A new Pull Request was created by @kk428 for master.

It involves the following packages:

RecoTracker/LSTCore (reconstruction)

@cmsbuild, @jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @VinInn, @VourMa, @dgulhan, @elusian, @felicepantaleo, @gpetruc, @mmasciov, @mmusich, @mtosi, @rovere this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

slava77 · 2025-10-14T21:08:04Z

test parameters:

enable_tests = gpu
workflows_gpu = 29634.704,29834.704
workflows = 29634.703,29834.703,29834.755,29634.757,29834.757
relvals_opt = -w upgrade,standard
relvals_opt_gpu = -w upgrade,standard

slava77 · 2025-10-14T21:09:11Z

@cmsbuild please test

cmsbuild · 2025-10-15T05:06:56Z

+1

Size: This PR adds an extra 44KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4492e0/48684/summary.html
COMMIT: 4b229f3
CMSSW: CMSSW_16_0_X_2025-10-14-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S,NVIDIA_T4
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/49164/48684/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially removed 4 lines from the logs
Reco comparison results: 8 differences found in the comparisons
DQMHistoTests: Total files compared: 58
DQMHistoTests: Total histograms compared: 4329520
DQMHistoTests: Total failures: 26
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4329474
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 57 files compared)
Checked 243 log files, 210 edm output root files, 58 DQM output files
TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

You potentially removed 1 lines from the logs
Reco comparison results: 246 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 178926
DQMHistoTests: Total failures: 27046
DQMHistoTests: Total nulls: 12
DQMHistoTests: Total successes: 151868
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

AMD_W7900 Comparison Summary

Summary:

You potentially removed 1 lines from the logs
Reco comparison results: 269 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 178926
DQMHistoTests: Total failures: 30103
DQMHistoTests: Total nulls: 11
DQMHistoTests: Total successes: 148812
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

NVIDIA_H100 Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 264 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 178926
DQMHistoTests: Total failures: 28432
DQMHistoTests: Total nulls: 10
DQMHistoTests: Total successes: 150484
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

NVIDIA_L40S Comparison Summary

Summary:

You potentially added 1 lines to the logs
Reco comparison results: 257 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 178926
DQMHistoTests: Total failures: 27417
DQMHistoTests: Total nulls: 12
DQMHistoTests: Total successes: 151497
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

NVIDIA_T4 Comparison Summary

Summary:

You potentially added 5 lines to the logs
Reco comparison results: 237 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 178926
DQMHistoTests: Total failures: 27382
DQMHistoTests: Total nulls: 8
DQMHistoTests: Total successes: 151536
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

jfernan2 · 2025-10-15T12:01:24Z

assign heterogeneous

cmsbuild · 2025-10-15T12:01:41Z

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

jfernan2 · 2025-10-15T15:51:37Z

@kk428 could you please add to the description any link to performance or timing studies of this change? Thanks

kk428 · 2025-10-15T19:42:28Z

@kk428 could you please add to the description any link to performance or timing studies of this change? Thanks

I included a link to the CI tests done on the SegmentLinking fork. Let me know if you need anything else.

Edit: I changed the description from a link to a timing comparison to the timing comparison itself.

jfernan2 · 2025-10-17T09:51:28Z

+1

slava77 · 2025-10-21T14:51:55Z

@cms-sw/heterogeneous-l2
please clarify on the status/plans of your review of this PR.
Thank you.

fwyzard · 2025-10-21T15:25:20Z

RecoTracker/LSTCore/src/alpaka/Quintuplet.h

+                } else {
+                  int quintupletModuleIndex = alpaka::atomicAdd(
+                      acc, &quintupletsOccupancy.nQuintuplets()[lowerModule1], 1u, alpaka::hierarchy::Threads{});
+                  //this if statement should never get executed!


What does it mean

this if statement should never get executed!

?
The if statement is always going to be executed once the code enters this branch.
Do you mean that the condition should never be true ?

I believe it means that the condition should never be true. This comment was leftover from where I copied this block of code from, but I don't think it's really necessary, so I removed it.

cmsbuild · 2025-10-21T18:44:46Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49164/46508

cmsbuild · 2025-10-21T18:45:06Z

Pull request #49164 was updated. @cmsbuild, @fwyzard, @jfernan2, @makortel, @mandrenguyen can you please check and sign again.

fwyzard · 2025-10-21T20:57:59Z

Thanks

fwyzard · 2025-10-21T20:58:40Z

+heterogeneous

Although, as I pointed out before, I don't think this kind of changes requires a review by @cms-sw/heterogeneous-l2 .

slava77 · 2025-10-21T21:21:23Z

@cmsbuild please test

cmsbuild · 2025-10-22T02:13:24Z

+1

Size: This PR adds an extra 36KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4492e0/48761/summary.html
COMMIT: 788d8b7
CMSSW: CMSSW_16_0_X_2025-10-21-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_W7900,NVIDIA_H100,NVIDIA_L40S,NVIDIA_T4
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49164/48761/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially added 2 lines to the logs
Reco comparison results: 4 differences found in the comparisons
DQMHistoTests: Total files compared: 58
DQMHistoTests: Total histograms compared: 4329400
DQMHistoTests: Total failures: 88
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4329292
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 57 files compared)
Checked 243 log files, 210 edm output root files, 58 DQM output files
TriggerResults: no differences found

AMD_W7900 Comparison Summary

Summary:

You potentially added 5 lines to the logs
Reco comparison results: 219 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 180174
DQMHistoTests: Total failures: 31474
DQMHistoTests: Total nulls: 14
DQMHistoTests: Total successes: 148686
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: found differences in 1 / 11 workflows

NVIDIA_H100 Comparison Summary

Summary:

You potentially removed 6 lines from the logs
Reco comparison results: 253 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 180174
DQMHistoTests: Total failures: 27921
DQMHistoTests: Total nulls: 5
DQMHistoTests: Total successes: 152248
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

NVIDIA_L40S Comparison Summary

Summary:

You potentially added 7 lines to the logs
Reco comparison results: 253 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 180174
DQMHistoTests: Total failures: 27950
DQMHistoTests: Total nulls: 10
DQMHistoTests: Total successes: 152214
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: no differences found

NVIDIA_T4 Comparison Summary

Summary:

You potentially removed 7 lines from the logs
Reco comparison results: 249 differences found in the comparisons
DQMHistoTests: Total files compared: 12
DQMHistoTests: Total histograms compared: 180174
DQMHistoTests: Total failures: 34176
DQMHistoTests: Total nulls: 9
DQMHistoTests: Total successes: 145989
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 11 files compared)
Checked 46 log files, 48 edm output root files, 12 DQM output files
TriggerResults: found differences in 1 / 11 workflows

jfernan2 · 2025-10-22T07:31:19Z

+1

cmsbuild · 2025-10-22T07:31:46Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

jfernan2 · 2025-10-22T07:44:29Z

+1

ftenchini · 2025-10-22T09:58:38Z

+1

Added T5 condition for densely connected modules in creation and coun…

4b229f3

…ting kernels

cmsbuild added this to the CMSSW_16_0_X milestone Oct 14, 2025

cmsbuild added reconstruction-pending pending-signatures tests-pending orp-pending code-checks-pending tracking labels Oct 14, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Oct 14, 2025

cmsbuild added tests-started and removed tests-pending labels Oct 14, 2025

cmsbuild added tests-approved and removed tests-started labels Oct 15, 2025

cmsbuild added the heterogeneous-pending label Oct 15, 2025

cmsbuild added reconstruction-approved and removed reconstruction-pending labels Oct 17, 2025

fwyzard reviewed Oct 21, 2025

View reviewed changes

cmsbuild added reconstruction-pending tests-pending code-checks-pending and removed code-checks-approved labels Oct 21, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Oct 21, 2025

cmsbuild added heterogeneous-approved and removed heterogeneous-pending labels Oct 21, 2025

cmsbuild added tests-started and removed tests-pending labels Oct 21, 2025

cmsbuild added tests-approved and removed tests-started labels Oct 22, 2025

cmsbuild added reconstruction-approved fully-signed and removed reconstruction-pending pending-signatures labels Oct 22, 2025

cmsbuild added orp-approved and removed orp-pending labels Oct 22, 2025

cmsbuild merged commit bcba1a3 into cms-sw:master Oct 22, 2025
23 checks passed

cmsbuild mentioned this pull request Oct 22, 2025

[CLANG_X][llvm]Update to version 21.1.4 cms-sw/cmsdist#10142

Merged

Uh oh!

LST: Introducing a condition before checking cuts in the T5 counting kernel #49164

LST: Introducing a condition before checking cuts in the T5 counting kernel #49164

Conversation

kk428 commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented Oct 14, 2025

Uh oh!

cmsbuild commented Oct 14, 2025

Uh oh!

slava77 commented Oct 14, 2025

Uh oh!

slava77 commented Oct 14, 2025

Uh oh!

cmsbuild commented Oct 15, 2025

Comparison Summary

AMD_MI300X Comparison Summary

AMD_W7900 Comparison Summary

NVIDIA_H100 Comparison Summary

NVIDIA_L40S Comparison Summary

NVIDIA_T4 Comparison Summary

Uh oh!

jfernan2 commented Oct 15, 2025

Uh oh!

cmsbuild commented Oct 15, 2025

Uh oh!

jfernan2 commented Oct 15, 2025

Uh oh!

kk428 commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jfernan2 commented Oct 17, 2025

Uh oh!

slava77 commented Oct 21, 2025

Uh oh!

fwyzard Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

kk428 Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

cmsbuild commented Oct 21, 2025

Uh oh!

cmsbuild commented Oct 21, 2025

Uh oh!

fwyzard commented Oct 21, 2025

Uh oh!

fwyzard commented Oct 21, 2025

Uh oh!

slava77 commented Oct 21, 2025

Uh oh!

cmsbuild commented Oct 22, 2025

Comparison Summary

AMD_W7900 Comparison Summary

NVIDIA_H100 Comparison Summary

NVIDIA_L40S Comparison Summary

NVIDIA_T4 Comparison Summary

Uh oh!

jfernan2 commented Oct 22, 2025

Uh oh!

cmsbuild commented Oct 22, 2025

Uh oh!

jfernan2 commented Oct 22, 2025

Uh oh!

ftenchini commented Oct 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kk428 commented Oct 14, 2025 •

edited

Loading

cmsbuild commented Oct 14, 2025 •

edited

Loading

kk428 commented Oct 15, 2025 •

edited

Loading