ATLAS blog post for GSoC 2025 #1755

yolannel · 2025-09-05T09:07:44Z

First draft proposed for the GSoC 2025 blog post for Neural (De)compression for High Energy Physics under the ATLAS project proposal.

netlify · 2025-09-05T09:07:49Z

✅ Deploy Preview for earnest-hotteok-b1e1bf ready!

Name	Link
🔨 Latest commit	`7243389`
🔍 Latest deploy log	https://app.netlify.com/projects/earnest-hotteok-b1e1bf/deploys/68e7c7cf6d881500080e86d9
😎 Deploy Preview	https://deploy-preview-1755--earnest-hotteok-b1e1bf.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

yolannel · 2025-09-05T09:11:30Z

@maszyman Hi Maciej - I think as I'm external to this repo, I can't directly request a reviewer. Pinging you here so that you can add yourself as a reviewer to this draft PR and provide any feedback prior to marking this ready for review!

maszyman

Thanks a lot Yolanne for this nice report (and your contribution in general)!

Please have a look at my comments inline.

_gsocblogs/2025/blog_ATLAS_YolanneLee.md

maszyman · 2025-09-05T11:17:35Z

_gsocblogs/2025/blog_ATLAS_YolanneLee.md

+
+## Introduction
+
+In high-energy physics experiments such as those at CERN’s ATLAS project, immense volumes of data are generated. This project explores the feasability for “precision upsampling” using deep generative models to be used to reconstruct high-precision floating-point data from aggressively compressed representations. I had the opportunity to work on this topic with the support and supervision of Maciej Szymański and Peter Van Gemmeren with the ATLAS Software & Computing group.


It would be nice if you could mention Argonne National Laboratory here as well :-)

_gsocblogs/2025/blog_ATLAS_YolanneLee.md

maszyman · 2025-09-05T12:35:57Z

_gsocblogs/2025/blog_ATLAS_YolanneLee.md

+
+Another approach under development is to treat the data as an inpainting problem, commonly seen within image generation where some part of an image may be blacked out; an inpainting model is designed to "fill in the blanks". In our case, we not only have the new theoretical bounds but also the first $23-n$ bits of data that is retained after truncation: this is valuable information which, in statistical tests, is also often a 'good-enough' approximation of the uncompressed data to begin with. Then, the challenge is only to "fill in" the remaining truncated $n$ bits which represents an even more bounded problem space and would minimize unexpected upsampling artifacts by constraining any correction terms to be within the allowable $n$ bits of change.
+
+While this project has not yet conclusively found a candidate model to precision upsample, ongoing work is being performed and is to continue beyond the timeline of the GSoC project toward proposing a working pipeline based off of the work performed up to this point. In short, autoencoders, variational autoencoders, and some simple flow matching models have been implemented and tested, with performance measured using simple MSE loss as well as distribution-based metrics such as KL divergence. The pipeline of the model was being tested in Jupyter notebook files, but I have begun to move them to modular python files to facilitate further work.


I would still add the plots showing the results for the models you implemented, even if they are not as good as one could hope for.

I think it's important to clearly show what has been achieved, try explaining why, and propose the next steps (which you did in the previous paragraph basically).

but I have begun to move them to modular python files to facilitate further work.

I would drop that. Instead you may say few words about your repo, what's there, how to use it, etc.

_gsocblogs/2025/blog_ATLAS_YolanneLee.md

Co-authored-by: Maciej Szymański <[email protected]>

…d interim models.

Addressed comments from review; added visualisations for key attempte…

maszyman · 2025-10-07T07:50:03Z

Hi @yolannel

Is it ready for the 2nd review?

Cheers,
Maciej

yolannel · 2025-10-07T10:24:29Z

Hi Maciej, Yes - please take a look at your convenience! Best, Yolanne

…

On Tue, Oct 7, 2025 at 8:50 AM Maciej Szymański ***@***.***> wrote: *maszyman* left a comment (HSF/hsf.github.io#1755) <#1755 (comment)> Hi @yolannel <https://github.com/yolannel> Is it ready for the 2nd review? Cheers, Maciej — Reply to this email directly, view it on GitHub <#1755 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOKA4GM5ZTNXRSPG7UA6IUT3WNWEBAVCNFSM6AAAAACFWQ2CJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGNZVGYYTQNRQGE> . You are receiving this because you were mentioned.Message ID: <HSF/hsf. ***@***.***>

maszyman

Looks good!

Please have a look at a couple of remaining small items.

_gsocblogs/2025/blog_ATLAS_YolanneLee.md

yolannel · 2025-10-09T14:34:46Z

@maszyman apologies for the small errors, I hadn't been too careful and should have caught those myself. I removed the latex and also uploaded the headshot; please see the modifications and thanks for the second eye!

maszyman

Many thanks @yolannel !

I think this is ready to go, could you please undraft?

First draft of blog post for GSoC 2025.

5f5521c

maszyman self-assigned this Sep 5, 2025

maszyman self-requested a review September 5, 2025 09:20

maszyman added the GSoC Related to Google Summer of Code activity label Sep 5, 2025

maszyman requested changes Sep 5, 2025

View reviewed changes

maszyman reviewed Sep 5, 2025

View reviewed changes

_gsocblogs/2025/blog_ATLAS_YolanneLee.md Outdated Show resolved Hide resolved

maszyman reviewed Sep 5, 2025

View reviewed changes

_gsocblogs/2025/blog_ATLAS_YolanneLee.md Outdated Show resolved Hide resolved

yolannel and others added 5 commits September 27, 2025 15:32

Apply direct spelling conventions/fixes from code review

3fefcc2

Co-authored-by: Maciej Szymański <[email protected]>

Addressed comments from review; added visualisations for key attempte…

6c0136f

…d interim models.

Merge pull request #1 from yolannel/pr-1755

df574ab

Addressed comments from review; added visualisations for key attempte…

additional small fixes for some phrasing clarity and a typo.

eee5105

Merge branch 'main' of github.com:yolannel/hsf.github.io

bcd4d6f

maszyman requested changes Oct 9, 2025

View reviewed changes

maszyman reviewed Oct 9, 2025

View reviewed changes

_gsocblogs/2025/blog_ATLAS_YolanneLee.md Outdated Show resolved Hide resolved

uploaded profile; fixed latex rendering

7243389

maszyman approved these changes Oct 9, 2025

View reviewed changes

yolannel marked this pull request as ready for review October 9, 2025 23:18


		## Introduction

		In high-energy physics experiments such as those at CERN’s ATLAS project, immense volumes of data are generated. This project explores the feasability for “precision upsampling” using deep generative models to be used to reconstruct high-precision floating-point data from aggressively compressed representations. I had the opportunity to work on this topic with the support and supervision of Maciej Szymański and Peter Van Gemmeren with the ATLAS Software & Computing group.


		Another approach under development is to treat the data as an inpainting problem, commonly seen within image generation where some part of an image may be blacked out; an inpainting model is designed to "fill in the blanks". In our case, we not only have the new theoretical bounds but also the first $23-n$ bits of data that is retained after truncation: this is valuable information which, in statistical tests, is also often a 'good-enough' approximation of the uncompressed data to begin with. Then, the challenge is only to "fill in" the remaining truncated $n$ bits which represents an even more bounded problem space and would minimize unexpected upsampling artifacts by constraining any correction terms to be within the allowable $n$ bits of change.

		While this project has not yet conclusively found a candidate model to precision upsample, ongoing work is being performed and is to continue beyond the timeline of the GSoC project toward proposing a working pipeline based off of the work performed up to this point. In short, autoencoders, variational autoencoders, and some simple flow matching models have been implemented and tested, with performance measured using simple MSE loss as well as distribution-based metrics such as KL divergence. The pipeline of the model was being tested in Jupyter notebook files, but I have begun to move them to modular python files to facilitate further work.

ATLAS blog post for GSoC 2025 #1755

Are you sure you want to change the base?

ATLAS blog post for GSoC 2025 #1755

Uh oh!

Conversation

yolannel commented Sep 5, 2025

Uh oh!

netlify bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for earnest-hotteok-b1e1bf ready!

Uh oh!

yolannel commented Sep 5, 2025

Uh oh!

maszyman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maszyman Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maszyman Sep 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

maszyman commented Oct 7, 2025

Uh oh!

yolannel commented Oct 7, 2025 via email

Uh oh!

maszyman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yolannel commented Oct 9, 2025

Uh oh!

maszyman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

netlify bot commented Sep 5, 2025 •

edited

Loading