-
Notifications
You must be signed in to change notification settings - Fork 17
New delnflux #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New delnflux #78
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep the current function names. Imo, the numpy_ prefix doesn't help understanding the code.
pyfv3/stencils/delnflux.py
Outdated
|
|
||
| def numpy_corner_y(self, d2, nord_data: Quantity): | ||
| for k in range(nord_data.shape[0]): | ||
| if nord_data.data[k] > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this now missing the previous comparison if nord > current_nord ? While the first two invocations hard-code current_node=0, the last two are run in a loop with current_nord between 0 and max(nord), which should - unless all nord values are the same - filter some edge computations, no?
pyfv3/stencils/delnflux.py
Outdated
| def numpy_corner_y(self, d2, nord_data: Quantity): | ||
| for k in range(nord_data.shape[0]): | ||
| if nord_data.data[k] > 0: | ||
| d2[0, 0, k] = d2[5, 0, k] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive my ignorance, but is there a way that this could be written as a loop as in copy_corners() rather than explicitly writing out every element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is. This re-write is due to ongoing performance engineering work on CPU - it's not necessarily going to stay that way in the long run.
It's also a studying case for us of a few things:
- a code that triggers heavy micro-op reject in stencils mode leading to poor performance
- a case of code that needs to "escape" the SDFG and remain in the full program optimizer purview
|
Superseded by NOAA-GFDL/NDSL#240 and PR #87. |
WIP for dace orchestration