Skip to content

Conversation

@nabilbrice
Copy link

Relevant Issue(s)/PR(s)

Closes #943

Provide an overview of the implemented solution or the fix and elaborate on the modifications.

The fold_events function is modified, first to optimise the binned statistics calculation, now using np.bincount instead of the previous scipy.stats.binned_statistic method. This has improved the computation speed in my own local testing (from something taking >20 seconds to ~7 seconds), but results may vary. Second, in order to modernise the style of the fold_events function, the opts.pop and error checking is now changed to use optional parameters with defaults. The use of optional parameters preserves the opts.pop functionality.

The main change for this PR is to introduce a new variances parameter to phase_dispersion_search that enables weighting observations. This change is propagated to the stat_func within and the fold_events.

So far, my own local testing shows that if variances is not supplied, the PDM gives the same results as before, except that if bins were empty before, the resulting PDM would be NaN. Now it is 0 as a result of the modified algorithm. The tests have been changed to reflect this.

Is there a new dependency introduced by your contribution? If so, please specify.

No. In fact, the use of scipy in fold_events has been removed.

Any other comments?

To be added.

Copy link
Member

@matteobachetti matteobachetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @nabilbrice, thanks for the PR! A very nice addition to the code. There are a few issues with the PR, which I think should be fixed before merging.

  1. You made some breaking changes when eliminating the pop instructions, so a few tests are failing. I always suggest to avoid changing the code in parts that are not strictly related to the main change you are proposing, even if it appears like a no-nonsense "modernization". But if you want to go this way, make sure that all tests pass.
  2. The fact of returning nan is also not necessarily a bug, it signals that the computation has failed in a way that 0 doesn't do.
  3. You did not add a new test using the new functionality.

I'll also request a review from Copilot, it might find some additional inconsistencies.



def pdm_profile_stat(profile, sample_var, nsample):
def pdm_profile_stat(profile, sum_dev2, nsample):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is changing the name of the variable needed? Also, is the variable something different from before?

prof = np.array([2, 2, 2, 2])
np.testing.assert_array_almost_equal(ef_profile_stat(prof), 0)

def test_pdm_stat(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test that the new keyword to folding_search also works.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the phase dispersion minimization (PDM) search functionality by introducing weighted events support and optimizing the binned statistics calculation. The main changes improve computational performance and enable proper handling of observational uncertainties through variance weighting.

  • Introduces a variances parameter to phase_dispersion_search for weighted PDM calculations
  • Optimizes fold_events function by replacing scipy.stats.binned_statistic with np.bincount for improved performance
  • Modernizes function parameters by replacing opts.pop() pattern with explicit optional parameters

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
stingray/pulse/search.py Adds variances parameter to phase_dispersion_search and updates PDM statistic calculation logic
stingray/pulse/pulsar.py Refactors fold_events function signature, optimizes PDM binning algorithm with np.bincount, and updates pdm_profile_stat
stingray/pulse/tests/test_pulse.py Updates tests to reflect new PDM behavior where empty bins return 0 instead of NaN
docs/changes/944.feature.rst Adds changelog entry for the weighted PDM enhancement

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

return pdm_profile_stat(profile, sigma, len_flux)
flux = kwargs["weights"]
len_flux = len(flux)
# Get the local (possibly modified copy of variances) from kwargs rather than as a closure
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is unclear. The code is not getting a 'modified copy' but rather retrieving the variances parameter from kwargs. Consider clarifying: 'Get variances parameter from kwargs for local use in PDM calculation'

Suggested change
# Get the local (possibly modified copy of variances) from kwargs rather than as a closure
# Get variances parameter from kwargs for local use in PDM calculation

Copilot uses AI. Check for mistakes.
sum_in_bin = np.bincount(bin_indices, weights=weights * event_weights, minlength=nbin)
sos_in_bin = np.bincount(bin_indices, weights=weights**2 * event_weights, minlength=nbin)
# put it together to make sum-of-squared deviations: sosdev; avoid division by zero
raw_profile = sos_in_bin - sum_in_bin**2 / np.maximum(n_in_bin, 1)
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 1 in np.maximum should be documented or extracted as a constant. This prevents division by zero when bins are empty, but this behavior should be explicitly documented in a comment.

Copilot uses AI. Check for mistakes.
Comment on lines +411 to +412
# nsample - nbin for the in_bin sum_dev2 because there are nbin means used
# nsample - 1 for the grand sum_dev2 because there is 1 grand mean used
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment uses 'sum_dev2' which is not a variable name in this function. It should reference 'profile' and 'sum_dev2' parameter names for clarity.

Suggested change
# nsample - nbin for the in_bin sum_dev2 because there are nbin means used
# nsample - 1 for the grand sum_dev2 because there is 1 grand mean used
# nsample - nbin for the in-bin sum of squared deviations (from profile) because there are nbin means used
# nsample - 1 for the grand sum of squared deviations (from sum_dev2) because there is 1 grand mean used

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

❌ Patch coverage is 63.63636% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.76%. Comparing base (991e488) to head (1df730c).

Files with missing lines Patch % Lines
stingray/pulse/search.py 0.00% 8 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (991e488) and HEAD (1df730c). Click for more details.

HEAD has 10 uploads less than BASE
Flag BASE (991e488) HEAD (1df730c)
16 6
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #944       +/-   ##
===========================================
- Coverage   96.06%   82.76%   -13.31%     
===========================================
  Files          48       48               
  Lines        9895     9899        +4     
===========================================
- Hits         9506     8193     -1313     
- Misses        389     1706     +1317     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…cted keyword argument (default behaviour) instead of ValueError, which was chosen error from previous implementation of old style kwargs.pop checking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Observation weighting for PDM

2 participants