- 
                Notifications
    You must be signed in to change notification settings 
- Fork 170
Pdm weighting #944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Pdm weighting #944
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @nabilbrice, thanks for the PR! A very nice addition to the code. There are a few issues with the PR, which I think should be fixed before merging.
- You made some breaking changes when eliminating the popinstructions, so a few tests are failing. I always suggest to avoid changing the code in parts that are not strictly related to the main change you are proposing, even if it appears like a no-nonsense "modernization". But if you want to go this way, make sure that all tests pass.
- The fact of returning nanis also not necessarily a bug, it signals that the computation has failed in a way that 0 doesn't do.
- You did not add a new test using the new functionality.
I'll also request a review from Copilot, it might find some additional inconsistencies.
|  | ||
|  | ||
| def pdm_profile_stat(profile, sample_var, nsample): | ||
| def pdm_profile_stat(profile, sum_dev2, nsample): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is changing the name of the variable needed? Also, is the variable something different from before?
| prof = np.array([2, 2, 2, 2]) | ||
| np.testing.assert_array_almost_equal(ef_profile_stat(prof), 0) | ||
|  | ||
| def test_pdm_stat(self): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test that the new keyword to folding_search also works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the phase dispersion minimization (PDM) search functionality by introducing weighted events support and optimizing the binned statistics calculation. The main changes improve computational performance and enable proper handling of observational uncertainties through variance weighting.
- Introduces a variancesparameter tophase_dispersion_searchfor weighted PDM calculations
- Optimizes fold_eventsfunction by replacingscipy.stats.binned_statisticwithnp.bincountfor improved performance
- Modernizes function parameters by replacing opts.pop()pattern with explicit optional parameters
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description | 
|---|---|
| stingray/pulse/search.py | Adds variancesparameter tophase_dispersion_searchand updates PDM statistic calculation logic | 
| stingray/pulse/pulsar.py | Refactors fold_eventsfunction signature, optimizes PDM binning algorithm withnp.bincount, and updatespdm_profile_stat | 
| stingray/pulse/tests/test_pulse.py | Updates tests to reflect new PDM behavior where empty bins return 0 instead of NaN | 
| docs/changes/944.feature.rst | Adds changelog entry for the weighted PDM enhancement | 
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| return pdm_profile_stat(profile, sigma, len_flux) | ||
| flux = kwargs["weights"] | ||
| len_flux = len(flux) | ||
| # Get the local (possibly modified copy of variances) from kwargs rather than as a closure | 
    
      
    
      Copilot
AI
    
    
    
      Oct 10, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is unclear. The code is not getting a 'modified copy' but rather retrieving the variances parameter from kwargs. Consider clarifying: 'Get variances parameter from kwargs for local use in PDM calculation'
| # Get the local (possibly modified copy of variances) from kwargs rather than as a closure | |
| # Get variances parameter from kwargs for local use in PDM calculation | 
| sum_in_bin = np.bincount(bin_indices, weights=weights * event_weights, minlength=nbin) | ||
| sos_in_bin = np.bincount(bin_indices, weights=weights**2 * event_weights, minlength=nbin) | ||
| # put it together to make sum-of-squared deviations: sosdev; avoid division by zero | ||
| raw_profile = sos_in_bin - sum_in_bin**2 / np.maximum(n_in_bin, 1) | 
    
      
    
      Copilot
AI
    
    
    
      Oct 10, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 1 in np.maximum should be documented or extracted as a constant. This prevents division by zero when bins are empty, but this behavior should be explicitly documented in a comment.
| # nsample - nbin for the in_bin sum_dev2 because there are nbin means used | ||
| # nsample - 1 for the grand sum_dev2 because there is 1 grand mean used | 
    
      
    
      Copilot
AI
    
    
    
      Oct 10, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment uses 'sum_dev2' which is not a variable name in this function. It should reference 'profile' and 'sum_dev2' parameter names for clarity.
| # nsample - nbin for the in_bin sum_dev2 because there are nbin means used | |
| # nsample - 1 for the grand sum_dev2 because there is 1 grand mean used | |
| # nsample - nbin for the in-bin sum of squared deviations (from profile) because there are nbin means used | |
| # nsample - 1 for the grand sum of squared deviations (from sum_dev2) because there is 1 grand mean used | 
…viations computation, each part using only single pass of numpy bincount
…ct that it is the sum of squares rather than the already summed values then squared
…pical broadcasted operations continue to work
…e_stat rather than split between it and the stat_fun in the separate module
4b6bf21    to
    1df730c      
    Compare
  
    | Codecov Report❌ Patch coverage is  
 
 Additional details and impacted files@@             Coverage Diff             @@
##             main     #944       +/-   ##
===========================================
- Coverage   96.06%   82.76%   -13.31%     
===========================================
  Files          48       48               
  Lines        9895     9899        +4     
===========================================
- Hits         9506     8193     -1313     
- Misses        389     1706     +1317     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
…cted keyword argument (default behaviour) instead of ValueError, which was chosen error from previous implementation of old style kwargs.pop checking
Relevant Issue(s)/PR(s)
Closes #943
Provide an overview of the implemented solution or the fix and elaborate on the modifications.
The
fold_eventsfunction is modified, first to optimise the binned statistics calculation, now usingnp.bincountinstead of the previousscipy.stats.binned_statisticmethod. This has improved the computation speed in my own local testing (from something taking >20 seconds to ~7 seconds), but results may vary. Second, in order to modernise the style of thefold_eventsfunction, theopts.popand error checking is now changed to use optional parameters with defaults. The use of optional parameters preserves theopts.popfunctionality.The main change for this PR is to introduce a new
variancesparameter tophase_dispersion_searchthat enables weighting observations. This change is propagated to thestat_funcwithin and thefold_events.So far, my own local testing shows that if
variancesis not supplied, the PDM gives the same results as before, except that if bins were empty before, the resulting PDM would beNaN. Now it is0as a result of the modified algorithm. The tests have been changed to reflect this.Is there a new dependency introduced by your contribution? If so, please specify.
No. In fact, the use of
scipyinfold_eventshas been removed.Any other comments?
To be added.