Pdm weighting #944

nabilbrice · 2025-10-02T10:26:59Z

Relevant Issue(s)/PR(s)

Closes #943

Provide an overview of the implemented solution or the fix and elaborate on the modifications.

The fold_events function is modified, first to optimise the binned statistics calculation, now using np.bincount instead of the previous scipy.stats.binned_statistic method. This has improved the computation speed in my own local testing (from something taking >20 seconds to ~7 seconds), but results may vary. Second, in order to modernise the style of the fold_events function, the opts.pop and error checking is now changed to use optional parameters with defaults. The use of optional parameters preserves the opts.pop functionality.

The main change for this PR is to introduce a new variances parameter to phase_dispersion_search that enables weighting observations. This change is propagated to the stat_func within and the fold_events.

So far, my own local testing shows that if variances is not supplied, the PDM gives the same results as before, except that if bins were empty before, the resulting PDM would be NaN. Now it is 0 as a result of the modified algorithm. The tests have been changed to reflect this.

Is there a new dependency introduced by your contribution? If so, please specify.

No. In fact, the use of scipy in fold_events has been removed.

Any other comments?

To be added.

matteobachetti

Hello @nabilbrice, thanks for the PR! A very nice addition to the code. There are a few issues with the PR, which I think should be fixed before merging.

You made some breaking changes when eliminating the pop instructions, so a few tests are failing. I always suggest to avoid changing the code in parts that are not strictly related to the main change you are proposing, even if it appears like a no-nonsense "modernization". But if you want to go this way, make sure that all tests pass.
The fact of returning nan is also not necessarily a bug, it signals that the computation has failed in a way that 0 doesn't do.
You did not add a new test using the new functionality.

I'll also request a review from Copilot, it might find some additional inconsistencies.

matteobachetti · 2025-10-10T07:49:47Z

stingray/pulse/pulsar.py



-def pdm_profile_stat(profile, sample_var, nsample):
+def pdm_profile_stat(profile, sum_dev2, nsample):


Is changing the name of the variable needed? Also, is the variable something different from before?

matteobachetti · 2025-10-10T07:52:04Z

stingray/pulse/tests/test_pulse.py

        prof = np.array([2, 2, 2, 2])
        np.testing.assert_array_almost_equal(ef_profile_stat(prof), 0)

    def test_pdm_stat(self):


Please test that the new keyword to folding_search also works.

Copilot

Pull Request Overview

This PR enhances the phase dispersion minimization (PDM) search functionality by introducing weighted events support and optimizing the binned statistics calculation. The main changes improve computational performance and enable proper handling of observational uncertainties through variance weighting.

Introduces a variances parameter to phase_dispersion_search for weighted PDM calculations
Optimizes fold_events function by replacing scipy.stats.binned_statistic with np.bincount for improved performance
Modernizes function parameters by replacing opts.pop() pattern with explicit optional parameters

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
stingray/pulse/search.py	Adds `variances` parameter to `phase_dispersion_search` and updates PDM statistic calculation logic
stingray/pulse/pulsar.py	Refactors `fold_events` function signature, optimizes PDM binning algorithm with `np.bincount`, and updates `pdm_profile_stat`
stingray/pulse/tests/test_pulse.py	Updates tests to reflect new PDM behavior where empty bins return 0 instead of NaN
docs/changes/944.feature.rst	Adds changelog entry for the weighted PDM enhancement

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T07:57:25Z

stingray/pulse/search.py

-        return pdm_profile_stat(profile, sigma, len_flux)
+        flux = kwargs["weights"]
+        len_flux = len(flux)
+        # Get the local (possibly modified copy of variances) from kwargs rather than as a closure


This comment is unclear. The code is not getting a 'modified copy' but rather retrieving the variances parameter from kwargs. Consider clarifying: 'Get variances parameter from kwargs for local use in PDM calculation'

Suggested change

# Get the local (possibly modified copy of variances) from kwargs rather than as a closure

# Get variances parameter from kwargs for local use in PDM calculation

Copilot · 2025-10-10T07:57:25Z

stingray/pulse/pulsar.py

+        sum_in_bin = np.bincount(bin_indices, weights=weights * event_weights, minlength=nbin)
+        sos_in_bin = np.bincount(bin_indices, weights=weights**2 * event_weights, minlength=nbin)
+        # put it together to make sum-of-squared deviations: sosdev; avoid division by zero
+        raw_profile = sos_in_bin - sum_in_bin**2 / np.maximum(n_in_bin, 1)


The magic number 1 in np.maximum should be documented or extracted as a constant. This prevents division by zero when bins are empty, but this behavior should be explicitly documented in a comment.

Copilot · 2025-10-10T07:57:26Z

stingray/pulse/pulsar.py

+    # nsample - nbin for the in_bin sum_dev2 because there are nbin means used
+    # nsample - 1 for the grand sum_dev2 because there is 1 grand mean used


The comment uses 'sum_dev2' which is not a variable name in this function. It should reference 'profile' and 'sum_dev2' parameter names for clarity.

Suggested change

# nsample - nbin for the in_bin sum_dev2 because there are nbin means used

# nsample - 1 for the grand sum_dev2 because there is 1 grand mean used

# nsample - nbin for the in-bin sum of squared deviations (from profile) because there are nbin means used

# nsample - 1 for the grand sum of squared deviations (from sum_dev2) because there is 1 grand mean used

…viations computation, each part using only single pass of numpy bincount

…ct that it is the sum of squares rather than the already summed values then squared

…nces keyword

…pical broadcasted operations continue to work

…e_stat rather than split between it and the stat_fun in the separate module

codecov · 2025-10-30T11:36:54Z

Codecov Report

❌ Patch coverage is 63.63636% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.76%. Comparing base (991e488) to head (1df730c).

Files with missing lines	Patch %	Lines
stingray/pulse/search.py	0.00%	8 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (991e488) and HEAD (1df730c). Click for more details.

HEAD has 10 uploads less than BASE

Flag BASE (991e488) HEAD (1df730c)

16 6

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #944       +/-   ##
===========================================
- Coverage   96.06%   82.76%   -13.31%     
===========================================
  Files          48       48               
  Lines        9895     9899        +4     
===========================================
- Hits         9506     8193     -1313     
- Misses        389     1706     +1317

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…cted keyword argument (default behaviour) instead of ValueError, which was chosen error from previous implementation of old style kwargs.pop checking

matteobachetti requested changes Oct 10, 2025

View reviewed changes

matteobachetti requested a review from Copilot October 10, 2025 07:56

Copilot AI reviewed Oct 10, 2025

View reviewed changes

Nabil Brice added 12 commits October 30, 2025 11:29

simplified pdm profile calculation

f62f49a

changed to using np.var with ddof=1 for unbiased estimator of variance

ae7f029

optimised pdm profile computation through splitting sum of squared de…

370bc5d

…viations computation, each part using only single pass of numpy bincount

changed variable name from sumsquared -> sumofsquares to better refle…

6dfb20d

…ct that it is the sum of squares rather than the already summed values then squared

added prelim weighted pdm search

fbd90fd

modernised fold_events alongside enabling weighted pdm with new varia…

a2b219a

…nces keyword

added type casting of weights to numpy array in pdm mode to ensure ty…

4332dcc

…pical broadcasted operations continue to work

refactored so all of the degrees of freedom are handled in pdm_profil…

c6adab6

…e_stat rather than split between it and the stat_fun in the separate module

amended comments and lines for clarity

89924ac

amended tests to work with new implementations

1b5a77d

reformatted according to black

019aad6

added changelog entry

1df730c

nabilbrice force-pushed the pdm_weighting branch from 4b6bf21 to 1df730c Compare October 30, 2025 11:34

changed test_search_wrong_key_fails to expect a TypeError from unexpe…

1c79fcc

…cted keyword argument (default behaviour) instead of ValueError, which was chosen error from previous implementation of old style kwargs.pop checking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pdm weighting #944

Pdm weighting #944

nabilbrice commented Oct 2, 2025

Uh oh!

matteobachetti left a comment •

edited

Loading

Uh oh!

matteobachetti Oct 10, 2025

Uh oh!

matteobachetti Oct 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

codecov bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def pdm_profile_stat(profile, sample_var, nsample):
		def pdm_profile_stat(profile, sum_dev2, nsample):

	# Get the local (possibly modified copy of variances) from kwargs rather than as a closure
	# Get variances parameter from kwargs for local use in PDM calculation

		# nsample - nbin for the in_bin sum_dev2 because there are nbin means used
		# nsample - 1 for the grand sum_dev2 because there is 1 grand mean used

Pdm weighting #944

Are you sure you want to change the base?

Pdm weighting #944

Conversation

nabilbrice commented Oct 2, 2025

Relevant Issue(s)/PR(s)

Provide an overview of the implemented solution or the fix and elaborate on the modifications.

Is there a new dependency introduced by your contribution? If so, please specify.

Any other comments?

Uh oh!

matteobachetti left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matteobachetti Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

matteobachetti Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

matteobachetti left a comment •

edited

Loading

codecov bot commented Oct 30, 2025 •

edited

Loading