Skip to content

Conversation

cetagostini
Copy link
Contributor

@cetagostini cetagostini commented Aug 19, 2025

Description

Expanded SensitivityAnalysis to support arbitrary PyMC models, improved documentation with usage examples, and added flexible sweep and filtering logic. The class now works directly with PyMC models and InferenceData, supports multi-dimensional inputs, and provides Jacobian-based marginal effects with optional xarray output and idata extension.

Legacy code using pandas was removed in favor of a more general and robust approach.

ref code: https://colab.research.google.com/drive/16xn_lGYflxcyKZilSEGr0A6ShJuKSStK#scrollTo=1cbXjzqurd1c

to-do's:

  • Make compatible with plot suite to keep current user behaviour.
  • Modify test suite to check new API outputs.
  • Validate new outputs against current outputs.

new features:

  1. Pytensor base.
  2. We can run sweep over original scale variables.
  3. We can run several channels in parallel.
  4. Option to run with several pytensor backends (more speed).
  5. More flexible filtering around dimensions.
  6. Compatible with any arbitrary forward pass transformation for any node (channels or controls or arbitrary effects)

Related Issue

  • Closes #
  • Related to #

Checklist

Expanded SensitivityAnalysis to support arbitrary PyMC models, improved documentation with usage examples, and added flexible sweep and filtering logic. The class now works directly with PyMC models and InferenceData, supports multi-dimensional inputs, and provides Jacobian-based marginal effects with optional xarray output and idata extension. Legacy code using pandas was removed in favor of a more general and robust approach.
@cetagostini cetagostini added this to the 0.16.0 milestone Aug 19, 2025
@cetagostini cetagostini self-assigned this Aug 19, 2025
@cetagostini cetagostini added enhancement New feature or request help wanted Extra attention is needed major API breaking changes API model components Related to the various model components labels Aug 19, 2025
@github-actions github-actions bot added the MMM label Aug 19, 2025
Copy link

codecov bot commented Aug 19, 2025

Codecov Report

❌ Patch coverage is 11.44578% with 147 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.70%. Comparing base (fa11344) to head (381586f).

Files with missing lines Patch % Lines
pymc_marketing/mmm/plot.py 3.84% 75 Missing ⚠️
pymc_marketing/mmm/sensitivity_analysis.py 18.39% 71 Missing ⚠️
pymc_marketing/mmm/multidimensional.py 0.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (fa11344) and HEAD (381586f). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (fa11344) HEAD (381586f)
23 21
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1899       +/-   ##
===========================================
- Coverage   92.12%   70.70%   -21.43%     
===========================================
  Files          67       67               
  Lines        8079     8106       +27     
===========================================
- Hits         7443     5731     -1712     
- Misses        636     2375     +1739     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cetagostini cetagostini requested a review from PabloRoque August 19, 2025 20:55
List of variable names to intervene on.
varinput : str
Name of the pm.Data variable (e.g., "channel_data").
Expected shape: (date, *dims, channel)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change here, should say: (date, *dims, arbitrary_dim) that match varinput dims.

)
if extend_idata:
self._add_to_idata(xr_result)
return stacked
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should return either the xr result or none. I'll change this.

coords["sweep"] = np.asarray(sweep_values)

# Map remaining dims using model coords and filters
# Compute axis offsets: after (sample, sweep), axes align with dims_order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to dislike combined=True and sample instead of the usual ('chain', 'draw') because at times there are issues down the line if you want to merge xarrays.

@github-actions github-actions bot added the tests label Aug 28, 2025
@cetagostini cetagostini removed this from the 0.16.0 milestone Aug 28, 2025
@cetagostini
Copy link
Contributor Author

I need @drbenvincent help to understand how to reformulate the plots.

@cetagostini
Copy link
Contributor Author

Thanks @ErikRingen for the feedback, @drbenvincent I just need you to adjust the test and follow a bit more what do we want to show, but that should be all.

Copy link
Contributor

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just bear in mind that I am very very beginner when it comes to pytensor.

Here's a question - originally raised by @ErikRingen - but that I'm picking up on as well. As far as I can tell, this is going straight for the marginal effect with the Jacobian right? The original functionality was first computing what the outcome is for different sweep values then computed the gradient as a function of the sweep values (the marginal effect).

In discussion with an LLM, we need something like this?

# Add this to get actual values
resp_fn = function(inputs=[], outputs=resp_graph)

# Then in the sweep loop:
for s in sweep_values:
    data_shared.set_value(original_X * s)
    actual_values = resp_fn()  # Actual outcome values
    marginal_effects = jac_fn()  # Marginal effects (current implementation)

My only other thought (at this time) is that there seems to be some assumptions about dimensions etc. We don't have the useful named dimensions that we do in xarray, so perhaps we need some extra validation here?

Actually, another thought. This seems very general. So in theory this could feasibly be ported to pymc-extras or pymc to enable generic sensitivity / marginal effects for arbitrary models? Not saying we should do that now, but it's perhaps worth considering, and thinking about the feasibility?

@cetagostini
Copy link
Contributor Author

After talking with @drbenvincent , I found out I'm doing fundamentally different things, so, I'll be restructuring all the code 😃

Updated sensitivity analysis logic in pymc_marketing/mmm/sensitivity_analysis.py and related plotting in plot.py. Adjusted notebook outputs and execution counts for reproducibility. Modified tests to align with new sensitivity analysis behavior.
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added the docs Improvements or additions to documentation label Sep 22, 2025
@cetagostini
Copy link
Contributor Author

@drbenvincent @ErikRingen

We are ready for review, current style makes the same than old with some new changes.

@cetagostini cetagostini marked this pull request as ready for review September 22, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API docs Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed major API breaking changes MMM model components Related to the various model components tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants