-
Notifications
You must be signed in to change notification settings - Fork 317
Refactor and enhance SensitivityAnalysis for MMM #1899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Expanded SensitivityAnalysis to support arbitrary PyMC models, improved documentation with usage examples, and added flexible sweep and filtering logic. The class now works directly with PyMC models and InferenceData, supports multi-dimensional inputs, and provides Jacobian-based marginal effects with optional xarray output and idata extension. Legacy code using pandas was removed in favor of a more general and robust approach.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1899 +/- ##
===========================================
- Coverage 92.12% 70.70% -21.43%
===========================================
Files 67 67
Lines 8079 8106 +27
===========================================
- Hits 7443 5731 -1712
- Misses 636 2375 +1739 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
List of variable names to intervene on. | ||
varinput : str | ||
Name of the pm.Data variable (e.g., "channel_data"). | ||
Expected shape: (date, *dims, channel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change here, should say: (date, *dims, arbitrary_dim)
that match varinput
dims.
) | ||
if extend_idata: | ||
self._add_to_idata(xr_result) | ||
return stacked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we should return either the xr result or none. I'll change this.
coords["sweep"] = np.asarray(sweep_values) | ||
|
||
# Map remaining dims using model coords and filters | ||
# Compute axis offsets: after (sample, sweep), axes align with dims_order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to dislike combined=True
and sample
instead of the usual ('chain', 'draw')
because at times there are issues down the line if you want to merge xarrays.
I need @drbenvincent help to understand how to reformulate the plots. |
Thanks @ErikRingen for the feedback, @drbenvincent I just need you to adjust the test and follow a bit more what do we want to show, but that should be all. |
…//github.com/pymc-labs/pymc-marketing into cetagostini/making_marginal_fn_pytensor_base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just bear in mind that I am very very beginner when it comes to pytensor.
Here's a question - originally raised by @ErikRingen - but that I'm picking up on as well. As far as I can tell, this is going straight for the marginal effect with the Jacobian right? The original functionality was first computing what the outcome is for different sweep values then computed the gradient as a function of the sweep values (the marginal effect).
In discussion with an LLM, we need something like this?
# Add this to get actual values
resp_fn = function(inputs=[], outputs=resp_graph)
# Then in the sweep loop:
for s in sweep_values:
data_shared.set_value(original_X * s)
actual_values = resp_fn() # Actual outcome values
marginal_effects = jac_fn() # Marginal effects (current implementation)
My only other thought (at this time) is that there seems to be some assumptions about dimensions etc. We don't have the useful named dimensions that we do in xarray, so perhaps we need some extra validation here?
Actually, another thought. This seems very general. So in theory this could feasibly be ported to pymc-extras
or pymc
to enable generic sensitivity / marginal effects for arbitrary models? Not saying we should do that now, but it's perhaps worth considering, and thinking about the feasibility?
After talking with @drbenvincent , I found out I'm doing fundamentally different things, so, I'll be restructuring all the code 😃 |
Updated sensitivity analysis logic in pymc_marketing/mmm/sensitivity_analysis.py and related plotting in plot.py. Adjusted notebook outputs and execution counts for reproducibility. Modified tests to align with new sensitivity analysis behavior.
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
We are ready for review, current style makes the same than old with some new changes. |
Description
Expanded SensitivityAnalysis to support arbitrary PyMC models, improved documentation with usage examples, and added flexible sweep and filtering logic. The class now works directly with PyMC models and InferenceData, supports multi-dimensional inputs, and provides Jacobian-based marginal effects with optional xarray output and idata extension.
Legacy code using pandas was removed in favor of a more general and robust approach.
ref code: https://colab.research.google.com/drive/16xn_lGYflxcyKZilSEGr0A6ShJuKSStK#scrollTo=1cbXjzqurd1c
to-do's:
new features:
Related Issue
Checklist
pre-commit.ci autofix
to auto-fix.