Skip to content

Conversation

@AdrianSosic
Copy link
Collaborator

@AdrianSosic AdrianSosic commented Oct 6, 2025

Say hello to our new settings management framework!
For all details, check out the new user guide page.

Fork: https://adriansosic.github.io/baybe-dev/latest/userguide/settings.html

Highlights

  • Unified and streamlined settings handling enabled by the new Settings class, which brings settings control right to your fingertips – everywhere, at all times.
  • All settings can be conveniently manipulated from within Python or via their corresponding environment variables.
  • Settings values are now validated, avoiding runtime or even silent bugs.
  • ... and so are the environment variables themselves. No more silent failures due to misspelled environment variables!
  • Settings can be easily altered via direct assignment:
      from baybe import active_settings
    
      active_settings.use_fpsample = True
  • (Joint) settings configurations can be stored in variables and either activated upon instantiation or at a later point in time:
    from baybe import Settings
    
    # Active immediately
    Settings(preprocess_dataframes=False, use_fpsample=True).activate()
    
    # Active later
    s = Settings(preprocess_dataframes=False, use_fpsample=True)
    # ... stuff
    s.activate()
  • Previous settings configurations can be restored with a single line of code, making it easy to temporarily change settings for a specific task:
    s.restore_previous()
  • The latter can also be done using context managers:
    with Settings(preprocess_dataframes=False, use_fpsample=True):
        # ... do stuff
  • ... or even decorators:
    @Settings(preprocess_dataframes=False, use_fpsample=True)
    def my_function(...):
        # ... do stuff
  • Random seed control is now possible using the same mechanisms:
    active_settings.random_seed = 42
    
    Settings(random_seed=42).activate()
    
    with Settings(random_seed=42):
      # ... do stuff
    
    @Settings(random_seed=42)
    def my_function(...):
        # ... do stuff

@AdrianSosic AdrianSosic self-assigned this Oct 6, 2025
@AdrianSosic AdrianSosic added enhancement Expand / change existing functionality new feature New functionality labels Oct 6, 2025
@Scienfitz Scienfitz added this to the 0.15.0 milestone Oct 8, 2025
@AdrianSosic AdrianSosic force-pushed the feature/settings branch 8 times, most recently from d7806b7 to 2c440be Compare October 14, 2025 13:44
@AdrianSosic AdrianSosic force-pushed the feature/settings branch 2 times, most recently from 8170c03 to 24d7af1 Compare October 15, 2025 07:52
@AdrianSosic AdrianSosic force-pushed the feature/settings branch 2 times, most recently from 8fd2bf3 to f8e81cb Compare October 15, 2025 08:59
@AdrianSosic AdrianSosic marked this pull request as ready for review October 15, 2025 09:53
Copilot AI review requested due to automatic review settings October 15, 2025 09:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a unified settings management framework to BayBE that consolidates previously scattered configuration mechanisms into a single Settings class. The new system provides flexible control over framework behavior through direct assignment, context managers, decorators, and environment variables, with automatic validation to prevent runtime bugs. Additionally, it brings random seed control into the settings system for reproducible experiments.

  • Unified settings management through the new Settings class with multiple activation patterns
  • Environment variable validation and deprecation warnings for legacy variable names
  • Integration of random seed control into the settings system with automatic state management

Reviewed Changes

Copilot reviewed 65 out of 65 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
baybe/settings.py New settings management framework with Settings class and global active_settings instance
baybe/init.py Exports Settings and active_settings for user access
baybe/utils/validation.py Adds preprocess_dataframe function consolidating validation and normalization
baybe/utils/boolean.py Adds AutoBool and UncertainBool enums moved from utils.basic
baybe/utils/basic.py Adds cache_to_disk utility and removes UncertainBool (moved to utils.boolean)
baybe/utils/random.py Deprecates set_random_seed and temporary_seed functions
tests/test_settings.py Comprehensive test suite for the new settings system
tox.ini Updates environment variable names to new format

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +709 to +711
def normalize_input_dtypes(
df: pd.DataFrame, objects: Iterable[Parameter | Target], /
) -> pd.DataFrame:
Copy link

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signature changed from generic Iterable[_T] to specific Iterable[Parameter | Target] which could be a breaking change for code that relied on the more generic typing.

Copilot uses AI. Check for mistakes.
skip_install = True
setenv =
SMOKE_TEST = true
BAYBE_USE_POLARS_FOR_CONSTRAINTS = false
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scienfitz @AVHopp: unclear why this is suddenly needed. There is one example that fails during doc building.

"JackiePhos": r"FC(F)(F)C1=CC(P(C2=C(C3=C(C(C)C)C=C(C(C)C)C=C3C(C)C)C(OC)=CC=C2OC)"
r"C4=CC(C(F)(F)F)=CC(C(F)(F)F)=C4)=CC(C(F)(F)F)=C1",
"SCHEMBL15068049": r"C[C@]1(O2)O[C@](C[C@]2(C)P3C4=CC=CC=C4)(C)O[C@]3(C)C1",
"SCHEMBL15068049": "C[C@]1(O2)O[C@](C[C@]2(C)P3C4=CC=CC=C4)(C)O[C@]3(C)C1",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea how to fix this

```python
from baybe import active_settings

active_settings.random_seed = 1337
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AVHopp, @Scienfitz, @kalama-ai

Do we want to enable attribute-like assignment like this one?

While you can change several settings one at a time using direct assignment, a more
convenient way is to instantiate a {class}`baybe.settings.Settings` object, which allows
to define multiple settings at once. In order for the settings to take effect,
call its :meth:`~baybe.settings.Settings.activate` method:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AVHopp, @Scienfitz, @kalama-ai

This is probably the biggest design choice to make for this PR. Do we want

  1. that settings are only activated when requested (i.e. either using an explicit activate call or using on of the context/decorator options) or (<-- current design)
  2. Should they activate immediately by default and one needs to specify otherwise if not desired, e.g. by passing an additional delay_activation=True flag?

Both variants are equally valid and there are reasons for both of them. Also, the additional syntax just moves from one case to the other, i.e. having an additional activate call vs passing an additional flag. So it's just a matter of taste in the end. Most striking "reasons" for either of the two

  1. First option behaves like a regular class, i.e. no fancy stuff happening until explicitly requested --> perhaps less surprising (which was probably the reason why I went with it)?
  2. On the other hand, activating settings immediately is probably the thing a user wants in 99% of cases, and delayed activation is rather rare.

Note that, for example, polars does the second approach using its apply_on_context_enter flag (which delays activation) – it was one of the packages I've screened when I collected ideas how to design the approach but our is sort of a mix of different ideas.

Let me know your thoughts

@@ -0,0 +1,213 @@
# Settings
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AVHopp, @Scienfitz, @kalama-ai

This is just my proposal for the naming, but we can discuss if better approaches exist. To be decided:

  • name of the module (currently: a public settings.py)
  • name of the settings class (currently: Settings). I think other packages oftentimes use small letters because they put the user experience first, i.e. a user would e.g. expect to call baybe.settings(...) without having to worry that this is actually instantiating class behind the scenes
  • name of the "global object" (currently: active_settings).

Reasons for the current choice: all names are sufficiently different and rather explicit so that there is no confusion.

But happy to hear your input


## Available Settings

For more information on which settings are available, their roles and allowed value
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to show something like a table here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Expand / change existing functionality new feature New functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants