Skip to content

Conversation

JasMehta08
Copy link
Contributor

This Pull request:

Changes or fixes:

This PR refactors and improves the dataset printing functionality in RooFit. The previous RooDataHist::printDataHistogram function was limited to 1D, had a redundant interface, and was not consistently available for all dataset types.

This contribution addresses these issues by:

  1. Renaming the function to the more general printContents().
  2. Moving the function's interface to the RooAbsData base class as a pure virtual function, ensuring all future dataset types must implement it.
  3. Providing a new, robust implementation for both RooDataHist and RooDataSet.
  4. The new implementation is now fully multi-dimensional and correctly handles mixed data types (RooRealVar, RooCategory).

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

Testing

The new feature was validated with a comprehensive local test. This script confirms that the new printContents() function works correctly for both RooDataSet and RooDataHist under various conditions, including:

  • Multi-dimensional data (3D + a category).
  • Correct handling of mixed data types (RooRealVar and RooCategory).
  • Robustness with empty datasets.
  • Correct printing of non-integer event weights.
  • Successful redirection of output to a file.

The full test script is provided below for review.

#include "RooRealVar.h"
#include "RooCategory.h"
#include "RooDataSet.h"
#include "RooDataHist.h"
#include <iostream>
#include <fstream>

using namespace RooFit;

void tougher_test() {
    std::cout << "\n=== Starting tougher test for the new printContents() feature ===\n" << std::endl;

    // --- 1. Define Observables (3D + Category) ---
    RooRealVar x("x", "x observable", -5, 5);
    RooRealVar y("y", "y observable", -5, 5);
    RooRealVar z("z", "z observable", -5, 5);
    RooCategory cat("cat", "event category");
    cat.defineType("type1");
    cat.defineType("type2");
    cat.defineType("type3"); // extra category
    RooArgSet observables(x, y, z, cat);

    // ====================================================================
    // --- 2. Edge case: Empty dataset ---
    // ====================================================================
    RooDataSet emptyData("empty", "Empty dataset", observables);
    std::cout << "--- Testing RooDataSet::printContents() on empty dataset ---\n";
    emptyData.printContents();

    // ====================================================================
    // --- 3. RooDataSet with mixed weights ---
    // ====================================================================
    RooDataSet dataSet("dataSet", "Complex 3D Weighted Data", observables, WeightVar("w"));

    // Add events with different weights
    x = 1.1; y = -1.1; z = 0.5; cat.setLabel("type1");
    dataSet.add(observables, 3.5);   // normal positive weight

    x = -2.2; y = 2.2; z = -1.5; cat.setLabel("type2");
    dataSet.add(observables, 0.0);   // zero weight

    x = 3.3; y = 3.3; z = 2.5; cat.setLabel("type3");
    dataSet.add(observables, -1.0);  // negative weight (allowed in some fits)

    std::cout << "\n--- Testing RooDataSet::printContents() with mixed weights ---\n";
    dataSet.printContents();
    std::cout << "... RooDataSet test complete ...\n" << std::endl;

    // ====================================================================
    // --- 4. RooDataHist from the dataset ---
    // ====================================================================
    x.setBins(3);
    y.setBins(2);
    z.setBins(2);

    RooDataHist dataHist("dataHist", "Binned Data", observables, dataSet);

    std::cout << "--- Testing RooDataHist::printContents() with 3D histogram ---\n";
    dataHist.printContents();
    std::cout << "... RooDataHist test complete ...\n" << std::endl;

    // ====================================================================
    // --- 5. File output test ---
    // ====================================================================
    std::ofstream outFile("test_output.txt");
    if (!outFile.good()) {
        std::cerr << "Error: Could not open test_output.txt for writing.\n";
        return;
    }

    outFile << "--- RooDataHist written to file ---\n";
    dataHist.printContents(outFile);
    outFile.close();
    std::cout << "Successfully wrote RooDataHist contents to 'test_output.txt'.\n" << std::endl;

    // ====================================================================
    // --- 6. Many entries test---
    // ====================================================================
    RooDataSet largeData("largeData", "Large dataset test", observables);

    for (int i = 0; i < 50; ++i) {
        x = (i % 10) - 5;
        y = (i % 5) - 2.5;
        z = (i % 7) * 0.5;
        cat.setIndex(i % 3);
        largeData.add(observables, 1.0 + (i % 3) * 0.25);
    }

    std::cout << "--- Testing RooDataSet::printContents() on large dataset (50 entries) ---\n";
    largeData.printContents();
    std::cout << "... Large dataset test complete ...\n" << std::endl;

    std::cout << "=== Tougher test finished ===\n" << std::endl;
}

The Test Result

Processing tougher_test.cpp...

=== Starting tougher test for the new printContents() feature ===

--- Testing RooDataSet::printContents() on empty dataset ---
Contents of RooDataSet "empty"
(dataset is empty)

--- Testing RooDataSet::printContents() with mixed weights ---
Contents of RooDataSet "dataSet"
  Entry 0: x=1.1, y=-1.1, z=0.5, cat=type1, weight=3.5
  Entry 1: x=-2.2, y=2.2, z=-1.5, cat=type2, weight=0
  Entry 2: x=3.3, y=3.3, z=2.5, cat=type3, weight=-1
... RooDataSet test complete ...

--- Testing RooDataHist::printContents() with 3D histogram ---
Contents of RooDataHist "dataHist"
  Bin 0: x=-3.33333, y=-2.5, z=-2.5, cat=type1, weight=0 +/- [0,0]
  Bin 1: x=-3.33333, y=-2.5, z=-2.5, cat=type2, weight=0 +/- [0,0]
  Bin 2: x=-3.33333, y=-2.5, z=-2.5, cat=type3, weight=0 +/- [0,0]
  Bin 3: x=-3.33333, y=-2.5, z=2.5, cat=type1, weight=0 +/- [0,0]
  Bin 4: x=-3.33333, y=-2.5, z=2.5, cat=type2, weight=0 +/- [0,0]
  Bin 5: x=-3.33333, y=-2.5, z=2.5, cat=type3, weight=0 +/- [0,0]
  Bin 6: x=-3.33333, y=2.5, z=-2.5, cat=type1, weight=0 +/- [0,0]
  Bin 7: x=-3.33333, y=2.5, z=-2.5, cat=type2, weight=0 +/- [0,0]
  Bin 8: x=-3.33333, y=2.5, z=-2.5, cat=type3, weight=0 +/- [0,0]
  Bin 9: x=-3.33333, y=2.5, z=2.5, cat=type1, weight=0 +/- [0,0]
  Bin 10: x=-3.33333, y=2.5, z=2.5, cat=type2, weight=0 +/- [0,0]
  Bin 11: x=-3.33333, y=2.5, z=2.5, cat=type3, weight=0 +/- [0,0]
  Bin 12: x=2.22045e-16, y=-2.5, z=-2.5, cat=type1, weight=0 +/- [0,0]
  Bin 13: x=2.22045e-16, y=-2.5, z=-2.5, cat=type2, weight=0 +/- [0,0]
  Bin 14: x=2.22045e-16, y=-2.5, z=-2.5, cat=type3, weight=0 +/- [0,0]
  Bin 15: x=2.22045e-16, y=-2.5, z=2.5, cat=type1, weight=3.5 +/- [3.5,3.5]
  Bin 16: x=2.22045e-16, y=-2.5, z=2.5, cat=type2, weight=0 +/- [0,0]
  Bin 17: x=2.22045e-16, y=-2.5, z=2.5, cat=type3, weight=0 +/- [0,0]
  Bin 18: x=2.22045e-16, y=2.5, z=-2.5, cat=type1, weight=0 +/- [0,0]
  Bin 19: x=2.22045e-16, y=2.5, z=-2.5, cat=type2, weight=0 +/- [0,0]
  Bin 20: x=2.22045e-16, y=2.5, z=-2.5, cat=type3, weight=0 +/- [0,0]
  Bin 21: x=2.22045e-16, y=2.5, z=2.5, cat=type1, weight=0 +/- [0,0]
  Bin 22: x=2.22045e-16, y=2.5, z=2.5, cat=type2, weight=0 +/- [0,0]
  Bin 23: x=2.22045e-16, y=2.5, z=2.5, cat=type3, weight=0 +/- [0,0]
  Bin 24: x=3.33333, y=-2.5, z=-2.5, cat=type1, weight=0 +/- [0,0]
  Bin 25: x=3.33333, y=-2.5, z=-2.5, cat=type2, weight=0 +/- [0,0]
  Bin 26: x=3.33333, y=-2.5, z=-2.5, cat=type3, weight=0 +/- [0,0]
  Bin 27: x=3.33333, y=-2.5, z=2.5, cat=type1, weight=0 +/- [0,0]
  Bin 28: x=3.33333, y=-2.5, z=2.5, cat=type2, weight=0 +/- [0,0]
  Bin 29: x=3.33333, y=-2.5, z=2.5, cat=type3, weight=0 +/- [0,0]
  Bin 30: x=3.33333, y=2.5, z=-2.5, cat=type1, weight=0 +/- [0,0]
  Bin 31: x=3.33333, y=2.5, z=-2.5, cat=type2, weight=0 +/- [0,0]
  Bin 32: x=3.33333, y=2.5, z=-2.5, cat=type3, weight=0 +/- [0,0]
  Bin 33: x=3.33333, y=2.5, z=2.5, cat=type1, weight=0 +/- [0,0]
  Bin 34: x=3.33333, y=2.5, z=2.5, cat=type2, weight=0 +/- [0,0]
  Bin 35: x=3.33333, y=2.5, z=2.5, cat=type3, weight=-1 +/- [1,1]
... RooDataHist test complete ...

Successfully wrote RooDataHist contents to 'test_output.txt'.

An event weight/error was passed but no weight variable was defined in the dataset 'largeData'. The weight will be ignored.
An event weight/error was passed but no weight variable was defined in the dataset 'largeData'. The weight will be ignored.
An event weight/error was passed but no weight variable was defined in the dataset 'largeData'. The weight will be ignored.
An event weight/error was passed but no weight variable was defined in the dataset 'largeData'. The weight will be ignored.
An event weight/error was passed but no weight variable was defined in the dataset 'largeData'. The weight will be ignored.
--- Testing RooDataSet::printContents() on large dataset (50 entries) ---
Contents of RooDataSet "largeData"
  Entry 0: x=-5, y=-2.5, z=0, cat=type1, weight=1
  Entry 1: x=-4, y=-1.5, z=0.5, cat=type2, weight=1
  Entry 2: x=-3, y=-0.5, z=1, cat=type3, weight=1
  Entry 3: x=-2, y=0.5, z=1.5, cat=type1, weight=1
  Entry 4: x=-1, y=1.5, z=2, cat=type2, weight=1
  Entry 5: x=0, y=-2.5, z=2.5, cat=type3, weight=1
  Entry 6: x=1, y=-1.5, z=3, cat=type1, weight=1
  Entry 7: x=2, y=-0.5, z=0, cat=type2, weight=1
  Entry 8: x=3, y=0.5, z=0.5, cat=type3, weight=1
  Entry 9: x=4, y=1.5, z=1, cat=type1, weight=1
  Entry 10: x=-5, y=-2.5, z=1.5, cat=type2, weight=1
  Entry 11: x=-4, y=-1.5, z=2, cat=type3, weight=1
  Entry 12: x=-3, y=-0.5, z=2.5, cat=type1, weight=1
  Entry 13: x=-2, y=0.5, z=3, cat=type2, weight=1
  Entry 14: x=-1, y=1.5, z=0, cat=type3, weight=1
  Entry 15: x=0, y=-2.5, z=0.5, cat=type1, weight=1
  Entry 16: x=1, y=-1.5, z=1, cat=type2, weight=1
  Entry 17: x=2, y=-0.5, z=1.5, cat=type3, weight=1
  Entry 18: x=3, y=0.5, z=2, cat=type1, weight=1
  Entry 19: x=4, y=1.5, z=2.5, cat=type2, weight=1
  Entry 20: x=-5, y=-2.5, z=3, cat=type3, weight=1
  Entry 21: x=-4, y=-1.5, z=0, cat=type1, weight=1
  Entry 22: x=-3, y=-0.5, z=0.5, cat=type2, weight=1
  Entry 23: x=-2, y=0.5, z=1, cat=type3, weight=1
  Entry 24: x=-1, y=1.5, z=1.5, cat=type1, weight=1
  Entry 25: x=0, y=-2.5, z=2, cat=type2, weight=1
  Entry 26: x=1, y=-1.5, z=2.5, cat=type3, weight=1
  Entry 27: x=2, y=-0.5, z=3, cat=type1, weight=1
  Entry 28: x=3, y=0.5, z=0, cat=type2, weight=1
  Entry 29: x=4, y=1.5, z=0.5, cat=type3, weight=1
  Entry 30: x=-5, y=-2.5, z=1, cat=type1, weight=1
  Entry 31: x=-4, y=-1.5, z=1.5, cat=type2, weight=1
  Entry 32: x=-3, y=-0.5, z=2, cat=type3, weight=1
  Entry 33: x=-2, y=0.5, z=2.5, cat=type1, weight=1
  Entry 34: x=-1, y=1.5, z=3, cat=type2, weight=1
  Entry 35: x=0, y=-2.5, z=0, cat=type3, weight=1
  Entry 36: x=1, y=-1.5, z=0.5, cat=type1, weight=1
  Entry 37: x=2, y=-0.5, z=1, cat=type2, weight=1
  Entry 38: x=3, y=0.5, z=1.5, cat=type3, weight=1
  Entry 39: x=4, y=1.5, z=2, cat=type1, weight=1
  Entry 40: x=-5, y=-2.5, z=2.5, cat=type2, weight=1
  Entry 41: x=-4, y=-1.5, z=3, cat=type3, weight=1
  Entry 42: x=-3, y=-0.5, z=0, cat=type1, weight=1
  Entry 43: x=-2, y=0.5, z=0.5, cat=type2, weight=1
  Entry 44: x=-1, y=1.5, z=1, cat=type3, weight=1
  Entry 45: x=0, y=-2.5, z=1.5, cat=type1, weight=1
  Entry 46: x=1, y=-1.5, z=2, cat=type2, weight=1
  Entry 47: x=2, y=-0.5, z=2.5, cat=type3, weight=1
  Entry 48: x=3, y=0.5, z=3, cat=type1, weight=1
  Entry 49: x=4, y=1.5, z=0, cat=type2, weight=1
... Large dataset test complete ...

=== Tougher test finished ===

This PR fixes #7822

Copy link

github-actions bot commented Sep 8, 2025

Test Results

    20 files      20 suites   3d 21h 53m 13s ⏱️
 3 658 tests  3 650 ✅   0 💤 8 ❌
71 482 runs  71 374 ✅ 100 💤 8 ❌

For more details on these failures, see this check.

Results for commit 871f3a3.

@JasMehta08
Copy link
Contributor Author

@guitargeek i went through the logs of the failing tests and the failing test are due to errors regarding the CI tests themselves not the code (time out errors, files not found errors etc) could you please rerun or have a look at it

Thanks!

@JasMehta08
Copy link
Contributor Author

JasMehta08 commented Sep 13, 2025

@guitargeek any comments or things that i should change in this PR?

Thanks!

Copy link
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perfect. Thank you so much for this work, and also including the demo script in the PR description! Very good that you are also considering the distinction between RooRealVar and RooCategory correctly and have a fallback for potential other types 👍

@guitargeek guitargeek merged commit 3e28f0f into root-project:master Sep 16, 2025
22 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RF] RooDataHist::printDataHistogram should be renamed and moved
3 participants