Skip to content

Commit 7d9a245

Browse files
rstzcopybara-github
authored andcommitted
Prepare release of TF-DF 1.9.0 and update installation instructions
PiperOrigin-RevId: 615495475
1 parent 632d813 commit 7d9a245

File tree

10 files changed

+263
-99
lines changed

10 files changed

+263
-99
lines changed

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Changelog
22

3-
## 1.9.0rc0 - 2024-02-26
3+
## 1.9.0 - 2024-03-12
44

55
### Fix
66

@@ -10,8 +10,15 @@
1010
### Features
1111

1212
- Compatibility with TensorFlow 2.16.0rc0.
13+
- Expose new parameter sparse_oblique_max_num_projections.
1314
- Using tf_keras instead tf.keras in examples, documentation.
1415
- Support NAConditions for fast engine.
16+
- Faster model loading for models with many features and dense oblique
17+
conditions.
18+
19+
### Documentation
20+
21+
- Clarified documentation of parameters for oblique splits.
1522

1623
## 1.8.1 - 2023-11-17
1724

README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,6 @@ The following resources are available:
6868
- [Issue tracker](https://github.com/tensorflow/decision-forests/issues)
6969
- [Known issues](documentation/known_issues.md)
7070
- [Changelog](CHANGELOG.md)
71-
- [TensorFlow Forum](https://discuss.tensorflow.org) (on
72-
discuss.tensorflow.org)
7371
- [More examples](documentation/more_examples.md)
7472

7573
## Installation

WORKSPACE

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@ load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
1111
# absl used by tensorflow.
1212
http_archive(
1313
name = "org_tensorflow",
14-
strip_prefix = "tensorflow-2.15.0",
15-
sha256 = "9cec5acb0ecf2d47b16891f8bc5bc6fbfdffe1700bdadc0d9ebe27ea34f0c220",
16-
urls = ["https://github.com/tensorflow/tensorflow/archive/v2.15.0.zip"],
14+
strip_prefix = "tensorflow-2.16.1",
15+
sha256 = "c729e56efc945c6df08efe5c9f5b8b89329c7c91b8f40ad2bb3e13900bd4876d",
16+
urls = ["https://github.com/tensorflow/tensorflow/archive/v2.16.1.tar.gz"],
1717
# Starting with TF 2.14, disable hermetic Python builds.
1818
patch_args = ["-p1"],
1919
patches = ["//third_party/tensorflow:tf.patch"],

configure/setup.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,20 +21,20 @@
2121
from setuptools.command.install import install
2222
from setuptools.dist import Distribution
2323

24-
_VERSION = "1.9.0rc0"
24+
_VERSION = "1.9.0"
2525

2626
with open("README.md", "r", encoding="utf-8") as fh:
2727
long_description = fh.read()
2828

2929
REQUIRED_PACKAGES = [
3030
"numpy",
3131
"pandas",
32-
"tensorflow~=2.16.0rc0",
32+
"tensorflow~=2.16.1",
3333
"six",
3434
"absl_py",
3535
"wheel",
3636
"wurlitzer",
37-
"tf_keras~=2.16.0rc2",
37+
"tf_keras~=2.16",
3838
]
3939

4040

@@ -84,8 +84,10 @@ def get_tag(self):
8484
name="tensorflow_decision_forests",
8585
version=_VERSION,
8686
author="Google Inc.",
87-
author_email="[email protected]",
88-
description="Collection of training and inference decision forest algorithms.",
87+
author_email="[email protected]",
88+
description=(
89+
"Collection of training and inference decision forest algorithms."
90+
),
8991
long_description=long_description,
9092
long_description_content_type="text/markdown",
9193
url="https://github.com/tensorflow/decision-forests",
@@ -113,7 +115,10 @@ def get_tag(self):
113115
packages=setuptools.find_packages(),
114116
python_requires=">=3.9",
115117
license="Apache 2.0",
116-
keywords="tensorflow tensor machine learning decision forests random forest gradient boosted decision trees",
118+
keywords=(
119+
"tensorflow tensor machine learning decision forests random forest"
120+
" gradient boosted decision trees"
121+
),
117122
install_requires=REQUIRED_PACKAGES,
118123
include_package_data=True,
119124
zip_safe=False,

documentation/installation.md

Lines changed: 104 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@
1010
* [Table of Contents](#table-of-contents)
1111
* [Installation with Pip](#installation-with-pip)
1212
* [Build from source](#build-from-source)
13+
* [Technical details](#technical-details)
1314
* [Linux](#linux)
14-
* [Setup](#setup)
15-
* [Compilation](#compilation)
15+
* [Docker build](#docker-build)
16+
* [Manual build](#manual-build)
1617
* [MacOS](#macos)
17-
* [Setup](#setup-1)
18-
* [Building / Packaging (Apple CPU)](#building---packaging-apple-cpu)
18+
* [Setup](#setup)
19+
* [Arm64 CPU](#arm64-cpu)
1920
* [Cross-compiling for Intel CPUs](#cross-compiling-for-intel-cpus)
20-
* [Final note](#final-note)
21-
* [Troubleshooting](#troubleshooting)
21+
* [Windows](#windows)
2222

2323
<!--te-->
2424

@@ -44,24 +44,74 @@ python3 -c "import tensorflow_decision_forests as tfdf; print('Found TF-DF v' +
4444

4545
## Build from source
4646

47+
### Technical details
48+
49+
TensorFlow Decision Forests (TF-DF) implements custom ops for TensorFlow and
50+
therefore depends on TensorFlow's ABI. Since the ABI can change between
51+
versions, any TF-DF version is only compatible with one specific TensorFlow
52+
version.
53+
54+
To avoid compiling and shipping all of TensorFlow with TF-DF, TF-DF
55+
links against libtensorflow shared library that is distributed with TensorFlow's
56+
Pip package. Only a small part of Tensorflow is compiled and compilation only
57+
takes ~10 minutes on a strong workstation (instead of multiple hours when
58+
compiling all of TensorFlow). To ensure this works, the version of TensorFlow
59+
that is actually compiled and the libtensorflow shared library must match
60+
exactly.
61+
62+
The `tools/test_bazel.sh` script configures the TF-DF build to ensure the
63+
versions of the packages used match. For details on this process, see the source
64+
code of this script. Since TensorFlow compilation changes often, it only
65+
supports building with the most recent TensorFlow versions and nightly.
66+
67+
**Note**: When distributing builds, you may set the `__git_version__` string in
68+
`tensorflow_decision_forests/__init__.py` to identify the commit you built from.
69+
4770
### Linux
4871

49-
#### Setup
72+
#### Docker build
73+
74+
The easiest way to build TF-DF on Linux is by using TensorFlow's build
75+
[Build docker](https://github.com/tensorflow/build). Just run the following
76+
steps to build:
77+
78+
```shell
79+
./tools/start_compile_docker.sh # Start the docker, might require root
80+
export RUN_TESTS=1 # Whether to run tests after build
81+
export PY_VERSION=3.9 # Python version to use for build
82+
# TensorFlow version to compile against. This must match exactly the version
83+
# of TensorFlow used at runtime, otherwise TF-DF may crash unexpectedly.
84+
export TF_VERSION=2.16.1 # Set to "nightly" for building with tf-nightly
85+
./tools/test_bazel.sh
86+
```
87+
88+
This places the compiled C++ code in the `bazel-bin` directory. Note that this
89+
is a symbolic link that is not exposed outside the container (i.e. the build is
90+
gone after leaving the container).
91+
92+
For building the wheels, run
93+
```shell
94+
tools/build_pip_package.sh ALL_VERSIONS INSTALL_PYENV
95+
```
96+
97+
This will install [Pyenv](https://github.com/pyenv/pyenv) and
98+
[Pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv) inside the docker
99+
and use it to install Python in all supported versions for building. The wheels
100+
are placed in the `dist/` subdirectory.
101+
102+
#### Manual build
103+
104+
Building TF-DF without the docker might be harder, and the team is probably not
105+
able to help with this.
50106

51107
**Requirements**
52108

53-
- Bazel >= 3.7.2
109+
- Bazel >= 6.3.0
54110
- Python >= 3
55111
- Git
56-
- Python packages: numpy tensorflow pandas
57-
58-
Instead of installing the dependencies by hands, you can use the
59-
[TensorFlow Build docker](https://github.com/tensorflow/build). If you choose
60-
this options, install Docker:
112+
- Pyenv, Pyenv-virtualenv (only if packaging for many Python versions)
61113

62-
- [Docker](https://docs.docker.com/get-docker/).
63-
64-
#### Compilation
114+
**Building**
65115

66116
Download TensorFlow Decision Forests as follows:
67117

@@ -71,31 +121,22 @@ git clone https://github.com/tensorflow/decision-forests.git
71121
cd decision-forests
72122
```
73123

74-
**Optional:** TensorFlow Decision Forests depends on
124+
*Optional:* TensorFlow Decision Forests depends on
75125
[Yggdrasil Decision Forests](https://github.com/google/yggdrasil-decision-forests)
76126
. If you want to edit the Yggdrasil code, you can clone the Yggdrasil repository
77127
and change the path accordingly in
78128
`third_party/yggdrasil_decision_forests/workspace.bzl`.
79129

80-
**Optional:** If you want to use the docker option, run the
81-
`start_compile_docker.sh` script and continue to the next step. If you don't
82-
want to use the docker option, continue to the next step directly.
83-
84-
```shell
85-
# Optional: Install and start the build docker.
86-
./tools/start_compile_docker.sh
87-
```
88-
89130
Compile and run the unit tests of TF-DF with the following command. Note that
90-
`test_bazel.sh` is configured for `python3.8` and the default compiler on your
91-
machine. Edit the file directly to change this configuration.
131+
`test_bazel.sh` is configured for the default compiler on your machine. Edit the
132+
file directly to change this configuration.
92133

93134
```shell
94135
# Build and test TF-DF.
95-
./tools/test_bazel.sh
136+
RUN_TESTS=1 PY_VERSION=3.9 TF_VERSION=2.16.1 ./tools/test_bazel.sh
96137
```
97138

98-
Create and test a pip package with the following command. Replace python3.8 by
139+
Create and test a pip package with the following command. Replace python3.9 by
99140
the version of python you want to use. Note that you don't have to use the same
100141
version of Python as in the `test_bazel.sh` script.
101142

@@ -154,25 +195,28 @@ For MacOS systems with ARM64 CPU, follow these steps:
154195

155196
1. Prepare your environment
156197

157-
```
198+
```shell
158199
git clone https://github.com/tensorflow/decision-forests.git
159200
python3 -m venv venv
160-
source venv/source/activate
201+
source venv/bin/activate
161202
```
162203

163204
1. Decide which Python version and TensorFlow version you want to use and run
164205

165-
```
206+
```shell
166207
cd decision-forests
167-
export TF_VERSION=2.15.0 # Change to the TensorFlow Version you need.
168-
export PY_VERSION=3.9 # Change to the Python you need.
169-
export RUN_TESTS=1 # Change to 0 if you want to skip tests.
170-
./tools/test_bazel.sh # Takes ~15 minutes on a modern Mac.
208+
bazel clean --expunge # Remove old builds (esp. cross-compiled).
209+
export RUN_TESTS=1 # Whether to run tests after build.
210+
export PY_VERSION=3.9 # Python version to use for build.
211+
# TensorFlow version to compile against. This must match exactly the version
212+
# of TensorFlow used at runtime, otherwise TF-DF may crash unexpectedly.
213+
export TF_VERSION=2.16.1
214+
./tools/test_bazel.sh # Takes ~15 minutes on a modern Mac.
171215
```
172216

173-
1. Package the code.
217+
1. Package the build.
174218

175-
```
219+
```shell
176220
# Building the packages uses different virtualenvs through Pyenv.
177221
deactivate
178222
# Build the packages.
@@ -188,36 +232,43 @@ machines with Intel CPUs as follows.
188232

189233
1. Prepare your environment
190234

191-
```
235+
```shell
192236
git clone https://github.com/tensorflow/decision-forests.git
193237
python3 -m venv venv
194238
source venv/source/activate
195239
```
196240

197241
1. Decide which Python version you want to use and run
198242

199-
```
243+
```shell
200244
cd decision-forests
201-
export TF_VERSION=2.15.0 # Change to the TensorFlow Version you need.
202-
export PY_VERSION=3.9 # Change to the Python you need.
203-
export RUN_TESTS=0 # Cross-compiled packages cannot be tested.
204-
export MAC_INTEL_CROSSCOMPILE=1
205-
./tools/test_bazel.sh # Takes ~15 minutes on a modern Mac.
245+
bazel clean --expunge # Remove old builds (esp. cross-compiled).
246+
export RUN_TESTS=0 # Cross-compiled builds can't run tests.
247+
export PY_VERSION=3.9 # Python version to use for build.
248+
# TensorFlow version to compile against. This must match exactly the version
249+
# of TensorFlow used at runtime, otherwise TF-DF may crash unexpectedly.
250+
export TF_VERSION=2.16.1
251+
export MAC_INTEL_CROSSCOMPILE=1 # Enable cross-compilation.
252+
./tools/test_bazel.sh # Takes ~15 minutes on a modern Mac.
206253
```
207254

208-
1. Package the code.
255+
1. Package the build.
209256

210-
```
257+
```shell
211258
# Building the packages uses different virtualenvs through Pyenv.
212259
deactivate
213260
# Build the packages.
214261
./tools/build_pip_package.sh ALL_VERSIONS_MAC_INTEL_CROSSCOMPILE
215262
```
216263

217-
1. The packages can be found in `decision-forests/dist/`.
264+
1. The packages can be found in `decision-forests/dist/`. Note that they have
265+
not been tested and it would be prudent to test them before distribution.
266+
267+
### Windows
218268

219-
## Final note
269+
A Windows build has been successfully produced in the past, but is not
270+
maintained at this point. See `tools/test_bazel.bat` and `tools/test_bazel.sh`
271+
for (possibly outdated) pointers for compiling on Windows.
220272

221-
Compiling TF-DF relies on the TensorFlow Pip package *and* the TensorFlow Bazel
222-
dependency. Only a small part of TensorFlow will be compiled.
223-
Compiling TF-DF on a single powerful workstation takes ~10 minutes.
273+
For Windows users, [YDF](https://ydf.readthedocs.io) offers official Windows
274+
builds and most of the functionality (and more!) of TF-DF.

documentation/known_issues.md

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,16 +17,26 @@ TensorFlow Decision Forests is not yet available as a Windows Pip package.
1717
[Windows Subsystem for Linux (WSL)](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux)
1818
on your Windows machine and follow the Linux instructions.
1919

20+
## Incompatibility with Keras 3
21+
22+
Compatibility with Keras 3 is not yet implemented. Use tf_keras or a TensorFlow
23+
version before 2.16.
24+
25+
## Untested for conda
26+
27+
While TF-DF might work with Conda, this is not tested and we currently do not
28+
maintain packages on conda-forge.
29+
2030
## Incompatibility with old or nightly versions of TensorFlow
2131

22-
TensorFlow [ABI](https://en.wikipedia.org/wiki/Application_binary_interface) is
23-
not compatible in between releases. Because TF-DF relies on custom TensorFlow
32+
TensorFlow's [ABI](https://en.wikipedia.org/wiki/Application_binary_interface)
33+
is not compatible in between releases. Because TF-DF relies on custom TensorFlow
2434
C++ ops, each version of TF-DF is tied to a specific version of TensorFlow. The
2535
last released version of TF-DF is always tied to the last released version of
2636
TensorFlow.
2737

28-
For reasons, the current version of TF-DF might not be compatible with older
29-
versions or with the nightly build of TensorFlow.
38+
For these reasons, the current version of TF-DF might not be compatible with
39+
older versions or with the nightly build of TensorFlow.
3040

3141
If using incompatible versions of TF and TF-DF, you will see cryptic errors such
3242
as:
@@ -37,16 +47,16 @@ tensorflow_decision_forests/tensorflow/ops/training/training.so: undefined symbo
3747

3848
- Use the version of TF-DF that is compatible with your version of TensorFlow.
3949

40-
Note that TF-DF is not compatible with Keras 3 at this time.
41-
4250
### Compatibility table
4351

4452
The following table shows the compatibility between
4553
`tensorflow_decision_forests` and its dependencies:
4654

4755
tensorflow_decision_forests | tensorflow
4856
--------------------------- | ---------------
49-
1.6.0 | 2.14.0
57+
1.9.0 | 2.16.1
58+
1.8.0 - 1.8.1 | 2.15.0
59+
1.6.0 - 1.7.0 | 2.14.0
5060
1.5.0 | 2.13.0
5161
1.3.0 - 1.4.0 | 2.12.0
5262
1.1.0 - 1.2.0 | 2.11.0
@@ -72,7 +82,7 @@ does.
7282

7383
**Workarounds:**
7484

75-
- Use a model that support distribution strategies (e.g.
85+
- Use a model that supports distribution strategies (e.g.
7686
`DistributedGradientBoostedTreesModel`), or downsample your dataset so that
7787
it fits on a single machine.
7888

0 commit comments

Comments
 (0)