Skip to content

Commit c340913

Browse files
arrdemmyrrlyn
andauthored
feat(py_venv): activate-less interpreter (#629)
This patch reworks how the interpreter shim operates to eliminate the data dependency on the `activate` script having been evaluated and move the parameters which previously came via the shell environment. Because the behavior of `activate` isn't actually specified, we previously assumed that it was safe to just extend `activate`. This isn't true at all it turns out. Under normal operation the interpreter initialization as implemented in `Modules/getpath.py` provides special behavior when the interpreter's observed filename (`argv[0]`) is a symlink as it is in a normal non-relocatable virtualenv. In this case the interpreter and will search for a `pyvenv.cfg` relative to the "interpreter" and implicitly activate the venv as needed. Our `./bin/python` being an actual binary interrupts this which means we're on our own to make `getpath` and `site` do the needful. This should be as simple as setting the `home=` key in the `pyvenv.cfg`, but there isn't a principled way other than going through the runfiles library to locate where the interpreter lands in the runfiles tree relative to the virtualenv. Ideally we'd set the `home=` key to a relative path and mostly move on. Instead due in part to bzlmod munging we have to do the repo mapping dance which can't be done statically. So we need to do `rlocation` somewhere. The solution this patch comes to is moving that `rlocation` logic out of the `activate` script and into the launcher itself so that the launcher can search for the needed `$PYTHONHOME` and desired interpreter without the user having previously loaded the required parameters into the env via `activate`. Fixes #594. Fixes #631. ### Changes are visible to end-users: yes - Searched for relevant documentation and updated as needed: yes - Breaking change (forces users to change their own code or config): yes - Suggested release notes appear below: yes The `py_venv_binary` and friends no longer depend on their virtualenvs being `activated` via the shell. Their interpreters can safely be invoked directly as under an IDE or in a Spark environment. The `py_venv_binary` now no longer includes NEITHER the user's site-packages NOR the interpreter's site-packages directory unless the `enable_system_site_packages` or `enable_user_site_packages` attributes are explicitly set. This prevents accidental non-hermetic imports. ### Test plan - Covered by existing test cases - Manual testing; please provide instructions so we can reproduce: ```shellsession ❯ aspect build //py/tests/py-internal-venv:test && bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py/py/tests/py-internal-venv/.test/bin/python -c 'import sys; from pprint import pprint; pprint(sys.path)' INFO: Analyzed target //py/tests/py-internal-venv:test (2 packages loaded, 922 targets configured). INFO: From Compiling Rust bin shim_macos_aarch64_build (1 files): warning: fields `cfg` and `version_info` are never read --> py/tools/venv_shim/src/main.rs:45:5 | 43 | struct PyCfg { | ----- fields in this struct 44 | root: PathBuf, 45 | cfg: PathBuf, | ^^^ 46 | version_info: String, | ^^^^^^^^^^^^ | = note: `PyCfg` has a derived impl for the trait `Debug`, but this is intentionally ignored during dead code analysis = note: `#[warn(dead_code)]` on by default warning: 1 warning emitted INFO: Found 1 target... Target //py/tests/py-internal-venv:test up-to-date: bazel-bin/py/tests/py-internal-venv/test INFO: Elapsed time: 10.289s, Critical Path: 6.72s INFO: 4 processes: 1 internal, 3 darwin-sandbox. INFO: Build completed successfully, 4 total actions "bazel-bin/py/tests/py-internal-venv/test.runfiles/MANIFEST" ['', '/private/var/tmp/_bazel_arrdem/93bfea6cdc1153cc29a75400cd38823a/external/python_toolchain_aarch64-apple-darwin/lib/python39.zip', '/private/var/tmp/_bazel_arrdem/93bfea6cdc1153cc29a75400cd38823a/external/python_toolchain_aarch64-apple-darwin/lib/python3.9', '/private/var/tmp/_bazel_arrdem/93bfea6cdc1153cc29a75400cd38823a/external/python_toolchain_aarch64-apple-darwin/lib/python3.9/lib-dynload', '/Users/arrdem/Documents/work/aspect/rules_py/bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py/py/tests/py-internal-venv/.test/lib/python3.9/site-packages', '/Users/arrdem/Documents/work/aspect/rules_py/bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py/py/tests/py-internal-venv', '/Users/arrdem/Documents/work/aspect/rules_py/bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py'] ``` ### TODO - [x] Need to add automated tests covering entrypoint bypass - [x] Need to add automated tests covering the usersite behavior - [x] Need to add automated tests covering the system site behavior --------- Co-authored-by: Alex Payne <[email protected]>
1 parent c964fb4 commit c340913

File tree

32 files changed

+1259
-220
lines changed

32 files changed

+1259
-220
lines changed

.aspect/workflows/config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# See https://docs.aspect.build/workflows/configuration
22
tasks:
33
- test:
4+
queue: aspect-large
45
- finalization:
56
queue: aspect-small
67
notifications:

Cargo.Bazel.lock

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"checksum": "81528401c6d6fde79ca22def4e50d7ebc47f7f182a83a106d38e9e1279da39a5",
2+
"checksum": "f1be4baa5924515e3e3bfe91c4afbfa084b2751aec2f7560df597efbcdc4dc3b",
33
"crates": {
44
"addr2line 0.24.2": {
55
"name": "addr2line",
@@ -14740,6 +14740,37 @@
1474014740
],
1474114741
"license_file": "LICENSE"
1474214742
},
14743+
"runfiles 0.1.0": {
14744+
"name": "runfiles",
14745+
"version": "0.1.0",
14746+
"package_url": "https://github.com/aspect-build/rules_py",
14747+
"repository": null,
14748+
"targets": [
14749+
{
14750+
"Library": {
14751+
"crate_name": "runfiles",
14752+
"crate_root": "src/lib.rs",
14753+
"srcs": {
14754+
"allow_empty": true,
14755+
"include": [
14756+
"**/*.rs"
14757+
]
14758+
}
14759+
}
14760+
}
14761+
],
14762+
"library_target_name": "runfiles",
14763+
"common_attrs": {
14764+
"compile_data_glob": [
14765+
"**"
14766+
],
14767+
"edition": "2021",
14768+
"version": "0.1.0"
14769+
},
14770+
"license": "Apache 2",
14771+
"license_ids": [],
14772+
"license_file": null
14773+
},
1474314774
"rust-netrc 0.1.1": {
1474414775
"name": "rust-netrc",
1474514776
"version": "0.1.1",
@@ -27114,6 +27145,7 @@
2711427145
"binary_crates": [],
2711527146
"workspace_members": {
2711627147
"py 0.1.0": "py/tools/py",
27148+
"runfiles 0.1.0": "py/tools/runfiles",
2711727149
"unpack_bin 0.1.0": "py/tools/unpack_bin",
2711827150
"venv_bin 0.1.0": "py/tools/venv_bin",
2711927151
"venv_shim 0.1.0": "py/tools/venv_shim"

Cargo.lock

Lines changed: 5 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ members = [
55
"py/tools/venv_bin",
66
"py/tools/unpack_bin",
77
"py/tools/venv_shim",
8+
"py/tools/runfiles",
89
]
910

1011
[workspace.package]

MODULE.bazel

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ crate.from_cargo(
8787
"//py/tools/unpack_bin:Cargo.toml",
8888
"//py/tools/venv_bin:Cargo.toml",
8989
"//py/tools/venv_shim:Cargo.toml",
90+
"//py/tools/runfiles:Cargo.toml",
9091
],
9192
)
9293
use_repo(crate, "crate_index")

WORKSPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,7 @@ crates_repository(
284284
"//py/tools/unpack_bin:Cargo.toml",
285285
"//py/tools/venv_bin:Cargo.toml",
286286
"//py/tools/venv_shim:Cargo.toml",
287+
"//py/tools/runfiles:Cargo.toml",
287288
],
288289
)
289290

e2e/smoke/WORKSPACE.bazel

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ crates_repository(
3939
"@aspect_rules_py//py/tools/unpack_bin:Cargo.toml",
4040
"@aspect_rules_py//py/tools/venv_bin:Cargo.toml",
4141
"@aspect_rules_py//py/tools/venv_shim:Cargo.toml",
42+
"@aspect_rules_py//py/tools/runfiles:Cargo.toml",
4243
],
4344
)
4445

py/private/py_venv/py_venv.bzl

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,10 @@ def _py_venv_base_impl(ctx):
118118
arguments = [
119119
"--location=" + venv_dir.path,
120120
"--venv-shim=" + py_shim.bin.bin.path,
121+
# Post-bzlmod we need to record the current repository in case the
122+
# user tries to consume a `py_venv_binary` across repo boundaries
123+
# which could cause repo mapping to become relevant.
124+
"--repo=" + (ctx.label.repo_name or ctx.workspace_name),
121125
"--python=" + to_rlocation_path(ctx, py_toolchain.python) if py_toolchain.runfiles_interpreter else py_toolchain.python.path,
122126
"--pth-file=" + site_packages_pth_file.path,
123127
"--env-file=" + env_file.path,
@@ -129,6 +133,14 @@ def _py_venv_base_impl(ctx):
129133
py_toolchain.interpreter_version_info.major,
130134
py_toolchain.interpreter_version_info.minor,
131135
),
136+
"--include-system-site-packages=" + ({
137+
True: "true",
138+
False: "false",
139+
}[ctx.attr.include_system_site_packages]),
140+
"--include-user-site-packages=" + ({
141+
True: "true",
142+
False: "false",
143+
}[ctx.attr.include_system_site_packages]),
132144
] + (["--debug"] if ctx.attr.debug else []),
133145
inputs = rfs.merge_all([
134146
ctx.runfiles(files = [
@@ -290,6 +302,42 @@ A collision can occur when multiple packages providing the same file are install
290302
"debug": attr.bool(
291303
default = False,
292304
),
305+
"include_system_site_packages": attr.bool(
306+
default = False,
307+
doc = """`pyvenv.cfg` feature flag for the `include-system-site-packages` key.
308+
309+
When `True`, the user's site directory AND the interpreter's site directory will
310+
be included into the runtime pythonpath.
311+
312+
When `False`, only the virtualenv's site directory and the interpreter's core
313+
libraries will be included into the runtime pythonpath.
314+
315+
`False` is obviously preferable as it increases hermeticity, but the choice of
316+
`False` cases for instance a `pip` or `setuptools` bundled into the interpreter
317+
to be unusable. Many libraries assume these packages will always be available
318+
and may not reliably declare their dependencies such that Bazel will satisfy
319+
them, so choosing isolation could expose packaging errors.
320+
321+
""",
322+
),
323+
"include_user_site_packages": attr.bool(
324+
default = False,
325+
doc = """`pyvenv.cfg` feature flag for the `aspect-include-user-site-packages` extension key.
326+
327+
When `True`, the user's site directory directory will be included into the
328+
runtime pythonpath.
329+
330+
When `False`, only the virtualenv's site directory and the interpreter's core
331+
libraries will be included into the runtime pythonpath.
332+
333+
`False` is obviously preferable as it increases hermeticity, but the choice of
334+
`False` cases for instance a `pip` or `setuptools` bundled into the interpreter
335+
to be unusable. Many libraries assume these packages will always be available
336+
and may not reliably declare their dependencies such that Bazel will satisfy
337+
them, so choosing isolation could expose packaging errors.
338+
339+
""",
340+
),
293341
})
294342

295343
_attrs.update(**_py_library.attrs)
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
load("//py/private/py_venv:defs.bzl", "py_venv_test")
2+
3+
py_venv_test(
4+
name = "test",
5+
srcs = [
6+
"test.py",
7+
],
8+
imports = [
9+
".",
10+
],
11+
include_system_site_packages = False,
12+
main = "test.py",
13+
deps = [
14+
"@pypi_cowsay//:pkg",
15+
],
16+
)
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
import os
2+
import sys
3+
import site
4+
5+
print("---")
6+
print("__file__:", __file__)
7+
print("sys.prefix:", sys.prefix)
8+
print("sys.executable:", sys.executable)
9+
print("site.PREFIXES:")
10+
for p in site.PREFIXES:
11+
print(" -", p)
12+
13+
# The virtualenv module should have already been loaded at interpreter startup
14+
assert "_virtualenv" in sys.modules
15+
print(site.PREFIXES)
16+
17+
assert len(site.getsitepackages()) == 1

0 commit comments

Comments
 (0)