-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
I have been seeing random segfaults on CI runs for PySR for Julia 1.11 + PythonCall.jl + Python 3.13 for the past several months. For example: https://github.com/MilesCranmer/PySR/actions/runs/14533490695/job/40777540350. I do not recall seeing this specific segfault when using Python 3.10-3.12. It has been going on for a while, but wanted to wait to see if it went away – it has not, and Python 3.13 has started to increase in usage, so I am trying to prioritise fixing it.
At the moment I only see it happening on macos-latest. I used to see it a couple of months ago on windows-latest and also ubuntu-latest. I am not sure if there is a specific pattern. Some days all the CI seems to run fine, other times there are two+ failing jobs. You can see the history of nightly runs here – most of the failing jobs are due to such segfaults. (Note that julia is installed from the binary; the conda-forge
is just used for Python. I don't believe conda to be related.)
The actual error traceback is:
[2916] signal 11 (2): Segmentation fault: 11
in expression starting at none:0
jl_object_id__cold at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/builtins.c:441
smallintset_rehash at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/smallintset.c:218
Shown in detail below.
I wonder if this is related to #57333 which was closed after being fixed in #57392 on Julia 1.12. Perhaps a backport be needed on Julia 1.11 @vtjnash?
Here is the full traceback from [this CI run](https://github.com/MilesCranmer/PySR/actions/runs/14533490695/job/40777540350)
[2916] signal 11 (2): Segmentation fault: 11
in expression starting at none:0
jl_object_id__cold at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/builtins.c:441
smallintset_rehash at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/smallintset.c:218
jl_smallintset_insert at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/smallintset.c:197
jl_idset_put_idx at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/./idset.c:104
jl_as_global_root at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/staticdata.c:2548
inst_datatype_inner at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/jltypes.c:2115
inst_type_w_ at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/jltypes.c:2584
ijl_instantiate_unionall at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/jltypes.c:1487 [inlined]
ijl_apply_type at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-11/src/jltypes.c:1421
widenconst at ./compiler/typelattice.jl:700
jfptr_widenconst_37429.1 at /Users/runner/miniconda3/envs/pysr-test/julia_env/pyjuliapkg/install/lib/julia/sys.dylib (unknown line)
#408 at ./compiler/typeutils.jl:51
anymap at ./compiler/utilities.jl:43 [inlined]
argtypes_to_type at ./compiler/typeutils.jl:51
abstract_call_known at ./compiler/abstractinterpretation.jl:2199
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2423
abstract_eval_call at ./compiler/abstractinterpretation.jl:2438
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2752
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3044
typeinf_local at ./compiler/abstractinterpretation.jl:3331
typeinf_nocycle at ./compiler/abstractinterpretation.jl:3413
_typeinf at ./compiler/typeinfer.jl:244
typeinf at ./compiler/typeinfer.jl:215
typeinf_edge at ./compiler/typeinfer.jl:923
abstract_call_method at ./compiler/abstractinterpretation.jl:660
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:102
abstract_call_known at ./compiler/abstractinterpretation.jl:2200
abstract_call at ./compiler/abstractinterpretation.jl:2282
abstract_call at ./compiler/abstractinterpretation.jl:2275
abstract_call at ./compiler/abstractinterpretation.jl:2423
abstract_eval_call at ./compiler/abstractinterpretation.jl:2438
abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454
abstract_eval_statement at ./compiler/abstractinterpretation.jl:2752
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3068
typeinf_local at ./compiler/abstractinterpretation.jl:3331
builtin_exec at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
cfunction_vectorcall_FASTCALL_KEYWORDS at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
_PyEval_EvalFrameDefault at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
object_vacall at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
PyObject_CallMethodObjArgs at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
PyImport_ImportModuleLevelObject at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
_PyEval_EvalFrameDefault at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
PyEval_EvalCode at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
builtin_exec at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
cfunction_vectorcall_FASTCALL_KEYWORDS at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
_PyEval_EvalFrameDefault at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
object_vacall at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
PyObject_CallMethodObjArgs at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
PyImport_ImportModuleLevelObject at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
_PyEval_EvalFrameDefault at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
PyEval_EvalCode at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
run_eval_code_obj at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
run_mod at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
_PyRun_SimpleStringFlagsWithName at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
Py_RunMain at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
pymain_main at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
main at /Users/runner/miniconda3/envs/pysr-test/bin/python3.13 (unknown line)
Allocations: 2925273 (Pool: 2925102; Big: 171); GC: 4
/Users/runner/work/_temp/a299f5a8-12f5-49f2-b61d-eb4071b29970.sh: line 2: 2916 Segmentation fault: 11 python -c "import pysr"
Error: Process completed with exit code 139.
I unfortunately cannot reproduce this locally. But it happens frequently enough in CI that it seems worth fixing. Tips to get a reproducer are appreciated.
I also have nightly CI that runs on Python 3.12, across operating systems, and I see no such errors: https://github.com/MilesCranmer/PySR/actions/runs/14548066793. It is only interactions with 3.13 that cause such errors. PySR connects Julia to Python using juliacall 0.9.24.