Skip to content

[mypyc] feat: new primitive for int.to_bytes #19674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2058f81
[mypyc] feat: new primitive for `int.to_bytes`
BobTheBuidler Aug 16, 2025
f62bfd2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2025
813c510
add headers
BobTheBuidler Aug 16, 2025
4be77d6
Merge branch 'to-bytes' of https://github.com/BobTheBuidler/mypy into…
BobTheBuidler Aug 16, 2025
cb93329
cover all arg combos
BobTheBuidler Aug 16, 2025
6ff1c9b
CPyLong_ToBytes header
BobTheBuidler Aug 16, 2025
166b6f4
;
BobTheBuidler Aug 16, 2025
be2d2de
fix name
BobTheBuidler Aug 16, 2025
9ff3a7c
fix _PyLong_AsByteArray compile err
BobTheBuidler Aug 16, 2025
8399619
Update ir.py
BobTheBuidler Aug 17, 2025
db7b483
define header
BobTheBuidler Aug 17, 2025
ed05c00
Merge branch 'to-bytes' of https://github.com/BobTheBuidler/mypy into…
BobTheBuidler Aug 17, 2025
a0c147b
fix ir
BobTheBuidler Aug 17, 2025
5bab265
Update ir.py
BobTheBuidler Aug 17, 2025
b346bbf
fix ir
BobTheBuidler Aug 17, 2025
8d523a0
Merge branch 'to-bytes' of https://github.com/BobTheBuidler/mypy into…
BobTheBuidler Aug 17, 2025
85652a2
fix ir
BobTheBuidler Aug 17, 2025
8e97165
add ir test
BobTheBuidler Aug 17, 2025
3660a01
fix py 3.10
BobTheBuidler Aug 17, 2025
6a8a83c
use _PyLong_AsByteArray on all pythons
BobTheBuidler Aug 17, 2025
ec95008
fix: py3.13 and 3.14
BobTheBuidler Aug 17, 2025
95fa8b0
optimize if check
BobTheBuidler Aug 17, 2025
78f4ca1
Merge branch 'master' into to-bytes
BobTheBuidler Aug 20, 2025
326484b
Update CPy.h
BobTheBuidler Aug 21, 2025
245a122
Update int_ops.c
BobTheBuidler Aug 21, 2025
d2994e1
Update int_ops.c
BobTheBuidler Aug 21, 2025
1138448
Merge branch 'master' into to-bytes
BobTheBuidler Aug 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions mypyc/lib-rt/CPy.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ CPyTagged CPyTagged_Remainder_(CPyTagged left, CPyTagged right);
CPyTagged CPyTagged_BitwiseLongOp_(CPyTagged a, CPyTagged b, char op);
CPyTagged CPyTagged_Rshift_(CPyTagged left, CPyTagged right);
CPyTagged CPyTagged_Lshift_(CPyTagged left, CPyTagged right);
PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag);
PyObject *CPyLong_ToBytes(PyObject *v, Py_ssize_t length, const char *byteorder, int signed_flag);

PyObject *CPyTagged_Str(CPyTagged n);
CPyTagged CPyTagged_FromFloat(double f);
Expand Down
56 changes: 56 additions & 0 deletions mypyc/lib-rt/int_ops.c
Original file line number Diff line number Diff line change
Expand Up @@ -581,3 +581,59 @@ double CPyTagged_TrueDivide(CPyTagged x, CPyTagged y) {
}
return 1.0;
}

// int.to_bytes(length, byteorder, signed=False)
PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag) {
PyObject *pyint = CPyTagged_StealAsObject(self);
if (!PyLong_Check(pyint)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second thought, all these type checks look unnecessary, normally Python wrappers should do them. You can probably verify this by adding some run tests with Anys in them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what, like this?

def f(x: Any) -> bytes:
    return int.to_bytes(x)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on @JukkaL response to a similar question on #19673 I think we can safely remove this check since CPyTagged_StealAsObject guarantees the type

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CPyTagged_StealAsObject is not correct there, since it will transfer the ownership of the parameter, and this can cause a double free. CPyTagged_AsObject will return a new reference which you can decref at the end of the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BobTheBuidler

what, like this?

First, not just the self, second I think you should try more something like this

def to_bytes(n: int, length: int, byteorder: str = "little", signed: bool = False) -> bytes:
    return n.to_bytes(length, byteorder, signed=signed)

x: Any = "no"
bad: Any = "way"
to_bytes(x, bad)

and check that a TypeError will be given even before getting to your specialized code.

Py_DECREF(pyint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think these decrefs may be wrong, can you add a test with a long integer (something that doesn't fit in 64 bits)?

PyErr_SetString(PyExc_TypeError, "self must be int");
return NULL;
}
if (!PyUnicode_Check(byteorder)) {
Py_DECREF(pyint);
PyErr_SetString(PyExc_TypeError, "byteorder must be str");
return NULL;
}
const char *order = PyUnicode_AsUTF8(byteorder);
if (!order) {
Py_DECREF(pyint);
return NULL;
}
PyObject *result = CPyLong_ToBytes(pyint, length, order, signed_flag);
Py_DECREF(pyint);
return result;
}


// Helper for CPyLong_ToBytes (Python 3.2+)
PyObject *CPyLong_ToBytes(PyObject *v, Py_ssize_t length, const char *byteorder, int signed_flag) {
// This is a wrapper for PyLong_AsByteArray and PyBytes_FromStringAndSize
unsigned char *bytes = (unsigned char *)PyMem_Malloc(length);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid allocating a temporary buffer if you call PyBytes_FromStringAndSize with NULL as the first argument to make the object uninitialized. You can then pass a pointer to the contents of the bytes object to _PyLong_AsByteArray.

if (!bytes) {
PyErr_NoMemory();
return NULL;
}
int little_endian;
if (strcmp(byteorder, "big") == 0) {
little_endian = 0;
} else if (strcmp(byteorder, "little") == 0) {
little_endian = 1;
} else {
PyMem_Free(bytes);
PyErr_SetString(PyExc_ValueError, "byteorder must be either 'little' or 'big'");
return NULL;
}
#if PY_VERSION_HEX >= 0x030D0000 // 3.13.0
int res = _PyLong_AsByteArray((PyLongObject *)v, bytes, length, little_endian, signed_flag, 1);
#else
int res = _PyLong_AsByteArray((PyLongObject *)v, bytes, length, little_endian, signed_flag);
#endif
if (res < 0) {
PyMem_Free(bytes);
return NULL;
}
PyObject *result = PyBytes_FromStringAndSize((const char *)bytes, length);
PyMem_Free(bytes);
return result;
}
29 changes: 28 additions & 1 deletion mypyc/primitives/int_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
RType,
bit_rprimitive,
bool_rprimitive,
bytes_rprimitive,
c_pyssize_t_rprimitive,
float_rprimitive,
int16_rprimitive,
Expand All @@ -31,7 +32,14 @@
str_rprimitive,
void_rtype,
)
from mypyc.primitives.registry import binary_op, custom_op, function_op, load_address_op, unary_op
from mypyc.primitives.registry import (
binary_op,
custom_op,
function_op,
load_address_op,
method_op,
unary_op,
)

# Constructors for builtins.int and native int types have the same behavior. In
# interpreted mode, native int types are just aliases to 'int'.
Expand Down Expand Up @@ -305,3 +313,22 @@ def int_unary_op(name: str, c_function_name: str) -> PrimitiveDescription:
c_function_name="PyLong_Check",
error_kind=ERR_NEVER,
)

# int.to_bytes(length, byteorder)
method_op(
name="to_bytes",
arg_types=[int_rprimitive, int_rprimitive, str_rprimitive],
extra_int_constants=[(0, bool_rprimitive)],
return_type=bytes_rprimitive,
c_function_name="CPyTagged_ToBytes",
error_kind=ERR_MAGIC,
)

# int.to_bytes(length, byteorder, signed)
method_op(
name="to_bytes",
arg_types=[int_rprimitive, int_rprimitive, str_rprimitive, bool_rprimitive],
return_type=bytes_rprimitive,
c_function_name="CPyTagged_ToBytes",
error_kind=ERR_MAGIC,
)
3 changes: 2 additions & 1 deletion mypyc/test-data/fixtures/ir.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import _typeshed
from typing import (
TypeVar, Generic, List, Iterator, Iterable, Dict, Optional, Tuple, Any, Set,
overload, Mapping, Union, Callable, Sequence, FrozenSet, Protocol
overload, Mapping, Union, Callable, Sequence, FrozenSet, Protocol,
)

_T = TypeVar('_T')
Expand Down Expand Up @@ -82,6 +82,7 @@ def __lt__(self, n: int) -> bool: pass
def __gt__(self, n: int) -> bool: pass
def __le__(self, n: int) -> bool: pass
def __ge__(self, n: int) -> bool: pass
def to_bytes(self, length: int, order: str, signed: bool = False) -> bytes: pass

class str:
@overload
Expand Down
23 changes: 23 additions & 0 deletions mypyc/test-data/irbuild-int.test
Original file line number Diff line number Diff line change
Expand Up @@ -210,3 +210,26 @@ L0:
r0 = CPyTagged_Invert(n)
x = r0
return x

[case testIntToBytes]
def f(x: int) -> bytes:
return x.to_bytes(2, "big")
def g(x: int) -> bytes:
return x.to_bytes(4, "little", True)
[out]
def f(x):
x :: int
r0 :: str
r1 :: bytes
L0:
r0 = 'big'
r1 = CPyTagged_ToBytes(x, 4, r0, 0)
return r1
def g(x):
x :: int
r0 :: str
r1 :: bytes
L0:
r0 = 'little'
r1 = CPyTagged_ToBytes(x, 8, r0, 1)
return r1
9 changes: 9 additions & 0 deletions mypyc/test-data/run-integers.test
Original file line number Diff line number Diff line change
Expand Up @@ -572,3 +572,12 @@ class subc(int):
[file userdefinedint.py]
class int:
pass

[case testIntToBytes]
def to_bytes(n: int, length: int, byteorder: str, signed: bool = False) -> bytes:
return n.to_bytes(length, byteorder, signed=signed)
def test_to_bytes() -> None:
assert to_bytes(255, 2, "big") == b'\x00\xff'
assert to_bytes(255, 2, "little") == b'\xff\x00'
assert to_bytes(-1, 2, "big", True) == b'\xff\xff'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add overflow error tests.

assert to_bytes(0, 1, "big") == b'\x00'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also test calling to_bytes() function from interpreted code.

Loading