-
-
Notifications
You must be signed in to change notification settings - Fork 3k
[mypyc] feat: new primitive for int.to_bytes
#19674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I will keep this open for a day or two in case @JukkaL has some comments.
// Helper for CPyLong_ToBytes (Python 3.2+) | ||
PyObject *CPyLong_ToBytes(PyObject *v, Py_ssize_t length, const char *byteorder, int signed_flag) { | ||
// This is a wrapper for PyLong_AsByteArray and PyBytes_FromStringAndSize | ||
unsigned char *bytes = (unsigned char *)PyMem_Malloc(length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can avoid allocating a temporary buffer if you call PyBytes_FromStringAndSize
with NULL
as the first argument to make the object uninitialized. You can then pass a pointer to the contents of the bytes
object to _PyLong_AsByteArray
.
def test_to_bytes() -> None: | ||
assert to_bytes(255, 2, "big") == b'\x00\xff' | ||
assert to_bytes(255, 2, "little") == b'\xff\x00' | ||
assert to_bytes(-1, 2, "big", True) == b'\xff\xff' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add overflow error tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the second look I think I have some more questions.
mypyc/lib-rt/int_ops.c
Outdated
// int.to_bytes(length, byteorder, signed=False) | ||
PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag) { | ||
PyObject *pyint = CPyTagged_StealAsObject(self); | ||
if (!PyLong_Check(pyint)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the second thought, all these type checks look unnecessary, normally Python wrappers should do them. You can probably verify this by adding some run
tests with Any
s in them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what, like this?
def f(x: Any) -> bytes:
return int.to_bytes(x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CPyTagged_StealAsObject
is not correct there, since it will transfer the ownership of the parameter, and this can cause a double free. CPyTagged_AsObject
will return a new reference which you can decref at the end of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what, like this?
First, not just the self
, second I think you should try more something like this
def to_bytes(n: int, length: int, byteorder: str = "little", signed: bool = False) -> bytes:
return n.to_bytes(length, byteorder, signed=signed)
x: Any = "no"
bad: Any = "way"
to_bytes(x, bad)
and check that a TypeError
will be given even before getting to your specialized code.
assert to_bytes(255, 2, "big") == b'\x00\xff' | ||
assert to_bytes(255, 2, "little") == b'\xff\x00' | ||
assert to_bytes(-1, 2, "big", True) == b'\xff\xff' | ||
assert to_bytes(0, 1, "big") == b'\x00' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also test calling to_bytes()
function from interpreted code.
mypyc/lib-rt/int_ops.c
Outdated
PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag) { | ||
PyObject *pyint = CPyTagged_StealAsObject(self); | ||
if (!PyLong_Check(pyint)) { | ||
Py_DECREF(pyint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think these decrefs may be wrong, can you add a test with a long integer (something that doesn't fit in 64 bits)?
This PR adds a new primitive for all arg combinations of
int.to_bytes