Skip to content

[mypyc] feat: new primitive for int.to_bytes #19674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

BobTheBuidler
Copy link
Contributor

@BobTheBuidler BobTheBuidler commented Aug 16, 2025

This PR adds a new primitive for all arg combinations of int.to_bytes

@BobTheBuidler BobTheBuidler marked this pull request as ready for review August 17, 2025 16:39
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I will keep this open for a day or two in case @JukkaL has some comments.

// Helper for CPyLong_ToBytes (Python 3.2+)
PyObject *CPyLong_ToBytes(PyObject *v, Py_ssize_t length, const char *byteorder, int signed_flag) {
// This is a wrapper for PyLong_AsByteArray and PyBytes_FromStringAndSize
unsigned char *bytes = (unsigned char *)PyMem_Malloc(length);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid allocating a temporary buffer if you call PyBytes_FromStringAndSize with NULL as the first argument to make the object uninitialized. You can then pass a pointer to the contents of the bytes object to _PyLong_AsByteArray.

def test_to_bytes() -> None:
assert to_bytes(255, 2, "big") == b'\x00\xff'
assert to_bytes(255, 2, "little") == b'\xff\x00'
assert to_bytes(-1, 2, "big", True) == b'\xff\xff'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add overflow error tests.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second look I think I have some more questions.

// int.to_bytes(length, byteorder, signed=False)
PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag) {
PyObject *pyint = CPyTagged_StealAsObject(self);
if (!PyLong_Check(pyint)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second thought, all these type checks look unnecessary, normally Python wrappers should do them. You can probably verify this by adding some run tests with Anys in them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what, like this?

def f(x: Any) -> bytes:
    return int.to_bytes(x)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on @JukkaL response to a similar question on #19673 I think we can safely remove this check since CPyTagged_StealAsObject guarantees the type

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CPyTagged_StealAsObject is not correct there, since it will transfer the ownership of the parameter, and this can cause a double free. CPyTagged_AsObject will return a new reference which you can decref at the end of the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BobTheBuidler

what, like this?

First, not just the self, second I think you should try more something like this

def to_bytes(n: int, length: int, byteorder: str = "little", signed: bool = False) -> bytes:
    return n.to_bytes(length, byteorder, signed=signed)

x: Any = "no"
bad: Any = "way"
to_bytes(x, bad)

and check that a TypeError will be given even before getting to your specialized code.

assert to_bytes(255, 2, "big") == b'\x00\xff'
assert to_bytes(255, 2, "little") == b'\xff\x00'
assert to_bytes(-1, 2, "big", True) == b'\xff\xff'
assert to_bytes(0, 1, "big") == b'\x00'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also test calling to_bytes() function from interpreted code.

PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag) {
PyObject *pyint = CPyTagged_StealAsObject(self);
if (!PyLong_Check(pyint)) {
Py_DECREF(pyint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think these decrefs may be wrong, can you add a test with a long integer (something that doesn't fit in 64 bits)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants