"sys.getsizeof(int)" returns an unreasonably large value?

Question

I want to check the size of int data type in python:

import sys sys.getsizeof(int)

It comes out to be "436", which doesn't make sense to me. Anyway, I want to know how many bytes (2,4,..?) int will take on my machine.

Note that python has support for very large numbers, which means you almost certainly won't get 2 or 4 or something even close. What are you really trying to do? Figure out if you're running a 32-bit or 64-bit OS/CPU? — Lasse V. Karlsen
– Lasse V. Karlsen, Commented Apr 28, 2012 at 17:04

senderle · Accepted Answer · 2023-04-30 16:30:53Z

The short answer

You're getting the size of the class, not of an instance of the class. Call int to get the size of an instance:

>>> sys.getsizeof(int()) 28

If that size still seems a little bit large, remember that a Python int is very different from an int in (for example) C. In Python, an int is a fully-fledged object. This means there's extra overhead.

Every Python object contains at least a refcount and a reference to the object's type in addition to other storage; on a 64-bit machine, just those two things alone take up 16 bytes! The int internals (as determined by the standard CPython implementation) have also changed over time, so that the amount of additional storage taken depends on your version.

`int` objects in CPython 3.11

Integer objects are internally PyLongObject C types representing blocks of memory. The code that defines this type is spread across multiple files. Here are the relevant parts:

typedef struct _longobject PyLongObject; struct _longobject { PyObject_VAR_HEAD digit ob_digit[1]; }; #define PyObject_VAR_HEAD PyVarObject ob_base; typedef struct { PyObject ob_base; Py_ssize_t ob_size; /* Number of items in variable part */ } PyVarObject; typedef struct _object PyObject; struct _object { _PyObject_HEAD_EXTRA union { Py_ssize_t ob_refcnt; #if SIZEOF_VOID_P > 4 PY_UINT32_T ob_refcnt_split[2]; #endif }; PyTypeObject *ob_type; }; /* _PyObject_HEAD_EXTRA is nothing on non-debug builds */ # define _PyObject_HEAD_EXTRA typedef uint32_t digit;

If we expand all the macros and replace all the typedef statements, this is the struct we end up with:

struct PyLongObject { Py_ssize_t ob_refcnt; PyTypeObject *ob_type; Py_ssize_t ob_size; /* Number of items in variable part */ uint32_t ob_digit[1]; };

uint32_t means "unsigned 32-bit integer" and uint32_t ob_digit[1]; means an array of 32-bit integers is used to hold the (absolute) value of the integer. The "1" in "ob_digit[1]" means the array should be initialized with space for 1 element.

So we have the following bytes to store an integer object in Python (on a 64-bit system):

8 bytes (64 bits, Py_ssize_t, signed) for ob_refcnt - the reference count
8 bytes (64 bits, PyTypeObject*) for ob_type - the pointer to the int class itself
8 bytes (64 bits, Py_ssize_t, signed) for ob_size - which stores how many 32-bit integers are used to store the integer

and finally a variable-length array (with at least 1 element) of

4 bytes (32 bits) to store each part of the integer

The comment that accompanies this definition summarizes Python 3.11's representation of integers. Zero is represented not by an object with size (ob_size) zero (the actual size is always at least 1 though). Negative numbers are represented by objects with a negative size attribute! This comment further explains that only 30 bits of each uint32_t are used for storing the value.

>>> sys.getsizeof(0) 28 >>> sys.getsizeof(1) 28 >>> sys.getsizeof(2 ** 30 - 1) 28 >>> sys.getsizeof(2 ** 30) 32 >>> sys.getsizeof(2 ** 60 - 1) 32 >>> sys.getsizeof(2 ** 60) 36

On CPython 3.10 and older, sys.getsizeof(0) incorrectly returns 24 instead of 28, this was a bug that was fixed. Python 2 had a second, separate type of integer which worked a bit differently, but generally similar.

You will get slightly different results on a 32-bit system.

Thanks! But my system returns 12 for the size. Is that in bytes? Kinda larger than I thought if it is in bytes, and unreasonable if in bits...
@HailiangZhang, that's 12 bytes -- or 24 bytes in my case. A Python int is very different from an int in (for example) c. In Python, an int is a fully-fledged object. This means there's extra overhead. See here for a fairly detailed discussion of the (cpython) int internals.
@HailiangZhang, also, this describes the structure of the PyObject_HEAD macro that you see in the page linked to above. Every Python object contains at least a refcount and a reference to the object's type in addition to other storage.
If you want to store a large number of compactly-stored 32-bit (or fewer-bit) integers, see the array module.
On a 64-bit system, Python allocates objects on 8-byte boundaries, for effiiciency reasons. Thus if you allocate two integer objects in a row, the ids, which reflect/reveal the object memory address, are likely to have a difference of 32, rather than 28, thanks to this alignment. You can see this effect if you ask for id(2)-id(1). (there is an internal table of integer objects for values 1-256, which is why the integer object id for two is larger than the integer object for one, even though it occurs first on the line -- they were both pre-allocated. )

Collectives™ on Stack Overflow

"sys.getsizeof(int)" returns an unreasonably large value?

1 Answer 1

The short answer

`int` objects in CPython 3.11

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

The short answer

int objects in CPython 3.11

5 Comments

Linked

Related

`int` objects in CPython 3.11