#internals-and-peps
1 messages · Page 7 of 1
valgrind might catch it too?
valgrind probably wouldn't catch it unless you also run with PYTHONMALLOC=malloc
but if you did that, then yes, valgrind should catch a write to freed memory
if you're lucky and the memory didn't get allocated to some other python object
yeah.
but if it did, you'd eventually catch a write to freed memory when that object gets decremented one too many times
though at that point the error report might be far from the buggy line of code.
is there some reason why object() doesn’t have __dict__ and can’t be set custom attribute to?
performance. Creating dicts is slow, requiring every object to have one is inefficient
it could probably be solved in other ways, but that's the reason it is what it is.
so how can its subclasses have __dict__ by default when it doesn’t
local cpython clone built using debug mode catched it ```py
test_refcnt((X(),None))
2
1:0
-2459565876494606883
\cpython-main\Python\ceval.c:4697: _Py_NegativeRefcount: Assertion failed: object has negative ref count
<object at 000001FCA5288460 is freed>
Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
Python runtime state: initialized
Current thread 0x00001a60 (most recent call first):
File "<stdin>", line 1 in <module>
subclasses can add stuff on top of what a parent class has. They just can't take stuff away.
if you decremented the struct's ob_refcnt without calling Py_DEC_REF or Py_Dec_Ref, does that still trigger GC?
or... is it just UB?
it's not magic.
no, DECREF has a branch that checks if the refcount is 0, and if so, destroys the object
nothing is watching the reference count field and observing writes to it
if you set it to 0 manually, the object won't be destroyed
but something bad may happen
if you set it to 1, the next decref call would destroy it
but setting it to 0 won't do anything but set it to 0.
ok so despite that it seems like PyGC_Collect() works
and yes, definitely UB.
i decided to put test_refcnt() as a builtin so here's the new code c static PyObject * builtin_test_refcnt(PyObject *self, PyObject *o) { printf("before decref: %zd\n", Py_REFCNT(o)); Py_DECREF(o); printf("after decref: %zd\n", Py_REFCNT(o)); PyGC_Collect(); printf("after collect: %zd\n", Py_REFCNT(o)); Py_RETURN_NONE; } output ```py
test_refcnt(X())
before decref: 2
after decref: 1
after collect: -2459565876494606883
C:\Users\rog\cpython-main\Python\ceval.c:4697: _Py_NegativeRefcount: Assertion failed: object has negative ref count
<object at 00000130CE586860 is freed>
Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
Python runtime state: initialized
Current thread 0x000017ec (most recent call first):
File "<stdin>", line 1 in <module>
so collection works ig
so essentially an object should never have 0 refcount right?
since the gc happens when there is 1 remaining, and it's already destroyed before becoming 0
yes
Include/object.h lines 555 to 563
static inline void Py_DECREF(PyObject *op)
{
_Py_DECREF_STAT_INC();
// Non-limited C API and limited C API for Python 3.9 and older access
// directly PyObject.ob_refcnt.
if (--op->ob_refcnt == 0) {
_Py_Dealloc(op);
}
}```
yep, exactly - as soon as it drops to 0, it's destroyed.
it's slightly confusing to say "GC" here because in CPython that usually refers to the cyclical GC, which doesn't directly deal with refcounts
what's interesting is that it gets left alone when reference count is not 1
what if it is -1
nasal demons
that's not supposed to ever happen. The code is written to assume that never happens. If that happens, things break.
@gray galleon you see this fatal error?
that's what happens
only in debug builds though
if you're running with a debug build.
define "work"
abort
checking for negative refcounts is extra work that's unnecessary in a well-behaved program
so (most) assertions are disabled in release builds.
!e 
from ctypes import *
import sys
x = [1, 2]
c_ssize_t.from_address(id(x)).value = -2
print(sys.getrefcount(x))
print(x)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | -1
002 | [1, 2]
isn’t it just as simple as --op->ob_refcnt <= 0
i wonder how many more things i can break with that
that's extra code with a branch in a very hot code path
adding that will likely make python measurably (a few %) slower
aren’t both of these just comparisons
or does compiler has an optimization for equality comparisons
deallocating an object with a negative refcount isn't any safer than the current behavior
you want a different path for == 0 and < 0
== 0 should deallocate, < 0 should abort and tell you you have a bug
< 0 = bug ok
actually, why is ob_refcnt a signed integer anyways?
Hi, I'm trying to convert a .py file to an .exe.
I tried using pyinstaller, but it seems to give me this error:
"The term 'pyinstaller' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included , verify that the path is correct and try again."
I used the "pip install pyinstaller" command and it still doesn't work.
good question. I think the GC may use the extra bits for some bookkeeping
this is not a help channel; see #❓|how-to-get-help
I'm pretty sure the GC does its bookkeeping with 2 fat pointers it adds before every object. I would bet ob_refcnt is probably signed purely for the ability to ensure no negative ref counts on debug builds
one of those temporarily holds a refcount plus some flags though. if the refcount could use the full word there'd be no space for the flags
Oh true
da
i ran pip from the python interpreter
Are there any performance optimizations related to the typing system?
if I understand you right, no. There is mypyc which does just that, and the new adaptive interpreter is going to be effectively the same for longer-running programs.
type hints aren't really used in the specializing adaptive interpreter in 3.11+ but they are used by stuff like Cython and MyPyC
@flat gazelle @rose schooner Thanks!
what exactly are these 2 types for
https://github.com/python/cpython/blob/3.11/Include/cpython/dictobject.h#L5-L6
Include/cpython/dictobject.h lines 5 to 6
typedef struct _dictkeysobject PyDictKeysObject;
typedef struct _dictvalues PyDictValues;```
They're both in the PyDictObject struct
https://github.com/python/cpython/blob/3.11/Include/cpython/dictobject.h#L21
Include/cpython/dictobject.h line 21
PyDictKeysObject *ma_keys;```
ah so this is the struct def for PyDictKeysObject, is there a reason it's like this instead of like the other PyObject structs?
https://github.com/python/cpython/blob/3.11/Objects/dictobject.c#L453-L463
Objects/dictobject.c lines 453 to 463
static PyDictKeysObject empty_keys_struct = {
1, /* dk_refcnt */
0, /* dk_log2_size */
0, /* dk_log2_index_bytes */
DICT_KEYS_UNICODE, /* dk_kind */
1, /* dk_version */
0, /* dk_usable (immutable) */
0, /* dk_nentries */
{DKIX_EMPTY, DKIX_EMPTY, DKIX_EMPTY, DKIX_EMPTY,
DKIX_EMPTY, DKIX_EMPTY, DKIX_EMPTY, DKIX_EMPTY}, /* dk_indices */
};```
seems like it's not a PyObject. I think it's a separate object from the dict itself so that multiple dicts can share keys
what do you need help with?
do you have info on what happens to inner references when a container object is destructed? I've been looking in the collections (list / dict) but couldn't find anything
it decrefs them, e.g. in _list_clear
is that called by the GC?
ah okay that makes sense 👍
what is the ascii font text pls
- incorrect channel
- it's a combination of different characters that was probably autogenerated so you can't get it as an individual "font"
Box-drawing characters, also known as line-drawing characters, are a form of semigraphics widely used in text user interfaces to draw various geometric frames and boxes. Box-drawing characters typically only work well with monospaced fonts. In graphical user interfaces, these characters are much less useful as it is more simple and appropriate t...
How can I use annotation which refers to self? i.e
class Node:
def __init__(self, value) -> None:
self.value = value
self.prev: Optional[Node] = None
I know that this is wrong, but Is there any way?
what you have actually works, why do you think it's wrong?
Self makes a difference if you care about subclasses of Node
Yes, It is what I was searching for. Thanks!
you don't need to annotate the self parameter of a method; typing tools already know that the type is the type of the instance of the class that the method was called on
the example does not annotate self
ah - I assumed that's what they were asking for Self for - you think they're asking for it for Optional[Node]?
I assumed they were asking about self.prev 🙂
you're probably right.
If I do this as I wrote the code, or don't annotate it, my linter throws err when I assign any value of Self type later.
to self.prev?
Optional[Node] is probably more correct, actually, unless you know that every element in the list will be of the same type (that is, that Node won't be subclassed, or that the list won't be made up of nodes of two different types)
It also works. Thanks!
How does python allocate memory for variables and methods?
python objects are allocated to some place in the heap
when it is no longer referenced it is deleted and memory is freed
this applies to all objects
including methods
Is there a checkable parent class of ctypes types like c_ssize_t, c_bool, etc.?
c_int seems to inherit _SimpleCData, which inherits _CData
but neither _SimpleCData nor _CData seem to be importable from ctypes
I've currently got
def is_ctypes_type(x) -> bool:
try:
POINTER(x)
return True
except TypeError:
return False
not sure if there's a better way 🥴
I can't figure a way of directly importing those types, but some hackery with _CData = ctypes.c_uint.mro()[-2] seems to actually allow me to have a reference to _CData directly 🥴
!e ```py
import ctypes
_CData = ctypes.c_uint.mro()[-2]
c_types = [
ctypes.c_bool,
ctypes.c_ubyte,
ctypes.c_uint16,
ctypes.c_int64,
ctypes.c_float,
ctypes.c_void_p,
]
print([issubclass(t, _CData) for t in c_types])
print(_CData)
@umbral plume :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | [True, True, True, True, True, True]
002 | <class '_ctypes._CData'>
🥴 interesting
now what to do with mypy 😩
delet
stdlib/ctypes/__init__.pyi line 70
class _CData(metaclass=_CDataMeta):```
but I can't import it
within if TYPE_CHECKING you can
hm...
also I guess with delayed evaluation with __futures__ annotations it doesn't need to be resolvable
though I was hoping to be able to isinstance it at runtime as well
you can do if TYPECHECKING: from ctypes import _CData else: your mro() trick
The MRO trickery's only needed if you're planning on checking types at runtime, else just importing postponed annotations should be enough
i should've seen that coming, the else clause being labelled as unreachable by pylance 
!e py from _ctypes import _SimpleCData print(_SimpleCData) @warm breach @umbral plume
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
<class '_ctypes._SimpleCData'>
Looks like _CData isn't exposed but _SimpleCData is
thanks this seems the least cursed 👍
I'm going to get reemed for this one I know but: is there a way to override how one byte code instruction is handled
!pep 523
though I think you have to replace the entire interpreter
With 523 you have to interpret all the code associated with the frame? Is that what you mean by "replace the entire interpreter"?
You can theoretically use a trace function to set frame.f_trace_bytecode and then do something with that
Depends what you to do
Ya I hit upon that but that API doesn't let you do something different does it? It's just a hook. You still have to return control to the runtime for evaluation don't you? Maybe I'm wrong
You do, but depending on what you want to change the opcode to do you may be able to manipulate it (using a combination of frames and ctypes)
Ya. Funny ouroboros thing is when exploring that API I tried to set a breakpoint in the callback but of course that doesn't work 😜
Hmm how does PEP 659 work
what kind of structure is a user defined class?
do classes inheriting builtin types have any relation to their original struct, like PyListObject?
class Foo(list):
...
i think it's now allocated on the heap
a dynamic type object
so what exactly is at the address of an instance object of a custom class

*heap
the object
its type is the custom class
the PyObject struct
a PyObject consists of:
- reference count
- reference to type (class)
- object slots (in object without slots, reference to
__dict__and__weakref__)
Effectively it's going to be the base class' struct, with the __dict__ pointer appended to the end if the original doesn't have one. Or each __slots__ slot directly.
hm that makes sense
it seems I could still read the user classes inheriting built-ins with the built-in's struct
I guess that's reading the front part
Yep, it's the same, so the C code will just use the same pointer offsets to get to each value.
What's the difference between heap types and everything else?
I think everything is allocated in heap, so every type is heap type?
no
there are static types that are either built-ins or custom C extension types
Consider int for example
Small instances are static allocated.
Every other instance is created on heap because it can take arbitrary big amount of memory.
So, int is a static type?
Same logic for tuple, list, str, ...
bool can be allocated in static memory
Built-in classes are also mostly statically allocated afaik.
@pliant tusk do you have any idea wtf is going on here 😔
from ctypes import POINTER, pythonapi, py_object
from einspect.structs.py_dict import PyDictObject
PyDict_GetItem = pythonapi["PyDict_GetItem"]
PyDict_GetItem.argtypes = (POINTER(PyDictObject), py_object)
PyDict_GetItem.restype = py_object
d = {"A": "hi", "B": 200}
obj = PyDictObject.from_object(d)
print(PyDict_GetItem(obj, "A"))
hi
Process finished with exit code 0
d = {"A": "doggo", "B": 200}
obj = PyDictObject.from_object(d)
print(PyDict_GetItem(obj, "A"))
doggo
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
so um, PyDict_GetItem returning "hi" works, but changed to "doggo" it segfaults 🥴
!e
from ctypes import *
class PyObject(Structure):
_fields_ = [
("ob_refcnt", c_long),
("ob_type", py_object),
]
class PyDictObject(PyObject):
_fields_ = [
("ma_used", c_ssize_t),
("ma_version_tag", c_uint64),
("ma_keys", POINTER(c_void_p)),
("ma_values", POINTER(c_void_p)),
]
PyDict_GetItem = pythonapi["PyDict_GetItem"]
PyDict_GetItem.argtypes = (POINTER(PyDictObject), py_object)
PyDict_GetItem.restype = py_object
d = {"A": "9694-d84ea2f28502", "B": "a840-2e41464912cc"}
obj = PyDictObject.from_address(id(d))
print(PyDict_GetItem(obj, "A"))
@warm breach :x: Your 3.11 eval job has completed with return code 139 (SIGSEGV).
9694-d84ea2f28502
but if you change those two dict values to a short string like "cat" "dog" it returns without a segmentation fault...?
I'm so confused
not sure of the details, but I bet there's a refcounting bug that somehow gets hidden if the string is on a freelist/interned because it's short
defining the dict like this seems to be the same
d = {
"A": sys.intern("9694-d84ea2f28502"),
"B": sys.intern("a840-2e41464912cc"),
}
but if I call Py_IncRef(obj) before the PyDict_GetItem call, it works without the segfault
but I don't think PyDict_GetItem steals a reference of the dict object?
and also how is the dict reference increase fix related to the string length of the value...? 
well one part of the puzzle is that PyDict_GetItem returns a borrowed ref https://docs.python.org/3/c-api/dict.html#c.PyDict_GetItem
I think with your ctypes code the return value gets automatically decrefed, so you must incref it to compensate because you don't actually own a ref
is that what borrowing means? It will decref the object it returns?
that still doesn't make sense though, increfing the string (return value) doesn't do anything, the case where it's fixed is by increfing the dictionary itself
no, it means you don't own a reference to the object returned. I think with the ctypes wrapper the DECREF will happen because the Python code that gets the return value does think it owns a reference
!e
import sys
from ctypes import *
class PyObject(Structure):
_fields_ = [
("ob_refcnt", c_long),
("ob_type", py_object),
]
class PyDictObject(PyObject):
_fields_ = [
("ma_used", c_ssize_t),
("ma_version_tag", c_uint64),
("ma_keys", POINTER(c_void_p)),
("ma_values", POINTER(c_void_p)),
]
PyDict_GetItem = pythonapi["PyDict_GetItem"]
PyDict_GetItem.argtypes = (POINTER(PyDictObject), py_object)
PyDict_GetItem.restype = py_object
Py_IncRef = pythonapi["Py_IncRef"]
Py_IncRef.argtypes = (py_object,)
text = "9694-d84ea2f28502"
d = {"A": text, "B": "a840-2e41464912cc"}
obj = PyDictObject.from_address(id(d))
print("text refs:", sys.getrefcount(text))
Py_IncRef(text)
print("text refs:", sys.getrefcount(text))
print(PyDict_GetItem(obj, "A"))
print("text refs:", sys.getrefcount(text))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | text refs: 5
002 | text refs: 6
003 | 9694-d84ea2f28502
004 | text refs: 5
actually yeah I think that's exactly what's going on 👍 thanks
not sure why IncRefing the dictionary makes that not segfault though
actually py_object is just completely unusable with that api function, it tries to autocast the NULL pointer returned on missing keys and segmentation faults
seems the only option is to make restype = POINTER(PyObject) and cast the pointer into a py_object myself and catch the ValueError on null pointer access
Is there a - sane - way to access the raw data going in and out of a SSLSocket? I'm working on a server that's supposed to communicate with a client that I don't have the SSL config for. So far I'm logging the client hello with a regular Socket (since SSLSocket hides the entire handshake) and trying to figure out why I get a "no shared cipher" with a field client but not with a test client. But I feel like it'd be easier going (for this and other potential errors) if I could make the test client better mimic the field client with the logged data.
This isn't correct. Small integers are also heap allocated. Those heap-allocated pointers just happen to be cached in a statically allocated array.
but "heap type" versus "static type" refers to the structure that represents the type itself, not instances of the type. Any type created with a call to the Python class statement or the type() constructor is a heap type. Types defined in C code may be heap types, or may not be - if the implementation uses static PyTypeObject ..., then it's not a heap type, it's a static type.
Pretty sure the only static instances (other than static classes) are constants like True, False, None, Ellipse
I believe so, yeah.
i think in new versions all cached ints (-5..255) are allocated statically
really? Hm, could be that changed since I last looked...
Since 3.11a4, then. OK, cool, I stand corrected.
They used to be dynamically initialized at startup right? We've talked about this quirk in the past.
Commit message does seem to imply this: "Consequently they are no longer initialized dynamically during runtime init."
why use Py_IncRef when you can Py_INCREF from _ctypes
all ints in range(-_PY_NSMALLNEGINTS, _PY_NSMALLPOSINTS)
because those values can change
what is the latest version of protobuf can python2 support?
what's difference 
latter is just more direct
what about ob_refcnt += 1 
from einspect.structs.py_dict import PyDictObject
d = {"A": 10, "B": 20}
py_dict = PyDictObject.from_object(d)
item_ptr = py_dict.GetItem("A")
item = item_ptr.contents
item.ob_refcnt += 1
obj = item.into_object().value
print(obj)
>> 10
actually might have to try except the item = item_ptr.contents as well, since PyDict_GetItem returns a null pointer on missing keys
ok ig
are you sure about that?
Modules/_ctypes/callproc.c lines 1787 to 1793
static PyObject *
My_Py_INCREF(PyObject *self, PyObject *arg)
{
Py_INCREF(arg); /* that's what this function is for */
Py_INCREF(arg); /* that for returning it */
return arg;
}```
https://github.com/python/cpython/blob/main/Modules/_ctypes/callproc.c#L1795-L1800 also isn't this just return arg;
Modules/_ctypes/callproc.c lines 1795 to 1800
static PyObject *
My_Py_DECREF(PyObject *self, PyObject *arg)
{
Py_DECREF(arg); /* that's what this function is for */
Py_INCREF(arg); /* that's for returning it */
return arg;```
semantically the same as that, yeah. Probably optimized to the same.
almost certainly optimized to the same, in fact.
is there a way to type hint a pointer[NULL] as returned by functions like https://docs.python.org/3/c-api/list.html#c.PyList_GetItem
how does one even make a null ctypes.pointer in python?
void pointer to 0
aka ctypes.c_void_p(None)
actually I guess the pointer type doesn't change
it's always LP_PyObject, just happens to be null
w3r
!mute 548929078242181141 Don't spam across channels
:incoming_envelope: :ok_hand: applied mute to @silk bane until <t:1671634310:f> (1 hour).
Are you supposed to subclass typing.Protocol? Like I have this example, and I am surprised that issubclass(FooBar, HasFoo) == False but isinstance(FooBar(), HasFoo) == True
from typing import Protocol, runtime_checkable
from functools import singledispatch
@runtime_checkable
class HasFoo(Protocol):
foo: str
class FooBar:
def __init__(self):
self.foo = 'bar'
So its like, should I not be subclassing HasFoo or should I?
runtime_checkable isn't (and really can't be) reliable for instance attributes. In your case, the class doesn't actually have a foo attribute
It would be good if runtime_checkable can also check that instance have some attributes.
Every var annotated in protocol should be an attr of var or a property in its class
But it is tricky in issubclass(). There is no way to check that instances will have some attr later (it can work for properties and attr from slots, but not for dynamic attrs)
Im trying to make functools.singledispatch work for protocols as requested here https://discuss.python.org/t/functools-singledispatch-should-support-pep-544-protocols/21912?u=melendowski
Not sure if it’s an omission or intentional. An example from the stackoverflow question: from typing import Protocol from functools import singledispatch class HasFoo(Protocol): foo: str class FooBar: def init(self): self.foo = 'bar' @singledispatch def f(_) -> None: raise NotImplementedError @f.register def _(it...
I was going to apply runtime_checkable to HasFoo and add an isinstance check in functools.singledispatch
You are saying thats a bad idea?
probably not a good idea in my opinion, but I'm not sure singledispatch is a good idea anywhere so I might be biased
Originally I thought this should just be a different dispatch decorator
haha ok. I don't really use it either but it seemed simple at first
typing.runtime_checkable also doesn't check the attribute type, only that the attribute exists it seems, so its less useful for this
yes, and checking the runtime type would be too complicated so I'd be opposed to changing that
So I'd assume then, if checking the runtime type on the attribute is too complicated, then making a dispatch algorithm for structural types, which did check the runtime type on the attribute is a no go
well, one thought there is that runtime_checkable protocols are most useful for methods where it's unlikely the signature will actually be different
e.g. technically to be an Iterable you need to have an __iter__ method with a particular signature
but just checking hasattr(obj, "__iter__") (which is what @runtime_checkable does) is a pretty good proxy for that
how does python implement abstract classes
how does it check if all methods are implemented
has dict unpacking assignment been suggested before?
maybe look into match cases
nvm doesn't support it
actually it is supported by match case but you can't unpack-assign a key
yes, several times on python-ideas
https://github.com/python/cpython/blob/main/Lib/test/test_asyncio/test_futures2.py#L71-L74 am I correct in assuming this is a oversight, and this test will never be skipped, either failing due to the class not building due to attribute error, or passing due to a correct _CTask being present (or failing in case of an incorrect asyncio impl)?
Lib/test/test_asyncio/test_futures2.py lines 71 to 74
@unittest.skipUnless(hasattr(tasks, '_CTask'),
'requires the C _asyncio module')
class CFutureTests(FutureTests, unittest.IsolatedAsyncioTestCase):
cls = tasks._CTask```
nevermind, that should indeed skip since the decorator evaluates first
how to write in this format
!code
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
the decorator will be evaluated first, but that doesn't prevent the class from being evaluated
Hmmm, yeah, that actually won't work. If the decorator raises a Skip, it skips the whole module
Fun
Can anyone help with this PEP ?? https://peps.python.org/pep-0582/
Python Enhancement Proposals (PEPs)
denball is trying to say, it works better here to ask your question directly. People might not volunteer as experts, and sometimes non-experts can help.
What do you mean by "helping with this PEP"? Help getting it accepted?
dk if it belongs here but do you know if something changed in ipython?
In [92]: def g(*x):
...: upper_frame = sys._getframe(1)
...: code = inspect.getsource(upper_frame)
...: print(code)
calling with
g(1,
2)
only yields the first line- g(1,
note this happens only on the global scope
when calling from inside a function it works as expected
excuse me?
what
Did it not used to do this?
im not sure actually. opened an issue, see what they say
The thing is that code can only be one line if I remember correctly, you'll see the same issue with the following code ```python
def err():
raise RuntimeError()
err(
)
Traceback (most recent call last):
File "C:\Users\%username%\Projects\test.py", line 4, in <module>
err(
File "C:\Users\%username%\Projects\test.py", line 2, in err
raise RuntimeError()
RuntimeError
Well, not necessarily but you may see it that way. Is it not correct though? The actual function call is on that line, even though the parameters and closing parenthesis is on multiple lines
it is
just feels awkward when not talking in tracebacks
Yes, I'd to see it accepted
i've done something that gets the whole thing before
#1045907088309755975 message maybe adapt this for a function
this seems to work ```py
from dis import findlinestarts
from itertools import islice, takewhile
from linecache import getline
from sys import _getframe
def g(*x):
lasti = (frame := _getframe(1)).f_lasti
lineno = frame.f_lineno
lines = [*takewhile(lambda x: x[0] != lasti, findlinestarts(frame.f_code))]
last_mention = ~next(
(i for i, x in enumerate(reversed(lines)) if i and x[1] == lineno),
0)
if 'file' not in frame.f_globals:
file = frame.f_code.co_filename
if last_mention == -1:
code = getline(file, lineno)
end_lineno = lineno
else:
end_lineno = max(x[1] for x in lines[last_mention:])
is_old = end_lineno == lineno + 1
code = ''.join([getline(file, i) for i in range(lineno, end_lineno+is_old)])
end_lineno -= not is_old
print(code.rstrip('\r\n'))
for the most part ```py
In [30]: g('wow'
...: "it's"
...: + str(25 *
...: sum(range(7)))
...: )
g('wow'
"it's"
- str(25 *
python pass arguments by reference am i right
always, yes
Python has no concept of "value". All objects are allocated on the heap and so all objects are a reference to the object on the heap.
sorta. it's not quite the same. a lot of people like to explain it as "pass by assignment"
this is an odd way to put it. Python variables are names that refer to values. Python has plenty of values.
for some reason, CS has convinced people that there are only two styles of passing arguments: by value or by reference. Python, JavaScript, Ruby, and Java use another mechanism.
In C++, "by reference" means you can do this: x = 1; set_to_2(x); assert(x == 2). You can't do that in Python.
how did CS do that 🤔. just seems like people using familiar terms from C or C++
i'm not sure, but it's a really common question: "Is Python pass-by-value, or pass-by-reference?"
i wonder if the people asking are always C++ programmers.
maybe they are
probably not
then where did the idea come from?
if i had to guess, poor teaching/learning materials
right, based on C++ supremacy
well yeah. C and C++ are commonly taught in school
that doesn't mean people asking are always C++ programmers, though
right, it's an idea that has infected CS teaching overall.
idk why you keep relating it back to CS
what would you call it? You're talking about "taught in school."
meanwhile, wikipedia lists 8 styles: https://en.wikipedia.org/wiki/Evaluation_strategy
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter (the binding strategy) and whether to evaluate the parameters of a function ca...
👀
from einspect import view
x = 8
view(x).value = 2
assert x == 2
ctypes doesn't count.
where source
ok
to be fair the only reason it wasn't possible before was due to the mutability of the underlying value referenced by x, not really intrinsically unchangeable
you mean the immutability
would be nice if this provides a function to change the size of a tuple
or any immutable object thereof
fsvo "nice"
it does 👀
how
but that's writing on unallocated memory
it has to
there's view._pyobject.Resize which calls the resize api but requires you to have ref count == 1
i'm gonna go make a function for the exact purpose of resizing a tuple
you can also call the RESIZE macro but that will reallocate the tuple
store ref count, set it to 1, call it, set refcount back to what it was before
()i could just iterate over gc.get_referents
yeah I really don't think that's mutating the tuple anymore :p
i guess
you can do it but if there were actually references, the original reference would point to garbage
the reason they require 1 reference is since it reallocates the tuple
k i gotta go sleep
aren’t numbers immutable in python
which is why you can’t do that
but in C++, with pass-by-reference, you can.
in C++, pass-by-reference is a reference to a variable. python's are references to values.
so we can debate why you can do it in C++ and not Python, but the fact remains that the two languages have different mechanisms.
That but also in other languages you can rebind what x refers to, regardless of mutability
assuming ints were mutable in python, you still could only change the value of x, not make it another type, for example
or, c++ has no concept of mutable vs immutable values, it has a concept of const variables.
I guess python doesn't really have a language standard for mutability?
just runtime devices to prevent it in some cases
Chack out the podcast with Lex
bruh
a frozen=True dataclass is completely mutable for example, even the dataclass constructor mutates it to make the class
with the creator of python
doesn't it? int/float/string/tuple are immutable, list/dict are mutable.
other types have descriptor-level locks are "more" immutable
Got to learn so much about python from its creator
it is immutable after construction
for c descriptor locked types, yes, not a dataclass
due to __setattr__
!e
from dataclasses import dataclass
@dataclass(frozen=True)
class Data:
x = 5
d = Data()
object.__setattr__(d, "x", 'hi')
print(d.x)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
hi
!e then you have enum which places a descriptor lock, making it "more" immutable
from enum import Enum
class Data(Enum):
A = 1
object.__setattr__(Data, "B", 'hi')
@warm breach :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 6, in <module>
003 | TypeError: can't apply this __setattr__ to EnumType object
Python is very flexible, but usually used in conventional ways
true yeah, I was just thinking there's no real standard way of making something immutable in python since apparently libraries do it different
i see what you mean. immutability for user-defined classes isn't a language feature, so it's hacks all the way down
actually I think there kind of is with Py_TPFLAGS_IMMUTABLE at the types struct
that's CPython implementation, not language.
wait
so in python each object has an id
variables store that id
and that id is passed to arguments
set_to_2(x) can change object at x but cannot change x itself to a completely different object because the object id is passed rather than the variable itself
python’s model is “pass by object id”
hope that make sense
thank you for coming to my ted talk
🙂
i think its called "call by value where the value is a reference" or call by sharing
there are a bunch of names people try to give it unfortunately 🙂
"call by pointer"
it is quite underrated even though many languages use it lmao
true
also do you know what is up with
x = object.__dict__
reveal_type(x)
>> Revealed type is "builtins.dict[builtins.str, Any]"
pyright seems to also reveal it to dict[str, Any]
but the runtime type is a MappingProxyType[str, Any]
so I'm starting to think it's an issue with typeshed or something
but https://github.com/python/typeshed/blob/0c196791fc40171bb9f6dadf9fb19705d82efa77/stdlib/builtins.pyi#L147-L148 it seems correctly defined?
stdlib/builtins.pyi lines 147 to 148
@property
def __dict__(self) -> types.MappingProxyType[str, Any]: ... # type: ignore[override]```
ah. hm
I guess that behavior is quite weird for typing
that the type changes just due to inheritance
set pythons minor version to 0
bad mistake
it causes the parsers feature_version to break
and be set... to 0
whats the pep for class XYZ[EFG]:
!pep 695
what will it have
generics, enums and optimizations
aren’t generics and enums already there
python has no enum keyword
the idea of an enum is that its an unique list
there's an enum module in the standard library though
what he means is no dedicated syntax
Why is dedicated syntax needed though?
Yeah, I don't think it is needed. Plus you lose out on a lot of customization.
what does generics mean? This sounds like you are trying to turn it into Java or Go?
I think he means generics as in typing. Hence he was interested in pep695 which bakes the syntax into the language instead of having to subclass typing.Generic
everyone needs a hobby I guess 🙂
like f that
can you say more about what you mean by generics?
@scenic path hello, please don't spam
what does NEWLOCALS flag mean
class MyName[T]
I think it's not really useful any more. In 2.7 it was set for all non-module code types though I'm not sure I fully understand why
python should have range literals tbh
given the language’s heavy reliance on range its not hard to see why. it means that a dedicated opcode can be made for range instead of a function call therefore can be slightly faster (just like how f-strings are faster than format)
it should be as readable as range (especially in simpler cases) while being more concise
"heavy reliance" on range?
oh no i mean common usage and use cases of it
if you are using it to iterate over indexes, you probably should be just iterating over the object directly or using enumerate() instead
other than in that case, I don't think that there are any ultra common cases in which you'd use range
fair
It's not actually required to have it be syntax to make it faster - the specialising interpreter is already doing that: https://github.com/python/cpython/blob/b0ea28913e3bf684ef847a71afcdfa8224bab63d/Python/bytecodes.c#L2541
Python/bytecodes.c line 2541
inst(FOR_ITER_RANGE) {```
What's the difference between bytecode.c and ceval.c? Both files contains huge switch-case for opcodes
sounds like bytecode.c is used to generate some header which is the included in ceval.c?..
https://github.com/python/cpython/blob/b0ea28913e3bf684ef847a71afcdfa8224bab63d/Tools/cases_generator/generate_cases.py#L1-L5
Tools/cases_generator/generate_cases.py lines 1 to 5
"""Generate the main interpreter switch.
Reads the instruction definitions from bytecodes.c.
Writes the cases to generated_cases.c.h, which is #included in ceval.c.
"""```
the opcodes in Python/ceval.c from <=3.11 were moved to Python/bytecodes.c as pseudo-opcodes or something in 3.12 to allow for easy generating of actual instructions, super instructions, and specialized instructions
There’s now a header in each instruction indicating how it affects the stack, so the push/pops can be automated.
nice, embedding Forth into a C code generation program
I don't even know what level of meta this is
is there variable pinning in python match?
i read the spec and there isn’t any reference to it
wdym by variable pinning?
instead of being captured as a pattern variable, pinned variable will be evaluated and become a match constant instead
ah, you mean something like ```py
from math import pi
def is_pi(x):
match x:
case $pi$:
return "yep"
return "sorry mate"
Yeah that is not a thing
it only applies if you use dotted notation, like case math.pi
like this py ONE = 1 match n: case ^ONE: print('number one') become this ```py
match n:
case 1:
print('number one')
yeah
you'll have to use an additional == check
hmm
Include/cpython/unicodeobject.h line 99
Py_hash_t hash; /* Hash value; -1 if not set */```
what does it mean for a PyUnicodeObject.hash to be "not set" at -1?
I think that means the hash hasn't been computed yet.
right - strings are immutable, so their hashes are cached the first time they're needed. If it's set to -1, then the hash hasn't yet been computed. Otherwise, the already-computed value is reused.
Objects/unicodeobject.c lines 11018 to 11024
if (_PyUnicode_HASH(self) != -1)
return _PyUnicode_HASH(self);
x = _Py_HashBytes(PyUnicode_DATA(self),
PyUnicode_GET_LENGTH(self) * PyUnicode_KIND(self));
_PyUnicode_HASH(self) = x;
return x;```
ah interesting
apparently one of the checks for whether a string can be mutated by CPython is that hash is -1
i just felt like turning it into a single ternary ```c
return (_PyUnicode_HASH(self) != -1 ? _PyUnicode_HASH(self) :
(_PyUnicode_HASH(self) =
_Py_HashBytes(PyUnicode_DATA(self),
PyUnicode_GET_LENGTH(self) * PyUnicode_KIND(self)));
though uh, is there a point where they're pre-calculated?
not that I know of, but perhaps?
it seems the value is usually populated in all cases
except when I run a pytest then it becomes -1 
why not calculate it when it's created?
why have a "lazy" hash?
because hashing is slow and expensive, and most strings will never be hashed
!e
from ctypes import c_int64
s = "hi"
s_hash = c_int64.from_address(id(s)+8*3)
print(s_hash)
print(hash(s))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | c_long(8187090164556928390)
002 | 8187090164556928390
!e hm, string literals might be hashed in advance. But ```py
from ctypes import c_int64
r = "h"
s = r + "i"
s_hash = c_int64.from_address(id(s)+8*3)
print(s_hash)
print(hash(s))
@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | c_long(-1)
002 | 1011790998455305619
!e ```py
print("hello" + "world" is "helloworld")
@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | <string>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
002 | True
that gets AST-optimized into a string literal
yeah constant folding
!e ```py
a = "hello "
b = "world"
c = "hello world"
print(a + b is c)
@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.
False
without constant folding it no longer works
in this case, are there ever actually two references to the same string? or does the instance from the left get deleted, and then the same memory address is used to create it again on the right?
no, they exist at the same time
the thing in the left gets optimized to helloworld which was probably interned
my guess is that after constant folding the constant interning process starts
<dis>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
0 0 RESUME 0
1 2 PUSH_NULL
4 LOAD_NAME 0 (print)
6 LOAD_CONST 0 ('helloworld')
8 LOAD_CONST 0 ('helloworld')
10 IS_OP 0
12 PRECALL 1
16 CALL 1
26 RETURN_VALUE
then that constant is put into the co_consts tuple
so they end up being the same obj
and both left and right use that constant
it loads the same constant twice
is there a cache of strings that are currently in memory, and cpython checks to see if an equivalent string exists when a new one is created?
no, this happens at compile time, @rose schooner is explaining part of the process
hmm
but if you actually create the string at runtime (e.g. with "o".join(["hell", "w", "rld"]) you get a different instance
how does 0 argument super work
does it get the object directly from the caller
!e
print('Hello')
@atomic grove :x: Your 3.11 eval job has completed with return code 1.
001 | File "<string>", line 1
002 | > print('Hello')
003 | ^
004 | SyntaxError: invalid syntax
!e
print('µStack')
@atomic grove :white_check_mark: Your 3.11 eval job has completed with return code 0.
µStack
.
It gets the first local variable from the upper frame afaik
so like ```py
outerframe = inspect.currentframe().f_back
super(type(outerframe.f_locals[outerframe.f_code.co_varnames[0]]))
in python code yes that's probably the equivalent
but in C code you don't do .f_back because there are no frames involving C code
no frames involving C code
wdym
so like when you call built-in functions defined in C they don't get a frame of their own
i think
so like ```py
print(2) # <module frame>
... # <inside builtin_print> <module frame>
... # <inside PyObject_Print> <module frame>
2
the equivalent in python code would need accounting for the function's own frame
C doesn't have any "built in functions" unless you're include type coercion
"A builtin function in C" refers to a builtin python function implemented in C
not anything built into c
It's not as built-into the language as much as you may think. Python technically turns each method into a closure so these are placed as cells (from my understanding, think variables the closure closes over).
The no-argument version of super() attempts to read these and supply the arguments you'd normally pass
Source: Dataclasses issue with super() when class is replaced with slots=True (creates a new class, otherwise dataclasses mutates the class)
There's also the special-casing of the name super, which, when used, counts as a use of the name __class__ (see https://github.com/python/cpython/blob/5e1adb4f8861f2a5969952d24c8ad0ce8ec0a8ec/Python/symtable.c#L1710-L1715), which, when present, adds a closure containing the class the method is defined in.
Python/symtable.c lines 1710 to 1715
/* Special-case super: it counts as a use of __class__ */
if (e->v.Name.ctx == Load &&
st->st_cur->ste_type == FunctionBlock &&
_PyUnicode_EqualToASCIIString(e->v.Name.id, "super")) {
if (!symtable_add_def(st, &_Py_ID(__class__), USE, LOCATION(e)))
VISIT_QUIT(st, 0);```
!e
class Yes:
def okay(self):
super
class No:
def okay(self):
duper
for cls in Yes, No:
print(cls.__name__, cls.okay.__closure__)
if cls.okay.__closure__:
print("\t", cls.okay.__closure__[0].cell_contents)
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | Yes (<cell at 0x7fa524e4a0b0: type object at 0x55a3369e1f80>,)
002 | <class '__main__.Yes'>
003 | No None
(this is how zero-argument super() gets the class, not the instance)
right - super() evaluates to super(__class__, first_arg_passed_to_the_function)
!e This leads to the surprising effect that unlike pretty much everywhere else in Python, you can't alias super and expect it to just work:
duper = super
class Wat:
def __init__(self):
duper().__init__()
Wat()
@quick snow :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 5, in <module>
003 | File "<string>", line 4, in __init__
004 | RuntimeError: super(): __class__ cell not found
!e ```py
duper = super
class Wat:
def init(self):
duper().init()
class
Wat()
@raven ridge :warning: Your 3.11 eval job has completed with return code 0.
[No output]
that's fun 🙂
Exactly, use super or __class__ anywhere in the method and it will work :D
See also: Super considered super https://www.youtube.com/watch?v=xKgELVmrqfs
i managed to recreate super() in python https://github.com/thatbirdguythatuknownot/sniplections/blob/main/pysuper.py
Im a beginner btw
hello guys
hello
can u help me
can you help me
I am learning python
so I have some example
i can`t solve it
hi anyone know any beginner friendly projects/resources to learn pandas / ML in general?
Objects/boolobject.c lines 197 to 205
struct _longobject _Py_FalseStruct = {
PyVarObject_HEAD_INIT(&PyBool_Type, 0)
{ 0 }
};
struct _longobject _Py_TrueStruct = {
PyVarObject_HEAD_INIT(&PyBool_Type, 1)
{ 1 }
};```
is there a reason True and False are 2 different structs 
@dusk comet :white_check_mark: Your 3.11 eval job has completed with return code 0.
False
but they are same type
yes, they are
struct _longobject
if you want to create two objects, you should do this: c int x = 0; int y = 1; not this: ```c
int x = 0;
int x = 1;
guys i only started python today can you pls help me
i will send error
#Coffee menu
menu = "Black Coffee, Espresso, Latte, Cappucino, Frappuccino"
#Ask the customer is they would like from the menu and store it in the variable order.
order = input ("name + what would you like from our menu today? Here is what we are serving." + menu)
#Ask the customer how many coffees they would like and store it in the variable QUANTITY quantity - input("How many coffees would you like?\n") #Set the price for coffee
#if order == "Frappuccino":
#price = 13
#else:
price = 8
if order == "frappucino":
price = 13
if order == "cappucino":
price = 10
if order == "latte":
price = 8
if order == "Espresso":
price = 10
if order == "black coffee":
price = 8
print("price = " +(str(price)) < ----this will not work it says eof error
The variable names are _Py_FalseStruct and _Py_TrueStruct, and they both have the type struct _longobject
They're structs because they're just statically allocated objects, as opposed to heap allocating them.
ah okay makes sense 👍
why can’t i subclass bool
because there's a flag preventing you from doing that
or maybe not
It wouldn't make any sense, since then there'd be two True objects...
if you subclass bool and not override __del__, you'd get an abort (probably)
https://github.com/python/cpython/blob/main/Objects/boolobject.c#L145-L149
Objects/boolobject.c lines 145 to 149
static void _Py_NO_RETURN
bool_dealloc(PyObject* Py_UNUSED(ignore))
{
_Py_FatalRefcountError("deallocating True or False");
}```
there's just a deliberate decision to ensure that True and False are the only two bool values. If you could subclass bool, that wouldn't be the case anymore.
oh there is a flag to allow you to do that
but bool doesn't have it
Objects/typeobject.c lines 2366 to 2371
if (!_PyType_HasFeature(base_i, Py_TPFLAGS_BASETYPE)) {
PyErr_Format(PyExc_TypeError,
"type '%.100s' is not an acceptable base type",
base_i->tp_name);
return NULL;
}```
just want to make a bool wrapper that prints true instead of True
maybe i subclass int instead
You could just implement a function that returns str(x)/repr(x), but changes bool values.
seems to be some weird repr/str behavior if you copy bools
!e
from ctypes import memmove
x = 4500
memmove(id(x), id(True), True.__sizeof__())
print(x, int(x), type(x))
print(bool(x))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | False 1 <class 'bool'>
002 | True
x has copied memory from True and works like True except prints as False 
Objects/boolobject.c line 12
PyObject *res = self == Py_True ? &_Py_ID(True) : &_Py_ID(False);```
anything that fails for is True will automatically be False
also what on earth is up with google searches for cpython 😩
>>> x is True
false
you can search using https://github.com/python/cpython/search?q=<query>
but it's case sensitive
still can't search past versions though
trying to find where this is in pre-3.11 https://github.com/python/cpython/blob/3.11/Include/internal/pycore_dict.h#L87-L125
struct _dictkeysobject
nvm it's case insensitive but it's a word match
is this a speed optimization?
probably
so True and False are just 2 static instances of int 1 and int 0
yes
you can view the blame and see the commit where they were added
that'll probably tell you where it was moved from
_Py_ID(True) and _Py_ID(False) are an optimization (cached constants for the strings "True" and "False"), but the rest of it is the obvious way to write that method, not an optimization
given that there are only 2 bools, the obvious and straightforward way to implement repr() is to return "True" if the object is True and "False" otherwise
0 is falsy, every other integer is truthy. Bools are integers and False == 0.
So i think it is obvious to return "False" if object is False, and "True"` otherwise
!cban 1058560915760492565 spam
:incoming_envelope: :ok_hand: applied ban to @cursive cypress permanently.
Does anybody kniw why nerdy was banned
hm, but __int__ on bools actually converts the values the struct holds
while repr checks the pointer by id 
wonder why it wasn't just one or the other
what function are you referring to? AFAICT from a quick glance, bool inherits __int__ from int
ah yeah I see, that makes sense
I guess they're just ints with overriden __repr__, |, ^, and &?
!e print(bool.__int__ is int.__int__)
@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.
True
basically, yep - https://github.com/python/cpython/blob/main/Objects/boolobject.c#L108-L192 shows what's overridden
!e do you know why these are different then
print(bool.__basicsize__)
print(int.__basicsize__)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 32
002 | 24
it appears tp_basicsize should be the one from _longobject?
!e also what even is 32 referring to? It's neither the size of True nor False (since those just have the same size as 1 and 0
print((1).__sizeof__())
print(True.__sizeof__())
print((0).__sizeof__())
print(False.__sizeof__())
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 28
002 | 28
003 | 24
004 | 24
tp_basicsize also accounts for alignment. Maybe that's the difference?
Objects/longobject.c lines 6282 to 6283
offsetof(PyLongObject, ob_digit), /* tp_basicsize */
sizeof(digit), /* tp_itemsize */```
`Objects/boolobject.c` lines 156 to 157
```c
sizeof(struct _longobject),
0,```
so at least part of the difference is explained by the tp_itemsize - int is variable-length, bool is fixed length
but why is bool __basicsize__ 32 
Oh, I think it makes sense.
Look at offsetof(PyLongObject, ob_digit)
tp_basicsize for an int isn't set to the size of struct _longobject!
It's shorter because it only has space out to where the digits start.
the bool one makes sense to me - it's not variable length, so it takes up exactly 32 bytes, which is the size of a struct _longobject with exactly 1 digit
hm?
but False has no digit array right?
so it's 4 bytes smaller than True
that seems true, but tp_basicsize is static
I'm not sure why they don't make bool variable like they do for int
It's not going to inherit successfully from int if it has no digit array.
longobject's tp_basicsize is 24 though, representing int 0
Stuff is going to read undefined memory then.
nah, ob_size for False is 0
actually - no, False does have a digit array. https://github.com/python/cpython/blob/main/Objects/boolobject.c#L199
Objects/boolobject.c line 199
{ 0 }```
that's the array.
does a 0 length array affect the size of a struct? 
It's not 0 length.
oh wait what
no, the struct's size is what it is, though C allows you to over-allocate and access memory past the end of the last field of the struct as though it's part of the last field, if the last field is an array. That's the trick that's being played with the digit ob_digit[1]; in the struct.
Include/cpython/longintrepr.h lines 79 to 82
struct _longobject {
PyObject_VAR_HEAD
digit ob_digit[1];
};```
so there's always space in a _longobject for one digit, but there can be space for more if extra space is malloc'ed for it.
So... I dunno why False.__sizeof__() is reporting 24 rather than 32. Seems wrong to me.
!e print(int.__eq__(0, False), int.__eq__(1, True))
@lone sun :white_check_mark: Your 3.11 eval job has completed with return code 0.
True True
okay so it seems all my confusion comes from __sizeof__
!e
print((0).__sizeof__())
print((1).__sizeof__())
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 24
002 | 28
I always thought 0 has an ob_digit array of size 0 as this would suggest
but apparently it is size 1 with [0] of 0
so what is __sizeof__ saying here 
!e print(int.__sizeof__ is bool.__sizeof__)
@lone sun :white_check_mark: Your 3.11 eval job has completed with return code 0.
True
yeah, __sizeof__ is just lying about that, I think. It's computing:
For a type with variable-length instances, the instances must have an
ob_sizefield, and the instance size istp_basicsizeplusNtimestp_itemsize, whereNis the “length” of the object.
But I don't think it's legal to under-allocate that last array field. I don't think it can legally be 0 bytes long.
that calculation makes a lot of assumptions tbh
I think tuples actually will have an 0 size array for ob_size=0?
but int(0) definitely seems to have an 1 size array holding 0
The relevant line is longobject.c 5882:
res = offsetof(PyLongObject, ob_digit) + Py_ABS(Py_SIZE(self))*sizeof(digit);
actually - https://en.wikipedia.org/wiki/Flexible_array_member - apparently it can be 0 in C99
C struct data types may end with a flexible array member with no specified size:
Typically, such structures serve as the header in a larger, variable memory allocation:
so maybe 0 actually does literally allocate zero bytes for the array. But False definitely doesn't.
I think this is a bug. int.__sizeof__ doesn't work properly for True and False because it relies on Py_SIZE, which ultimately relies on ob_size. But ob_size is zero for bool objects even though they have a non-empty array at the end.
yeah. ob_size for the two bool objects ought to be 1, I think.
Include/cpython/tupleobject.h lines 5 to 11
typedef struct {
PyObject_VAR_HEAD
/* ob_item contains space for 'ob_size' elements.
Items must normally not be NULL, except during construction when
the tuple is not yet visible outside the function that builds it. */
PyObject *ob_item[1];
} PyTupleObject;```
empty tuples also have a 1 length array of holding 0
not sure why the sizeof calculation assumes the array is 0 when ob_size is 0
That would make them look like variable-length objects, though, when they're not. To me the problem is that bool is reusing an int method in a situation where it's not appropriate. I think bool should get its own __sizeof__.
you can't tell that from the struct definition - you'd need to see a place where an empty tuple is allocated
well, they are variable-length-ish. Their base class is variable-length, so treating bools as a variable-length thing whose length is always 1 probably breaks less stuff than treating them as a fixed-length thing which happens to extend a variable-length thing
either way, might be worth a bug report if no one has reported it already.
I recall there's been various discussions about these variable-length end arrays, it's tricky because the support varies between C99, C11, C++ and MSVC...
So getting code that works everywhere is tricky.
Include/internal/pycore_global_objects.h line 58
PyTupleObject tuple_empty;```
so it seems it would have a 1 length array
true enough, but hasn't support for pre-C99 been dropped now?
yeah, looks like it, then.
Okay, good point. Making them true variable length objects would require at least changing how tp_basicsize is calculated for a bool, but it might be the right thing to do.
Since bool inherits from int, making bool have a fixed size is pretty non-intuitive.
It's probably not a big deal, since we're literally talking about like 8 bytes total across the whole program?
This might be inherited from when int and long were different?
Bool is kinda special.
Ionite pointed out that tuple has the same issue. I think __sizeof__ under-reports the size of most zero-length variable-length things.
there's a minimum length of 1, for all of the types with a [1] array as their last field.
But this only affects the empty tuple, which is a singleton, right?
Actually tupleobject.c says there's an empty tuple instance for each tuple subclass.
So if you create lots and lots of namedtuples then maybe this is an issue?
I bet it affects empty list as well.
It doesn't, because lists don't use the tp_itemsize feature since they can change their size.
This is only going to affect objects where you can malloc the data right into the object struct.
so yeah - pretty minor bug. Still, though, a bug, I think.
In the worst case it might affect everything that's a PyVarObject. Like bytes.
python/cpython#100637 👀
This doesn't affect bytes. It's true that a bytes object's variable length array always has at least one element, but all bytes objects are null-terminated. The null terminator isn't counted in ob_size, but it is counted in PyBytesObject_SIZE, which is effectively the fixed-length part of a bytes object.
So that's good.
hm, would the more correct fix be to make 0 length tuples and ints actually have an array of size 0? or changing sizeof
I'd let the core devs weigh in on that, honestly. The [1] instead of [] is for compatibility with pre-C99 compilers, I think - and if so, allocating 0 bytes for that array that's declared with a static size of 1 is probably undefied behavior
you can get away with over allocating, but not under allocating
could the struct declaration be size 0 instead?
for those pre-C99 compilers, no. But CPython has now dropped support for pre-C99 compilers, I think - so perhaps it could be [] instead
but that'd likely need lots of other changes scattered throughout the code base.
The C standard says that you can't write [0], and I think [] would be an incomplete type.
Oh, I see, it's variable length arrays where [] is an incomplete type.
seems like it had a long history
I just found out that the lazy importing PEP was rejected 😦 https://peps.python.org/pep-0690/
My hopes of TensorFlow not making a help dialogue taking 15 seconds are gone
Python Enhancement Proposals (PEPs)
is SWAP used a lot in bytecode?
It seems to go back to https://bugs.python.org/issue25221.
tomllib.load should support text files in a future version of Python
currently, a workaround is tomllib.loads(file.read())
!d tomllib.load
tomllib.load(fp, /, *, parse_float=float)```
Read a TOML file. The first argument should be a readable and binary file object. Return a [`dict`](https://docs.python.org/3/library/stdtypes.html#dict "dict"). Convert TOML types to Python using this [conversion table](https://docs.python.org/3/library/tomllib.html#toml-to-py-table).
*parse\_float* will be called with the string of every TOML float to be decoded. By default, this is equivalent to `float(num_str)`. This can be used to use another datatype or parser for TOML floats (e.g. [`decimal.Decimal`](https://docs.python.org/3/library/decimal.html#decimal.Decimal "decimal.Decimal")). The callable must not return a [`dict`](https://docs.python.org/3/library/stdtypes.html#dict "dict") or a [`list`](https://docs.python.org/3/library/stdtypes.html#list "list"), else a [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError "ValueError") is raised.
A [`TOMLDecodeError`](https://docs.python.org/3/library/tomllib.html#tomllib.TOMLDecodeError "tomllib.TOMLDecodeError") will be raised on an invalid TOML document.
Is there a feed of some sort for new PEPs ? And, hopefully one for PEP resolution?
Probably the PEPs category on https://discuss.python.org/ is the closest.
It does not intentionally.
The upstream implementation of tomllib was/is tomli which requires a IO[bytes] object as Python's universal newlines feature goes against the TOML spec.
I'm struggling to undersatnd this rationale. Are embedded CR legal in a string in TOML?
the restriction feels very pedantic to me
like in a multiline string? I'm not sure, but it seems like no? https://toml.io/en/v1.0.0#string
It seems like the TOML spec allows either Unix or Windows line endings (LF or CRLF), while Python's universal newlines feature allows Unix, Windows, or legacy Mac (CR, used by macOS prior to OS X). But legacy Mac line endings haven't been used for 20 years, and it seems like the only negative impact of accepting text files processed with Python's universal newlines mode is that CR characters in the file that aren't part of a line terminator would be incorrectly recognized as line breaks. So the impact of accepting a universal newlines text file is just that some things that aren't valid TOML would be parsed as though they are valid TOML (as though every CR was replaced by a NL, basically)
that's exactly tomli's argument (100% spec compliance) but yeah it seems overkill
to be fair, universal newlines should probably drop CR support at some point. The legacy mac text file format hasn't been used in over 20 years. Universal newlines support for it is just tech debt at this point, not a useful feature for Python devs
stdlib/ctypes/__init__.pyi line 275
# TODO These methods cannot be annotated correctly at the moment.```
is there any hope of ctypes.Array value annotations ever working?
imo we should just get rid of the automatic type casting of "simple types" that ctypes does, very annoying for usage and typing
but that would be a pretty major change to ctypes
!e also the ctypes "auto simple type unboxing" makes it impossible to use many pythonapis that returns a PyObject pointer or null
from ctypes import *
PyDict_GetItem = pythonapi["PyDict_GetItem"]
PyDict_GetItem.argtypes = (py_object, py_object)
PyDict_GetItem.restype = py_object
d = {"A": 1, "B": 2}
print(PyDict_GetItem(d, "C"))
@warm breach :warning: Your 3.11 eval job has completed with return code 139 (SIGSEGV).
[No output]
if it didn't autocast, we would have gotten a py_object(<NULL>) and been able to catch a ValueError on accessing py_object.value, but currently this autocast gets us a segmentation fault
You can make your own subclass of c_void_p with a custom value getter. Subclasses disable the unboxing iirc
!e you can actually just subclass py_object
from ctypes import *
class py_object(py_object):
def __repr__(self):
try:
return f'py_object({self.value})'
except ValueError:
return f'py_object(<NULL>)'
PyDict_GetItem = pythonapi["PyDict_GetItem"]
PyDict_GetItem.argtypes = (py_object, py_object)
PyDict_GetItem.restype = py_object
d = {"A": 1, "B": 2}
print(PyDict_GetItem(d, "C"))```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
py_object(<NULL>)
👀 do you know if I can subclass pointer in the same way
because I've been using regex to hack at __annotations__ to make it work with typing.get_type_hints
would be nice to add a __class_getitem__ onto ctypes.pointer but it seems that class is "final" and not subclassable
both pointer and POINTER are functions, not classes
they construct classes at runtime, so it would be tricky to add in the __class_getitem__
wait what
!e py from ctypes import * print(type(pointer), type(POINTER))
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
<class 'builtin_function_or_method'> <class 'builtin_function_or_method'>
ah
*you could add __class_getitem__ to builtin_function_or_method with fishhook, but that seems out of scope for what you are trying to do
would that affect every builtin?
yea, thats why i said it would be out of scope
ctypes may provide typing information that makes it look like a class
src/einspect/protocols/type_parse.py lines 43 to 44
RE_PY_OBJECT = re.compile(r"^(py_object)(\[(.*)])$")
RE_POINTER = re.compile(r"^(pointer)(\[(.*)])$")```
i mean you could shim pointer calls with your own function
It's supposed to be in the typehint though
I think I tried using my own thing and it broke inferencing for IDE / mypy
src/einspect/structs/py_tuple.py lines 31 to 33
@bind_api(pythonapi["PyTuple_GetSlice"])
def GetSlice(self, start: int, stop: int) -> pointer[PyTupleObject[_VT]]:
"""Return a slice of the tuple."""```
like @bind_api here needs to inspect pointer[PyTupleObject[_VT]] and use it to make restype = POINTER(PyTupleObject)
you could try to use _ctypes._Pointer
@warm breach you could try using _ctypes._Pointer and add __class_getitem__ to that using fishhook
!e
does hook not work on class methods?
from fishhook import hook
from _ctypes import _Pointer
from ctypes import pointer, c_uint32
@hook(_Pointer)
def __class_getitem__(cls, item):
return cls
t = _Pointer[c_uint32]
@warm breach :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 10, in <module>
003 | TypeError: __class_getitem__() missing 1 required positional argument: 'item'
ah I think it's static?
you need to wrap __class_getitem__ yourself, its a classmethod
!e ```py
from fishhook import hook
from _ctypes import _Pointer
from ctypes import pointer, c_uint32
@hook(_Pointer)
@classmethod
def class_getitem(cls, item):
return cls
t = _Pointer[c_uint32]
@pliant tusk :warning: Your 3.11 eval job has completed with return code 0.
[No output]
@warm breach normally, __class_getitem__ is automatically wrapped on class creation, but since fishhook modifies in place you dont get the implicit wrapping
!e ```py
class A:pass
A.class_getitem = lambda *a:a
print(A[int])``` see, it is only passed int
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
(<class 'int'>,)
!e ```py
class A:pass
A.class_getitem = classmethod(lambda *a:a)
print(A[int])```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
(<class '__main__.A'>, <class 'int'>)
I chose not to add the implicit wrapping inside fishhook incase it is ever added natively, so for right now fishhook hooks react the same way as setattr on a regular class
python/cpython#100663 guess this might be finally getting fixed after 14 years 👀
first related issue was in 2008 it seems for python 3.0 python/cpython#47940
this feels like the time me and godly fixed the list allocation problem
link? 👀
python/cpython#31816
that also fixed python/cpython#87740
there was a follow-up after that covered by python/cpython#92914 though
apparently also found python/cpython#100659 while discussing the int/bool sizeof under-reporting yesterday
i wonder how many issues python had with memory since it was created
even until now there's something to discover and fix
Pardon me for necroing an old issue, but someone pointed out the surprising behavior of len being called twice by list(iterable), and it caught my curiosity.
the someone is me \😎
!Git
!tags return
Return Statement
A value created inside a function can't be used outside of it unless you return it.
Consider the following function:
def square(n):
return n * n
If we wanted to store 5 squared in a variable called x, we would do:
x = square(5). x would now equal 25.
Common Mistakes
>>> def square(n):
... n * n # calculates then throws away, returns None
...
>>> x = square(5)
>>> print(x)
None
>>> def square(n):
... print(n * n) # calculates and prints, then throws away and returns None
...
>>> x = square(5)
25
>>> print(x)
None
Things to note
• print() and return do not accomplish the same thing. print() will show the value, and then it will be gone.
• A function will return None if it ends without a return statement.
• When you want to print a value from a function, it's best to return the value and print the function call instead, like print(square(5)).
You may be looking for the #bot-commands channel for testing things out
pydis collaboration
how can i get an object by its id
ctypes stuff
!e ```py
from ctypes import cast, py_object
G = "abc"
print(cast(id(G), py_object).value)
@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.
abc
keep in mind that the address may not be valid any more because it may be GC'd
python-c interop with ctypes be like
in cpython, as an implementation detail, that's possible - but it's not possible in Python in general.
is id even guaranteed to be an int? oh, it is
This is an integer which is guaranteed to be unique and constant for this object during its lifetime.
yes, but that's all that's guaranteed.
is there some difference between how extension modules are loaded on macos and linux? what i'm seeing is an extension module that depends on symbols in another extension module works on macos and doesn't work on linux (with "missing symbol" errors). like does macos load extensions using RTLD_GLOBAL while linux load extensions using RTLD_LOCAL? that seems highly unlikely since it's dramatically different
ctypes.DEFAULT_MODE
The default mode which is used to load shared libraries. On OSX 10.3, this is RTLD_GLOBAL, otherwise it is the same as RTLD_LOCAL.
wow i'm shocked
is this only 10.3 or since 10.3
Hmm I'd have to guess only 10.3 since the docs are usually very explicit
thoughts on how this formatting looks? 👀
from einspect import view
ls = [1, 2, 3]
print(view(ls).info())
PyListObject (at 0x10486dc40):
ob_refcnt: Py_ssize_t = 2
ob_type: *PyTypeObject = ptr[0x105e20890] -> list
ob_size: Py_ssize_t = 3
ob_item: **PyObject = ptr[0x104852450] -> Array([
ptr[0x105f06940] -> 1
ptr[0x105f06960] -> 2
ptr[0x105f06980] -> 3
ptr[NULL]
])
allocated: c_long = 4
PLEASE HELP IN THIS
I need to mount a DVD drive but it says Sorry this couldnt be mounted and all
I tried many ways I could find on net to solve this
Please suggest other ways this is impt
not here pls
When doing *args in a function's parameter list, args becomes a tuple: ```pycon
def foo(args):
... print(args)
...
foo([1,2,3])
(1, 2, 3)
Yet when doing unpacking assignments, like `*var = something`, `var` becomes a list:pycon
x, *y, z = range(5)
print(x, y, z)
0 [1, 2, 3] 4
What's with the inconsistency? The two situations appear similar in syntax and purpose, yet result in two different data types. Wouldn't it make more sense if either both situations created tuples, or both situations created lists?
https://peps.python.org/pep-3132/
Make the starred target a tuple instead of a list. This would be consistent with a function’s *args, but make further processing of the result harder.
in functions, the reasoning is probably lost to time at this point, that's quite an old feature. The starred assignment unpacking is designed for algorithms which need to do such operations, which are likely to want to mutate the unpacked sequence, as per the PEP.
Python Enhancement Proposals (PEPs)
https://www.pypy.org/posts/2023/01/string-concatenation-quadratic.html
so is joining a generator fast bc of the optimization or because of inherent ability of the generator?
This is a super brief blog post responding to an issue that we got on the PyPy
issue tracker. I am moving my response to the blog (with permission of the
submitter) to have a post to point to, since i
str.join doesn't do string concatenation internally, but a more efficient algorithm
haven't checked the code but I believe there's an internal "writer" type that manages the space, increasing space as needed
so you don't incur the quadratic behavior of repeated copying
oh tnx ill go look at the impl
Fair enough, I suppose that sorta changes the question instead simply to "why does *args produce a tuple instead of a list?" The earliest docs i could find are for 1.4, and they feature *args, so its a hella old thing that's probably just stuck around for compatibility.
Its a real shame too, because it feels like since then, there's been more of an unwritten convention that anywhere you're using variable-length tuples, you're better replacing them with lists
it's probably helpful for perf that it's a tuple, because tuples are immutable so you can share them
and *args usually just gets forwarded into another call
a reason I can think of is to make it clear that passing varargs always copies.
Does that actually happen I wonder
it doesn't at least to the extent that the object id remains the same, the tuples may reuse the underlying buffer, but I strongly doubt that
can python objects share underlying memory like that?
probably not all things considered, without an explicit refcounted reference to the underlying memory
I think this code is not forwarding same tuple but is creating new tuple:
def f(*args):
g(*args)
Or there is optimization for this case?
yeah, seems like it currently does copy into a new tuple
They can share some internal objects.
from my understanding, tuples are essentially a struct consisting of their length, followed by an array of references to their elements at the same location in memory. Since the data's stored physically as part of the tuple, wouldn't it have to be copied on each copying of the tuple?
why would you need to copy the tuple? it's immutable
!e
def f(*args):
print("f", id(args))
g(*args)
def g(*args):
print("g", id(args))
f(1, "foo", 3)
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | f 140247978191360
002 | g 140247978195776
*args always seems to create a new tuple, even when it doesn't have to: py x = (1,2,3,4) def foo(*args): print(x is args) foo(*x) Prints out False, implying its creating a new yet identical tuple
it does, but it would be a legal optimization if it didn't
some discussion in https://github.com/faster-cpython/ideas/issues/445
nice
hm, copy on write seems like a very foreign concept just to optimize args / kwargs. Since args / kwargs are usually pretty small in size, I wonder if this would really have a performance increase
my initial feeling is that such a change would actually decrease speed due to the object wrappers, albeit prevent some copy allocations
yes, for dicts it's probably difficult, but I feel like the args tuple could be shared
Yeah I don't think the data is large enough to really warrant it
why isn't **kwargs a mapping proxy but a dict
either that or *args should be a list
design wise I think immutable kwargs isn't the worse idea, since args is already immutable, but it would break a lot of existing code
does a lot of code rely on mutable kwargs
yes, the issue I linked above has some examples
I've definitely written code that mutates kwargs
where
https://github.com/bloomberg/memray/blob/4bbb311d25cebc41f8970a1180d932e0b77f1d2b/src/memray/commands/common.py#L163 is an example from one of my projects.
src/memray/commands/common.py line 163
kwargs["merge_threads"] = not args.split_threads```
actually, using kwargs.pop() is the idiomatic way to implement cooperative multiple inheritance in __init__
you could just convert the mappingproxy to dict?
how often are mutating kwargs needed anyways
it's not a question of whether they're needed - mutable data types aren't ever necessary; there are languages that only have immutable data types.
It'd break a bunch of stuff to change it in Python at this point, though, so there would need to be a very good justification for changing it
no im saying how often is kwargs mutation found in code
if its not often used then there is little cost in changing
- kwargs is mutable even though it shouldn't be
- people exploit it
- now can't change it to immutable 😭
why shouldn't it be mutable?
consistency with *args
function arguments shouldn't be changed during a call anyways
reassigned
there's a very common pattern where you reassign function arguments: mutable default values
ok
def append_to(element, to=None):
if to is None:
to = []
to.append(element)
return to
for instance.
anyone know where __class_getitem__ is implemented for built-in generics like list tuple
Objects/listobject.c line 2862
{"__class_getitem__", Py_GenericAlias, METH_O|METH_CLASS, PyDoc_STR("See PEP 585")},```
hm, wonder if we can make PyCPointer_Type subscriptable as a generic
typeshed lists ctypes.pointer as a generic and shows type hints as "pointer[c_ssize_t]" etc. But you can't actually subscript that type at runtime
which means typing.get_type_hints also errors
using POINTER() feels like a bad workaround, it's a runtime evaluated type instead of something concrete. And it won't work with further generics
POINTER(Array[c_uint])
hm. That seems like something that might be worth trying to get into CPython itself, actually
__class_getitem__ is pretty new, and most likely no one thought of this as a potential use case for it - but it seems like a pretty sane one...
at least, at surface level. I haven't given it much thought, maybe there's some reason why it doesn't make any sense, but it sounds reasonable at first blush.
the only thing is currently apparently pointer is a function, not a class
not sure if it would be breaking to change it to a class with __new__ instead
or is it possible to just add that method to the c function type 
Ah, hm. That does make it tougher. Maybe this is a good argument for making it a class... Dunno, I rarely touch ctypes
it is probably a function because it has to return an instance of a class that it generates at runtime (and metaclasses in C are tricky)
hm. If it were a class, the class's __new__ could return an instance of the dynamically generated (sub?) class rather than an instance of the base class, and __class_getitem__ could likewise return the dynamically generated class rather than the base class. Maybe.
again, I'm just spitballing 🙂
tbh i think that a lot of ctypes should be rewritten in python, and just the bare minimum should happen in C. it would mean that the source code would contain examples on how to use the ctypes code in useful ways, and would probably be more intuitive to extend
how can i compile to a code object with parameters
compiling with compile gives a code object with no parameters
compile code with a function in it and grab the function's code object out of co_consts?
hmm ok
Python/bltinmodule.c line 1229
/* map object ************************************************************/```
can someone eli5 https://github.com/python/cpython/blob/main/Python/bltinmodule.c#L1340-#L1381
why does map need its own little stack and in general (besides __reduce__ ig?), why isn't map a generator function?
Python/bltinmodule.c line 1340
static PyObject *```
the stack is used to call the mapped function using the vectorcall protocol, which uses a C array plus size to pass the args
it's not a generator function because it's implemented in C
couldn't some builtins like map be implemented separately in python?
also
is there not a way in c to make a generator function?
in theory yes, builtins could be implemented in Python, though it might cause some bootstrapping issues
and there's little reason to switch map to a Python version; it will likely cause some slowdown and there's no compelling advantage
yea makes sense
only advantage is ig a simpler impl (as you'd expect map to just be a function)
sure, but the implementation cost isn't that big a deal considering how many people would use this code
yea. well thanks jelle
does *args always get packed and unpacked each time it is passed through functions?
yup ```py
x = (1,2,3)
def foo(*args):
... print(x is args)
...
foo(*x)
False
any tips to avoid this?
avoiding packing/unpacking when not necessary, i suppose
https://github.com/omkarxpatel/Spotify-Playlist-Generator anyone have clues how i can turn this web based?
is it ever planned for isinstance to be able to handle generic types?
that's not really possible
What would it do with custom types?
I guess there could be a Dunder for it 🤔
You mean like a typing.NewType?..if so that requires a type be passed as the second argument. So why couldn't they defer to that during a isinstance check?
NewType is erased at runtime so it's not possible for isinstance to support it
and something like isinstance(x, Iterable[int]) isn't safely possible at runtime
Hey all, I don't need programming help per se but don't understand the best practices of Python projects since they seem to be structured dramatically differently than JS projects (no nested relative imports, no sibling imports). I THINK I've got the idea, and don't want to blow up this chat, but if anyone wants to proselytize how they structure projects in Python, my help post is waiting for you. https://discord.com/channels/267624335836053506/1060965255276138598
It's kind of a philosophical question so wasn't sure if it was best here or in help.
Do I understand correctly what you mean? Do you want to make if isinstance(x, list[int]): traverse the list and check if each element is an int?
Why are the typing and types modules separate?
Is it because types has been around in the stdlib for longer than typing, which was added in 3.5?
types is a module for creating types, typing is a module for static type hints
No I mean isinstance(x, typing.NewType("SpecialInt", int)) is just an alias for int and defer back to it
That isn't correct, because the whole point of NewType is that the original type isn't directly compatible. It also would confuse two NewTypes with the same base...
hey i just posted a question but wanted to see if i could get a quick answer here, i downgraded from 3.11 to 3.10 and now my webui is still trying to read from the 3.11 version. how do i tell it to read the older version i just installed. 3.11 isnt' there and i uninstalled it using the program.
are python going to have a jit soon
pypy?
maybe
no a jit for cpython
hmm
The plan is for it to get a simple one within a year or two.
@pliant tusk currently does fishhook.hook just change the PyTypeObject's tp_flags to be mutable and call setattr?
what actually happens when setattr is called after that?
like if you setattr(int, "__add__", custom_add)does the new function actually overwrite the original type's PyNumberMethods *tp_as_number and binaryfunc nb_add ?
or does it just modify int.__dict__?
I think type.__setattr__ has some logic that updates slots if you are assigning to dunder.
So setattr(X, '__add__', custom_add) is calling X.__setattr__ which is implemented in type, so slots are updated.
I have no proofs for that but it seems logical
class X: ...
X.__add__ = custom_add # slots are updated
X() + X() # works because slots are updated
get_cls_dict(X)['__sub__'] = custom_sub # slots are not updated because this dict dont know that it is a part of a class
X() - X() # TypeError because slots has no `__sub__` slot
This should also work for builtin classes if you somehow unlock them
hm..
seems int.__setattr__ just resolves to object.__setattr__
so I guess it's this? https://docs.python.org/3/c-api/object.html#c.PyObject_SetAttr
PyObject_SetAttr is just an alias for setattr
object.__setattr__ has no public API i think
>>> int.__setattr__
<slot wrapper '__setattr__' of 'object' objects>
>>> object.__setattr__
<slot wrapper '__setattr__' of 'object' objects>
>>> type.__setattr__
<slot wrapper '__setattr__' of 'type' objects>
``` hmm
ah yes, it work as it should work
X.a = b calls type(X).__setattr__(X, 'a', b) and not X.__setattr__('a', b)
so int.__add__ = custom_add calls type(int).__setattr__(int, '__add__', custom_add) which is equivalent to type.__setattr__(int, '__add__', custom_add)
type.__setattr__ should contain logic for updating slots
yeah I'm trying to see where type.__setattr__ is defined in the source but

Objects/object.c lines 1012 to 1013
int
PyObject_SetAttr(PyObject *v, PyObject *name, PyObject *value)```
here's PyObject_SetAttr but it just seems to call tp->tp_setattro or tp->tp_setattr
Objects/typeobject.c line 4333
type_setattro(PyTypeObject *type, PyObject *name, PyObject *value)```
It allocates all the tp_as_*structs for a given class and all of its subclasses recursively. Also has some custom code to allow for hooking the base object type
And of course you have the orig code too
!e ```py
from fishhook import unlock
unlock(str)
str.sub = lambda a, b:print(a, b)
'A' - 'B' ```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
A B
It allocates all the tp_as_*structs for a given class and all of its subclasses recursively.
wait why is this necessary if setattr works fine after unlocking? 
unlock() also allocates those structs
type_setattr assumes that if a class is mutable, it must have all of those structs defined, and in some versions doesn't bother to check and it will segfault
https://github.com/chilaxan/fishhook/blob/master/fishhook/fishhook.py#L34-L42 is part of what makes fishhook special, cause that's what lets it work on multiple different versions with different tp_as layouts without any major changes
fishhook/fishhook.py lines 34 to 42
def get_structs(htc=type('',(),{'__slots__':()})):
htc_mem = getmem(htc)
last = None
for ptr, idx in sorted([(ptr, idx) for idx, ptr in enumerate(htc_mem)
if id(htc) < ptr < id(htc) + sizeof(htc)]):
if last:
offset, lp = last
yield offset, ptr - lp
last = idx, ptr```
hm... so these things? https://github.com/python/cpython/blob/3.11/Include/cpython/object.h#L240-L247
Include/cpython/object.h lines 240 to 247
typedef struct _heaptypeobject {
/* Note: there's a dependency on the order of these members
in slotptr() in typeobject.c . */
PyTypeObject ht_type;
PyAsyncMethods as_async;
PyNumberMethods as_number;
PyMappingMethods as_mapping;
PySequenceMethods as_sequence; /* as_sequence comes after as_mapping,```
Yea
get_structs calculates the offsets and sizes required at runtime using pointer math
or these also I guess https://github.com/python/cpython/blob/3.11/Include/cpython/object.h#L165-L167
Include/cpython/object.h lines 165 to 167
PyNumberMethods *tp_as_number;
PySequenceMethods *tp_as_sequence;
PyMappingMethods *tp_as_mapping;```
yeah my thing doesn't work with this 
from einspect.structs.py_type import PyTypeObject, Py_TPFLAGS_IMMUTABLE
PyTypeObject.from_object(int).tp_flags &= ~Py_TPFLAGS_IMMUTABLE
int.__getitem__ = lambda self, index: "abc"
x = 1
print(x[0])
TypeError: 'int' object is not subscriptable
I assume since int didn't have tp_as_sequence allocated or something
ah yup
t = PyTypeObject.from_object(int)
print(t.tp_as_mapping[0])
~~~~~~~~~~~~~~~^^^
ValueError: NULL pointer access
doesn't work still 
from ctypes import pointer
from einspect.structs.object_h import PyMappingMethods
from einspect.structs.py_type import PyTypeObject, Py_TPFLAGS_IMMUTABLE
t = PyTypeObject.from_object(int)
t.tp_flags &= ~Py_TPFLAGS_IMMUTABLE
t.tp_as_mapping = pointer(PyMappingMethods())
def __getitem__(self, index):
return index
t.__getitem__ = __getitem__
print((1).__getitem__)
AttributeError: 'int' object has no attribute '__getitem__'
or am I allocating the mappingmethod wrong
You are setting t get item not int

