#internals-and-peps
1 messages · Page 6 of 1
ah, that's not even the one I was thinking of - I was thinking of async def coroutines vs @types.coroutine generator coroutines
i've never even heard of the latter 
!d types.coroutine
types.coroutine(gen_func)```
This function transforms a [generator](https://docs.python.org/3/glossary.html#term-generator) function into a [coroutine function](https://docs.python.org/3/glossary.html#term-coroutine-function) which returns a generator-based coroutine. The generator-based coroutine is still a [generator iterator](https://docs.python.org/3/glossary.html#term-generator-iterator), but is also considered to be a [coroutine](https://docs.python.org/3/glossary.html#term-coroutine) object and is [awaitable](https://docs.python.org/3/glossary.html#term-awaitable). However, it may not necessarily implement the `__await__()` method.
If *gen\_func* is a generator function, it will be modified in-place.
If *gen\_func* is not a generator function, it will be wrapped. If it returns an instance of [`collections.abc.Generator`](https://docs.python.org/3/library/collections.abc.html#collections.abc.Generator "collections.abc.Generator"), the instance will be wrapped in an *awaitable* proxy object. All other types of objects will be returned as is.
New in version 3.5.
huh. is this how they did async before async and await?
oh wait, async and await were 3.4 right
generators were the legacy way of doing coroutines. @types.coroutine is a way to bridge the gap between those legacy generator coroutines and the modern async def ones by adapting them to look and behave more like the modern ones
i see. so you would just add that and you could use await on them
yep
i thought async coroutines are implemented using generators
no, though they use similar machinery under the hood.
Python Enhancement Proposals (PEPs)
You also still need this for the bottom trap of every single async event loop. Remember: await becomes yield from - not yield - and doesn't actually yield anything until one of the coroutines (generators) hits a yield point.
Event loops needs this so that they can actually yield so usually you have some kind of wrapper function like: ```python
@types.coroutine
def trap(value):
return (yield value)
I second this. It will render a lot of the school manuals that many 10 year olds read completely useless, though.
you can always use it as a variable name and ignore the style guidelines for it
what does annoy a little is whenever syntax highlighters highlight it
i know you can, I just genuinely think it's a bad idea
maybe now with mypy it's less of an issue since you'd probably get pretty bad mypy errors immediately when you make a mistake of that kind
in the past I remember genuinely wasting some time on it though
Hello guys, i have a few questions:
is setup.py still used?
do you have to use setup.py to build a wheel?
is poetry viable? or can i still stick to pip for my packages?
is there a fully-comprehensive guide on how to use pip to distribute packages?
do you have to hand code the .toml?
I have a lot of confusion...
open question at:
https://discord.com/channels/267624335836053506/1047961312451371059
!e ```py
import dis
dis.dis("f(1, a=1, b=2)")
@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0 0 RESUME 0
002 |
003 | 1 2 PUSH_NULL
004 | 4 LOAD_NAME 0 (f)
005 | 6 LOAD_CONST 0 (1)
006 | 8 LOAD_CONST 0 (1)
007 | 10 LOAD_CONST 1 (2)
008 | 12 KW_NAMES 2
009 | 14 PRECALL 3
010 | 18 CALL 3
011 | 28 RETURN_VALUE
@gray galleon :white_check_mark: Your 3.10 eval job has completed with return code 0.
001 | 1 0 LOAD_NAME 0 (f)
002 | 2 LOAD_CONST 0 (1)
003 | 4 LOAD_CONST 0 (1)
004 | 6 LOAD_CONST 1 (2)
005 | 8 LOAD_CONST 2 (('a', 'b'))
006 | 10 CALL_FUNCTION_KW 3
007 | 12 RETURN_VALUE
somehow the keyword names can’t be seen in python 3.11
Is there a tool I can use to see the PVM code emitted from a python script
im pretty sure dis just doesnt know where to look for them yet
they are still located in the co_consts tuple
https://github.com/python/cpython/blob/3.11/Lib/dis.py#L364 yea dis just checks if the op is LOAD_CONST
Lib/dis.py line 364
if op == LOAD_CONST:```
.
just drafted a pull request to fix this
what does KW_NAMES and PRECALL do
does KW_NAMES create internal pair data types that will be used as keyword args
if so then how about PRECALL
KW_NAMES indeed sets a variable to the tuple of names, which is then consumed by CALL_*. That way the location of the constants can be a separate oparg to the rest of the values. The reason PRECALL existed as part of the specialisation system. What it does is handle adjusting the arguments if a method is being loaded or if a class is being called, meaning CALL doesn't need to care about that, and can be specialised independently - there's currently 18 different call opcodes depending on what the callable is.
Actually, PRECALL is now gone in 3.12, it was judged as not being worth it.
https://github.com/python/cpython/pull/92925
call_shape.kw_names ig
dot is python’s way to make internal variables lol
like .0 is listcomp implicit argument
It's a C local variable in the interpreter loop.
We know CALL is coming immediately after to use it.
Search for kwnames here: <https://github.com/python/cpython/blob/main/Python/bytecodes.c#L2876-L2881
Python/bytecodes.c lines 2876 to 2881
// stack effect: ( -- )
inst(KW_NAMES) {
assert(kwnames == NULL);
assert(oparg < PyTuple_GET_SIZE(consts));
kwnames = GETITEM(consts, oparg);
}```
@unkempt rock I think dicts and sets share some of the same implementation, but I'm not entirely sure
hi wookie
do you get like pings when people send messages here?
I wasn't even typing, how did you know I was here 👀
did you accidentally start that typing event broadcast self-bot of yours?
I think it must be that if you have text in the message box for a channel whenever you go to that channel it says you're typing, that's a bit silly.
this channel has becomes internals-of-discord-client-and-peps
Lazy imports got rejected, I'm sad
why?
!pep 690
still a draft
where are pep decisions announced, anyway?
the python-dev mailing list, i believe
theres seems to have been some more support for hash(None) being a constant
i didn’t know python 3.12 is already in development
like 3.11 were just released recently
well, yeah - that's why 3.12 is under development.
any feature that didn't make it in before the 3.11 feature freeze would become part of 3.12
so 3.12 is just a continuation of 3.11
right as a new python major version is fully released, it branches and a new version continues the main branch
that's probably what they call a "feature freeze" or something
What is it currently?
!e print(hash(None))
@boreal umbra :white_check_mark: Your 3.11 eval job has completed with return code 0.
8790177024592
!e print(hash(None))
@boreal umbra :white_check_mark: Your 3.11 eval job has completed with return code 0.
8764759286608
Is it based on the id that None gets at interpreter start time?
its based on its id(), the default for all objects
sure - in the same way as the next version of any application is a continuation of the previous version. In the case of CPython there's a release schedule, so even though people always want to get new changes in, there's cutoff points once a year where they say that no new features will be added to the next version, so any new features start getting added to a branch for the version after that.
discuss.python.org nowadays
Decision on PEP 690 - Lazy Imports The Python Steering Council has decided to reject PEP 690 on Lazy Imports. We agree with the widely accepted sentiment that faster Python startup time is desirable. Large command line tools in particular suffer as that is a human user experience. Lazy imports, as proposed, are one of many potential mechanism...
PEP 638 could make a way to simulate Perl's unless
b = False
unless! b: # same as `if not(b):`
print("Good")
I don't want pep 638 to happen. I understand the point of macros in C to reduce verbosity but I really don't think they're needed in python
macros certainly sound cool, but i really don't think that they really suit python that well
one of the things that python is well known for is its really easy-to-read syntax - adding a new feature that allows developers to essentially create their own syntax is an easy way to reduce that readability greatly
Rust macros are pretty cool. But it's a very different language
like in the PEP, they give an example of creating a macro that simultaneously creates a dict of k:v pairs, as well as a dict consisting of the inverse v:k pairs: py bijection! color_to_code, code_to_color: "red" = 1 "blue" = 2 "green" = 3 already, i'm not a fan of macros enabling writing stuff like "red" = 1, seemingly assigning a literal to another literal
main use of macros is for something like:
if DEBUG_MODE:
assert <expensive-to-check-invariant>, ...
I don't want this to have any cost if DEBUG_MODE is false.
the one compelling argument is saying that certain libraries like numba would benefit from being directly given the AST of a function, rather than its bytecode (which is apparently what happens right now when you use a numba JIT decorator), which is something macros would enable, but i don't know enough about those types of libraries to really say much on the matter
__debug__ already exists
well, you can already get the AST of a function by fetching its code with inspect and parsing it with ast
...also, asserts don't run if you use the -O flag
There is an argument to be made that @dataclass et al make more sense as macros. But whether that's enough of a merit to pile on even more complexity, IDTS. match had the strong, compelling usecase of argument normalisation, and even then it only barely got in.
I don't think there is enough bad python code which would be significantly improved with macros
presumably you want some of your asserts to run in all cases and some only in "debug". But __debug__ does seem to be a solution, I didn't know about that.
For libraries like lark and kivy, external files or Multiline strings are IMO fine
aside, you don't benefit from removing a simpleif check very often
unless it's some kind of very tight loop
the PEP also doesn't really delve that deep into what an implementation of a macro looks like, like it'd be nice to see what their idea of an actual implementation of that bijection! macro looks like, for example
Actually, disabling debug seems to remove all of your asserts, not just the "expensive" ones, so macros are still marginally better.
Anyway, it's a fairly minor point I agree, the cost of evaluating if DEBUG_MODE isn't going to be particularly large.
less than the typical function call at any rate
i think macros’ main function is making domain specific languages with pythonic syntax
like so ```py
recipe:
ingredients:
"ingredient 1"
"ingredient 2"
...
steps:
...
don't forget that macro calls have ! at the end, so it would be
recipe!:
ingredients!:
"ingredient 1"
"ingredient 2"
...
steps!:
...
You can get pretty close with existing syntax:
class ApplePie(Recipe):
with ingredients:
"Ingredient 1"
"ingredient 2"
with steps:
...
(Alternatively: lists, or functions inside)
most of the time, you need to extend a class, or use a with statement
You can even do:
steps: (
"Do thing",
"Do other thing",
)
Wait. What feature allows you to access those strings in the with block?
__prepare__
oh wait, that still won't give you strings
i can do:
class ApplePie(Recipe):
__ingredients__
Milk
Eggs
using prepare
class ApplePie(Recipe):
Ingredients
- Milk
- Eggs
a: "Preheat oven to 200 degrees"
b: "Mince the garlic in a bowl"
Is any library actually using it like this?
Not for any practical purpose I don't think
I've used it for easy_z3 but that's more of a poc of the tech
wait how
metaclass __prepare__ that returns a weird mapping
tell me more about that
the __prepare__ method on the metaclass returns the mapping object in the class namespace
so the Ingredients line does something like ns = metaclass.__prepare__(); ns["Ingredients"]
so your __prepare__ method has to return an object with a __getitem__ that records all the times it was called
and it has to return some object that implements __neg__ so the - Milk line works
then the a: "Preheat oven" line, that's a variable annotation
I think that gets compiled to something like ns["__annotations__"]["a"] = "Preheat oven"
so your weird mapping can record that too
I actually wrote a real use for this once, it lets you write serializable algebraic datatypes like ```class Maybe(ADT):
Nothing(tag=0)
Just(value=Any, tag=1)
taxonomy/adt.py line 36
class _ADTNamespace(MutableMapping[str, Any]):```
!pypi dont :P
Hey, I had made something similar, kinda
https://gist.github.com/decorator-factory/b2fd85ef8248c9230835461c1ec24597
but more cursed, including meta-metaclasses
So I like pep695 but don't like the new soft keyword of type why not just make it typedef? https://discuss.python.org/t/pep-695-type-parameter-syntax/21646?u=melendowski
PEP 695 is posted. It proposes to add an improved syntax for specifying type parameters within a generic class, function, or type alias. It also introduces a new statement for declaring type aliases. This PEP has already gone through several cycles of discussions in the typing-sig forum and public (virtual) meet-ups in the Python typing commun...
creating a new (hard) keyword breaks any code that happens to use it as a name
It could be typedef and still be soft
typedef doesn't give off big python vibes
who cares about not giving python vibes
Why is
type ListOrSet[T] = list[T] | set[T]
needed, couldn't it be
ListOrSet[T] = list[T] | set[T]
?
I do think type foo = bar looks very unlike Python, and more like C, Java, ...
Why not
ListOrSet[T]: type = list[T] | set[T]
?
indexing an undefined variable
your choices are to either have implicit T (current solution), or define ListOrSet first (ugly solution), or use some other syntactic structure (such as the one proposed)
core developers typically
fwiw I'm not all sold on the soft keyword yet
something like ListOrSet: TypeAlias[T] = list[T] | set[T] would also work although a tiny bit verbose
I don't think type aliases are exactly a first-class citizen
ListOrSet: type[T] = list[T] | set[T]
type[T] has different denotations, though
also this is difficult to read by static analyzers as a type alias (as opposed to a non-type assignment)
ListOrSet: _[T] = list[T] | set[T] 
I get the vibe that the new soft keyword is trying to sneak in via the merits of the rest of the pep :>
honestly, I am strongly doubtful something only usable for typechecking is getting into the python core
I'm not sure if this is the right place to ask but this is a strange question, and google isn't much help. I've run into a situation where certain frames of the call stack are "missing" in the debugger and the traceback. Has anyone see this before? This is a minimal example:
from sqlmodel import Field, SQLModel
class A(SQLModel, table=True):
id: int = Field(primary_key=True)
a = A(id=5)
breakpoint()
a.dict()
When you step into the dict method. It actually skips several frames and jumps into a different function call deep in the call stack (namely _calculate_keys()). You don't actually see the dict method. You can force an error by changing the library code, and use pdb.post_mortem to trick your way into the missing frame. Pdb will tell you you're in the dict method, but if you use the longlist command, it says EOF, it can't find any code. What's going on here? For more context, dict is a pydantic method, it's a normal python method and isn't a C extension or anything weird as far as I can tell. (SQLModel inherits BaseModel).
pydantic is compiled with Cython
(IIRC)
Oh wow, I didn't see any trace of Cython in the pydantic repo but it turns out you're right. It said so in the installation docs which I didn't bother reading. Should've RTFM. Thanks.
Why (*a,) is 2 times slower than tuple(a)? ```js
C:\Users\denba>py -m timeit -s "x = [*range(10**6)]" "(*x,)"
20 loops, best of 5: 16.7 msec per loop
C:\Users\denba>py -m timeit -s "x = [*range(10**6)]" "(*x,)"
20 loops, best of 5: 16.9 msec per loop
C:\Users\denba>py -m timeit -s "x = [*range(10**6)]" "tuple(x)"
50 loops, best of 5: 8.64 msec per loop
C:\Users\denba>py -m timeit -s "x = [*range(10**6)]" "tuple(x)"
50 loops, best of 5: 8.46 msec per loop
0 0 RESUME 0
1 2 BUILD_LIST 0
4 LOAD_NAME 0 (x)
6 LIST_EXTEND 1
8 LIST_TO_TUPLE
10 RETURN_VALUE
seems like it copies stuff twice: first into a list, then into a tuple
Now i see. [*a] and list(a) takes same amount of time: ```js
C:\Users\denba>py -m timeit -s "x = [*range(10**6)]" "[*x]"
50 loops, best of 5: 8.54 msec per loop
C:\Users\denba>py -m timeit -s "x = [*range(10**6)]" "list(x)"
50 loops, best of 5: 8.58 msec per loop
Oh wow this is suprisingly horrible
That technically means that [*x] is slower as the alternative list(x) does a global lookup and function all on top of actually converting it to a list
why do all comparison operators come from 1 instruction?
why not BINARY_GREATER, BINARY_EQUAL et cetera
Actually in CPython there's no other way to build a tuple
there is also BUILD_TUPLE
🤔
but that can't work with unpacking
because a list looks like this:
(header) (length) (ptr)
|
V
[ ptr0 ptr1 ptr2 ptr3 ptr4 ...]
And in a tuple, the items are embedded directly into the memory location, kinda like this
(header) (length) (ptr0) (ptr1) (ptr2) ...
yeah I meant a variable-length tuple
or rather, a non-known-in-advance
i wonder if there would be any speed benefits from makeing the underlying array for lists a tuple, so that operations like tuple([*iter]) would not require the copying from the list to the tuple
then what if the list is mutated
it would only be allowed if the list has a refcount that proves it isnt referred to
and you might be able to optimize stuff like t = tuple(t) where t is a list
or you could check the refcount of the referred to tuple on mutations and copy if needed
then code like py x = [1,2,3,4] y = tuple(x) # y would point to x->tup del x would only require one backing array
Not just that it isn't referred to, but that it can't ever be referred to - right?
otherwise you have to worry about things like ```py
x = [1,2,3,4]
y = tuple(x) # y would point to x->tup
x[0] = 42
right so list.__setitem__ would have a check like if l->tup->ob_refcount > 1 {reallocate()}
ah, I see
and this check should compile down to a double dereference so that should be fast even when a realloc isnt needed (ie when there isnt a copy of the list)
so that would make all lists in every Python program 16 bytes larger, and every list mutation or deletion a bit slower, and in exchange would make it faster to construct a tuple from a list.
seems possible, now that I understand what you're proposing, but I'm not convinced it's a good tradeoff.
yea i don't know if it would be a good idea, just would be interesting
note that it would be possible to optimize (*x,) even without doing that. A possible implementation of (*x,) would be to call x.__length_hint__() first and, if it exists, pre-allocate a tuple of the hinted size and start unpacking directly into it. If __length_hint__ doesn't exist, or if the hint was wrong and there are more items than the hint suggested, it could fall back to the current way.
does it not call __length_hint__ now inside of UNPACK?
i know the tuple constructor calls it
this might be worse for small iterables because of the extra cost of calling __length_hint__
__length_hint__ isn't a slot I believe so calling it requires going through the method dict
indeed, I probably should have put "optimize" in some scare quotes there.
the LIST_EXTEND might, but my point is that it doesn't necessarily have to BUILD_LIST / LIST_EXTEND / LIST_TO_TUPLE - it could just start with a tuple if __length_hint__ exists and is correct.
not necessarily better, and I'm not sure this is a common enough operation to be worth any special optimization, but that's one route that could work.
another would be assuming that unpacking is much more common for small iterables than large ones - it could start with a tuple of size 10, unpack into that, and if it fills it then it could realloc.
at the C level, tuples can be resized, so the fact that it doesn't know the final size isn't even necessarily a problem.
list_extend (which I think is what LIST_EXTEND ends up calling) does use __length_hint__
why are None, True and False in CamelCase?
You mean Pascal case. To deferentiate between built in variables and defined variables (plus other reasons prob)
https://peps.python.org/pep-0285/ says that True and False were chosen rather than true and false for consistency with None
Python Enhancement Proposals (PEPs)
Other languages (C99, C++, Java) name the constants “false” and “true”, in all lowercase. For Python, I prefer to stick with the example set by the existing built-in constants, which all use CapitalizedWords: None, Ellipsis, NotImplemented (as well as all built-in exceptions). Python’s built-in namespace uses all lowercase for functions and types only.
At first True and False weren't keywords actually
Just noticed that python is around 40% faster on WSL than on Windows
ubuntu 22.04, Windows 10, CPython 3.11 (two different installations, one in windows, one in wsl), same benchmark
speed differs on different benchmarks, it is in range 20-50%
Why?
what an oddly authoritative answer. Not super consistent - why should str be lower and Exception be pascal? - but authoritative nonetheless.
True, False = False, True
The more u know
yep
in fact NotImplemented is still not a keyword
!e
NotImplemented = 42
print(NotImplemented)
@grave jolt :white_check_mark: Your 3.11 eval job has completed with return code 0.
42
Built-in exceptions are classes so not sure what that comment is about 🤷♂️
Ellipsis too
At first True and False didn't exist at all, until surprisingly recently. Later they existed, but weren't keywords. Now they're keywords.
Might be cmake? Since it doesn't ship by default with windows
you broke the matrix
hi, i just want to check, is it only me or anyone else...
i feel very anxious to read code if there isn't an empty line every 4,5 lines
im using black, it enforces empty lines before/after every function/class
but im also adding empty lines inside functions to separate logic and syntactic blocks of code
Good visual cues help when reading code!
Which is why a single line with a big line of comments above it stands out as Very Important Do Not Touch
I noticed that itertools is missing a chunk_by (or chunked) implementation as an undo for the chain function. Is there a technical reason (beyond no one created a PEP) for this?
I think I might have an idea as to why this is the case. Originally in like Python 1.X builtin types were totally different to classes, so you couldn't inherit etc. int(), str(), list() etc were merely functions which did a conversion, but were eventually replaced by the actual classes when that became possible. If you go to the earliest copy of the docs (Python 1.4), you can see that they're described as functions, and the builtin types in their section don't get given a specific identifier:
https://docs.python.org/release/1.4/lib/node26.html#SECTION00330000000000000000
I wondered if it was related to the class vs type distinction that used to exist, but Exception would have been a type as well, AFAICT
I like the theory that it might be related to whether or not a given type could be subclassed back in the long long ago
I was checking for 3.11, didn't see that. Thanks
Well exceptions were originally string values - that's why the exc_info tuple exists, and why except Exception1, Exception2: is a syntax error... Oh wow. They weren't even a specific class apparently, just a string compared by identity - you'd be raising like ("IOError", "Invalid permissions", <traceback>). So then the rule makes sense, CamelCase would be used for constant names, including None.
https://docs.python.org/release/1.4/lib/node25.html#SECTION00320000000000000000
.
of all the bad ideas you suggested this one may or may not actually make some sense
well first of all that reduces the work
they all just call the same thing (PyObject_RichCompare()) anyway
the mode of comparison is just passed through the oparg
Hello all - I wanted to get some clarification around a stylistic choice. For multiline arglists, I adopted a style long ago (that I thought was PEP-8 compliant, but now I can't seem to find where I got it...) for multiline arglists:
def some_function(
first_argument: int,
second_argument: int,
third_argument: int,
) -> int:
"""Get the product of three integers."""
return first_argument * second_argument * third_argument
I've seen the variant where the closing ) of the function declaration is on the same line as the final parameter, and I've seen it where it's on its own line but in line with the def keyword. I also know that double-indenting the multiline arglist is a common style. What I can't seem to find is whether it's atypical to indent the closing ) in line with the function body in these situations. To me, this is very readable, but I can definitely imagine counterarguments to it. Thoughts?
does anyone know if Read After Frees are considered critical python vulns? And they use some obviously weird code so the code structures probably wouldn't show up by accident
yes, a read after free is bad; it can lead to segfaults and security issues
yea ik its bad, im just trying to determine if i should report it using the security email, because its not the kind of code that can be hit by accident. (it requires freeing a buffer inside of a dunder)
probably fine to just report as a bug? there's a bunch of similar issues like https://github.com/python/cpython/issues/87353 where there were segfaults when doing something weird that haven't been treated as security issues. But maybe it's better to be safe than sorry
C moment 😔
https://github.com/python/cpython/issues/97592 is another one in the same style
I am more used to this style
def some_function(
first_argument: int,
second_argument: int,
third_argument: int,
) -> int:
"""Get the product of three integers."""
return first_argument * second_argument * third_argument
I guess it's a bit easier to scan because the closing line is on a different level. And there's a tiny bit more space
But I guess it's more of a personal preference (or a preference of your team)
my team lead uses double indents for arguments, with blocks etc. and it drives me crazy
PEP8 suggests some styles but doesn't force one
Right, right - and yeah I used to do it the way you do. I don't know why I landed on the double-indent style. Probably some SA SO discussion on it a few years ago.
SA?
Er, meant SO -- Stack Overflow
o
oh how I wish this was in ABCs
It doesn't matter much. What's more important is to just be consistent
hmm, this isn't the channel I thought it was
kinda inconsistent 😉
yep
what exactly is it in abstract classes that causes a TypeError on initialization with missing abstract methods? I wasn't able to find any code in the abc module for it, just the code that collects them into cls.__abstractmethods__, and the raised type error's traceback doesn't even include the path to where it was raised from
It's apparently built into object.__new__ - https://github.com/python/cpython/blob/4246fe977d850f8b78505c982f055d33d52ff339/Objects/typeobject.c#L4971-L4976
Objects/typeobject.c lines 4971 to 4976
PyErr_Format(PyExc_TypeError,
"Can't instantiate abstract class %s "
"without an implementation for abstract method%s '%U'",
type->tp_name,
method_count > 1 ? "s" : "",
joined);```
Thats very weird. It makes every object creation slower, even if they are not instances of ABC-related classes.
by one easily predicted branch based on a single bit check
Objects/typeobject.c line 4937
if (type->tp_flags & Py_TPFLAGS_IS_ABSTRACT) {```
anyway it is not related to object itself
why code related to some module is included into object.__new__?
also this: ```py
type.dict['abstractmethods']
<attribute 'abstractmethods' of 'type' objects>
https://peps.python.org/pep-3119/ says that making object.__new__ check Py_TPFLAGS_IS_ABSTRACT is an optimization over checking whether __abstractmethods__ is non-empty
Python Enhancement Proposals (PEPs)
it is making creation of instances of possibly abstract classes faster, but it is slowing down all other calls to object.__new__
hm - does it need to be done this way to support other metaclasses?
like, does it need to be possible to create an abstract class without using ABCMeta so that you can have an abstract class with a different metaclass?
Py_TPFLAGS_IS_ABSTRACT and related things are implementation details, so if you somehow created abstract class without abc you screwed up
the PEP seems to be structured to say that abstract classes are a first class feature of Python, and that the abc module is just a helper, not the only way of creating abstract classes
there is no public API (Python nor C) that allows us to create abstract classes
even Py_TPFLAGS_IS_ABSTRACT is not documented in Py_TPFLAGS list
this definition from glossary is not even related to object creation, it can be implemented in pure python without messing with any type flags
hm, indeed...
oh wow, that's incredibly weird
mildly cursed 
!e
Here's a fun way to crash Python
from abc import ABC
class X(ABC):
pass
X.__abstractmethods__ = iter(int, 1)
print(X())
@grave jolt :warning: Your 3.11 eval job timed out or ran out of memory.
[No output]
I think I meant to reply to @raven ridge
you can indeed create your own abstract class without the abc module
Py_TPFLAGS_IS_ABSTRACT is actually set by Cpython based on the existence of __abstractmethods__, so no, you don't need to use the abc module
there is an error when I try to install the noise library for perlin noise. I use python 3.10 and am using pip to install it. There are some threads online about this but I am too stupid to understand these, can someone help fix this.
you leaked your bot token. you should reset it
Hey @mild sonnet, I removed your message because it had your bot token. Please change it as soon as possible.
If you have a question about discord.py, see #❓|how-to-get-help or #discord-bots
Though __abstractmethods__ is undocumented, so it's not clear that you're meant to...
follow the link
should stdlib imports come before third-party imports which comes before user packages import? this doesn't seem to be common but I do that
that's a common convention yes
I want to add a readonly property to the base object class. The property will be the address in memory. I know id() can do that, but it would be cool for that alternative method. Since it must be written in C due to it not being able to use extentsions like C#, how would I go about adding such a property?
I will try but I don't think it will do much. It is an error with python itself
well you have to download build tools (provided by the link) to make it work
it's not an error within python itself
You could use fishhook to do that
!e ```py
from fishhook import hook
@hook.property(object)
def addr(self):
return id(self)
print(int.addr)```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
140174779776640
So, would this allow me to in that script do for example if I have a list that is [1, 2, 3] and the var name is list1. I can do list1.addr and it will print the memory location?
Yea, it adds a property named addr to every instance of object
This is a description I found. What does this all mean? And you can have static classes? And what are heap classes. I know what the heap is. But what are heap classes?
Static classes are classes defined in C
Heap classes are mutable by default (like all python classes and some very specific c classes)
I wrote that description a while ago and it was mostly referring to the memory location of a given class (heap or static memory)
But since then python has added the functionality for C classes in the heap, but fishhook still works
Wdym. C classes aren’t mutable? Like I can’t change properties of their classes? And wait C doesn’t have classes?
U MADE IT?
Yea it's one of my more fully fledged personal projects
HOLY CRAP
How long have u been programming for?
That’s a rlly advanced project.
I released the first version of fishhook in 2020, and I think I started learning python a year or 2 prior
Damn. So how exactly does it work?
And how did u learn all the low level stuff u needed to do it?
Did u read cpython internals and learn C?
Yea I read the internals a lot
I see. Did you learn C before python?
it takes at least a year or two of programming to get most concepts right
the rest is learned at any time
Or C as u went along?
Hm I see. That’s just so cool.
Very true
No I learned more from #esoteric-python
same
There was another user who showed me early on how you could change the builtin int.add with ctypes (fully manually) and it got me interested in doing it in a more general way
That's how fishhook came about
I don't know if the other user is still here, they used to be Juan or something
You are faster at searching than I am lmao
Yea they are now @twilit garnet. And yea that got me interested in the internals
i just searched in:esoteric-python int ctypes
Ah I did int .value = ctypes was smarter
i got interested when some code object i was building segfaulted because i removed a POP_TOP
skorb pointed to the line that caused the error and i got to modifying the internals to show a detailed error #esoteric-python message
Lol I show up like 5 messages after that one
i think the bug with your load_addr() thingy also gave motivation to start learning the internals
Oh yea how it used LOAD_DEREF?
I think I remember you asking for a ton of details on how load_addr worked
Oh fr? What is that?
What’s that?
Sry I had to go, I’m doing be and stuff 😅
Also, ur lib has so much memory stuff and unlocking and locking. How does that work? @pliant tusk
Those functions toggle some type flags to trick python into thinking a given type is mutable/unmutable
How did u figure that out?!
!e ```py
from fishhook import lock
class A:pass
lock(A)
A.a = 1 # fails ``` for fun you can do this too
And know u had to do that?
@pliant tusk :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 7, in <module>
003 | TypeError: cannot set 'a' attribute of immutable type 'A'
Lots of reading typeobject.c
What is that?
And how did u know to read rhag specifically?
That
The file that controls how types work
I looked specifically at the implementation of setattr
And how it sets slot functions to the correct pointers
Slot functions?
And what do the pointers ur referencing do?
Slot functions specifically control dunder functions
So stuff like __add__
So the first version of fishhook was written to calculate those slot pointers, but now I just abuse setattr to set them for me
I see.
Where did u learn all this low level stuff? @pliant tusk
Lots of trial and error
How long did it take u to make the lib?
And was it a consistent two years of python before u made it?
A few weeks probably
That’s IT?
By the time u started working on it, were u well versed in how u were gonan do it?
I had written some small specific stuff so I understood how to do it. Fishhook was just a new strategy meant to be more general
I see. Well its a fantastic lib. Did u learn a lot of python in Uni if u did that?
I was still in HS when I released that lib lol
WHAT
U making me sad now 😭
💀
i'm... still in (j)hs
!e but what about 👀
import ctypes
from ctypes import pythonapi
from fishhook import lock
def setattr(obj, name, value):
get_dict = pythonapi._PyObject_GetDictPtr
get_dict.restype = ctypes.POINTER(ctypes.py_object)
get_dict.argtypes = (ctypes.py_object,)
ptr = pythonapi._PyObject_GetDictPtr(obj)
ptr.contents.value[name] = value
class A:
pass
lock(A)
setattr(A, "x", "what 🤔")
print(A.x)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
what 🤔
I think a lot of people here still are
hehehe hello!
Not as as advanced XD
!pep 701 just saw that f-strings to formal grammar is in a draft. hope it gets approved soon 🥳
yea fair that would still work, just like that would work on other immutable classes. (although you need to call pythonapi.PyType_Modified(py_object(cls)) after altering that dictionary to avoid crashes due to corrupted cache
hey, what do you think of fishhook, since your example way back kind of inspired it
I am really looking for the way it can work like a more powerful version of JavaScript template literals without the Python f-string syntax restrictions:
// Node.js v18.9.0 console
> `${`this`}` // example with delimiter collision
'this'
> `${"\n"}` // example with backslash
'\n'
> `${
... 'this' // comment
... }` // example with comment
'this'
Consistent 3 years?
wdym consistent?
hello! i need to come back to this place fr, i'm just so starved of free time to do so with uni and stuff now haha. i'm getting old!
old people gang
it looks really cool! i'll definitely need to look further into it, i'd love to play around with it hehehe
hey i joined this server when i was 15 and now i'm 20, what a fever dream
,I need a code that firstly has a list(return of a subprocess),than (inside while True) is called the same function to get the list,but if the list elemens are gone or added the program is terminated
how can I do that
I have a code but it doesnt work properly.I can also show the code
!e
import sys
str_ = "abc123def"
print(sys.getrefcount(str_))
int_ = 123500
print(sys.getrefcount(int_))
list_ = [1, 2, 3]
print(sys.getrefcount(list_))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 4
002 | 4
003 | 2
anyone know why some objects like new strs and ints start with a refcount of 4?
I guess one is the code object's co_consts, one is the variable, and one is the getrefcount() function
but that leaves one
also even deling the string only removes 1 reference
gc.get_referrers() only finds co_consts
!e
import sys
import ctypes
str_ = "abc123def"
str_id = id(str_)
print(sys.getrefcount(str_))
del str_
print(sys.getrefcount(ctypes.cast(str_id, ctypes.py_object).value))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 4
002 | 3
hm, seems to be the same
but if I make it a non constant it does go back to 2 references
like
str_ = "abc 123 def"
str_ *= 2
so yeah I think it's the co_consts, but why does that increase refcount by 2 
so is there a __code__ or co_consts even for module level statements? Is it accessible in runtime python code?
yeah everything is executed from a code object. Not sure how to get to it for a module though
Out[352]: <code object <module> at 0x7fb083b9c400, file "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dis.py", line 1>
this seems to do it
ah cool thanks
also yeah, if I define the string in a function, as expected it's 2 references + 1 from co_consts = 3
!e
import sys
def foo():
return "abc 123 def"
print(sys.getrefcount(foo()))
s2 = "500 def 123"
print(sys.getrefcount(s2))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 3
002 | 4
but somehow module level variables have 1 more reference from somewhere
I would have guessed some globals() dictionary but that doesn't explain the reference staying after del
Maybe parser/compiler holds references to all string literals to use the same object every time some particular string literal is used in code?
Like keep going?
i don't remember taking any breaks so probably?
Damn.
Rn I’m starting to learn C, and working on discord bots. Hopefully to make a lib which can add certain things to python which it lacks. What else shld i do?
you're asking what does Python lack that you can add to your library?
Like, what other low level projects can I do.
Include/cpython/tupleobject.h lines 5 to 11
typedef struct {
PyObject_VAR_HEAD
/* ob_item contains space for 'ob_size' elements.
Items must normally not be NULL, except during construction when
the tuple is not yet visible outside the function that builds it. */
PyObject *ob_item[1];
} PyTupleObject;```
is there a way to define a ctypes.Structure for a PyTupleObject?
since ob_item and the length of the tuple is only known at runtime
!e ```py
from ctypes import *
class PyTupleObject(Structure):
fields = [
('ob_refcount', c_ssize_t),
('ob_base', py_object),
('ob_size', c_ssize_t),
('_ob_items', py_object*0)
]
@property
def ob_items(self):
items_addr = addressof(self._ob_items)
return (py_object * self.ob_size).from_address(items_addr)
c = (0,1,2)
c_S = PyTupleObject.from_address(id(c))
c_S.ob_items[0] = c
print(c)
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
((...), 1, 2)
what does the *0 do 👀
it just makes that slot a 0-length py_object array
ah hm
basically a placeholder for the later addressof
I was going to dynamically define the structure at runtime when size is known
not sure if that or address calculation is better
you should use the address calculation
(at least in my opinion)
because it means that if you do something like type(c_S).from_address(id(other_tuple)) it will still work
also do you know what's going on with pythonapi.PyTuple_GetItem, it seems to return an address not matching the actual objects
!e
from ctypes import *
class PyTupleObject(Structure):
_fields_ = [
('ob_refcount', c_ssize_t),
('ob_base', py_object),
('ob_size', c_ssize_t),
('_ob_items', py_object*0)
]
@property
def ob_items(self):
items_addr = addressof(self._ob_items)
return (py_object * self.ob_size).from_address(items_addr)
tup = (0,1,2)
obj = PyTupleObject.from_address(id(tup))
print(id(tup[0]))
print(pythonapi.PyTuple_GetItem(obj, 0))
@warm breach :x: Your 3.11 eval job has completed with return code 139 (SIGSEGV).
139821935707976
👀 wut
!e you need to make sure to set the argtypes and return types properly, like so ```py
from ctypes import *
pythonapi.PyTuple_GetItem.restype = py_object
pythonapi.PyTuple_GetItem.argtypes = (py_object, c_ssize_t)
class PyTupleObject(Structure):
fields = [
('ob_refcount', c_ssize_t),
('ob_base', py_object),
('ob_size', c_ssize_t),
('_ob_items', py_object*0)
]
@property
def ob_items(self):
items_addr = addressof(self._ob_items)
return (py_object * self.ob_size).from_address(items_addr)
tup = (0,1,2)
obj = PyTupleObject.from_address(id(tup))
print(id(tup[0]))
print(pythonapi.PyTuple_GetItem(tup, 0))
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 140197582378824
002 | 0
ah I see, nice
why isn't pythonapi typed 😔
also is that possible to work with the structure? instead of the actual tuple reference
because it is generated from the pythondll so it does not have type info
yes
!e ```py
from ctypes import *
class PyTupleObject(Structure):
fields = [
('ob_refcount', c_ssize_t),
('ob_base', py_object),
('ob_size', c_ssize_t),
('_ob_items', py_object*0)
]
@property
def ob_items(self):
items_addr = addressof(self._ob_items)
return (py_object * self.ob_size).from_address(items_addr)
pythonapi.PyTuple_GetItem.restype = py_object
pythonapi.PyTuple_GetItem.argtypes = (POINTER(PyTupleObject), c_ssize_t)
tup = (0,1,2)
obj = PyTupleObject.from_address(id(tup))
print(id(tup[0]))
print(pythonapi.PyTuple_GetItem(obj, 0))```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 139972054882120
002 | 0
so argtypes can be anything and it'll try to cast to that?
*just need to remember to use POINTER(struct) if you are passing it to something that expects a pyobject (which is just a pointer to a given structure)
ish? you need to make sure it matches the abi of the function implementation
like if I do restype = ctypes.c_long that would be the pointer before casting to pyobject?
yea, but i would use ctypes.c_void_p
actually I guess that is same as not defining restype?
is there a predefined "default" restype or does it do something special if its not defined
yea i think it is c_int
argtypes does something special if it is not defined, it lets ctype handle the func like a varargs function.
so like printf (if you wanted to call libc.printf manually), you couldnt set argtypes if you want it to retain its varargs functionality
@warm breach you can also define a new interface using PYFUNCTYPE or CFUNCTYPE and then make copies of the functions from pythonapi if you want the same function with different interfaces
py_size = CFUNCTYPE(py_object, py_object, c_ssize_t)
struct_size = CFUNCTYPE(py_object, POINTER(PyTupleObject), c_ssize_t)
getitem_with_pyobj = py_size(pythonapi.PyTuple_GetItem)
getitem_with_struct = struct_size(pythonapi.PyTuple_GetItem)
👀 yeah that's useful thanks
no prob, feel free to ping me if you have any more ctypes questions
Include/cpython/tupleobject.h lines 22 to 25
static inline Py_ssize_t PyTuple_GET_SIZE(PyObject *op) {
PyTupleObject *tuple = _PyTuple_CAST(op);
return Py_SIZE(tuple);
}```
is it POINTER(py_object) for argtypes?
and then I tried casting the id into a pyobject, but maybe I am misunderstanding what py_object.from_address() does because supplying an id of a python object causes a segmentation fault 
so a py_object is a bit of a misnomer, its actually a pointer to a py_object. and from_address basically looks for that pointer at the address you pass it (id) so it is parsing the refcount as a pointer (and then crashing)
ah..
_ctypes.PyObj_FromPtr will do what you want
so.. the id int sort of is the py_object?
if you wanted to do py_object.from_address(arg) then arg would have to be the address of a pointer to your object
that is why you can access obj.ob_base with py_object.from_address like so py_object.from_address(id(obj) + 8)
for 1 that would give back int
I was previously doing
ctypes.cast(id, ctypes.py_object).value
to get a python object from id
not sure if that or _ctypes.PyObj_FromPtr is less cursed
yea that also works, as ctypes.cast goes straight from the address passed in
either works
also is there anywhere stuff like PyObj_FromPtr is documented? even just a list of them
i have no idea, I found all of this stuff by reading the source code
!e do you know what's wrong with the types here 😔
import _ctypes
import ctypes
from ctypes import py_object
py_size = ctypes.CFUNCTYPE(ctypes.c_ssize_t, py_object)
PyTuple_Size = py_size(ctypes.pythonapi.PyTuple_Size)
tup = (1, 2, 3)
tup_id = id(tup)
py_obj = py_object(_ctypes.PyObj_FromPtr(tup_id))
print(py_obj)
size = PyTuple_Size(py_obj)
print(size)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | py_object((1, 2, 3))
002 | Exception ignored on calling ctypes callback function: <_FuncPtr object at 0x7fb67e394120>
003 | Traceback (most recent call last):
004 | File "<string>", line 15, in <module>
005 | ctypes.ArgumentError: argument 1: <class 'TypeError'>: Don't know how to convert parameter 1
006 | 140421788063592
!e If I use the direct assignment instead of CFUNCTYPE it works fine with the same types
import _ctypes
import ctypes
from ctypes import py_object
PyTuple_Size = ctypes.pythonapi.PyTuple_Size
PyTuple_Size.argtypes = (py_object,)
PyTuple_Size.restype = ctypes.c_ssize_t
tup = (1, 2, 3)
tup_id = id(tup)
py_obj = py_object(_ctypes.PyObj_FromPtr(tup_id))
print(py_obj)
size = PyTuple_Size(py_obj)
print(size)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | py_object((1, 2, 3))
002 | 3
actually I was wrong about copying the function, you do it like this ```py
ctypes.pythonapi['PyTuple_Size']
CFUNCTYPE(...)(callable) is used to wrap python callables
yea
Hi, my name is Mari-eve
Include/cpython/longintrepr.h
ah nice 👍
my cursed shenanigans can continue
there's a multitude of places to search for some of the structs, particularly those of int, dict, and type(sys._getframe())
btw remember that most internal structs are contained in Include/cpython/
yeah I think I just never opened that file since I thought it was something to do with reprs 🥴
wow
scale of 0-10 how cursed would this be as a library
v1 = view(view)
with v1.unsafe():
v2 = view(view.unsafe.__exit__)
with v2.unsafe():
v2.value = None
v1.value = None
assert False
What is this sorcery
well, value is a special method I have for IntView, the basic objects don't have it
but there is this:
view(None).move_to(view(5))
print(5)
now with a <<= overload for move_from
tup = (1, 2, 3, 4)
print(type(tup), tup)
with view(tup).unsafe() as v:
v[0] = "👀"
v[4] = "🤔"
v.size += 1
print(type(tup), tup)
v <<= 5
print(type(tup), tup)
v <<= ["what", "is", "this"]
print(type(tup), tup)
<class 'tuple'> (1, 2, 3, 4)
<class 'tuple'> ('👀', 2, 3, 4, '🤔')
<class 'int'> 5
<class 'list'> ['what', 'is', 'this']
oh extending a tuple at runtime like that is not safe 😬
yeah I suppose it's just writing into unowned memory? or whatever is beyond the tuple struct 
is it possible to expand / resize a PyObject struct in place? I'm assuming no
You could create a new tuple object, and then rewire all references.
not safely or consistently
is that actually possible?
re-directing references
Yes, I did that for a project once (although we limited our types of references): https://github.com/L3viathan/batchable
TL;DR: You can call gc.get_referrers() and then change those (in more or less complicated ways, depending on mutability etc.)
The interesting part is just the Proxy class and its replace() method
!e ```py
import gc
def replace_tuple(self, new):
for container in gc.get_referrers(self):
if isinstance(container, dict):
for k, v in container.items():
if v is self:
container[k] = new
elif isinstance(container, list):
for i, v in enumerate(container):
if v is self:
container[i] = new
elif isinstance(container, tuple):
for i, v in enumerate(container):
if v is self:
temp = list(container)
temp[i] = new
replace_tuple(container, tuple(temp))
elif isinstance(container, set):
container.remove(self)
container.add(new)
x = (1,2,3)
replace_tuple(x, (0,1,2,3,4))
print(x)```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
(0, 1, 2, 3, 4)
though this doesn't find constants that are still the same object (id) right?
!e
import gc
def replace_tuple(self, new):
for container in gc.get_referrers(self):
if isinstance(container, dict):
for k, v in container.items():
if v is self:
container[k] = new
elif isinstance(container, list):
for i, v in enumerate(container):
if v is self:
container[i] = new
elif isinstance(container, tuple):
for i, v in enumerate(container):
if v is self:
temp = list(container)
temp[i] = new
replace_tuple(container, tuple(temp))
elif isinstance(container, set):
container.remove(self)
container.add(new)
x = (1, 2, 3)
print(id(x))
def fn():
t = (1, 2, 3)
print(id(t))
return t
replace_tuple(x, ("replaced", "tuple"))
print(x)
print(fn())
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 139823796873536
002 | ('replaced', 'tuple')
003 | 139823796873536
004 | (1, 2, 3)
correct, it only loops through naive containers.
wont work for interned ints, for example
their references are stored in array and to do that you should find and change that array
You can look at the referrers of an interned int, and replace them. Where would it matter that I don't find that array? Yes — they would not get garbage-collected, that's true. But if I try to replace all references to 111 to point to 99 instead, that (mostly) works.
Afaik you cannot get referrers for ints because they are not tracked by gc
!e
import gc
for x in gc.get_referrers(111):
print(type(x), id(x))
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | <class 'list'> 140222762929152
002 | <class 'tuple'> 140222762985792
Huh neat, idk why I thought that didn't work
>>> import sys, gc
>>> sys.getrefcount(111) - 10**9
15
>>> [*map(type, gc.get_referrers(111))]
[<class 'list'>, <class 'tuple'>, <class 'tuple'>, <class 'list'>, <class 'tuple'>, <class 'dict'>, <class 'dict'>, <class 'dict'>, <class 'list'>]
>>> len([*map(type, gc.get_referrers(111))])
9
``` `gc.get_referrers` is not returning all referrers, only some of them
^ah that is why I thought it didn't work
what about set refcnt to 1 (then adjusted for the ctypes call) and call ctypes.pythonapi._PyTuple_Resize?
that returns you a new pointer and invalidates the old one, right?
so if your refcount isn't actually 1 all the other references are now pointing to garbage

!e seems fine if you hold the only reference to the tuple
from ctypes import *
IncRef = pythonapi["Py_IncRef"]
IncRef.argtypes = (py_object,)
DecRef = pythonapi["Py_DecRef"]
DecRef.argtypes = (py_object,)
Resize = pythonapi["_PyTuple_Resize"]
Resize.argtypes = (POINTER(py_object), c_ssize_t)
SetItem = pythonapi["PyTuple_SetItem"]
SetItem.argtypes = (py_object, c_ssize_t, py_object)
def get_tup():
t = py_object(("cat", "dog"))
print(t.value)
print(id(t.value))
DecRef(t)
for item in ["snake", "py", "rs"]:
size = len(t.value)
IncRef(item)
DecRef(t)
Resize(byref(t), size+1)
SetItem(t, size, item)
IncRef(t)
print(t.value)
print(id(t.value))
IncRef(t)
return t.value
print(get_tup())
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | ('cat', 'dog')
002 | 140502162303552
003 | ('cat', 'dog', 'snake')
004 | 140502162303552
005 | ('cat', 'dog', 'snake', 'py')
006 | 140502162315904
007 | ('cat', 'dog', 'snake', 'py', 'rs')
008 | 140502162315904
009 | ('cat', 'dog', 'snake', 'py', 'rs')
^ though I'm not sure why the id only changes sometimes
iirc the PyTuple_Resize always creates a new tuple with small tuples?
Overallocating?
I finally figured it out, the extra reference is from the one sent to sys.getrefcount, so it'll always be 1 more than the actual refcount before call 😔
The memory allocator itself often over-allocates (or rather allocates in chunks), so sometimes realloc will just extend the region and keep the same pointer.
That's my hypothesis
actually speaking of which, do you know how to interpret the result from sys.getsizeof(<some tuple>)?
say we have
x = (1, 2, 3)
and my understanding of the struct is:
class PyTupleObject(Structure):
ob_refcnt: c_ssize_t # 8 bytes
ob_type: POINTER(c_void_p) # 8 bytes
ob_size: c_ssize_t # 8 bytes
ob_item: c_ssize_t * N # 8 bytes * N
so 8 + 8 + 8 + 8 * 3 = 40 bytes?
but sys.getsizeof(x) gives 64
I thought it was alignment but isn't 40 already 8 byte aligned?
Hmmmm
() -> 40
(1,) -> 48
(1, 2) -> 56
(1, 2, 3) -> 64
so adding elements adds 8 bytes as expected
but it starts at 40 bytes...?
Ahh I think I get it
Every object also contains its size IIRC
Hence another 8-byte field
Wait, no
I mean, considering 0 size tuple it should just be 3x8=24 from the struct.
so there's another 16 bytes from somewhere
!e also int, which works pretty much the same way except with a uint32 array, starts at 24 as expected when its empty (0)
from sys import getsizeof
x = 0
print(getsizeof(x))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
24
I think it could be the GC data, since they are GC tracked.
!e that doesn't explain the empty one though, it's specifically not tracked
import gc
t = ()
print(gc.is_tracked(t))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
False
It'd be allocated with the capability to, since that's like a type flag.
so (1, 2, 3).__sizeof__() returns 48
Python/sysmodule.c line 1705
return (size_t)size + _PyType_PreHeaderSize(Py_TYPE(o));```
it's added to that
...
fml I questioned everything except the getsizeof implementation
what even is a PyType_PreHeaderSize
Include/internal/pycore_object.h lines 278 to 283
static inline size_t
_PyType_PreHeaderSize(PyTypeObject *tp)
{
return _PyType_IS_GC(tp) * sizeof(PyGC_Head) +
_PyType_HasFeature(tp, Py_TPFLAGS_PREHEADER) * 2 * sizeof(PyObject *);
}```
tuple matches the requirement for _PyType_IS_GC(tp)
so add the size of this (equivalent to 2 pointers of size, so 16 bytes in 64-bit) https://github.com/python/cpython/blob/main/Include/internal/pycore_gc.h#L12-L20
Include/internal/pycore_gc.h lines 12 to 20
typedef struct {
// Pointer to next object in the list.
// 0 means the object is not tracked
uintptr_t _gc_next;
// Pointer to previous object in the list.
// Lowest two bits are used for flags documented later.
uintptr_t _gc_prev;
} PyGC_Head;```
curious
is that struct actually preallocated together in memory as the tuple struct?
wdym?
PyGC_Head
is that allocated together with the object
or does it have nothing to do with the original struct and is maintained independent by the GC
if you had Py_TRACE_REFS defined at compilation it may be
otherwise it's just this
I think I'll just call the dunder __sizeof__ for struct size then
should be fairly safe...?
i guess
pretty sure it's really in front of the tuple itself in memory
what's your use case? for GC-tracked types you are going to need that GC header
just trying to get a memory copy of the entire object at some address, and wondering how many bytes I have to copy for the entire object
very cursed but this essentially copies memory from an object (500) into another object (9015)
import view
x = 9015
v = view(x)
with v.unsafe():
v <<= 500
print(x)
print(9015)
500
500
I'm actually not sure what happens to the GC header if it is copied to another object, and the original object is garbage collected... 
does the GC try to free the new memory it got copied to as well...?
oh yeah, copying the GC header would make this worse
in any case currently I'm memmoveing from the object address + size from __sizeof__, so if the GC header is before the object address struct I guess it never gets moved?
yes, you'll probably get trouble if you copy from a GC-tracked type into a non-GC-tracked type
since the GC will treat whatever memory is in front of the object as the header
yeah that seems to be the combination that will cause a segfault
!e
from ctypes import memmove
obj = 5
src = ("dog", "cat")
memmove(id(obj), id(src), src.__sizeof__())
print(obj)
@warm breach :x: Your 3.11 eval job has completed with return code 139 (SIGSEGV).
('dog', 'cat')
!e whereas the reverse seems fine
from ctypes import memmove
obj = ("dog", "cat")
src = 5
memmove(id(obj), id(src), src.__sizeof__())
print(obj)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
5
that's probably also UB, won't it free() the wrong pointer?
which free, on the object that gets overwritten?
hm, I think? still not fully understanding the segfault for when a non gc tracked type gets overwritten with memory from a tracked type (no header)
the original GC and freeing of the source object should be unaffected right?
the GC header is a doubly-linked list pointing to other GC-tracked objects. when the object is GCed, the interpreter will need to follow the links in the DLL to remove the GCed objects from the DLL
it decides whether to do that based on whether the type opts in to GC
but in your case, there is a GC-tracked object (according to its type) that doesn't actually have a GC header in front of it, just whatever random memory happens to be there
so 💥
ah that make perfect sense thanks
I thought lack of header just means the GC won't touch it
for the reverse case, when the object is GCed, Python will (probably) ultimately call free() on the pointer. If the object is GC-tracked, it should actually do free(pointer - sizeof(GC header)), but in this case it won't because it thinks there's no GC header
what will happen if you free() the wrong pointer? no idea, probably depends on your malloc implementation
Hello
I am new here
Nd wanting to learn python for data analytics
I just want the complete roadmap and perfect problem practicing websites
Can anyone help me pls
!e so copying the header kind of seems to work 🥴
from ctypes import memmove, pythonapi, py_object
Py_IncRef = pythonapi["Py_IncRef"]
Py_IncRef.argtypes = (py_object,)
obj = 500
src = (1, 2)
Py_IncRef(py_object(src))
memmove(id(obj) - 16, id(src) - 16, src.__sizeof__() + 16)
print(500)
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
(1, 2)
I don't think so, that will mess up the doubly linked list
ah right 
and write into memory that you don't own
also yeah that 😔
possibly change the value of 499?
actually probably not, it's outside the range of ints that are allocated in the small ints array
ah are the small ints contiguous in memory?
!e
from ctypes import memmove, pythonapi, py_object
Py_IncRef = pythonapi["Py_IncRef"]
Py_IncRef.argtypes = (py_object,)
obj = 10
src = (1, 2)
Py_IncRef(py_object(src))
memmove(id(obj) - 16, id(src) - 16, src.__sizeof__() + 16)
print(8)
print(9)
@warm breach :x: Your 3.11 eval job has completed with return code 139 (SIGSEGV).
001 | 8
002 | 0
003 | free(): invalid pointer
👀 interesting
it's -3 to 252 I think
to 255 at least
to make bytes/bytearray indexing fast
and from -5 iirc
>>> x = -5
>>> y = -5
>>> x is y
True
>>> x = -6
>>> y = -6
>>> x is y
False
someone trying to make segfaults in python (real)
what
🍪
remember that some types __sizeof__ return a value that is not representative of contiguous memory
!e py print('lists have constant size due to inner pointer:', list.__basicsize__) print('__sizeof__ adds in length of pointed to array', [1,2,3].__sizeof__())
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | lists have constant size due to inner pointer: 40
002 | __sizeof__ adds in length of pointed to array 72
👀
also, if you want a way to call __sizeof__ on any object, use this ```py
def sizeof(obj):
return type(obj).sizeof(obj)
I never really understood how that worked
how sometimes some dunders are not accessible in the instance
and only by class call
if they are defined on an instance and on the type (like in classes)
!e py print(list.__sizeof__())
@pliant tusk :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 1, in <module>
003 | TypeError: unbound method list.__sizeof__() needs an argument
!e py print(type(list).__sizeof__(list))
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
408
but even object instance have it as instance method no?
!e
print(object().__sizeof__())
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
16
yea, types don't, because of how they work
ah hm
haven't tried to make a type struct yet
wonder what happens if you move one in memory 👀
The pep692 discussion has me a little annoyed. Like yeah let's just keep annotations more and more verbose by requiring the use of Unpack instead of introducing a new syntax
I've noticed that people who seemingly have never used typing beyond anything basic tend to comment on the typing pep discussions as against them
Huh, have the core devs moved on from "no syntax specifically for type hints"
PEP 646 had some syntax specifically for type hints
and PEP 692 might still get it too
are there any plans for the return variable type hint to be exposed to a called function?
like here func() can somehow know that its caller has annotated its return type as int
x: int = func()
you can look at outer scope of a function with some shenanigans
What would it do with it though?
I guess you could have a library where that influences runtime-behavior
like
x: str = get_input()
y: int = get_input() # parses to an int instead
z: list[int] = get_input() # splits input to a list of ints
though I guess that would make it impossible to use without an assignment
though perhaps when used without an assignment, you receive the type hint of the callable it was called in(kind of confusing though, maybe just return None if no type hints available)
from typing import get_assignment
def fn():
print(get_assignment())
abc: list[int] = fn()
>> ('abc', list[int])
xyz = fn()
>> ('xyz', None)
fn()
>> (None, None)
Well, you can just do abc = fn(list[int])
but not as magical right 
also I suppose you may be able to get the type hint of a caller as a function as well
from typing import get_assignment
def outer(x, i: int):
...
def fn():
print(get_assignment())
outer(fn())
>> ('x', None)
outer(i=fn())
>> ('i', int)
x = get_input(type=str)
y = get_input(type=int)
z = get_input(type=list[int])
Is it possible to get local annotations without re-parsing source code of function?
that's gonna be really hard to do
unless there's a thing in inspect that can do it well
yeah it's pretty cursed 😔
also what to even annotate the return type as
import inspect
def f():
parent_frame = inspect.currentframe().f_back
line = inspect.getframeinfo(parent_frame).code_context[0]
print(line)
x: int = f() # x: int = f()
don't think it works in repl
but i suppose you can parse this line for annotations using ast and etc
when will python break gil
or it never will?
Do you think having {} be an empty set gain any traction? IF {3,5,2} is a set, then {} should be an empty set
A dictionary is a way more common use case of {} so it makes sense that it's an empty doct
And if {1: 2} is a dict, then {} should be an empty dict. Dicts are much more commonly used than sets, and in any case, even if this was better, it's a gigantic breaking change that is never gonna happen.
Fair enough
Hmm
What if it was an object that counted as both an empty set and an empty dict, then when added to or an operation was applied, converts into one of those?
that would be difficult to implement and difficult to understand for users
It would be great fun to implement though! But if it would be difficult to understand or if there was overlapping functions I can see why it would cause issues.
In #esoteric-python they would definitely like it
Maybe I'll make an implementation and play around with it. Who knows, may be coming to a module near you 😛
maybe in a statically typed language with type inference that would be fun
haskell does do polymorphic literals, and I don't hate it, but it does lead to some odd happenings. I do think the approach python takes with the numeric tower makes the most sense and leads to the fewest edge cases.
I tried it briefly, it gets a bit annoying because it's not easy to do self.__class__ = dict
obviously its too late to change anything now but i think {} and {:} for empty sets and dicts wouldnt have been terrible
{}[::]
Full slice of empty set is empty dict
{*()} ← empty set literal
is there any performance benefits of using tuples even when the size is not known ahead of time
afaik the main incentive for creating tuples is that their size are known at compile time
I wouldn't worry the performance of tuples vs alternate data structures, it is very unlikely it would matter. The main reason to use tuples is to get hashes by their contents.
their main use is for multiple return values?
what library use tuples to index a dict or set?
ah fair, I meant more main use for tuples where you couldn't use a list for the same thing
Well, for that you could just use a list
The only real difference from a list is the hashability
functools.lru_cache does something similar, although not quite
The Python impl does use the hash of a tuple though
Also, a set of tuples is a common pattern to implement a game board or other set of points. Like a Game of Life
whats the difference between LOAD_ATTR and LOAD_METHOD
they seem to do the same thing
Tuple hash is not cached, it is calculated on every hash(tup) call
Frozenset's hashes, for example, are cached
It works exactly as LOAD_ATTR, but it can do better thing if your attr is a python function
In that case it can not create new bound_method object, but put object and function itself on stack.
PRECALL(or CALL, idk) can look on two top items on stack: if they are not null - they are object and function (and you can call function directly with known first argument and all other passed arguments), otherwise - it is result of accessing attribute (and it can be called as usual).
When you are doing a.b(), it likely to be a call to a function (builtin or python), so it is possible to not create bound_method object and call function directly.
a.b(c) in this case works like A.b(a, c) and not like this: bound_method(A.b, a)(c)
Huh, TIL
just use set() unless you're code golfing and wanna save some characters between set and some identifier before it
it's faster
(I know, I was just trolling. {*()} isn't even usually shorter than set().)
what if it was
PRECALL does that in 3.11 but in 3.12 PRECALL is gone and CALL handles that too
uh, I don't think tuple sizes are known at compile time
unless you mean tuple literals with other intrinsic literals which get inlined
well tuple sizes are not really "known" at compile time
unless you mean the tuples created by the BUILD_TUPLE or LOAD_CONST opcodes
yes i mean BUILD_TUPLE and interned tuple
i've inherited code and they were still using setup.py and distutils, but after researching a bit of the state of python packaging im unsure how to take the code to be more up to date
am i suppose to still use setup.py but have a pyproject.toml in addition to it?
or am i suppose to use setup.cfg or some combination of all 3?
Would be cool if you could at least do {,} - the same as (,) for tuples
tuples can do () already
huge backwards incompatibility
how is it faster though? 
set involves a lookup of a global
oh wow, it is faster
that's... interesting
{*()} isn’t explicitly optimised, so it ends up building an empty set, loading the empty tuple, updating the set with the contents (building an iterator probably), then finally giving you the result.
it uses the new 3.11 opcode specializations and a function designed exactly for looking up globals or builtins
It's actually faster on 3.9 as well
well that's a little surprising
Is it? {*()} isn't evaluated during compilation so it creates a set, loads the tuple, and updates the set with the tuple.
set() is a global lookup, but those got faster before 3.11 (in 3.10, though, I think, and not 3.9?).
having it faster in 3.9 is the surprising part
set() needs to do
LOAD_NAME 0 (set)
CALL_FUNCTION 0
while {*()} is doing
BUILD_SET 0
LOAD_CONST 0 (())
SET_UPDATE 1
I guess the extra LOAD_CONST and SET_UPDATE is enough to offset LOAD_NAME being slow?
probably
BUILD_SET 0 can do the work all by itself
yeah might as well inline it
probably will be faster 🥴
!e
import dis
import timeit
from contextlib import suppress
def inline(code: str):
lines = [ln for ln in code.strip('\n').splitlines() if ln]
code = ""
var_locals, names, consts = {}, {}, {None: 0}
for line in lines:
s = line.split(maxsplit=2)
n_code = dis.opmap[s[0]]
mem = int(s[1]) if len(s) > 1 else 0
cache = "0000" * dis._inline_cache_entries[n_code]
code += f"{n_code:02x}{mem:02x}{cache}"
if len(s) < 3:
continue
data_field = map(str.strip, s[2][1:-1].split(','))
if n_code in dis.haslocal:
for value in data_field:
var_locals[value] = 0
elif n_code in dis.hasconst:
for value in data_field:
with suppress(NameError):
value = eval(value)
consts[value] = 0
elif n_code in dis.hasname:
for value in data_field:
if value != "NULL":
names[value] = 0
return (lambda: 0).__code__.replace(
co_code=bytes.fromhex(code),
co_consts=tuple(consts.keys()),
co_names=tuple(names.keys()),
co_varnames=tuple(var_locals.keys()),
co_nlocals=len(var_locals),
)
fn2 = inline("""
RESUME 0
LOAD_GLOBAL 1 (NULL, range)
LOAD_CONST 1 (300000)
PRECALL 1
CALL 1
GET_ITER
FOR_ITER 4
STORE_FAST 0 (_)
BUILD_SET 0
STORE_FAST 1 (x)
JUMP_BACKWARD 5
LOAD_CONST 0 (None)
RETURN_VALUE
""")
def fn():
for _ in range(300_000):
x = set()
print("set():")
timeit.main(['-s', "from __main__ import fn", "fn()"])
print("inlined BUILD_SET:")
timeit.main(['-s', "from __main__ import fn2", "eval(fn2)"])
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | set():
002 | 10 loops, best of 5: 32.1 msec per loop
003 | inlined BUILD_SET:
004 | 10 loops, best of 5: 27.4 msec per loop
Can anyone remind me how Python grow's containers? What algorithm is used to determine the new size of it?
what type of container?
there's list, set, and dict and they each have different growing patterns
Oh they do? How does a set grow?
you add to it
it just multiplies the current capacity by a constant once a certain amount of elements are stored
along with a few other operations
Is that constant different for lists, sets, and dictionaries?
IIRC dicts and sets can only double?
yeah. I think for lists it's 9/8 and for dicts it's 3
actually lists don't even multiply https://github.com/python/cpython/blob/main/Objects/listobject.c#L70
Objects/listobject.c line 70
new_allocated = ((size_t)newsize + (newsize >> 3) + 6) & ~(size_t)3;```
Objects/setobject.c lines 242 to 247
/* Find the smallest table size > minused. */
/* XXX speed-up with intrinsics */
size_t newsize = PySet_MINSIZE;
while (newsize <= (size_t)minused) {
newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1.
}```
how can a hash table grow not in powers of 2?
left bitshift by 1, so multiplication by 2
actually idk why they're bitshifting...
kinda confusing tbh
seems like it rounds up to the nearest power of 2
if __builtin_clzll() was supported on all compilers it can support that's probably what they'd use
then there's https://github.com/python/cpython/blob/main/Objects/dictobject.c#L592-L629 for dicts
which i assume also grows according to powers of 2
Yeah it has to grow by a power of something
!e ```py
set(map((l:=iter([0])).setstate, l))
@pliant tusk :warning: Your 3.11 eval job timed out or ran out of memory.
[No output]
@pliant tusk do you know if there's a way to refer to the own class's pointer in an instance method? Currently I'm doing:
class PyTupleObject(ctypes.Structure):
_fields_ = ...
PyTupleObject.GetItem = pythonapi["PyTuple_GetItem"]
PyTupleObject.GetItem.argtypes = (ctypes.POINTER(PyTupleObject), Py_ssize_t)
PyTupleObject.GetItem.restype = ctypes.py_object
I would like to do something like this but I guess I wouldn't have a way to refer to the class itself in the class definition?
class PyTupleObject(ctypes.Structure):
_fields_ = ...
GetItem = pythonapi["PyTuple_GetItem"]
GetItem.argtypes = '?'
the class body runs before the class is actually created, so you may be outta luck, outside of some lazy evaluatation kinda stuff.
!e Yea you cannot refer to a class inside its own definition. You could create a class property ```py
from ctypes import *
class bind(property):
def init(self, func, restype=c_int, argtypes=[]):
self.func = func
self.restype = restype
self.argtypes = argtypes
def set_name(self, owner, name):
self.name = name
def get(self, owner_self, owner_cls):
self.func.restype = self.restype if self.restype is not ... else POINTER(owner_cls)
self.func.argtypes = [c if c is not ... else POINTER(owner_cls) for c in self.argtypes]
setattr(owner_cls, self.name, self.func)
return self.func
class PyTupleObject(Structure):
fields = [
('ob_refcount', c_ssize_t),
('ob_type', py_object),
('ob_size', c_ssize_t),
('_ob_items', py_object*0)
]
@property
def ob_items(self):
return (py_object * self.ob_size).from_address(addressof(self._ob_items))
GetItem = bind(pythonapi['PyTuple_GetItem'], restype=py_object, argtypes=[..., c_ssize_t])
t = ('a', 'b', 'c')
t_s = PyTupleObject.from_address(id(t))
print(PyTupleObject.GetItem(t_s, 0))
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
a
yea you could just use a standard property
or skip the overwriting part and do some more stuff in __get__
hm...
is there a difference between this and casting the address of the struct into a py_object and calling the pythonapi with that
the only difference is how the call takes place from python
I think the py_object cast creates a reference?
py_object(obj) creates a reference, py_object.from_address does not
wait how does py_object.from_address work again
I remember you said it wasn't the id or addressof(struct) right?
it needs the address of a pointer to a given object
or you can do cast(id(obj), py_object) which also works and does not make a new reference
(when you use it with .value it will add a reference)
!e I guess this isn't too terrible? 
from functools import partial
from ctypes import *
class bind(property):
def __init__(self, func, restype=c_int, argtypes=[]):
self.func = func
self.restype = restype
self.argtypes = argtypes
self.func_set = False
def __set_name__(self, owner, name):
self.name = name
def __get__(self, owner_self, owner_cls):
if not self.func_set:
self.func.restype = self.restype if self.restype is not ... else POINTER(owner_cls)
self.func.argtypes = [c if c is not ... else POINTER(owner_cls) for c in self.argtypes]
self.func_set = True
if owner_self is None:
return self.func
return partial(self.func, owner_self)
class PyTupleObject(Structure):
_fields_ = [
('ob_refcount', c_ssize_t),
('ob_type', py_object),
('ob_size', c_ssize_t),
('_ob_items', py_object * 0)
]
@property
def ob_items(self):
return (py_object * self.ob_size).from_address(addressof(self._ob_items))
GetItem = bind(pythonapi['PyTuple_GetItem'], restype=py_object, argtypes=[..., c_ssize_t])
t = ('a', 'b', 'c')
ts = PyTupleObject.from_address(id(t))
print(PyTupleObject.GetItem(ts, 0))
print(ts.GetItem(0))
t2 = (1, 2)
ts2 = PyTupleObject.from_address(id(t2))
print(PyTupleObject.GetItem(ts2, 1))
print(ts2.GetItem(1))
@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | a
002 | a
003 | 2
004 | 2
I guess caching is not possible if accepting use both as class and instance?
not sure how I feel about it returning a functools.partial type though
It's not that bad
kind of? [] on a tuple will look in the tuple type's slots table
but don't classes like that also support __getitem__ on class?
or does the bytecode already make that __class_getitem__
Abuse __class_getitem__
Beat me to it
I think the bytecode calls PyObject_GetItem and that tries __class_getitem__ if there's no slot for getitem
Objects/abstract.c line 147
PyObject *```
ah interesting, never knew that method existed
since object class doesn't have __getitem__
another fun fact here is that there are two slots for getitem, one for mappings (mp_subscript) and one for sequences (sq_item). I wonder when the sequence one is actually used though, because anything that accepts slices must use the mp_subscript slot. Seems like at least direct calls to PySequence_GetItem from C code will use sq_item.
Lib/functools.py lines 997 to 1007
with self.lock:
# check if another thread filled cache while we awaited lock
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
val = self.func(instance)
try:
cache[self.attrname] = val
except TypeError:
msg = (
f"The '__dict__' attribute on {type(instance).__name__!r} instance "
f"does not support item assignment for caching {self.attrname!r} property."```
is this the standard way for caching instance attributes?
also is the thread lock there really important given the GIL?
yes, it's the stdlib way
the lock was a mistake, let me find the issue
user code that relies on the locking might break
does such code exist? we're not sure, but backward compatibility is important
isn't it still a race condition on which thread acquires that lock before setting the attribute?
whereas before it was a race condition which thread acquires the GIL for setting the dict
is there a difference in those 2 things?
the body of the property could be doing something that requires only a single thread to be able to access it
or maybe if computing the property is really expensive, you never want two threads to do it at once
ah I see, like during the computation before the cache, hm
realized there was no possible way to type hint the duality of the class / instance descriptor. So now I have this, bind decorator types the pythonapi funcptr using the type hints
@struct
class PyTupleObject(PyVarObject[_Tuple]):
_ob_item_0: Py_ssize_t * 0
@bind_api(pythonapi["PyTuple_GetItem"])
def GetItem(self, index: int) -> object:
"""Return the item at the given index."""
@bind_api(pythonapi["PyTuple_SetItem"])
def SetItem(self, index: int, value: object) -> None:
"""Set a value to a given index."""
Even mypy is happy now 👍
from einspect.structs import PyTupleObject
instance = PyTupleObject.from_object(("abc", "def"))
reveal_type(instance.GetItem)
ret = instance.SetItem(1, 650)
reveal_type(ret)
note: Revealed type is "def (index: builtins.int) -> builtins.object"
note: Revealed type is "None"
Success: no issues found in 1 source file
type-safe unsafe python 
Nice
can anything bad happen if you temporarily decrease an object's refcount to 1 👀
from einspect.structs import PyTupleObject
x = ("a", "b")
print(x)
ls = [x, x, x]
tup = PyTupleObject.from_object(x)
tup.ob_refcnt -= 5
tup.SetItem(0, 900)
tup.SetItem(1, "hi")
tup.ob_refcnt += 5
print(tup.into_object().value)
print(ls)
('a', 'b')
(900, 'hi')
[(900, 'hi'), (900, 'hi'), (900, 'hi')]
it shouldn't trigger gc.. right?
if something else holds a ref to it and decrefs it, you'll lose the object
yeah that's pretty bad, would calling PyTuple_SET_ITEM directly be slightly safer 🥴
actually is there even a difference in calling that and just directly modifying the pointer array at ob_item
seems like it just does that anyways?https://github.com/python/cpython/blob/3.11/Include/cpython/tupleobject.h#L33-L37
Include/cpython/tupleobject.h lines 33 to 37
static inline void
PyTuple_SET_ITEM(PyObject *op, Py_ssize_t index, PyObject *value) {
PyTupleObject *tuple = _PyTuple_CAST(op);
tuple->ob_item[index] = value;
}```
I don't think you can call static inline directly
Yea you cannot
(Mind you, it's very easy to reimplement with the ctypes stucture that @warm breach already has defined)
this library seems to re-export it
https://github.com/brandtbucher/pycapi/blob/master/pycapi.c#L7543-L7553
pycapi.c lines 7543 to 7553
static PyObject *
capi_PyTuple_SET_ITEM(PyObject *Py_UNUSED(self), PyObject *args)
{
PyObject *arg0;
Py_ssize_t arg1;
PyObject *arg2;
if (!PyArg_ParseTuple(args, "OnO:PyTuple_SET_ITEM", &arg0, &arg1, &arg2)) {
return NULL;
}
PyTuple_SET_ITEM(arg0, arg1, arg2);
if (PyErr_Occurred()) {```
and does seem to sort of work
from pycapi import PyTuple_SET_ITEM
t = (1, 2, 3)
ls = [t, t]
PyTuple_SET_ITEM(t, 1, "what")
print(ls)
[(1, 'what', 3), (1, 'what', 3)]
I don't think it's really any safer than anything else though
Yea you can export it with additional C code, static inline means that there isn't a function pointer associated with the function after compilation, the assembly is just interposed into the functions where it is used
also what exactly is an "immortal interned" string 👀
I haven't really come across any that have that state
so it's deprecated? hm
https://github.com/python/cpython/blob/3.11/Objects/unicodeobject.c#L15589
Objects/unicodeobject.c line 15589
"PyUnicode_InternImmortal() is deprecated; "```
import library vs from library import * - why would someone use from .. * and is there a reason to / use case where it's best practice?
there isn't really, but it's up to your judgement
star imports make it confusing which attribute is overriden and where names come from
python static inference is already not great so it's as confusing as it gets without any help from the ide either
The only legitimate reason I can think of is an __init__.py that contains only imports; something like
from .models import *
from .tasks import *
from .utils import do_thing
Is __del__ granted to be called except at the interpreter exit?
When the reference count reaches zero, yes.
you shouldn't rely on __del__. it might not be called, and running code there can cause problems.
I thought __del__ would always run, it just wasn't deterministic when
!d object.__del__
object.__del__(self)```
Called when the instance is about to be destroyed. This is also called a finalizer or (improperly) a destructor. If a base class has a [`__del__()`](https://docs.python.org/3/reference/datamodel.html#object.__del__ "object.__del__") method, the derived class’s [`__del__()`](https://docs.python.org/3/reference/datamodel.html#object.__del__ "object.__del__") method, if any, must explicitly call it to ensure proper deletion of the base class part of the instance.
It is possible (though not recommended!) for the [`__del__()`](https://docs.python.org/3/reference/datamodel.html#object.__del__ "object.__del__") method to postpone destruction of the instance by creating a new reference to it. This is called object *resurrection*. It is implementation-dependent whether [`__del__()`](https://docs.python.org/3/reference/datamodel.html#object.__del__ "object.__del__") is called a second time when a resurrected object is about to be destroyed; the current [CPython](https://docs.python.org/3/glossary.html#term-CPython) implementation only calls it once.
Mm. Just after that, it says
It is not guaranteed that
__del__()methods are called for objects that still exist when the interpreter exits.
It's not deterministic when it runs or which thread it runs in, it's not guaranteed to be called when the interpreter is shutting down, and if it is called when the interpreter is shutting down it might not be able to find global variables or modules that it needs.
Also, it won't get run if someone pauses the cycle collecting GC with gc.disable()
only for cycles though
yeah, refcount destructors don't need any gc
I think you can pretty much use weakref.finalize to replace it for running tasks on GC
which is what stdlib implementations like tempfile uses for cleanup, haven't really seen __del__ used in stdlib
hey how can i solve this error
A secure web app that allows you to create, edit, share and save text files to your device or to Google Drive as an editable Doc
its run fine in vs code but in jupyter...
hey, please post in #1035199133436354600 @quasi oriole
ok
does python have callcc?
IIUC yield or async/await is essentially a form of call/cc
🤔
I never understood how shutting down works. Why some globals can be missing? Why it is deleting globals in weird order (underscored first)?
Why not just call __del__ on every object (even if refcount is not zero) and then just free all memory?
What is call/cc?
EDIT: https://en.wikipedia.org/wiki/Call-with-current-continuation
it does normally, but since you can keep an object alive by keeping a reference in __del__ you can postpone the GC of that reference
but at interpreter shutdown it needs to go at some order to shut things down, you can't have an object alive forever
ideally your weakref.finalize or __del__ should have run well before interpreter shutdown
Hi everyone, I need some work on python to get hands-on practice. Thanks
I never understood how shutting down works. Why some globals can be missing?
At shutdown, the interpreter destroys every module that was imported. One step in destroying a module object is clearing its globals dict. If clearing that globals dict causes an object to be garbage collected (because a global variable in that dict owned the last live reference to an object), then that object's__del__will run. If that object's__del__tries to use a global variable from a module whose globals dict has already been cleared, it will get aNameError, because the variable it's trying to access no longer exists as a global variable for that module.
Why it is deleting globals in weird order (underscored first)?
It has to pick an order to delete things in. This documented order tries to make it possible for you to work around thoseNameError, by guaranteeing that certain globals will be cleared before others - so if your global is one of that's cleared in the first pass (something whose name starts with_) then its__del__can safely refer to globals that have not yet been cleared (things whose names don't start with_).
Why not just call
__del__on every object (even if refcount is not zero) and then just free all memory?
Memory is the least interesting resource we can talk about here. When the process is about to end, there's (almost) no reason to free memory at all. Freeing memory one allocation at a time is slow, and the OS kernel will reclaim all the memory allocated to the process when the process dies, anyway. Other resources like opened files, sockets, message queues, etc are the real reason why it's worthwhile for the interpreter to try to clean things up when shutting down. And if you call__del__on every object in a random order, you'll still have the same problem: things will try to use an object after its__del__has run. And instead of getting aNameErroryou might get anOSError, for instance, for trying to write to a closed file.
^ and of course there's also the possibility the interpreter never gets to shutdown
Memory is the least interesting resource we can talk about here
But it must be cleared if i embedded interpreter in my app, used it once and then finalized it. If memory is not cleared, i will get a memory leak. In other cases i agree, there is no reason to free memory.
And instead of getting a
NameErroryou might get anOSError, for instance
Yeah, this is tricky. Wrapping those errors intry-exceptorif gc.is_finalized(x)can't solve all problems. I cant come up with better idea.
I think, relying on __del__ of object with several references is bad. __del__ is good for one-use or one-reference objects (like file descriptors). In other cases it is better to use .close() or something like that manually.
Are there any use cases of __del__ that is used to finalize at shutting down time? (there are no such cases in stdlib, but maybe there are some in other libs)
What's the difference between __del__ and weakref.finalize? Weakrefs make reference graph more sparse and they are not using global names, but are there any benefits of using weakref.finalize?
weakref.finalize probably offers more specific behavior on all implementations
But it must be cleared if i embedded interpreter in my app, used it once and then finalized it. If memory is not cleared, i will get a memory leak.
I said "almost" 🙂
Yeah, embedding the interpreter is one case where it's worthwhile to free memory (since the process might not be dying, and someone might initialize a new interpreter). And running under a tool that's checking for memory leaks is another, since it's a way for the interpreter to indicate to the tool that it hadn't lost track of some memory.
objects being GC'd when their refcount hits 0 is a cpython implementation detail. Along with __del__ getting called immediately after GC
I think, relying on
__del__of object with several references is bad.__del__is good for one-use or one-reference objects (like file descriptors).
An object rarely if ever knows how many references to it there will be. And things with only 1 reference are quite rare.
I suppose there's objects created in ctypes.py_object having 1 reference exactly
it's not that it can't happen, just that it's quite a special case, not the norm. And not something that you generally have control over unless you're writing C code (or ctypes)
Specifically, weakref.finalize finalizers are called from atexit, which happens before the interpreter starts destroying module globals. They're guaranteed to fire at a time before this tricky stuff about module globals being destroyed applies. And they're guaranteed to fire, while __del__ is explicitly not.
!e though, assuming interpreter is still alive when gc collects 😔
import ctypes
import weakref
class Foo:
def __init__(self):
self._finalize = weakref.finalize(self, self.cleanup)
@classmethod
def cleanup(cls):
print("Important cleanup tasks")
def __del__(self):
print("Important cleanup tasks")
f = Foo()
ctypes.py_object.from_address(-1).value
print(f)
@warm breach :warning: Your 3.11 eval job has completed with return code 139 (SIGSEGV).
[No output]
They also don't run if your machine is unplugged and the battery dies
more news at 11
they also don't run after os._exit() obviously 😄
>>> import os
>>> x = type('',(),dict(__del__=lambda*a:print('IMPORTANT CLEANUP')))()
>>> del x
IMPORTANT CLEANUP
>>> x = type('',(),dict(__del__=lambda*a:print('IMPORTANT CLEANUP')))()
>>> os._exit(1)
*nothing*
my repl is absolutely broken
same with {*iter(int,1)}
same with all the other unpacks
happens for <builtin iterable>(<infinite iterator not implemented in python>) or an unpack of the infinite iterator
can this crash python ```py
class A:
a = None
while True:
class B(A):
pass
class C(A):
pass
B.a = C
C.a = B
MemoryError probably
!e ```py
class A:
a = None
while True:
class B(A):
pass
class C(A):
pass
B.a = C
C.a = B
@gray galleon :warning: Your 3.11 eval job timed out or ran out of memory.
[No output]
no, it is not even using a lot of memory
it is creating new classes and old are GC'd
aren’t those classes circularly referenced
class A:
a = 1
while 1:
class A(A): ... # memory leaking
A.a # very very slow
when new classes are created, old are no longer referenced by anything (except each other), so they can be GC'd
aren’t those circular classes have non zero reference count
they are referencing each other, yes
but GC can figure out that nothing is referencing these two classes (except these classes itself), so they can be safely GC'd
It is very similar to this example:```py
a, b = [], []
a.append(b)
b.append(a)
a, b
([[<Recursion on list with id=2168809581952>]], [[<Recursion on list with id=2168809582784>]])
del a, b
huh? ```py
refcounts in comments
a = [b := []] # a: 1, b: 2
b.append(a) # a: 2, b: 2
del b # a: 1, b: 1
del a # a: 0, b: 0
can the gc detect self referenced object (like the class A(A): example)
no
# refcounts in comments
a = [b := []] # a: 1, b: 2
b.append(a) # a: 2, b: 2
del b # a: 2, b: 1 a is still 2 because it is referenced in namespace and in b
del a # a: 1, b: 1 reference cycle
# at some point after that GC will run and collect this cycle:
# a: 0, b: 0
yes, it can detect even more complicated isolated structures
like this? ```py
class A:
def init(self):
self.self = self
A()
it can detect everything
so in python you can get memory leak if you:
- have memory leak in C
- have no actual memory leak (you are still holding reference somewhere, but dont know about it/dont use it)
3*) messed with internals and broken something (regular #esoteric-python stuff)
acc = ()
while True:
acc = acc, acc
```this actually crashes
sure. That uses infinite memory.
CPython has two different garbage collection methods. Objects are destroyed instantly if their reference count ever drops to 0, and they're destroyed if the cycle-collecting GC runs and detects that those objects are part of a cycle, and that cycle has no references into it from outside the cycle.
so it works as well as traditional tracing gcs
it is a proper GC, yes.
an object belongs to a cycle if it can't be discovered in a tree traversal from the scope's symbol table, right?
if it can be discovered in a traversal from its own children
https://devguide.python.org/internals/garbage-collector/#identifying-reference-cycles describes the algorithm
right. but that doesn't tell you if that cycle is disconnected from the rest of the reference graph. so if an object has a non-zero reference count, but there's some graph traversal whereby it can't be visited (I'm not totally sure what that is--I'll have to read the link godly just dropped), it must be part of a disconnected cycle.
(I hope I'm not coming off as confrontational. I'm just trying to clarify what I meant, which is hard when I don't fully understand what I'm talking about 😛 )
oh yes, that does make sense. though I think there are other GC roots than the scope's symbol table
so i tested gc.collect() using a C extension c static PyObject * a_c_test_refcnt(PyObject *self, PyObject *o) { PyObject *s = PyTuple_GET_ITEM(o, 0); PyObject *non = PyTuple_GET_ITEM(o, 1); Py_INCREF(non); Py_DECREF(o); PyGC_Collect(); printf("%zd\n", Py_REFCNT(s)); return non; } (takes a tuple in the form of (obj, None) because i have no idea why all the basic python structures/objects, including Py_None, point to garbage addresses)
seems to work ```py
from a_c import test_refcnt as t
class X:
... def init(s):s.s=s
...
t((X(),None))
-2459565876494606883
it doesn't start from roots at all. It just iterates over a linked list of all GC-aware objects.
I think you are reading garbage there since you're incorrectly DECREFing the tuple
wdym?
the only reference gets passed to the function right
Py_DECREF(o); is invalid. You're decrementing the reference count of a tuple that you don't own a reference to.
the reference gets passed to the function right
a borrowed reference is.
where else does it go anyway?
oh clever, it knows which things are reachable because they have references from outside the GC
yep.
when your function is called, Py_REFCNT(o) should show 1 - there is a single reference to o, owned by the caller of a_c_test_refcnt
so the caller in this case is the program
and it creates a temporary tuple that it passes to CALL
yes. after the call returns, it DECREFs the tuple
the caller is PyObject_CallObject or something like that
but yes - the thing that creates the temporary tuple is the thing that's also supposed to destroy it (unless you increase its reference count, saving a reference to it for yourself)
hm
?it should've aborted then because of negative reference count
when you decrement its reference count, a) it drops a reference that you didn't own (likely destroying that object), and b) when the calling frame drops the reference that it did own, it either decrements it to -1, or it decrements some entirely unrelated object if the pointer was reused.
or it segfaults if a new allocation reused that same pointer, though that's unlikely to happen in your example.
I think python -Xdev would catch your negative reference count
it is -X dev
or maybe it just writes into some now-unused memory and it's harmless
it exits fine
the space isn't needed