#internals-and-peps
1 messages Β· Page 21 of 1
https://github.com/python/peps/pull/3523 for reference
Well, it's part of the new interpreters stdlib thing for sure.
I'm just looking for a more detailed answer, it sounds interesting.
is it open for new contributors if nobody is assigned? (but it is not in the "up for grabs")
e.g this: https://github.com/python/cpython/issues/111964
For example this as well: https://github.com/python/cpython/issues/111968, or this https://github.com/python/cpython/issues/112050
The Python programming language. Contribute to python/cpython development by creating an account on GitHub.
o.0 i thought deque was already thread safe
it is thread safe with the GIL
I still don't know if I can work on them.
It's super common for libraries that use the C API to implement their thread safety in terms of the guarantees the GIL gives. As CPython exists today, if your C API function doesn't release the GIL or decrement any ref counts to 0 or call any functions written in Python (or call another function that does one of those things), it can't be preempted. That makes all sorts of multi threaded code much easier to write
Ping Sam to ask, perhaps?
Is there any pep for a "weak" lru_cache that can be used naively to memoize methods without keeping each instance of the class alive?
You can set it in init
def __init__(self):
self.get_foo = lru_cache()(self.get_foo)
I don't think there's a pep, but it's definitely something I've wanted before. I usually just reimplement it so it supports methods
Oh, so each instance gets memoized version of the method at instance time? clever
Clever code should be avoided though π
There were some long discussions earlier this year:
Nice, we have some office-made decorators for cached_property and cached_method, but it would be nice to not need such things.
I read a nice blog post or website/article about this issue a year or two ago, but I can't seem to find it. I should save those good reads somewhere ...
#bot-commands message
does this look like a bug?
error range is 2 times longer than needed, i guess that's because these characters need 2 bytes in unicode to represent them
On my local Python install it does it correctly. So either fixed on main, or system-dependent.
Edit: Yeah, bug (?) not present on my local machine under 3.12 either.
probably caused by a different unicode library version or something similar, that doesn't properly count non-ascii char width
>>> b'ΠΆΠΆΠΆΠΆΠΆΠΆ'
File "<stdin>", line 1
b'ΠΆΠΆΠΆΠΆΠΆΠΆ'
^^^^^^^^^
SyntaxError: bytes can only contain ASCII literal characters
``` tested on 3.12 + windows, there is no bug
weird
yeah I think this was somehow fixed in 3.12 by some of Pablo's work around error locations
File "<string>", line 1
b'ΠΆΠΆΠΆΠΆΠΆΠΆ'
^
SyntaxError: bytes can only contain ASCII literal characters
% python3.12 -c "b'ΠΆΠΆΠΆΠΆΠΆΠΆ'"
File "<string>", line 1
b'ΠΆΠΆΠΆΠΆΠΆΠΆ'
^^^^^^^^^
SyntaxError: bytes can only contain ASCII literal characters
> py -2 -c "b'ΠΆΠΆΠΆ'"
> py -3.7 -c "b'ΠΆΠΆΠΆ'"
File "<string>", line 1
SyntaxError: bytes can only contain ASCII literal characters.
> py -3.8 -c "b'ΠΆΠΆΠΆ'"
File "<string>", line 1
SyntaxError: bytes can only contain ASCII literal characters.
> py -3.9 -c "b'ΠΆΠΆΠΆ'"
File "<string>", line 1
b'ΠΆΠΆΠΆ'
^
SyntaxError: bytes can only contain ASCII literal characters.
> py -3.10 -c "b'ΠΆΠΆΠΆ'"
File "<string>", line 1
b'ΠΆΠΆΠΆ'
^
SyntaxError: bytes can only contain ASCII literal characters
> py -3.11 -c "b'ΠΆΠΆΠΆ'"
File "<string>", line 1
b'ΠΆΠΆΠΆ'
^^^^^^
SyntaxError: bytes can only contain ASCII literal characters
> py -3.12 -c "b'ΠΆΠΆΠΆ'"
File "<string>", line 1
b'ΠΆΠΆΠΆ'
^^^^^^
SyntaxError: bytes can only contain ASCII literal characters
<=3.8 dont have error locations (at least for stdin code)
3.9 is bugged
=3.10 are fine
at the same time snekbox running 3.12 has this bug too
!e b'ΠΆΠΆΠΆ'
@dusk comet :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | b'ΠΆΠΆΠΆ'
003 | ^^^^^^^^^
004 | SyntaxError: bytes can only contain ASCII literal characters
!e
b'This string has 4 instances of ΠΆ: ΠΆΠΆΠΆ'
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | b'This string has 4 instances of ΠΆ: ΠΆΠΆΠΆ'
003 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
004 | SyntaxError: bytes can only contain ASCII literal characters
!e
b'This string has 3 instances of ΠΆ: ΠΆΠΆ'
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | b'This string has 3 instances of ΠΆ: ΠΆΠΆ'
003 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
004 | SyntaxError: bytes can only contain ASCII literal characters
!e
import os
print(os.environ)
@grave jolt :white_check_mark: Your 3.12 eval job has completed with return code 0.
environ({'LANG': 'en_US.UTF-8', 'OMP_NUM_THREADS': '5', 'OPENBLAS_NUM_THREADS': '5', 'MKL_NUM_THREADS': '5', 'VECLIB_MAXIMUM_THREADS': '5', 'NUMEXPR_NUM_THREADS': '5', 'PYTHONDONTWRITEBYTECODE': 'true', 'PYTHONIOENCODING': 'utf-8:strict', 'PYTHONUNBUFFERED': 'true', 'PYTHONUSERBASE': '/snekbox/user_base', 'HOME': 'home', 'LC_CTYPE': 'C.UTF-8'})
Maybe it has something to do with some terminals potentially treating ΠΆ as double width?..
then why is snekbox broken?
it saves input to a file
Running it in docker doesn't reproduce the bug, so nothing in my system should impact it π€
!source e
Run Python code and get the results.
snekbox actually runs is in nsjail hmmm
!e ```py
import sys
print(sys.getdefaultencoding())
@dusk comet :white_check_mark: Your 3.12 eval job has completed with return code 0.
utf-8
there i was using windows terminal
just checked it on cmd.exe, works in exactly the same way, only 3.9 is bugged
when reading from files in windows it uses a different one by default
edit; nvm guess that this is not what you're talking about
I'm on linux, and I get this on 3.9
>>> b'foo barΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆ'
File "<stdin>", line 1
b'foo barΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆ'
^
SyntaxError: bytes can only contain ASCII literal characters.
the caret is always 1 char after the literal
>py -3.9
Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> b'ΠΆΠΆΠΆΠΆΠΆΠΆ'
File "<stdin>", line 1
b'ΠΆΠΆΠΆΠΆΠΆΠΆ'
^
SyntaxError: bytes can only contain ASCII literal characters.
``` π€
snekbox runs code in nsjail, maybe that's different
!e
exec('b"ΠΆΠΆΠΆ"')
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 1, in <module>
003 | exec('b"ΠΆΠΆΠΆ"')
004 | File "<string>", line 1
005 | b"ΠΆΠΆΠΆ"
006 | ^^^^^^
007 | SyntaxError: bytes can only contain ASCII literal characters
if you run a file it bugs
$ cat foo.py
b"ΠΆΠΆΠΆ"
$ python foo.py
File "/tmp/foo.py", line 1
b"ΠΆΠΆΠΆ"
^^^^^^^^^
SyntaxError: bytes can only contain ASCII literal characters
So there's probably a bug when printing the traceback
ΠΆ aside this code confused me for a moment
self._line = line
if lookup_line:
self.line
oh wait, no no no
@dusk comet the SyntaxError already has the wrong end_offset
$ cat foo.py
b"ΠΆΠΆΠΆΠΆΠΆΠΆΠΆΠΆ"
$ cat main.py
try:
import foo
except SyntaxError as exc:
print(exc.offset, exc.end_offset)
$ python main.py
1 20
so the bug is probably somewhere in the C code
i guess offsets are calculated using byte position, not unicode-aware position
AAAAH
guys i want to revers the stack how ?
O
:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1700715334:f> (10 minutes) (reason: mentions spam - sent 6 mentions).
The <@&831776746206265384> have been alerted for review.
:incoming_envelope: :ok_hand: applied timeout to @wet ridge until <t:1700715409:f> (10 minutes) (reason: chars spam - sent 5982 characters).
The <@&831776746206265384> have been alerted for review.
what is happening?
spam raid?
<@&831776746206265384> suspiciously high amount of off-topic messages
!silence
β silenced current channel for 10 minute(s).
!cban 962960473010102282 harassment
:ok_hand: applied ban to @round robin permanently.
!e
@dusk comet here's some fun confusion with your bug
ΠΎΠΎΠ°ΠΎΠ°Π°_ΠΌΠΌΠΌ = 42 + return + deploy
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | ΠΎΠΎΠ°ΠΎΠ°Π°_ΠΌΠΌΠΌ = 42 + return + deploy
003 | ^^^^^^
004 | SyntaxError: invalid syntax
if I got that error I might be scratching my head
this triggers the fuzzy under return for other variable names. vsc warns of funny characters for the os and as, but are you hiding an RTL character somewhere in there?
These are cyrillic letters
it's a bug where a SyntaxError moves the caret one char too far for cyrillic letters
!e
the_bug = b"fooΠΆ"
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | the_bug = b"fooΠΆ"
003 | ^^^^^^^^
004 | SyntaxError: bytes can only contain ASCII literal characters
!e
the_bug = b"fooΠΆΠΆΠΆ"
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | the_bug = b"fooΠΆΠΆΠΆ"
003 | ^^^^^^^^^^^^
004 | SyntaxError: bytes can only contain ASCII literal characters
well thats weird
it is not a SyntaxError's fault
col_offsets are assigned incorrectly during parsing, which causes Pablo's error messages to use wrong values
>>> import ast
>>> m = ast.parse('"ΠΆΠΆΠΆ"')
>>> e = m.body[0]
>>> e.col_offset, e.end_col_offset
(0, 8)
>>> m = ast.parse('"zzz"')
>>> e = m.body[0]
>>> e.col_offset, e.end_col_offset
(0, 5)
should we ping Pablo here? I think he should have solid understanding of what is happening, because he implemented this feature initially in 3.11
worth reporting a cpython bug
Pablo's not super active here, so you'd be more likely to catch his notice with a bug report than a ping here
you should get him addicted
I pasted this example to him and said it was brought up here. We'll see if he takes the bait π
if you have some questions, feel free to ask - though it'd be better in #python-discussion than here
Thanks
Hello guys is there an app where your code is always on the cloud? Something like git but its is always saved up there so I can just start coding on my pc and the I can just jump to my laptop and instantly contiue working?
!ot
#ot2-never-nesterβs-nightmare
Please read our off-topic etiquette before participating in conversations.
you mean github?
or gitlab or gitea or whichever you prefer
that is what those do
yeah but those require to push the chenges, and the pull the changes. I want something instant.
ras
potentially look into Syncthing? But you'd have to set your devices up to be on the same network/vpn
Any idea where these functions are defined in the code? I couldn't find it with github's project browser. https://docs.python.org/3/c-api/refcounting.html
they're largely macros, not functions. I recommend searching for something like "#define Py_REFCNT"
should be in some .h file in Include/
Include/object.h line 277
static inline Py_ssize_t Py_REFCNT(PyObject *ob) {```
Include/object.h line 783
# define Py_INCREF(op) Py_INCREF(_PyObject_CAST(op))```
how would that increment the reference?
Include/object.h line 734
static inline Py_ALWAYS_INLINE void Py_INCREF(PyObject *op)```
that's a bit weird - it's a macro function calling an inline function of the same name
there is a weird cpp (not c++, but C-PP) feature: macros are not recursive, so if it expanded once in this place, it wont do it again there
so Py_INCREF expands once, but refuses to expand twice, so then compiler have to go and look for this name
but it is not incrementing any values.
or at least I can't see it.
hello
Include/object.h line 774
op->ob_refcnt++;```
aq
<@&831776746206265384> he is sending random discord invite links in DMs
contact @summer lichen
huh, what does that have to do with the macro? I mean it's a bit complicated.
it's annoying that a bunch of people are spamming on this channel.
this particular function is very messy because of a lot of preprocessor things
i believe there are several "preprocessor"-code paths:
- it calls something on lines 742/744 related to limited api
- does nogil magic on lines 756/759
- does something weird on line 768
- does simple increment on line 774
oh okay, thank you.
sorry, I'll stop spamming π
π I was talking about the fact that some people just say hello and then send random lines that my brain can't interpret.
!pep 554
!pep 734
i dont really understand the advantages of using subinterpreters in provided examples in pep 734
i dont see any advantages over just using threads
(assuming nogil is a thing)
the goal is that subinterpreters can run at the same thing independently of each other
subinterpreters-based parallelism would allow for better "IPC" between interpreters over multiprocessing
very similar to threading, no?
so, if i understand it correctly, current python thread runs roughly line by line, and when a line is running, another threads must wait until the line is finished?
yeah, kinda
GIL ensures that at any given moment only one thread can execute python code
Is this lock acquired once per function or once per bytecode?
once per bytecode instruction
threads do roughly this:
- acquire GIL
- execute one bytecode instruction (or more - i dont remember)
- drop GIL
- repeat
ok... so, when i view it using dis, it acquire GIL before every line of it and release GIL after each line of it....
im not sure that it really drops/reacquires GIL after every instruction
maybe it does it once in 100 instructions or something like that
Whatchu coding
if there is only main thread, GIL is constantly reacquired (because nobody else can hold it)
if there are several of them - it probably will be reacquired by the same thread (because of reasons i dont quite understand), so you dont have "smooth" cpu utilization that is split equallt across all threads
if there is GIL - in 99% cases yes
what is that 1% ?
CALL can call to C function that drops GIL, then acquires it again and calls other python function that breaks your invariants, and then it returns and you finish CALL instruction with different data you started with
or something crazy like that
you can ask in #esoteric-python about that π
im not sure that this will work at all, but i hope you got the idea
so, now we will have subinterpreter soon, may that change many assumption? for example, could i still thinks list is thread safe? before that, i can assume that because every read/write to list only need one byte code, which is atomic.
if 2 subinterpreters append some thing to the same list at the same time.....
as of a version ago I think it only happens at points where it could branch, e.g. CALL and jumps
but that's obviously an implementation detail that might change
No, definitely not. The GIL can be released underneath all sorts of instructions.
!e import dis; dis.dis("a+1")
@raven ridge :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | 0 0 RESUME 0
002 |
003 | 1 2 LOAD_NAME 0 (a)
004 | 4 LOAD_CONST 0 (1)
005 | 6 BINARY_OP 0 (+)
006 | 10 RETURN_VALUE
That BINARY_OP for example might call a __add__ method, with its own bytecode, and the GIL might be dropped between two of the __add__ method's bytecode instructions
Two subinterpreters cannot write to the same list at all, because objects can't be shared across interpreters, at least as things stand today. Interpreters need to send serialized messages among themselves instead, like how multiprocessing works
I still don't really get the point of subinterpreters. When it's usable from Python isn't it basically the same as multiprocessing, from a user-perspective? Perhaps with slightly less overhead? In which scenarios would I choose subinterpreters over multiprocessing?
I believe the dream of subinterpreters is that you can have the strong isolation of multiprocessing, meaning you have fewer hard-to-debug race conditions than with threading because of less shared state. But at the same time, the overhead of subinterpreters should be much lower than with multiprocessing, meaning (1) it's much cheaper to spawn new interpreters than new processes, and (2) you'd be able to run many subinterpreters at the same time even if you only had a few cores.
In theory it should be the best of all worlds for certain tasks
I'm not sure how close we are to that dream. I know we're close to being able to easily spawn subinterpreters from pure Python code, but I think the overhead of subinterpreters might still be a bit higher than it theoretically could be? I'm not an expert, though
I think the main wide appeal was single process paralellism, but that's not going to be that useful without the gil so you only get the separation from them as a benefit
thank you. i find that part in pep now. only shareable object can be shared. and only the value is sent. not the object....
may i know the difference points and the relation between the new queue and the existing ones?
there are too many types of queueπ
if we cannot share object between them, isnt it just like subprocessing?π€
multiprocessing can be useful in case of mempry leak
you can spawn and kill new leaky workers periodically, and OS will clear all memory for you
I think another major limitation for subinterpreters is that extension modules need to specifically support it, and that's not always trivial to do. If I recall correctly the datetime module cannot be used in a subinterpreter yet!
There's a lot of cool potential for subinterpreters long term. A lot of the current view of it without looking at the plans for the future and other PEPs it will interact with could Easily make it seem like "just another multiprocessing", but once some things stabilize around this and the removal of the gil, this has a lot of potential with anything that would currently fork, and quite a few other things currently using process pools.
There's plans to make it an error to fork after spawning threads, which will make multiprocessing much harder to use on POSIX platforms than it is today. Also, multiprocessing is prone to breaking in hard to detect and recover from ways (one process from a process pool gets killed, let's say, or one fails to start, or the parent process dies while children are still running, etc). Also, multiprocessing is one of the few ways to truly leak resources with Python. You can easily leak memory that stays leaked even after every involved process is dead, and the only way to recover it is to reboot your computer. Even if the only thing we get from subinterpreters is something that's exactly like multiprocessing except with only one process, that actually solves a lot of problems
Also, multiprocessing is one of the few ways to truly leak resources with Python. You can easily leak memory that stays leaked even after every involved process is dead, and the only way to recover it is to reboot your computer.
can you explain more?
is it related to shared memory?
Yep, exactly. Most memory is freed when the process that allocated it dies, but shared memory by its very nature is not scoped to any particular process
and OS allows that?
Yes, that's how shared memory works
Python tries to clean up after itself, but that process can fail, and if it does you won't get some of those resources back until you reboot
so shared memory is kinda persistent between runs (if there is no cleanup/reboot)
you can create shared memory, die and then use the same memory in next process
may i know the difference points and the relation between the new queue and the existing ones?
as said in that pep, queue.Queue/asyncio.Queue/multiprocessing.Queue
The new queue will work between multiple interpreters, and will not use shared memory
could old queues be used here?
multiprocessing.Queue could, but then you'd lose most of the biggest advantages of multiple subinterpreters, and go back to a state where it's possible to leak system wide resources and need to pickle everything and pay a high overhead for passing objects around
if i created a interpreters.Queue and put some thing in it, and get it back in the same interpreter, they only keeps the value instead of the object, too?
in == out β
in is out (unsure)
Yes
thank you alot.
but, u said that multiprocessing one need to pickle every thing.....
isnt that the latest one need to do that too?
they need to pickle to make it shareablle.
No, it can share underlying data structures
tuple.
No, tuple needs to be recreated, but str and int might not
could it share a tuple and one of the tuple s element is the tuple itself?
i created one last time. and any attept to hash it make the python crash....
!e
def f():
import marshal
tuplis=[]
def ltot(inpl):
if id(inpl) in tuplis:
return b"r"+tuplis.index(id(inpl)).to_bytes(4,"little")
else:
tuplis.append(id(inpl))
ret=b"\xa9"+len(inpl).to_bytes(1,"little")
for i in inpl:
ret+=ltot(i)
return ret
def conv(inpl):
tuplis.clear()
return marshal.loads(ltot(inpl))
return conv
conv = f()# this function can create tuples with the same structure as the input nested lists.
a=[];a.append(a);
x=conv(a)
print(x)
@hasty turtle :white_check_mark: Your 3.12 eval job has completed with return code 0.
((...),)
this is not possible without using bugs
or internals
As an implementation detail it certainly could, though I doubt that would make sense
i will add a hash(x) to the end of my code.
!e
def f():
import marshal
tuplis=[]
def ltot(inpl):
if id(inpl) in tuplis:
return b"r"+tuplis.index(id(inpl)).to_bytes(4,"little")
else:
tuplis.append(id(inpl))
ret=b"\xa9"+len(inpl).to_bytes(1,"little")
for i in inpl:
ret+=ltot(i)
return ret
def conv(inpl):
tuplis.clear()
return marshal.loads(ltot(inpl))
return conv
conv = f()# this function can create tuples with the same structure as the input nested lists.
a=[];a.append(a);
x=conv(a)
print({x:1})
@hasty turtle :warning: Your 3.12 eval job has completed with return code 139 (SIGSEGV).
[No output]
The new queue is different from all the existing ones in that it works differently (attempts to share underlying data rather than objects) and allows different types to be sent
i wrote that code last time. it is a function that can makes nested lists becomes nested tuples with the same structure.
for example,
a=[];a.append(a);
x=conv(a)
makes a tuple of it self.
isnt that interesting?
for example, maybe it becomes 2 "str" objects, but it points to the same c string inside?
Right
there is also a different way to achieve this
ctypes?
that is boring
i dont remember how it is done exactly, but it works roughly like this:
- make generator
- pass this generator to
tuple(...)constructor - firstly tuple object is created, and then it is populated by items from generator
- now in your geenrator look into
gc.get_objects(), find tuple that was just created and yield it - now tuple is appended to itself
>>> def f():
... yield "abc123"
... yield next(x for x in gc.get_objects()[::-1] if x == ("abc123",))
...
>>> tuple(f())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in f
File "<stdin>", line 3, in <genexpr>
File "D:\Programs\Python\3.12\Lib\importlib\metadata\_text.py", line 74, in __eq__
return self.lower() == other.lower()
^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'lower'
``` i cant reproduce it :(
here it is π
mine is more easy and customizable.
if you want 2 tuple be in each other , you can
a=[];b=[]
a.append(b);b.append(a)
conv(a)
you are kinda changing type of lists to tuple
so your method can create all structures of tuples
exactly.
the __length_hint__ trick lets you put whatever you want inside of the tuple
!e ```py
import gc
class magic:
def length_hint(self):
return 2
def __iter__(self):
for obj in gc.get_objects():
if isinstance(obj, tuple):
try:
0 in obj
except SystemError:
yield obj
yield obj
break
self_tuple = tuple(magic())
print(self_tuple)```
@pliant tusk :white_check_mark: Your 3.12 eval job has completed with return code 0.
((...), (...))
you can make it contain itself twice if you want
i guess that it is because that function tuple() creates an tuple first, then append items into it. (that is not python code so it is doable.)
then you use gc.get_objects() to get the unfinished tuple and add itself into it.....@pliant tusk@dusk comet
Yes, exactly
by the way, py byte code allows creating empty set directly. but py code syntax doesnt allow that...
{*()}
0 RESUME 0
1 BUILD_SET 0
LOAD_CONST 0 (())
SET_UPDATE 1
RETURN_VALUE
:(
nah that's still multiple bytecodes
I think Serhyi suggested special casing that construct in the compiler but we didn't merge it
could we remove them?
yes. I think the pushback was that this construct is too rare to special case
i always avoid using set(). i fell it un-good-looking.
and it depends on the name "set"
Does {*()} feel good looking?
much better
Interesting. I think it's much worse
do you have that s link? i want to read how to do that.
sorry can't find it. github/discourse search for "{*()}" isn't great
Yet another reason not to recommend that as the set constructor π
found this python-ideas thread: https://mail.python.org/archives/list/python-ideas@python.org/thread/TUCSR7C5K7A4Z62564CTYRR2C6VNGEQ4/?sort=date
I feel like there was a later attempt to special-case that construct in the compiler though
that's not mentioned in that thread
another interesting thing of python is that although super is not a key word or soft key word, it is different from others.....
if i write ζ΄ζ°=int;ε符串=str, i can use ζ΄ζ° or ε符串 exactly the same way as int/str.
but super is not. my_super=super does not makes my_super() works.....
but if that method has the name super inside, even if
a=1
super
b=2
makes my_super worksπ€ͺ
Yeah, adding a __class__ cell which super uses is conditional on the name __class__ or super being used in the body
who come up with this...
Yeah, it is not great, but the practical benefits of not having to write super(Cls, self) just about everywhere are pretty significant.
compiler simply checks for super name in methods
how could i write a function that gets the cell of the place it is called too.....
you couldn't, it is special cased in the compiler
there is a function deep somewhere which just compares strings against __class__ and super
why isnt compiler adding __class__ cell to every method?
yeah in symtable.c
it costs some memory
then make "super" a key word. like None, True , False.
do not make it half a way
then its a breaking change
is it a lot?
iirc, cells are stored in tuples, so it costs one tuple + one cell object which can be shared across all methods in same class
i believe it is a lot smaller than code object itself
the weirdest keyword-like thing in my opinion is __debug__
it is at the same time a global variable that you cant assign to, and it can be used as attr name, but you still cant assign to it
>>> __debug__
True
>>> __debug__ = False
File "<stdin>", line 1
SyntaxError: cannot assign to __debug__
>>>
>>> class X: ...
...
>>> x = X()
>>>
>>> x.__debug__ = False
File "<stdin>", line 1
SyntaxError: cannot assign to __debug__
>>> x.__dict__['__debug__'] = False
>>> x.__debug__
False
hm that SyntaxError on attribute writes feels like a bug
possibly it's to prevent setting the attribute on modules?
Practicality beats purity
what a hidden feature
and golly gee was being more verbose than java due to needing to restate the typename 4 times per class not practical (slight hyperbole, but still)
i m not saying that.
it is compiled to LOAD_CONST, but for some reason it also exists in builtins:```py
dis('debug')
0 0 RESUME 0
1 2 RETURN_CONST 0 (True)
builtins.dict['debug']
True
I don't hate just doing the bare minimum to make the one pain point less painful, rather than trying to figure out how to make super a keyword without breaking everything
i will just leave it there ```py
dis('debug.x = a')
0 0 RESUME 0
1 2 LOAD_NAME 0 (a)
4 LOAD_CONST 0 (True)
6 STORE_ATTR 1 (x)
16 RETURN_CONST 1 (None)
dis('x.debug.x = a')
0 0 RESUME 0
1 2 LOAD_NAME 0 (a)
4 LOAD_NAME 1 (x)
6 LOAD_ATTR 4 (debug)
26 STORE_ATTR 1 (x)
36 RETURN_CONST 0 (None)
i think adding __class__ cell to every method makes most sense
yes, it is a bit memory-unfriendly, but we are doing python, so it is not a huge concern
i noticed that my function s bytecode get a attribute
f.__code__._co_code_adaptive
after some using.
i want to store it into a file, and use it to replace
f.__code__.co_code
the next time i run that py file.
but it failed...it changed to non-adaptive version....
π
i guess code objects store more statistics information than just _co_code_adaptive
should we attempt to store and reuse the _co_code_adaptive?
you can try, it would be cool
i m new to this channel. this channel is great.
i see it like this:
- python loads regular .pyc file
- does stuff that specializes functions
- at shutdown gather all code objects and store them into .pyc file carefully
at next launch python should pick up specialized bytecodecode versions
(there is a lot of pitfalls, it is not a trivial thing to do)
not if you want your code to be safe and portable
i want to learn more python to say goodbye to the python ππ stage.
then this is definitely a good way to learn more about how the interpreter works π
the "adaptive" part is relatively new (Python 3.11); see
!pep 659
for more
emmmmm....π€ before i read it, let me have a guess, maybe, loading a attribute then calling it can be reduce to 1 move, and in that move, if that attribute is a function object then directly call it with a self before args?
to skip creating the method object?
this was a thing long before 3.11
there was a LOAD_METHOD / PREPARE_CALL / CALL triplet (or something like that), that did almost the same thing as you described
interesting, almost all of that got fixed in Python 3
inb4 it just redirects to https://python.org
archive server START!
i made a gh issue on this i think
like just using {*()} and removing those 2 opcodes
Huh, coerce() predates me... I don't know that one
I think it was useless in late Python 2 already. I remember seeing that it was removed in Python 3 but I don't think I ever used it in Python 2
(I only ever used 2.7)
Indeed, it's there in 2.7:
coerce(x, y)
Return a tuple consisting of the two numeric arguments converted to a common type, using the same rules as used by arithmetic operations.
Huh.
raw_input(): use sys.stdin.readline()
Glad that one didn't happen - I don't think that would've been beginner friendly at all
but note that in Python 2 input() automatically eval()ed the result
glad we don't have that any more
we ended up instead renaming raw_input() to input() which was probably the right call
I do remember the main point for python being easy when I was like... 12-14 was that you didn't need to do string-to-int when reading user input, it just became int automatically. I remember being confused about having to do that when I started learning py3, but eventually I figured the situation out.
Yeah, definitely better that we got rid of the implicit eval - that was cute for ints, but terrible for strings, for instance
i still think mine is more customizable and the gc one is kind of UB to meπ€
is that a undefined behavior?π€
its defined that __length_hint__ is used to preallocate tuples, and its defined that tuples need to be in the GC before you start pulling from the iterator, (in case the iterator raises an exception, so that things don't get leaked). Grabbing the tuple as it is being built is not quite UB, but its not recommented (the gc module notes that you can get references to partial objects)
they're both UB. The Python interpreter assumes that tuples cannot be self-referential, and anything that you do that lets you put a tuple in itself is undefined behavior. I'd go further, in fact - I'd say that anything that lets you put a tuple inside itself is a bug. But not every bug is practical to fix
this probably is fixable, FWIW. There's definitely no language guarantee that "tuples need to be in the GC before you start pulling from the iterator". AFAIU, there's not even a language guarantee that the gc module exists. It's a CPython implementation detail
there wouldn't be anything incorrect about PySequence_Tuple creating a new tuple instance without putting it on the GC list, as long as it guarantees that it gets added to the GC list before a successful return. Until the function returns a reference is known to be on the stack, and the GC only "needs" to know about stuff that might be part of an unreachable cycle. While there's a reference on the stack, it's known that this tuple can't possibly be part of an unreachable cycle
fair enough
I think this bug could actually be fixed pretty easily
I think the fix is just a call to PyObject_GC_UnTrack(result) after https://github.com/python/cpython/blob/3.12/Objects/abstract.c#L2056 and a call to PyObject_GC_Track(result) before https://github.com/python/cpython/blob/3.12/Objects/abstract.c#L2099
Objects/abstract.c line 2056
result = PyTuple_New(n);```
`Objects/abstract.c` line 2099
```c
return result;```
emmmmmπ€ π€ π€
there is a built-in function id().
what if I send that list s id to the sub interpreter, then use ctypes to get that list from id.π€
yes, you can use ctypes to break the isolation guarantees of the interpreters. If you do that, you might cause crashes or security vulnerabilities or any number of other issues
so I will not do that.(unless for fun)
in general if you use ctypes any safety guarantees go out of the window
I understand that. I just want to know if it is possible to get that because I can't test it as 3.13 has not came out
you can use _xxsubinterpreters in 3.12
(with no stability guarantees)
thank you. can't wait to play with itπ
im reading pep734
it says that builtin objects will be shared
does it mean that if i do dict.foo in two different subinterpreters i will get the same object?
what if foo method is fishhook'd, will i get the same function object?
if so, then i can share arbitrary data through foo.__dict__ which doesn't feel safe at all
fishhook
doesn't feel safe at all
Well..
The interpreter assumes these are immutable, and unless fishhook goes out of it's way to change how builtins are looked up, this assumption would be broken and you'd have shared mutability that the interpreter thought was not mutable. This could lead to safety issues. I suspect that fishhook could adapt for this and only allow changing it for the current subinterpreter by changing how it operates, but I have gone out of my way to change apis rather than use fishhook, so I can't say I'm super well-versed in how it currently functions.
there is also a way to get underlying object dict by abusing mappingproxy's bug with | operator
@pliant tusk what do you think?
it's a known annoying issue
I will likely have to instantiate new instances of built in types to get that work work properly with subinterpreters
I was wondering, how do subinterpreters could potentially mix with nogil? I remember when the nogil PoC just dropped, the discussion was that subinterpreters can be considered as an alternative to nogil (despite the overhead). So, if Python will eventually move to nogil, subinterpreters will be obsolete?
yeah, i dont see any point in using subinterpreters if nogil is available
the only usecase i can think of is implementing some kind of REPL where UI and thing that executes commands are placed in two subinterpreters, so they dont affect each other too much
off the top of my head:
- it will be much easier for extension modules to support multiple interpreters than to support nogil, so it may be many years before important 3rd party libraries properly support nogil without race conditions
- the plan is that there will be several CPython versions where the GIL is optional but enabled by default, so libraries that want to guarantee parallelism regardless of how the user runs the interpreter may want to use multiple interpreters since they're expected to be available everywhere much sooner
it will be much easier for extension modules to support multiple interpreters than to support nogil
it's not clear to me that this is true. Subinterpreter support requires modules to migrate to the new-style extension initialization API, while nogil requires better locking in places that are currently protected by the GIL. I believe Sam got numpy to work with nogil, while subinterpreter support for numpy sounds far off (https://github.com/numpy/numpy/issues/24755#issuecomment-1729061333)
that's interesting. I'd think that moving to the new-style extension initialization API would be much easier than finding and fixing all of the places where someone made assumptions about the GIL guaranteeing atomicity of some operation throughout the codebase
the latter seems much more open-ended. It's comparatively really easy to tell when you've successfully switched to the new initialization API and removed all your static variables
true, but to support a per-interpreter GIL, you similarly have to find all places where you were implicitly relying on the GIL to make your operations safe
so the worst case scenario for a low-level extension maintainer would be to support 1) vanilla, 2) subs, 3) nogil?
I mean, at the same time
I don't think that's necessarily true - if you were implicitly relying on the GIL to prevent other threads from modifying an object that you're holding a reference to, you get that for free with a per-interpreter GIL
yes
that's true, there are some cases where a per-interpreter GIL is less work to support. However, if you're relying on the GIL to protect a C global, the per-interpreter GIL will still cause you trouble
yeah, that's fair. I haven't seen as much code trying to do that, but I'm sure there's some out there
virtually everything that uses borrowed references is a potential memory safety bug in a nogil world, right?
I think so. (It's also quite likely to be a potential bug right now)
yeah, right now you have to be careful not to execute anything that could release the GIL, but in a nogil world I don't think it's possible to safely use any C API function that returns a borrowed reference to any object that another thread might have a reference to
the status quo is that it's difficult to do safely, the nogil state might be that it's literally impossible unless the borrowed reference is borrowed from a newly constructed object (and even that's not necessarily enough, given things like that self-referential tuple hack being discussed yesterday)
(though maybe we don't need to worry about that one, and people who abuse gc.get_objects() deserve what they get, heh)
here's an idea to make one specific common borrowed ref safe: https://github.com/capi-workgroup/api-evolution/issues/38
that's interesting. I need to think about that a bit more to figure out how I feel about it. I guess it's useful that the reference to the type object can't be removed out from under you and leave you with a pointer to freed memory, but at the same time I'm struggling to think of a case where it's useful to hold a pointer to the type an object used to be, and that's fundamentally what this is enabling... It's nice that it's adding memory safety for code that holds onto the type object after the object's type has changed, but that code is probably still wrong, since using the old type for anything is probably incorrect, or at least suboptimal
but anyway, there's a whole lot more interesting cases for borrowed references than that. it'll be much more common that another thread modifies a dict after you've called PyDict_GetItem on it, or things like that
I think the end state has to be that (virtually?) everything returning borrowed references is replaced by something that returns strong references instead. Which means the deprecation of some things in the stable ABI
it seems to me that nearly every piece of C API code that iterates over a dict or list or tuple passed in by the caller will be memory unsafe in a nogil world
memory safety shmemory shmafety
cool, let us know when you're done
be sure to block off another weekend to rewrite all existing C extensions in Rust too
Ah yes of course. This is a good timeline. See you guys back here in two weeks. Promise. π
#![feature(rustc_private)]
use embed_c::embed_c;
embed_c! {
int add(int x, int y) {
return x + y;
}
}
fn main() {
let x = unsafe { add(1, 2) };
println!("{}", x);
}
just wrap every CPython module into this and rename the files to .rs
gradual typing was introduced to gradually annotate all the code
today we introduce gradual rustification

we already have RustPython. It's especially nice if you fancy an occasional stack overflow. Still, a great project
why is concurrent module empty
why wasn't concurrent.futures named concurrent_futures or something like that
what is the point in one redundant namespace?
today i found this page....
π€¦
https://mail.python.org/archives/list/python-ideas@python.org/message/GRMNMWUQXG67PXXNZ4W7W27AQTCB6UQQ/
!e
import dis
dis.dis("frozenset({1, 2, 3})")
@hasty turtle :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | 0 0 RESUME 0
002 |
003 | 1 2 PUSH_NULL
004 | 4 LOAD_NAME 0 (frozenset)
005 | 6 BUILD_SET 0
006 | 8 LOAD_CONST 0 (frozenset({1, 2, 3}))
007 | 10 SET_UPDATE 1
008 | 12 CALL 1
009 | 20 RETURN_VALUE
it turns out that it goes like this:
frozenset->set->frozenset
now i fully agrees that they should add a frozenset literal....
that sounds like a really niche optimization
You can do frozenset((1, 2, 3)) to avoid building the set
or better, if this really impacts your performance, create the frozenset once in a global constant
for sets with only constants, yes
!cban 1148549789886201937 7d advertising spam across multiple channels
:incoming_envelope: :ok_hand: applied ban to @frozen flicker until <t:1702027661:f> (7 days).
Python's standard logging API violates PEP-8 and this #PEP proposes to fix this. Feedback and criticism of all sort is welcome.
In general: sure, why not. A few notes:
- logging isn't the only stdlib module that violates modern standards. Why only logging?
- replacing the names in
__all__breaks backwards compatibility - does this even need a PEP?
I suspect it needs a PEP because it will be controversial. Generally the core devs have refused changes that only improved conformance to pep8.
Maybe it could work as a third-party package?
Like a module that exports its own patched logging/...
another module that violates pep8 is unittest
i cant recall any other module, is there any?
built-in classes have lowercase names, does that counts?
lets rename list to List
replacing the names in all breaks backwards compatibility
yes, this is true. I would recommend adding the new PEP-8 compliant names.
logging isn't the only stdlib module that violates modern standards. Why only logging?
I wasnt aware of any others. Consider this PEP submission a more constructive way to approach the rant on Python logging I started a few days ago - 98 comments, and only 1 effective downvote π
Really, Python logging needs a high-level interface like what loguru provides so that 99% of the cases dont involve even seeing the things you see in the basic or advanced cookbook.
Generally the core devs have refused changes that only improved conformance to pep8.
So this means it will be difficult to find a core dev sponsor for this.
in a way it does. both unittest and logging violate it in the same way, adopting hungarian notation.
i m wondering if it is necessary that namedtuple needs eval/exec.
why it is used there and is there a way to not use it?
Yes it will, and if you like loguru better than logging, why not just use loguru?
same goes for dataclasses 
it's not hungarian, it's Java-style, because both libraries are basically copied from the Java ones.
why not? there is nothing wrong in using eval/exec to generate functions
it is a lot more convenient and reliable than constructing function manually
will it becomes slower?
why?
no matter how you create functions, if they are equivalent, they will have the same perfomance
im pretty sure doing eval/exec on string is faster than constructing ast + exec'ing it/constructing function and code object manually
- fear of missing out... standard logging is widely used in many important modules such as Flask and SQLAlchemy. So The Next Great Thing had better have similar usage conventions
- This is not just about me. It's about saying one thing to people learning Python (obey PEP-8) and then they read standard docs and see blatant violation of PEP-8. This is a very stunning thing to deal with.
- We want a more appealing tutorial for Python logging. If it takes 5 lines of code to do what loguru can do in 1, then there is a problem. A different problem than this PEP addresses, but it still is an issue because it is not attractive to see a tutorial with so much boilerplate to do simple things.
Anyone familiar with PEP8 would not see this as stunning. Prior to any of the actual conventions in PEP8, there's this whole spiel:
In particular: do not break backwards compatibility just to comply with this PEP!
Some other good reasons to ignore a particular guideline:
When applying the guideline would make the code less readable, even for someone who is used to reading code that follows this PEP.
To be consistent with surrounding code that also breaks it (maybe for historic reasons) β although this is also an opportunity to clean up someone elseβs mess (in true XP style).
Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.
When the code needs to remain compatible with older versions of Python that donβt support the feature recommended by the style guide.
So a PR to be a breaking change to fit with PEP8 ironically violates it
so that it can generate functions with the right signature. I made a change a few years ago to reduce how much it uses eval/exec, and that greatly improved performance
patches logging and adds high-level functions to make it as easy to use as loguru or structlog (the 2 leading 3rd party logging libraries).
could we change (fix) the signature later by editing the function object?
i mean, after using *args,**kwargs to define it.
I think we also need to generate all the self.x = x code, if I recall correctly
another one is tkinter, of course
you're welcome to look at the code and see if you can get it to work without exec(), it might be a nice performance optimization if possible
i just watch a video guide about the GIL that came out today. ~5 hours before.
it pointed out that almost every byte code not trying to give GIL back, except call/jump, since 3.10
and in the end of the video, it also said that it is an implementation detail.
and dont rely on it.
@feral island (sorry for pinging you) I saw that you reviewed the pep(shorthand keyword arguments thingy). I know that it has not been accepted yet, but should I send a PR so that if it's accepted it can be just merged? It is more organised and can be linked on the Pep page.
yeah that's fine, just mark it as draft. It can be mentioned in the PEP as reference implementation
okay thank you.
The proposal to update the test suite is wrong. It's not enough to update all the tests to refer to your new names, because then you lose test coverage for the old names. And it's not enough to test only the old names, because that wouldn't prove that the aliases work. Instead, you'd need to duplicate every test into one using the new names and one using the old, or something like that
And it seems pretty unreasonable to me to drop the documentation for the old, very widely used names. That will make existing legacy code far harder for random people to pick up and maintain. I think you'd need to keep both names in the documentation.
i think it is reasonable to leave only new names in the documentation, but add notes like "this method was renamed from doThing to do_thing" and add some info for search engines so they will return do_thing link if you searched for doThing
that seems strictly worse than properly documenting both names. It takes up more space in the documentation, it's harder to notice, it won't be present in the documentation index, it'll break things using intersphinx to refer to particular methods, it will break @fallen slate's !d command, etc
and there's existing precedent in the stdlib for documenting multiple functions with a single documentation paragraph - https://docs.python.org/3/library/os.html#os.execl for instance
ultimately this seems like an all-around bad idea to me, but if you're going to do it, it should be done in a way that doesn't break things
oh, i forgot that this is the option
then yes, documenting both versions using same paragraph with double header is a lot better than what i proposed
found a read after free within stringlib join function, it requires some contrived conditions but still pretty neat: https://github.com/python/cpython/issues/112625
!e ```py
def ReadAfterFree(size, do):
b = bytearray(size)
class T:
def iter(self):
b.clear()
self.v = do()
yield b''
yield b''
c = b.join(t:=T())
return memoryview(c).cast('P'), t.v
leak, obj = ReadAfterFree(bytearray.basicsize, lambda: bytearray(8))
print('bytearray:', obj)
print('leaked memory of buffer:', leak.tolist())```
@pliant tusk :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | bytearray: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00')
002 | leaked memory of buffer: [1, 139934072983744, 8, 9, 139934062675024, 139934062675024, 0]
@feral island ^ you might think this is neat. I'm trying to think of an elegant fix for it, since stringlib is mostly dynamic and I don't think putting checks inside it would be a good idea, do you think it would be sufficent to just increment bytearray->ob_exports before calling stringlib join and then decrementing it after?
wow that's pretty awful. I'll have to look at the code
hm if I'm not mistaken this is actually not hard to fix in stringlib, you just have to put the PySequence_Fast call before the STRINGLIB_STR and STRINGLIB_LEN calls
Objects/stringlib/join.h line 25
seq = PySequence_Fast(iterable, "can only join an iterable");```
though I guess there is code later in that function that could also call into Python and therefore clear the bytearray
e.g. the PyObject_GetBuffer call on line 71
so your solution is safer
other possible options would be strdup'ing the separator in stringlib, or pessimizing bytearray by copying it into a bytes at the start (or doing that in the caller, alternatively)
though the ob_exports option sounds simpler if it's sufficient
yes, the ob_exports option feels a little bad in the sense that we're throwing an error when we could do something broadly sensible instead that doesn't throw an error
but the failure case here is contrived enoug that I don't feel too bad about throwing an error
well, switching the implementation to essentially do bytes(the_bytearray).join would fix it, but at the cost of pessimizing every bytearray.join
though otoh, I'd wager that bytearray.join is really rare in the wild, anyway
I'm not sure I've ever seen it...
.join with a non-constant string/bytes is quite rare I think
yeah. some_bytes.join(some_bytearray) doesn't seem weird to me at all, but some_bytearray.join(anything) feels weird
there might be some special case where it's useful, but I can't easily think of one
I made a pull request with the ob_exports fix and it seemed to do the trick
I think the best solution would be to adapt stringlib to use the buffer api for mutable strings
But that would be a lot more work, and way more to test
!e Ironically, bytes.join(bytearray) fails because join expects sequences of bytes. ```py
b''.join(bytearray(8))
@pliant tusk :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 1, in <module>
003 | b''.join(bytearray(8))
004 | TypeError: sequence item 0: expected a bytes-like object, int found
!e py b''.join(b'ab') also fails
@pliant tusk :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 1, in <module>
003 | b''.join(b'ab')
004 | TypeError: sequence item 0: expected a bytes-like object, int found
we need a bytearrayarray
I think you can do it with memoryview shapes?
!e py m = memoryview(bytearray(range(8))) m2 = m.cast('c', shape=(4, 2)) print(m2.tolist())
@pliant tusk :white_check_mark: Your 3.12 eval job has completed with return code 0.
[[b'\x00', b'\x01'], [b'\x02', b'\x03'], [b'\x04', b'\x05'], [b'\x06', b'\x07']]
mom, I want numpy
we have numpy at home
numpy at home:
hey, I was wondering if this is some C error?
def f():
try: f()
finally: raise Exception
f()
object address : 0000029FD73E9900
object refcount : 3
object type : 00007FFC91FE9D10
object type name: Exception
object repr : Exception()
lost sys.stderr
well, it's creating an exception chain that's sys.getrecursionlimit() deep. I'd suspect that it's then failing to print that exception, probably by hitting the recursion limit while trying...
ah yes, shown by this
def f():
try: f()
finally: raise Exception from None
f()
guys im new to python and i wanted to install scipy to use lambertw() but my editor (IDLE Python) refused to do so. So i switched to Visual Studio Code and now i think im getting scizo
i cant install it here either im confusion
ah ty
i have a feeling i didnt do this correctly eventhough the outputs are correct can someone help pls
this is not the correct channel
consider asking in #algos-and-data-structs or in #1035199133436354600 (#βο½how-to-get-help )
Does Cpython have a "token on fly" lexer? like it doesn't store the tokens in a dynamic array it just returns it back when the parser needs it or actually stores it?
Most compilers/interpreters use the former method.
CPython uses the PEG parser now, so there isn't really a separate lexer at all AFAIK.
well lexer.c and tokenizer.c are kinda confusing.
there is a separate lexer/tokenizer.r see https://devguide.python.org/internals/parser/#tokenization
ah, neat
any one who have a good course for python ?
how do I resolve Cannot find the reference resize in (__init.py__) in cv2
in pycharm
ask in #python-discussion
Can anyone think of any material concerns or bad interactions running subprocess.run instances within an openMPI instance [within a SGE job] to call CPU heavy, specifically nonconcurrent, shell based / nix shell accessible commands?
if yes, what actually would be the "correct approach" within python, assuming the tool is nontrivial to import or recreate within python, but accessible via the shell.
When I implemented above it worked fine but I ran into an intermittent error and when I mentioned the issue to a more senior dev I got an unexpectedly strongly negative reaction ("subprocess is terrible and unreliable")
would like to understand what pieces are at play if anyone can throw me links to the right direction, I'm willing to do my own reading
oh wow
python/cpython#112732
can we include help text from subparser in help message https://docs.python.org/3/library/argparse.html#sub-commands
do i need to override format_help method of a default formatter to do this
where can i get design diagram of python argparse module , its very complex
Why was this actually fixed? https://github.com/python/cpython/issues/112125
Isn't doing filter(None.__ne__, things) bogus anyway? That's literally how equality dunders work
bogus?
I mean, it is most definitely the wrong way to do this
if a core dev used it, i'm not sure if it is
why?
it's also pretty backwards incompatible
__eq__, __ne__, __lt__, __gt__ etc. all have NotImplemented as a valid return value
don't they?
well, sure - that's not the issue
the issue is that None.__ne__(None) should be False
filter(None.__ne__, things)
``` seems to be a less verbose version of ```py
(x for x in things if x != None)
``` Would you call the latter bogus?
The latter uses != which does the correct thing (check the other way around if the first comparison returned NotImplemented)
i feel like they should only be NotImplemented if their flipped version also returns NotImplemented
otherwise the dunder should be equal to not'ing the flipped dunder
hm, true - but I wouldn't expect None.__ne__ to ever return NotImplemented
For compatibility. It's not a great pattern and I wouldn't recommend using it, but we try not to break people for no reason, and in this case the breakage was very much accidental
yeah, perhaps
maybe people should configure their linters to ban these calls π
I've only ween WPS prohibit direct dunder calls
!e
a = 0.5
b = 3
print(a.__add__(b))
print(b.__add__(a))
@grave jolt :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | 3.5
002 | NotImplemented
!e
a = 0.5
b = 3
print(a.__eq__(b))
print(b.__eq__(a))
@grave jolt :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | False
002 | NotImplemented
Actually, why would it ever be valid to return False while __eq__/__neq__ing values of different types?
doesn't that break commutativity? (a != b <=> b != a)
True == 2 validly returns False
True != 1 also validly returns False
!e
!e
class Foo(int):
__hash__ = None
def __eq__(self, other):
return other is True
print(True == Foo())
@grave jolt :white_check_mark: Your 3.12 eval job has completed with return code 0.
False
Copying stuff from IPython is unnecessarily complicated
isn't that also the same for any repl
Yes
seems like nobody thought of a decent solution
Oh wait, they did and I am a dummy
In [6]: %history 5
class Foo(int):
__hash__ = None
def __eq__(self, other):
return other is True
print(True == Foo())
til
is that because of the common int parent?..
yeah
If the operands are of different types, and right operandβs type is a direct or indirect subclass of the left operandβs type, the reflected method of the right operand has priority, otherwise the left operandβs method has priority. Virtual subclassing is not considered.
So I have recently opened an idea on the python ideas board (https://discuss.python.org/t/add-ability-to-force-alignment-of-ctypes-structure/39109), and also an issue on github (https://github.com/python/cpython/issues/112433)
There has been a little interest but not too much, but I am not sure if this is just because people have no opinion on this feature or what.
Are there any further steps I need to do to try and make this new feature a reality? I have already made some POC code changes, and have partially written some tests for it locally, so I'm definitely happy to do pretty much all the work for it, just need some direction on how to move forward I guess...
Make a PR and hope a core dev reviews it. It might take a while since not a lot of people actively work on ctypes
ok sure, thanks. I assume to have a better chance it would be good to have all the documentation and test changes already done before making the PR...
!e
import marshal
marshal.loads(b'\xa9\x01<\x01\x00\x00\x00r\x00\x00\x00\x00')
@hasty turtle :warning: Your 3.12 eval job has completed with return code 139 (SIGSEGV).
[No output]
!e
import marshal
print(marshal.loads(b'\xa9\x01r\x00\x00\x00\x00'))
@hasty turtle :white_check_mark: Your 3.12 eval job has completed with return code 0.
((...),)
That would be a spurious lint, it's perfectly safe to do so within the same type, and only an issue for dunders across different types. I'd much prefer people use something like (0).__le__ than either bind a lambda (lambda i: i > 0) or have to do functools.partial(operator.le, 0) It would be cool if python could detect all of these and replace them with the same thing automatically so that there wasn't a reason to prefer not having that lambda, but limitations of interpreted languages and mutable functions...
there's plenty of things you can do in python which aren't necessarily unsafe to do, yet various linters will still have rules to flag up, including subjective style decisions
While that may be true, I don't think that's always a good design decision. We shouldn't have automated tools that are supposed to be good at detecting issues (linters) point people to a potentially worse way of doing something. If that pattern is something CPython devs don't want to see, then there should be efforts to make sure the alternatives are as good (They aren't).
i'm trying to decipher this discussion. What is the bad thing to do that linters complain about, and what is the worse way they point to?
using bound methods of a constant isn't inherently bad, a hypothetical lint to warn on that would lead to worse performance for no improvement in behavior in the cases it is fine in.
(see the thing I replied to not realizing at first how far back that discussion started)
the linter isn't complaining about using bound methods of a constant, it's complaining that using obj1.__ne__(obj2) isn't equivalent to obj1 != obj2
at least not in general, if obj1 and obj2 are different types. If you run obj1.__ne__(obj2) only type(obj1) gets a chance to decide whether they're unequal, if you run obj1 != obj2 then type(obj2) gets a chance to weigh in if type(obj1) says "I don't know"
(and that's a subtle bug; I didn't catch it right away even when @grave jolt was deliberately drawing my attention to it)
and yet the point remains. There are places where it is fine (when all the objects are of the same type) and there are real reasons to use that.
sure. But in this case - the case that started the conversation, filter(None.__ne__, iterable) - it would be nonsensical if all of the objects were the same type
it wouldn't make any sense to run that if you know that all of the elements of the iterable are None
I agree with you that linters are sometimes overly opinionated about stylistic stuff, but I don't think this is one of those cases. This is way more likely to be a bug than not be a bug, and it's reasonable for linters to flag things that are likely wrong even if they can't be certain it's wrong
the GH issue points out that that's a common pattern, been in use for a while without issue, and was only broken because of an underlying implementation change. I'd consider it more broken if any other type lied and had objects it said it equaled None
I think you're conflating two different things, though. Everyone agrees that the change to None.__ne__ was an unintentional breaking change that should be reverted. That's a separate question from whether I think you're conflating two different things, though. Everyone agrees that the change to None.__ne__ was an unintentional breaking change that should be reverted.
That's a totally separate question from whether obj1.__ne__(obj2) is something that linters should warn about
I'm not conflating things here, I think the only cases where obj1.__ne__(obj2) would be used and would break involve other design issues that should be considered the root issue.
requires an issue like mixed data types that dont know about eachother, but one adds the reverse comparison dunders
like int and float
I consider mixing ints and floats a much bigger issue than this. Even without this that can cause issues if people are unaware of it.
well - π€·ββοΈ type checkers disagree with you, fwiw. They explicitly allow passing ints to functions documented as accepting floats
yes, but the other way around is problematic, and we're talking about just a pool of mixed objects here
!e no, not necessarily. ```py
from typing import Iterable
def drop_element(x: float, it: Iterable[float]) -> Iterable[float]:
return filter(x.ne, it)
print(list(drop_element(1, [1.0, 2.0, 3.0])))
Python numeric types are designed in a way that allows an int to be a float/fraction/decimal/complex/etc
@raven ridge :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | /home/main.py:5: DeprecationWarning: NotImplemented should not be used in a boolean context
002 | print(list(drop_element(1, [1.0, 2.0, 3.0])))
003 | [1.0, 2.0, 3.0]
using not equal on a float is already a design issue
no, it isn't. It's fine when two floats are equal
you don't need an epsilon if you know what the value is
you only need an epsilon if you want to treat two unequal things as "close enough", and here, we don't.
and anyway, note the DeprecationWarning raised there. Python is telling you that this will become an error in a future version
that's not what's becoming an error, lol
what do you mean?
it's telling you that using filter(x.__ne__, iterable) will raise an exception in the future when x is an int and iterable contains a float
(again, despite the fact that type checkers will not flag that bug)
hello godlygeek
python type checkers are wrong about a lot of things
The implicit bool(NotImplemented) is the future error. The root cause is not necessarily x.__ne__, but the combination of using it in a situation where it could return NotImplemented.
yes, but that's any situation where the objects being compared are of different types
@urban sandal did you say None.__ne__(x) is a common pattern?
see related GH issue, im not getting to much more into this.
comes up with use of filter
it's common enough that people noticed when they upgraded to 3.12, but I doubt it's a common pattern in general
I had never seen it before, but searching GitHub turns up more than I would have expected.
do any actual linters complain about code like None.__ne__? I see some theoretical discussion but I don't remember seeing any real-world lint rules warning about it.
good question. ruff doesn't complain even with all rules enabled
pylint complains about py return x.__ne__(y) with C2801: Unnecessarily calls dunder method __ne__. Use != operator. (unnecessary-dunder-call) but it doesn't complain about ```py
return filter(x.ne, y)
and "unnecessary" isn't really the issue here, the issue is that __ne__ isn't semantically equivalent to !=
The alternative to that is is functools.partial(operator.ne, x). There's no point in doing that if you already know you don't need to consider it. lambda y, x=x: y != x is strictly worse, but hey, no imports, so such a hypothetical lint would (based on other similar cases) push people to that.
It not being semantically equivalent isn't necessarily a bug. (one could argue it a misfeature of the reverse dunders though)
Something I would quite like would be an operator.not_None function. Semantically, it would be exactly the same as not_None = partial(operator.is_not, None) -- but I feel like it's common enough to want to filter out Nones from a container that it might make sense for it to be provided directly in the operator module. It would be nice to be able to do for x in filter(not_None, iterable):
i know its not the same but filter(None, it) usually works for that
Tricky
filter(bool, it) does the same thing as filter(None, it) but is much more readable imo
filter(None, it) is rarely used, it uses weird default behaviour that is understandable only by reading docs and it also can be confused with something like filter(op.is_none, it) or something like that
It often does, but as you say, it's not the same. As well as None, it'll also filter out other falsey objects like 0, "", [], {} and False
I've deleted your message as part of rule 6
Please familiarize yourself with the #rules of this server.
I know that CPython is testing Python's syntax in python with unittest frameworks, but was wondering what it used before it could implement the tests in its "own" language. Any ideas?
you can look back in the git history
1990 holy shit
Python is old
GvR is older
The pyramids in Egypt are older
(also something is weird about that early history, because there are many commits that look like parts of "Initial Revision")
guido also wrote --Guido (last modified 10 Sep 90) in readme, but commit is dated "9 aug 1990"
guido is time traveler probably
ofc
It's Guido
Would be strange if he wasn't
worth noting that CPython also predates git. It was in Mercurial until a few years ago, and either CVS or SVN before that, and possibly others before that. So the old history has gone through several layers of translation from one protocol to another, some of that may not have worked well
There probably just weren't tests early on. There's still possibly some corners of the stdlib that aren't tested because they were added before we routinely tested things and nobody has bothered since
I think unittest might originally have been a third-party project, but not 100% sure of that
Lib/unittest/__init__.py lines 2 to 3
Python unit testing framework, based on Erich Gamma's JUnit and Kent Beck's
Smalltalk testing framework (used with permission).```
Lib/unittest/__init__.py lines 29 to 30
Copyright (c) 1999-2003 Steve Purcell
Copyright (c) 2003-2010 Python Software Foundation```
Thanks, that means there were at least 9 years of Python without unittest π
unittest more like udontneedtest
They might have used some c testing frameworks for testing the API.
in 1990? doubtful
anyway, not too much point arguing about what "should" have been done 30 years ago
I was just wondering tho.
I guess that's not hard to do?
Is it possible to remove these two byte code instructions when optimizing, if set_update is set from a constant and the constant is empty?
yes, pretty easy
Whose idea was it to basically port junit?
Hypothetical question: Can an ownership-like model (inspired by Rust but different) solve the same problems that the GIL solves in the context of multi-threading memory management?
For example, lets say every time a main thread opens its first sub thread an additional "owner thread" is created to own the data.
- The main thread and all other threads can then have a pointer to owner thread for reading purposes.
- To synchronize data between threads, the Thread object has a method called commit() that can be called on any attribute (attributes of an object are objects too and can have methods inserted).
- Committing an attribute to the owner thread can use a queue to request the owner to overwrite its data (or take ownership of attributes of that name associated with the thread pool and replace the values in the threads that have that attribute).
- If a variable is changed without committing, those changes only exist for the local thread.
- If only the main thread is still active, the owner thread transfers ownership to the main thread and closes.
I am not suggesting this as a feature, my question is hypothetical:
Can a system that uses rules, such as an ownership system, solve the same problems as the GIL for multiprocessing memory management?
PS. Practically, I would include this in the garbage collector rather than have an "owner thread". The example is only meant for illustration.
PPS. A garbage collector would only have to keep track of thread names, a commit keyword also makes more sense than a method
How do you resolve conflicting changes? Stuff like "thread 1 wants to decrement X, thread 2 wants to increment X"
With this system you can only express "set X to 7"
And who calls commit and when?
If committing a change uses a Queue then the Queue will establish a schedule. Most importantly in the context of this hypothetical, synchronization only happens when explicitly requested by calling commit.
The way I understand it, the nogil change does something similar where almost all objects are owned by one thread. This "ownership" has different semantics than in Rust (other threads can still change the object, just via a slower path).
That's why I used a thread for my example, so memory management operations happen in parallel on the owner thread and other threads only need to submit changes to queue
Suppose I want to use this system for a reference counter. The count is now at 1. Thread 1 wants to increment a refcounter, thread 2 wants to decrement a refcounter. How would that work?
(at least with regards to reference counting)
Thanks
If you want to look more into it, look up "biased reference counting"
In that case, when thread 1 commits +1 it gets added to the queue and if thread 2 commits -1 afterwards then that gets added to the queue. When the queue resolves it will make the changes in that order and the only problem you have left is if you read in between those operations (happens in a multiprocessing Queue in anyway if you request queue size so this is acceptable but not perfect)
Thank you
That is very useful
What if the -1 thread adds its change first?
Then the object will temporarily have a 0 refcount so it will die immediately
I know this is a python server but can I get some Java help
Itβs very simple im pretty sure im just new to coding
#ot2-never-nesterβs-nightmare
Please read our off-topic etiquette before participating in conversations.
https://discord.gg/P3UmanFmvK might be a more suitable server for that question too
Oh Iβm sorry
HI
!e ```py
class Foo:
bar = 'value'
def func(self):
nonlocal bar
bar = 'another value'
def func2(self):
nonlocal bar
return bar
@pliant tusk :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 4
002 | nonlocal bar
003 | ^^^^^^^^^^^^
004 | SyntaxError: no binding for nonlocal 'bar' found
!e it works for wrapped functions, i dont see why it shouldnt work for class scopes ```py
def outer():
bar = 'value'
def func():
nonlocal bar
bar = 'another value'
def func2():
nonlocal bar
return bar```
@pliant tusk :warning: Your 3.12 eval job has completed with return code 0.
[No output]
You cannot have closures capture class variables at all
even just ```py
class X:
x = 1
def f(self): return x
self.x/bar works fine there in each broken example, what's the context for wanting this?
class scope doesnt exist after class is created
whatever scope is copied to dict and stored internally somewhere in the type object
metaclasses can replace scopes with their own fancy objects, that is cool during class creation, but doesn't make a lot of sense afterwards
mostly just wondering why it didn't work (as i believe it did in the past at some point)
I do not believe it ever did, fails in py2.7 and py3.5, and the more common example of this not working,
x = 1
class X:
x = 2
def f(self): return x
assert X().f() == 1
```has been going around for quite a while now.
huh, i could have sworn i wrote some cursed code that relied on it at some point
guess not
Hello
def foo[T](): pass
class Bar[T]: pass
>>> statistics.fmean(timeit.repeat("foo[int]", globals=globals()))
0.0814359413983766
>>> statistics.fmean(timeit.repeat("Bar[int]", globals=globals()))
0.9336082836030982
```whys 2 so much slower than 1?
What is foo[int] supposed to do? It crashes.
!e
def foo[T](): pass
foo[int]
@quick snow :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 2, in <module>
003 | foo[int]
004 | ~~~^^^^^
005 | TypeError: 'function' object is not subscriptable
it makes a generic alias like GenericAlias(foo, (int,))
Python 3.13.0a0 (heads/function-subscript:97220ab6aa, Dec 5 2023, 10:50:30) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def foo[T](): pass
...
>>> foo[int]
__main__.foo[int]```
Oh, 3.13
how is it implemented? using function.__getitem__ ?
if so, i suspect that dispatch to cls.__class_getitem__ is very slow
yeah just function.getitem
hmm, wonder why its so slow on a class
its probably got specialisation actually for the function case
>>> class X:
... def __getitem__(self, key):
... return GenericAlias(self, key)
...
>>> x = X()
>>> GenericAlias = type(list[int])
>>> n = 10**6
>>> min(timeit.repeat('Foo[int]', globals=globals(), number=n)) / n * 10**9 # ns
634.6786000067368
>>> min(timeit.repeat('x[int]', globals=globals(), number=n)) / n * 10**9 # ns
124.98309998773038
``` i can kinda reproduce it on 3.12
difference is not so big i guess because `X.__getitem__` is pure python function
hmm, yeah its not specialisation, it must just be really slow for classes
where is ceval.c located in new versions? i forgot
>>> def foo():
... foo[int]()
>>> dis.dis(foo)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + foo)
12 LOAD_GLOBAL 2 (int)
22 BINARY_SUBSCR
26 CALL 0
34 POP_TOP
36 RETURN_CONST 0 (None)
>>> class Bar[T]: pass
>>> def bar():
... Bar[int]()
...
>>> dis.dis(bar)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + Bar)
12 LOAD_GLOBAL 2 (int)
22 BINARY_SUBSCR
26 CALL 0
34 POP_TOP
36 RETURN_CONST 0 (None)```they have the same bytecode
what is the specialized bytecode?
there is BINARY_SUBSCR_GETITEM, but im not sure what it does
its probably just cause its calling through to typing._generic_class_getitem
Hello everyone, what should I do if my smart friend deleted all the files that have the word python and now he doesnβt have textures in many games?xD
oh its not even returning an instance of types.GenericAlias which is probably why
>>> my_cool_list = list
>>> statistics.fmean(timeit.repeat("my_cool_list[int]", globals=globals()))
0.16742044179700316```looks slightly better
its only 2x rather 10x
Objects/abstract.c line 183
return Py_GenericAlias(o, key);```
`Objects/listobject.c` line 2918
```c
{"__class_getitem__", Py_GenericAlias, METH_O|METH_CLASS, PyDoc_STR("See PEP 585")},```
`Objects/genericaliasobject.c` line 986
```c
Py_GenericAlias(PyObject *origin, PyObject *args)```
oh wait, i unconfused myself
type[T] and list[T] call the same thing - types.GenericAlias
ye
but if you make your own instance it goes into typevarobject.h or .c
so if you check that you see that it goes to typing._generic_class_getitem
>>> type.__class_getitem__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'type' has no attribute '__class_getitem__'
``` huh
i'm officially confused
such a mess
Isn't __class_getitem__ just the hack so you can define it on the class instead of the metaclass?
So it would be just __getitem__ on type
but types metaclass is type no?
But type is cheating
And even if you disregard that: It doesn't need to use __class_getitem__ because it can just go via __getitem__.
there is no __class_getitem__ slot there: https://github.com/python/cpython/blob/main/Objects/typeobject.c#L5239
and this method doesnt exist in X.__dict__, but somehow attr access works: ```py
'class_getitem' in X.dict
False
X.dict
mappingproxy({ 'module': 'main',
'type_params': (T,),
'getitem': X.getitem,
'orig_bases': (typing.Generic[T],),
'dict': <slot X.dict>,
'weakref': <slot X.weakref>,
'doc': None,
'parameters': (T,)})
X.class_getitem
X.class_getitem
Objects/typeobject.c line 5239
static PyMethodDef type_methods[] = {```
right π€¦
Objects/typeobject.c lines 5359 to 5360
0, /* tp_as_sequence */
0, /* tp_as_mapping */```
Objects/typeobject.c lines 5164 to 5166
/* We have no guarantee that bases is a real tuple */
Py_ssize_t i, n;
n = PySequence_Size(bases); /* This better be right */```
so i can provide instance of tuple subclass with bad behaviour?
>>> X = type('bad_tuple',(tuple,),dict(__len__=lambda*_:-1))
>>> type('',X([int]),{}).__bases__.__class__
bad_tuple
>>> len(X([1,2,3]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: __len__() should return >= 0
doesnt even do anything exciting for out of bounds +ve
sadly it is bypassing some overriden methods :(
>>> X = type('bad_tuple',(tuple,),dict(__len__=lambda*_:10,__getitem__=lambda*_:'hi',__iter__=lambda*_:iter('world')))
>>> type('',X([int]),{}).__bases__
('w', 'o', 'r', 'l', 'd') # this is from pprint.pprint formatting
>>> str(type('',X([int]),{}).__bases__)
'(int,)' # this is what actually is stored
>>> type('',X([int]),{})('42')
42
ok, so this is gonna be a weird one... So I am playing around with injecting python into a running process using pymem. This works well, but I have noticed an issue where certain keypressed don't work correctly. Long story short, I am reasonably certain that by injecting python the towupper function which resides in ucrtbase.dll is being somehow modified so that it doesn't return the correct values (or at least the values that the game expects...)
I am wondering if anyone know why this couild be happening. It definately happens when python is injected as I can make that happen on demand and there is no issue until I inject it.
I do see that python seems to have some checks around ucrtbase on windows so I wonder if it's doing something internally to make this function not operate the same way...
You can pass adaptive=True to dis.dis to get the specialised bytecode that would be used by the specialising adaptive interpreter
yeah, i know, thank you
i guess my wording was ambiguous π₯², i was asking Gobot about bytecode of their function
This sounds like a locale issue. CPython calls setlocale during initialization, whatever process you're using maybe assumes a different locale?
Hello guys, this might be a stupid question but does anyone know that where can i find the source code of python3-devel, I would be more interested in how the whole process works, where you have the python installed on your pc so regardless of where you are you will always can access it, how does this whole process work?
Cannot operate on silence internals-and-peps; it is currently locked and in use by another operation. Please wait for it to finish and try again later.
β silenced current channel for 8 minute(s).
Cannot operate on silence internals-and-peps; it is currently locked and in use by another operation. Please wait for it to finish and try again later.
!silence
!silence
β current channel is already silenced.
Cannot operate on silence internals-and-peps; it is currently locked and in use by another operation. Please wait for it to finish and try again later.
β unsilenced current channel.
Thanks! This was indeed the issue! I thought I had tried this, but I actually imported the dll the game imports and used that function, I was able to fix the issue by setting the locale also using this dll!
Hi, is this true ?
https://stackoverflow.com/questions/34149013/what-does-it-mean-that-python-is-stack-based
If so, will it ever get refactored ?
it is true that CPython is a stack based VM
Are there plans to refactor it into a more efficient model ? How much effort would it take ?
there were experiments on register-based VM in faster-python project
i dont know any additional details
the current VM is currently undergoing optimizations for the stack-based part
there are also layouts preparing for the hybrid stack/register-based VM
Uhmm, I want to start reading the CPython source code. Would this stuff be a good place to start ?
I'd imagine that the upgrade is not that convoluted if the code is well modularized. Tho full disclaimer, I know nothing about this stuff.
i dont know where are all parts of VM are located, but the part that does the bytecode executing is located in /Python/bytecodes.c file, iirc
Got it, already have questions, but its basic stuff so I'm gonna see how for I can go with gpt4
TIL: on 3.12 there is literally no difference in perfomance between accesing __slots__'ed attribute vs normal attribute
there was little difference (~10-20%) on previous versions, but on 3.12 i can't even measure it
columns are: setup code, measured code, time spent in measured code
im using timeit for it
D:\>py -3.8 _.py
class X:__slots__=("a",)\nx=X();x.a=1 x.a 20.21 ns
class X:...\nx=X();x.a=1 x.a 22.31 ns
class X:__slots__=("a",)\nx=X() x.a = 1 25.09 ns
class X:...\nx=X() x.a = 1 31.11 ns
class X:__slots__=("a",) X() 47.4 ns
class X:... X() 48.29 ns
D:\>py -3.9 _.py
class X:__slots__=("a",)\nx=X();x.a=1 x.a 20.13 ns
class X:...\nx=X();x.a=1 x.a 22.03 ns
class X:__slots__=("a",)\nx=X() x.a = 1 23.7 ns
class X:...\nx=X() x.a = 1 30.57 ns
class X:__slots__=("a",) X() 48.49 ns
class X:... X() 50.09 ns
D:\>py -3.10 _.py
class X:__slots__=("a",)\nx=X();x.a=1 x.a 23.41 ns
class X:...\nx=X();x.a=1 x.a 25.21 ns
class X:__slots__=("a",)\nx=X() x.a = 1 27.02 ns
class X:...\nx=X() x.a = 1 31.86 ns
class X:__slots__=("a",) X() 50.13 ns
class X:... X() 55.4 ns
D:\>py -3.11 _.py
class X:__slots__=("a",)\nx=X();x.a=1 x.a 14.99 ns
class X:...\nx=X();x.a=1 x.a 15.35 ns
class X:__slots__=("a",)\nx=X() x.a = 1 13.69 ns
class X:...\nx=X() x.a = 1 13.86 ns
class X:__slots__=("a",) X() 62.01 ns
class X:... X() 71.82 ns
D:\>py -3.12 _.py
class X:__slots__=("a",)\nx=X();x.a=1 x.a 13.78 ns
class X:...\nx=X();x.a=1 x.a 13.77 ns
class X:__slots__=("a",)\nx=X() x.a = 1 13.74 ns
class X:...\nx=X() x.a = 1 13.75 ns
class X:__slots__=("a",) X() 61.98 ns
class X:... X() 74.9 ns
good news i guess
also i noticed that '...%s...' % (x,) formatting is as fast as f'...{x}...', but i think this was a thing for a long time (a couple of years)
why not just '...%s...' % x
this is not optimized because there is no way to statically determine that x is not a tuple
so it defers to slow old method
ok
hi i have a question
this are the structs for cpython's tuple and list objects
typedef struct {
PyObject_VAR_HEAD
PyObject *ob_item[1];
} PyTupleObject;
typedef struct {
PyObject_VAR_HEAD
PyObject **ob_item;
Py_ssize_t allocated;
} PyListObject;
why the difference in how ob_item is declared?
PyTupleObject.ob_item is part of the tuple memory allocation (further enforcing its immutability) but PyListObject.ob_item is a realloc()-able pointer for the items of a list
yes but that requires the whole tuple to be realloc()'d
Objects/tupleobject.c line 899
_PyTuple_Resize(PyObject **pv, Py_ssize_t newsize)```
it's a naughty C trick, you access a given element of a tuple like type.ob_item[10] for example - making sure to allocate more space than the PyTupleObject struct itself requires.
this is why it requires that Py_REFCNT(*pv) == 1
but i thought that was to avoid breaking code that relied on immutable tuples
tuples are always immutable
i still don't see how ob_item plays into this
PyTupleObject.ob_item being declared as an array basically makes it part of the tuple structure itself
im learning c but i gathered that along the way

so it can't be changed without reallocating the whole tuple structure
what about lists then?
can you just point to a different vector?
is that the entire point
PyListObject.ob_item being declared as a pointer (to a pointer of an object) makes it reference another piece of memory separate from the list structure
it's reassignable
ohh
unlike PyTupleObject.ob_item
ok i get it
i didn't know this lol
makes sense tho
ish
wait no it doesn't how does that work
if you want a diagram, a tuple looks like this in memory ```
[header]
. . . . . p0 p1 p2
\ \
_| _| _|
obj obj obj
and a list is like
[header]
. . . . . ptr cap
_|
p0 p0 p2
\ \
_| _| _|
obj obj obj
i thought structs were just pointers with members
A struct describes a fixed-size value and names its members, and the members are stored contiguously in memory
then pointers to structs are pointers to pointers?
You can make the last element of a struct be variable-sized
Yeah well, that's kind of a hack
for lists, the last element isn't ob_item
That's afaik only an optional C feature. You can't do what CPython is doing here as per the language, compilers just special case that specific pattern.
A list stores a pointer to an array with the items. But a tuple stores the items directly after the object header
It's in C99
And made optional in further standards along with VLAs AFAIK.
ig im just really confused how stuff looks in memory then. i thought structs members were just pointers to different areas in memory
Ah, no. A struct contains its actual members. Those members can be pointers, but don't have to be.
well basically they're the equivalent of arrays (but with members that may have different types)
static arrays
oh i am just really lost then
i don't think they're laid out how i guessed then
i based that guess on a very bad little test run
ok thanks guys that was a nice wakeup call
Imagine a struct like this ```c
struct Point {
x uint64_t;
y uint64_t;
}
It will be laid out as a single 16-byte value, where the first 8 bytes are `x`, and the second y bytes are `y`
So it will literally occupy 16 consecutive memory cells, like an array
no i got it now
didnβt even know that byte padding was a thing
which is where most of the confusion came from
hey does anyone know how to download Spotify playlists using python script
Not really a question for this channel, use #python-discussion for general stuff, but: check out https://spotipy.readthedocs.io/en/2.22.1/
Kind of late but downloading spotify video's is illegal and can even have sentences Linked to it in some places since it breaks drm
what do you think about for x in y if z: statement syntax?
# imaginary syntax:
for var in iterable if condition:
do_stuff()
# inspired by this:
[expr for var in iterable if condition]
# works like this:
for var in iterable:
if condition:
do_stuff()
Kind of cute, but making it harder to skim code. Reminds me of Ruby a bit, where you can do stuff like
return 42 unless false
in odin you can do this: py n1 := caller_2() or_return this returns iff caller_2() failed, otherwise it continues
https://odin-lang.org/docs/overview/#or_return-operator
why would py for var in iterable if condition: be better than ```py
for var in iterable:
if condition:
?
I think there's no chance of that proposal ever getting accepted
things that allow more terse code just for the heck of it are incredibly unlikely to be accepted
If you're worried about indentation levels you can also just do
for var in iterable:
if not condition:
continue
...
by this argument, it would be better to write ```py
def get_area(self): return self.length * self.width
if you can write the same code in less lines without losing readability, i think it is a win
your get_area example joins two lines together, it is not simpler than "normal" code, it only hurts readability a bit
my proposal gets rid of for: if:/for:if not:continue boilerplate, and it adds new syntax, so there is no need to think about these compound statements - you have one simple statement that is baked into the syntax
this proposal looks familiar to me. I think you can find it in the mailing list archives (if you can think of something to search for).
your proposal is nearly equivalent to asking for Python to support py for var in iterable: if condition: do_stuff() and that's syntactically disallowed. Your proposal is exactly that minus one :
in other words, we could have something today that gives both of your stated advantages (1 line instead of 2, no extra indent level) by just allowing an if statement on the same line as a for statement, but that's specifically and deliberately blocked as things stand today
people can already one line this and I've seen people do so, even at the cost of a pointless list:
[do_stuff() for var in iterable if condition]
might have better luck getting that specific pattern special cased in the interpreter to never create the list
might as well do:
for var in (var in iterable if condition):
do_stuff(var)
You mean
for var in (var for var in iterable if condition):
do_stuff(var)
yes, i do.
http://www.catb.org/esr/structure-packing/ Worth a read
thanks
python-ideas already discussed this before
not sure where i found it but i did encounter it before
This is very nice
Code readability >
One line is comfortable but not that much readable
I agree that what you have there is the better version. I don't want to encourage pointless one-liners, writing more is fine, it's the same ideas split vertically instead of horizontally, all within a reasonable amount of total screenspace. Was pointing out that doesn't stop people from doing it, even when it comes with other reasons not to beyond the readability. (like the pointless list that could be too large to for in memory if iterable is large enough lazily generated...)
Edit: No, I meant the written out form and got a bit of it mixed up.
was more to point to the absurdity as a reminder that we don't actually need more terseness all the time
btw, I would not approve of the for-loop-over-generator that I suggested. Write it out.
I found this discussion that has a bunch of links to other similar discussions https://discuss.python.org/t/conditional-loop-filtering/27317/2 also found the mailing link https://mail.python.org/archives/list/python-ideas@python.org/thread/7IXPROG2ANAFZTHZC4G3HRVXSKRIPXDL/
import re
for _ in range(10):
pattern = re.compile(r'\d+')
Does re optimize this such that the actual work of compiling \d+ is done only once, and any calls to compile a pattern represented by the same string retrieves the existing one from a cache?
Lib/re/__init__.py lines 374 to 377
@functools.lru_cache(_MAXCACHE)
def _compile_template(pattern, repl):
# internal: compile replacement pattern
return _sre.template(pattern, _parser.parse_template(repl, pattern))```
ooh sorry maybe its not that one but the _compile above
You should be careful though β despite the internal cache, if your program uses a lot of regexes, you can often still get a large speedup from compiling the patterns once and reusing them. I got a 9% speedup for one of the tools CPython uses in CI just by doing that: https://github.com/sphinx-contrib/sphinx-lint/pull/82
oh hey, I use sphinx-lint - thanks for that, @merry bramble π
You're very welcome! π
is that speedup from the actual pattern compilation, or from eliminating additional evaluations of re.compile(...)?
that's from avoiding unnecessarily re-compiling the same pattern over and over
jw, but if there's supposed to be a cache, then should't the cache just return the already compiled pattern instead of recompiling it again?
the cache has a size limit, so if your program uses a lot of regexes, you might wind up evicting things from the cache that you actually need, and needing to re-compile them later
Yeah, the cache is an LRU cache: once it reaches its maximum size, the regexes that you used least recently are evicted from the cache
If you're dynamically generating lots of regexes (sphinx-lint is), then you can pretty quickly use up the cache, meaning the regexes that you actually are reusing over and over aren't actually present in the cache a lot of the time, and unnecessarily get recompiled over and over
Demo: ```
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(100)]'
2000 loops, best of 5: 221 usec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(100)]'
2000 loops, best of 5: 179 usec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(200)]'
1000 loops, best of 5: 375 usec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(300)]'
500 loops, best of 5: 565 usec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(400)]'
1 loop, best of 5: 737 usec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(500)]'
1 loop, best of 5: 908 usec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(600)]'
1 loop, best of 5: 313 msec per loop
$ python -m timeit -s 'import re' '[re.match("x" * i, "") for i in range(700)]'
1 loop, best of 5: 458 msec per loop
note the giant jump between creating 500 regexes and 600 - it leaps up from hundreds of microseconds per call to hundreds of milliseconds, 3 orders of magnitude slower
so if you don't have a lot of regexes / use the same one then you're probably fine, no?
and you know that the libraries you're using also don't use regexes internally
ah never mind, i blinded over the
if your program uses a lot of regexes
in alexs message
this in particular is a pretty big ask. It's not enough that you know that you don't use many different regexes, you need to know the internals of every library you're using to know whether each library does
if you don't know that, then you can't make any guarantees about whether the total number of distinct regexes used by the program exceeds the cache size or not, or the likelihood of your regex getting evicted from the cache to make room for regexes used by some library
that's a very good point, never thought of that
<@&831776746206265384>
Could you please send me this message 10 times in private?
If I don't respond increase it by 2 every minute.
@graceful oar Review our rules. We don't allow advertising on this server, so I'm deleting your message.
... what?
sarcasm. π
I ran into a case with this a while back that made me think about if it was worth using a different eviction policy than LRU, but I ended up thinking it probably didnt occur too often and people can just not rely on the cache...
Could possibly be worth switching to an LFU eviction policy, would be less likely that people run into this kind of issue, but the cache semantics are a little more complex, may not be worth it
Really? Advertising in PEPs channel...
Consider whether necroposting about something from 48 hours ago is adding anything of value as well.
@summer lichen
<@&831776746206265384> 1009175270613319802 asking for help with unethical/illegal topics (and also not topical to this channel π). This is the only thing they have mentioned in their 5 msgs in the server.
@unkempt rock, kindly refer to rule 5. We will not help with malicious topics.
!rule 5
5. Do not provide or request help on projects that may violate terms of service, or that may be deemed inappropriate, malicious, or illegal.
i guess this guy really needs a python script that can crack wifi passwords
they should write a PEP for it then
Bro really wants that free car dealership Wi-Fi
OK It was just for lab
Hello, I am working on a fully static build of python and I noticed that the "internal" static libs (libmpdec.a and libexpat.a) are not getting installed anywhere. Is there a reason for this? Could this be added? Maybe only for the case when the static lib is enabled and shared libs are disabled?
Is this pokimon ? what does it mean ?
Hi everyone, I have a question, from what I've observed ( not from here ) some people have a negative opinion on classes / OOP. why is that? Least from my experience it's been a fairly straight-forward. I've used Traits before in rust and that somehow made it unbareable.
very long, but this video covers it pretty well: https://www.youtube.com/watch?v=QM1iUe6IofM
An explanation of why you should favor procedural programming over Object-Oriented Programming (OOP).
that is, you don't like Java/C++ style OOP after having used Rust?
Or you fount traits unbearable?
found traits more difficult to learn / understand / readable
This is specifically about inheritance, but it sums it up nicely: https://youtu.be/hxGOiiR9ZKg?si=Ehbnru9l7ZYMMNYw
Let's discuss the tradeoffs between Inheritance and Composition
Access to code examples, discord, song names and more at https://www.patreon.com/codeaesthetic
0:00 Introduction
0:25 Inheritance
3:32 Composition
5:22 Abstracting with Inheritance
6:52 Abstracting with Interfaces
8:20 When to use Inheritance
It's less about it being confusing or whatever, but how easy it is to change the code
i.e coupling
Traits are basically just ABCs
I found this interesting. Thanks for sharing. My biggest take away was "dont do deep levels of inheritance (deep is slight ill defined)" and "dont try to over encapsulate"
Or maybe even that encaspulation is pointless and really the crux of it all
The bigger thing for all of that, is donβt do premature encapsulation or inheritance. Donβt start trying to solve problems that donβt exist yet just because they might exist later.
Oh yhe, people always talk about premature optimization, don't see much talk about premature abstraction
I've seen this video a few times, but I still don't think I get it... why can't objects pass references to other objects?
it's also not quite clear what the narrator means by OOP, because I have no clue what it is
Keeping in mind that the video is using a perspective of a stricter, lower level language, not one like python where everything is treated as an object, the point made there is that passing objects undermines the encapsulation rationale for oop.
I've never been able to take that video seriously (Object-Oriented Programming is Bad). That entire section of the video on not passing references is pointless since they later say themselves no-one codes like that in practice. To me the entire middle part of the video falls apart as just a rant on frustrations with the programing industry and not an actual critique or guide. Everything they complain about is something that just happens in large systems of code regardless of the paradigm used, since in the end dealing with state and io is hard. Even with pure functional code, you can still make a mess of state encapsulation once you start currying since that is still data, just stored differently.
I dont think it's a good explanation of the position. The argument isn't wrong per se, but it's not a pragmatic or useful explanation of why you would avoid oop, and there's a level of purism in the argument that falls apart the moment you aknowledge hardware.
and it relies on agreeing with many definitions that would be accepted in functional programming circles that werent agreed upon, or at least presented first
I do think people are too quick to encapsulate everything in a class rather than having a function which can operate on only what's needed, but there are better ways to make the argument than that video does.
A video that may make the argument slightly better (and more pragmatically) would be this one: https://www.youtube.com/watch?v=tD5NrevFtbU but it isn't even going after oop in entirety, but after certain patterns of factoring oop code.
People should keep in mind that this is from the perspective of someone teaching a course on performance and game dev, and that the performance reasons alone are not universally the important factor people should consider, but there's a few things in it that apply beyond, and a few of the things that make it easier for the compiler to do the right thing also help with locality of ideas and congitive reasoning about a program's design.
I never understood the process, from what i remember to make a trait on to properly use it.
a) make a trait with it's function
b) make a impl
c) do a dyn with it
"do a dyn with it" that is definitely not a required step, traits are super useful even without dynamic dispatch
Are you familiar with ABCs (or Protocols for that matter) in Python?
Sup fools.
At what stages does the GLEB behavior happen in interpretation and compilation?
When and how is the pycore_flowgraph.h used?
Python compiler and interpreter hit with the black magic stick or no?
edit your message to be a bit more respectful
Sup fools.
what's ironic about it?
It's bold faced apparent irony, it's like calling you guys smart.
.... yeah sure
Cuz this is the place for the devs who know way more about python.
Smh
Which doesn't mean "squish my horse"
It means "shake my head"
ok
@silk slate what is an "unoptimized search" and where do you see one?
I heard about the GELB behavior and seen someone test it. GELB seems unoptimized so I started reading through compile.c to find flags for optimization but didn't get far yet.
I've never added anything to a stuff, so figured maybe it's a good opportunity.
what do yuo mean by "seems unoptimized"?
The search happens every time, yee?
Why?
The search doesn't even need to always happen.
why is that?
Can map the stuff and see if it can change per some branch
Wish I could send a voice message to quick say it all
But I'm too lazy to type it
you're going to have to use more words, and more specific words.
ok, never mind then.
Yee
you're not going to get an answer from the core devs without putting some more effort.
I'll just read source then.
they have more important things to do than try to read your mind and decipher what little you've said.
Ok
@silk slate definitely also look into how dynamic python can get. names can come and go.
core devs too busy copy pasting meta nogil code :/
lolol
yee i get it.
GELB behavior searched for variables. you can check the rest of a branch and decide if higher level variable addresses can or are expected to change, and store if a change has occurred and optimize this behavior.
or
idk theres all kinds of ways
point is i seen someone test it and show a 10% performance difference copying a global in.
and i thought
thats stupid
so i started reading source and learning more about the behavior.
im on my laptop now, so i can type faster.
pro tip: when you think, "that's stupid" consider that it might be more complex than you realize.
it could be that you have a way to improve things, that will be great.
well yeah thats why im reading source and about the bytecode and stuff.
but the core devs are not dumb.
neither am i
have you disassembled any bytecode? That would be a place to start if you haven't yet.
ugh gross. thats like the ghidra approach
bleehhh gonna throw up
i hate ghidra
reverse engineering so slow
i have no idea what ghidra is, i'm just talking about reading the bytecode that python is compiled to.
Ghidra is a reverse engineering tool, not in anyway related to what you said.
ghidra decompiles stuff and maybe sometimes can turn to source
you dont need to decompile the bytecode
it is, because what youre doing is the same essentially.
!d dis
Source code: Lib/dis.py
The dis module supports the analysis of CPython bytecode by disassembling it. The CPython bytecode which this module takes as an input is defined in the file Include/opcode.h and used by the compiler and the interpreter.
no, it's not.
its reading the lowest stuff to infer some higher level pattern
thats backwards. ill just read source.
thanks
it's reading the bytecode to understand what happens when a local is accessed, a global is accessed, etc.

