#internals-and-peps

1 messages · Page 32 of 1

rich nimbus
#

that was it! It contained a pycache folder, hah. Thanks for the help!

cerulean pond
#

Hi

rich nimbus
#

just a curiosity but is there a (practical) way to write python opcode mnemonics (ala writing assembly) and execute them on the interpreter?

#

and if not, is there an impractical way?

rich nimbus
#

ah yes, seems like there are projects for this. answered my own question!

spark magnet
#

@rich nimbus can you say more about why you want to write bytecode directly? it changes from version to version.

grave jolt
rich nimbus
spark magnet
#

ok!

molten kelp
#

Hi, I'm Arsalan from Pakistan. I'm a beginner in Python and excited to learn with you all!

boreal umbra
sturdy timber
quick snow
#

The filtering for versions on the trends site doesn't seem to work: Despite a selected version like 3.13 it still shows 3.11, 3.12, and 3.13 builds in the graph.

raven ridge
#

Either way, this is still fake data. There's a reason this hasn't been announced yet 😅

raven ridge
#

Ah, I see. Yeah, I'm pretty sure that's just because the chart is still being populated by mock data

haughty gazelle
#

What is peps

#

And internals

#

LIFO

#

😎

limpid forum
#

Is this a bug? We just talked about it in #python-discussion and I decided to experiment and it doesn't work

https://peps.python.org/pep-3101/

The object.format method is the simplest: It simply converts the object to a string, and then calls format again:

class object:
    def __format__(self, format_spec):
        return format(str(self), format_spec)

But with simple custom class with str

>>> format(Pair(1,2),"^10")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported format string passed to Pair.__format__
>>> format(str(Pair(1,2)),"^10")
'  (1, 2)  ' 

And even with string itself passed to object's dunder:

>>> object.__format__("aaa","^10")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported format string passed to str.__format__
>>> str.__format__("aaa","^10")
'   aaa    '

Tested on 3.12.11.
It would seem object is doing something to the format spec instead of just passing it like the pep says?

CC @vernal narwhal and @ornate wyvern because maybe you're curious too

ornate wyvern
#

from a first look

#

^10 in my opinion adds those spaces

#

notice that the first string is 10 length
the second is 9

limpid forum
ornate wyvern
#

i see what u mean

#

🤔

limpid forum
# ornate wyvern ^10 in my opinion adds those spaces

Literally this vs the quote you originally posted in pygen

>>> format(Pair(1,2),"^10")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported format string passed to Pair.__format__
>>> format(str(Pair(1,2)),"^10")
'  (1, 2)  ' 

If object's format is calling format(str(obj), format_spec) like pep said, those two should work the same. But one doesn't work

ornate wyvern
#

u mean why with str it works and without it doesn't

uneven raptor
#

I think it's just that PEP 3101 is out of date, the canonical docs for object.__format__ say this:

The default implementation by the object class should be given an empty format_spec string. It delegates to __str__().

limpid forum
ornate wyvern
#

let me try as well

uneven raptor
#

ah, yup:

Changed in version 3.4: The __format__ method of object itself raises a TypeError if passed any non-empty string.

#

PEP 3101 was written for 3.0. it's not updated for any future changes made to __format__.

ornate wyvern
#

what

#

if passed any non empty string?

limpid forum
ornate wyvern
#

he passed str(Pair(1,2)) and it worked

limpid forum
limpid forum
ornate wyvern
#

i call everyone he, no offense

#

but i can call u she though if u want

limpid forum
ornate wyvern
#

i dont feel comfortable using they

limpid forum
#

Tough luck. That is misgendering on purpose and against the rules

ornate wyvern
#

🤷‍♂️

limpid forum
ornate wyvern
uneven raptor
#

seemingly we were more lenient with changes to builtins back in 3.4

ornate wyvern
#

so obbject's format literally doing nothing?

#

accepts only empty strings?

uneven raptor
#

yes

ornate wyvern
#

...

#

well, problem solved then

limpid forum
# uneven raptor there wasn't any. see https://bugs.python.org/issue7994

Shouldn't pep be edited to mention that this part of described behaviour is no longer valid? Because the rest of the pep is still important, just that part about object's format dunder is not

Also imo error message could be clearer - it says {type(obj)}.__format__ got unsupported format string - suggesting that this class has format dunder, but imo should say just object.__format__ got non-empty format_spec (like the issue says - raise when it's not empty)

uneven raptor
#

nope, final PEPs are kept solely as historical documents. they're not updated for new amendments to the feature they added.

limpid forum
#

Huh, not even the header to mention which pep changed them? Although here it wasn't changed by a PEP...

uneven raptor
#

if it were a PEP that changed it, then it would say Superseded-by in the PEP headers. a lot of PEPs have a note saying something like "this is a historical document, the up to date documentation is at <docs url>", but I personally don't think it's worth adding it here (on the basis that it would probably inspire copycat PRs).

worldly venture
#

i like the way the IETF handles things like this with the errata system

limpid forum
#

Imo it's still worth adding if it's valid superseding, and editing the header was forgotten. Because as I said, the quotes from this pep were flying in #python-discussion as if it was valid and I was just like "huh, I didn't know that" and tested it and then arrived here 😄

uneven raptor
#

I mean, this applies to all PEPs, especially old ones

feral island
#

PEP 3101 should probably get a header pointing to current str formatting docs

#

we do that for a number of other PEPs

toxic leaf
#

Yo

lofty hornet
#

And refer to people with correct pronouns and just ask if you are unsure about it

#

Before saying anything

boreal umbra
#

@lofty hornet the moderators have already responded to what you're referring to. There's no need to pile on.

lofty hornet
#

Mb

boreal umbra
#

Thank you for your concern for our rules. Just send a message to @summer lichen if you see something that you think we should respond to.

balmy plank
#

hi

#

iam looking for dev

#

who can help me?

boreal umbra
#

@balmy plank you can look at #❓|how-to-get-help if you have a question. you can't try to hire people here, though.

balmy plank
#

大変申し訳ありません。

merry venture
#

is that a CPython implementation detail?

feral island
merry venture
sinful osprey
# merry venture method_cache() implementation supporting subclasses. feedback's welcome lol http...

At a glance, it looks like it locks on a per-closure basis instead of a per-instance basis, which pre-3.12 functools.cached_property did as well (thought its lock was technically per descriptor). There was enough feedback about how much of a performance issue it was that the steering council agreed that it should be removed. Not sure it's a big deal if you don't intend for method_cache to be used for something performance-critical, but who knows.

Relevant links:

merry venture
#

I'm seeing that I must have belittled the case

#

thanks, will think about a different approach

feral island
#

Quite possibly you should just not do locking. My takeaway from the locking saga with functools.cached_property was that the locking added more problems than it solves

#

If the worst thing that can happen is the cached method might execute multiple times in some cases, that seems fine

sinful osprey
#

The various links have people’s attempts at per-instance locking as well, if you want to try that.

#

Could also just document that the decorated function should use a lock if a user thinks thread-safety is important for their purposes.

winged sphinx
late vortex
#

!resources

fallen slateBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

halcyon trail
# feral island Quite possibly you should just not do locking. My takeaway from the locking saga...

I don't know if these problems are cached_property specific, but I've been noticing lately that a teammate of mine uses @lru cache annotations on functions a lot, instead of just writing a simple class that would open the relevant data once and then access it.
lru cache seems even worse because then your state is globa, whereas with cached_property I would guess it's just stored in the instance of the object

#

and it just feels... idk. Maybe adds a tiny bit of convenience short term, but I can't help the nagging feeling that someday it's either going to surprise me with really unintuitive behavior, or it's just going to make debugging a headache

#

I'm curious if other people feel like @lru_cache should be used very sparingly, or think it's fine.
For me, i feel like I'd mostly only really use it when I had a bunch of code already written with a function, and wanted to speed it up without rewriting it.

feral island
#

I've used it (or other caching decorators) frequently. It can definitely add a layer of complexity, but caching can also solve a lot of problems

halcyon trail
#

fair enough, I should maybe be more open minded then. It just feels like quite a bit of magic to save what is typically a handful of lines of code. but maybe the magic isn't an issue often enough that this matters.

spark magnet
feral island
halcyon trail
#

Well, to be fair, if I was not using the decorator I wouldn't have a lazy global

#

I'd have a class, whose init function took an argument explicitly saying which data to load

#

And then an accessor

lapis bramble
#

Hello everyone I'm not sure if this is the right place to ask but I was wondering about the use of _asyncio_future_blocking
As far as I know, it's set when the future is awaited (or yielded from same thing) and the yielded future is brought back up to __step_run_and_handle_result. I was told that "the flag is there to avoid waking up the event loop when the future is done but no one is there to await it" but I only see wake up mechanisms (run_forever looks like just a while True and run_until_complete just adds a callback to stop run_forever which it calls internally)

#

The only wakeup mechanism I see is on individual tasks, and tasks that receive a future from coroutines register their wakeup as a done callback on the future, but I don't seen why is _asyncio_future_blocking == True needed for that, since when the tasks receives a non blocking future it raises a RuntimeException anyway

#

I'm doing my best to understand but this is all a bit obscure to me haha

lapis bramble
#

Just tried to look up all instances of that in the Lib folder, I don't see what it's used for haha

mild roost
#

Hi please helpe me I understand python concepts I'm facing deficult while solving the problem

boreal umbra
cursive lily
#

Hello from the future. What problems did snakemake cause for you? Would you recommend something else for configuring a data processing workflow?

cursive lily
rich nimbus
#

hey everyone, where can I find a reference for python bytecode? for learning & exploratory purposes, I know it can change between versions (on that note, could be for any version that's still supported)

feral island
#

also the InternalDocs/ folder in the cpython repo maybe, not sure how much the bytecode is covered

#

you'll probably be able to find various tutorials and stuff on the internet, but a lot of aspects of the bytecode change significantly between versions

raven ridge
#

!pban 1319768663485583391 scan spam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @nimble dust permanently.

blissful bluff
#

I am making a program which caches files (up to 100MB), sent over network one kilobyte at a time
How should I allocate memory for the cached file?
Should I allocate the memory for the entire file beforehand?

while len(data) == 1000:
    head = conn.recv(8)
    data = conn.recv(int.from_bytes(head[1:8], "big"))
    cache[i : i+len(data)] = data
    i += len(data)```
or should I allocate memory as data is received?
```cache = bytes(0)
while len(data) == 1000:
    head = conn.recv(8)
    data = conn.recv(int.from_bytes(head[1:8], "big"))
    cache += data```
#

The first method would be the C way of doing it but I don't know how bytes objects are stored in memory, and I'm worried about memory fragmentation.

raven ridge
blissful bluff
raven ridge
#

It'll be contiguous for both, but the first method allocates it only once, and the second reallocates on each append, so that each iteration of the loop creates a new contiguous array that's 1000 bytes larger, and then deallocates the previous contiguous array

raven ridge
#

Which can lead to fragmentation, in addition to just being generally less efficient

raven ridge
# blissful bluff I am making a program which caches files (up to 100MB), sent over network one ki...

also - I didn't have time to bring it up earlier, but - it doesn't seem like you're handling recv() correctly here. If this is using UDP, you're missing any handling for dropped packets, so I assume that it's TCP. But if it's TCP, any recv() call can return less data than you asked for (even if the client sent the data in the chunk sizes you're expecting). You need to loop until you get the amount that you want (or an empty bytes object, indicating the connection dropped before the client sent everything you were waiting for). You need something like (untested, off the top of my head) ```py
def read_exactly(sock, n):
buf = bytearray(n)
read = 0
while read != n:
chunk = sock.recv(n - read)
if not chunk:
raise EOFError # Or however you want to handle this
buf[read : read + len(chunk)] = chunk
read += len(chunk)
return buf

cache = bytearray(size_of_file)
i = 0
while i != size_of_file:
head = read_exactly(conn, 8)
data = read_exactly(conn, int.from_bytes(head[1:8], "big"))
cache[i : i+len(data)] = data
i += len(data)

boreal umbra
#

I think someone said that Astral eventually intends for uv to have a solution for dependency conflicts. Could that be done through a custom implementation of importlib, and would it be possible for uv to monkeypatch it at runtime?

raven ridge
#

I don't think the idea relies on a custom implementation of importlib, I think it just depends on a finder and loader on sys.meta_path that searches for modules in a different place depending on what module is doing the importing. Plus some way to deconflict things in sys.modules I guess

#

and they could register that sys.meta_path entry using a .pth file in uv-created environments, or with a custom sitecustomize.py, or something.

#

the hard part isn't finding multiple copies of a library at different versions, or loading the right one depending on what module is doing the importing. The hard part is handling extension modules, which get loaded as shared objects and which need to not be able to find symbols from a different version of the library (or its dependencies)

velvet patrol
# raven ridge also - I didn't have time to bring it up earlier, but - it doesn't seem like you...
from __future__ import annotations
import dataclasses, ssl, struct, hmac, hashlib, socket, time, tempfile, secrets
from typing import Final

MAX_CHUNK: Final = 2 * 1024 * 1024
MAX_FILE: Final = 50 * 1024 * 1024
SOCK_TIMEOUT: Final = 10
TRANSFER_DEADLINE: Final = 120
VALID_CMD: Final = 0x01
HMAC_KEY: bytes = secrets.token_bytes(32)


@dataclasses.dataclass
class ProtoError(Exception):
    msg: str


def read_exactly(sock: socket.socket, n: int) -> bytes:
    sock.settimeout(SOCK_TIMEOUT)
    buf = bytearray(n)
    mv = memoryview(buf)
    read = 0
    while read < n:
        chunk = sock.recv_into(mv[read:], n - read)
        if not chunk:
            raise ProtoError("peer closed early")
        read += chunk
    return bytes(buf)


def make_tls_server_socket(bind: tuple[str, int], cert: str, key: str) -> socket.socket:
    ctx = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
    ctx.minimum_version = ssl.TLSVersion.TLSv1_3
    ctx.load_cert_chain(certfile=cert, keyfile=key)
    raw = socket.create_server(bind, reuse_port=True)
    return ctx.wrap_socket(raw, server_side=True)  # type: ignore


def receive_file(conn: socket.socket) -> bytes:
    start = time.monotonic()
    pre = read_exactly(conn, 16)
    size, n_chunks, salt = struct.unpack("!II8s", pre)
    if not (0 < size <= MAX_FILE):
        raise ProtoError("bad file size")
    if n_chunks == 0 or size // n_chunks > MAX_CHUNK:
        raise ProtoError("bad chunk meta")
    mac = hmac.new(HMAC_KEY + salt, digestmod=hashlib.blake2s)

    with tempfile.SpooledTemporaryFile(max_size=MAX_FILE) as tmp:
        for expected_idx in range(n_chunks):
            if time.monotonic() - start > TRANSFER_DEADLINE:
                raise ProtoError("transfer timeout")

            header = read_exactly(conn, 9)
            cmd, idx, length = struct.unpack("!BII", header)
            if cmd != VALID_CMD:
                raise ProtoError(f"invalid opcode {cmd:#x}")
            if idx != expected_idx:
                raise ProtoError("out-of-order / replayed chunk")
            if not (0 < length <= MAX_CHUNK) or tmp.tell() > size - length:
                raise ProtoError("chunk length invalid")

            chunk = read_exactly(conn, length)
            tmp.write(chunk)
            mac.update(header)
            mac.update(chunk)

        received = read_exactly(conn, mac.digest_size)
        if not hmac.compare_digest(mac.digest(), received):
            raise ProtoError("HMAC mismatch – tamper detected")

        tmp.seek(0)
        data = tmp.read()

    if len(data) != size:
        raise ProtoError("size mismatch after readback")
    return data
raven ridge
#

just skimming, but that looks much more reasonable

velvet patrol
#

Yeah, unfortunately it's python though. There is much more value in making the same thing in go

raven ridge
#

hm? why?

#

IO-bound stuff should work just about equally well in Python as in Go

#

Python can wait for the remote just as fast as Go can, and copy bytes into a buffer almost as fast

velvet patrol
#

Yeah true, Python can wait just as fast on I/O. But it reacts slower, since it’s interpreted and has to juggle the asyncio event loop and callbacks. Plus, the GIL gets in the way of real multithreading. This means if you're caching a bunch of files at once, you'd need to use multiprocessing or tools like Numba. Still, even then, Go’s goroutines are just way more flexible and scale way better.

#

Uvloop might make the event loop just as good, although I haven't benchmarked it

#

Nope it doesn't. Unfortunately the event loop and callbacks are still not as good.

feral island
velvet patrol
feral island
#

asyncio in CPython itself was optimized, while uvloop hasn't moved much over the last few years

#

haven't looked at benchmarks though, just something I heard discussed at work

velvet patrol
#

In October 2024 they fixed compatibility issues with python 3.12.5 and improved signal handling with cpython. So I'm not sure what issues there may be or if the gap narrowed but Uvloop is just a C type implementation of asyncio that runs it in a wrapper pretty much

#

I haven't seen any benchmarks that prove it to be still not worth using

blissful bluff
# raven ridge also - I didn't have time to bring it up earlier, but - it doesn't seem like you...

I had this exact issue with the client not receiving data, and found a similar solution and made an equivalent of read_exactly().
I have not had any such problem with the server, even after hundreds of megabytes of test data sent without issues, though if I do I will replace all recv() calls with a more reliable function.
Thanks for the concern though, and another method of handling sockets' unpredictable behavior

boreal rivet
#

what is the the use of socket programming??

static hinge
#

inter-process and network communication

#

What's that have to do with internals?

quiet crane
#

Any thoughts, ideas or knowledge on pytest/cpython diff in usage of __test__?

GitHub

A file level test = False succesfully stops pytest from collecting any test from that file, however when running pytest --doctest- modules on the same file, this error appears: /sw/Python/Ubunt...

GitHub

Bug report Bug description: First opened this issue at pytest-dev but was directed here. pytest-dev/pytest#11730 A file level test = False succesfully stops pytest from collecting any test from...

thorn flume
#

so like

#

theoretically

#

if i made an implementation of python that was completely compatible with cpython

#

except

#

i had a slight implementation difference

#

like say

#

not implementing pep 456

#

it wouldnt be a python implementaiton

#

?

sour thistle
# thorn flume it wouldnt be a python implementaiton

there isn't a well defined rule for what can or cannot called 'a python implementation'

if it can pass the CPython tests suite, run most Python programs in the wild and supports pure python pypi packages it's probably fair to call it a python implementation

raven ridge
#

I think it's fair to call something a python implementation even if it fails most of those tests. I'd say that MicroPython and CircuitPython are clearly Python implementations, despite missing a giant chunk of the stdlib

feral island
#

Is PEP 456 just an example or do you really want to get more hash collisions in your Python code?

thorn notch
#

particle physics programming
how do you know the hashes exist if they don't collide?

waxen siren
#

shordingers hash

rose schooner
#

pep 798, yay

boreal umbra
#

!pep 798

fallen slateBOT
boreal umbra
gilded flare
wanton flame
#

Status: draft. That means it's time to discuss and refine it before sending to the Steering Council. Discuss link is at the top of the PEP.

shy slate
#

I have a implementation for a HybridLock to work with multiple threads and asyncio at the same time, although i now also want to allow sync access through that lock, but i am worried about the possible caveats it can have

also i do want to confirm that whenever a thread will access this lock will it block that thread (which is what i want and expect) or will it block the thread in which the lock instance has been created (which is not what i want)

Src: https://mystb.in/7e54be880f8b663f35

shy slate
#

what we are mostly worried about is the cancellations part, it could maybe cause unexpected behaviour

nimble field
#

Guys

#

Have You Heard Of Panda3D

#

Its A Game Engine Using Python

shy slate
#

i wonder why the stdlib doesnt provide such a lock

winged orbit
#

In the core.py podcast, they mention that the 3.15 profiler supports asyncio. I want to see how this works under the hood, but when I check Lib/profile/sample.py I don’t see anything useful + I don’t see any relevant PRs with some basic searches.

What am I missing?

Ultimately I would love for this builtin async support to work for more than asyncio, which is why I asked here. (Crossposted from a help channel because I doubt I’ll get any good responses before the question gets automatically closed)

prime estuary
# winged orbit In the core.py podcast, they mention that the 3.15 profiler supports asyncio. I ...

I think it's to do with the _remote_debugging module (PEP 768) having support for iterating tasks, it might do that automatically? Also this issue: https://github.com/python/cpython/issues/91048

GitHub

BPO 46892 Nosy @gvanrossum, @DinoV, @asvetlov, @1st1, @kumaraditya303, @itamaro, @mpage Note: these values reflect the state of the issue at the time it was migrated and might not reflect the curre...

raven ridge
#

There was a PyCon US talk about this, actually!

raven ridge
# winged orbit In the core.py podcast, they mention that the 3.15 profiler supports asyncio. I ...

asyncio in Python 3.14 introduces a new powerful feature: introspecting a running asyncio program from another OS process. This changes everything—now you can debug and profile your asyncio code in production with no performance penalty. Join us for a fun ride as we show how this magic works under the hood and how you can use it. Learn about t...

▶ Play video
#

Both hilarious and informative 😃

winged orbit
#

Hm I thought that one was specifically asyncio debugging, I’ll take a look. Thanks!

uneven raptor
#

one million hertz

raven ridge
winged orbit
#

yeah I thought there was async support but couldn't find any mention in the code/in the docs and no obvious prs, so I was wondering what I was missing

raven ridge
#

the new Lib/profile/async_pstats_collector.py there seems to be what you're interested in

winged orbit
#

thanks that's exactly what I was looking for! I wonder why there's no PR -- was this feature dropped temporarily to get the profiler in first due to some issues?

raven ridge
#

no, the other way around

#

the sampling profiler was added first, and this is a new feature being added to the sampling profiler, building on top of the previous work

winged orbit
#

oh I just assumed because this was adding e.g. the sampling profiler to the whats new (so I assumed w/o asyncio was just a cherry pick or something)

raven ridge
#

uh... hm. 🤷‍♂️ 😄

quick snow
#

!warn 653398159418195971 This is not a job board, see #rules (rule 9)

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied warning to @high ice.

late jungle
#

is there a PEP somewhere that defines python to be strictly interpreted and not AOT compiled to machine code?

flat gazelle
#

shouldn't be, tho many PEPs implicitly makes this assumption in the implementation section.

uneven raptor
#

plenty of things AOT compile python to machine code

#

nuitka and mypyc off the top of my head

#

technically pypy

raven ridge
#

Technically Cython

clear hill
#

also the distinction between an “interpreted” and “compiled” language is pretty squishy to the point where the terms aren’t very useful IMO

#

the CPython interpreter has a bytecode compiler, for example

lusty scroll
#

funny, I was just thinking that this morning as well

feral island
# raven ridge Technically Cython

Curious what makes you say "technically" here. Cython compiles Python to C code and then machine code, nothing technical about it. Is it because Cython semantics don't exactly match standard Python semantics?

#

It has a mode that compiles unmodified Python code in addition to the ones where you write in a .pyx file and get more syntax

clear hill
#

recent versions of Cython also parse type annotations, which can lead to speedups just like mypyc

raven ridge
quiet crane
feral island
quiet crane
#

I thought generators were recommended over creating lists. Is this only the case sometimes?

feral island
#

they'll definitely save you memory and create more concise code

rose schooner
feral island
#

if your argument is that the generator is faster, you should benchmark first before making the argument

rose schooner
quiet crane
#

Ok, so for the lint ruggestion to ruff in the above link, is there one of the forms to always prefer?

#

And there is no case that can win both memory efficiency and speed? I mean, this is just counting 🤔

feral island
#

write a for loop 😄

quiet crane
#

🙃

feral island
#

there's also been talk in CPython of inlining the generator execution with functions like any() and sum(), that would likely make the generator win on speed

#

Interestingly if I add this to @brisk jay 's script in that issue it's even slower

print(
    "sum generator bools",
    timeit(lambda: sum(e % 2 for e in big_list), number=10_000),
)
#

likely because sum() has a fast path for summing just ints

rose schooner
#

those are just ints though

#

so shouldn't it be faster?

feral island
#

bool is a subclass of int but the fast path is likely only for exact ints

rose schooner
#

e % 2?

feral island
#

since subclasses of ints could be doing funky business in __add__

#

oh you're right

#

oh the issue is I'm summing more things

rose schooner
#

oh

#

that might be it actually

feral island
#

5000 ones and 5000 zeroes instead of just 5000 ones

rose schooner
#

branchless somehow loses out over branchful in python :p

#

benefits(?) of an object-only language

quiet crane
#

Maybe we can crosslink the ruff issue with the cpython issue you mention @jelle? Seems like some testcase/benchmark should perhaps be added?

feral island
feral island
brisk jay
#

That's why in my micro-benchmark in the linked issue above I also included a len-check on an array of booleans, just in case it mattered (it made a micro difference, could just be due to not storing big numbers)

#

Also e % 2 is just a stand-in for a filter function. It cuts the list in half for filtering which I figured should be good enough for a quick sanity check.

quiet crane
feral island
swift imp
#

Like __add__ then if that doesn't work try __radd__ ?

feral island
#

i believe it does some tricks for numerical stability

swift imp
#

Oh wow

raven ridge
#

Huh, indeed:

Changed in version 3.12: Summation of floats switched to an algorithm that gives higher accuracy and better commutativity on most builds.

#

TIL. I knew about math.fsum but not that sum gained some of those smarts

glass mulch
#

Got a special JIT build segfaulting where it wasn't before. Bisected it to https://github.com/python/cpython/pull/136307. Does anything in that PR seem suspicious? The diff I apply to get a special build is (and it works without this patch):

index 8b7f12bf03d..329b1f615e3 100644
--- a/Include/internal/pycore_optimizer.h
+++ b/Include/internal/pycore_optimizer.h
@@ -116,12 +116,12 @@ PyAPI_FUNC(void) _Py_Executors_InvalidateCold(PyInterpreterState *interp);

 // Used as the threshold to trigger executor invalidation when
 // trace_run_counter is greater than this value.
-#define JIT_CLEANUP_THRESHOLD 100000
+#define JIT_CLEANUP_THRESHOLD 10000

 // This is the length of the trace we project initially.
 #define UOP_MAX_TRACE_LENGTH 800

-#define TRACE_STACK_SIZE 5
+#define TRACE_STACK_SIZE 10

 int _Py_uop_analyze_and_optimize(_PyInterpreterFrame *frame,
     _PyUOpInstruction *trace, int trace_len, int curr_stackentries,
@@ -152,7 +152,7 @@ static inline uint16_t uop_get_error_target(const _PyUOpInstruction *inst)
 }

 // Holds locals, stack, locals, stack ... co_consts (in that order)
-#define MAX_ABSTRACT_INTERP_SIZE 4096
+#define MAX_ABSTRACT_INTERP_SIZE 8192

 #define TY_ARENA_SIZE (UOP_MAX_TRACE_LENGTH * 5)

@@ -163,7 +163,7 @@ static inline uint16_t uop_get_error_target(const _PyUOpInstruction *inst)
 // progress (and inserting a new ENTER_EXECUTOR instruction). In practice, this
 // is the "maximum amount of polymorphism" that an isolated trace tree can
 // handle before rejoining the rest of the program.
-#define MAX_CHAIN_DEPTH 4
+#define MAX_CHAIN_DEPTH 16

 /* Symbols */
 /* See explanation in optimizer_symbols.c */
diff --git a/Python/optimizer.c b/Python/optimizer.c
index 8d01d605ef4..33c00169a60 100644
--- a/Python/optimizer.c
+++ b/Python/optimizer.c
@@ -456,7 +456,7 @@ BRANCH_TO_GUARD[4][2] = {


 #define CONFIDENCE_RANGE 1000
-#define CONFIDENCE_CUTOFF 333
+#define CONFIDENCE_CUTOFF 100

 #ifdef Py_DEBUG
 #define DPRINTF(level, ...) \
GitHub

GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

glass mulch
#

I found the related change, but it makes no sense. It's just adding an identifier to Include/internal/pycore_global_strings.h and respective generated files.

round path
glass mulch
#

Turns out this diff is enough to segfault main:

index 454c8dde031..c49652adc27 100644
--- a/Include/internal/pycore_backoff.h
+++ b/Include/internal/pycore_backoff.h
@@ -99,7 +99,7 @@ backoff_counter_triggers(_Py_BackoffCounter counter)
 // Must be larger than ADAPTIVE_COOLDOWN_VALUE, otherwise when JIT code is
 // invalidated we may construct a new trace before the bytecode has properly
 // re-specialized:
-#define JUMP_BACKWARD_INITIAL_VALUE 4095
+#define JUMP_BACKWARD_INITIAL_VALUE 344
 #define JUMP_BACKWARD_INITIAL_BACKOFF 12
 static inline _Py_BackoffCounter
 initial_jump_backoff_counter(void)

If we define JUMP_BACKWARD_INITIAL_VALUE to 345 instead of 344, no more segfault.

glass mulch
#

False alarm, 345 won't segfault during compilation, but will on normal use :/

#

True boundary numbers seem to be 703 for no crash, 702 for crash.

#

Adding a bunch of new identifiers doesn't change that number 🤔

glass mulch
#

Trying to get to the REPL is what makes the crashy build crash. python -m _pyrepl will crash, python -m random or python -m http.server will run to completion. 🤔 🤔

glass mulch
glass mulch
#

Crash happens on unpatched main too, patching just makes it happen faster.

boreal umbra
#

@atomic siren your message was removed for asking for a job. Please read the #rules

quiet crane
#

Does a hashavle object need to have a "correct" __eq__ implemented?
In cpython dict lookop seems to be short-cirtcuiting using is. Is this specified in the python language or an implementation detail? Is cpython's behavior ok? Why?

quick snow
#

The only required property is that objects which compare equal have the same hash value.
If a class does not define an __eq__() method it should not define a __hash__() operation either.

#

I think having equal objects hash to different values just results in UB, anything CPython does is fair game then.

quiet crane
#

I have objects with hash function set to id() and __eq__ raising Type Error at the moment

#

But I'm not confident with this code. 😅

#

And the short circuting in dict really surprised me

raven agate
#

The requirement is that objects that compare equal should return the same hash value. This implies that they need an __eq__ method otherwise you cannot compare them and determine whether or not they are equal. It also means that two objects that are different (but equal) needs a __hash__ method that returns the same value. Using id() as a hash value will return different ids for different objects and so the requirement will not be met.

clear hill
#

I just noticed 3.14 rc2 comes out aug 26th, but 25/8 is a better pi approximation 😦

#

smh

static hinge
#

you mean 22/7?

sturdy timber
uneven raptor
clear hill
#

22/7 is even better but august only has so many options...

#

in other news: neat! CPython is getting a built-in sampling profiler apparently. https://discuss.python.org/t/pep-799-a-dedicated-profilers-package-for-organizing-python-profiling-tool/100898

static hinge
#

!pep 800 is also neat

fallen slateBOT
ruby elm
#

I've recently run across this while working on a Ruff issue, and I was wondering if anyone here has some insight:
Is the interaction between f-string format specs and whether or not the string is raw documented anywhere?

What I mean is f"{A():\xFF}" vs rf"{A():\xFF}"
It looks like this changed in 3.12 with PEP 701. On 3.11 and prior, f"{A():\xFF}" is not equal to rf"{A():\xFF}", the rawness does affect the format spec (ÿ vs \xFF). On 3.12 and later, f"{A():\xFF}" is the same as rf"{A():\xFF}", both giving ÿ.
I've looked everywhere I can think of in the docs, as well as PEP 498 and PEP 701, but I can't find anything talking about this edge case. I also could not find any issues about it on the CPython github.

PS ~>uvx python@3.11 -c 'A=type("",(),{"__format__":lambda _,f:f});print(f"{A():\xFF}"==rf"{A():\xFF}")'
False
PS ~>uvx python@3.12 -c 'A=type("",(),{"__format__":lambda _,f:f});print(f"{A():\xFF}"==rf"{A():\xFF}")'
True
feral island
#

probably worth opening an issue on cpython

ruby elm
lusty scroll
#

if I wanted to download the Cpython source and then checkout the tag for 3.13.5, that wouoldn't be difficult to do would it?

clear hill
#

no, you can also use pyenv to automate all that: pyenv install 3.13.5

#

if you want a build with your checkout anyway

lusty scroll
#

that worked great thank you

#

sorry if this has been answered but I've been looking around the code and I can't figure out where different types are added as virtual base classes

#

this would be helpful for knowing how to make instanceof checks

feral island
lusty scroll
feral island
#

That's done with .register calls, I think they're mostly in _collections_abc.py. But classes can also present as subclasses of these ABCs through the __subclasshook__ which generally just checks for the existence of some methods.

lusty scroll
#

The source code says:

Note that the new implementation hides internal registry and caches,
previously accessible via private attributes _abc_registry,
_abc_cache, and _abc_negative_cache. There are three debugging
helper methods that can be used instead _dump_registry,
_abc_registry_clear, and _abc_caches_clear.
That was in NEWS.txt

#

however I've been unable to find any actual usae of _dumb_registry searching the codebase and github (I don't have particularly great technique for searching especially github)

feral island
#

NEWS.txt might describe the original implementation, not necessarily how things work now

lusty scroll
#

I thought maybe if I read all the type hint related PEPs in orderthat might help some

feral island
#

ABCs were not originally meant for type hints, and are still a somewhat independent system

#

Reading PEPS can be helpful but their text doesn't necessarily reflect current behavior exactly

lusty scroll
#

ah, ok, yeah they seem a bit like a shadow type system but they're fully incorporated to instance of one way or another, it seems to me

#

so it seemed to at some point get added into runtime behavior, maybe for type checkers?

feral island
#

What is "it" here?

lusty scroll
#

which one? haha

#

the runtime recognition of how the typing helper classes get incorporated in I guess

#

I'm trying to get an example

feral island
#

The runtime behavior came first

#

Type checkers don't see that; they know that e.g. list is a Sequence because in typeshed we pretend it's an actual base class

fallen slateBOT
#

stdlib/builtins.pyi line 1084

class list(MutableSequence[_T]):```
lusty scroll
#

actually, it looks like Any cannot be used with isinstance 😄

#

I'm no longer sure, the only thing that passed any of my checks was object

#

but I'm reading your GitHub ref, thank you

#

strange

#
>>> l = list
>>> l.__subclasses__()
[<class '_frozen_importlib._List'>, <class 'functools._HashedSeq'>]
>>> l[0].__subclasses__()
[<class '_frozen_importlib._List'>, <class 'functools._HashedSeq'>]
>>> l[1].__subclasses__()
[<class '_frozen_importlib._List'>, <class 'functools._HashedSeq'>]```
#

I think I messed it up, once moment

#

ok maybe not, I just still get those two

#

I'm probably missing something

feral island
#

Note that gives you subclasses, not base classes

lusty scroll
#

maybe it wasn't list but something else fairly simplistic

feral island
#

In a fresh 3.15 REPL I get

>>> list.__subclasses__()
[<class '_frozen_importlib._List'>, <class 'traceback.StackSummary'>]
>>> import email.message
>>> list.__subclasses__()
[<class '_frozen_importlib._List'>, <class 'traceback.StackSummary'>, <class 'email.header._Accumulator'>]
lusty scroll
#

oh ok, well, at least I'm not crazy

feral island
#

So yes, importing email gives you another subclass of list

lusty scroll
#

how does one use abc.ABCMeta._dump_registry() ?

#

everything I've tried has said that the class has no member _abc_meta

feral island
# lusty scroll how does one use `abc.ABCMeta._dump_registry()` ?
>>> collections.abc.Sequence._dump_registry()
Class: collections.abc.Sequence
Inv. counter: 25
_abc_registry: {<weakref at 0x102f56040; to 'type' at 0x102945418 (memoryview)>, <weakref at 0x102f55e80; to 'type' at 0x10294ac48 (tuple)>, <weakref at 0x102f55fd0; to 'type' at 0x102948e50 (range)>, <weakref at 0x102f55ef0; to 'type' at 0x10294ec60 (str)>, <weakref at 0x102f55f60; to 'type' at 0x1029316b8 (bytes)>}
_abc_cache: set()
_abc_negative_cache: {<weakref at 0x102f55da0; to 'type' at 0x102945418 (memoryview)>}
_abc_negative_cache_version: 12
lusty scroll
#

ohh it's a method on one of the virtual bases

#

that makes sense, thanks

quiet crane
raven agate
# quiet crane So objects that I want to forbid comparison for, can/should never be hashable?

If you do not provide an __eq__ method, then objects will be compared using their "id". This is a built in default. If you want to use an object as a key in a dict, then you will need to provide both an __eq__ method and a __hash__ method. It is up to you how they work. If you want to "forbid" (your word) an object from being a key, then you can have your __hash__ method return NotImplemented.
It is unclear, at this point, what you objective may be. If you want to ensure that your object can never be compared and found equal, then write your __eq__ method so that it always returns False.
If you need further clarification, you will need to state what you are trying to achieve.

quiet crane
# raven agate If you do not provide an `__eq__` method, then objects will be compared using th...

I want to avoid two objects of my class to be compared using equal operator like: a == b. So I did

    raise TypeError("..")

I did this because there is no logical reason to compare them, but there has already been mistakes of comparison these objects where there is actually a risk of incorrectness in the logic.

However, some of the usecases are as keys in memoize caches. If I just remove the memorization, I see 3x runtime increase (e.g. 20 seconds -> 1 minute)

#

This is old messy code I'm looking at. Trying to figure out what is worth doing, and how.

raven agate
quiet crane
#

(if they are used as keys: as long as all hash conflicts are due to the objects being the same same, id() will "short circuit" and never call __eq__. But I believe this is an implementation detail of cpython as I haven't found any documentation about it. Also I don't want to rely on this behavior anyway 😅)

quiet crane
quiet crane
raven agate
quiet crane
#

You can with id(). It's not nice, but code "works out". I don't want to do it but all of these questions spurred from lack of detailed knowledge about dict.

And getting the last puzzle pieces was difficult due to dict short circuting with an is (id()) check.

#

The problem is I won't forbid comparing these o jects, because there is too much code relying on comparing them in different places of the code. Some are straight up comparisons and easy to deal with. Some are as keys in memoize caches or as items in sets - not easy to deal with.

raven agate
#

It would be interesting to know WHY you do not want two objects to be able to be compared. In the meantime I would try something like this:

def __eq__(self, other):
    return TypeError(f'Cannot compare objects of type {type(self)}')
faint river
#

is pip shared between python versions on windows?

#

I guess my real question is, if so, is the cache per-version or shared?

quiet crane
#

The underlying code that a lot of other stuff is implemented on top is very bad designed.

It's a graph structure (mostly tree with some sibling pointers). Each is constructed lazily. And (very!) unfortunately a new object might be constructed for a node that has already been constructed. Suddenly we have two objects representing the same node. Should they compose equal or not? I want to forbid comparison to force the user code to understand this issue.

The ideal thing would be to fix the underlying structure to never have this situation appear.

raven agate
quiet crane
boreal umbra
#

@knotty pond your message was removed for surveying, which is not allowed.

rain trellis
#

Not sure where else to float this - but the scoping of sys.remote_exec seems a little confusing...

#

The docs say that the injected code is "executed by the target processes' main thread", which I (maybe wrongly) assumed would mean in the main thread's global namespace

#

Maybe a clarification in the docs? Or maybe I just need a worked example of importing and interacting with existing objects

#

Anyway, seemed minor to open an issue about and couldn't figure out where this might go on Discuss, so just tossing it out here

raven ridge
raven ridge
rain trellis
#

Thank you! Just looking to examine/modify variables in existing/imported modules at runtime:

#foo.py - run a loop
import time
x = 0
while True:
    x += 1
    print(x)
    time.sleep(1)
# inject_me.py
x = -9999
# main.py
import sys
sys.remote_exec('pid-of-foo.py', 'inject_me.py')
boreal umbra
#

I wish we called it "module scope" and not "global scope"

raven ridge
#

if you'd instead done import foo, you'd want to instead do ```py
import foo
foo.x = -9999

rain trellis
#

Ahhhh thank you very much

boreal umbra
lusty scroll
#

python-gdb.py is super useful. is there a way that people use to get it automatically sourced every time they debug python?

clear hill
#

somehow python 3.14 doesn’t catch a ValueError that 3.13 does

raven ridge
#

hm. very foggy guess... what do you see if you do: ```py
import pandas as pd
obj1 = pd.DataFrame({'0': [1, 1, 1, 1], '1': [1, 1, 1, 1]})
obj2 = pd.DataFrame({'0': [1, 1, 1, 1], '1': [1, 1, 1, 1]})
try:
obj1 and obj2
except BaseException as exc:
print(f"{exc = }")
print(f"{type(exc) = }")
print(f"{type(exc) is ValueError = }")

clear hill
#

oh i think I’m hitting the bug that needed a bytecode tweak for rc2…

raven ridge
#

ah, indeed, it does look like that

clear hill
#
Hudson River Trading

At HRT, we’ve found that centralizing our codebase facilitates cross-team collaboration and rapid deployment of new projects. Therefore, the majority of our software development takes place in a monorepo, and our Python ecosystem is set up such that internal modules are importable everywhere. Unfortunately, the convenience of this arrangement ...

static hinge
#

Great name

grave jolt
#

!pep 802

fallen slateBOT
grave jolt
#

new pep just dropped

mint cove
#

{/}

uneven raptor
#

people seem to really dislike it, unfortunately

mint cove
#

I wrote the PEP as I’m sympathetic to the idea and I think it’s worth exploring how/if we can resolve the syntax hole for sets, which have sometimes felt a little unloved by the core language. Unsurprisingly, people have lots of thoughts!

grave jolt
#

_ _ _ _ _ _ _ _ _ _ _ _
| 🚲🚲🚲🚲 🏍️ |

mint cove
grave jolt
#

it's a bike shed

mint cove
#

Of course, foolish of me

grave jolt
#

should I have coloured it differently to make it clearer?

mint cove
#

I did think the Tour de France finished several weeks ago

grave jolt
#

How is the empty frozenset literal going to look like? f{/}?

grave jolt
#

that's pretty unique syntax

mint cove
# grave jolt should I have coloured it differently to make it clearer?

Ideally, of course, the shed would be a time machine and/or magic wand and we would ‘fix’ all the choices from the past. The argument has been made that the key distinction of dictionaries are the presence of a colon (:), so the notation ought be {:} for empty maps and {} for empty sets. This, though, will never happen.

#

(At least in the Python language; it’s fun to design your own language when you’ve anywhere between half an hour and ten years of downtime)

mint cove
grave jolt
#

a time machine is already built into the shed

mint cove
#

Fabulous, now we can properly focus on colour schemes

boreal umbra
feral island
#

write the PEP

clear hill
#

i’ll join you on the frozendict PEP - or dict.freeze()

swift imp
#

Or frozenmap never made it to the steering council

jade raven
feral island
#

the frozendict PEP is stuck

#

(or was it frozenmap? I forgot)

uneven raptor
#

there’s both apparently, PEP 416 (frozendict, rejected) and PEP 603 (collections.frozenmap, draft)

mint cove
topaz dock
#

Hi I can sell my last script in python dm me for more information !!

uneven raptor
mint cove
#

Conceptually?

feral island
#

Lots of people use it

mint cove
#

I use loads of frozen sets

feral island
#

Always hard to make statements like this though, it really depends on what kinds of Python code you see

clear hill
#

for multithreaded applications, if you know it’s frozen the thread safety story is trivial

mint cove
#

Useful to have the guarantee it won’t change (hashability is nice but I don’t really use it)

uneven raptor
#

yeah, i'm generalizing. i've personally never seen a frozenset in a codebase, other than CPython

feral island
#

As far as I'm concerned nobody uses complex or Decimal either. But I'm sure there are domains where they are useful and common.

clear hill
#

i’ve used frozendicts fwiw

#

dict.freeze is appealing to me too, if it returns a frozen dict

mint cove
clear hill
#

but that doesn’t actually freeze the underlying dict though

uneven raptor
#

wouldn't it make more sense to just revive PEP 351 if freezability is the desire

clear hill
#

there’s also the project verona folks and the deep immutability stuff too, but that was received even less warmly than {\}

mint cove
#

This is HAMT vs ‘immutable’ though — I think the argument is that the former has certain nice properties

clear hill
#

lol one downside of {\} is you need to escape the backslash to type it in markdown

mint cove
#

You can use it today from one of the _testinternalcapi modules

clear hill
#

there’s also rpds-py on pypi

#

ah sorry

mint cove
#

Backslash isn’t currently a legal token in Python, forwards slash is

#

PEP 803: backslashes!

mint cove
uneven raptor
#

my frustration with frozensets is that they have to wrap another container. it's like if you had to do tuple([...]) to make a tuple. still useful, just generally not worth the effort.

#

that, and we've kind of slept on frozensets internally. the empty frozenset() is not an immortal singleton, unlike all other immutable types

glass mulch
#

Oh boy, a 1MB MRE isn't quite ideal

clear hill
#

TIL warnings.warn grew a skip_file_prefixes kwarg that is strictly better than the old stacklevel kwarg

static hinge
#

A contextmanager would be better in case of wrapper libraries

rain trellis
#

Another sys.remote_exec question - is it possible to get remote execution to happen 'sooner' while the program awaiting on asyncio.sleep() or on select?

I.e. if i have a script running:

import asyncio
import os

async def _a():
    # Print own PID and a counter for convenience
    x = 0
    while True:
        x += 1
        print(f"{os.getpid(): <8} {x: > 8}")
        await asyncio.sleep(10)

def a():
    asyncio.run(_a())

if __name__ == "__main__":
    a()

and I do sys.remote_exec on that process, is there a way to avoid waiting for the end of the asyncio.sleep for the remote execution to happen?

raven ridge
raven ridge
#

if you control the code of the process you're attaching to, you can do tricky things like register a no-op signal handler using loop.add_signal_handler, though, and then send that signal after sys.remote_exec returns to trigger the event loop to run the signal handler, which will cause the eval loop to run, which will cause the script injected by sys.remote_exec to run

#

though I imagine that if you controlled the code of the process you're attaching to, you wouldn't be using sys.remote_exec in the first place, heh

rain trellis
raven ridge
#

yeah, that's definitely true, but it's not the approach I'd take. I'd instead set up a completely separate side channel with your first remote_exec, and then use that side channel for every future command you want to run

#

that's what we do in PDB's new remote mode, for instance - the PDB client spawns a TCP server, then runs a sys.remote_exec command telling the remote process being attached to that it should establish a connection to that server and accept PDB commands that come over it

rain trellis
#

Oh yeah I've seen - I've been into the guts of _PdbClient and _PdbServer this week. But you still have the same issue with debugging async apps - if the remote app is waiting on a selector (or just generally awaiting), you don't get a response back from the Server until the selector times out/returns.

#

Even after pdb is attached

raven ridge
#

well that's just down to the particular side-channel you're using

#

like, if you inject a separate thread, then nothing that's happening in the event loop thread will cause your thread to stop (short of acquiring the GIL and never releasing it, at least)

rain trellis
#

Huh, that's an interesting thought... and from a thread in the same process, I think it should be easy to throw signals at the main thread to spin the eval loop, with something like pthread_kill - I was running into some issues with a Python distributable that didn't seem to build in pidfd_send_signal

raven ridge
#

pidfd is still relatively new, there's definitely platforms out in the wild that don't support it yet

#

but anyway, the best overall structure is definitely using sys.remote_exec for one-time setup of some sort of side channel, and then using that side channel from there on out

rain trellis
#

Makes sense to me!

#

I may come back with some more questions - I’m working on essentially a TUI wrapper for remote pdb to add some quality of life features. I’m trying to keep it compatible with stdlib pdb so that existing processes can be attached to, so long as they’re running the same Python version

raven ridge
#

I'm the person who contributed remote pdb, so you can feel free to ping me here if you've got more questions 🙂

real elm
#

Yo can

#

Some one help me to learn faster python

swift imp
#

Oh

#

Wow okay, just read the pep. NEVER MIND

raven ridge
#

Pablo and Ivona (who gave the PyCon presentation together) are both teammates of mine. PyCon would only allow a max of two presenters per talk, and I don't enjoy giving talks anywhere near as much as Pablo does 😄

#

the 3 of us worked together on PEP 768 and the C portion of sys.remote_exec, but remote PDB in particular was almost entirely me. Pablo did an early prototype of it, but the design that he had couldn't handle a lot of PDB's features, and I don't think a single line of it survived to the final version

swift imp
raven ridge
#

that early prototype was just done using 2 FIFOs, having everything typed in the client be sent directly to the remote process, and everything written by the remote process be printed by the client. That's simple and elegant, but it has lots of subtle problems - how does tab completion work? how does line editing work? how does the remote know whether syntax highlighting should be enabled? how does the remote know how wide the client's terminal is and where to wrap listings?

#

really, the inability to square that with tab completion is what killed it. Lots of other things could be papered over, but the approach where the client sends complete lines to the server over a fifo is just totally incompatible with tab completion

swift imp
#

Ah

#

pdb has been getting a lot of love lately

#

remote access, and Tian fixed a long standing issue from reading commands from a .pdbrc that I just love

#

I still don't know how one develops pdb, I tried and it was difficult to walk through the code with pdb. Like debugging pdb with itself

raven ridge
#

print()

swift imp
#

Yeah Ive developed a habit of never print debugging bc I could always use pdb

#

and it felt like I was doing something wrong

#

its the one time its appropriate I suppose

raven ridge
#

!otn a who debugs the debuggers

fallen slateBOT
#

:ok_hand: Added who-debugs-the-debuggers to the names list.

raven ridge
#

I have no idea how @spark magnet manages to develop coverage.py, honestly. The few times I've tried to make changes there, I've immediately gotten frustrated with the inability to use PDB and coverage together. I think he's said that he has tons of conditionally enabled print statements just left in the library for debugging things

#

debugging anything that uses sys.settrace is just a nightmare

swift imp
#

I do suggest people use the new .pdbrc commands feature though for programmatic debugging

#

Its nice

spark magnet
#

and btw, it's a pain in the ass debugging coverage problems sometimes

raven ridge
#

and you say that as someone who knows how things work, heh. it is super frustrating to need to debug unfamiliar code and be unable to use a debugger

spark magnet
spark magnet
raven ridge
raven ridge
#

oh thank god 😆

#

pytest-cov has a .pth file - is that what's changing the behavior, I wonder?

#

I wonder if the failure still reproduces if you mv $VIRTUAL_ENV/lib/python*/site-packages/pytest-cov.pth{,.bak}

spark magnet
#

i want to talk about this, but am on a work thing

raven ridge
#

well, I should be too, and you nerd sniped me 😆

#

chat later, then!

paper echo
# cursive lily Hello from the future. What problems did snakemake cause for you? Would you reco...

I don't remember why I thought it sucked, but I felt like it solved a problem that I didn't have, and didn't solve the problem that I ha. I didn't need or want a special syntax with Python source blocks in it, basically just swapping out shell for Python in a Makefile. I wanted (and still want) a general-purpose DAG-based task runner for which "success" doesn't have to mean emitting a file to disk. Imagine Airflow but without a server and scheduler and all that: just operators, DAGs, runs thereof, and a database to track state(maybe just a directory of files like Git) . That tool still doesn't exist to my knowledge in 2025. But Dbt + DVC can get pretty close.

#

Snakemake is fine if all you want is "Make but for data processing in Python"

spark magnet
spark magnet
#

and this is one of those problems that debuggers aren't good at: processes forking, starting, stopping, and the interactions that happen at those times.

#

could I get a debugger to step into a .pth file? i wouldn't try.

merry venture
raven ridge
#

I can't tell: is it proposed to change the behavior of collections.namedtuple as well, or is the proposal to only change typing.NamedTuple?

#

I have definitely seen code in the wild - and probably written some myself - that does ```py
class MyClass(collections.namedtuple("_MyClass", "foo bar baz")):
"""
Some docstring
"""

#

the docs for collections.namedtuple even give an example of doing this

swift imp
#

I feel like people go pretty far to be able to attach docstrings everywhere that python should make better support for them.

Between class variables, module constants, and that namedtuple example, makes me think there should be better support.

grave jolt
#

like what Rust has

boreal umbra
#

I was in a conversation with nedbat and Guido at pycon two years ago. Guido said that docstrings were supposed to be for maintainers of code and that they're now overused.

Though "docstrings are for maintainers" seems to conflict with the help function being a thing.

grave jolt
#

It's common for libraries to generate autodocumentation from docstrings

#

It's also very convenient to read the documentation for the thing I'm using in my editor, either by hovering or going-to-definition.

#

Otherwise I'd have to open a search engine, find the documentation site for the library I'm using, and find the thing on that website

spark magnet
swift imp
spark magnet
swift imp
#

Which I think makes sense given u can strip them out to save memory

#

Not that I have any reason to not believe guido

clear hill
boreal umbra
# spark magnet we should have done a better job capturing what was discussed

I think it's possible that since he was speaking off the cuff, he was misstating his own opinion when he said "docstrings are for maintainers". I remember people in the conversation were talking about sphinx, and he said he thinks only documenting each module, level, and class individually doesn't actually document how the software is intended to be used holistically, and that he thinks code examples in docstrings are annoying.

merry venture
#

i'll reply to the comments in the thread soon, they're very right to call out some existing & valid use cases. it may not be worth it

glass mulch
spark magnet
obtuse compass
#

Instead of sending screenshots, I can just send this, and anyone can correct me on my code, as I'm a beginner and would love to learn

frigid bison
#

I am writing a library for parsing python bytecode. I resolve the extended args so that it's easier to work with but this also means I have to regenerate them if I want to convert the bytecode back to a bytearray that python can understand. I wrote integration tests that "rewrite" the whole standard lib and check if I have a 1:1 input and output after parsing and writing it back using my lib. During this process I found an interesting edge case in a file in Python 3.13 and I wondered if anyone might have an idea why python generates this bytecode. See below for what I found.

TLDR:
Python 3.13 generates this piece of bytecode:

           1344 EXTENDED_ARG             1
           1346 JUMP_BACKWARD          256 (to 838)```
my library outputs this:
       1344 JUMP_BACKWARD          255 (to 838)```

it's fairly obvious that this means exactly the same. So my question is why does python generate this useless extended arg? This looks a bit like the chicken and egg problem. The existence of the extended arg is itself. The extended arg causes the oparg to exceed 255 and thus need an extended arg. Does anyone know why it would generate such bytecode?

feral island
frigid bison
feral island
#

Maybe an off-by-one?

#

Or some combination of optimizations to the bytecode that cause us to generate suboptimal code

frigid bison
#

yeah I think the latter is more likely the case

#

I have seen that happen before while writing this lib

#

it might also be because the jump target (838) is itself a jump with an extended arg

#

so perhaps the way that's calculated afterwards causes this bug

fallen slateBOT
#

Python/assemble.c line 730

/* XXX: This is an awful hack that could hurt performance, but```
frigid bison
#

this part right here

#

this comment has also been here for a long time 😅

frigid bison
#

I made my library replicate the bug but it seems like python doesn't do it consistently everywhere so there's no way to get a perfect 1:1 output without preserving the original extended args

verbal comet
#

QR code tool is done

icy sun
#

What do men truly want?

boreal umbra
icy sun
#

My bad g

lusty scroll
verbal comet
fallen slateBOT
#

src%2Fv313%2Fext_instructions.rs line 617

pub fn to_instructions(&self) -> Instructions {```
frigid bison
#

You can run the integration tests to try it and see the error

lusty scroll
frigid bison
#

I'm not sure what you're trying to find/fix though. I think it's an issue with python, not my library

subtle pine
#

Hi does anyone know efficient way to define TabularCPD for the real data which has observed nodes. Doesn't contain latent nodes data. Using pgmpy or pomegranate. Please

clear hill
#

This channel is for discussing internals of CPython, I’d take your question to another channel

faint river
#

how does one go about finding the implementation for operators?

#

I want to see how dunders are called

grave jolt
#

you mean the implementation of an operator for a type, or how binary operators execute in the interpreter?

faint river
#

interpreter

grave jolt
#

I think it's complicated nowadays, with the adaptive interpreter and new opcodes

gilded flare
#

just search the operator itself and maybe it's there

faint river
#

looks like it doesn't have attribute operator

#

.

gilded flare
#

oh, yeah that's a complicated thing

gilded flare
# faint river `.`

main function should be PyObject_GetAttr() in Objects/object.c i think

faint river
gilded flare
gilded flare
faint river
#

not the default impl necessarily

fallen slateBOT
#

Objects/typeobject.c line 10565

/* There are two slot dispatch functions for tp_getattro.```
gilded flare
#

tl;dr ‎_Py_slot_tp_getattro() when __getattribute__() is overridden and __getattr__() doesn't exist, ‎_Py_slot_tp_getattr_hook() otherwise

clear hill
#

err, a new operator, along with a dunder method

grave jolt
#

All of the material is completely up to date for Python 3.9, the latest version of the Python programming language.
Seems like this page hasn't been updated in ages

clear hill
#

eh, 3.9 isn’t that old

grave jolt
#

That's true, it's not even EOL yet

clear hill
#

updating a book like this is a lot of work! just thought I’d share even though it’s a little old because it fit your ask almost perfectly

grave jolt
#

Yeah the book is perfectly fine

gilded flare
clear hill
#

I doubt this stuff has changed much and even if it has the gist will be right

gilded flare
#

might be fine

opal tapir
#

Hi

#

Hi guys

unkempt rock
#

yo guys

#

I am just a 13 year old kid trying to learn some python

boreal umbra
halcyon skiff
sour thistle
#

<@&831776746206265384>

#

I'm surprised discord automatic filters did not catch that

boreal umbra
#

please don't post the same thing in a bunch of channels. please make sure that all your messages are on-topic for the one that you post in.

winged sphinx
warm agate
wise berry
#

hello everyone, I've been really interested in performance improvements, and I'd love to contribute to the cpython project.

boreal umbra
wintry ferry
#

@fast elbow how do you make a fast game on python ?

dapper lily
merry venture
#
chrome berry
#

hi

meager nacelle
#

Trying to implement __annotate__ for dataclasses' generated __init__ method does make me wish there was an option to get all PEP-649/749 annotations in an unevaluated format.

meager nacelle
#

My understanding of PEP-749 was that VALUE_WITH_FAKE_GLOBALS was added so you could raise a NotImplementedError if you did not support it or another format. However this doesn't appear to work currently, because call_annotate_function calls the function with VALUE_WITH_FAKE_GLOBALS in an environment with fake globals where NotImplementedError doesn't exist and neither does Format.

#

I'm creating an issue for this but it was fairly confusing until I realised what was going on. format == Format.VALUE evaluates as True even though format is Format.VALUE_WITH_FAKE_GLOBALS because Format.VALUE is a _Stringifier so the whole expression is converted to a _Stringifier.

merry venture
#

oh, although that's keyed by filename

#

in natural uses this can't grow bigger than all python files imported i guess... shouldn't be a concern

fallen slateBOT
#

Python/compile.c lines 1047 to 1048

case YIELD_VALUE:
    return 0;```
frigid bison
#

why does the stack_effect function in 3.10 say this opcode has no impact on the stack?

#

YIELD_VALUE
Pops TOS and yields it from a generator.

#

It should be -1 no?

grave jolt
#

(through the send method of a generator)

#

If the result of the yield expression is unused, then that value is dropped, similarly how to e.g. the result of a call can be unused

#

!e

import dis

@dis.dis
def f():
    x = yield 42
    yield x
fallen slateBOT
frigid bison
grave jolt
#
gen = f()
print(gen.send(None))  # 42, equivalent to next(gen)
print(gen.send(69))  # 69
print(gen.send(None))  # StopIteration
weak hound
#

I hate unit testing so much

boreal umbra
weak hound
frigid bison
merry venture
#

is there a tool for cpython that figures out what re-setup is needed between any 2 revisions so that when hunting regressions i can only put a repro and make git bisect do the rest, without writing a sophisticated script?

sometimes between revisions there are no changes to the C/Tool part so make doesn't have to be run
sometimes it's only make -j -s, sometimes make regen-cases, sometimes perhaps make regen-all.

it largely depends on which was the previous revision, if there was no cleaning.

#

oh

#

although i can just narrow down the bisection to those revisions that touch certain files...

#

and then assume that a certain degree of recompilation is always needed

#

sometimes i don't know whether i'm hunting a python regression or a cpython regression though. but that should be easy to spot in the first place. it's a "depends case-by-case"

#

think the easiest approach is to identify the area of regression and then filter out the set of bisected revisions to those that can be said need certain steps to full rebuild

#

but there can be changes in between that change other parts as well that next revisions depend upon... i think?

#

oh, right. i don't need make regen-cases, such things are included in commits.

unreal haven
#

Hows python3.13 without Gil for you guys

grave jolt
#

pretty difficult when underwater

static hinge
clear hill
merry venture
#

spawning multiple asyncio event loops

#

cool!

merry venture
#
white nexus
#

in 3.7, PEP 563 introduced the from future import annotations directive, which turns all annotations into strings. This directive is now considered deprecated and it is expected to be removed in a future version of Python. However, this removal will not happen until after Python 3.13, the last version of Python without deferred evaluation of annotations, reaches its end of life in 2029. In Python 3.14, the behavior of code using from future import annotations is unchanged.

This is directly at odds with the messaging about how future is considered stable and a feature will NEVER be deleted?

#

(please ping reply)

quick snow
# white nexus > in 3.7, PEP 563 introduced the from __future__ import annotations directive, w...

Where do you see that messaging? Features are deleted all the time (just search for "removals" in the "What's new" pages of various Python versions).
It's true that the docs claim that "No feature description will ever be deleted from __future__", but I think that refers to future imports that have become the default never being removed (e.g. generators, with_statements). annotations is a bit of a special case, because it's the only future that has never (and will never) become the default. annotations will (probably) be removed eventually, but that's a long time in the future (the earliest possible date would be autumn 2029). As for the feature itself: it's just not needed anymore, since forward references are now possible without the import.

white nexus
#

Hm

feral island
white nexus
feral island
#

we could make it a no-op eventually instead, can discuss that again in a few years

white nexus
feral island
#

it should generally not in 3.14

white nexus
#

As in, if I have a file with that feature and it no longer has that feature, will that file continue to work

#

I think it depends on if the third parties support it

raven ridge
#

huh. Come to think of it - I wonder if there should be a PYTHON_IGNORE_FROM_FUTURE_IMPORT_ANNOTATIONS environment variable that allows people to test their code with the new semantics without needing to modify the code...

#

that'd let people test proactively to make sure that it won't cause them any problems to drop the import when Python 3.13 hits end of life

white nexus
#

fwiw i'm very happy that inspect.get_annotations was made into an alias

#

this means that we probably support python 3.14 annotations without changes

feral island
raven ridge
#

for turning off futures in general, I don't think there's any need. For the specific case of turning off the one future feature that's going to go away without ever becoming the default, there is a pretty compelling need

#

we're telling people right now that they ought to be able to drop from __future__ import annotations when they drop support for 3.13, but if 3.13 isn't EOL until ~5 years from now, surely we don't want the first time that they test without from __future__ import annotations to be when they try dropping it from their code in 5 years...

#

imagine they find a problem at that point - when 3.14 is already out of bug fix support. Not a great situation to find themselves in...

white nexus
#

or anyone in tbh

#

could be bad for the ecosystem and then we have another abi3 on our hands

raven ridge
#

I guess the one redeeming thing is that string annotations aren't going away, so if someone does find a problem when they remove from __future__ import annotations in 5 years, they can probably fix it by just adding quotes

#

I guess maybe that's enough of a reason to not worry about it

grave jolt
grave jolt
#

Why doesn't IndexError for built-in sequences say what the index and the length were? That seems kinda useful

feral island
white nexus
#

!d IndexError

fallen slateBOT
#

exception IndexError```
Raised when a sequence subscript is out of range. (Slice indices are silently truncated to fall in the allowed range; if an index is not an integer, [`TypeError`](https://docs.python.org/3/library/exceptions.html#TypeError) is raised.)
shy slate
#

Does string interning not happen only for dynamically created strings or also very large strings? Or any other possibility

raven ridge
#

I think it only occurs by default for names and module constants

shy slate
#

It is happening for any string i make except when the string is made dynamically

#

I tried large strings they were interned to so idk what the limit is or when does a string not intern

raven ridge
shy slate
raven ridge
#

just module constants

#

like string literals

shy slate
#

x = "hello"
y = "hello"

id(x) == id(y) returns true

raven ridge
#

yep - "hello" is a module constant in that case

shy slate
#

Yeah so a string made inside a function will also be interned right?

raven ridge
#

a string literal will be, yes

#

any string that's a constant in the bytecode

shy slate
#

Yeah so any string literal of any size is interned right? And dynamically generated strings like "h" * 100 isnt

raven ridge
#

yes

shy slate
#

Okay thnx

raven ridge
#

that's all implementation details, though

#

don't write code that relies on it 🙂

#

I don't think there's any guarantee that anything will ever be interned

shy slate
#

Also for integers the range is -5 to 256 was this choosen in light of ascii fitting under this bracket

shy slate
shy slate
raven ridge
#

yep, except then 🙂

raven ridge
#

I doubt it had any relation to ascii, but I'm not sure. I think it was just a small range of commonly used integers

shy slate
#

Hm well 256 is the range for 8 bits or 1 byte so each ascii character can be almost represented with those cached

#

As for -5 it looks like a safe lower hand since mostly just -1 is used for array indexing and all

raven ridge
shy slate
#

🤔 but internally ascii is just gonna convert to numbers so maybe for that?

#

Or is it direct binary conversion

raven ridge
#

those internal numbers won't be Python int objects, they'll be C integers

shy slate
#

Correct

raven ridge
#

so they wouldn't benefit from the small integer cache

#

they'd be an entirely different type of thing that's never cached

shy slate
#

Fair

radiant garden
#

i think moreso anything to do with byte strings
e.g. iteration or indexing yields an int in byte range

shy slate
#

Hm PyLongObject pointers

raven ridge
#

and in Python 2 iterating over a byte string gave byte strings

#

and likewise for indexing a byte string

radiant garden
#

so be it

shy slate
#

Also just one more confirmation

If you do
x = 10
x = 12

Setting x to 12 uses new memory and not the one 10 was in and the 10 is still there in memory where it was and it will be garbage collected

raven ridge
#

because of the small integer cache, it will not be garbage collected

#

that cache owns references to interned copies of -5 through 256

shy slate
#

Yea i mean except for that range

#

Yea yea i was talking about in general except for this range

#

Sorry the x = 10 example was wrong

#

Just consider it x = 2000 and then x = 3000

#

So 3000 will not be stored where 2000 was right? A new allocation will happen and 2000 will be gc'd later right?

raven ridge
#

for some period of time, both things exist

#

it's easier to see this if you think of an object with an explicit constructor, instead of an integer. Imagine you've got: py foo = MyClass() foo = MyClass() After the first line executes, there's a MyClass instance in memory, and foo refers to it. Then the second call to MyClass() executes, at which point there are two MyClass instances in memory, one that's referred to by foo and one that's a temporary. Then the foo = part executes, and foo now refers to the new MyClass instance that up until now was a temporary, and now nothing is referring to the original MyClass instance

#

it could never reuse the same memory because, while the second line is in the middle of executing, both objects need to exist simultaneously

shy slate
#

Correct so a new allocation happens and the old one will be gcd

#

It wont overwrite the same memory

shy slate
raven ridge
#

think about: py foo = [1, 2, 3] bar = foo foo = [4, 5] At the end of that, bar refers to [1, 2, 3] and foo refers to [4, 5]. If the new value was somehow written to the original object, it would cause bar to refer to the wrong thing.

#

also, think about py class MyClass: def __init__(self, name): self.name = name if random.random() < 0.05: raise RuntimeError("unlucky") 5% of the time, that constructor fails. If you do py foo = MyClass("name1") foo = MyClass("name2") it can't change the .name attribute of the first MyClass instance, because there's still a chance that the constructor will fail, and so the second foo = won't happen, even though the second MyClass() call did happen

gilded flare
#

this is also mostly the reason your "hello" example worked earlier

#

all constants in the same scope are stored in the code object's constant tuple

#

and all code objects, except the module code object, are referenced by a parent code object

gilded flare
#
def f():
    x = "hello"
    x = "bye"
    return x

print(f())  # bye

in basic terms, "hello" isn't GC'ed for the entirety of this code's execution

#

when it is assigned to x, a reference to it simply gets pulled (copied) from the function's code object's constant tuple

#

and when it gets replaced with "bye", its number of references just gets decreased by one

#

until the end of that print() call, when the code is exiting, "hello" (and all other constants within the code) are alive and not GC'ed, since they are stored in the code object and the code object needs to be alive for the code to execute

gilded flare
shy slate
#

The only thing i wanted to confirm was that new memory is made since obviously we have no guarantee of the old memory being that big to store this new stuff and also what godlygeek said, so memory allocation happens everytime

raven ridge
shy slate
#

But doesnt the allocator automatically look for already allocated free space

feral island
#

things might or might not get reallocated in the same place. There's no guarantees about that and you shouldn't rely on any such behavior

raven ridge
#

and it might not even make it to the allocator. Some dead objects can be reused (the "free lists")

gilded flare
#

unless you mean "gc'd" as in "reference decremented" and not completely destructed/freed

frigid bison
#

I am breaking my head over this. When python 3.11 calculates the max stack size of the following function it calculates a max stack of 8. I don't get how it reaches 8, if I manually calculate it I get 7 which happens at index 110. Can someone check what I'm doing wrong?
https://paste.pythondiscord.com/UDVQ

#

I am recompiling the bytecode and my tool also reaches 7, so I think this is a bug in cpython maybe

#

or I'm just missing something

frigid bison
#

so I think it's a bug they fixed along the way

gilded flare
fallen slateBOT
shy slate
shy slate
shy slate
raven ridge
#

yes. But, depending on the data type, it may not make a request to the allocator at all

#

it may just have some already partially constructed instances of that type sitting around for reuse

#

that's an optimization that the interpreter plays for types that are very frequently constructed and destroyed

shy slate
shy slate
#

does anyone have a fix for this i am using socket.getfqdn and it is returning the ipv6 and hence socket.gethostbyname_ex fails, in other project my db is trying to connect and again this issue is being raised in macos

Mac OS Version: 26
Python Version: 3.13

#

this may not be the correct channel to post this in, but this isnt getting attention anywhere else idk why, not in python-help neither the original cpython issue is getting any attention, its open since 2018

glass mulch
#

The cpython issue has gotten attention in the past, but it's hard to tackle because the issue is not reproducible for the devs that tried to. It might help figuring out whether there's some commonality for the pooled group of people that had the same issue.

compact kiln
#

If anyone is willing to teach me, I need to learn about how all of these work

shy slate
#

Only 1 macbook doesnt have this error which is of my friend from germany

#

Tbh i dont think its hard to find a macbook where this happens

#

Also this may be very wrong but i think its a combination of macbook and your wifi provider which causes this

glass mulch
# shy slate Tbh i dont think its hard to find a macbook where this happens

That's the issue in my opinion. The reporters believe it's trivial to reproduce, the developers believe it's trivial to not reproduce, and no progress happens. Figuring out what's similar among everyone you know that has the issue and contrasting that to your friend who doesn't might be our best bet to diagnose this.

shy slate
#

Also i dont know how but psutil lib figures a private a ip correctly while this socket thing issues

glass mulch
shy slate
#

Ugh

glass mulch
#

Oh boy, we hit a segfault in the fuzzer (as opposed to in the fuzzing script), I'm never going to be able to reproduce it 😭

AddressSanitizer:DEADLYSIGNAL
=================================================================
==393769==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000020 (pc 0x55a51dacc61e bp 0x7ffd6e813260 sp 0x7ffd6e8131b0 T0)
==393769==The signal is caused by a READ memory access.
==393769==Hint: address points to the zero page.
    #0 0x55a51dacc61e in Py_INCREF Include/refcount.h:279
    #1 0x55a51dacc61e in _Py_NewRef Include/refcount.h:527
    #2 0x55a51dacc61e in _PySet_NextEntryRef Objects/setobject.c:2817
    #3 0x7fe2d6f4ceb3 in save_set Modules/_pickle.c:3498
    #4 0x7fe2d6f40713 in save Modules/_pickle.c:4415
    #5 0x7fe2d6f43cf9 in batch_dict_exact Modules/_pickle.c:3355
    #6 0x7fe2d6f45c5a in save_dict Modules/_pickle.c:3417
    #7 0x7fe2d6f406f6 in save Modules/_pickle.c:4411
    #8 0x7fe2d6f4366c in batch_dict_exact Modules/_pickle.c:3333
    #9 0x7fe2d6f45c5a in save_dict Modules/_pickle.c:3417
    #10 0x7fe2d6f406f6 in save Modules/_pickle.c:4411
    #11 0x7fe2d6f43cf9 in batch_dict_exact Modules/_pickle.c:3355
    #12 0x7fe2d6f45c5a in save_dict Modules/_pickle.c:3417
    #13 0x7fe2d6f406f6 in save Modules/_pickle.c:4411
    #14 0x7fe2d6f43cf9 in batch_dict_exact Modules/_pickle.c:3355
    #15 0x7fe2d6f45c5a in save_dict Modules/_pickle.c:3417
    #16 0x7fe2d6f406f6 in save Modules/_pickle.c:4411
    #17 0x7fe2d6f43cf9 in batch_dict_exact Modules/_pickle.c:3355
    #18 0x7fe2d6f45c5a in save_dict Modules/_pickle.c:3417
    #19 0x7fe2d6f406f6 in save Modules/_pickle.c:4411
    #20 0x7fe2d6f412d5 in dump Modules/_pickle.c:4611
    #21 0x7fe2d6f41df2 in _pickle_dump_impl Modules/_pickle.c:7744
    #22 0x7fe2d6f42388 in _pickle_dump Modules/clinic/_pickle.c.h:724
Python/generated_cases.c.h:2361

SUMMARY: AddressSanitizer: SEGV Include/refcount.h:279 in Py_INCREF
==393769==ABORTING
merry venture
#

walking on landmines

clear hill
#

increfing an invalid object inside of pickle? wha?

frigid bison
#

any idea why Python would calculate a max stack size of 1 for this function?

#

is the minimum 1 for some reason?

#

This is Python 3.13 btw

#

it comes from here

gilded flare
frigid bison
#

no RETURN_CONST has no stack usage

gilded flare
#

otherwise it might just naturally have a minimum max stack size of 1

frigid bison
#

that's the whole point of the opcode really, because returning None was so common that they made a special opcode for it

gilded flare
#

special opcode for returning constants*, but yea

#

maybe it's just that minimum 1 stack size

#

what method is used to calculate it?

feral island
#

why do you care about the stack size? the answer to your questions might very well be that it's not important for CPython's purposes to be very precise

#

if I understand correctly, if we underestimate stack size, things explode, but if we overestimate we just waste some memory

frigid bison
#

yes correct, but I'm writing a library that allows for bytecode modification so I have to recalculate the stacksize and I was doing some integration tests where I noticed this discrepancy between my "algorithm" and what cpython outputs

#

so I wondered if there was any reason for this behaviour, so that I myself don't underestimate the stack size

gilded flare
#

i don't think i can access github dev..

#

argcount?

fallen slateBOT
#

Python/flowgraph.c line 757

calculate_stackdepth(cfg_builder *g)```
frigid bison
fallen slateBOT
#

Objects/codeobject.c lines 503 to 505

if (con->stacksize == 0) {
    con->stacksize = 1;
}```
merry venture
fierce coral
#

When python v2.3 came out, the MRO was switched from depth-first, left-to-right resolution of the class precedence list, to C3 linearization. What problems exist with the former?

feral island
shy slate
deep dirge
clear hill
shy slate
clear hill
#

I agree, FWIW I own that book and it’s pretty good

#

there are also some really good cpython internals blog posts i found once…

#

i haven’t actually looked at that since 2023 so no idea if it’s gone downhill

#

also you’re 90% of the way there if you just always check for errors in your C code 😝

shy slate
shy slate
#

for me running

CPPFLAGS="-I$(brew --prefix zlib)/include" \
LDFLAGS="-L$(brew --prefix zlib)/lib" \
./configure --with-openssl=$(brew --prefix openssl) --with-pydebug

gives me this

configure: error: Unexpected output of 'arch' on OSX
boreal umbra
#
# class A has a method `bar`
a1 = A()
a2 = A()

a1.bar  # x
a1.bar  # y 
a2.bar  # z

This creates a method three times. How much machinery is shared between x and y, and between x and z? Are there any optimizations?

#

CC @ornate wyvern

ornate wyvern
#

i'm not sure what u're asking

#

it does create the method 3 times, thats true

#

if a1.bar caches bar, the next time ti'd use the same bound method, not create a new one

#

as for a2, it'd be the same, create on the first call, use the cached one next

#

@boreal umbra

boreal umbra
ornate wyvern
#

i've implemented cache for get

#

the way i implemented it works like this

boreal umbra
#

I'm asking how it really works, in the internals-and-peps channel.

ornate wyvern
#

how it really works is that it creates a new method on every lookup

boreal umbra
#

I know, and I'm asking what optimizations or shared machinery there is.

ornate wyvern
#

oh alright, that i dont know, others can maybe answer that

#

if anyone knows something, i'd appreciate a ping

feral island
#

I believe it creates a new bound method object every time. However, usually you'd of course do a1.bar() or similar, and in that case, the specializing adaptive interrpreter creates specialized code that optimizes out the creation of the bound method object.

#
>>> class A:
...     def bar(self): pass
...     
>>> def f():
...     x = A()
...     x.bar()
...     
>>> for _ in range(100):
...     f()
...     
>>> import dis
>>> dis.dis(f, adaptive=True)
  1           RESUME_CHECK             0

  2           LOAD_GLOBAL_MODULE       1 (A + NULL)
              CALL_NON_PY_GENERAL      0
              STORE_FAST               0 (x)

  3           LOAD_FAST_BORROW         0 (x)
              LOAD_ATTR_METHOD_WITH_VALUES 3 (bar + NULL|self)
              CALL_PY_EXACT_ARGS       0
              POP_TOP
              LOAD_CONST               0 (None)
              RETURN_VALUE
clear hill
#

it’s what I use with brew on my Mac for all the work I’ve done in the past year and a half adding free-threaded support across the ecosystem

uneven raptor
#

i’m pretty sure method objects are on a freelist too, so the creation of them isn’t that expensive

shy slate
clear hill
#

makes sense! I skipped the “building python” bits because I’d already figured out pyenv, I’ve used it for more than ten years at this point

shy slate
#

fair

shy slate
grave jolt
#

at least on linux

shy slate
feral island
shy slate
#

it has that ./configure thing which gives me an error

#

configure: error: Unexpected output of 'arch' on OSX

feral island
#

that's not something I've seen before, might be worth reporting as a bug and trying to figure out what's special about your setup

#

lots of people definitely are compiling successfully on osx with those instructions

#

so there must be something slightly unusual about your setup

grave jolt
shy slate
#

i searched and there are reports about apple silicon chips not being able to compile versions around 3.9

shy slate
feral island
#

oh yeah that might predate mac aarch64 support

grave jolt
shy slate
#

mmm so what do i do

feral island
#

probably best to just compile the latest version, things will be a little different but probably you should be able to get things to match up

shy slate
#

3.9.0b1 is the version to be precise

shy slate
clear hill
#

you could set up a linux VM or docker image

feral island
#

so you should just be able to check out a more recent commit on the 3.9 branch, things won't be that different

clear hill
#

i use orbstack for docker on macs, it’s pretty neat

shy slate
#

fair when was the parser added i think 3.9.1 shouldnt be much different

clear hill
#

ah that will also work too

feral island
#

that would have been in a different major version, I think 3.10

grave jolt
#

patch versions usually don't change a lot of things

shy slate
#

correct

grave jolt
#

(i think they're supposed to be backwards compatible)

shy slate
#

yeah 3.9.1 it is then, ill try

grave jolt
feral island
#

lol

#

but security

clear hill
#

do you care a lot about the parsing details? otherwise I’d probably still use the main branch of cpython

#

or the 3.14 branch

shy slate
#

i dont care, just wanted to follow the book since it takes in account the old parser

grave jolt
shy slate
#

otherwise i already have the new version compiled

clear hill
#

I don’t know how much in that book is out of date, but you’d probably learn something finding out where it is and why

#

and you’d come out the other side having a better handle on the current codebase

shy slate
#

thats fair

#

but it was more for showing something to other's, so i needed to be compatible, for myself i have the newer version compiled

#

hm 3.9.1 worked thnx a bunch

plain dome
sour thistle
#

!rule 6 9

fallen slateBOT
#

6. Do not post unapproved advertising.

9. Do not offer or ask for paid work of any kind.

elder pivot
#

HI 👋

#

I want to read Fluent Python. Do you have any opinions about it? Is it a good book?

#

I have made some simple projects in Python, like a stock price downloader that converts data into CSV or JSON. I also made some videos using Manim

#

So basically, I’m still a beginner.

winged sphinx
quick snow
#

I was not a fan of the previous PEP, this one seems fine. I'm not sure I like lazy from foo import bar over from foo lazy import bar, but eh..

#

I also don't understand the point of the syntax restrictions, presumably this could also work inside functions etc.

glass mulch
steel solstice
grave jolt
#

Right now it's just a lookup in an array

#

(and also there isn't a big use case for it)

grave jolt
clear hill
#

but also thank you whoever decided that

#

* imports: let’s not

#

I still need to fix one spot in NumPy that does that and shouldn’t

grave jolt
#

It provides that nostalgic feeling of "which one of these damn header files this comes from?"

quick snow
grave jolt
#

...well, nostalgic to C developers, I am just getting acquainted with C

clear hill
#

before numpy 2.0, the entire public API was set up using * imports and that meant that we leaked all kinds of implementation details into the API and also as a bonus made it impossible to find where things are defined.

#

well, not impossible

grave jolt
#

yeah, I frequently look at library code on github and it's frustrating

clear hill
#

just harder for no reason besides laziness

quick snow
#

But couldn't you just put the placeholder objects in the locals array and resolve them the same way they get resolved when it's a normal namespace?

clear hill
grave jolt
#

hm, I guess it could be specialized into a different instruction

#

like it would be specialized for globals

#

but it would be extra work and extra code, and I'm assuming there isn't a usecase for lazy imports within a function

grave jolt
steel solstice
grave jolt
#

If you want the import to be done conditionally, then it won't run if the function is not called

steel solstice
#

But can't you just put the import in the conditional?

grave jolt
#

wdym

grave jolt
clear hill
#

yeah, lazy imports lets us abandon all the delayed import hacks we’ve accumulated like this

regal glen
#

I think this is the biggest reason to like lazy imports.
Having all imports that a file might use defined at the top is very nice.
It can sometimes be annoying to search a 10k+ LOC file for all the imports it randomly does. (Not hard if I am explicitly looking for it, but it is extra mental space I would rather not have to take up)

white nexus
#

Will there be a specific importlib util for proper lazy loading?

#

Will I be able to use importlib.util.import_module(..., lazy=True)

#

Partially intialized modules....

A module may contain a lazy_modules attribute, which is a sequence of fully qualified module names (strings) to make potentially lazy (as if the lazy keyword was used). This attribute is checked on each import statement to determine whether the import should be made potentially lazy. When a module is made lazy this way, from-imports using that module are also lazy, but not necessarily imports of sub-modules.

The normal (non-lazy) import statement will check the global lazy imports flag. If it is “enabled”, all imports are potentially lazy (except for imports that can’t be lazy, as mentioned above.)
Will this raise an error if defined after the first import statement? I think it should. Otherwise a user might be confused as to why it's not working.

#

When the module is first reified, it’s removed from sys.lazy_modules (even if there are still other unreified lazy references to it).
And then
If reification fails (e.g., due to an ImportError), the exception is enhanced with chaining to show both where the lazy import was defined and where it was first accessed (even though it propagates from the code that triggered reification). This provides clear debugging information:

If accepted this seems to have a small question of what if the following code would error with and where:

# main.py
lazy import json
import util
data = json.load(...) # reified here
util.dump_json(data)
``````py
# util.py
lazy from json import dumsp

def dump_json(...):
  dumsp(...)
merry venture
winged sphinx
meager nacelle
#

Downside is that supporting old python means I have to maintain all of my hacks anyway

sinful osprey
#

I wonder how much effort it would take to create a third-party “backport” of PEP 810. I’d guess it’s possible with PEP 523 hooks, but perhaps infeasible.

static hinge
#

Though I'm not sure how we would make globals() return the unreified instance

grave jolt
#

import hooks?

sinful osprey
#

Idk if the same general approach would be enough to emulate 810.

#

… looking at the reference implementation, maybe not, lol. Benching the idea for now.

winged sphinx
#

There's some use cases where we defer importing (import inside a function) because of optional features that depend on packages that may may not be installed; would that fail fast as a lazy import? Or, does the proposed lazy import accept everything and defer module not found till later?

grave jolt
#

So if you do: ```py
lazy import json
print(json)

#

The actual module is loaded on first use of that name.

winged sphinx
#

Oh that's good. I wonder if this would lead to deprecating importing outside global scope, curious if there's any valid use cases left

grave jolt
#

idk, ```py
def try_load_windowsthing():
try:
import windowsthing
except ModuleNotFoundError:
pass
else:
self.set_thing(windowsthing)

#

maybe you're fiddling with sys.path before loading this function

grave jolt
#

oh, reading the globals doesn't count as using the object

feral island
grave jolt
#

yeah, I realized

#

maybe the PEP could specify more precisely what it means to "use" the imported name

#

idk

#

oh wait, globals() are mentioned explicitly in the Q&A of the PEP

feral island
#

I feel they put some things in the FAQ section that maybe should have gone in the spec section

#

but shrug

grave jolt
#

I am just bad at reading

radiant garden
#

I get the rationale of "external introspection should generally be unaware of the change but internal introspection should be aware of it, as that's the code that's most able to change" but I too think it'd be better to put it somewhere other than the faq

merry venture
#

would you find this small option useful?

❯ ./python -c 'import sys; sys.version_info'

❯ ./python -pc 'import sys; sys.version_info'
sys.version_info(major=3, minor=15, micro=0, releaselevel='alpha', serial=0)

❯ ./python --help | rg -m 1 -- '-p'
-p     : print the result of the program passed in as string (use with -c)

(the interface can be a little different, i'm just asking about the idea)

#
soft snow
winged sphinx
#

I like idea that I could run a notebook cell via command line, without having to add in a trailing print statement.

white nexus
#

was about to ask how this worked but after implmenting a repl recently no yeah that's exactly how I would implement it I think

#

sigh guess I'm doing a small refactor to my eval command shortly

faint river
#

!e 3.14

import annotationlib

def f(x: int): ...
print(annotationlib.get_annotations(f, format=annotationlib.Format.STRING))
print(f.__annotate__(annotationlib.Format.STRING))
fallen slateBOT
# faint river !e 3.14 ```py import annotationlib def f(x: int): ... print(annotationlib.get_a...

:x: Your 3.14 pre-release eval job has completed with return code 1.

001 | {'x': 'int'}
002 | Traceback (most recent call last):
003 |   File "/home/main.py", line 5, in <module>
004 |     print(f.__annotate__(annotationlib.Format.STRING))
005 |           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
006 |   File "/home/main.py", line 3, in __annotate__
007 |     def f(x: int): ...
008 | NotImplementedError
faint river
#

why is the second one not implemented?

meager nacelle
#

Because the implementation is in annotationlib.call_annotate_function

#

You're not intended to call the __annotate__ functions directly

feral island
#

otherwise I'd have had to write STRING in raw bytecode which would have been exciting

meager nacelle
#

Nice spam

boreal umbra
#

thought experiment: what if global del were a compound keyword that deleted the actual object from memory and gives all its remaining references to None

#

though come to think of it, I doubt there's a mechanism to get all the references to an object that isn't traversing the whole reference graph

radiant garden
#

with a name like that it might as well leave the pyobject pointers dangling 😆

boreal umbra
#

That's just how it could be done without introducing a new keyword. But soft keywords are popular now.

#

If python has to check before accessing any object that it's still there, every time it follows a reference, that kills the idea

static hinge
boreal umbra
static hinge
boreal umbra
static hinge
#

Theoretical effect

boreal umbra
#

I guess bing_shrug

static hinge
#

We'll need to introduce a borrow checker to typing