#internals-and-peps

1 messages · Page 34 of 1

grave jolt
#

You probably meant ```diff

  • class TimerPrinter[**P, R]:
  • class TimerPrinter:
  • def call(self, f: Callable[P, R], /) -> Callable[P, R]:
  • def call[**P, R](self, f: Callable[P, R], /) -> Callable[P, R]:
merry venture
#

was just typing that, yes

#

the P and R cannot belong to TimerPrinter -- they're function's type args

grave jolt
#

When you had class TimerPrinter[**P, R], it means that the entire class is generic.
When you instantiate e.g.: tp = TimerPrinter("spam"), what type is tp? I don't know, and mypy also doesn't know (it doesn't do enough bidirectional inference for that)

mint cove
#

Thanks @grave jolt, I had erred in assuming as it was a method I should use type parameters on the parent class.

#

Can confirm this fixes the issue at hand, thanks both!

grave jolt
#

For example, if list had a map method, this would be the signature ```py
class list[A]:
def map[B](self, func: Callable[[A], B], /) -> list[B]:
...

mint cove
#

So class-level type parameters are for data held in the class, rather than (only) passing through it? Makes sense.

grave jolt
#

something like that, yes

marsh sorrel
#

hey all

why don't we have a logical "xor" in the python syntax?
not important enough?

#

of course you can make it work with and and or, but it's more verbose

prime estuary
#

You can also use != for booleans. Indeed it’s pretty uncommon to need, also you can’t have the short circuiting behaviour of the other logical ops, so it wouldn’t add much?

gilded flare
#

boolean is not boolean or the above is already an option too

prime estuary
#

That’s another good point. True xor True gives False so this would have to either return bools or do ~a, would be weird and inconsistent with the other logical ops.

marsh sorrel
#

yeah I've been using != with parenthesis

i also know about operator.xor in the stdlib

radiant garden
#

I don't think returning the function'𝕤 arguments (like and and or do) makes sense for xor because it can't do any short circuiting.

regal glen
regal glen
#

Only in the sense that it can consume more than just booleans and spit out an answer. But otherwise ,,, why is it not sufficient?

sour thistle
regal glen
#

Yea, fair enough I guess

marsh sorrel
# sour thistle _logical_ `xor` would be False for `1 xor 2` the same way `1 and 2` is not the s...

exactly, ^ is bitwise xor

but i mean, C language doesn't have logical xor as well.
has logical and && and bitwise and &, logical or || and bitwise or |
but xor only has the bitwise operator ^

after some googling it seems there's no benefit in logical xor because there's no short-circuiting with the xor operation - of course, because we need to check both values in order to get the xor result
I was just curious, thanks all =)

clear hill
#

fun fact, this is deprecated but ~True is -2 lol

#
>>> ~True
<python-input-0>:1: DeprecationWarning: Bitwise inversion '~' on bool is deprecated and will be removed in Python 3.16. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
-2
>>> ~False
<python-input-1>:1: DeprecationWarning: Bitwise inversion '~' on bool is deprecated and will be removed in Python 3.16. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
-1
>>> import numpy as np
>>> ~np.True_
np.False_
>>> ~np.False_
np.True_
#

~True being truthy is a big footgun

pearl tendon
#

so would ~True be False?

#

ohh nvm its just -2 im dumb lol

merry venture
#

do you think this is worth fixing? i guess not, but maybe...

>>> import os
... os.posix_spawn("/bin/echo", ["echo"], [])
...
Traceback (most recent call last):
  File "<python-input-2>", line 2, in <module>
    os.posix_spawn("/bin/echo", ["echo"], [])
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'keys'
#

it's the known PyMapping_Check caveat, it doesn't really bother that much most of the time

#

i guess in this case posix_spawn should be as fast as possible, so anything other than PyMapping_Check (like, checking for collections.abc.Mapping?) would likely be worse

silent mirage
#

that function is intentionally low level, even if other functions in cpython use *_Check

raven ridge
#

That shouldn't make anything noticeably slower, since it's only on the error path

grave jolt
#

!e

file = open("banana.py", "w")
file.write("""
weird = '''
Multi-line string that contains a stinky surrogate character: \\udead
'''
""".lstrip())
file.close()

import banana
print(len(banana.weird))
fallen slateBOT
grave jolt
#

!e

file = open("banana.py", "w")
file.write("""
'''
Module-level docstring that contains a stinky surrogate character: \\udead
'''

weird = "totally normal"
""".lstrip())
file.close()

import banana
print(len(banana.weird))
fallen slateBOT
grave jolt
#

Which of these is a bug?

#

Or maybe it's documented somewhere that module level docstrings cannot have surrogates?

#

(Where are module-level docstrings even documented?)

#

it seems to only happen in 3.13+, this works fine in 3.12

grave jolt
sturdy timber
#

Related, this seems to be out of date: https://docs.python.org/3/tutorial/controlflow.html#documentation-strings

The Python parser does not strip indentation from multi-line string literals in Python, so tools that process documentation have to strip indentation if desired.

Even though it looks like the example after was updated to show the indentation being removed.

grave jolt
#

good catch

grave jolt
#

Why does the Python CLA bot say it can "act on my behalf"? 🤨

sturdy timber
#

I think that is shown for all GitHub apps, and should be read as "act on my behalf, under the constraints of the previous permissions", which in many cases isn't really acting on your behalf at all.

oak vortex
#

or is there now mark and sweep?

#

is there any more recent resource validating that?

sturdy timber
oak vortex
#

That's amazing thanks

#

I'm apparently shit at googling

sturdy timber
#

some other nice docs there that I didn't realise existed, cool

frigid bison
#

especially if you're trying to search in specific branches

oak vortex
#

ah yeah makes sense, it's not only me then

#

i always have trouble finding .md i remember seeing in the past

clear hill
#

there’s been a bit of an effort to improve the devedocs recently

spark magnet
#

i unfortunately started a refactor of the devguide that is hanging out there confusing people 🙁

grave jolt
# grave jolt I opened an issue: <https://github.com/python/cpython/issues/142411>

I wonder if someone could review this docs-only PR 👀 https://github.com/python/cpython/pull/142413
Not sure if completely removing the unindent paragraph from the tutorial is a good idea. But since the docstrings are now automatically unindented, it seems inappropriate to put these verbose rules in the tutorial (it's probably only interesting in you're doing some metaprogramming and/or making python tooling, so you're way out of the tutorial)

spark magnet
#

(I can put that on the PR)

grave jolt
#

(since UnicodeDecodeError happens to be a ValueError)

spark magnet
grave jolt
#

The doc currently says

This function raises SyntaxError if the compiled source is invalid, and ValueError if the source contains null bytes.

#

but now ValueError can appear in some other conditions, that's what I meant

spark magnet
#

Sure, make that sentence broader

grave jolt
#

Like "if the source contains null bytes or any docstrings cannot be encoded as UTF-8"? Seems a bit too specific
i think what we need to convey is that if you're calling compile and expect it to sometimes fail, you should really catch both SyntaxError and ValueError

spark magnet
grave jolt
#

🤔 cannot be read as source?

#

that feels weird, it's valid Python, but compile can't understand it?

#

The core issue is that it's not defined anywhere that this is invalid (even if it probably doesn't make sense for it to be valid), it's just an implementation quirk

spark magnet
grave jolt
fallen slateBOT
grave jolt
#

!e And for surrogates, having a surrogate in the source code is understandably invalid (seems like this case of raising ValueError was not documented either, would be a good change too) ```py
compile("\udead", "test.py", "exec")

fallen slateBOT
# grave jolt !e And for surrogates, having a surrogate in the source code is understandably ...

:x: Your 3.14 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 1, in <module>
003 |     compile("\udead", "test.py", "exec")
004 |     ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
005 | UnicodeEncodeError: 'utf-8' codec can't encode character '\udead' in position 0: surrogates not allowed
grave jolt
#

!e
My issue is dealing with the case where the source is perfectly fine UTF-8 bytes (or a string that's encodable as UTF-8), but a docstring contains a representation of a surrogate code point

source = """
def fn(): "\\udead"
"""
print(source.encode("utf-8"))  # source is valid as utf-8
compile(source, "test.py", "exec")
``` (also it reports a position of 0 without saying where in the _source_ it is)
fallen slateBOT
# grave jolt !e My issue is dealing with the case where the _source_ is perfectly fine UTF-8 ...

:x: Your 3.14 eval job has completed with return code 1.

001 | b'\ndef fn(): "\\udead"\n'
002 | Traceback (most recent call last):
003 |   File "/home/main.py", line 5, in <module>
004 |     compile(source, "test.py", "exec")
005 |     ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
006 | UnicodeEncodeError: 'utf-8' codec can't encode character '\udead' in position 0: surrogates not allowed
spark magnet
#

i don't think we have to be really specific about what exceptions get raised when.

#

the only reason to mention SyntaxError is that it's an unusual exception.

grave jolt
#

so just say "this function can raise SyntaxError or ValueError"?

#

I'm trying to see this from the perspective of a caller. I want to call this function, what exception should I potentially catch? (i.e.: what are the failure modes)

spark magnet
grave jolt
#

well, if I write except Exception:, it will also catch a typo like cmopile, or passing the wrong type, like compile(42, "answer.py", "exec")

fallen slateBOT
#

IPython/core/async_helpers.py lines 149 to 155

try:
    code = compile(
        cell, "<>", "exec", flags=getattr(ast, "PyCF_ALLOW_TOP_LEVEL_AWAIT", 0x0)
    )
    return inspect.CO_COROUTINE & code.co_flags == inspect.CO_COROUTINE
except (SyntaxError, MemoryError):
    return False```
grave jolt
#

The authors probably read the docs for compile, concluded that the cell can't contain null bytes, so there's no reason to call ValueError

spark magnet
#

MemoryError is the odd thing there, but repls are weird.

grave jolt
#

and IPython should change the except block to be except Exception:?

spark magnet
grave jolt
#

So "this function will raise SyntaxError or ValueError if the source is invalid" or something like that?

#

maybe the docs should stay as is, and this new corner case should just be a fun easter egg to discover

#

(and the fact that null bytes do not in fact raise ValueError)

#

that'd be the best option, because I can just close my PR 😎

spark magnet
#

no, the PR has docstring indentation, which is good (should have been two PRs)

grave jolt
#

I think I'm slowly uncovering some kind of pandora's box

#

!e

import ast

tree = ast.parse("banana.answer = 42")
tree.body[0].targets[0].value.id = "ban\x00ana"
c = compile(tree, "<string>", "exec")
exec(c)
``` i'm trolling python at this point
fallen slateBOT
grave jolt
#

I think you might be in the wrong channel

pulsar mulch
#

Ohh

#

Soory

grave jolt
fallen slateBOT
# grave jolt !e This is even more fun ```py import ast tree = ast.parse("banana.answer = 42"...

:x: Your 3.14 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 6, in <module>
003 |     exec(c)
004 |     ~~~~^^^
005 |   File "<string>", line 1, in <module>
006 | UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in position 3: surrogates not allowed
grave jolt
#

so it seems like UnicodeDecodeError may appear at various places when you might not expect it

merry bramble
#

No one expects the Spanish inquisition UnicodeDecodeError

grave jolt
#

!otn a nobody-expects-the-UnicodeDecodeError

fallen slateBOT
#

:ok_hand: Added nobody-expects-the-𝖴nicode𝖣ecode𝖤rror to the names list.

raven ridge
#

it's a UnicodeEncodeError that's happening here, not a UnicodeDecodeError (and tbf, UnicodeEncodeError is way more unexpected)

#

!otn d nobody-expects-the-UnicodeDecodeError

fallen slateBOT
#

:ok_hand: Removed nobody-expects-the-𝖴nicode𝖣ecode𝖤rror from the names list.

raven ridge
#

!otn a nobody-expects-the-UnicodeEncodeError

fallen slateBOT
#

:ok_hand: Added nobody-expects-the-𝖴nicode𝖤ncode𝖤rror to the names list.

raven ridge
grave jolt
raven ridge
#

I think it'd be reasonable to go with:

This function will raise a SyntaxError if the source is syntactically invalid. It will raise a ValueError if the source is syntactically valid but cannot be compiled.

#

It's weird that there are cases where syntactically valid code can't be compiled, but it is apparently the case...

grave jolt
#

fun side fact: you can put the ASCII bytes #coding:utf-16 at the start of a Python file, and Python will try to decode the file as UTF-16 and (obviously) fail

raven ridge
#

presumably with a UnicodeDecodeError in that case

grave jolt
#

no, SyntaxError

raven ridge
#

huh.... I bet that internally is catching a UnicodeDecodeError and raising a SyntaxError instead. That's even more argument in favor of catching the UnicodeEncodeError in compile and raising a SyntaxError instead

#

you can make that argument based just on consistency, based on that...

grave jolt
#

Yes, that's what I think should've happened. But is it worth changing?

raven ridge
#

probably, yeah

#

at least, I think so 🙂

#

it's very weird to me that the alternative is documenting that compile() sometimes fails to compile syntactically valid Python code.

feral island
grave jolt
#

Yes

#

otherwise how do you read the coding comment?

feral island
#

the alternative I could think of is to read to the first (ASCII) newline and decode the rest according to the given encoding

#

but obviously that would give you a file that few other tools would be able to read

grave jolt
#

yeah

#

and when you save it it will be all screwed up

quick snow
#

I wish we could have deprecated removed the whole # coding: thing with the move to Python 3 (and always assume UTF-8), but I guess Python 3k was a bit too early for that..

grave jolt
#

given that compile stopped raising ValueError for null bytes in source in python 3.11 and nobody complained, maybe nobody actually cares about the precise interface of compile, and users should just assume to always catch SyntaxError and ValueError when calling compile?

raven ridge
#

what users really need to know is what exceptions it can raise that can be caused by the source string, versus which ones are not user-controlled. If the user can provide a string which provokes a ValueError, I do think that needs to be documented.

grave jolt
#

It was already documented that ValueError can be raised if the source contained null bytes (which isn't true since python 3.11, but whatever)

#

It is a bit unfortunate that passing an incorrect mode (other than "eval", "exec", "single") or incorrect flags will also raise a ValueError... but that's standard library exceptions for you, I guess

feral island
#

Looks like in theory it can also raise OverflowError but perhaps not in any realistic situation

#

Like if name mangling produces a name that is over Py_SSIZET_MAX long

grave jolt
#

i'd say that's covered by this clause

raven ridge
#

but even if not, it tells end users what they need to know

#

I think it's very weird to say that a certain piece of code is syntactically valid even though it's impossible to compile it or include it in a Python module, and I'd hope most core devs would agree with that position, heh

#

the alternative is kind of saying that the code is syntactically valid with -OO but not without, which seems way weirder to me... surely the syntax of the language doesn't depend on compiler optimization level

feral island
#

I'm with you there

grave jolt
#

anyway... I think I'll let the core developers figure it out, I don't really care what the final behavior is

#

I should probably spend my time fixing the IPython crash

raven ridge
merry venture
#

compiling strings will always ignore the encoding header

#

compiling bytes won't

#

there's PyCF_IGNORE_COOKIE but idk if you can pass it via compile, it may not be in the compile bitmask

merry venture
# raven ridge That shouldn't make anything noticeably slower, since it's only on the error pat...

agree! though probably not worth it
i think this might be a similar case to the recent https://github.com/python/cpython/issues/142396

GitHub

Bug report Bug description: {}.pop([]) Raises a KeyError, not a TypeError as I would expect. However {1: 2}.pop([]) Raises TypeError: cannot use 'list' as a dict key (unhashable type: '...

raven ridge
#

Yeah, that is pretty similar.

winged sphinx
boreal umbra
#

Just bought my PyCon ticket meow_party this will also be my first time in California where I leave the airport.

spark magnet
median palm
#

👀

#

i wonder if i can abuse my club budget to buy myself a pycon ticket

#

call it a field trip or something

boreal umbra
median palm
#

i spent it all on food 🥀 🥀

#

are there like discounts if we buy like a bulk amount of tickets and have a huge group come

boreal umbra
raven ridge
#

the flights and hotels will certainly cost more than the conference tickets

#

there are travel grants available if the cost is a hardship, though - but only so many to go around.

boreal umbra
#

Yeah. Looks like my company can only cover the conference ticket and my time this year (though the time is the most important part to me)

median palm
#

if it's not too far from the bay I could probably stay with a friend who lives there

feral island
#

You mean the San Francisco Bay? It's in southern California, not realistic to commute there from the Bay Area

raven ridge
#

~7.5 hour drive, give or take 🙂

median palm
#

😔

neat cypress
#

if I want to test the cpython jit, do I need to build from source?

#

answering myself, no, I can do PYTHON_JIT=1 python3.14 -c "import sys; print(sys._jit.is_enabled())"

hybrid relic
#

Huh that's weird I thought the compiler was off in the build system by default

round path
charred wagon
grave jolt
crimson hatch
#

Looks like it was not backported

#

The bot will remove the needs-backport labels once the backport is done

grave jolt
#

If miss islington is having skill issue and cannot backport automatically, that means I should just create the backport PRs manually?

#

(unrelated PR)

#

or is there some bot command magic?

crimson hatch
#

Looks like Łukasz wanted to backport https://github.com/python/cpython/pull/127532 first. But yes, if miss-islington cannot cleanly backport you need to make a PR using the cherry picker CLI as described in this comment https://github.com/python/cpython/pull/134392#issuecomment-2898078744
Backporting the other PR may allow miss-islington to backport the one you linked.

GitHub

Closes #127529
All good to go. ConnectionAbortedError now continues instead of returning. Improves OpenBSD performance. Full writeup in the issue.
I&#39;ve left InterruptedError grouped with Bl...

GitHub

Issue: asyncio.create_unix_server has an off-by-one error concerning the backlog parameter #90871

#

I'll comment to follow up

grave jolt
#

yo it worked

#

I think I got cherry_picker into a werid_state with ctrl+c... but the remove-section thing worked ```
$ cherry_picker f6b6a99aa5d63702b8e8101864ae08e615131702 3.14 3.13
🐍 🍒 ⛏
^C
Aborted!
$ cherry_picker --abort
🐍 🍒 ⛏
Run state cherry-picker.state=FETCHING_UPSTREAM in Git config is not known.
Perhaps it has been set by a newer version of cherry-picker. Try upgrading.
Valid states are: BACKPORT_PAUSED, UNSET. If this looks suspicious, raise an issue at https://github.com/python/cherry-picker/issues/new.
As the last resort you can reset the runtime state stored in Git config using the following command: git config --local --remove-section cherry-picker

crimson hatch
#

Yeah cherry-picker is not atomic, you may want to reset state if it failed

boreal umbra
grave jolt
#

!pypi cherry-picker

fallen slateBOT
#

Backport CPython changes from main to maintenance branches

Released on <t:1755777667:D>.

boreal umbra
# grave jolt No, it's

so that's what they use to get bugfixes into all currently-supported python versions, using the same commits?

grave jolt
#

Usually it's done automatically by the miss-islington bot. But sometimes it can't figure it out due to merge/rebase conflicts

worldly jacinth
#

hi everyone .. I just started to learn python ... so that I reatched to until now

#

in future I wanna work at backend ... so I saw many videos on YouTube that say .. u can get onto backend from python ... so any advice

grave jolt
worldly jacinth
#

oh .. sorry.. ok .. thank u

boreal umbra
#

This person is looking for ways to contribute to the PSF, if anyone knows.

spark magnet
boreal umbra
spark magnet
unkempt rock
#

@spark magnet Hi It's me

hybrid relic
#

Does a ThreadPoolExecutor in a with block such as this one automatically make the main thread wait for all tasks on other threads to finish before continuing execution on the main thread?

with concurrent.futures.ThreadPoolExecutor(max_workers=24) as executor:
    sid = 0
    for cr in creatures:
        futures[future_count] = executor.submit(single_iteration, sid, cr, equivalent_seconds * 240)
        print(futures[future_count])
        sid += 1
        future_count += 1
# When this point is reached, have all other threads already completed execution?
raven ridge
hybrid relic
#

Thanks!

oak vortex
#

How do the finalizers run here:

import gc

class A:
    def __init__(self):
        self.other: None | B = None
        self.value = 10

    def __del__(self):
        if self.other != None:
            print("A.__del__ sees other =", self.other.value)

class B:
    def __init__(self):
        self.other: None | A = None
        self.value = 15

    def __del__(self):
        if self.other != None:
            print("B.__del__ sees other =", self.other.value)

a = A()
b = B()
a.other = b
b.other = a

del a 
del b

gc.collect()

In the python docs (https://github.com/python/cpython/blob/main/InternalDocs/garbage_collector.md), it says "Call the finalizers (tp_finalize slot) and mark the objects as already finalized to avoid calling finalizers twice if the objects are resurrected or if other finalizers have removed the object first."

What is this notion of resurrected object? Does it apply here?

gilded flare
#

since neither a nor b are made reachable to anywhere other than each other, it doesn't apply here

#

based on a couple of searches on the internet and reading the gc notes, at least

oak vortex
#

Ah ok I see
But then here, how does one finalizer refer the other object? Or maybe Python first runs the finalizers and only then after calling all the finalizers it frees the allocated memory?

spark magnet
oak vortex
#

Out of curiosity and also because I'm interviewing these days I guess although that probably won't come up

spark magnet
oak vortex
#

Nah for sure they won't

#

It's just I'm preparing by reading some general stuff about GC and then I got curious

boreal umbra
#

I'm the "make sure the candidate actually knows python" person for my department, and I'd never ask about __del__

spark magnet
#

tbh, i have asked a ladder of questions that start very simple and get esoteric, but I make clear at the beginning that I expect we'll get to "I don't know", and that is fine, I'm just trying to gauge your level of knowledge, none of it is a deal-breaker.

oak vortex
#

I wasn't trying to say that'd be necessary to know for my interviews

#

I was genuinely curious

spark magnet
crisp locust
grave jolt
crisp locust
#

Python is moving to use Rust instead of C

grave jolt
#

There's a proposal (not even a PEP yet) to add the ability to write accelerator modules for CPython in Rust.

crisp locust
#

Just a FYI

grave jolt
#

Rust allows you to invoke unsafe functions, which require you to carefully uphold invariants to avoid undefined behaviour (like you always have to do in C). That has never been a secret.

spark magnet
grave jolt
#

(Linux also issues CVEs very conservatively; any CPython issue that involves out of bounds access, data races, use after free or similar would each get a separate CVE)
see: https://social.kernel.org/notice/B1JLrtkxEBazCPQHDM

Rust is is not a "silver bullet" that can solve all security problems, but it sure helps out a lot and will cut out huge swatches of Linux kernel vulnerabilities as it gets used more widely in our codebase.

That being said, we just assigned our first CVE for some Rust code in the kernel: https://lore.kernel.org/all/2025121614-CVE-2025-68260-558d@gregkh/ where the offending issue just causes a crash, not the ability to take advantage of the memory corruption, a much better thing overall.

Note the other 159 kernel CVEs issued today for fixes in the C portion of the codebase, so as always, everyone should be upgrading to newer kernels to remain secure overall.

crimson hatch
spark magnet
#

Sounds like they should not be taken seriously.

raven ridge
#

What a weird, reactionary take. Rust prevents certain bugs by construction, in the same way as Python prevents certain bugs by construction. This code explicitly opted out of the constraints which provide those guarantees, and it has a bug - the developer's reasoning for why it's safe to bypass those constraints was incorrect. That's... not interesting. Someone could have easily implemented exactly the same bug in C instead of in Rust

#

Never having heard of The Lunduke Journal before, it's immediately obvious that this is clickbait

spark magnet
#

@crisp locust you might want to find better sources.

hybrid relic
#

Rust is fine, but I think that comment above should be stressed to some people who are a little too into Rust

Rust is is not a "silver bullet" that can solve all security problems

#

Being insulted for not knowing how to read Rust code by language supremacists (The kind of developers that treat languages as "Which one is the best" instead of seeing them for what they are; Different tools for different purposes) who latched on to Rust as "The best" is unfortunately not a new thing for me

hybrid relic
crimson hatch
#

Rust is is not a "silver bullet" that can solve all security problems
Certainly true! Even formally verified code can have logic errors. But rates of security issues and crashes have been shown to be a lot lower, across all projects I have seen that have adopted Rust.

crimson hatch
#

There is however no concrete plan to make Rust required at the moment

crimson hatch
hybrid relic
hybrid relic
#

I can't find a "Thanks" emoji to react to that message so have a thumbs up instead

hybrid relic
crimson hatch
#

Yep! I see this as one great experiment. I hope it is successful like Rust for Linux (or perhaps even more so), but we'll see!

hybrid relic
#

Wish your proposal all the best!

uneven raptor
#

TIL how horribly broken __static_attributes__ is:

self = object()


class Test:
    @staticmethod
    def test1():
        self.a = 1

    def test2(self):
        if False:
            self.b = 2

    @staticmethod
    def test3():
        del self
        self.c = 3

    @staticmethod
    def test4():
        self = object()
        self.d = 4

    def test5(this):
        this.e = 5

    @staticmethod
    def test6():
        def inner(self):
            self.f = 6


print(Test.__static_attributes__)  # ('a', 'b', 'c', 'd', 'f')
grave jolt
#

does it... literally just tell you all the attributes looked up on a name called self?

uneven raptor
#

attributes assigned on anything called "self", yeah

grave jolt
#

that's so janky

quick snow
#

Reminiscent of the magic of super() which also checks for a string match.

merry venture
#

found a good one

(3.13+) run python | xargs and press Ctrl+C for terminal flood

#

i guess the new repl shouldn't even run if stdout is not a tty?
but it only checks stdin

boreal umbra
#

I'm working on my pycon talk proposal (please don't flame me for waiting til the last minute). should I include code examples that I might refer to in the slides, or is that more granular than is helpful?

spark magnet
boreal umbra
#

also, if my proposal doesn't get accepted this year, do I get any feedback? In either case, would it be rational to re-submit it next year during the mentorship period?

spark magnet
boreal umbra
#

Well it's submitted now. Fingers crossed.

weak hawk
#

<@&831776746206265384>

orchid karma
#

!clban 1236408674763669526 scam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @grizzled gazelle permanently.

hybrid relic
#

Kek

clear hill
fallen slateBOT
#

Include/cpython/pyhash.h line 19

#define PyHASH_INF 314159```
static hinge
#

Who needs math.pi when you have PI = hash(float("inf")) / 1e5?

gilded flare
#

hash(None) is set to 0xfca8420 too, though that's less interesting than pi

swift imp
#

Isn't it randomized per interpreter session

#

The hash seed that is

grave jolt
#

If you have a hash table with an internal capacity of e.g. 64, it's only going to use the last 6 bits of the hash. If two values happen to have the same last 6 bits, it's called a "hash collision", and hash tables must be able to account for them. That's why dict and set use both __hash__ and __eq__

primal cypress
grave jolt
#

There are special hash tables where hash collisions are impossible. That can be the case when you have a known possible set of inputs (e.g.: keywords in a programming language), so you can pre-compute a hash function in advance that you know will never lead to collisions (called a perfect hash function). But that's a special case

#

(obviously you can't do that for an arbitrary dict)

clear hill
#

also hash(-1) and hash(-2) are the same

#

because of fun with C 🙂

boreal umbra
grave jolt
#

because -1 is a signal that something went wrong or something like that?

clear hill
#

yeah because a C function that returns ints needs to signal errors

grave jolt
#

!e

class Apple:
    def __hash__(self): return -69
class Banana:
    def __hash__(self): return -1
print(hash(Apple()))
print(hash(Banana()))
fallen slateBOT
clear hill
#

so it can’t return -1

hybrid relic
#

I recently came across an old blog that purported to have invented a super fast interpreter dispatch mechanism based on call threading, and when I put it to the test it genuinely seems like the blog wasn't lying, even though the blog was working with 32 bit x86 and I was testing it on 64 bit x86 almost 20 years later with so many things different and probably didn't implement it 100% correctly, it zooms along in test code: https://godbolt.org/z/6or8z6WPj

But the downside is the mechanism invented in the blog was incredibly ugly and messy, and I unfortunately don't know how to improve it to take advantage of newer features that we have now compared to the limitations the blog was working with in 2008

#

A bit of a shame really if it wasn't so ugly I could genuinely see this being proposed to faster-cpython

round path
# hybrid relic I recently came across an old blog that purported to have invented a super fast ...

Cool. CPython doesn't use switch-case though except on MSVC. On GCC, it uses computed gotos. On new Clang, you can also use a form of indirect call threading via tail calls, which achieves the same thing as what you're describing. If you're on Windows, I just merged indirect call threading for Python 3.15 for VS 2026, you might want to try it out https://fidget-spinner.github.io/posts/no-longer-sorry.html

winged sphinx
#

I also think that entire paragraph is wholesome and exactly the good-faith way we should all act when trying to make something better.

round path
winged sphinx
#

Also, one note: this send me on a 'what is the GHC calling convention' side quest, if you have a good link, might be nice to add

clear hill
round path
clear hill
#

oh hey it’s #1 on HN, nice christmas present!

spark magnet
clear hill
#

There’s certainly a less rude way to make that point. But that is not the way of the orange site. Lobste.rs is, unfortunately, only situationally and marginally better but I still prefer the community there.

spark magnet
clear hill
#

I think it’s fair to say that until recently interpreter performance wasn’t getting a lot of focus and now it is. There’s no need to bring counterfactuals into it though.

grave jolt
#

imagine how much energy has been expended compiling C++ and Rust hyperlemon

#

(ignoring for now the amount of energy expended on using LLMs)

spark magnet
#

How about how much energy is expended shipping memes around? There's so much we could pick on.

faint river
#

hey nedbat please can you be careful to not pick on so many things we are trying to conserve energy from message sending

boreal umbra
grave jolt
#

that does use a lot of Python though

#

(but the root cause, of course, is that lot of those models don't really need to exist in the first place)

boreal umbra
#

For generative AI, the amount of execution that's happening in the pure python part is negligible

#

Python is just telling the NVIDIA chips what to do, and that's the overwhelming majority of the cost.

hybrid relic
#

I can't comment about tail call dispatch though, since I don't know how it's properly implemented

#

The frustrating thing is I reached out to the author of the blog recently and he revealed that he had vastly improved the original design (Which at the time was working with the limitations of 32 bit) to be significantly faster and take advantage of all the goodies that 64 bit architectures offer, as much as it can be without resorting to handwritten assembly, but he had been hired by several companies to build interpreters for them in the meantime, so he can't share this improved design with me, only give a general overview of how the implementation works

#

Of course, I'm nowhere near smart enough to understand how to build such an advanced system from a description alone, so I can't work on faster Interpreters for programming language implementations that I care about

round path
hybrid relic
#

Haha thanks! That is fair, it does depend on a lot of experience too, which helps understanding. Well, that and you have to know a ton about processors and how instruction sequences interact to yield performance!

#

I can't help but feel like it's a missed opportunity though, I'm thinking of what could have been if Darek's (The author of the blog) new designs were publicly available

round path
hybrid relic
#

I will caution you it is an extremely long read

round path
hybrid relic
simple cypress
#

I want to contribute to the https://github.com/python/cpython/issues/116738 so I can help with PEP 703.. I am asking if someone would be willing to review my PR or show me how they contribute to a project, I can watch a bunch of videos but I am a hands on person. Thanks.

GitHub

Feature or enhancement Proposal: We should audit every built-in module for thread safety and make any necessary fixes. This can be done separately from tagging the modules as safe using Py_mod_gil;...

simple cypress
maiden dune
#

i saw someone saying the GIL is getting removed. is that right? i thought it was just going to be optional. did that change course towards a complete gilectomy recently?

hybrid relic
#

I think it's just optional for now no?

spark magnet
maiden dune
#

or speculative timeline at least

spark magnet
maiden dune
spark magnet
winged sphinx
clear hill
uneven raptor
#

I did ask about this at the LS this year and it seems the plan is to always keep the GIL optional

winged sphinx
clear hill
#

Details still being worked out but there will likely be an opaque PyObject ABI that lets you build extensions compatible with both builds on 3.15 and newer

maiden dune
#

and then e.g. relegate the work on non ft compatible extensions to the GIL'd subinterpreters

clear hill
#

maybe?

#

there’s also pep 795

bitter pulsar
#

"Hey guys, how's it going?".

winged sphinx
#

!warn 1212383120192438294 Don't advertise in this server

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied warning to @ocean crest.

tall raven
#

i want to make web pages for my hackathon competition, and i am a beginner!
hey there, anyone knows about vibe coding?

boreal umbra
#

Hello @tall raven , your message is off topic for this channel. Try asking in #python-discussion

tall raven
#

sorry, i am new here

boreal umbra
# tall raven sorry, i am new here

Welcome 💚
Be sure to read the description of each channel before using it for the first time. That helps us make sure everyone can have interesting conversations.

tall raven
#

from now i will be taking care of it!

hybrid relic
#

Is there a way to check if ThreadPoolExecutor is genuinely running threads in parallel on multiple cores

#

I'm using free threaded Python but it strongly seems like it's still running one thread at a time, even with -Xgil=0

grave jolt
#

(might be different on windows, where the task manager reports the percentage relative to all of your cores, not one core? not sure)

#

you could be contending on some shared resource even without the GIL, i suppose

raven ridge
#

if, for instance, all of your threads are trying to append results to the same list or the same dict, they'd all be contending for that list or dict's lock

#

What does this have to do with the channel's topic?

#

!pban 1456527216287420570 spam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @wild igloo permanently.

clear hill
#

helps to have builds with debug symbols

#

can you share more context? you might be hitting a bug in OSS software FWIW

hybrid relic
hybrid relic
clear hill
#

numpy built from source with debug symbols too then

#

and that SO link might be relevant in that case too

#

if pybullet isn’t pure python, that too

#

any context you can share will be helpful, doesn’t matter if the script is long

hybrid relic
#

I could drop the script here, but it's a little over 700 lines long... Or maybe just the relevant parts that run in ThreadPoolExecutor is better?

#

Yeah that might be better

clear hill
#

the whole thing in a gist, along with a profile is best

#

building numpy from my fast-cache branch might be interesting too

hybrid relic
#

I'll see if I can get the profiler working before I make the script available since profile data along with the script is ideal, thanks!

halcyon trail
#

Question about the current state of no gil; is the single threaded performance practically the same, or is it a bit worse because of needing to use atomics or things like that?

raven ridge
#

A bit worse. Fine grained locks take more work than coarse grained locks, and all of the optimizations that could make it into the with-gil version do

halcyon trail
#

this came up a bit indirectly - someone posted an article comparing free-threaded python parallelism to multiprocessing. The article ended up making a relatively nuanced recommendation about which to use.
this somewhat struck me as strange initially because in principle there's never a reason for multiprocessing to be faster (or at least, practically never).
but then I kind of realized that if the single threaded performance is still worse, then you could still run into situations like that

#

Interesting. but should be really small, right? Like 1-2% or something like that?

#

I remember vaguely that earlier gil-ectomy attempts ran into this issue before and (also vaguely) recall bigger numbers for the single threading penalty being thrown around, and it seemed like there was a lot of opposition to making the change if the single threaded penalty was substantial

raven ridge
halcyon trail
#

ah gotcha, okay, so not totally trivial

#

so it's very much plausible that on the right kind of workload multiprocessing would still beat multithreading

raven ridge
#

But also for nuance, single threaded performance is also improving version over version

halcyon trail
#

yeah for sure, I know performance has been improving in general in python. This isn't a concern about python's evolution so much as trying to understand how relevant multiprocessing is likely to be in the future

#

because obviously single computer multiprocessing is extremely niche in most languages

raven ridge
#

in principle there's never a reason for multiprocessing to be faster

Hm. That might depend on whether you're using the fork spawn method, too. Fork and CoW is fast, and ends with distinct objects that can be access in parallel without contention - but fork is no longer the default multiprocessing spawn method, and the new default makes it more expensive to share data

halcyon trail
#

AFAIU, CoW from forking a process isn't fine-grained, so as soon as you write stuff, won't it copy everything?

#

if you're just reading something, then you can also just read from multiple threads without contention. but that might not be the case in python as it depends on the interpreter, not sure.

#

when i say "in principle" I mean for a reasonably written native program basically I suppose

feral island
#

so it won't copy everything, but something like some MBs of data

#

in Python that often tends to happen quickly because of refcounts

clear hill
#

3.14t is about as fast or a little slower in single-threaded use than 3.14

#

depends on OS and CPU

grave jolt
#

!cleanban @frosty wadi some sort of scam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @frosty wadi permanently.

uneven raptor
#

there was one library that claimed to be faster for single-threaded code on 3.14t than 3.14, but I can't find it

round path
clear hill
#

I don’t personally notice any difference when I switch back and forth while 3.13t was noticeably slower

round path
#

Not to mention, if you're using C extensions heavily, the noticeable perf penalty is even lower

halcyon trail
halcyon trail
clear hill
#

that’s true but amdahl’s law means the GIL always ends up as a scaling bottleneck

reef plaza
#

Hi, I’m Zakaria from Algeria, I want to learn web dev!

glass mulch
#

Hi Zakaria, this is the wrong channel for that. You should find help at #web-development . Good luck!

oak vortex
#

I haven’t really followed things closely around the GIL. Will making the garbage collector thread-safe, hence allowing unlocking the GIL, always be, by default, not activated? since having a thread-safe collector is likely to be always be more expensive?

#

Hopefully my question makes sense (I know that today the answer is yes, I’m wondering what are the “plans” for later versions of Python)

uneven raptor
#

the free-threaded build will likely be the default someday, yeah. performance isn’t really the problem at this point; the last hurdle is that free-threaded python is ABI incompatible with GIL-icious python, so the switch will require an ABI break.

raven ridge
#

I haven't heard of anyone proposing a garage collector that doesn't require stop-the-world

crimson hatch
#

It's tricky because it would likely require an ABI break and inserting barriers around increfs/decrefs

halcyon trail
#

But then, there's tons of code where the time spent in python (as opposed to the C extension) is totally negligible to start with (otherwise it wouldn't be used), so the GIL isn't likely to ever be a bottleneck

#

obviously it just depends on the code

clear hill
#

the GIL means there’s always going to be a nonzero amount of time running on only one thread at a time; amdahl’s law says scaling is only perfect when there’s no code running on only one execution context. It’s not theoretical either, read the intro to PEP 703.

swift imp
#

this is weird

quick snow
#

@thorny elm Not the place for this, please read channel descriptions before posting.

thorny elm
#

what???

hybrid relic
unkempt rock
#

me love some arm architecture

simple cypress
# spark magnet oh, yes, that could be, i don't know the internal details of what happens if the...

https://github.com/python/cpython/issues/108219

I was looking into physics simulations with cpython and that is when I learned about its limitations, someone said rust python can do parallelism, i only know what i have been learning for my projects.

I am not sure how to contribute to the python project so I came to this discord to learn where I can and try not to add any burdens.

thanks for any mentorship

GitHub

Feature or enhancement The steering council has accepted PEP 703. This is intended as a top-level issue to keep track of integration status. The "up for grabs" list contains issues that n...

#

i know i can go without the GIL now but I wanted to help make it optional and contribute to python since its the main language that i know and want to know more, so i am slithering out of my burrow and presenting myself for reality checks.

halcyon trail
# clear hill the GIL means there’s always going to be a nonzero amount of time running on onl...

maybe I misread what you wrote. if by "ends up" you mean "if you increase the number of cores and threads on a single machine enough then eventually the GIL will be the bottleneck", then yes, that's true.
it will not end up always being the bottleneck in real software because machines only have so many cores. If you spend 1% of your time in python (i.e. a given thread only has the GIL for 1% of the time because 99% of the time its inside a C extension that has released the GIL), and you have 20 cores and threads, then the GIL will not end up being a bottleneck in that situation

clear hill
#

I think we agree, but in-practice very few workloads really do only spend 1% of their time in extensions

#

more often you're lucky to get it down to 10% - that's just my personal experience though

#

and not all extensions always release the GIL

#

numpy doesn't, for example, for small arrays

halcyon trail
#

yes, that's fair

#

At work, we (like a lot of finance shops) use computational graph engines a lot for modelling, and this is exactly one of the selling points of those approaches

#

the graph can be wired in python, but once you start telling it to execute, it just goes through the entire graph in C++ and never kicks back to python again until its complete

#

numpy is not great because it kicks back into python a lot, unless you do a single massive matrix operation or something like that all at once

clear hill
#

Has the idea of a zero-copy one-way conversion from list to tuple ever come up? With subinterpreters and free-threading it seems like something that would be nice for a lot of use-cases.

#

like the new take_bytes

raven ridge
#

I don't see how it could be done in a zero-copy way... tuple and list have different memory layouts

#

a tuple contains an array of pyobject pointers.
a list contains a pointer to an array of pyobject pointers.
the only way to turn a list into a tuple would either be to
a) make a new type of tuple that contains a pointer to an array of pyobject pointers, plus a flag indicating whether the tuple is using the traditional contains-an-array layout or the new contains-a-pointer-to-an-array layout, or
b) prepend the tuple's fixed fields at the start of the array (which isn't going to be possible, in general - there's not padding before every array we could grow into)

grave jolt
#

and then somehow (?) make it work with GC and such

raven ridge
#

that would be for converting from tuple to list, not from list to tuple

grave jolt
#

ohh

#

i haven't mastered reading yet

raven ridge
#

but that would indeed work for zero-copy conversions from tuple to list, assuming the ownership problem could be solved

#

I guess that'd force that either there are no references to the tuple or that the list would need a copy if you mutate it

#

but anyway, yeah, that's the opposite of what @clear hill wants

grave jolt
#

basically, point to a mutable tuple

raven ridge
#

yep, that'd work

grave jolt
#

would be pretty silly

raven ridge
#

yep 😄

grave jolt
#

does it have to be a list? lists really are everywhere, but we could have a speciallist_that_turns_into_box

#

like an ugly caterpillar eventually turning into a colorful butterfly

grave jolt
# raven ridge yep, that'd work

In Rust, a similar-ish operation (but involving a copy) would be Vec<T>.into() -> Rc<[T]> or Vec<T>.into() -> Arc<[T]>. When you build a vector up and then put the slice directly behind a reference countered pointer to shrink it and avoid a level of indirection. Not sure what the C++ equivalent is even

#

Rc<[*mut PyObject]> is prety much a CPython tuple

boreal umbra
#

apparently there were 1015 pycon talk proposals
and there's how many slots?

winged sphinx
halcyon trail
#

(it doesn't actually exist, largely because of allocator related subtleties probably, but it's what would be the equivalent)

grave jolt
halcyon trail
#

std::shared_ptr<T[]> became a thing in 17

#

"it doesn't actually exist" is in reference to the conversion

#

there's just no way in C++ currently to get std::vector to "release" its data

#

so, you cannot implement a conversion like the one above yourself, and AFAIK it doesn't already exist anyway

clear hill
#

maybe we want a tuple builder API or something like that

#

it's much more ergonomic (and faster!) to build a list from a list generator or by appending in-place than to build a tuple in the same way

#

but I think you answered my question: the memory layout issue makes this complicated

clear hill
#

maybe frozenlist, actually

#

I always thought frozenlist isn't worth doing because tuple is a thing, but maybe having list.to_frozen that returns a frozenlist and takes the storage from the list is worth it because of these memory layout issues

#

it would be gnarly but I think you could do that in an extension too, although you wouldn't be able to have a nice list method

raven ridge
#

granted it's less ergonomic than appending to a list because you need to manage the resizes by the growth factor yourself, but it is as efficient, just at the cost of a bit more work

clear hill
#

No, I'm talking about Python

#

let's say someone wants to do something like

import numpy as np

l = [i for i in range(100_000)]
arr = np.array(l)

I'm thinking about what happens if another thread has a reference to l and mutates it while the array is getting created. So we need to add some locking, which we only need to do for mutable data containers.

#

I want to give people an escape hatch, but right now it's kind of annoying to write Python code that creates tuples for things like this

raven ridge
#

I want to give people an escape hatch
why? shouldn't uncontended access to the critical section be very close to free?

#

I'd expect np.array to just do something like

PyObject *make_array(PyObject *lst)
{
    Py_BEGIN_CRITICAL_SECTION(lst);
    PyObject *ret_array = iterate_elements_and_make_array(lst);
    Py_END_CRITICAL_SECTION();
    return ret_array;
}
clear hill
#

What if multiple threads are trying to simultaneously create an array from the same input data? You could imagine a simulation that takes a seed and a set of initial conditions and then runs that many times in a thread pool.

#

there's no notion of a reader-writer critical section

#

so concurrent reads block each other

raven ridge
#

sure, that's true - but is that actually a common use case? common enough that the obvious workaround of telling people to store a reference to tuple(the_list) and pass that to array instead wouldn't work?

clear hill
#

or convert it to an array first

raven ridge
#

right

clear hill
#

I guess converting to a tuple before farming out to a thread pool works, but to me it seems nicer not to copy unnecessarily

raven ridge
#

fwiw, memcpy is very fast. I wouldn't worry too much about the "zero copy" aspect of what you're asking about. I'd expect that copying the array of PyObject* is pretty cheap, and the slow and expensive part is updating the refcount for each of them, incrementing it to indicate that the tuple now holds a reference to each of those objects, and then decrementing it whenever the list dies

#

which is to say, if this is worth optimizing, a middle ground would be something that doesn't get rid of the memcpy, but does get rid of the reference count manipulation, by having the newly created tuple steal the references from the list

clear hill
#

or frozenlist

#

to the extent that tuple isn’t already frozenlist

raven ridge
#

right - same deal, the frozenlist could steal the references from the original list (a la a C++ move constructor)

clear hill
#

that’s what I was getting at with a list.freeze() method

#

the frozenlist would steal the storage and refences and the original list would be empty

raven ridge
#

copying the big-ish contiguous array should be pretty cheap, updating all of those refcounts all over the place with no locality of reference much less so

clear hill
#

you could only do it for a uniquely referenced list

#

I think

#

but that makes sense for the builder pattern

raven ridge
#

nah, you could do it for any list - it's just lock the list's critical section, memcpy the list's array into the new tuple or frozenlist, set the list's size to 0, and then unlock the list

#

(and notably, numpy could provide a method that does this! it wouldn't necessarily need to be in cpython)

raven ridge
#

I think this? ```c
PyObject *list_to_tuple(PyObject *lst)
{
PyObject *ret = 0;

Py_BEGIN_CRITICAL_SECTION(lst);
Py_ssize_t size = PyList_Size(lst);
if (size >= 0) {  // otherwise an exception has been set
    ret = PyTuple_New(PyList_GET_SIZE(lst));
    if (ret) {  // otherwise an exception has been set
        for (Py_ssize_t i = 0; i < size; ++i) {
            PyTuple_SET_ITEM(ret, i, PyList_GET_ITEM(lst, i));
        }
        Py_SET_SIZE((PyVarObject*)lst, 0);
    }
}
Py_END_CRITICAL_SECTION();
return ret;

}

raven ridge
#

Pretty sure that accomplishes the goal and uses only public APIs. The one part of that that's tricky is that I don't think it's documented that PyListObject is a PyVarObject... it is and always has been, but in theory that could change, and this code would break if it did

#

it's probably frowned upon to poke in the list's internal state like this, heh

swift imp
#

That may have gotten rid of the frozen list idr but tuples have specific semantics

clear hill
#

it requires deep knowledge of how python stores string data

#

I’m kidding about that ask but also sort of not kidding 🙂

raven ridge
#

I know very little about numpy but I do know a fair bit about how CPython stores strings (and how that changes from version to version!) - I could try to help

clear hill
#

right now StringDType just returns Python strings when you access a scalar, which technically breaks numpy’s type system

#

^ this PR did it by implementing a numpy scalar that isn’t a PyUnicode subtype but it really needs to be a PyUnicode subtype otherwise the transition period is annoying for users

#

optimally, it would be a subclass of both str and np.generic, which is the base class for all the numpy scalar types

raven ridge
#

hm, so, it looks like you can subclass from both str and np.generic, which sort of surprises me - but given that you can, what's the problem?

#

what do you need to add to str in your subclass to make it a valid np.generic and to make it work with numpy?

#

and why do you need to know the layout of the data inside the str? I don't think I follow that yet...

#

I can see why you'd need to get at the UTF-8 bytes inside the str, but that's just a call to PyUnicode_AsUTF8AndSize

clear hill
#

Doesn’t that make a copy?

raven ridge
#

no

#

it's a reference to a (lazily populated) field inside the str

clear hill
#

I’m having trouble remembering all the details about why I needed to know about the internal representation, it’s been a while since I looked closely

#

if we don’t then great

raven ridge
#

for any str, you can get at a UTF-8 representation of that str with PyUnicode_AsUTF8AndSize, and the returned pointer has the same lifetime as the str itself

#

and that'll work just fine for subclasses of str

clear hill
#

and hopefully sometime in the future Python can store the data internally as UTF-8 🙂

raven ridge
#

it does, sometimes

#

ascii strings are internally stored as UTF-8

#

since ascii is a subset of UTF-8, the one buffer is used for both the array-of-codepoints representation and the UTF8-byte-array representation

#

and most strings in most programs are ascii, so most of the time there's only one buffer

#

basically: a str internally has an array of unicode codepoint values. If every codepoint in the string is less than 256 that's an array of 1 byte numbers, otherwise if every codepoint in the string is less than 65536 it's an array of 2 byte numbers, otherwise it's an array of 4 byte numbers.

a str also has an array of utf-8 bytes. If every codepoint in the string is less than 128 that's a pointer to the array of codepoints, because ascii is a subset of UTF-8. Otherwise, it's a pointer to a separate heap allocated array owned by the str and lazily populated the first time it's needed

halcyon trail
#

wasn't there talk of moving to plain utf-8 with an array of character indices

clear hill
#

Now I just need to figure out how to do multiple inheritance in C…

boreal umbra
#

Inheritance is legitimized code obfuscation

raven ridge
uneven raptor
#

yeah, one of the big drawbacks of static types is that you cannot use multiple inheritance with them

#

I’d like to see a proper static type deprecation someday (with the exception of builtin types) but way too many people refuse to migrate to heap types right now

worthy pagoda
#

hi guys does anyone know how to open a python file in visual studio? i dont quite understand

winged sphinx
prime sage
#

how can I propose a change to the struct module to help with typing?
I want the API for struct unpack to add a param for the return type.
for example:

foo1, foo2, foo3 = struct.unpack("!iHi", bin, type=tuple[int. int, int]) # foo1, foo2, foo3 are automatically inferred as ints
boreal umbra
prime sage
raven ridge
#

this only needs a change in type checkers. The return type is implied by the format string, they could already do this if they cared to (assuming the format string is a literal)

#

and it's better to infer the return type from the format string than to have the user supply it separately, because otherwise they could go out of sync. You want the type checker to catch it if you update the format string and don't update the unpacking, for instance

#

this proposal would be bad for the same reason as having max(values, type=int) would be bad

spark magnet
prime sage
quick snow
#

!clban 1450903571691339871 Only here to advertise

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @split parcel permanently.

hybrid relic
#

There's so many of them

meager nacelle
#

This came up in a help post, but dataclasses currently has some unexpected behaviour around how it can 'inherit' default values:

from dataclasses import dataclass

class A:
    a = 42

@dataclass
class B(A):
    a: int

print(B())  # B(a=42)

Currently all of mypy, pyright, ty and pyrefly think this will error. I think it probably should be an error but I'm not sure if this would be considered a bug or a 'feature' - it's not documented and no tests rely on it.

grave jolt
#

the combination of dataclasses and inheritance is almost always puzzling to me

meager nacelle
#

This kind of just falls through because dataclasses is using getattr(cls, a_name, MISSING) to get the default values, so it'll pick up things from any parent class

#

Notably, attrs doesn't do this

#

This means it also picks up properties for instance

grave jolt
#

!e 🥴

from dataclasses import dataclass

class A:
    def a(self): pass

@dataclass
class B(A):
    a: int

print(B())
fallen slateBOT
meager nacelle
#

You can get the same thing if you use a property if you make the dataclass slotted

#

If you don't make it slotted you can't create an instance because the property has no setter, if you do make it slotted the slot has replaced the property so it "works"

meager nacelle
#

The type checkers all give some variant of: main.py:10: error: Missing positional argument "a" in call to "B" [call-arg] (mypy for example)

halcyon trail
#

Though I will say it's definitely useful at times

#

If you do combine dataclasses and inheritance you should probably always turn off equality and turn on kw_only

grave jolt
#

there's also __replace__ which broke type checking on generic data classes

meager nacelle
#

I guess it's just the question of is this a bug to be fixed, or if there's an argument to be had over changing the behaviour or documenting it

halcyon trail
#

I'm not sure if there's really anything to change. the code is a really bad idea but it's not wrong per se, and dataclasses have always been very "loose" with inheritance

meager nacelle
#

Well this specific instance is clearly undesirable, but it's more to illustrate what the problem is when you have something more like the original example where someone was trying to override a property with a field

#

!e

from dataclasses import dataclass

@dataclass
class B:
    @property
    def p(self) -> int:
        ...

@dataclass
class C(B):
    p: int
    q: int
#

Ah the actual full error is too long and got truncated

#

You get this: TypeError: non-default argument 'q' follows default argument

#

It also leads to (further) inconsistent behaviour between slotted and unslotted classes

#

!e

from dataclasses import dataclass

for slotted in [True, False]:
    @dataclass(slots=slotted)
    class A:
        answer: int = 42

    @dataclass(slots=slotted)
    class B(A):
        answer: int

    print(f"slots={slotted}")
    try:
        print(B())
    except TypeError as e:
        print(e)
        print()
fallen slateBOT
halcyon trail
meager nacelle
#

I mostly want it to behave the way all of the type checkers already think it behaves.

halcyon trail
#

so, lets say you actually have a property in the base class, and its computed so there's no backing field

#

in the derived dataclass I would have a field with a different name, and use that to back the property

meager nacelle
#

My expectation is just that this should error, but it should error when it tries to set self.p because the property is there and has no setter.

#

It shouldn't inconsistently appear to inherit default values based on whether there are slots

uneven raptor
#

it’s probably not changeable, it looks enough like a feature that people are probably relying on it somewhere

meager nacelle
#

I wouldn't propose just snap changing the behaviour - it would need to be a warning first if it was going to be changed.

I'm aware there are people who have relied on it, or tried to. See this thread about adding it to dataclass_transform - https://discuss.python.org/t/dataclass-transform-add-inherit-defaults-option/89531

However, they're assuming it's "dataclasses inherits defaults" where it's actually "dataclasses will assume any value present as a class attribute is intended as the default, unless it's a slot".

#

Interestingly Pydantic's "dataclasses" used to behave like Python's ones in this respect, but no longer do.

#

(No they still do I just missed something when testing)

#

I assume they're just adding things onto regular dataclasses? BaseModel doesn't behave like dataclasses (neither does attrs).

inland halo
#

can anyone explain me what is use of if name == "main" in code

sour thistle
#

!ifmain

fallen slateBOT
#
`if __name__ == '__main__'`

This is a statement that is only true if the module (your source code) it appears in is being run directly, as opposed to being imported into another module. When you run your module, the __name__ special variable is automatically set to the string '__main__'. Conversely, when you import that same module into a different one, and run that, __name__ is instead set to the filename of your module minus the .py extension.

Example

# foo.py

print('spam')

if __name__ == '__main__':
    print('eggs')

If you run the above module foo.py directly, both 'spam'and 'eggs' will be printed. Now consider this next example:

# bar.py

import foo

If you run this module named bar.py, it will execute the code in foo.py. First it will print 'spam', and then the if statement will fail, because __name__ will now be the string 'foo'.

Why would I do this?

  • Your module is a library, but also has a special case where it can be run directly
  • Your module is a library and you want to safeguard it against people running it directly (like what pip does)
  • Your module is the main program, but has unit tests and the testing framework works by importing your module, and you want to avoid having your main code run during the test
sour thistle
quick snow
#

See above, this is not on-topic for this channel.

dusky olive
#

Good evening; I am VERY new to programing (no edu. and failing at being self taught in a program that I MUST work within (REDCap)) does anyone know REDCap and is willing to answer a few specific questions? not even sure if this place is the right one to ask. Couldn't find a REDCap specific place to ask. Also, also I looked up what language REDCap uses basically I am told all the languages.
Any how, I am looking how to turn a value into a NUL Value.

  1. Can that be done
  2. If so how
  3. If wrong forum, I am sorry and am willing to accept any guidance
#

says I am new here and need to say "hi" so hi and stuff

#

if([item_6]=0,somethingsomethingnulvalue)

rose nebula
#

Hello all. I have a CPython PR up to add custom header support to the stdlib HTTP module. It's undergone several rounds of review - I've made many requested changes and the PR is up for re-review, but there's been no activity since before the Holiday season. I realize things are perhaps a little slow right now, but are there any core devs that are able to give it a look? https://github.com/python/cpython/pull/135057 . Thanks for the help!

GitHub

As proposed in #135056, Add a --cors command line argument to the stdlib http.server module, which will add an Access-Control-Allow-Origin: * header to all responses.
Invocation:
python -m http.ser...

weak hawk
clear hill
#

amazingly, this works on the GIL-enabled build somehow

#

I ran into it today because on the free-threaded build it causes a bus error

#

tp_new for a static type is in the data segment of the binary, isn't it?

raven ridge
#

tp_new and ob_refcnt are both in the data section

clear hill
#

I wonder why it bus errors on 3.14t then

#

ohhh I know

#

lol its PyObject layout is wrong

raven ridge
#

Ah, yeah. I should've said, I didn't realize you didn't know that

clear hill
#

I wasn’t paying attention to that bit

#

but of course doing this via ctypes requires a model for PyObject

quick snow
#

!warn 759279650828058708 We have filters for a reason. Do not post surveys here.

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied warning to @fallen tiger.

bronze rapids
#

I need a person who can apply for me.

raven ridge
timid imp
#

Test
internet connection

boreal umbra
timid imp
spark magnet
#

it would be really useful to hook into changes to sys.path....

raven ridge
spark magnet
# raven ridge What's the use case?

coverage.py tries to avoid measuring third-party code because it slows things down needlessly. But finding third-party is a bit ad-hoc, and depends on what is in sys.path. But sys.path is changed by test runners.

#

I think I have it all fixed now, maybe?

#

i keep a copy of sys.path, and when a decision has to be made, if sys.path has changed, i re-consider the world to decide where third-party code is.

uneven raptor
#

are you worried about changes to the sys.path attribute itself, or changes to the list? if it's the latter, I think you can make a list type that tracks modifications and set it to sys.path

#

if it's the former, I think it's still possible, just more difficult: in C, you can call PyDict_AddWatcher/PyDict_Watch on sys.__dict__ and then watch modifications to path

spark magnet
proper sedge
#

Hello

edgy flicker
proper sedge
#

Hi

edgy flicker
proper sedge
#

In Egypt

#

Where are you from

edgy flicker
proper sedge
#

Ok

edgy flicker
#

so are you programmer or what ...?

quick snow
#

Hello, this channel is about the internals of Python, not an off-topic channel.

proper sedge
#

Python

edgy flicker
edgy flicker
winged sphinx
#

!ot

fallen slateBOT
winged sphinx
#

Or off topic for off topic.

edgy flicker
#

okay sorry

faint river
smoky wave
#

looks so much better now

spark magnet
faint river
agile meteor
#

Hello

#

Someone can play my game

#

İ send it on dö

#

Dm

ruby schooner
proper sedge
#

Hello

oak sequoia
#

Hello

jovial flame
#

@civic acorn Your message was removed for violating rules 6 and 9 regarding advertising and paid work.

quiet crane
#

If python was be documented inside its source code, it would be much easier for me to explore the 1st party libraries. But today, showing the signature with lsp.hover or looking at the definition with lsp.goto_definition yields next to nothing. Why?

#

hover on argparse.ArgumentParser().parse_args() method

#

goto_definition takes me to /usr/lib/python3.13/argparse.py, but there is no docstring:

deep dirge
static hinge
#

I think that's just a pyright thing, and the stdlib stubs don't include docs

#

based-pyright can merge them

quiet crane
#

(I'm not using pyright but python-lsp-server)

deep dirge
quiet crane
deep dirge
quiet crane
#

There is also pydoc3 -b, but that seems to show the same content as the source code does

For example only the signature:

add_mutually_exclusive_group(self, **kwargs)
quiet crane
spark magnet
deep dirge
quiet crane
# spark magnet right: that method has no docstring.

This is what I mean. 🙂

Is there any world where the awesome online doc content and the method docstring could be delivered from the same source of information. (To avoid the duplicated work @stan mentioned)

quiet crane
spark magnet
deep dirge
#

Also note that we can’t copy the online docs source directly, it uses rST markup.

quiet crane
quiet crane
spark magnet
spark magnet
quiet crane
#

LIke, have a basic docstring, and include that as part of the html documentation. But I guess that would just split the documentation text into 2 parts and actually cause a lot of maintenance burden anyway

#

Seems like I should download the html docs and add a binding in my editor to open that.

spark magnet
quiet crane
deep dirge
spark magnet
quiet crane
quiet crane
#

I ended up doing a mapping to directly search the python documentation for the word under my cursor:

autocmd FileType python nnoremap <expr> <leader>D '<cmd>Open https://docs.python.org/3/search.html?q='.expand('<cword>').'<cr>'
steel solstice
spark magnet
uneven raptor
#

I mean, most editors can minimize those comments

#

personally I would love documentation generated from docstrings but I won't fight that battle right now

frank abyss
#

hello

#

i have learned basic python

#

now what should i do next ?

spark verge
#

!res

fallen slateBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

spark verge
#

But your message is almost off topic for this channel

spark magnet
spark magnet
spark magnet
# boreal umbra why not?

since the suggestion was a guess at what they were interested in, i was also willing to take a guess. I think perhaps 1% of new learners will be interested in the internals.

#

esp after "basic python"

boreal umbra
#

ah. I thought maybe it was a general rebuke of the book.

spark magnet
#

no, i'm sure it's a fine book, though it's necessarily stuck at 3.9.

astral gazelle
#

Have internals changed a lot since 3.9?

gilded flare
#

other than that it's probably alright

#

i haven't read the book though

round path
# astral gazelle Have internals changed a lot since 3.9?

I read the book. The internals are pretty different now. The interpreter got revamped due to PEP 659, threading due to free threading, GC due to incremental GC and FT GC,

I don't recommend reading the book if you want to dive into modern cpy dev, the devguide and internaldocs are more up to date. However, it is a good snapshot of 3.9 at that point of time.

uneven raptor
#

I read the book a few years before I got into cpython development. it's great for piquing interest, but yeah, wouldn't recommend for actually learning about the codebase.

boreal umbra
uneven raptor
#

I wonder what percentage of readers become core devs. probably less than 1% or something, right? unless the book has less sells than I thought

orchid thicket
#

What are thoughts on allowing set literals in type expressions to allow writing Literal[X, Y] as {X, Y}?

Basically all of the same ways that the set union operator, |, can be used in annotating types, I'd like for set literals to do the same.

So instead of a verbose

from typing import Literal

def get_fruit(fruit: Literal['apple', 'banana', 'cherry']) -> str:
    return fruit

you can have

from typing import Literal

def get_fruit(fruit: {'apple', 'banana', 'cherry'}) -> str:
    return fruit

My main motivation behind this is how verbose annotating instances of generics or aliasing them is.

ColorName = Literal["red", "yellow", "green", "cyan", "blue", "magenta"]

class Color[C: ColorName]:
    def __init__(self, name: C) -> None: ...

Red = Color[{"red"}]  # instead of Color[Literal["Red"]]
Cmy = Color[{"cyan", "magenta", "yellow"}]
orchid thicket
meager nacelle
#

I'm not a huge fan. Currently for defined types a | b is Union[a, b]at runtime. This can't be the case for {'apple', 'banana', 'cherry'} which is a valid set at runtime.

orchid thicket
#

yeah, i think it couldn't and shouldn't work for assignment because of this. but in any instance of type expression

foo: {1, 2, 3} = 1
bar(baz: {'a', 'b'}) -> {'c'}: ...
Spam = Eggs[{True}]

i'm for it

meager nacelle
#

I don't think just saying it's a type expression gets you around that with how things work at runtime.

feral island
#

oh sorry you're proposing Literal instead of it meaning a set

#

which is also another argument against: it might not be obvious what it means

meager nacelle
#

Yeah there's that too, initially I thought sets.

#

My current annoyance with annotations is that STRING annotations currently end up evaluating the annotations due to needing to check VALUE_WITH_FAKE_GLOBALS in a non-fake globals environment and I kinda wish they didn't

orchid thicket
meager nacelle
#

Tuples are also in that why can't we page

orchid thicket
#

This would also just be generally more useful to more people since way less people use Literal when annotating in my experience

orchid thicket
#

ye i'm so find with annoyingly having to deal with Literal if this gets in instead. way more useful imo

steel solstice
#

With literal a more common syntax fwiw is just union operators on the string/int literals themselves

orchid thicket
#

yeah, but it's unfortunately already taken by union. this would be kinda a mathy correctness alternative

orchid thicket
#

and the | operator has meaning with int

feral island
#

though of course in all cases, there's a tradeoff (broadly, it makes runtime introspection harder)

#

we could decide we can live what that tradeoff or find another way to support runtime introspection, like Imogen's AST introspection

steel solstice
#

If only type hints were always strings or type context existed

meager nacelle
#

I think I've soured on AST runtime transformations as I can't see it not splitting things and being worse for performance

#

At least if they're being transformed at runtime

boreal umbra
#

!clban 1341358376680034376 some kind of claude AI gift code scam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @onyx oasis permanently.

harsh sail
#

Hi

quick snow
#

!warn 1293650689796472892 Don't post rule-5-breaking videos. Also don't post off-topic stuff in non-offtopic channels.

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied warning to @empty sundial.

boreal umbra
#

Good luck to everyone who might hear about their pycon proposal today

grave jolt
#

A bit disappointed that we won't get a HAMT, but nice nonetheless

quick snow
# boreal umbra what's HAMT?

Hash Array Mapped Trie, see https://peps.python.org/pep-0603/
It's already part of Python as part of the contextvars implementation.

grave jolt
# boreal umbra what's HAMT?

To shorty outline why I'm sad: HAMTs allow you to efficiently derive changed values by using structural sharing. So if you have py fruits1 = ham({"apple": 0, "banana": 1, "cherry": 2}) fruits2 = fruits1.replace("banana", 69) the replace operation is O(log N) instead of O(N)

#

This kind of structure is present in many functional languages. Having it in the stdlib would make functional programming practical for more kinds of things in Python

steel solstice
quick snow
grave jolt
#

don't let the boss man tell you not to do from _testinternalcapi import hamt

steel solstice
#

Yeah that

#

I'm not gonna say what I searched on the internet but it didn't turn up that

#

Or anything python related...

merry bramble
grave jolt
grave jolt
#

for legal reasons I have to state that this is a derivative work of this

meager nacelle
#

Reminded me that I had some help pages on classes that looked a bit like that due to weirdness with inspect.signature on some __init__ descriptors that I don't quite understand.

#

If you put a descriptor in place of __init__, inspect.signature will call .__get__(cls, type) and not .__get__(None, cls). I never figured out if this was a bug or I didn't understand something 🤔.

#

I get that it removes self for the purpose of getting a signature, but the function you'd get by doing that makes no sense.

meager nacelle
meager nacelle
#

I do actually like finally having lazy imports - it looks like it might finally let us have dataclasses with reasonable import performance.

#

I also see frozendict just got merged

spark magnet
meager nacelle
#

I mean within dataclasses

meager nacelle
meager nacelle
#

But there is also the case for libraries that want to do something specific if they encounter a dataclass but don't construct one themselves

#

They could potentially lazy import and use is_dataclass

#

Might be a case for making the annotationlib import lazy inside dataclasses for that too

#

There was already the case that configparser was going to use dataclasses, but using them tripled the import time so it didn't in the end.

quick snow
#

What's the plan in general for lazy imports in the stdlib?

deep dirge
meager nacelle
#

It would definitely be nice to remove some of the ugly in-line imports but we do need to be careful not to break the eager mode completely.

meager nacelle
winged sphinx
#

Understood, was just linking to the pep approval, I'm excited for it

fallen slateBOT
#

Lib/typing.py lines 1931 to 1936

@functools.cache
def _lazy_load_getattr_static():
    # Import getattr_static lazily so as not to slow down the import of typing.py
    # Cache the result so we don't slow down _ProtocolMeta.__instancecheck__ unnecessarily
    from inspect import getattr_static
    return getattr_static```
meager nacelle
#

Yeah I'd seen that one, typing has about 3 different hacks for lazy imports iirc

#

I'm not actually sure how eager mode is implemented, ideally we could have a form of import that is always lazy but slightly more awkward to use?

meager nacelle
#
lazy import annotationlib
typing = __lazy_import__("typing")

print(globals()["annotationlib"])
print(globals()["typing"])

print(annotationlib)
print(typing)

With -Xlazy_imports=none

<module 'annotationlib' from '/home/ducksual/src/cpython/Lib/annotationlib.py'>
<lazy_import 'typing'>
<module 'annotationlib' from '/home/ducksual/src/cpython/Lib/annotationlib.py'>
<module 'typing' from '/home/ducksual/src/cpython/Lib/typing.py'>
merry bramble
merry bramble
meager nacelle
#

I'd like to know it's a intended 'feature' first - and check it always holds

#

Or maybe it's better to just make the PR and someone else can decide that

fleet frost
clear hill
#

do lazy imports have any C hooks? Might be neat to expose in PyO3. I recently added preliminary 3.15 support.

uneven raptor
#

I wouldn't be surprised if libraries ignored support for it

fleet frost
#

it's basically a kill switch to force eager imports, mostly for debugging

uneven raptor
#

yeah, but why?

fleet frost
uneven raptor
#

no, I don't, but I don't see how it'd be helpful in debugging side effects. it could basically tell you "your lazy imports break something, but I have no idea where or why!"

#

anyways, I'm sure people already discussed this on the forum, I won't bother

meager nacelle
#

I'm not sure what the actual practical use case is, but the flag was included in the PEP so it should probably be at least somewhat usable.

boreal umbra
#

has anyone gotten a notification about their pycon talk proposal today?

fleet frost
merry bramble
fleet frost
naive saddle
#

Realistically, we'd set the global sys(?) attribute to disable lazy imports, but effectively, we can't really use lazy imports.

uneven raptor
#

oh, is the problem in your deps?

naive saddle
#

We can't control what our vendored dependencies do. Actually, I'd imagine some of our dependencies would want to use lazy imports, but 'cause pip vendors them (and has an unique security perimeter), they can't be used.

uneven raptor
naive saddle
#

and for the difference between lazy imports and imports in functions, both are unsafe for us

uneven raptor
#

ugh, this seems like something that could have been solved with an audit event rather than a global filter system

uneven raptor
meager nacelle
#

IIRC this is an issue for pip because it's installing into the same folder it's running from so one pip install <malicious package> can both replace a part of pip (or vendored dependency) and subsequently trigger a code branch that imports the malicious replacement?

uneven raptor
#

right, but ISTM that could have been solved by installing an audit event that prevents lazy import so pip (or its deps) don’t do it by accident

meager nacelle
naive saddle
naive saddle
uneven raptor
#

yeah, that seems like a CVE waiting to happen. it’d be nice if pip had a different way of avoiding the problem, like preserving sys.path or something like that

quick snow
#

I don't really understand this.. 1. Can't packages anyways execute arbitrary code when being installed? (I guess not when installing wheels specifically?) 2. Doesn't this mean that using -X lazy_imports=none prevents wheels from executing code at install, but not from executing code when pip is run a second time?

#

Ah, I should just read the ticket:

For example, pip install --only-binary :all: could be used in a trusted context, before using the installed packages in an untrusted context (e.g. different stages in a build pipeline).

merry bramble
meager nacelle
#

The now closed PR #internals-and-peps message I brought up does demonstrate that if you patch out all of the lazy imports the stdlib at least will break. Not sure what the case is for pip's other dependencies.

meager nacelle
#

I remembered in an earlier stage testing pip with the equivalent of -X lazy_imports=all and it did give a reasonable start time improvement so it's kind of sad to not be able to get that due to this issue.

#

That said -X lazy_imports=all also appears to be slightly broken at the moment

uneven raptor
meager nacelle
#

Under the 'all' mode it seems some __getattr__ lazy imports don't always work? ./python -X lazy_imports=all -c "from typing import Match; print(Match)"

meager nacelle
#

Ah, no I see - it's broken with normal lazy imports too.

meager nacelle
#

The lazy from import puts a lazy_import object in the dict, which blocks the getattr from being called. I created an issue for it.

#

python -c "import typing; lazy from typing import Match; print(typing.Match)"

fleet frost
meager nacelle
#

I'm just surprised it creates a lazy object in typing at all.

#

I would have expected it just to create the object in __main__ and attempting to use the object would then try to do the from import.

#

./python -c "import typing; print(typing.__dict__.get('Match')); lazy from typing import Match; print(typing.__dict__['Match'])"

None
<lazy_import 'typing.Match'>
#

I'm using typing.Match as that was the import that failed when I tried to pip list with forced lazy imports.

clear hill
#

On CPython’s main branch, PyMem_Raw apis have been updated to use mimalloc instead of the system allocator on free-threaded builds. For a ufunc-intensive benchmark of NumPy that does a lot of allocations for temporary arrays on worker threads, we see substantially improved scaling. This will benefit all projects that use the PyMem_Raw apis, which I find pretty neat.

fleet frost
clear hill
#

also this all came as a result of a human asking a question on stackoverflow where they didn’t understand why multiprocessing beats multithreading

#

stackoverflow: it’s still a thing!

slate spire
swift imp
crimson hatch
#

I have been thinking about this problem a bit, and I've been thinking about a design where python runs in venvs by default. I have some experimenting and noodling to do on the design though

meager nacelle
#

I think it (pip not running in the environment it is installing into) is kind of a mildly spicy take, not quite spicy enough to make you reach for a glass of milk.

#

Without breaking established workflows I'm not sure how painful it would be though.

#

I have a TUI tool for managing runtimes and venvs, part of the design is essentially that you should not be using your python runtimes outside of a venv. You can open a REPL in a runtime, but if you want a shell with a default Python you can install things in it has to be a venv.

naive saddle
#

I'm trying to get some actual resources to work on pip, but I don't need to tell anyone here how that story goes, of course.

meager nacelle
#

Yeah maintainer resources are always a problem, I figured that was why just removing lazy imports was the resolution to the previous issue

naive saddle
#

FWIW, I did try to experiment with eager imports but deferred execution (where the finder/loader finds and reads the source code at import-time, but the module isn't executed until use), but I didn't observe any nontrivial improvements in startup time.

#

So I'm not really convinced even if we had the resources to support lazy imports, it'd be worth it.

meager nacelle
#

I did run some pip commands with an earlier version of the lazy imports and remember there being some significant improvement, but the current main is broken if you try to force lazy imports for pip.

naive saddle
#

significant improvement just running pip or an actual pip command?

meager nacelle
#

pip list I think

naive saddle
#

hm, interesting

naive saddle
meager nacelle
#

Install I expect to be smaller, but that's also where I expect you'd disable lazy imports

#

At least without getting into reimplementing how pip works

naive saddle
#

Honestly, the performance of pip show/list isn't really a concern.

meager nacelle
#

idk, it's just annoyingly unnecessarily slower as it's spending a fair chunk of its time importing modules it will never use

#

But that's also true of all the other packaging tools written in python

naive saddle
#

And the bigger improvements will come from faster installation metadata reading, which is blocked on either JSON distribution metadata (needs a PEP) or someone contributing a faster/C accelerated email parser in the stdlib.

meager nacelle
naive saddle
#

I mean, I get that people generally faster tools, but out of all pip's performance woes, this is not something we're really worried about.

#

There is something to be said about keeping lazy imports for non-destructive commands (and auto-completion), but we'll cross that bridge when 3.15 is near release.

meager nacelle
#

It's fair enough if you don't have the maintainer bandwidth, pip isn't even the worst among the tools.

#

I just don't think the lazy_imports=none flag is the best or even a good tool to deal with the lazy imports issue.

naive saddle
#

well, we need something

naive saddle
meager nacelle
#

I've mentioned this in the DPO thread, but I think what you need right now would just be a way to force all existing lazy imports to resolve. Which PEP-810 lazy imports make easier than the current condition.

naive saddle
#

sure

#

I have no opinion on PEP 810 TBH.

#

I only got involved in this thread in the first place because people were curious to why -X lazy_import is even a thing.

meager nacelle
#

IE, before you do the install process, you just loop over every module dict, and .resolve() every lazy import type.

#

(and so on for all of the new modules that have been imported)

naive saddle
#

ok!

#

I'm really the wrong person to talk about this since I really don't care (I'm sure we'll care once we get around to supporting 3.15)

#

but I have zero desire to be involved in a standards discussion, hah

meager nacelle
#

Without PEP-810 imports, the imports can be lurking in functions and classes. If everyone uses PEP-810 imports then they'll all be at module level (PEP-810 imports are only allowed at module level)

naive saddle
#

I don't think people are going to be removing function imports any time soon

#

pip still supports 3.9+ FWIW

meager nacelle
#

Oh yeah, but the idea is that it makes things better rather than worse.

#

New lazy imports are easier to handle than old ones

#

Might take a few years though

slate spire
#

i only vaguely had in mind something like the installed "pip" and what "python -m pip" winds up actually invoking being a pip run from a dedicated venv, passing along all of the relevant python sys config and install/env info paths for the to be installed in python. but i predict plumbing all of that could be... complicated. (i haven't looked inside the pip code box in eons).

#

uv happens to sidestep this given it isn't implemented in python so solved plumbing this information out of necessity.

#

In my mind this is similar to "native compilation" vs "cross compilation" toolchain wise. Projects naturally start out presuming native and encounter pain later when they need to move to cross compilation because the presumption of local runtime & tools & env matching execution runtime & tools & env is baked in implicitly all over the place.

#

I realize this is all about maintainer bandwidth in the end.

naive saddle
#

None of this is particularly hard (as uv has shown), but A) developing that plumbing, and this is the real kicker B) managing that transition is impossible with our resources

#

uv was much better designed from scratch

meager nacelle
#

Helps to not have all the existing workflows to break

uneven raptor
swift imp
swift imp
uneven raptor
#

faster

swift imp
# uneven raptor faster

It's that slow it requires a rewrite? Almost all stdlib could be made faster if an extension. Why is email so special

uneven raptor
#

email is complicated and prone to security problems, so rewriting it in C would generally be too much of a maintenance burden. rust is, hopefully, easier to write than C

raven ridge
#

Huh. I've never looked at the module, what makes it hard to have a fast pure Python implementation?

uneven raptor
naive saddle
#

It's probably not that slow, but the thing is a lot of basic packaging operations involve parsing these metadata files so the overhead adds up.

swift imp
meager nacelle
#

That's not overly surprising, assuming they're sdists they can execute arbitrary code and by default pip will attempt to build from sdist.

#

The issue with the lazy import trick was that it would work even with --only-binary :all: which shouldn't execute arbitrary code on its own.

raven ridge
#

iow this is a point that only really matters to security researchers who want to build untrusted code. It doesn't really make a practical difference for anyone else

#

If your goal was to install some package (which turned out to be malicious), and then run the Python from the environment you installed it into, you're still pwned

clear hill
crimson hatch
#

Actually, I guess I could see an atomic store maybe causing troubles on aarch64?

clear hill
#

is the test that’s failing even multithreaded?

#

no, I don’t think so

#

i wish github actions were less opaque 🤷‍♂️

crimson hatch
#

Yeah I was imaginging maybe something wrong with the C11 atomics implementation in ubuntu 22.04 on ARM, maybe?

#

But probably something else going on

whole jewel
whole jewel
# whole jewel

This (improper recognition of error and/or location) only seems to appear when typed in REPL, but not when run from file.

deep dirge
#

This was fixed recently by Pablo, it was an issue in traceback.py IIRC

boreal umbra
#

I often wish that the default iteration behavior for a dict is key-value pairs (a la dict.items)

quick snow
boreal umbra
quick snow
#

Yeah I hate that idea ducky_devil

boreal umbra
#

it's okay 🫂

merry bramble
boreal umbra
spark magnet
#

implicit iteration is handy, but can be confusing. I would have been ok with needing to use .keys(), .values(), or .items() in order to iterate a dict. And strings could have required .chars()

quick snow
#

!tempban 1324082847384080437 3d You've had multiple warnings. Asking for paid work is, as you know, not acceptable here.

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @tight field until <t:1772146112:f> (3 days).

swift imp
boreal umbra
static hinge
#

Alternatively, dict shouldn't be iterable at all.

clear hill
#

in a different language maybe but that would break even more code than making print a function did

#

unfortunately the case for basically any backward incompatible change to a builtin

feral island
#

If we were starting from scratch I'd be with @spark magnet and only allow iterating over .items(), .keys() etc., not the dict itself, but at this point that's way too disruptive a change to ever make

zenith topaz
#

That's one for python 4

swift imp
#

Why I think .items should be the base iteration not the keys

raven ridge
#

Do you think if key in the_dict: should work?

swift imp
# raven ridge Do you think `if key in the_dict:` should work?

I don't think base iteration and the containment check need to be similar

You're hinting this piece from the data model page?

It is recommended that both mappings and sequences implement the contains() method to allow efficient use of the in operator; for mappings, in should search the mapping’s keys; for sequences, it should search through the values. It is further recommended that both mappings and sequences implement the iter() method to allow efficient iteration through the container; for mappings, iter() should iterate through the object’s keys; for sequences, it should iterate through the values.

swift imp
#

If it didn't how would you check for composite keys

raven ridge
#

For every container type that I can think of, found = x in y is logically equivalent to py found = False for e in y: if e == x: found = True break and to found = x in list(y)

I expect breaking that symmetry would subtly break lots of things

#

Fundamentally, the in operator checks whether a collection contains a particular thing, and a for-each loop loops over all the things in a collection. It would be weird and asymmetrical for the "things" in a dict to be kv pairs for the purposes of a for-each loop, but keys for the purposes of a containment check.

clear hill
#

yeah, you’d have to turn off __contains__ and do key in d.keys() explicitly

#

it probably made more sense when keys and items returned lists instead of lazy iterators

boreal umbra
swift imp
#

KeysViewType and ValueViewsType isnt it?

#

from typing module

boreal umbra
#

I think so

swift imp
#

says deprecated

boreal umbra
#

Is the actual type depreciated, or just the hint?

#

Deprecated*

swift imp
swift imp
raven ridge
swift imp
swift imp
#

im not a language designer and dont want to be so I digress

raven ridge
#

I think that it's quite nice from a language design perspective that x in y and for x in y are analogous - that x is the same data type in both cases. I think that if you want for x in y to loop over kv pairs if y is a dict, that x in y would also need to expect x to be a kv pair

clear hill
#

it still works this way

boreal umbra
swift imp
#

I didn't know ItemsView supported the set operations

swift imp
#

I didn't know that u could access the view and update the originating dict and the views are updated

#

I always assumed they were throw away

raven ridge
# boreal umbra In what sense are they lazy?

The sense that they do not eagerly copy the keys/values/items from the underlying dictionary when they are constructed, and so reflect changes to the underlying dictionary made after they are constructed

#

The Nth key to be yielded when iterating over d.keys() is not determined until the Nth time the iterator is advanced

boreal umbra
swift imp
#

I suppose thats not surprising though?

boreal umbra
swift imp
#

maybe im thinking about it wrong but the KeysView has to in some way know the full sequence of keys for that RunTimeError to have occurred, right?