#internals-and-peps

1 messages ยท Page 25 of 1

thick hemlock
#

from reading some ruff rules it sounds something in their realm of complexity

#

but i think you're right, i don't know if there's a better way than hoping a linter would catch it

#

but maybe if an Iterator wasn't an Iterable this would be trivial? Since then it would be safe to assume that __next__ does have side effects but __iter__ doesn't? see my next comment for better explanation

flat gazelle
#

There are iterables which are sort of fundamentally single-use, for example a file

#

julia has an iteration API that looks something like

new_iterator, value = next(old_iterator)
```, which is great, since you can make `next` a side-effect free function. But well, an iterable going over every line of a file can't really be reusable like that.
thick hemlock
#

I think lists in java are not iterable but will return an iterable that is NOT themselves

#

meaning if you required (using intersection types or smthn) for something with __iter__ to not have __next__ then that would work, right?

flat gazelle
#

I'm a bit lost, work for what?

thick hemlock
#

i meant that a file would be an Iterable but a list wouldn't be

#

sorry, maybe im confusing how iterators work so this is non-sense

flat gazelle
thick hemlock
#

yes

#

i meant the opposite my bad

#

so a list would have __iter__ that returns a new object which has __next__, but a file would have only __next__

flat gazelle
#

IG a for loop would then try to call __iter__, and if it fails, call __next__ directly?

radiant topaz
#

honestly i would have expected a different type of exception to denote this, like StopIteration vs DeadIteration or something

raven ridge
#

collectons.abc.Collection is probably the best type there. Or, you can update the code to work with any iterable: ```py
def iterate_twice(inp: ...):
inp = list(inp)
for a in inp:
pass

for b in inp:
pass

raven ridge
thick hemlock
#

yeah I know

#

see this quote

thick hemlock
sturdy timber
thick hemlock
#

damn

#

cool

#

btw on that topic

#

is that something there's a PEP for? I remember reading about "...when we get negation types..." but couldn't find anything about it

sturdy timber
#

Not sure what the current state of the proposal is

thick hemlock
#

thanks!

raven ridge
thick hemlock
#

yes, I meant to say that's it's too late now since it would be breaking

#

just talking ideally

raven ridge
#

It's not clear to me that what you're proposing would be ideal, either

#

for line in file: is a nice pattern, and you wouldn't be able to do that if files didn't have a __iter__

halcyon trail
#

which proposal is that?

final geode
#

Looks great! A few things I noticed while reading:

  • "The internals of the pull request itself.": Due to a GitHub bug, the PR looks empty. You should probably just link to the actual commit for this: https://github.com/python/cpython/commit/f6d9e5926b6138994eaa60d1c36462e36105733d
  • "the flags for the windows builds are the same": They aren't, but they're similar. They're spelled --experimental-jit, --experimental-jit-off, and --experimental-jit-interpreter
  • "--enable the-experimental-jit=yes-off": This option should be --enable-experimental-jit=yes-off. Probably also worth calling out the PYTHON_JIT=1 environment variable here (PYTHON_JIT=0 will also turn the JIT off in the other modes).
  • "--enable-experimental-jit=interpreter": Maybe worth clarifying that this runs the same code as the JIT, but without actually jitting anything. Also worth calling out that it doesn't require LLVM, it works anywhere, and it's quite a bit slower.
  • I don't see any mention of needing to install LLVM. Probably worth calling this out and linking to the Tools/jit/README.md for instructions.
  • "*Except in frames that are owned by the C Stack, i.e. C extensions calling into Python.": That's not really what "owned by the C stack" means. It's pretty subtle and not too important here, so I'd just leave it out (basically, these are "secret" shim frames that we sneak in whenever we re-enter the interpreter loop).
  • "and creates an nice HTML summary": It creates a markdown summary, not HTML.
  • "--enable-experimental-jit jit": Repeated "jit" here.
  • For the deeper dive, might be worth directing interested people to the jit_stencils.h file now sitting in their build directory. ๐Ÿ™‚
thick hemlock
halcyon trail
#

i think it's a mistake to conflate Iterable vs Iterator, with single pass vs multi pass iteration

#

I mean most languages have something similar to Iterable and Iterator, and the distinction is never related to single vs multi pass.
iterator vs iterator is just for things like for loops, as well as other things as well (e.g. flat_map type functions) to understand when a type can be iterated, without needing to implement the iterator API directly on the type
most languages don't really do great at expressing that something can only be used once - in Rust you can express this, and the way it gets expressed is basically by what the type of the argument is in the Iterable (IntoIterator) to Iterator transformation

#

Like, for a Vec<T> (the equivalent of python's list[T]), IntoIterator is implemented separately for:

  • Vec<T> - yields an Iterator<T>
  • &Vec<T> - yields an Iterator<&T>
  • &mut Vec<T> - yields an Iterator<&mut T>
    In the first case, it really is single pass as the transformation of the Vec into an Iterator consumes the Vec - you can't iterate a second time
thick hemlock
#

I'm not really sure about rust, but I'll give the Java example. This is the first stackoverflow answer to Iterable vs Iterator in Java:

An Iterable is a simple representation of a series of elements that can be iterated over. It does not have any iteration state such as a "current element". Instead, it has one method that produces an Iterator.

An Iterator is the object with iteration state. It lets you check if it has more elements using hasNext() and move to the next element (if any) using next().

Typically, an Iterable should be able to produce any number of valid Iterators.

#

The last line explicitly mentions multi-pass, but even if it wasn't there, the distinction is that the Iterable does not have an iteration state. An object with an iteration state, by definition, will only be iterable once.

#

@halcyon trail ^

spark magnet
raven ridge
#

the last line says "typically" exactly because some Java iterables are consumed by iterating over them

#

which is exactly the same situation as in Python

halcyon trail
#

not much of a Java dev, but yes, what godlygeek says is what I would expect

thick hemlock
spark magnet
raven ridge
#

I'm not sure that iterators being iterables makes any difference. You're assuming that an iterable can be used to produce any number of iterators. That's not guaranteed by either language.

spark magnet
#

@thick hemlock i missed the start of this discussion, so I might be missing where you are headed.

halcyon trail
#

the problem is basically this: what do you do when you have a class that holds some state, outside of the iteration state.
you want to be able to iterate over it - but iterating over it will "consume" the associated resources?
An example of this would be something like a socket - once you read the data it's gone, you don't even have the option to re-open it like a file.

thick hemlock
#

I do think that makes a difference in relation to typing, no?

I mean, assuming these 2 interfaces weren't related, like in Java, in this function

def foo(it: Iterable):
    pass

it is, not garunteed, but typically, safe to iterate it twice as it has no state.

however in this:

def bar(it: Iterator):
   pass

you can assume that once you exhuast it, you can't iterate it again

halcyon trail
#

that's why both python and Java, as godly says, "occasionally" have classes that are Iterable, but can only be Iterated once.

#

you don't want to make them directly Iterators because someone may not intend to iterate over them immediately

#

so you wouldn't want to store that Iterator state

thick hemlock
#

so a function that accepts something like a socket would require an Iterator, but a function that doesn't want to consume it after iterating will ask for an Iterable

halcyon trail
#

but there's no way to not consume it - that's my point

thick hemlock
spark magnet
#

@thick hemlock tbh, i think you are putting too much on the types here, and "Iterator extends Iterable" is not a Python-style sentence.

raven ridge
halcyon trail
thick hemlock
halcyon trail
#

the point is that when you create an Iterator, you get to call a function that returns a different type, and can do any initializaiton you need

#

as an example you might want to create a buffer

thick hemlock
spark magnet
halcyon trail
spark magnet
#

@thick hemlock there are no other guarantees. __iter__ might return the same object every time.

halcyon trail
#

laziness isn't free either

raven ridge
# raven ridge > assuming these 2 interfaces weren't related, ... in this function ... it is, n...

if you type the function as accepting an Iterator, you may or may not be able to iterate over it twice (but probably can't).
If you type the function as accepting an Iterable, you may or may not be able to iterate over it twice (but probably can).
Aren't those two things equivalent, as far as how generic code would need to deal with them? You can't assume that it's consumed in either case, you can't assume that it's repeatable in either case.

halcyon trail
#

Like, if you really want this in the type system so bad, then the actual solution would be to have another interface: MultiIterable
MultiIterable has the same API as Iterable, i.e. you can get an Iterator from it

#

except it conceptually "promises" that you can iterate over it multiple times

halcyon trail
#

as it happens, MultiIterable already exists, and it's called Collection ๐Ÿคทโ€โ™‚๏ธ

raven ridge
halcyon trail
#

it's not exact technically since Collection also requires a length, but like in 99% of cases this is true

#

If you want to accept something in generic code that you are guaranteed to be able to iterate over multiple times

#

Collection[Whatever] is your best bet

spark magnet
#

@thick hemlock are you trying to guarantee re-iterability in the type system?

halcyon trail
#

either that or simply accept an Iterator and construct a list from it

thick hemlock
# spark magnet <@217353364076232705> there are no other guarantees. `__iter__` might return th...

I'm aware there's no guarantees, it can do something completely wild and return a random set of elements on each call.

However, if the inheritance tree would be different, the iterator would have only __iter__, and not __next__. The actual iteration in a for loop will only happen using __next__ in this imaginary world, so it's much less likely for it to hold state about the current iteration if it doesn't have __next__

thick hemlock
halcyon trail
#

if the for loop iteration only used next then you'd need to manually call iter yourself

spark magnet
thick hemlock
raven ridge
#

the current status quo is that, if you need to do two passes over the thing, you need to copy all of the elements into a temporary (regardless of whether the thing is an iterator or non-iterator iterable).
In the world you're proposing, that's all still true, right?

halcyon trail
raven ridge
#

so - what's the point, then, if it doesn't change anything about how you interact with the object?

halcyon trail
#

short of deliberate obfuscation, it's hard to understand how someone wil be constructing a Collection[Foo] that doesn't allow iterating more than once

raven ridge
thick hemlock
#

Maybe that's the gap then? I'm not sure of the difference. I thought the ABCs define this tree?

raven ridge
#

the ABCs are just a convenient way for you to implement your own containers

grave jolt
#

or was that your point?

raven ridge
#

that was exactly my point

#

if it needs to work with any iterable, you need to cope with the possibility that the iterable is only able to be iterated once (like data streamed over a socket). That's true even if you only allow non-iterator iterables

spark magnet
raven ridge
#

the overwhelming majority of iterators don't inherit from collections.abc.Iterator

halcyon trail
#

I guess mypy et al just "magic" that in

thick hemlock
halcyon trail
#

ah actually you'd use the ones from typing I suppose

raven ridge
halcyon trail
#

or are the ones from typing deprecated now? I forget

spark magnet
halcyon trail
grave jolt
#

type checkers treat collections.abc.Iterator and friends as protocols

raven ridge
halcyon trail
spark magnet
grave jolt
halcyon trail
#

Honestly if I had ignore the python inhibition against "star imports" and made my team a small prelude of types, I'd be in better shape

thick hemlock
halcyon trail
grave jolt
#

yeah it's the same pep585

spark magnet
halcyon trail
#

was Union to | there too

raven ridge
thick hemlock
steel solstice
spark magnet
halcyon trail
thick hemlock
raven ridge
halcyon trail
#

yes, that's an assumption. Nothing about the Collection API says anything about locks.

#

I just don't really see the point dragging threading matters into this.

raven ridge
#

sure, fine

halcyon trail
#

that's ultimately not handled by the type system, either way

thick hemlock
#

btw, even the collection API doesn't garuantee multiple passes are safe as far as i'm concerned

spark magnet
halcyon trail
#

if you receive a Collection[Foo] and your function doesn't otherwise mutate it, it's reasonable to assume you can iterate over it twice, IMHO

raven ridge
#

they're proposing that a file should not have __iter__

halcyon trail
#

and get the same results

#

assuming Foo is like, some simple piece of data or something

thick hemlock
spark magnet
#

i guess I am wrong: an open file is an iterator

thick hemlock
#

hm?

halcyon trail
#

@grave jolt so far I have "best practices" changes in two python versions - 3.8 and 3.9. I will find more...

thick hemlock
#

in the current tree, being an Iterator means it's also an Iterable

halcyon trail
#

sorry - 3.9 and 3.10

halcyon trail
# thick hemlock <@512649489300062228>

I mean it depends what you mean by "guarantee" - as I said, in practice I think it is guaranteed, given the nature of a collection, unless someone is abusing it

grave jolt
#

import typing and import typing as t fans fr in shambles after the collections.abc.* thing

raven ridge
halcyon trail
#

but again - semantic requirements are up to people to agree on

thick hemlock
halcyon trail
#

Is accessing something via the Mapping API guaranteed not to mutate it?

#

In a sane world yes, but not in the python standard library

halcyon trail
#

good luck finding a language or a type system that can save you from that.

grave jolt
#

Haskell

halcyon trail
#

i knew it

#

i knew someone woudl say it

#

get out

grave jolt
#

can't have bad code if you don't have any

halcyon trail
#

๐Ÿ˜‚

#

@thick hemlock it's already pretty hard to abuse Collection - it has a length, which means you have to already know exactly how many elements you have, which i fyou're retrieving it from a file/socket/database etc, you pretty much never do.

thick hemlock
thick hemlock
raven ridge
thick hemlock
#

give me a sec to read on that as im not familiar.

raven ridge
#

it represents a blob of data that may or may not correspond to a file, but if it does, iterating over the thing would be done by iterating over the file

thick hemlock
#

The doc says this:

The Resource type is defined as Union[str, os.PathLike].

raven ridge
#

ah, sorry, I meant importlib.resources.abc.Traversable (that happens to not be iterable, but imagine it was)

thick hemlock
#

either way, the point is, that if you know that iterating your data will consume it, then you should only implement __next__ and not __iter__. This is a way for the person writing the class to communicate that

raven ridge
#

and what if you don't know?

#

something like zip()

thick hemlock
#

in the current state, even if it's consumable, you will implement __next__, and then __iter__ returning yourself, right?

raven ridge
#

yep, you're required to

thick hemlock
raven ridge
#

so everything generic that wraps an arbitrary iterable would implement __next__ and not __iter__

thick hemlock
#

for readers jumping in the middle: this is a disccusion about a change of behavior, im not saying this is the current state - I will try to say __new_next__ and __new_iter__ to make the distinction clearer as this is getting confusing

halcyon trail
#

I'm just curious what about this is better than MultiIterable?

#

Like, as long as we're changing existing stuff - that seems strictly better

raven ridge
thick hemlock
thick hemlock
#

basically this idea lets me annotate whether it's consumed or not after iteration

halcyon trail
#

like, we all agree that by virtue of its API, an Iterator is always single pass.
so now the question is whether the Iterable promises that it can produce more than one Iterator that are in some sense "the same", i.e. each of them will yield the same sequence of results

#

the very obvious way to do that is to have a separate type MultiIterable

#

it has the same syntactic requirement as Iterable

raven ridge
halcyon trail
#

but it has the semantic requirement on users, that you expect it to be able to produce multiple iterators

#

This seems strictly better than "implement next directly on it but not iter"

raven ridge
thick hemlock
halcyon trail
thick hemlock
#

okay wait let me think about the multiiterable thing

halcyon trail
#

In practical terms, as I've said, Collection is already basically MultiIterable, and it adds a new piece of API (len) so you can tell them apart structurally

#

technically technically, you could imagine cases where you can iterate well multiple times but not have a length but this is getting super niche

halcyon trail
thick hemlock
#

yes I guess that is the upside and downside

#

if you're talking about an API addition that won't break anything then obviously your idea is better

#

but at least in my mindset, it doesn't "naturally" happen, a.k.a someone can forget to mark it

#

with the API I propose as long as you do what the API says you should in __new_next__ and __new_iter__, you can annotate that purpose for everything

halcyon trail
#

well, it does happen naturally in the sense of - if you can iterate something twice, that generally means you already have the elemnts in memory, which means you can have a length

#

so the length ends up being a market

#

*marker

#

like, yes, someone could somehow implement a Collection that doesn't really iterate the same way twice - but they can also do that with your new_next stuff too

#

can't stop people from writing bad code

#

The point is that if you want a reasonable type to accept into a functions that ensures you can iterate multiple times -

def foo(x: ?):
    for e in x:
        if e.blub():
            return e
    print("No blub found, printing everything!")
    for e in x:
        print(e)

it's perfectly reasonable today to annotate this as x: Collection[Foo]

thick hemlock
#

also, i can think of one concrete use case @raven ridge - if this is not annotatable then the person calling my function may call list() before passing it to be sure, and i, inside my function, will call list() again to be sure

halcyon trail
#

at least, I wouldn't have a problem with this

thick hemlock
halcyon trail
#

it is annotatable though...

thick hemlock
#

not if it doesnt have a length?

halcyon trail
#

And how common is this? What are your examples of types that can be iterated multiple types but don't have a length?

#

Even if such a type exists - it still solves the original problem pretty well. If you keep annotating Collection[Foo] down the call stack, then worst case the very top caller will have to do list(weird_multi_iterable_type_without_len)

#

so you create a single redundant list, rather than doing it repeatedly ๐Ÿคทโ€โ™‚๏ธ

#

but again - I'd be curious where all these examples of multi iterable types that don't have length are

grave jolt
#

Multi-iterable as in, Iterables but not Iterators?

halcyon trail
#

I guess it's not totally clear exactly what it means.
you could have something that's Iterable, that lets you iterate over a bunch of results from a database.
maybe technically you can get iterators from it each time - but each time you do, you're just doing a new query on the database.

#

so you can "multi iterate" over it, but you probably shouldn't do it light

#

*lightly

thick hemlock
halcyon trail
#

the idea of Collection is basically - the only time it's really cheap to iterate multiple times is when it's already all in memory. And when that's the case - you have a length.

thick hemlock
halcyon trail
#

if you don't have a length then you don't already have the elements and you're probably doing significant work to retrieve them. So creating a list and then multi-iterating over that is generally better anyhow.

#

I mean the simplest and most common explanation is just that your colleague wanted to iterate multiple times without overhead - but also wanted to be able to pass both, say, lists and sets

raven ridge
thick hemlock
#

this would be useful because sometimes you want to iterate the result-set twice

thick hemlock
halcyon trail
#

i mean it sounds like you're talking about lazy caching, are you not?

thick hemlock
#

I am

halcyon trail
#

I mean yeah, there exists uses for such things - it just tends to be very niche

thick hemlock
#

I think it can be useful in any IO consumption which certainly isn't niche

halcyon trail
#

lazy caching for something like this could be very error prone - you could keep the RepeatableCursor around and try to re-iterate expecting fresh results, but get the cache

#

i mean, no?

thick hemlock
#

not sure what you mean?

halcyon trail
#

it's not useful for "any IO consumption" - it's only useful if you decide to do lazy caching

#

lazy caching of IO is relatively niche - there's a lot of issues around invalidation

raven ridge
thick hemlock
thick hemlock
halcyon trail
#

i think if you're in this ultra specific situation then you would simply annotate with the exact type

grave jolt
halcyon trail
#

x: RepeatableCursor and then you simply know that its doing caching for you

thick hemlock
halcyon trail
#

Like, the point of things like Collection etc aren't that they're the only protocols you can have - they're just the most useful and most common

thick hemlock
halcyon trail
#

also to your point, if you're already doing laziness

#

then your RepeatableCursor can implement len just fine

#

it simply internally has a list, which it needs anyway

#

and if len is called and the list hasn't been populated yet, it populates it

#

and then returns the length of the list

thick hemlock
#

actually an SQL cursor would probably want to explicitly avoid implementing len, since it's much more expensive than doing COUNT on the SQL side

#

so this cursor would probably not encourage you to get the length by consuming the results, unless you actually need the results

halcyon trail
#

i mean unless you expect its a common use case that someone creates a RepeatableCursor, and never uses it

#

it doesn't really matter

#

you could also simply cache the length separately, if you prefer

#

Even in this niche example it's not clear there's any actual problem here ๐Ÿคทโ€โ™‚๏ธ

thick hemlock
#

but again this example is very niche

halcyon trail
#

๐Ÿคจ

#

it is

thick hemlock
#

to be clear, I did not at any point suggest this was a very common use case for me

halcyon trail
#

the worst thing about the status quo is probably the simple fact that some Iterables are "consuming", and you can iterate them twice, and mypy will not complain about it

#

is there a mypy bot here

thick hemlock
halcyon trail
#

no it's fine, I'm not a mod but I don't think you violated any rules or anything like that

thick hemlock
halcyon trail
#

it was all completely civil

thick hemlock
#

good to know!

grave jolt
halcyon trail
#

LOL!

thick hemlock
#

lmao

#

too real

grave jolt
#

Oh here's another example. You might choose to make a linked list iterable, but not provide length via __len__.

thick hemlock
#

interesting

#

yeah actually I do think, to generalize, a use case where you wouldn't want to annotate a Collection is one where a length is expensive

#

that can happen in a couple scenarios and data structures

grave jolt
#

Well, you can store a length in a linked list, I'm pretty sure deque does this

#

Although if you're manipulating nodes directly, it might not make sense to do that

#

(also if you're using the same class for lists with and without cycles)

jade raven
#

!e ```py
from typing import Iterable

class example(list):
def len(self) -> Iterable:
return [1, 2, 3]

print(len(example())) # 3```
huh, is the parent return of int being enforced

fallen slateBOT
halcyon trail
#

maybe not for some kind of exotic multithreaded skiplist or something, idk

thick hemlock
#

as a good practice though, I think it's fair to ask to be able to annotate the bare minimum your function needs. You could also annotate a lot of thing as list when you don't actually need all that functionality, but you shouldn't

halcyon trail
#

well, Collection is extremely close to the bare minimum here - there's very good concrete reasons why something that is cheaply multi pass iterable will basically always have length

#

but it does come partly back to the structural typing

#

I suppose you could do this with a "marker function" that doe snothing, except satisfy the structural subtyper lol

thick hemlock
#

at least from my POV, the reason the collections.abc tree is so big, is exactly to be able to annotate the bare minimum

grave jolt
grave jolt
thick hemlock
#

true

grave jolt
#

see quicknir's above example of defaultdict potentially breaking the expectations of some users about Mapping or dict

thick hemlock
#

and i do think as a community we should strive for that set of un-annotatable things to be as small as it can

grave jolt
#

well, unfortunately, Python is turing complete ๐Ÿ™‚

#

Haskell/Agda/other languages with advanced type systems and immutability guarantees might do a bit more, but still

thick hemlock
#

well, not small as it can be then, but, uncommon as it can be

grave jolt
#

If you formalize all the properties (expected by users informally) of a typical class like starlette.Request, you'll need to write some kind of research paper

halcyon trail
#

(outside of haskal and leetcode maybe)

thick hemlock
grave jolt
#

I thought that just records a history of requests and responses

thick hemlock
#

yes but for that you need to standardize what a request object looks like

halcyon trail
#

btw Wolf, since we talked about that most common error, here's roughly how you scan a file line by line in Rust, similar to what we write inpython

    let file = File::open("foo.txt")?;
    let reader = BufReader::new(&file);

    for line in reader.lines() {
        println!("{}", line?);
    }

    // trying to use reader again would be a compiler error
thick hemlock
halcyon trail
#

the mistake we talked about in python, where you can iterate a file twice and not get a complaint

thick hemlock
halcyon trail
#

doesn't happen in Rust

thick hemlock
#

how does rust know though

halcyon trail
#

@thick hemlock so, this goes back to what I was saying about the signature of IntoIterator and such

thick hemlock
#

what in the signature of intoiterator says that?

halcyon trail
#

in this case, its the signature of lines() - it takes "self" (the BufReader) by value

#

in Rust, takign by value means you "consumed" the thing

#

"moved" from it

thick hemlock
#

hmmm

#

interesting

halcyon trail
#

yeah

thick hemlock
#

how would partially consuming work then?

halcyon trail
#

there is no partially consuming

thick hemlock
#

in python terms, if I hit break while consuming a file, i can consume the rest

halcyon trail
#

so, if you wrote for line in reader.lines() { .. } and then did a break, you wouldn't be able to consume the rest

#

you'd need to hang onto the iterator

thick hemlock
#

ohh okay

#

feel like this would be possible as static error in ruff if __new_iter__ and __new_next__ existed

halcyon trail
#

well, the same as if MultiIterable existed

#

๐Ÿ™‚

halcyon trail
#

to "consume the rest" in rust you'd need to do something like

    let lines = reader.lines()
    while let Some(line) = lines.next() {
    ... // break early
    }
    // lines is still usable here
grave jolt
# thick hemlock but also I don't see how this is related lol

I meant that the real code has classes with more complex interactions than just iterating over a thing. There are loads of expected properties that are hard to formalize. For example (github link):

  • calling request.stream() from two different async tasks and using the iterator simultaneously is a really bad idea
  • calling request.body() and request.json() in two different tasks is a bad idea because they both call request.stream()
fallen slateBOT
#

starlette/requests.py line 199

class Request(HTTPConnection):```
halcyon trail
#

all the static type checking has costs too; before I wrote that snippet I had to double check a couple of signatures

grave jolt
halcyon trail
#

All you really is the ability to model "consume" and then you can have as many distinct statically states as you want for some entity

grave jolt
#

Yeah, that's what I mean by linear typing

#

readJson : ( [L]RequestStarted ) -> ( JSON | NetworkError | JsonError, RequestConsumed )

halcyon trail
#

yes, I know, I'ma ddressing the second part of what you said

#

it seemed like you were saying that X would be mitigated by the linear typing, but you can imagine something else with a series of complex states

#

which made it sound like linear typing wouldn't be sufficient to model a series of complex states

grave jolt
#

Yeah I don't understand what my point was

#

I think I meant more like, you might have runtime requirements that would not be practically modeled with a series of distinct states

#

Like if you have a bunch of continuous parameters (numbers/strings/something else that I haven't thought about) that need to align in some way

halcyon trail
#

oh for sure

#

and it might be practical, but also just not worthwhile

#

type systems aren't the answer for everything

#

even the stuff i've shown in rust, it's cool but unless you're otherwise committed to avoiding GC I think the cost is pretty high

grave jolt
# halcyon trail and it might be practical, but also just not worthwhile

Yeah, like if you have a combination of states you can have stuff like ```rs
trait ThermalState {}
struct Cooling; impl ThermalState for Cooling {}
struct Heating; impl ThermalState for Heating {}

trait AlarmState {}
struct NoAlarms; impl AlarmState for NoAlarms {}
struct MaintenanceAlarm; impl AlarmState for MaintenanceAlarm {}
struct JustRun; impl AlarmState for JustRun {}

struct NuclearReactor<Alarm: AlarmState, Thermal: ThermalState> {
/* ... */
}

#

it might be worthwhile in a nuclear reactor, but not in a rubber ducky store

#

assuming none of the duckies are nuclear powered

halcyon trail
#

Well, you can play games with combinations of traits in many languages

#

What is unique to rust here is

impl Cooling {
    fn start_heating(self) -> Hearing { ... }
}
#

The point being that start_heating takes by value, so force the state into a new type

#

cooling will be unusable after you call start_heating on it

grave jolt
#

yep

rain trellis
#

Iโ€™m sure itโ€™s old hat to many of you, but I ran across this error message when building Python today and laughed out loud in astonishment. Very nifty Easter egg.

grave jolt
#

.xkcd 2200

neon troutBOT
#

ERROR: We've reached an unreachable state. Anything is possible. The limits were in our heads all along. Follow your dreams.

halcyon trail
#

That's a great one

rose schooner
final geode
thick hemlock
#

that is awesome

rain trellis
rain trellis
# rain trellis Thank you for the awesome feedback, you and <@843510746222428191> both! I'm impl...

Iโ€™m going to hold off on publishing until gh-120437 is fixed - building on linux with โ€”enable-experimental-jit and โ€”with-pydebug on the 3.13 branch breaks something.

The bug is triggered by running ./python -m ensurepip, but I would think itโ€™s a JIT issue and not an ensurepip issue, thatโ€™s just where I happened to stumble on it.

https://github.com/python/cpython/issues/120437

signal rivet
#

Hi there! I was unable to find any existing proposal or idea discussion, maybe this is the place where I can find help or thoughts. I'm curious if there is any particular reason for raise being not an expression but statement. Why users can't write code like:

do_something(on_event=lambda e: raise SomeError)
valid_x = x if is_valid(x) else (raise ValueError("invalid"))

Maybe it would be possible to change this to the expression, or allow function-like syntax (like it was for print function years ago) - so one can do raise(something) to force expression behavior. What do you think?

#

Some languages in a wild already have such a behavior, e.g in scala

val x = if true then 1 else throw RuntimeException("boom")
// val x: Int = 1
grave jolt
# signal rivet Hi there! I was unable to find any existing proposal or idea discussion, maybe t...

I think this is one of the reasons raise isn't an expression: you would be able to create more complex expressions that potentially do an important action (raise an exception). Instead of: py if is_valid(x): valid_x = x else: raise ValueError("x") or just py if not is_valid(x): raise ValueError("invalid") If you like this style, you can always make your own function: ```py
def throw(exc):
raise exc

#

Scala is a more "expression oriented" language so maybe it makes sense there

signal rivet
#

yeah, statements is something you can't compose...

feral island
grave jolt
#

You could do the same as yield and require (raise) though

signal rivet
#

@feral island > raise without arguments is legal syntax
That's a point ๐Ÿค”

#

Thank you!

feral island
#

In Python 2 raise x, y, z was also legal (I think it's type, value, traceback), that would have made an expression form even harder. Probably still fixable by requiring parentheses though. Python 3 simplifies the syntax, but this may help explain why it wasn't made an expression when the syntax was first added.

thick hemlock
grave jolt
#

EAFPers in shambles

neat delta
peak spoke
#

I still think exceptions are a bit more annoying to deal with than they should be

#

Though not sure what I'd do about it apart from better documenting what can be raised in the stdlib

thick hemlock
#

never liked EAFP cause of the verbosity

flat gazelle
#

What an odd thing to say in the rejection notice, the glossary prefers EAFP https://docs.python.org/3/glossary.html#term-EAFP

alpine rose
#

if raise was literally a function call, you'd also create a lot of cyclic references

raven ridge
#

Wouldn't the function just drop itself from the traceback, in order to maintain backwards compatibility with today's tracebacks? And wouldn't dropping the frame representing the raise() call itself resolve the circular references you're talking about? Or are you talking about something other than the local variables of the raise() function

rose schooner
#

why, for a function with this signature: py def foo(ham, *eggs): ... is ham a valid keyword argument?
eggs wouldn't be able to receive any arguments when ham is given as a keyword argument ```py
foo(1, ham=2) # multiple values for 'ham'
foo(ham=1, 2) # positional argument follows keyword argument

spark verge
grave jolt
rose schooner
grave jolt
#

ah

#

for some reason I misread *args as **args

radiant garden
#

you can consider eggs a funny kind of default argument taht doesn't let you assign it directly :)

thick hemlock
#

I was also under the impression this is something Python "recommends"

#

But looking at your link I don't think this is recommended

#

EAFP and LBYL are both mentioned

halcyon trail
#

I gotta say it's wild to see how hard exceptions are still working (not just in python) to solve the same issues for so long. Multiple exceptions, transporting across threads, using conveniently as expressions

#

I feel like non exception solutions have solved the propagation issue a lot better and more easily

thick hemlock
halcyon trail
#

Particularly in conjunction with a propagation operator

thick hemlock
#

Can you explain how this propagation operator works in "Python terms"? I'm not too familiar with Rust

halcyon trail
#

Sure

#

Say foo returns a result.
let x = foo()?;
Expands very roughly to

let f = foo();
if f.is_err() return f.err();
let x = f.ok();
#

I'm going out of my way to avoid pattern matching here

quick snow
#

But that means you have to handle the (potential) error right where you call the function. Exceptions let you handle them where it makes sense.

halcyon trail
#

The only difference is that propagating happens explicitly via a single character instead of implicitly

quick snow
native flame
#

i think there is value in indicating which lines can possibly error

thick hemlock
halcyon trail
halcyon trail
#

Erase the type, and have handlers downcast to handle particular kinds of errors

thick hemlock
#

can you use some pattern matching magic to only propagate on specific values?

halcyon trail
#

You can do stuff like that too, but it's probably easier to start by looking at Result code that is more analogous to exceptions

radiant garden
#

in the case of rust, you'd use a dynamic dispatch pattern a la

fn handles_some_errors() -> anyhow::Result<()> {
    let db = connect_and_do_things(); // returns result with multiple errors
    if let Some(ClusterError { cluster }) = db.downcast_ref() {
        println!("damn that failed, {cluster} is down");
    }
    // do other things with db
}
thick hemlock
#

Hmm okay I'll look at some code examples

radiant garden
#

for maximum ergonomics

halcyon trail
#

Pretty much

#

anyhow::Result is a class that only has one type parameter, for the result

#

It carries any error

thick hemlock
#

I do dislike exceptions because they break the control flow

halcyon trail
#

Yeah, so there's also that part of it, with Result you need ? to propagate. So it's visible. But I do think that's a trade off.

#

But things like handling multiple errors or using fallible things as expressions are just pure wins I think

thick hemlock
halcyon trail
#

? is a magic language level operator

#

Without something like ?, propagating Result is kind of verbose

#

And you propagate errors a lot more than you handle them

grave jolt
#

Stupid question. Is an iterator always an iterable? why/why not?

#

The glossary says:

Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted.
but I vaguely remember someone saying that it's not the case

flat gazelle
#

There was a rather extensive discussion on this, but an iterator should be an iterable, but certain language constructs will work despite an iterator missing __iter__. However, some won't.

grave jolt
flat gazelle
grave jolt
#

what a snarky canadian

halcyon trail
#

Lol!

woven sundial
#

man i need help i don't understand this

sour thistle
#

!e shouldn't this raise a TabError instead? (using exec since discord converts tabs to spaces) py exec("""if True: \N{TAB}1 \N{SPACE}2""") Doing it in the opposite order (space first, tab second) gives a TabError

fallen slateBOT
sour thistle
#

!e ```py
exec("""if True:
\N{SPACE}1
\N{TAB}2""")

fallen slateBOT
quick snow
#

!e I'm guessing it would show that error if there was a matching outer indentation level:

exec("""if True:
\N{SPACE}if True:
\N{TAB}1
\N{SPACE}2""")
fallen slateBOT
quick snow
#

Ah, well, it doesn't reach it in this case.

#

!e what if there's more spaces?

s, t = " \t"
exec(f"""if True:
{t}1
{s * 9}2""")
fallen slateBOT
boreal umbra
grave jolt
#

What happens if you use a fake name in the CLA?

feral island
#

(it is basically honor system)

grave jolt
#

what if I just don't want to reveal my real name to the PSF

sturdy timber
#

@elder blade asked the PSF about that a while ago and they said that the email and address are most important and using an internet alias for the name is fine.

#

I think that's what I did

#

(That advice was to a specific instance of someone asking rather than a general policy afaik though, so it may be worth asking the PSF first yourself if you want to use a fake name)

elder blade
#

Almost leaked my home address sending the whole document, so the psuedonym will have to do

#

I suppose it's better to sign it under your psuedonym, because they will be able to know which signed CLA is which contributor... in hindsight though I am not sure I won anything writing "Bluenix" instead of my real name ๐Ÿคทโ€โ™‚๏ธ

#

Here's a reminder to everyone to always store copies of all documents you sign! ๐Ÿ˜… It should be easy-as-cake to find if necessary

raven ridge
sturdy timber
#

Hmm, if you sign using the CLA bot do you even have to enter your name and physical address now ๐Ÿค”

quick snow
thick hemlock
#

this seemed off topic for the discourse, so if you don't mind me asking, @feral island, is there a specific reason you chose to do PEP 749 for the implementation tweaks instead of just halting acceptance of PEP 649 to when a draft PR was out?

#

or was it just too much work to draft a PR before an official acceptance? cause the current situation seems quite odd with PEP 749

feral island
#

I am not the author of PEP 649, so I don't feel comfortable editing it

thick hemlock
#

oh I see

feral island
#

Most of what's in PEP 749 I'd feel comfortable changing myself in an implementation PR, but the creation of a new stdlib module and the proposal for the future of from __future__ import annotations feel important enough that they should be called out in a PEP

thick hemlock
#

yeah no doubt the annotationlib thing should get proper approval

#

Is Larry also helping you implement 749? wondering if the question is relevant to him or maybe this is just a weird case with the author vs implementor

feral island
#

I haven't heard from him for some time

thick hemlock
#

oh, I see :(

#

well yeah given the situation that makes a lot more sense

thick hemlock
#

But overall I'm happy that you did formalize it since the discussion there seems productive

#

thanks for the answer!

glass arch
halcyon trail
#

i honestly still like Optional[Foo] over Foo | None

rose schooner
#

so 1 space is technically an unindent

thick hemlock
rose schooner
thick hemlock
thick hemlock
halcyon trail
#

why? also typing is still crazy ubiquitous, Sequence and Mapping should be two of the most common type annotations

thick hemlock
#

Okay, to rephrase, I prefer to use typing's verbose forms the least I can. I don't actually mind the import time

#

What I meant to say is that I don't have anything specific against Optional I just prefer the "native" syntax in all places even if it's a bit less explicit.

#

same goes for the new typevars

halcyon trail
#

idk, Optional[Foo] is very very marignally more verbose than Foo | None

#

and it's also just like the vastly more common way to do it across languages

fallen slateBOT
#

Parser/lexer/lexer.c line 499

else /* col < tok->indstack[tok->indent] */ {```
rose schooner
#

col (of the space indent) is 1, as opposed to the previous col (of the tab indent), which is 8

#

so the python lexer takes that as a dedent

thick hemlock
#

maybe just feeling "first-class"?

halcyon trail
#

i guess, I mean i think it's a pretty different situation

#

in one case you just drop the capitalization and avoid an import

rose schooner
fallen slateBOT
#

Parser/lexer/lexer.c line 506

if (col != tok->indstack[tok->indent]) {```
rose schooner
#

so it raises an IndentationError instead of TabError

thick hemlock
#

Yeah obviously the list annotation has no downsides, but I was comparing it to show the fact that I really care about that change even though it doesn't mean anything.

halcyon trail
#

unironically I'm not a huge fan of the list/dict thing - tuple I don't mind so much.
it basically gives a higher precedence to a type annotation that is less likely to be correct

rose schooner
fallen slateBOT
thick hemlock
#

because of the Sequence thing?

halcyon trail
#

yeah

feral island
thick hemlock
#

so you're saying because the list import is "easier to reach" than other collections.abc's it makes lazy people make wrong annotations?

halcyon trail
#

yes

#

this was alread a problem before

#

honestly I made this mistake - I was new to python typing and I thought List was appropriate to use a lot, I didn't yet know about Sequence

thick hemlock
#

in my experience, people who care about annotations usually try to find the best annotation whether it requires an import or not

#

people who don't will use a list annotation for most things anyway

#

(and that's okay)

#

my worldview on it is basically that if you don't really care about typing you shouldn't need to import typing

halcyon trail
#

idk - I'm a living breathing counter-example ๐Ÿ™‚

thick hemlock
#

and if you care it's okay that you import it

thick hemlock
#

honestly the collection inheritence tower was a bit overwhelming to me when starting

#

so in a way I'm glad it's not all a builtin? Maybe only Iterable I would want because of generators

thick hemlock
#

you said that as a new user

#

you overused list

halcyon trail
#

new to python typing specifically - sure

#

It's not that I didn't "bother" with Sequence - I just didn't really know about it

#

I'm familiar with the "tower" from other languages - e.g. Kotlin

thick hemlock
#

that's fair

halcyon trail
#

anyhow, list being first classed and Sequence not makes it "feel" like you shoudl annotate with list a lot and Sequence more rarely but it's actually the reverse

#

it would be funny to do a survey of Optional[Foo] vs Foo | None on the python reddit or something

thick hemlock
#

I agree, I just think there is value for common types to be "discovered" first as to create a gradual introduction to typing

thick hemlock
#

I hope eventually it will happen, this could really fit on it.

halcyon trail
#

from a quick google skim or two it seems like Foo | None is more popular - or at least people think it's better because its newer

#

who knows

#

I want @feral island 's opinion ๐Ÿ˜›

thick hemlock
#

I think that FastAPI recommends Foo | None in their docs

feral island
#

It's better because it's less confusing. Optional[] does not mean the argument is optional

halcyon trail
#

I think whatever FastAPI recommends doing the opposite seems appealing

thick hemlock
halcyon trail
thick hemlock
#

I mean

#

that has caused confusion

feral island
#

I've definitely seen people write foo: Optional[int] = 0

thick hemlock
#

if I create a dataclass like so:


@dataclass
class A:
   param: Optional[int]
feral island
#

because the argument is optional

thick hemlock
#

it's not optional

halcyon trail
feral island
halcyon trail
#

hard for me to tell I guess what you mean

thick hemlock
#

they meant that the Optional annotates you don't need to pass a value

feral island
#

The parameter is an optional one that takes an int and doesn't have a reason to accept None. But people write foo: Optional[int] = 0

thick hemlock
halcyon trail
#

I don't find this confusing, nor do I think it's particularly compelling that this would confuse people - but obviously, everything will confuse someone

#

I'm sure someone has been confused by this, N > 1 times

thick hemlock
halcyon trail
#

no, I don't think it's confusing

thick hemlock
#

I mean I would expect this annotation to mean "passing param is optional, you don't have to"

#

that's what it says

halcyon trail
#

I wouldn't ๐Ÿคทโ€โ™‚๏ธ

#

it says that its optional whether or not param contains an int

#

i think maybe this is a thing for people who are less used to static typing, primarily

thick hemlock
#

To me it reads as "This dataclass accepts a param which is optional, and it's type is an int"
I do understand that it actually reads "This dataclass accepts a param which is optionally an int", but it's confusing

halcyon trail
#

like, most statically typed languages have names for this kind of thing that is either Option or something very similar to Option (like Maybe) - nobody in Haskell is getting confused that a function parameter of Maybe type, only maybe has to be passed.

neat delta
#

i do wonder how often Optional appears without assigning a default value

thick hemlock
#

I remember it being called Nullable in some language

#

don't know which

spark magnet
#

these days you'd say param: int | None

thick hemlock
halcyon trail
#

Kotlin

rose schooner
# sour thistle !e shouldn't this raise a `TabError` instead? (using exec since discord converts...

!e here's all the cases where TabError can be raised: ```py
tab = '\t'
space = ' '
space_8 = space * 8
for i, example in enumerate((

col (8) == TOIS (8), altcol (8) != TOAIS (1)

f"""if 1:
{tab}1
{space_8}2""",

col (8) > TOIS (1), altcol (1) == TOAIS (1)

f"""if 1:
{space}1
{tab}2""",

col (8) > TOIS (2), altcol (1) < TOAIS (2)

f"""if 1:
{space}if 2:
{space * 2}if 3:
{tab}4""",

col (8) < TOIS (16), altcol (8) != TOAIS (1)

f"""if 1:
{tab}if 2:
{tab * 2}3
{space_8}4""",
)):
try:
exec(example)
except TabError:
print(f"TabError raised for example {i}")
else:
break
else:
print("All examples raised TabError")

thick hemlock
halcyon trail
#

but "nullable" often differs from optional subtly

#

it's not 100% consistent, but often

rose schooner
#

mk wait

halcyon trail
#

python's "optional" is really closer to "nullable" fwiw

fallen slateBOT
neat delta
feral island
halcyon trail
neat delta
#

Nullable isn't great either, as python doesn't have a Null, but Noneable is terrible

rose schooner
fallen slateBOT
thick hemlock
spark magnet
#

this kind of debate is one of the reasons in the docs we're leaning to using more English to describe types instead of strict annotations.

halcyon trail
#

@thick hemlock usually, optional types support nesting, nullable types do not.
Option[Option[Foo]] in python is meaningless - Union types in python are automatically collapsed

thick hemlock
halcyon trail
#

which is actually a big headache sometimes

feral island
halcyon trail
#

Uhm I'm underestimating what percentage of people know the difference between nullable and optional types? I don't think so.

thick hemlock
#

Underestimating the people that annotate it wrong because they don't know the difference

halcyon trail
#

that's totally different, I don't think my comment that you responded to was saying what you think it was

spark magnet
halcyon trail
#

I was replying to ned

halcyon trail
thick hemlock
thick hemlock
halcyon trail
#

^

feral island
#

But it's not any different from others. You can still talk about "union with None"

thick hemlock
#

I agree and I also don't know if it's actually needed in the stdlib, but it is something a term for would be useful for me

halcyon trail
#

I mean in a purely mechanical sense, sure, it' snot different. Practically it is - that's why you see distinctiosn around it in almost every language.

thick hemlock
halcyon trail
#

if you're using for example type annotations to help you deserialize something

#

it would be extremely common to have special handling for Option[Foo]

feral island
feral island
halcyon trail
#

I have too, and python really makes you suffer

#

because you can have a type annotation like Option[Union[Foo, Bar]] - okay, you tell your framework how to deserialize Union[Foo, Bar], right?
And now it has the generic logic for Union[T], it knows how to deserialize T - life is good, right?

thick hemlock
halcyon trail
#

not actually though because python just collapses your type

#

anyhow - it is what it is I guess. but I will take note that Foo | None is the idiomatic recommendation, plus what Ned said makes me reconsider

#

(plus recalling my own bad experiences)

#

we just have a mixed C++ and python codebase, and in C++ it's std::optional - so Optional comes very naturally

thick hemlock
#

if we were to continue that earlier topic, are there more things in typing you wish you had as a builtin?

#

I personally would like Literal to be easier to write, but i don't really know how that would work

#

Maybe Any but that ship has sailed because of any()

#

but also maybe that's too permissive to even have as something that should be easy to reach

feral island
#

Yes, it runs into the same issue that you're making something easy that arguably shouldn't be easy

thick hemlock
#

mhm

#

Any is sometimes useful for me when transitioning codebases without annotations

feral island
#

I'm not entirely sure I agree with that argument myself

thick hemlock
#

haha

#

want to share why?

feral island
#

It feels like it goes against "practicality beats purity". People often have to use Any in practice

halcyon trail
#

heck, python doesn't define clear rules for it so I'm not sure that I understand how it works in python - despite being very very familiar with how it works in another language (C++)

#

The only place I'd ever consider using literal in my own code is to type legacy - otherwise use an enum or something

thick hemlock
halcyon trail
#

it's wrong but until relatively recently it's the only thing mypy really supported

thick hemlock
#

i love the recursive types being supported

#

but even specifying here another level would be useful

halcyon trail
#

there was a whole discussion here a while ago, where I brought this up, and people were saying that properly typing json as a recursive union didn't matter ๐Ÿ˜›

thick hemlock
#

and it goes with enums

halcyon trail
#

you can elaborate though I probably won't agree

#

Oh pydantic again, bleh

thick hemlock
#

This is the basic example in pydantic's docs:

class Cat(BaseModel):
    pet_type: Literal['cat']
    meows: int


class Dog(BaseModel):
    pet_type: Literal['dog']
    barks: float


class Lizard(BaseModel):
    pet_type: Literal['reptile', 'lizard']
    scales: bool


class Model(BaseModel):
    pet: Union[Cat, Dog, Lizard] = Field(..., discriminator='pet_type')
    n: int
#

discriminator is an OpenAPI thing that basically lets pydantic change its behavior from "try parsing as cat, try parsing as dog, try parsing as lizard" to "check the discriminator field and only parse as that"

#

which leads to much better error messages

halcyon trail
#

this is just not a good way to do this - its giving every single Cat a pet_type field which is entirely redundant

#

well, I guess it's not necessarily, because pydantic is kind of its own weird little world

#

but in the rest of python, that's what that would mean

thick hemlock
#

You do need this pet_type though

halcyon trail
#

you don't need every instance to have it - it's the property of the class

#

python instances already know their class

thick hemlock
#

I mean, in Python you don't, but this is not relevant to just pydantic, any serialization of this to JSON will require this information

halcyon trail
#

in json you will need it, yes

#

the python objects don't need it

thick hemlock
#

yes

#

serialization is very useful though

halcyon trail
#

but the types above annotate as though every instance of Cat has it ๐Ÿคทโ€โ™‚๏ธ

#

my point is that you can achieve this in other ways - I've written generic code that serializes and deserializes Unions

thick hemlock
#

This is also useful in non-discriminator use cases. Consider an enum for, let's say, an image processing type. I want my implementation to use that enum but to annotate it only supports 3 of the 5 processing types

#

so I would use a literal here to say that

#

I don't know any better way

thick hemlock
halcyon trail
#

btw, what the rest of the python world would probably do in that example is something like pet_type: ClassVar[str] = "cat"

#

the point is that the Literal annotation doesn't actually give you anything, doesn't improve type checking in any way

#

it's just a string fed to pydantic internals for serialization/deserialization

thick hemlock
#

that is fair and I'm not gonna re-ignite the discussion about runtime usage of annotations vs type checking

halcyon trail
#

you just create a new enum, that has 3 of the 5 types, and a function to convert between them (or throw)

#

it's pretty much exactly the same thing, except the enum is being explicit about the fact that it's effectively a new type

thick hemlock
#

I don't really agree

#

that creates a lot more enums

halcyon trail
#

The problem is that people who aren't very familiar with typing tend to fall into this trap - static checks good, so more static checks better, so they try to heavily use Literal for things like this
and then it doesn't always behave in ways they understand - there are things that may be obvious to a user, but which python's static type system cannot understand

thick hemlock
#

what behaviors are hidden here? I've always viewed it as a pretty simple feature

halcyon trail
#

people who have actually worked with statically typed languages, where these things need to be 100% correct (they're not just annotations after all) tend to understand that pulling values into a type system is no joke

thick hemlock
#

Python is different in that though

#

you can't just treat it as it's statically typed

halcyon trail
#

Yes, because the static type system isn't terribly well defined

#

so when you work with Literal in particular, mypy vs pylance vs whatever will very often give you different answers

#

I've seen it a lot

thick hemlock
#

I know this is oversimplifying and ignoring a looot of use cases

#

but really all I wanted with this example is for someone to get a warning when they explicitly pass a value they shouldn't

#

I don't want to create a new type for this

halcyon trail
#

you already did?

thick hemlock
#

wdym?

halcyon trail
#

Literal[1, 2, 3] is a type

thick hemlock
#

sure, but it's inline

#

its the easiest thing for me to do

halcyon trail
#

sure - but it's still a type. So just say - you don't want to write a few lines out of band ๐Ÿ™‚

#

i was about ot say that in the end most people's reasoning is just to save a few characters

#

not very compelling

thick hemlock
#

I don't know why you don't find that compelling

halcyon trail
#

and it's less nice for the person using your function - they can get auto completion on both the enum, and the enumerators

halcyon trail
#

saving a few characters to me just isn't a big deal. Having to consider the ramifications of a more poorly defined type system, less readability, less help for tooling - all these matter more

thick hemlock
#

I feel like you have to take into account that the typing system is not like in static language. The person can simply annotate it as being the enum ImageProcessingType and not give any annotation to what values of it are supported. This is often the difference in PRs between telling them to "just add a literal annotation" which wouldn't get pushback, or to create a new enum for this which would.

#

I am not saving a few characters to codegolf

#

I'm saving characters to encourage people to use type hinting.

#

When it's simpler and shorter to annotate something - people will do it more. Easier annotations create better annotations.

halcyon trail
#

I don't have an issue with people not type hinting

thick hemlock
#

I have seen a lot of people avoid TypeVar because of this, and I'm so glad PEP 695 passed

halcyon trail
#

It's still saving a few characters no matter how you slice it - there's much bigger obstacles to typing a codebase than this

#

If people choose not type things because of this - they're going to have a lot of untyped code ๐Ÿ™‚

thick hemlock
#

I'm talking about situations where the codebase is typed but I want a new function to be typed better, if possible.

#

I would pass the PR if the annotation was simply ImageProcessingType and the first line would check the value and raise an exception if needed

#

but I would prefer if it specified the values

#

its not "typed" vs "untyped", it's how well it's typed

halcyon trail
#

An enum is obviously typed at the call site. Passing "foo" it's very far from clear it's one of 3 legal values

thick hemlock
#

Also it matters what characters you're saving. That type being inline saves you from needing to find another name for the enum

#

which is often hard for these subsets

halcyon trail
#

Anyhow I just don't think we'll agree, and I think the convo reinforces my previous beliefs.

#

Hard for the writer but good for the reader ๐Ÿ™‚

thick hemlock
#

I agree with the sentiment

#

it's just not always practical

#

something something practicality beats purity

halcyon trail
#

Uhm you can disagree with me but please don't call it impractical

#

It's fully practical

#

I do it. It's very easy.

thick hemlock
#

what do you mean? I didn't say it's impractical, I said it's not always practical

#

there are situations where I can do what you suggest

#

in others, not

halcyon trail
#

There's no situation where you can't do it - it may take a couple extra lines, yes, but you can

thick hemlock
#

Again, I think that's idealistic. Codebases are not either "typed" or "untyped". It's a spectrum

halcyon trail
#

This is becoming bad faith so let's stop

thick hemlock
#

Oh, sorry

#

I didn't mean for that to happen at all, I was actually okay with the discussion

#

sorry it felt like that :(

#

I did not mean at all to say anything in bad faith, sorry if any of my messages sounded like it

#

Have a good day/night!

grave jolt
#

There is an ergonomic issue with Literal regarding type inference.
If you write x = [Color.red, Color.green], x is inferred as list[Color]. But it would be silly to infer the type of ["red", "green"] as list[Literal["red", "green"]]

#

Literals are extremely popular in TypeScript and you can do e.g. ["a", "b"] as const and it will infer it as readonly ["a", "b"].
However, TypeScript has a lot of uses for literals outside of just enum values, so that probably warrants some extra tooling

gray galleon
#

does python exponentiation use exponentiation by squaring for integer exponents

sharp plover
#

If the exponent is larger than that, it uses "left-to-right k-ary sliding window exponentiation" (algorithm 14.85 in the same book).

#

I'm not sure I totally understand the second algorithm, but the first algorithm is the exponentiation by squaring that you already know, except working from the most significant bit of the exponent down to the least significant.

#

Essentially this: ```py
def pow(base, exp):
result = 1
bit = 1 << (exp.bit_length() - 1)
while bit > 0:
result *= result
if bit & exp:
result *= base
bit >>= 1
return result

fallen slateBOT
#

Objects/longobject.c line 4848

long_pow(PyObject *v, PyObject *w, PyObject *x)```
gray galleon
#

why donโ€™t 'ba'[:10] cause an index error

grave jolt
#

It would be a huge pain to have to write user_input[:min(10, len(user_input))] insead

#

in other words, you almost never want it to error in the case when you have more elements

faint river
#

range has similar cutoff

#

!e print(*range(0, 7, 4))

fallen slateBOT
grave jolt
#

(unlike for example list/string indexing, where JavaScript made a mistake in the early days, and returned undefined when the element doesn't exist at an index)

halcyon trail
#

I mean, fwiw, I think it's perfectly reasoanble to error here too

#

and some languages do

faint river
#

which

halcyon trail
#

rust for example

grave jolt
#

Java I think

halcyon trail
#

i probably prefer erroring tbh

#

if you want to say "I want up to index 10 or the length, whichever is smaller" - why not be explicit and say that

#

but I dont' find it a hugely common use case

gray galleon
halcyon trail
#

python is about having a mantra that says that it's explicit ๐Ÿ˜›

grave jolt
#

Well, Rust is different in that it doesn't treat strings as "arrays of characters for some definition of 'character' (since we're using unicode)"

halcyon trail
#

it's nothing to do with strings really - we can just talk about vectors as well

#
fn main() {
    let v = vec![1,2,3];
    let u = &v[..5];
}

this will panic in rust

grave jolt
#

yeah that is true

halcyon trail
#

another reason why it makes sense in rust is that rust offers non-throwing/panicking API for slicing.
the same way that python has a .get for say dict, that does not throw

#

so, this behavior is useful in conjunction with that

fn main() {
    let v = vec![1,2,3];
    if let Some(u) = &v.get(..5) {
        println!("{:?}", u);
    }
    else {
        println!("Oops!");
    }
}
rose schooner
halcyon trail
#

this prints "Oops"

#

v.get and v[] have to be consistent with each other

#

if you implicitly cut off you would not be able to easily branch on whether the slice fit. It would also feel less consistent with regular indexing.

halcyon trail
#

I think it makes more sense for the default behavior to "fail fast"

#

that's also part of the zen of python ๐Ÿ˜‰

#

Errors should never pass silently.

grave jolt
#

I guess that's the tradeoff that was made when slicing was first developed. Kinda hard to go one way or another for me

halcyon trail
#

the problem I guess for me is that in python the "erroring" version is so awkward that nobody will ever write it- people will just say "yeah, i'm confident these strings have at least 10 characters"

#

I think you literally have to do

assert len(s) >= 10
x = s[:10]
#

in rust, the default errors, so you a) have no choice, and b) it's still a one liner s[..min(10, s.len())]

rose schooner
#

s.limit(10)

#

if only..

grave jolt
#

unless you want a slice of the first 10 bytes for some reason

gray galleon
#

s.chars()[..10]?

grave jolt
#

no, chars() is an iterator

gray galleon
#

so to use slice you need to turn it into list smh

halcyon trail
#

You can just get the byte index of the nth character

grave jolt
#

ah, and then split_at

#

that is also possible

#

s.split_at(s.char_indices().nth(10).unwrap_or(s.len())).0??

halcyon trail
#

I'm confused

#

Just use char indices and then slice

grave jolt
#

Still, ```rs
{
let split_pos = s.char_indices().nth(10).unwrap_or(s.len());
&s[..split_pos]
}

#

I'm actually not quite sure what people use string slicing for most of the time in Python

gray galleon
#

for slicing strings

halcyon trail
#

Slicing utf 8 strings by character just isn't a cheap operation

#

So Python is hiding the cost somehow

gray galleon
halcyon trail
#

If str has O(1) access then that just means that it's storing a map from characters to bytes

#

I.e. it's using extra space

#

Basically it's just storing that char_indices you get above on the fly

halcyon trail
#

Then Im not sure what to say, obviously there's something going on

#

Utf-8 is a variable length encoding

#

How is Python getting you the nth character in O(1) time?

#

Uh the message you linked to literally says it wastes space

#

Just not the kind of waste I was suggesting

#

Basically it's not really a uft 8 string

gray galleon
#

no it's not

#

it uses latin-1, ucs2 or ucs4 depending on the characters in the string

feral island
#

Yeah strings aren't stored as UTF-8 internally

halcyon trail
#

Yeah I was a bit surprised by that

raven ridge
#

they're stored as either ascii, latin1, ucs2, or utf32 depending on the largest codepoint in them. and the utf-8 representation is cached the first time it's needed.

halcyon trail
#

But anyhow, different trade offs

#

What rust is doing makes sense for rust. Most times you're doing slices it's after finding a character or something like that and you have the byte index

feral island
halcyon trail
#

Yeah so basically what he is proposing is closer to how I already assumed (incorrectly) it worked

#

Oh wow this is very recent

cold hull
#

Does the file size matter for the python interpreter?
I mean is there a size x lines of code (or bytes) where it's better to have two files and one import the other rather than all code being on one file?

feral island
#

Personally I'd start thinking about splitting up a file at 1000 lines or so, or just if it feels like it's becoming hard to understand

cold hull
#

I won't edit the file, it's generated

feral island
#

If it's generated code I wouldn't worry about it

cold hull
#

cool, thanks

dusk comet
#

if it is generated you can run into another set of issues
python parser is written in C, and there are some assumptions about parsed code
one of them is that code is not nested very deeply (current limit is somewhere about 30, iirc)
if you violate these assumptions, parser can crash the process

#

my knowledge was a bit outdated
this (with 99 levels) works perfectly: ```py
if 1:
if 1:
if 1:
if 1:
if 1:
if 1:
if 1:
if 1:
... # a lot more lines
print(42)

with 100 levels it errors: `IndentationError: too many levels of indentation`
i guess it was changed several versions ago
cold hull
#

it will not be nested very deeply, 10 levels maybe of blocks of ifs and classes and whatever
it's just long, not sophisticated

dusk comet
cold hull
#

I'd try to solve it another way long before 20 levels of that :}

#

But I get what you mean

dusk comet
#

if perfomance is important and you want to micro-optimize the code, there are other things you could do
for example: if you have a lot of try-except/with blocks in one function, and exceptions occur pretty often, you can split this function into several to make exception handling faster
this is because in recent versions "zero-cost try-except" was implemented, that stored all information about try-except blocks (and with blocks, because they are basically the same thing) in one table. And if exception occurs, this table is decoded to figure out where is code that handles this exception. More try-except -> bigger table -> slower exception handling

cold hull
#

it is not important now, and if/when it will be I'll write C or something else probably

rain trellis
#

I've been working on a fresh branch for superinstructions... suffice to say something's gone a little wrong ๐Ÿ˜…

#

For some reason, with more than one instruction, it's not unrolling the switch that handles multiple ops. Basically:

//-D_JIT_OPCODES={1,2} -D_NUM_UOPS=2

opcodes[] = _JIT_OPCODES;
for (int i = 0; i < NUM_UOPS; i++){
  uopcode = opcodes[i]
  switch (uopcode) {
    #include executor_cases.c.h
  }
}
gray galleon
#

#include in a switch block ducky_concerned

fallen slateBOT
#

Tools/jit/template.c line 112

#include "executor_cases.c.h"```
crimson hatch
#

Does anyone know of a way to specify in argument clinic that an argument should be a mapping?

#

I know I can do a dictionary subclass check but as far as I can tell protocols are unsupported by argument clinic

raven ridge
#

Given that PyMapping_Check basically documents that it's impossible to check whether or not an arbitrary object is a mapping, I'm guessing the answer is "no", though I've never used argument clinic and I don't know for sure

#

!d PyMapping_Check

fallen slateBOT
#

int PyMapping_Check(PyObject *o)```
 *Part of the [Stable ABI](https://docs.python.org/3/c-api/stable.html#stable).*Return `1` if the object provides the mapping protocol or supports slicing, and `0` otherwise. Note that it returns `1` for Python classes with a [`__getitem__()`](https://docs.python.org/3/reference/datamodel.html#object.__getitem__) method, since in general it is impossible to determine what type of keys the class supports. This function always succeeds.
feral island
#

The two C-implemented functions I could think of that accept a mapping (str.format_map and dict.update) don't use argument clinic ๐Ÿ™‚

crimson hatch
#

Ok! Sounds like I should just take object and do the check in the implementation

grave jolt
#

Lots of objects have a __getitem__ (sequences, generic classes, probably something else)

crimson hatch
#

Yeah that is true

dusk comet
#

list[T] is a mapping from int to T

#

change my mind

faint river
#

With further restrictions

#

you could base the definition off of collections.abc.Mapping, which should make lists not mappings

grave jolt
grave jolt
#
In [19]: dict.update??
Docstring:
D.update([E, ]**F) -> None.  Update D from dict/iterable E and F.
If E is present and has a .keys() method, then does:  for k in E: D[k] = E[k]
If E is present and lacks a .keys() method, then does:  for k, v in E: D[k] = v
In either case, this is followed by: for k in F:  D[k] = F[k]
Type:      method_descriptor
#

(Something with just getitem and keys is not a mapping according to a glossary)

feral island
#

These informally defined "protocols" tend to be very squishy

#
...     def keys(self):
...         return range(len(self))
...         
>>> dict(mymap([1, 2, 3]))
{0: 1, 1: 2, 2: 3}
grave jolt
grave jolt
jade raven
#

when did python3 drop int objects and just use c-long objects and make that better instead?

naive saddle
#

Hmm.. are there any more resources on using llvm-bolt with CPython other than the docs ? I am running into this issue on Ubuntu 24.04 and I can't find any good links for this error :/

BOLT-INFO: 62781 instructions were shortened
BOLT-INFO: removed 158 empty blocks
BOLT-INFO: UCE removed 918 blocks and 55709 bytes of code
BOLT-INFO: padding code to 0x1400000 to accommodate hot text
BOLT-ERROR: library not found: /usr/lib/libbolt_rt_instr.a
make[1]: *** [Makefile:854: profile-bolt-stamp] Error 1
make[1]: Leaving directory '/home/ichard26/Downloads/Python-3.12.4-bolted'
make: *** [Makefile:883: bolt-opt] Error 2
#

I installed llvm-bolt from apt.

rose schooner
jade raven
rose schooner
#

ok nvm

#

they just did a different way to spell the current implementation on <=3.11

jade raven
#

can features be removed after an (No new features beyond this point.) point?

rose schooner
#

they'd probably make it to the next minor release

jade raven
#

3.13.0 beta 1: Wednesday, 2024-05-08 (No new features beyond this point.)
like i see this, can these features be removed before an RC?

rose schooner
#

i mean a feature removal is sort of a "new feature"

#

no major changes to the minor version are expected to come after the beta freeze

#

immediately after beta of the current dev minor is released the main repo would shift versioning to the next minor release

#

some bug fixes could be backported but that's about it

thick hemlock
#

Maybe if a feature is found to be too problematic to release it can be pushed back?

#

Shouldn't be something that regularly happens though

thick hemlock
#

You should just say the specific situation though

feral island
jade raven
#

or really not set in stone until final release?

feral island
#

only final releases are final

#

however, during the RC phase the ABI is final

thick hemlock
#

interesting

#

why does that get a different treatment

feral island
thick hemlock
#

ah okay

#

sounds like a generally sensible motivation yeah

#

I'll read about it more though

jade raven
thick hemlock
#

weird how that applies only to the ABI since other things can break as well

#

but there are probably good reasons for it

feral island
#
jade raven
feral island
feral island
#

I guess that's mostly for historical reasons

naive saddle
#

Well Black is actually OK as the psf/ organisation is meant to house community projects anyway.

feral island
#

I think the main reason for this policy was that ลukasz unilaterally put black in /python/ ๐Ÿ˜„

naive saddle
#

Haha :)

fallen slateBOT
#

fixtures/sitetree_menus.json line 2006

"title": "Alternative Implementations",```
safe basalt
#

Wow thanks embed

#

Better

jade raven
south kayak
feral island
safe basalt
# south kayak wait is this an official server?

Basically
Its where the core team hangs out
But not everyone likes it
Some of the core team still only uses Discuss/mailing lists/etc

Thereโ€™s three Discord servers
The docs one and the PyPA ones are public
And then thereโ€™s a private one for core devs

south kayak
#

that's fair

naive saddle
#

I'll note that the PyPA and CPython project are independent of each other, although there's considerable overlap in core team membership.

crimson hatch
#

Huh so the devguide docs for asan/ubsan seem rather (to put it lightly!) out of date: https://devguide.python.org/development-tools/clang/

I tried just setting -fsanitize=address and -fsanitize=memory, and when I build CPython I get a bunch of errors from frozen modules, which is I assume is because they are immortals if I understand correctly?

crimson hatch
#

I should update the dev guide based on this

jade raven
#
from tkinter.filedialog import askopenfile
``` seems like i found a bug for 3.13t
#

python3.13t crashes entirely on this import

dusk comet
#

what is 3.13t ?

#

version with JIT enabled?

crimson hatch
#

Free threaded I presume

jade raven
#

free threaded yeah

crimson hatch
jade raven
crimson hatch
#

Ah

#

Their code does import tkinter.filedialog

jade raven
merry venture
#

a case when Exception > BaseException
don't you think this should change?

#

oh, it did.

#

interesting.

#

i can't reproduce this on 3.12.

#

in 3.11 i can still reproduce this behavior.

#

I'm wondering if that isn't a breaking change. maybe someone based off their workflow on the fact that this allowed to dodge base exceptions? clueless

torpid ember
torpid ember
grave jolt
fallen slateBOT
#

โœ… deleted

merry venture
radiant garden
#

amazing

native flame
#

how does object()+x manage to call x.__radd__ if there's no default implementation of object.__add__ which returns NotImplemented

#

there is a default impl for the comparison operators though

merry bramble
#

This is generally true for all binary-operation dunders

native flame
#

oh, i see

#

is that not true for <, >?

merry bramble
#

I think it is, but not 100% sure. I don't know why there's the default impl for those on object, but not for the other dunders

native flame
rain trellis
#

Is there a way to enable PyStats builds for windows? I don't see it in the build.bat options

#

I don't particularly need it, but I'm wondering if it's something that I need to worry about while poking at PyStats

native flame
fallen slateBOT
#

Objects/typeobject.c line 6412

object_richcompare(PyObject *self, PyObject *other, int op)```
feral island
glass mulch
#

Greetings! I'd like to add a way to detect which REPL (basic or PyREPL) is in use, mostly to be able to test that the PYTHON_BASIC_REPL environment variable is working as it should. That's a very weak use case, could you help me come up with better ones?

I've described a couple of lame use cases on a Discourse thread[1], but I'm bad at coming up with interesting ones. Maybe it would help to conditionally enhance the REPL with readline/history recording if it was detected the basic one was in use?

[1] https://discuss.python.org/t/add-a-way-to-detect-whether-the-basic-or-the-new-repl-pyrepl-is-in-use/56109/4

thick hemlock
#

Why would like to do this if you can't find strong use cases?

#

what's the motivation?

glass mulch
#

I added a unit test that only checks that PYTHON_BASIC_REPL is set on os.environ when passed as an env var, it should ideally check that the right REPL was chosen. And not checking that bothers me. Not a good reason, I know ๐Ÿ˜ฆ

feral island
#

That sounds like good enough reason to put something in test.support within CPython at least

#

Not sure about making it a public API, that would require a use case external to our own test suite

brisk sparrow
#

Is there a discord sever where I can unban myself

glass mulch
feral island
#

also here's a possible way to tell the difference ```% PYTHON_BASIC_REPL=1 ./python.exe
Python 3.14.0a0 (heads/pep649-inspect-dirty:6e078fd344, Jun 12 2024, 19:54:37) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

import sys
sys._getframe(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
sys._getframe(1)

~~~~~~~~~~~~~^^^

ValueError: call stack is not deep enough

^D
% ./python.exe
Python 3.14.0a0 (heads/pep649-inspect-dirty:6e078fd344, Jun 12 2024, 19:54:37) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import sys
sys._getframe(1)
<frame at 0x102dc75b0, file '/Users/jelle/py/cpython/Lib/code.py', line 91, code runcode>

glass mulch
merry bramble
#

If we're doing "cursed ways to detect whether the REPL is written in pure Python or not", here's my contender:

~/dev/cpython (main)โšก % PYTHON_BASIC_REPL=1 ./python.exe
Python 3.14.0a0 (heads/main:ead676516d, Jun 25 2024, 10:30:23) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import inspect
>>> len(inspect.stack()) > 1
False
>>> exit()
~/dev/cpython (main)โšก % ./python.exe
Python 3.14.0a0 (heads/main:ead676516d, Jun 25 2024, 10:30:23) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import inspect
>>> len(inspect.stack()) > 1
True
#

Oh, this actually might be a non-cursed way of figuring out whether the REPL is written in pure Python or not @glass mulch: pure-Python modules always have a __file__ dunder:

~/dev/cpython (main)โšก % PYTHON_BASIC_REPL=1 ./python.exe
Python 3.14.0a0 (heads/main:ead676516d, Jun 25 2024, 10:30:23) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> hasattr(sys.modules["__main__"], "__file__")
False
>>> exit()
~/dev/cpython (main)โšก % ./python.exe
Python 3.14.0a0 (heads/main:ead676516d, Jun 25 2024, 10:30:23) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> hasattr(sys.modules["__main__"], "__file__")
True
glass mulch
#

Non-cursed is even better! Let me test something real quick...

#

That does solve the "how was Python initialized REPL-wise" part ๐Ÿ™‚

raven ridge
#

it may work for the two possible REPLs that exist today, but definitely doesn't work in general for all possible REPLs

merry bramble
raven ridge
#

yeah, definitely true - I didn't see that was the context, I assumed this was for a library

#

for CPython's test suite, it's definitely true that the set of possible REPLs is known ๐Ÿ™‚

thick hemlock
#

on that note I just love the new REPL

#

I have been missing that as a built-in feature for so long lol

halcyon trail
#

is it actually better than ipython

#

I can't even remember the last time I used the "standard" python repl

thick hemlock
#

I use ipython where I can

halcyon trail
#

i'm curious what improvements it actually brings

#

I just skimmed this

thick hemlock
#

but I use the REPL a lot on servers and generally a lot of things that aren't my main computer

halcyon trail
#

I mean all this is stuff that's ancient history

#

in ipython

thick hemlock
halcyon trail
#

i guess it depends what you're doing, what workflows you use, etc

thick hemlock
#

yeah ofc

#

I understand that for a lot of people ipython is gonna be good enough

halcyon trail
#

you can throw micromamba on that server and have python, ipython, and all the third party packages that your heart desires inside of2 minutes

thick hemlock
#

those servers are not always connected to the internet

halcyon trail
#

then yeah, that's annoying

#

we have servers like that as well, but micromamba is what my work uses for C++ and python environments, so we have our own conda-forge channel hosted, so that works.

thick hemlock
#

also sometimes for troubleshooting I want to run a REPL in a docker container