#internals-and-peps
1 messages · Page 16 of 1
here is the rules i know i missed, but i'll have to research that form feed one, hadn't heard of it
Furthermore, if a line ends in a backslash (
\) or any pair of delimiters is open ((and),[and],{and}), no newline token is emitted and no indent or dedent tokens are generated for the following line.
btw wrt the (?) i found
A formfeed character may be present at the start of the line; it will be ignored for the indentation calculations above. Formfeed characters occurring elsewhere in the leading whitespace have an undefined effect (for instance, they may reset the space count to zero).
the behavior is undefined
for completeness, this is the revised version of my "exception" list:
There are a few exceptions to the indentation rules:
- If a line only contains spaces, tabs, formfeeds, or comments, then no newline token in generated, and that line does not contribute to the addition or removal of a level to the indentation stack.
- If a line ends in a backslash (), no newline token is emitted and no indent or dedent tokens are generated for the following line.
- If any pair of delimiters is open (( and ), [ and ], { and }), no newline, indent, or dedent tokens are emitted.
- All formfeed characters in the leading whitespace are simply ignored and do not contribute to the indentation levels. This behavior is left undefined by the reference, but is done for consistency with the rule that formfeed characters at the start of a line must be ignored for indentation calculations.
Why it is not allowed to subclass enum with members?
Why can't i extend some enum through subclassing?
Allowing subclassing of enums that define members would lead to a violation of some important invariants of types and instances.
For example, if you have
class Color(Enum):
RED = auto()
BLUE = auto()
GREEN = auto()
``` there's an expectation that there are only three instances of the type `Color`. This would be violated if you were allowed to subclass `Color`
do you have some use case where you'd like to do this?
No, im just curious
Also typecheckers are able to ensure that all cases are handled in match-case
well, type checkers live in their own world
What's strange about 24:00? 🥸 Japan uses time data reaching up to 27:00 as I've seen. For example "Open Fri 18:00-27:00"
i dont understand how REPL is started
for me pyreadline3 module is doing REPL stuff (it is true on fresh python?)
how interpreter know which module to import to start REPL?
how can i override this behaviour?
I think you're looking for the console module?
Oh wait, you're looking for the code built-in @dusk comet
!docs code.InteractiveConsole
class code.InteractiveConsole(locals=None, filename='<console>')```
Closely emulate the behavior of the interactive Python interpreter. This class builds on [`InteractiveInterpreter`](https://docs.python.org/3/library/code.html#code.InteractiveInterpreter "code.InteractiveInterpreter") and adds prompting using the familiar `sys.ps1` and `sys.ps2`, and input buffering.
At which stage python calls this module?
And how can i run different repl?
Now my repl is handled by pyreadline3 module. When this module was started?
you're looking for what happens when you just run python and get the REPL? Seems like it calls pymain_run_stdin in Modules/main.c, but haven't traced further than that
I think the readline stuff happens in tokenizer.c where it ends up calling PyOS_Readline
Thank you, i will look into that
I'm curious, everyone ever come across the pattern of a "dismissable" context manager
it seems like it could be helpful for writing code that transfers ownership of something across a scope boundary, while being exception safe
def make_foo():
with Dismissable(...) as (dismiss, foo):
foo.do_one_thing()
dismiss()
return foo
The idea being that Dismissable wraps the expression in ..., which is a context manager, with another context manager
All this does is store an additional bool, which starts true, and gives you the option to change it to false.
the original context manager's exit is run conditionally on that bool
it gives back a pair, which is the "dismiss" handle, and the original context manager's object
in this example, we transfer ownership of foo to someone calling make_foo(), but also if an exception is thrown by foo.do_one_thing(), we still correctly cleanup
what does "transfer ownership mean"? I'm confused what problem this is solving.
sorry let me make the example more concrete
def make_file_with_header(path):
with Dismissable(open(path, 'w')) as (dismiss, file):
file.write("Header line\n")
dismiss()
return file
usage
with make_file_with_header(path) as file:
...
what does dismiss() do?
dismiss stops the original nested context manager from running
sounds like contextlib.ExitStack.pop_all might meet your needs
i.e. the __exit__ on the object returned by open
to keep it from closing the file? Why use a context manager at all if you want to avoid the close?
@raven ridge let me check it out
they want it to be closed if an exception is raised, but not if not (in that example)
if you don't use a context manager, and file.write throws
it will leak
at least, as far as I understand
but the caller will also be using a context manager, which will try to close the file.
because the user's context manager will never run
will it?
the expression to initialize the context manager will never complete
I don't think it will but maybe I'm wrong
oh, hmmm maybe not?
I don't really see how it could tbh
but if the caller's doesn't close it, what will?
that's my point
the inner one will close it
that's why we create the inner context manager, but dismiss it just prior to exiting our function
but if there's no exception, it will return a closed file, which seems useless.
it won't, because we called dismiss() 🙂
that's what dismiss() does, it prevents the context manager from running exit
I think this is equivalent to: ```py
import contextlib
def make_file_with_header(path):
with contextlib.ExitStack() as stack:
file = stack.enter_context(open(path, 'w'))
file.write("Header line\n")
stack.pop_all()
return file
but notice that we only get to call dismiss() if we make it there
ok, and if there's no exception, how will the caller close the file?
if there's no exception, the user's context manager will run normally
sorry, i'm mixed up
why does the caller have a context manager?
because you're returning a file, the user needs to control when it's closed
and why does this function return a file? It's either closed, or an exception happened.
because the user is going to keep writing to it
when will the caller close the file?
no, because dismiss() prevents it from closing
the idea of make_file_with_header is that it either a) throws an exception and closes the file, or b) exits normally, returning a file that the user needs to close
how will the user close it?
with a context manager
wouldn't that context manager close it in the excception case also?
i mean if you don't use dismiss(). get rid of the inner context manager, open the file, and return it. period.
if it's still not clear it may just b esimpler to look at the naive code first, and see why it's wrong
just don't take ownership and have the user pass you an open file if that's the intent?
then the file can leak
no, because the outer context manager will close it if there's an exception.
i don't think it will
why wouldn't it?
because the context object is never initialize
ie.
with open(...) as fp: # user code
your_function(fp)
there's no object to call exit on
def make_file_with_header(path):
file = open(path, 'w')
file.write("Header line\n") # let's say this throws
return file
...
with make_file_with_header(path) as file:
...
here's the scenario
i think there's a way to organize your context managers so that this works without dismiss()
the general pattern I have seen more looks closer to this
@contextmanager
def wav_datafile(path):
with open(path, 'w') as file:
write_header(file)
yield file
with wav_datafile(...) as f:
f.write(first_sample)
we can test this
but I don't see how make_file_with_header(path) as file can possibly clean up the file
when the function throws, it never receives anything
a 20-line test file will test it out
sure
but dismiss is an interesting idea
I'll test it
don't be.... dismissive 😛
just kidding
but I think what godlygeek mentioned is probably close enough
nice
This really doesn't seem like a problem with context managers and just code that should be structured slightly differently. 🤔
as for the pattern you mentioned @flat gazelle I am not sure that's the same
sorry, I don't agree, you can't really just approach this by saying "well why not just pass the file descriptor" etc
obviously there are functions in python that open in one function and close/exit in another
you try to avoid it because it's more awkward for the language, but it's not 100% avoidable
But I can, and I absolutely think it's the cleaner approach here for your example. I don't need open to write a header, I write the header independently from opening the file
but the whole point is to combine the functionality
and yes, you can, but you'd just be wrong
because this is obviously just a single example
it could be opening a database connection and doing some "pre work" that you always want done
then can you give me an example where this is actually needed?
it could be a million things
yeah, I do not want to write
with create() as f:
setup_f(f)
```for every usage of `f`
I want to factor that out into a separate "function"
I just gave you another, if your goal here is just to go through every example and say why you think it would be structured differently... that's not a game I want to play tbh
I think it sufficies that sometimes you will want to structure your code that way
but a function is inadequate since cleanup is required
yeah, exactly
I need to look more carefully at your pattern btw
so you make a new contextmanager, which reuses the old one, and just adds new behaviour
you dont have to, you can have
@contextmanager # im being slightly lazy here
def setup(path):
try:
with open(...) as f:
do_thing(f)
yield f
finally:
pass
this isn't a matter of needing to repeat yourself
which is what I have shown
(which I still like less than dismiss, but yeah it's an option)
@contextmanager
def setup_and_create():
with create() as f:
setup_f(f)
yield f
```is probably python's one obvious way to do it
I dont think there's any benefit in adding another way to do this here, the existing ways work, and there are many options here for them based on your actual needs already.
nah there's no actual needs here, just code that should be refactored anyway right 😉
A function that does nothing if a flag is set always rubs me the wrong way, but that is probably just my silliness.
as far as you've shown, I'd agree with that even if you intended it as a dig here.
Adding this would violate expectations about contextmanagers just to add another way to do something.
currently, __exit__ running and when it runs is one of the strongest guarantees python actually has for cleanup. changing the semantics of when it might run, and that you can intentionally bypass it and break cleanup doesn't seem like a good idea to me if it isn't solving something not already solved.
.... in the end literally all that happened is that folks suggested an alternative way of implementing it. not that the original idea or requirement didn't make sense.
@spark magnet btw
class Bar:
def __init__(self):
print("enter")
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
print("exit")
def make_bar():
b = Bar()
raise RuntimeError("oh no")
with make_bar() as x:
print(x)
this prints "enter"
but not "exit"
it's odd that you printed enter in __init__ instead of __enter__
that's the case I was original thinking of
sorry
you're right, it is strange
if I fixed it to make sense it would almost certainly print init but not enter
right
Exceptions inside __enter__ will not call __exit__
yeah, it's not even in enter though
tbh, contextmanagers overall have this problem, where the object is one of AfterInit AfterEnter AfterExit, but you generally want AfterEnter AfterExit only
this is especially obvious when you use async context managers, where the constructor just does something - generally assigns init arguments as attributes, and then __enter__ has to do actual setup logic
The fact that ExitStack.pop_all exists is evidence that this pattern - deciding against calling some cleanup and making it someone else's responsibility instead - has some value. Or at least, it was thought to have value at the point when it was added to the standard library 🙂
lakmatiol's solution though, in this particular case seems to be the cleanest
to be honest, I've definitely written code exactly like that before, I just never really thought carefully about whether it was fully exception safe
and it was hard to convince myself just looking at it, that it was
I tested it and it is 🙂
It's almost like @contextmanager is another "color" of function 😛
in a way IG
it's just syntactic sugar for defining a class that implements the context manager protocol yourself
I'd never allow code using it to be checked in in review, just because someone came up with a solution years ago doesn't mean we don't have better ways to do this, and didn't also have them at the time with less eyes and experience on it.
Pass stateful resources into things that need them, and this works in both synchronous and asynchronous settings.
🤷♂️ That's your prerogative. In my experience, good engineers don't tend to trade in absolutes, though.
There's plenty of old smelly design in the standard library, no need to use it just because it exists
There are a few things I'd trade in absolutes on, and this is one of them. passing resources down into what needs them is the best way to ensure they live for exactly as long as they are needed, are scoped properly, and are cleaned up properly.
if a better method comes along, I'll update my view on that, but for now, it's where things are.
there's literally nothing wrong with the solution given by lakmatiol (and later yourself)
and it literally solves exactly the same problem here
the method I provided is passing it down, it's just wrapping it in the process.
Like, sorry but you've just been unnecessarily negative on the idea, then on the implementation... not really sure what the point is
@contextmanager
def open_files(paths):
with ExitStack() as stack:
fs = [stack.enter_context(open(path)) for path in paths]
yield fs
```I don't really see the advantage of `pop_all` over doing something along these lines.
IG if you want to leave the context manager world to make a full object with __enter__ and __exit__
I'm negative on it because I've seen the effects of people trying to avoid using context managers (And especially asynccontextmanagers) and skipping proper cleanup through buggy code
the method you and lakmatiol provided is still doing an ownership transfer; because the client needs to use with
...
nobody was trying to avoid using a context manager here
but even the original PEP proposed context managers via the decorator and yield, rather than by making it a class
the thing I suggested was itself a context manager. I just didn't realise it was superceded @contextmanager, which is itself basically a generic way to easily wrap context managers (and other bits of code)
the proposed dismiss thing you had is in the vein of skipping __exit__ (ie avoiding the contextmanager)
it is the canonical way to make context managers, contrary to what code in the wild would make you think
but Dismissable itself is a context manager...
well, https://github.com/python/cpython/blob/1858db7cbdbf41aa600c954c15224307bf81a258/Lib/test/test_zoneinfo/test_zoneinfo.py#L58-L87 for an example. I can't easily see a different way to accomplish this by resources down instead.
I'm definitely not saying it was a great idea; I mostly posted here to see if there were better solutions
but saying I was trying to avoid context managers is just flat out incorrect
I would argue that is a situation in which you are using a bad API that doesn't support context managers
sometimes I really like @contextmanager, but there's been a few times where I've just found it confusing, and written a class explicitly instead.
i'm not sure I could honestly tell you exactly why.
fair enough
i think both are pretty legitimate ways to write them, generally speaking.
in this case though, yeah, @contextmanager is very clearly the way to go
most context managers I have put in a class felt weird since you pretty much have to arbitrarily split code into __init__ and __enter__. Haven't run into a case where there is clear separation (outside of async where the separation is "only one of them is allowed async")
at any rate I learned that the pattern where you wrap a context manager with another @contextmanager function is 100% exception safe, which I never really realized before, so I'm calling that a win for me 😛
thanks
This is going to get into a semantic argument about what it means to avoid a context manager, and I dont think that's useful here.
dismiss as proposed, as well as the use of exitstack in the way it was proposed waffles about who is actually responsible for cleanup, and offering a way which has the potential to be broken by mistakenly calling dismiss.
It would arguably be more useful to have a way to make __exit__ idempotent if the goal is allowing anyone to handle cleanup, as that still has only running it once, but you could have it be handled as a real guarantee that the last exit must cleanup if it hasn't been yet.
and have non context manager analogues
yeah...
dismiss as proposed, as well as the use of exitstack in the way it was proposed waffles about who is actually responsible for cleanup
I don't agree. Both dismiss as proposed and exitstack as proposed are about transferring ownership of a resource from the function where it's created to someone else.
That was the original framing of the question. There's no waffling involved in that.
pop_all seems more like a way to talk to an API which doesn't use context managers
which is... fine
there are plenty of bad APIs around
like unittest
and some which even avoid them for legitimate reasons I am sure
nah, that would be the case if it returned a callable - it returns an ExitStack instead, which can be entered - which means it does allow moving resources from one context manager to another.
even the doc example instantly makes it into a callable, to be fair
sure, you can use it that way, but I strongly doubt anyone actually does that.
except in neither case is there a guarantee anymore that cleanup will be run. It encourages code that is more likely to handle something incorrectly, instead of just making it the caller's responisbility to use a context manager and having cleanup be guanrateed
mm, true enough.
sometimes, the cleanup is on some library that doesn't give you flexibility to attach a context manager.
sure. That's a necessary part of handing ownership of a resource to something other than your caller.
this is a pretty interesting topic, fwiw. The C++ folk I talk to tend to generally point out the shortcomings that with, and it's many many similar incarnations across GC languages, have.
I generally agree that these things have real shortcomings, I just think it's a relatively low price of admission, given how nice GC is when you can afford it.
huh, that's actually an interesting point. So many of these abstractions are designed around the fact moves don't exist
You can have RAII without moves (e.g. C++ until 12 years ago)
it's more the fact that C++/Rust, guarantee that stack variables get destroyed deterministically, and so on transitively
so you can always ensure that things are cleand up when you want
GC languages tend to not want to provide those guarantees
in C++, this same function would be along the lines of
T make_thing() {
T a;
write_header(a);
return a;
}```no faffing about with colorful functions.
yep
that's exactly how this came up
and it's impossible to forget to use with also
and yet the behavior of __exit__ and context managers is actually pretty strongly guaranteed here. as is try/finally.
it does however force you into these odd workarounds to use it, since passing a context up the callchain is impossible without giving up guarantees
so, the two examples that typically come up where RAII is cleaned is a) transferring scopes, b) members.
b) got me thinking, how in RAII languages (C++/Rust), classes automatically handle the RAII guarantees of all their members. In GC languages with context managers or similar, I do not know a single example of a languag ethat will handle this for you automatically.
one bit of messiness there is that in RAII languages, conceptually all types are constructed and destroyed, so it makes perfect sense that when you are destroyed, all your members are destroyed
construction/destruction is also unique per-type
It really doesn't. Write apis that don't assume they are responsible for the thing needing cleanup. Use is just
with thing:
do_something_to(thing)
and it all just works. There's a reason I have such strong absolutes on this in particular, and it's because there's a lot of software that thinks it needs to own the resource to work on it, despite this just not being a thing in python.
None of this applies to context managers, so it's much harder to try to do this automatically, I suppose, so languages just don't
What quicknir originally asked for is basically https://en.cppreference.com/w/cpp/experimental/scope_exit/release
Yes, that was the inspiration
(Andrei Alexandrescu called it "dismiss" in one of his famous talks)
the moment you seperate the concerns of managing the resouce and using the resouce, the correct code is idiomatic.
I've seen implementations that call it disarm as well
that's actually pretty funny
I showed you a place in CPython's codebase that uses ExitStack.pop_all in the way that I suggested, and in that case there is no way to have the caller be responsible for initialization and finalization of the resource.
I do still think it's possible to design a GC language, that uses context managers, and solves b) out of the box
I don't write every single API that I use, and I can't write APIs that know exactly what resources they will need ahead of time. Yes, of course, best design is one where all IO and resource management happens high up in the call chain and is passed down (making it effectively a borrow), but this is not universally applicable.
GC does not mean you can't have ownership, destructors and all related semantics, yeah
it just means you get to not use it for memory management
you can have destructors in theory
in practice, eh
i mean nobody seems to want to do it. It's probably a huge guarantee to make, with big time ramifications for GC performance
hmm, yeah, IG an object going away on you because the owner died is a bit annoying
(note, when I say "destructors" I mean they must run deterministically, as opposed to "finalizers" or whatever you want to call them, a la Java and python)
you can't statically eliminate reference leaks ala rust
it just gets very messy also, for this to be useful you want guarantees even about destructor ordering
Foo f;
bar b;
like if these are local variables, C++ will guarantee that b gets destroyed first
and so will equivalent Rust
etc etc
there's just currently a lack of practical examples, I think, of languages that go all out on GC but try to have deterministic "destruction".
yeah, upon further thought, I can see why
I strongly disagree with this being the only solution here, but this is involving testing a module and behavior that isn't threadsafe being done by the test, not that the standardlib itself needs it.
Swift is the closest thing, perhaps, but it's not what most people would call "real" GC
my idea was just doing rust, but memory is GCd, all other resources work the same, but I am not sure how much better that is than actual rust.
it rapidly becomes messy because all these things are mixed in together
Like, you have a list/vec, which is "just memory" but the objects in it may have "other resources"
so you still have to traverse everything basically
I strongly disagree with this being the only solution here
There may be other solutions, but the thing that you've said above is the only acceptable solution - making the caller be responsible for finalization - is not an option here.
not to say anything is impossible here, it's just non-trivial, and there's a lack of real world experience in this space
I also disagree with that here, it is possible to make the testing framework (The caller) responsible.
🙄
that's what this code is doing, right?
I would argue that the caller is being made to do the finalization, just with the unittest API, not with the contextmanager API.
you're saying this proves that the caller isn't able to be responsible, but that isn't the case here. It's just that this was written in a way that it was done in this particular way, not that a testing framework could not setup and teardown resources without this.
making the caller be responsible for finalization - is not an option here.
This is just false
It's entirely an option here
so - if you had been the reviewer on the PR that introduced this test, you would have demanded CPython switch to a different test framework that provides a different way of managing resources rather than accept this PR?
I'd have written the test differently. This is possible with the current testing framework.
ok, maybe I'm not following, then: how would you have done it?
They made a mixin that uses the exit stack, and that mixin gets used in a bunch of subclasses that are used to test behavior of classes that are being subclassed but change the behavior of those classes.
Personally, I think this defeats part of the purpose of even testing this as they are changing the behavior from what users have happen for this, but even ignoring that, it's possible to do this without the exit stack by just structuring this slightly differently.
In this particular case, I'd have not written this test this way for reasons that go beyond just the exit stack use, but that they are testing behavior in a way that doesn't match real world use.
I don't think I follow, but 🤷♂️
If you think this could be cleaner, you should submit a PR
When I said I wouldn't allow such code to be checked in, I was strictly referring to code where I'd be the one with my name signing off on it being okay. That might sound callous, but I'm not going to rewrite someone else's test without there being a provable issue caused by it even as I see a better way. If there was such a use in code I relied on, I'd be more open to considering the time justifying the changes, but restructuring a bunch of code that solely exists to test things to adhere to better practices just to make a point is not high on my todo list.
To reframe that in a slightly more optimistic light: I'd rather spend energy on discussing better design with those considering new uses that could potentially run into issues, and on fixing real issues with code, than tear up existing tests that work, but that I fundamentally disagree with their structure over the potential for an issue that isn't arising with the specific code in question. If the test breaks in the future because of this and an interaction with other changes, that would be an appropriate time to spend energy on cleaning it up.
What’s the use case where this matters?
well if you want to have automated clean-up without using context managers
C++ and Rust don't have context managers because they generally don't need them
I meant, the order of cleanup/etc
you could have types that require other types to be initialized, and have them as part of their state
and then you'd want to make sure the second type is cleaned up first, while the first type exists
it's not a very practical example per se but a simple example is a mutex, and a mutex lock guard
Ok, yah, I guess, gotcha
it's not super common that it matters, fwiw
but realistically once you're making destruction deterministics anyway, guaranteeing the order for locals doesn't really matter
A a;
{
B b;
}
there's tons of code where it deeply matters that b is destroyed before a (in C++ and Rust)
so if you are already doing that, then handling order within the same scope isn't really much different
(for non-locals it is pretty different, and C++ and Rust diverge a lot more when it comes to destruction order of members and things like that)
I generally avoided relying on such destructor ordering in my c++ days, but I had an intense distrust of destructors anyway.
Well, I guess that last example less so, but the earlier examples.
it's not a bad policy, fwiw.
Regarding cpython github issue, is 5 days with no response common? How can (I get help to) add the label topic-multiprocessing to https://github.com/python/cpython/issues/105829 ?
it's not super rare to get no responses
added the label
Thanks.
is there any good up-to-date python implementation without gil?
looks like this: https://github.com/colesbury/nogil is only 3.9, but i want 3.11
what does "good" mean to you? What trade-offs are you willing to accept to get rid of the GIL?
it should be not very buggy, it would be enough
it's OK if it's slower, and some C extensions don't work?
well, i have a pure python code that has the potential to greatly benefit from multithreading
i do not use any C-exts (outside of stdlib)
so i dont really care about single-thread perfomance (for all other tasks i will use regular CP\ython)
!pep 703 how is this pep doing?
i noticed that latest nogil-3.12 crashes randomly when i use multiprocessing.pool.ThreadPool(1) with more than 1 thread
😔
should report a bug to Sam
ok, i will minimize my code and create an issue 👍
import multiprocessing.pool
import itertools
objs = [[0] for _ in range(1000)]
def func():
for p in objs:
p[0] += 1
for _ in map(print, itertools.count()): # fancy way to print numbers :)
with multiprocessing.pool.ThreadPool(2) as pool:
pool.apply_async(func)
pool.apply_async(func)
``` this is what i got
why is it weird? it presumably takes an int and not a float
did you try running it in a debugger to get the stacktrace for the crash? (no idea how to do that in windows, but it should help a lot in diagnosing the bug if you can do that)
no, i dont know how to to that :(
i think i can build the same interpreter in "debug" mode and see some logs in stderr
I want to implement a LRU cache with an OrderedDict. Should I move_to_end() all my items and use popitem(last=False) to pop the first item, or should I move_to_end(last=False) and then popitem() for the best performance?
ie. should I place the most recently used item first or last to achieve the best performance with OrderedDict?
benchmark it!
have you considered deque?
When I compile the following AST
Expression(body=BinOp(left=Constant(value=38534), op=Sub(), right=Constant(value=38533)))```
I get the following bytecode
1 0 LOAD_CONST 0 (1)
2 RETURN_VALUE```
even when I set different optimization levels it keeps resolving the constant
how can I force it to do a pure compilation with no optimization or inlining at all?
I don't think you can, unless you reach into the C API. The peepholer is always invoked.
what if I change one of the constants to a different type and then change the constant in the bytecode?
are there some hacky ways I can do it?
I guess that works
yes, you can construct new code like this: code = code.replace(co_consts=tuple('do something with x' for x in code.co_consts))
you can everything in co_consts, not only int/float/str values
You could wrap the constants in a proxy class that just calls the dunders on the wrapped object
Can you give an example please?
that's for loading names, not constants
Why does the not operator have lower precedence than ==?
This makes the following a syntax error:
assert True == not False
In that case, you'd better use !=. Do you have a different example where it's more readable than the alternative?
I know, but this is from here: https://github.com/nim-lang/Nim/wiki/Nim-for-Python-Programmers#boolean-conditionals
== and in have the same precedence, if not would bind tighter then
not foo in bar
would not be the same as
foo not in bar
While the latter is preferred, the former is still occasionally used and doesn't look too bad.
I'm writing a program that emits python bytecode. What qualifies an object for inclusion in co_consts?
those are immutable values found in the code by the compiler.
you understand the byte code can change a lot from version to version?
cool. what is your bytecode generated from?
It's hacky at the moment, will hopefully be less hacky in the future. We do a translation from bytecode to an SSA graph (lossy, doesn't support all constructs), back to bytecode.
We functionalize what we can, and error if we can't.
idk what SSA is. What is the big picture of what the program does? What's it for?
that sounds like an interesting project!
Single Static Assignment Intermediate Representation for a compiler
to answer your original question, CPython itself puts immutable constants (e.g. ints, strings, tuples containing other constants, code objects) in co_consts
And also I think imported modules?
if you generate your own bytecode, you can put whatever you want
no
I once wrote a project that translated references to functions into constants. For example, if you use os.path.join, at runtime that does LOAD_GLOBAL os, LOAD_ATTR path, LOAD_ATTR join. Instead I put a reference to the os.path.join function in co_consts and loaded it directly. Worked fine.
import are just assignments to whatever scope they are in
import foo is the same* as foo = import_it("foo")
Ah. Yeah. makes sense.
(where import_it is something I made up)
could have just said __import__ 🙂
and it would be mostly true
I'm experienced with compilers, new to python internals.
i forget the details, and they weren't relevant here 😄
Is it bad if the stuff I put in co_consts is unmarshallable?
why would you do that?
(and what type is it?)
It's __builtins__
ok, why?
It's relatively specific to what I'm doing. I want to hoist all the global state that I can to be a function argument, or a constant if it could never be assigned. People never assign to builtins, but I would have to model them seperately if I can't throw them into co_consts.
that means that you can't put it in pyc files, which in turn means you'll have to redo whatever you do at runtime every time. not sure how bad that would be for you
We do our own caching of trace inputs, so potentially not catastrophic, but definitely not ideal I think.
People never assign to builtins,
You mean specifically for the thing you're working on, or in general? Because in general, they absolutely do. You can see code every day in #python-discussion where someone uses list as a variable name
anyone wanna critique my email for me?
mabye
wrong channel
ask in #1035199133436354600
Error tells you what's wrong
Close proccess that is editing file you are trying to open
:incoming_envelope: :ok_hand: applied timeout to @pastel barn until <t:1687647809:f> (10 minutes) (reason: emoji spam - sent 24 emojis).
The <@&831776746206265384> have been alerted for review.
hey folks
its been a while i used python and pip to install modules
forgot lol
i ran pip install requests
but when importing im getting requests is not accessed
wrong channel
ask in #python-discussion or #1035199133436354600
hi im a new programmer. this is what i leart today!!! print("hello world")
!e
class Hmm:
def __init__(self, x):
self.x = x
def __hash__(self):
print(f'hashing hmm({self.x})')
return -2
def __eq__(self, other):
out = self.x == other.x
print(f'eq {self.x} vs {other.x}: {out}')
return out
hn1 = Hmm(3)
hn2 = Hmm(4)
d = {hn1: 10, hn2: 20}
@neat delta :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | hashing hmm(3)
002 | hashing hmm(4)
003 | eq 3 vs 4: False
004 | eq 3 vs 4: False
005 | eq 3 vs 4: False
006 | eq 3 vs 4: False
007 | eq 3 vs 4: False
008 | eq 3 vs 4: False
009 | eq 3 vs 4: False
010 | eq 3 vs 4: False
011 | eq 3 vs 4: False
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/awatiyofuj.txt?noredirect
What the heck is going on here? I know there's funny stuff around __hash__ coercing a return value of -1 to -2, but why does this check equality 13 times when other return values (aside from -1, obviously) just do the expected single equality check?
even more interesting is if you use 3 items in the dict
hashing hmm(3)
hashing hmm(4)
3 vs 4 13 times
hashing hmm(5)
3 vs 5 13 times
4 vs 5
why does it compare 4 vs 5 at the end only once??
hn1 = Hmm(3)
hn2 = Hmm(4)
hn3 = Hmm(5)
d = {hn1: 0, hn2: 0, hn3: 0}
```btw
the single comparison at the end is expected, but the bizarre feature is that regardless of the size of the dict, it comes the first element - hn1 in this case - to all other elements 13 times. every other comparison is only once
oh and with a non-negative constant hash, it doesn't do all these comparisons
if you make it a constant 0, it does this
hashing hmm(3)
hashing hmm(4)
eq 3 vs 4: False
hashing hmm(5)
eq 3 vs 5: False
eq 4 vs 5: False
what you would expect
the only triggers are when __hash__ returns either -1 or -2, and python forces a hash of -1 to be -2 anyways (-1 is a common error code for C programs)
-3 is weird too
hashing hmm(3)
hashing hmm(4)
eq 3 vs 4: False
hashing hmm(5)
eq 3 vs 5: False
eq 4 vs 5: False
eq 3 vs 5: False
eq 4 vs 5: False
eq 3 vs 5: False
eq 4 vs 5: False
eq 3 vs 5: False
eq 4 vs 5: False
eq 3 vs 5: False
eq 4 vs 5: False
eq 3 vs 5: False
eq 4 vs 5: False
eq 3 vs 5: False
even weirder actually
see the mixture of 3v5 and 4v5?
-4 does the same as -2 and -1 it seems
i'm resisting the urge of summoning either godlygeek or jelle
-5 does the same as -3
it seems odd numbers do the weird alternating thing
and -1 / evens do the 13
yep like -7, -9, -11 all did the alternating
no idea 🙂
but but but
you're (one of) the chosen one!
we're doomed
it's the number of times you need to right shift five -2 for its last few bits to change
Objects/dictobject.c line 821
perturb >>= PERTURB_SHIFT;```
Objects/dictobject.c lines 163 to 164
The first half of collision resolution is to visit table indices via this```
so it keeps probing the same element until then https://github.com/python/cpython/blob/925cb85e9e28d69be53db669527c0a1292f0fbfb/Objects/dictobject.c#L803
Objects/dictobject.c line 803
int cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);```
cool thing! thanks for mentioning it
>>> f = set.intersection
>>> f({1},{1,2})
{1}
>>> f({1},{1,2},{1,2,3})
{1}
>>> f({1})
{1}
>>> f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method set.intersection() needs an argument
is it possible to add support for calling set.intersection without args?
typically you are supposed to call it from an existing set, not as if it were a class method / static method
besides, what would it even return? a set containing everything in the universe? an empty set?
the former is not possible, and the later would not make sense
empty set seems reasonable
set.intersection() -> set()
frozenset.intersection() -> frozenset()
would you expect for this to always hold true?
intersection(a, b, c) = intersection(intersection(a, b), intersection(a, c))
!e ```py
from fishhook import *
@hook(set | frozenset)
def intersection(*args):
if not args:
return 'hooked'
else:
return orig(*args)
f = set.intersection
print(f({1},{1,2}))
print(f({1},{1,2},{1,2,3}))
print(f({1}))
print(f())```
@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | {1}
002 | {1}
003 | {1}
004 | hooked
@hook(set | frozenset) <- thats cool!
yea i liked the syntax
but as of rn fishhook breaks on 3.12 and i dont have time to get it working for a while
you have 3 months until 3.12 release 😄
eh we'll see if i have time in the next 3 months
the identity element for set intersection is the set containing all elements
so no
Just needs a set subclass that claims to contain everything :D
pop() should return a random object stored in memory :)
pop() would indeed be a problem, you'd have to remember the object because pop() should never again return it.
Maybe it can only be a frozenuniverse
well, it's pretty fair to consider a "complement set" which has everything except the stored set
then pop just needs to, uh, magically conjure up an item and add it to the complemented set
Since pop() of a set doesn't guarantee any order, we could even just do
def pop(self):
o = object()
self.excluded.add(o)
return o
fair enough
Hi
return ascending integers 
you're right though that "magically conjuring" does in fact just mean creating an object
Wonder if python 3.14 will support getting the last number of pi..
i fuck you in info go ctf root me
English please
Why does the path not shot up in echo %path% but is in path in the environment variable?
wrong channel, ask in #1035199133436354600, please delete the message
btw what do yall think if python was never made? would there would be a language similar to it to replace it? or there wont be any similar language like that?
Python became popular in response to computation becoming cheaper and developer labor becoming more expensive. Those factors existed whether Python existed or not. If python hadn't existed, I think we still would have seen a dynamically typed language rise in popularity around that same time. But I don't think we'd have ended up in a place where one language was widely used for both scientific computing and for general purpose development.
Without the ecosystem that's grown up around it, it makes little sense that python would be the language of data science and ML.
thank you
what about lua btw (if you dont know about it then its ok)
>>> str(math.pi)[-1]
'3'
easy
Basically, Brazil had a policy of strong trade barriers from 1977 until 1992 for computer hardware and software, which pushed Tecgraf to implement basic tooling from scratch
They had two languages: SOL (simple object language) and DEL (data entry language)
Lua was made because of factors that made Tcl, Lisp, Scheme, and Python unsuitable for Tecgraf's needs:
- Tcl: unfamiliar syntax, bad support for data description, Unix-only
- Lisp, Scheme: unfriendly syntax
- Python: still in its infancy
Another reason was that SOL and DEL didn't have control flow structures, and Petrobras (which has used these tools by Tecgraf) felt a growing need to add such structures
What is lua even used for
Scripting engine in video games
I've written a little for World of Warcraft macro, but it's used for add-ons too
Was there ever a try to make the datetime module and object more performant in the stdlib? (Speed and memory usage)
(In addition to some weird behaviour, like not being able to nicely check if a datetime object is UTC or some other timezone, but that is a side point)
Been ambushed on two projects now where datetime objects are the bottleneck (again speed and memory usage wise). Which I found odd for such a universal thing.
oh
if brazil didnt had that policy, then there wont be lua right
I don't remember datetime perf being raised. We'd be happy to take patches if there are obvious wins
Just wanted to see if it ever came up or there already was a try.
Sadly my C is not that good, but if I find some obvious things, I will throw it in here first probably.
I am currently still in the stage of determining the actual performance of it though.
Just profiling your applications where datetime performance was your bottleneck with something like the --native mode of py-spy and sharing the profiles would be very helpful. What time zone library are you using? Are you parsing datetimes from strings?
import time
pq = []
def create_task(task):
pq.append(task)
def event_loop():
while pq:
task = pq.pop(0)
task()
def async_wait(timer: float):
if pq:
other_task = pq.pop(0)
ct = time.time()
other_task()
elapsed = time.time() - ct
if elapsed < timer:
async_wait(timer - elapsed)
def my_async_function():
print("How are you")
async_wait(5)
print("Function completed")
def another_async_function():
print("This ran before the first functions end")
time.sleep(6)
def third_async_function():
print("This runs in the last despite the wait because of time.sleep()")
create_task(my_async_function)
create_task(another_async_function)
create_task(third_async_function)
event_loop()
wanted to know how async is implemented.
would this be a naive implementation of it? or is it far from how its usually done 
i think for awaited functions i can just make a flag to know if the caller which used async_wait is being awaited or not
based on that async_wait will either act as blocking function (since its being awaited) or will run another task until timer hits 0
but i think the way timer is handled here is too simple, if a task which takes more than 5 seconds is being ran while caller funcction is being in async_wait it will penalize the program by running the other task for more than the caller's wait time
Neovim 🥹
How to find who is maintainer for concurrent.futures and/or multprocessing.pool?
Can I assume that the same source code, when compiled by the same python executable, have the same bytecode?
Well, due to things like -O, no
If it also has the same arguments, maybe. I have a vague memory of some part of the compiler touching runtime stuff, but I do not remember the details
set iteration order is inconsistent, I think that's the major cause of nomreproducible pyc files
does that affect the result of compile() as well?
In a single process, probably not
I suppose not because that gives you a code object, not a marshalled pyc files, and the inconsistency happens during marshalling
No module named 'Tkinter'
why is that, i am trying to import from it but it keeps giving me an error
bless your soul ❤️
#1035199133436354600 ask for help here
Thanks for telling the corect channel
please explain
A target occurring in a del statement is also considered bound for this purpose (though the actual semantics are to unbind the name).
from Python execution model doc
Bound in this section essentially means "interacting with the name, not the value". Usually, when you write a variable name in Python, it resolves to some value (or raises a NameError). The exceptions are constructs that bind names: assignments, for loops, function/class definitions, etc. In that sense del also refers to the binding itself, it doesn't care about the value (which isn't even looked at¹).
Thanks man, nicely explained
please explain bind names as well?
!names
A name is a piece of text that is bound to an object. They are a reference to an object. Examples are function names, class names, module names, variables, etc.
Note: Names cannot reference other names, and assignment never creates a copy.
x = 1 # x is bound to 1
y = x # y is bound to VALUE of x
x = 2 # x is bound to 2
print(x, y) # 2 1
When doing y = x, the name y is being bound to the value of x which is 1. Neither x nor y are the 'real' name. The object 1 simply has multiple names. They are the exact same object.
>>> x = 1
x ━━ 1
>>> y = x
x ━━ 1
y ━━━┛
>>> x = 2
x ━━ 2
y ━━ 1
Names are created in multiple ways
You might think that the only way to bind a name to an object is by using assignment, but that isn't the case. All of the following work exactly the same as assignment:
• import statements
• class and def
• for loop headers
• as keyword when used with except, import, and with
• formal parameters in function headers
There is also del which has the purpose of unbinding a name.
More info
• Please watch Ned Batchelder's talk on names in python for a detailed explanation with examples
• Official documentation
Why does __init_subclass__ not trigger ABC TypeErrors due to missing abstract method implementations? Is this a bug? (This is related to my current thread https://discord.com/channels/267624335836053506/1126969464706060440, where you can find an example)
Those only trigger upon instantiation not definition, this is a common misconception of ABCs.
Yes, but in this case I instantiate the subclass within the __init_subclass__
Which for some reason doesn't trigger the expected TypeError
Hi, should ast wait for stdin and hangs there, if it is waiting it should process something after a signal right just like cat and grep, or it should respond with a help message like grep
▶ grep
Usage: grep [OPTION]... PATTERNS [FILE]...
Try 'grep --help' for more information.
▶ ./python -m ast
Namespace(infile=<_io.BufferedReader name='<stdin>'>, mode='exec', no_type_comments=True, include_attributes=False, indent=3)
<_io.BufferedReader name='<stdin>'>
is this an intended behavior?
someone on Discourse responded that grep and cat and other tools work like this only, but they process the stdin after every enter (signal)
▶ grep "fike"
sadk
asd
sad
sads
fike
**fike**
as
dsa
dasd
Not every line is correct code. Ast needs context to correctly parse code, so it is reading entire stdin and parses it into Module ast node
then it should respond with a help message for
python3 -m ast
Why?
what does it waiting for ? is it going to process anything ever ?
print("hi")
Module(
body=[
Expr(
value=Call(
func=Name(id='print', ctx=Load()),
args=[
Constant(value='hi')],
keywords=[]))],
type_ignores=[])
-m ast -h clearly says that default file is stdin. It treats stdin as refular file, so it reads it until EOF and then parses it into Module
(sent Ctrl+D after print("hi"))
Or Ctrl+Z on windows
ohh got it
with args.infile as infile:
source = infile.read()
source will get it and later get processed, Thanks
!d dict
class dict(**kwargs)``````py
class dict(mapping, **kwargs)``````py
class dict(iterable, **kwargs)```
Return a new dictionary initialized from an optional positional argument and a possibly empty set of keyword arguments.
Dictionaries can be created by several means:
• Use a comma-separated list of `key: value` pairs within braces: `{'jack': 4098, 'sjoerd': 4127}` or `{4098: 'jack', 4127: 'sjoerd'}`
• Use a dict comprehension: `{}`, `{x: x ** 2 for x in range(10)}`
• Use the type constructor: `dict()`, `dict([('foo', 100), ('bar', 200)])`, `dict(foo=100, bar=200)`
Why {} is called a comprehension? It is a display with zero key-value pairs, not a comprehension
agree that's weird
I'd rephrase to:
• Use a comma-separated list of
key: valuepairs within braces:{'jack': 4098, 'sjoerd': 4127}or{4098: 'jack', 4127: 'sjoerd'}or{}• Use a dict comprehension:
{x: x ** 2 for x in range(10)}
Why is yield from an expression, not a statement? When would it ever have a non-None value?
!e e.g.
def foo():
value = yield from range(2)
print("yield expression:", value)
for num in foo():
print(num)
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0
002 | 1
003 | yield expression: None
weird
!e
def foo():
yield 1
yield 2
return 3
def bar():
x = yield from foo()
print("yield result:", x)
print(list(bar()))
@grave jolt :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | yield result: 3
002 | [1, 2]
it'll be non-None when the iterable being yielded from also returns a value (since iterables can do that, they just very often don't)
Yeah, StopIteration can contain a result. That's how coroutine results worked in pre-async/await coroutines in asyncio
!d StopIteration
exception StopIteration```
Raised by built-in function [`next()`](https://docs.python.org/3/library/functions.html#next "next") and an [iterator](https://docs.python.org/3/glossary.html#term-iterator)'s [`__next__()`](https://docs.python.org/3/library/stdtypes.html#iterator.__next__ "iterator.__next__") method to signal that there are no further items produced by the iterator.
The exception object has a single attribute `value`, which is given as an argument when constructing the exception, and defaults to [`None`](https://docs.python.org/3/library/constants.html#None "None").
When a [generator](https://docs.python.org/3/glossary.html#term-generator) or [coroutine](https://docs.python.org/3/glossary.html#term-coroutine) function returns, a new [`StopIteration`](https://docs.python.org/3/library/exceptions.html#StopIteration "StopIteration") instance is raised, and the value returned by the function is used as the `value` parameter to the constructor of the exception.
If a generator code directly or indirectly raises [`StopIteration`](https://docs.python.org/3/library/exceptions.html#StopIteration "StopIteration"), it is converted into a [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError "RuntimeError") (retaining the [`StopIteration`](https://docs.python.org/3/library/exceptions.html#StopIteration "StopIteration") as the new exception’s cause).
Changed in version 3.3: Added `value` attribute and the ability for generator functions to use it to return a value.
yield from is relatively complex when used in the intended use case instead of just a shortcut for yielding everything
its how they still secretly work underneath too, right?
Well, coroutine objects are now not generators
but they do something similar, I think
yeah they're kinda generators but not really, not sure on the actual differences underneath
They still have send and stuff but are separate types
i think they are generators, but they work "orthogonally"
regular generators use one api, async functions use another api
so you can mix them without problems
PEP492 does say that "Coroutines are based on generators internally", and the __await__ dunder must return an iterable, so i think async is still using generators underneath to work - they've just gotten better at hiding all of that stuff :P
Python Enhancement Proposals (PEPs)
Vaguely, generator and a coroutine and an async generator are all a subclass of some low-level, python inaccessible-ish generator "type" (actually a C macro that expands into a list of fields.)
Hi, why do we need attribute for a function, we can do it by using decorator right or we can also use metaclass if we have that kind of use-case
is it because functions are first-class objects that why python provides the ability to add attributes to a function object
Regular attribute dot-notation is used to get and set such attributes. Note that the current implementation only supports function attributes on user-defined functions. Function attributes on built-in functions may be supported in the future
can someone share a use-case which requires Function attributes on built-in functions may be supported in the future
I think it's less "why do we need" and more "why should we forbid"
most objects allow you to add arbitrary attributes. built-in functions are the weird ones for not letting you do that.
okay so it is there just to confirm objects allow you to add arbitrary attributes
why someone will use this?
I'm using it occasionally, for example when writing a decorator: I can collect the decorated functions in an attribute of the decorator.
It's just sometimes a convenient place to store stuff.
okay
Aren't builtin functions a different object under the hood? Probably for optimization purposes I imagine
Pretty sure they do this in functools module for singledispatch
>>> type(print)
<class 'builtin_function_or_method'>
>>> type(lambda:...)
<class 'function'>
dis.Bytecode implements a kwarg adaptive which states that "If adaptive is True, dis() will display specialized bytecode that may be different from the original bytecode"
What is 'specialized bytecode' in this context? 
!pep 659

fwiw for these instructions to show up the code has to first be executed
!pep 639
!e
def f(x):
return x + 1
import dis; dis.dis(f, adaptive=True)
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 1 0 RESUME 0
002 |
003 | 2 2 LOAD_FAST 0 (x)
004 | 4 LOAD_CONST 1 (1)
005 | 6 BINARY_OP 0 (+)
006 | 10 RETURN_VALUE
!e
def f(x):
return x + 1
for _ in range(10000): f(23)
import dis; dis.dis(f, adaptive=True)
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 1 0 RESUME_QUICK 0
002 |
003 | 2 2 LOAD_FAST__LOAD_CONST 0 (x)
004 | 4 LOAD_CONST 1 (1)
005 | 6 BINARY_OP_ADD_INT 0 (+)
006 | 10 RETURN_VALUE
!e
code
Wow. Facebook commits 3 engineer-years to remove GIL from Python ⚡
Python is in-famous for GIL which prevents its threads from running parallel across multiple cores, but pep-703 has proposed removal of it and to expedite its development and performance optimization Facebook has committed 3 engineer-years worth of effort in implementing nogil.
After this change, threads in Python will be able to run in parallel across multiple CPU cores. Hoping PEP 703 gets accepted and we get rid of GIL forever 🤞
I really admire Facebook for its share contribution to open-source; some of the biggest and most popular projects that we use on a day-to-day basis came from Facebook - rocksDB, PyTorch, React, ReactNative, zstd, audiocraft, to name a few ⚡
Proposal - PEP-703: https://lnkd.in/gsUNGyu5
Comment by Facebook where they commit: https://lnkd.in/gkGmKUQ3
Open-source projects by Facebook: opensource.fb.com/projects
https://peps.python.org/pep-0703/
Pep for making GIL optional. now that is exciting.
Hopefully it will be accepted in this or next year
Python Enhancement Proposals (PEPs)
We’ve had a chance to discuss this internally with the right people. Our team believes in the value that nogil will provide, and we are committed to working collaboratively to improve Python for everyone. If PEP 703 is accepted, Meta can commit to support in the form of three engineer-years (from engineers experienced working in CPython interna...
Python Enhancement Proposals (PEPs)
Where is python with jit atm ?
I have been meaning to get back into python
There's no real JIT in CPython at the moment, but something not entirely unrelated has landed in 3.11: The specializing adaptive interpreter of PEP 659, which replaces generic opcodes with more specialized variants once it has gone down that route often enough.
Yeah there was plans for it I remember
There was a git repo describing it (can't find it)
Has there been any development to it in the past few months ?
the specialising interpreter is sort of the very first step to a JIT, but actually JITing has afaik been deferred until the non-JIT specialisation is done.
Is there any GitHub repo describing this ?
I don't fully understand what non jit specialisation means from the top of my head
https://peps.python.org/pep-0659/ describes it quite well.
Python Enhancement Proposals (PEPs)
Thanks I'll take a look
Timestamps
00:00 - Introduction
24:30 - Brandt Bucher, Specializing Adaptive Interpreter
50:40 - Mark Shannon, Other Speedups
1:07:42 - Irit Katriel, Exception Improvements and Features
1:42:13 - Pablo Galindo, Better Tracebacks
1:58:46 - Pablo Galindo, tomllib
2:08:07 - Łukasz Langa, Typing Improvement and Features
2:36:02 - 3.11 is released
2:...
Here they talk about the jit aswell
The timestamp isn't totally accurate
!rule 6 @plain rover This is not an ad board.
3 FT engineers sponsored by meta would be huge
Only 3ft? I hope they have appropriate chairs and desks for such people
part of their downsizing efforts includes smaller chairs and mini screens
Still normal sized desks tho? 😂
You can put a screen under a desk I guess.
mounted upside-down 
Little did we know Zuck = Willy Wonka, and wanted to add a link, since the original post didn't link to what I think is the most current thread and statement on it: https://discuss.python.org/t/pep-703-making-the-global-interpreter-lock-optional-3-12-updates/26503/20
Reposting the reply I put on the steering council pep 703 decision issue here per Guido’s suggestion just so it’s all in one place: “”"The steering council is going to take its time on this. A huge thank you for working to keep it up to date! We’re not ready to simply pronounce on 703 as it has a HUGE blast radius. Software isn’t ready for the...
Why anyone bother about backwards compatibility in this case?
Lets make 3.13 (or whatever version when nogil happens) completely without GIL (with no option to run with gil), so something is guaranteed to break (it is unavoidable) and make 3.12 support time longer.
Everyone will be able to stay on 3.12 for several years and slowly port code to 3.13.
I think releasing two different interpreters (gil and nogil) of same version will make a lot more confusion
I think i just dont see some obvious reason to not do that
guido said on lex it was to give c extension producers time to have no gil versions of their libraries ready several years in advance
there are 3 posts with hundreds of perspectives and takes about this subject, it is absolutely more nuanced than this
because a python with GIL is important to the PSF
Hm?
who has said that?
and why would that be important long term?
why not just release python 4
Why not just abandon python and switch to c++?
coming in python 5
When python 6
I dont quite understand why 2->3 transition was so painful.
One reason i can think of: a lot of library maintainers were slow in porting (or they weren't porting at all), so for a long time there were no python3 versions of libraries, and a lot of developers weren't able to install dependencies they usually used. But in this case you can switch to similar library, right?
it'd be pretty rare to have a similar library that won't get affected by the same problems as all the other libraries
most built-in functions that once returned lists (i.e. dict.keys()/.values()/.items(), map(), filter(), zip()?) returned iterators in 3.0
map(), range(), input(), among other things, also changed behaviour other than the list-iterator problem
list.sort()/sorted() once had a cmp parameter according to 3.0 What's New
the built-in functions cmp(), xrange(), raw_input(), etc. got removed
- The
repr()of a long integer doesn’t include the trailingLanymore, so code that unconditionally strips that character will chop off the last digit instead. (Usestr()instead.)
0nnn for octals changed to 0onnn
among many other things
so basically it affected everything
not to mention that there was a point during which range was functionally deleted from the language, with everyone using xrange instead
this is in the 3.0 What's New but i don't think it has been removed yet
- String literals no longer support a leading
uorU.
maybe they removed it but added it back
sets module too
raise Exception, *a -> raise Exception(*a)
except Exception, e: -> except Exception as e:
the biggest reasons: the value proposition was incredibly low, and the cost of moving was incredibly high. For years after Python 3 was created, it wasn't better than Python 2, just different (and arguably worse, due to less library availability)
It was also very hard for a long time to write code that worked in both 2 and 3. Eventually, some features were added to Python 2.7, and other features were added to Python 3.2 and 3.3, in order to make it easier to write code that worked with both the latest version of Python 2 and Python 3. Things like being able to use u"" strings in both languages. At first, Python 3 removed the u"" string prefix, which made it very, very difficult to write code that behaved the same in both languages, and necessitated the invention of nasty hacks like https://six.readthedocs.io/#six.u
scrolling through the docs of https://six.readthedocs.io/ might give you an idea of how tough it was to support both 2 and 3 in the same code base. Everything in six exists to help people write polyglot Python 2 + Python 3 code. And if library maintainers couldn't support both in the same code base, the expectation is that they'd instead need to maintain two different libraries, one in 2 (where almost all of their users are), and a separate and incompatible one in 3
It's also really tough to convince millions of people that they've been using the wrong string type for years and they should rewrite everything.
If supporting both is painful, isnt it reasonable to support only one version?
Drop support for python2 (libraries for it already exists, they work and dont need much fixes, you should leave them as is) and then work only on python3 libraries (port them from python2 and continue your development)
If supporting both is painful, isnt it reasonable to support only one version?
Yes, totally reasonable. Lots of people did exactly that.
Drop support for python2
They didn't do that, though. Why would they? All of their users were using Python 2, all of their bug reports and feature requests were for Python 2, all of their companies were using Python 2. Why would they build an extra Python 3 version?
Python 2 and Python 3 are of course very similar, but they're very different too. Almost every existing program needed changes in order to be ported to Python 3. Asking everyone to rewrite their libraries in a new language so that their existing users could port their code to that new language is a huge request. Especially when, by and large, the users are happy with the old language.
You don't think the 3 to 4 jump, if it ever happens will be that severe?
as the core development team we really don't want another similar jump
I know on Lex Friedman, Guido said no gil would be a valid 3 to 4 jump but you (personally) don't think there would also be some no backwards changes to apis and syntax?
Some people might want the opportunity to make other backwards-incompatible changes, but the general view is that the 2-to-3 transition was poorly handled and we shouldn't repeat that mistake
The initial expectation I believe was that people would just run 2to3 once to migrate their code from Python 2 to 3. That didn't work because library authors generally wanted to keep developing on both 2 and 3 for some time, so people had to invent tools like six for the migration to become workable.
Right, yeah most large 3rd party packages are still keeping python 3.7 around and as such can't even use stuff introduced later
I'm still trying to get my head around where PEP 684/554 fits in the context of the no-gil/703 conversation... it seems that 684/554 gets us to the holy land, "simultaneous use of multiple CPU cores", without a gil-ectomy? Or, am I mis-interpreting it (no pun intended)
subinterpreters let you use multiple cores, but it's very hard to share data between subinterpreters at the moment; Python objects need to live in exactly one subinterpreter
Thx, now I get it... I was missing the point / the 554 examples & rationale now make sense. ** I'm increasingly working with pyarrow/arrow tables, but I guess I'd share them no differently than usual (across processes, altho without the multiprocessing.shared_memory option).
Doesn't that defeat the purpose of arrow though, not sharing memory and being zero copy?
How do you achieve zero copy without sharing memory?
I typed that confusingly, I mean if you cannot share the memory and therefore gain the zero copy benefit, doesn't that defeat the purpose of arrow?
Gotcha, yah, agree.
Arrow seems amazing and I'm am very excited to to see it become more integrated into pandas and such
does PEP mention anything regarding inline functions? for example:
def is_even(n): return n % 2 == 0
PEP 8 says there should be a line break after the :
Although true one line functions like that may be better written as lambdas. Especially if it doesn't need to be named
lambda n: n % 2 == 0
are you sure about that? I get all kind of PEP8 warnings on Pycharm, but not on this function.
I also had a look here: https://peps.python.org/pep-0008/, and I couldnt find that
true, but wouldn't it be nice if Python would automatically return the last expression of a function?
That would drastically change the semantics of existing programs, increasing memory usage and breaking all kinds of stuff, I wager
Well, it says "write docstrings for all public modules, functions, classes, and methods", which is enough to forbid that style in most cases
But you're right, it doesn't seem to explicitly forbid single line functions. Hm.
I personally hate them cuz its not very readable, I would even argue that a lambda is much more readable because of the removed boilerplate (no function name, and no return statement)
but I was just wondering if they were explicitly forbidden or not...
There are some languages that do this, like ruby, but I'm personally not a fan of it
things shouldn't depend on the GC for correctness, but plenty of things do, and beginning to return stuff from functions that previously returned None would keep references alive longer, and most likely change the behavior of existing programs
FWIW, you can share memory between subinterpreters without using "shared memory" in the sense of the IPC primitive. The subinterpreters are running in the same address space, so while they can't share Python objects, they could share in-memory buffers (assuming some suitable synchronization primitive exists to provide thread safety)
okay consider using operator + functools
yeah probably not a good idea now that I think about it
Today I encountered a question on StackOverflow that wanted to get back what concrete parameter Generic was given.
As in they has class MyGeneric(typing.Generic[R]) and when they instantiate it like a = MyGeneric[SomeClass]() they want to later access SomeClass.
They originally came from C# and gave example how it would be in C#.
At some point I loved analysing objects in interactive sessions, but I haven't done it in a long time - but I still gave it a go.
What i ended up with (in spoiler in case anyone wants to consider this a challenge):
||```py
a = MyGenericint
print(a.orig_class.args[0])
It's hella ugly, but I don't believe there's some more direct approach, is there?
Isnt this documented?
Both .__orig_class__ and .__args__
Turns out .__orig_class__ is not documented
typing.get_args(tp)```
Get type arguments with all substitutions performed: for a typing object of the form `X[Y, Z, ...]` return `(Y, Z, ...)`.
If `X` is a union or [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal "typing.Literal") contained in another generic type, the order of `(Y, Z, ...)` may be different from the order of the original arguments `[Y, Z, ...]` due to type caching. Return `()` for unsupported objects.
Examples:
```py
assert get_args(int) == ()
assert get_args(Dict[int, str]) == (int, str)
assert get_args(Union[int, str]) == (int, str)
``` New in version 3.8.
^
So that's only the .__args__ part. But what about getting it from subclass's object? That was the case
class MyGeneric(Generic[R]) :...
a = MyGeneric[SomeClass]()
And get that SomeClass from a.
Not get R from MyGeneric (easy), not get R from a (also quite easy)
This works only fpr aliases themselves, not for objects constructed from aliases
Oh, i am a bit laggy, i was replying to this
Ah, for that you'll have to implement a special __class_getitem__ or a metaclass
I would just pass the type as an argument to __init__
That's why i said I inspected stuff and found how to get it from existing stuff, like I said :3
That would be quite a bit redundancy. MyGenericSomeClass doesn't look good
That would be inheriting from SomeClass rather than being some kind of container of SomeClass
Wait, no, you mean in making object
This is a call, yes
I thought it was a class definition line, that's why I said inheritance
class Foo(Generic[T]);
def __init__(self, t: type[T]): ...
This also requires the caller to pass a type
Whereas with normal generics the type variable is sometimes inferred
but then you couldnt alias like FooInt = Foo[int]
i dont know whether thats useful
oh wait you still could i think
I think it's a bad idea to call a type alias
A type alias is for use in annotations and other aliases
But then you'd need to still do a = FooInt(int) and repeat yourself for int to be visible inside with @grave jolt 's idea and without the magic (.__orig_class__) I found
The issue with __orig_class__ is that we set it wherever possible, but a lot of the time it's not possible:
>>> list[int]()
[]
>>> _.__orig_class__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute '__orig_class__'
Do we want to document something that's so inconsistently available? I'm not sure...
Why does it even exist?
i didn't add it, idk
I guess some introspection capability is better than no introspection capability? 🤷♂️
not blaming anyone, just asking 😛
ik ik
I suppose the historical reason is that typing.Generic came first, so when we added it we added these introspection capabilities (but didn't document them). Later PEP 585 came along and added built-in generics, but we didn't want to add a bunch of random attributes to builtin classes.
@steel solstice proposed to add a slot to all Python classes everywhere so that we could always set __orig_class__: https://discuss.python.org/t/adding-an-orig-class-member-to-the-pyobject-struct/25520
But that would be quite a big change!
That's gonna be a long time coming as well
I would like to add the slot to anything using the PEP 695 classes and also add something for defining 695-ness in c
it'd be opt in for the non c based classes because you can always do __class_getitem__ = classmethod(GenericAlias) but yeah it would be more memory for dict, list etc
I think that's going to be a nonstarter
introspection 
it's important that dict/list use as little memory as possible
do you think there'd be any kind of way I could convince people this would be a good idea?
if you can implement it in a way that doesn't require adding another pointer to lots of important objects
I can think of a way to do it, it's just horrific and involves subclassing every single one of those classes
that way though it would be truely opt in
i noticed that pep693 (py3.12 release schedule) is not updated. Is it ok?
Isn't func.__globals__ useless when function uses no global or builtin names?
I was thinking about lambdas, they always have reference to global scope, so if you created lambda in one module, passed it no another module, and "unimported" first module, it will not be gc'ed because there is still reference from lambda
I've noticed that when developing a project with a pyproject.toml file, using pip install . builds the project in the same directory. If I just want to install the requirements, this seems like overkill, but I imagine there's some rationale for doing this.
Is there some way just to install the base requirements from pyproject.toml without installing the rest of the project? (please ping on reply, thx)
usually I would strongly suggest using pip install -e ., specially if you plan to test your code, but maybe try --dry-run then copy-paste the last line of the output into pip install?
not sure if that'll avoid building though
right, I saw -e recommended as an option, but my understanding was this has some side effects?
as far as I know it quite much just adds the package you're woring on rn to your venv
gotcha
@uncut sageThe usual reason I have seen is to test the package in the same environment (installed) that downstream users will have it.
that makes sense.
When your in the same directory as your package, imports work a bit differently and you can just open("db.sqlite").
Then you publish your package and users can't import things and their database is being created in their CWD and Python can't find it because it's looking in site-packages....
So they recommend a src/ dir to encapsulate your code, to keep anything from happening that you don't expect.
But now you can't run you're code because you're not in the same folder anymore.
So you install your package.
But now everytime you make a change you have to install it again.
That's where --editable comes in. Instead of zipping your files and copying them over, it symlinks your local directory into the site-packages, so it can see when you change your files without any extra effort.
Together, a src/ directory and --editable keep you safe from most of the local editing inconsistencies, and make for a much more consistent experience between dev and prod overall.
yes! I intuited some of this from reading the docs about -e and conducting my own (admittedly limited) experiments
does anyone know a good way to do derivatives, also anyone know a good way to solve my existantial crisis?
Wrong channel
#data-science-and-ml
hey i had this question im not really sure if its relevant for this channel
i got a curious question regarding windows maybe some of you would know the answer
so im working on this inagme overlay app (working on the framwork for it so i can build more similar apps in the future with this)
but i noticed that when i put the game in fullscreen that no matter what i cannot make my app appear on top anymore
so i was wondering if this is some windows limitation or if this could be done by certain windows dlls?
if its not relevant here then feel free to point it out ❤️ and ill move it to the appropriate channel
I think full screen gets exclusively assigned to the primary monitor, but there are apps out there like overwolf that can assign an overlay into the game/app once it's full screen, so it's definitely possible
thats what i am tryna copy basically
overwolf is what inspired me tbh
so im working on a similar webapp framework with python backend
but i cant seem to find info on this
are deprecated classes and variables removed from the __all__ of stdlib modules?
there are two kinds of deprecation: soft and hard: https://peps.python.org/pep-0387/ it doesn't mention __all__. I would guess they are not removed, since that would break code.
Python Enhancement Proposals (PEPs)
(hard deprecations are removed eventually. soft deprecation just means, "you should look for an alternative".)
"The deprecated functionality may eventually be removed from the typing module. Removal will occur no sooner than Python 3.9’s end of life, scheduled for October 2025."
Yeah but why remove the stuff?
to reduce the support burden? It says "may eventually"
I guess it doesn't make sense to "actively deprecate" this stuff until 3.8 is EOL and considered "too old"
we're pretty unlikely to remove them
i assumed theyd just become aliases at some point
Unfortunately that would be a breaking change 🙃
>>> from typing import List
>>> List["int"]
typing.List[ForwardRef('int')]
>>> list["int"]
list['int']
not when i fix it to work 😉
all this really illustrates is that they have different reprs, and surely changing the repr isn't a backwards incompatible change. I'm not doubting that it's backwards incompatible, but what supported behavior changes?
they dont have just different reprs, ForwardRef isnt included in the args for list["int"]
its just a plain ol string
>>> from typing import get_args, List
>>> type(get_args(List["int"])[0])
<class 'typing.ForwardRef'>
>>> type(get_args(list["int"])[0])
<class 'str'>
what difference does that make to a type checker? Won't it just wind up resolving both "int" and typing.ForwardRef("int") to int?
No difference to a type checker
well, no difference to a static type checker
But it may make a difference to tools that do runtime introspection of type hints
wouldn't those things also be expected to resolve forward references?
(maybe I'm thinking about this wrong; I've never really done any runtime introspection of annotations)
What they're expected to do and what they actually do are two different things :p
I'm not sure about the exact way that things might break tbh, and maybe they wouldn't -- maybe I'm overegging it a bit. But I certainly don't see any real motivation to change things now so that they're "just" aliases. The maintenance burden of these objects is extremely low; the implementation is very simple in typing.py.
dw im willing to make the burden typing.Text levels by making it far far greater for people working in c :P
annotations of "int" and ForwardRef("int") seem semantically equivalent, so it seems like it'd be a weakness in some tool if it treats them differently - but I definitely buy that the risk of breaking anything might outweigh the cost of maintaining the existing stuff
some discussion of this here: https://discuss.python.org/t/concern-about-pep-585-removals/15901 see serhiy’s comment for some concrete intermediate milestones.
personally i’d vote no removal for at least the next decade, and aliasing to builtins no sooner than 2026
PEP 585 currently mentions the following: The deprecated functionality will be removed from the typing module in the first Python version released 5 years after the release of Python 3.9.0. where “deprecated functionality” refers to typing.List, typing.Dict, etc. Removal will break pretty much all typed code and all typing documentation / r...
My personal position remains that we should aim to remove them at some point. I think it's really unfortunate that we have a parallel hierarchy of builtin types in typing.py -- it's confusing to newcomers, who don't understand which they're meant to be using in new code, and contributes to a general problem where typing has an unnecessary huge/sprawling public API that's overwhelming for users. (FWIW, I also understand why they were originally introduced, and would probably have made the same decision at that point in time.)
But I'm amazed that I said in this thread (over a year ago) that we should aim for removal in 3.16. I definitely favour a much longer deprecation period now. A decade or so sounds pretty good to me; I agree with what you said in https://discuss.python.org/t/concern-about-pep-585-removals/15901/6 that the collections/collections.abc deprecations are a good precedent here.
Typing has A LOT of confusing things like this, especially generic alias stuff, it's difficult to know when you can directly instantiate or not
That too. It's pretty confusing that you can subclass both List[int] and list[int], but you can only instantiate the second one
What tools were used for fuzzing the Python frontend?
Writing python bytecode transforms, and I want to make sure I can handle all valid python.
I think hypothesmith is often used for that kind of thing: https://github.com/Zac-HD/hypothesmith
please can i get a core dev to review https://github.com/python/cpython/pull/103232?
I think this addresses all the issues I have with the current message. Thanks to Eryk for the pointer as to where I should be editing.
Issue: gh-96663
Thank you very much, that was exactly what I was looking for.
Next question: What are the other lists of variables (other than locals) that constitute frame->localsplus? In what order? Is there documentation on this that I can be pointed to?
Heya, this channel is for the discussion of python's internals, or other overall ideas around the language itself, this kinda question might be better off in one of our off-topic channels
Why there are no do..while or repeat..until statements in python?
Isn't repeat...until just a while loop?
repeat..until is just a do..while loop, but with inverted condition (it loops until condition is true)
do..while loops while condition is true (=until condition is false)
do..while is not a while loop, it unconditionally executes one iteration, and only then checks condition
Ah right
you can find do..while loop in C++, and repeat..until loop in pascal
I mean you can manually do these
you can manually perform for loop using while loop too 😄
it is all just a syntactic sugar around goto
I'd think the main reason is that they don't really offer a significant enough benefit to justify reserving multiple keywords for them.
Since you can just do it with a while loop. Really though Python is a more high-level language, you usually use a for loop for iteration not a while loop. If you do need that kind of loop, often you might want to just encapsulate that in a generator instead, hide away the details.
weirdly enough, i was actually thinking of this yesterday (because someone asked about it in pygen), and I nearly asked about it here
I think its probably "there should be one obvious way to do it" in action, since do..while is in pretty much every way equivalent to ```py
while True:
... # do stuff
if cond:
break
Almost equivalent, except
do:
...
while cond
else:
...
would behave differently
yeaaa but how often are you actually using the else clause of a while loop
Grepping roughly through all work repos I have checked out: four times :D
(not counting for-else)
i've never used while else in like...5 years of python
Include/cpython/code.h lines 155 to 156
PyObject *co_localsplusnames; /* tuple mapping offsets to names */ \
PyObject *co_localspluskinds; /* Bytes mapping to local kinds (one byte \```
why? 👀
it's an implementation detail. You can get at it if you try hard enough, but it's not guaranteed to exist, nor is it guaranteed to have the same structure across Python versions. Any attempt to use that is gonna mean that you need entirely different code for different versions of CPython. That can even change from one patch version of CPython to the next Actually, maybe not - I'd need to think more about whether there's any guarantee of ABI stability for code objects within a minor version...
but in any event, there certainly isn't from one minor version to the next
for instance, Python 3.10 doesn't have co_localsplusnames and co_localspluskinds at all (though it does have co_localsplus)
It's an implementation detail, but I'm writing a cpython bytecode to bytecode translation layer, so I want to know.
Bytecode too is version specific, and I already have to deal with that.
What I want to do is static analysis to figure out the storage location being referred to by LOAD_DEREF.
And to do that, I'm going to need to know the name of the variable that I'm looking for.
For LOAD_FAST and LOAD_GLOBAL, it's easy. For LOAD_CONST, it's irrelevant.
LOAD_DEREF is dynamic in the storage location that it refers to, and it could be a freevar or cellvar, which is relevant, but the name of the variable is known at compile time, and I need to know.
if you have access to the code object, can't you get at the names of the locals from the public properties of the Python code object?
Yep.
>>> def f():
... x = 1
... return lambda: x
>>> c = f()
>>> dis(c)
0 COPY_FREE_VARS 1
3 2 RESUME 0
4 LOAD_DEREF 0 (x)
6 RETURN_VALUE
>>> c.__code__
<code object <lambda> at 0x0000026E18C095F0, file "<stdin>", line 3>
>>> c.__code__.co_cellvars
()
>>> c.__code__.co_freevars
('x',)
>>> f.__code__.co_cellvars
('x',)
>>> f.__code__.co_freevars
()
Ah, so you're saying it's just
itertools.chain(code.co_locals, code.co_cellvars, code.co_freevars)
I think it's just an offset into co_varnames
Oh
It... is...
In memory at least.
I don't know if there's a bounds check stopping me.
I'll check.
I thought co_varnames was just the locals though, and not localsplus.
actually, no - I think it's an index into co_fastlocalnames
no, that doesn't exist, so I must be misunderstanding the docs...
I'm back to thinking it's an index into ```py
c.code.co_varnames + c.code.co_cellvars + c.code.co_freevars
no... it's into c.__code__.co_cellvars + c.__code__.co_freevars
final answer 😄
hey, free vars!
where can I get mine?
just send me your bank routing numbers to cover shipping, and I'll hook you up
I'm afraid that might breach your local laws 😔
then i've been doing it wrong?
i've been using varnames + cellvars + freevars for LOAD_CLOSURE, *_FAST, *_DEREF, and LOAD_CLASSDEREF
basically any opcode using fast locals/locals plus
!e ```py
from dis import dis
def f():
x = 1
y = 2
def g():
y = 3
return x
return g
dis(f)
@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0 MAKE_CELL 2 (x)
002 |
003 | 2 2 RESUME 0
004 |
005 | 3 4 LOAD_CONST 1 (1)
006 | 6 STORE_DEREF 2 (x)
007 |
008 | 4 8 LOAD_CONST 2 (2)
009 | 10 STORE_FAST 0 (y)
010 |
011 | 5 12 LOAD_CLOSURE 2 (x)
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/I7O5RB6HHU3CGNUB7QX5NCPD3E
function co_varnames co_cellvars co_freevars
f() ('y', 'g') ('x',) ()
g() ('y',) () ('x',)
all localsplus-using opcodes
in f():
('y', 'g') + ('x',) + () = ('y', 'g', 'x')
MAKE_CELL 2 ('y', 'g', 'x')
^^^
STORE_DEREF 2 ('y', 'g', 'x')
^^^
STORE_FAST 0 ('y', 'g', 'x')
^^^
LOAD_CLOSURE 2 ('y', 'g', 'x')
^^^
STORE_FAST 1 ('y', 'g', 'x')
^^^
LOAD_FAST 1 ('y', 'g', 'x')
^^^
in g():
('y',) + ('x',) = ('y', 'x')
STORE_FAST 0 ('y', 'x')
^^^
LOAD_DEREF 1 ('y', 'x')
^^^
unless we're talking about a different thing here
Can you give me any good resource where the process of string creation is explained? Like how it chooses whether to store it as ascii object or compact unicode object and such? And whether it sacrifices some memory in order to keep random access in constant time?
yeah I think you're right
If I were you I'd start reading the CPython code. There have also been a few PEPs about the string internal representation so those might help, though they might be outdated
Do you know which file i should read?
the comments in https://github.com/python/cpython/blob/main/Include/cpython/unicodeobject.h would be a good place to start
whether it sacrifices some memory in order to keep random access in constant time?
It does - all the characters in a given string object are always stored stored in the same number of bits, so if an ASCII string has gets 1 emoji added to it, suddenly the string takes 4 bytes per character instead of 1, just to give that constant indexing
how it chooses whether to store it as ascii object or compact unicode object and such?
My understanding is that:
- if every codepoint in the string is <128, it uses ASCII (1 byte per character)
- if every codepoint is <256 it uses latin1 (also 1 byte per character)
- if every codepoint is <65536 it uses UCS2 (2 bytes per character)
- otherwise it uses UTF-32 (4 bytes per character)
And it does this by being optimistic and assuming that all characters are ascii until it find a non ascii character and so it starts over?
Thanks!
unicodeobject.c
I believe there is no ASCII version at all: It's Latin-1, UCS-2, or UCS-4.
(at least what storage is concerned; there's just the one PyUnicode_1BYTE_KIND)
(Seems like I'm mistaken; PyASCIIObject is still a thing for some reason)
yeah - I'm not sure why it's still a thing. It doesn't seem to have any particular advantages over Latin-1...
why these are strings, tuples or sets should be more intuitive right? there is one script right that finds out if a pattern is being used in top library can I get the link to it?
# Some strings for ctype-style character classification
whitespace = ' \t\n\r\v\f'
ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = ascii_lowercase + ascii_uppercase
digits = '0123456789'
hexdigits = digits + 'abcdef' + 'ABCDEF'
octdigits = '01234567'
punctuation = r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
printable = digits + ascii_letters + punctuation + whitespace
there is new adding 7 years ago
punctuation = r"""!"#$%&'()*+,-./:;<=>?@[]^_`{|}~"""
people must be using it
Why should it be sets? So that you can do hexdigits - octdigits or something like that? Or just for some theoretical purity?
Sets aren't indexable, so you couldn't directly do random.choice(string.ascii_lowercase) anymore, which is a relatively common thing.
got you, but it can be tuple
i didn't think about random.choice(string.ascii_lowercase) while thinking about set
Okay, it could be a tuple, but why would it be?
it's containing things that's why i thought a container suits well, but it is not going to add anything i guess
but sting is also a container
I am not able to express myself, like a more intuitive container for people coming from other language
String is also a container of strings
Searching for char in string is faster than searching for string in tuple (and they both slower that searching in set)
String takes less memory than tuple of chars
I wouldn't be surprised if this isn't even true for some of the strings in string, because they're so small.
What if you actually need string? Then you have to convert it back to string: "".join(whitespace)
You would need exactly tuple in very rare cases. Usually you need it to be iterable or indexable, so string can do everything tuple can do
For some maybe, but not for all. printable is pretty big
!remind 1d test speed of membership tests for strings, tiny strings, tuples and sets
Sorry, you can only do that in #bot-commands!
what are the naming conventions for enums in Python?
the class-name is PascalCase, but what about the name of each attribute?
SCREAMING_SNAKE_CASE
I'm using snake_case
so there is no standard?
even in the Python docs I see some examples SCREAMING_SNAKE_CASE and others with snake_case
google Bard certainly agrees with you
I've seen both
I'd argue for SCREAMING_SNAKE_CASE, in part since they're essentially constants
thats a good point
New in version 3.4.
Source code: Lib/enum.py...
!pep 8
I did a ctr + f in pep8, and nothing
the enum docs use screaming snake case
pep8 says nothing about it
but here there is UPPER case:
https://docs.python.org/3/library/enum.html
and here its LOWER case:
https://peps.python.org/pep-0435/
actually, it seems that from version 3.6 they consistently started using upper case
that is a pretty old pep & doesn't really explicitly mentions style
true
I'll defintely use upper case
thanks guys
I guess Bard was right after all 🙂
Hi friends! Is there a good place to follow progress on the famous ~5x CPython speedup project? https://github.com/markshannon/faster-cpython/blob/master/plan.md
Things like:
- where are we currently
- what would be in the next release
- any perf projection adjustments
- any timeline adjustments
The last thing (1 year ago) I heard is the we should do +1 to all the minor version in this document. Would be great to understand the current state of the world.
The context is that I need to convince my management to invest into massive update of python ecosystem and one of the carrots would be updating the interpreter version to realize the $$$ savings. So having some solid understanding of where are we going to be in 1-2 years would be invaluable to me.
Did you see Mark Shannon's talk from EuroPython last week? https://youtu.be/HHvm_TYhG14?t=5724 (I have nothing else to contribute to this discussion tho)
No I haven't! Thank you for the link.
enums are constants and should use all caps
is what i've been taught, havent reflected much on it
In Python, that is.
In Rust, enums are much more powerful in that they can contain structural variants alongside unit variants, so they use CapWords letter case
Even the Rust standard library has such enums. For example, this is how the two most frequently used enums are defined:
enum Option<T> {
Some(T),
None,
}
enum Result<T, E> {
Ok(T),
Err(E),
}
Similar stuff to Rust enums can be found in languages like Haskell or OCaml
I think it's kind of an abuse of the word enum lol
It's an algebraic data type/sum type
I've only seen them called enums in Rust
Haxe also calls them enums, and it is odd there as well.
when they're really "tagged unions"
anyone hav youtube channel related to website development ? please inbox me if have....
I didn't know capitalized snake case is called Screaming Snake Case and I think that's visceral and awesome
This_Is_Capitalized_Snake
THIS_IS_SCREAMING_SNAKE
Bwoah, so internally and peppy...
Mine is tucked up and ready for bed
aww
Does anyone know if slice is a singleton instance that's callable or is slice a class that you instantiate when you call it?
the latter
Include/sliceobject.h lines 22 to 25
typedef struct {
PyObject_HEAD
PyObject *start, *stop, *step; /* not NULL */
} PySliceObject;```
If it was the former, what would it return when it's being called?
Great question. I ask because of this thread https://discuss.python.org/t/alternative-to-create-slice-and-tuple-of-slices-with-slice-class-getitem/30316/3
I like this idea, and it would in fact remove duplicate effort in at least two libraries. Numpy has numpy.s_ and Pandas has pandas.IndexSlice. https://numpy.org/doc/stable/reference/generated/numpy.s_.html https://pandas.pydata.org/docs/reference/api/pandas.IndexSlice.html The code for both is very simple, although it is so simple it might b...
I think the OPs use case is genuine but given the nature of slice I never see it happening
I could see something like slice.of[1,2:2]
It could totally be done, but it's unnecessary; you can just define that special object yourself
Looks neat, unfortunately these kind of names are rare or nonexistent in the standard library.
Yeah, I think it's fine to just have this replicated once per library.
Yeah, the code I linked is ridiculously simple, its not a huge burden
is there a particular reason float.inf doesn't exist as an alternative to float('inf')? I think the former would be a lot more elegant.
that explains why there isn't a float.nan (because there are several nans), but I can't think of a need for float.nan anyway
Nan propagates around nicely to indicate a failure in computation
And the fact that all ops with Nan produce Nan make it great for missing data bc then the vectorizing math functions still work
yes, but I rarely find myself needing to instantiate a NaN. I just need to check if something I didn't directly create is a NaN or not.
well float('nan') exists
but... we already have exceptions for that 😦
and it actually shows you where the screwup occurs instead of just appearing
```pycon
sum([1e400, -1e400])
nan
(inf - inf, just to show that exceptions do not cover every single case in which NaN might spawn)
Well, this sounds like a good candidate for OverflowError or whatever
true lol
In scientific computing, if you're doing the element wise division of two arrays, and one of the elements in the denominator is a zero, you don't want to have to handle an exception for that.
doesn't that indicate a problem with the data?
not necessarily, but in either case, there are ways of dealing with nans after the fact (like replacing them with numbers in some way, or omitting them from future computations)
Hello @unkempt rock this server does not allow for recruitment
Is this the best channel to ask about the nitty gritty of "faster cpython" optimizations in 3.11? Please let me know if there is a better channel for this question. I was referred from #esoteric-python message
I'm curious how the "Load attribute" optimization works, and when it works.
Similar to loading global variables. The attribute’s index inside the class/object’s namespace is cached. In most cases, attribute loading will require zero namespace lookups.
https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-faster-cpython
If I have a class Foo: pass and I make two instances a = Foo() ; a.bar = 1 ; a.foo = 1 ; b = Foo() ; b.foo = 1 ; b.biff = 1
Does this mean a_or_b.foo is optimized? And how? Because a and b have different dictionary layouts.
No, the only thing that's optimized is several accesses to the same attribute (of the same object) when no attribute has been created/deleted in the meantime. Under certain conditions (no custom __getattr__ for instance).
oh interesting, it sounds like you're describing something very different.
Like literally_anything_with_foo.foo ; literally_anything_with_foo.foo will do a full, expensive dictionary lookup for the first line, but then reuse that value for the second line? But the first line will always do the lookup since it's invoked with very different kinds of objects?
clarifying: that example is wrapped in a function, def access_foo_twice(literally_anything_with_foo): and then called multiple times, passed very different objects that all have a foo attribute
When you say "of the same object" do you mean same identity, or different identities but same type (instances of same class)?
same class I believe. There is an optimization that makes instances of the same class share the keys for their instance dict (since usually it's the same)
that optimization gets disabled if you set different attributes on instances of the same class
then the LOAD_ATTR optimization basically notices if your code always accesses an attribute on the same class, and if so it saves the offset into the shared dict keys
Ok interesting. So if I have a hot code path that's always passed instances of Color with r,g,b attributes, then I get the optimization
But if I occasionally pass instances of subclass ColorWithAlpha that has an additional alpha attribute, then the optimization doesn't happen, because LOAD_ATTR sees a different shared dict for Color than for ColorWithAlpha
I suppose __slots__ don't help
Every slotted attribute is looked up on the class, not the instance dictionary, but the class will also be different every time: alternating between Color and ColorWithAlpha
So depending on the situation, if you really need to wring performance out of a hot code path, composition instead of inheritance might help
yes, a lot of the Python 3.11 optimization depend on types being consistent across execution of a function
caveat that this is based on my general understanding of this area, I haven't worked on the code and I don't understand it deeply
Thanks, it's helpful. Ultimately I'm benchmarking everything I do, so that'll be the real proof.
With these dictionary-based optimizations, supposing I have a code path which always accepts Color -- never ColorWithAlpha -- it sounds like the optimized __dict__ stuff might be faster than __slots__?
With __slots__
it has to check for absence of __dict__ first
then gets the slot descriptor off the class (dict lookup will be cached, but still requires getting the descriptor)
and then access that memory offset on the instance
With optimized __dict__
it gets the attribute's offset from the __dict__ (which will be cached)
and then accesses that index in the values array
I think there's also a specialization for slots
Python/ceval.c line 3566
TARGET(LOAD_ATTR_SLOT) {```
didn't do this rigorously but it looks like the slots one has to do fewer checks than the instance dict one (LOAD_ATTR_INSTANCE_VALUE)
Interesting, it's only de-opting if tp->tp_version_tag != type_version so that means type_version values are globally unique even across all different classes?
I think it's per class
type_version comes from the cache associated with the instruction
wouldn't it need to de-opt if I passed it thousands of Color instances and then suddenly passed it a Cat?
At the moment the value of type version tag is globally unique, until the global int overflows at least. Also IIRC, Jelle is right that slots attribute specialization loads faster than instance attributes specialization.
yes, and the type version tag also changes whenever MRO changes
So if you do a class attribute write or change a method it updates.
ok cool. And I see what you mean about it caching the slot's memory offset directly, so slotted classes optimize faster than dictionary'd classes
also just a fyi, IIRC deopt doesnt immediately change the instruction. There's a deopt counter so only if it deopts many times will the entire instruction change. if you have thousands of Color then passed a Cat, it will still be specialised for Color, until you pass it Cat enough times then it becomes re-specialized for Cat
This is intentional to account for changes in code execution and also some degree of type instability.
Interesting. How does it choose what to opt for in the first place? I understand it detects if it's "hot".
I'm not sure if this makes sense, but what if at the exact moment it realizes it's "hot" and decides to optimize, I pass it a Cat next? Presumable it would opt for the Cat. Then I guess the deopt timer would start counting up the more Colors it receives, and eventually it deopts
To 1. This function is called in LOAD_ATTR_ADAPTIVE which decides based on runtime values https://github.com/python/cpython/blob/3.11/Python/specialize.c#L656 .
To 2. Yeah if you trick it at the boundary it would need to deopt.
Python/specialize.c line 656
_Py_Specialize_LoadAttr(PyObject *owner, _Py_CODEUNIT *instr, PyObject *name)```
You can see the "countdown" happening in LOAD_ATTR_ADAPTIVE here https://github.com/python/cpython/blob/3.11/Python/ceval.c#L3479
Python/ceval.c line 3479
if (ADAPTIVE_COUNTER_IS_ZERO(cache)) {```
I see, thanks for all the info
Fwiw mark Shannon did a good talk on some of this at pycon this year
Many of you will will have heard that Python 3.11 is considerably faster than 3.10.
How did we do that? How are we going to make 3.12 and following releases even faster?
In this talk, I will present a high level overview of the approach we are taking to speeding up CPython.
Starting with a simple overview of some basic principles, I will show...
https://github.com/python/cpython/issues/107406
https://github.com/python/cpython/pull/107407
my first contribution 🥳
is it ok?
fair enough. it should go through
in case you missed it: https://discuss.python.org/t/a-steering-council-notice-about-pep-703-making-the-global-interpreter-lock-optional-in-cpython/30474
Posting for the whole Steering Council, on the subject of @colesbury’s PEP 703 (Making the Global Interpreter Lock Optional in CPython). Thank you, everyone, for responding to the poll on the no-GIL proposal. It’s clear that the overall sentiment is positive, both for the general idea and for PEP 703 specifically. The Steering Council is also l...
pep 7 says:
No line should be longer than 79 characters
but i see a lot of cases where this rule is violated (even in very recent changes)
is pep 7 outdated?
!pep 7
the 79c rule itself is quite outdated
why doesnt pep8 call snake_case snake case? this feels cursed
Is there a set process for getting reviews on cpython PRs? I've had a PR up for over a month (~1.5months) that unfortunately hasn't received any attention
ping some guys here and hope for the best
or open a thread on discourse
I am kinda bored. Which are your favourites or more interesting PEPs to you? I will be reading them. So far I know the most common ones tho
!pep 0
- PEP 509 – Add a private version to dict
- PEP 699 – Remove private dict version field added in PEP 509
I thought that 3.11 specializing interpreter is using dict versions (for optimized global lookups, for example). Am i wrong?
Since CPython 3.11, this field has become unused by internal optimization efforts. PEP 659-specialized instructions use other methods of verifying that certain optimizations are safe.
Hmm
Consider reading pep about nogil. It is huge and very interesting
Ok thanks! 🙂
3.11 uses dict keys version, not dict version.
Yeah, that makes sense
Hello! I’m trying to understand how the interpreter and threads are related to one another in cpython. I was going through this page https://devguide.python.org/internals/interpreter/
But I couldn’t understand the following paragraph
“Interpreter” refers to the bytecode interpreter, a recursive function. “Interpreter state” refers to state shared by threads, each of which may be running its own bytecode interpreter. A single process may even host multiple interpreters, each with their own interpreter state, but sharing runtime state.
Is there any other page/resource where the idea of multiple interpreters and how it relates to thread states is discussed?
I’m fairly new to cpython internals and any help would be appreciated. Thanks!
if i have two keys 5 and 55 with hash function like mod10 using separate chaining method
i set dict[5] = 'a'
then dict[55] = 'b'
if both of them get stored at 5 index of hashmap, how does it know which element to access?
for example calling dict[5] answer is a but at index 5 you have both a and b separately chained
hashmaps typically store items in buckets for this reason
Python's dicts don't, though, they use linear probing IIRC.
The main idea of detecting hash collisions (whether using buckets or not) is to store the key alongside the value in the table. Then during item lookup you can compare your key with the key found in the spot selected by your hash, and then do appropriate actions (go to another slot, or to the next item in the bucket) if they don't match.
Can you submit comment on an existing PEP?
if it's still under discussion, there will be a Discussions-To header pointing to the discuss.python.org thread discussing the PEP
What if it is instead to a mail.python.org archive?
then it's likely a PEP that has already been decided on
