#internals-and-peps
1 messages · Page 28 of 1
is that because the object domain is going through mimalloc and the other domains are not?
I think it's because we need to be able to read the refcounts of recently-freed objects. Their memory isn't actually reused until the world is stopped and everything is in a sane state, I believe.
So it needs to be safe to dereference a PyObject pointer and fetch its refcount. For freed objects, it will be zero (instead of a segfault or some random new object out of thin air).
I think both domains go through mimalloc.
Yeah, that seems to be the case
so - both go through mimalloc, but non-objects are immediately returned to mimalloc during free and objects are only returned during stop-the-world?
There’s also another mechanism called QSBR for non-object memory that allows lists and dicts and such to be resized while being read by multiple threads. That keeps memory around longer as well.
Since a list’s items array isn’t refcounted, for instance. Two threads need to be able to read both the old and new array at the same time.
.py
print("hellow warald")
Please try not to spam
I am considering stopping using chatGPT and that because it is making my thought process weaker. As an programmer or engineer a person should be able to problem solve and I feel like chatGPT is taking that ability away. The ability to actually to meet the resistance and then overcome it by actually dedicating time and focus. I would rather have a weaker output of my work than let somebody else do it? or am I old school and should embrace the change?
What kind of problems are you solving with ChatGPT?
The problem with using LLMs for code is not that they help you, it's that they don't know how to program, and they will tell you incorrect information
that is just at the moment, I believe in the long run LLM will be able to become more advanced
well, when they do, then we can talk about it 🙂
hypoteticals, I love them
as tools improve and get better, you should adapt and use them to be more efficient yourself - but at the point where LLMs are at today, they make mistakes way too often to be useful in most real medium/large scale projects
Somebody suggested I post this here: https://discord.com/channels/267624335836053506/1278773754297843762 about python's zlib not behaving exactly like the C one in certain context
I can agree with that they are still not good enough but can you potentially see them being advanced in the future?
might take a while for someone that knows about it to show up, but yeah that's fit for this channel
I asked on zlib github and they directed me to stackoverflow : D
I personally would like to learn python and therefore has canelled my ChatGPT membership.
well, seems like my help posting about zlib got locked : ( can't add any more info. is it normal for them to close so quickly ? I wanted to ask if python's zlib is adapted from the original zlib, and if there's any modifications done
yeah it gets closed after an hour of inactivity, you can continue the discussion in this channel
Are peps are like rules?
https://peps.python.org/pep-0001/ explains
Thanks man, I'll read it.
I noticed that when I code in pycharm
It shows problems
When I press problems
It shows some pep with their number
That's why I'm asking.
Oh, you're probably seeing "PEP8" violations in PyCharm... https://www.jetbrains.com/help/pycharm/tutorial-code-quality-assistance-tips-and-tricks.html#turn-on-pep8. That's a particular PEP that has some style recommendations:
!pep 8
Wow, didn't know pycharm has so many features.
Has a lot more than that too!
someone had an interesting question on github, are statically allocated PyTypeObjects safe to use across multiple interpreters? as in, one PyType_Ready call in the main interpreter, and then can it be used safely from any subinterpreter?
I guess PyType_Ready is being called in every subinterpreter. Actually I remember one "related" issue: https://github.com/python/cpython/issues/117482
what's the tradeoff between sub interpreters and no gil in the context of multithreading?
so tp_dict is copied across interpreters, interesting
Hi guys, how to stay updated with the python news, releases, new features roll out and best practices and tools. Is there a place i can subscribe?
there are some mailing lists ( #mailing-lists ) (at least for official things like new Python versions, not so much for best practices or library releases) & the #reddit will typically talk about relevant things in the ecosystem
you can also try following some podcasts
I recommend the core.py and Python Bytes podcasts
What's a during loop?
kinds of loops I know:
- C-style for
- python-style for (aka for each/range based for)
- while
- do while
- repeat (do while with inverted condition)
- some langs also have special construct for infinite loop
async for x in y: ...
oke but i mean 2 loops running together
if you want to lets say test two algorithms (dyna vs priority sweeping) but dont want one to run after the other
!d zip
zip(*iterables, strict=False)```
Iterate over several iterables in parallel, producing tuples with an item from each one.
Example:
```py
>>> for item in zip([1, 2, 3], ['sugar', 'spice', 'everything nice']):
... print(item)
...
(1, 'sugar')
(2, 'spice')
(3, 'everything nice')
```...
no, this creates a list, i mean the full running of two nested algorithms
for might have been the wrong example, we can take while
sounds like you are misusing asyncio
semantics...
!d threading
Source code: Lib/threading.py
This module constructs higher-level threading interfaces on top of the lower level _thread module.
Changed in version 3.7: This module used to be optional, it is now always available.
yuh
oh yes i know this feature
my greedy ass asking for a new name for sth that alr exists
if a thing already exists and is easy to use, there is no need for a new name for it
We don't allow advertising in this server. Please read #rules and #code-of-conduct . Your message has been removed.
in what language is "during loop" a thing? never heard of it before
hi
that’s the thing: it doesn’t
yet
Ask in #career-advice
Thanks for posting that link. I will check that channel. You're goated for that, brother
!pep 8
what would you guys suggest to collect a list of all valid words within the python language? set, list, collections etc
probably the glossary? alternatively the syntax specification
huh, good idea, let me check
import everything you can, go over all code objects, modules, classes, functions, ... and extract all kinds of names from them
maybe sprinkle a little bit of filtering
or parse the documentation and extract everything that looks like a name from it
I once did that with dunders, found a couple of interesting things
what do you mean by "valid words"? you can find a list of keywords somewhere in the docs, but "set" and "list" are not keywords, they're builtins
The following identifiers are used as reserved words, or keywords of the language, and cannot be used as ordinary identifiers. They must be spelled exactly as written here:
False await else import pass
None break except in raise
True class finally is return
and continue for lambda try
as def from nonlocal while
assert del global not with
async elif if or yield
no, that is not it...
there is a module that holds a list of keywords and soft keywords
!d keyword
Source code: Lib/keyword.py
This module allows a Python program to determine if a string is a keyword or soft keyword.
list of builtins is pretty trivial to get
list of stdlib modules is available somewhere in sys (not sure if it is complete or consistent across platforms)
sys.stdlib_module_names I believe
looks like it's all modules that are on the stdlib on any platform. some of them will not exist on your platform
Wasn’t talking about keywords necessarily, just words that are valid in Python syntax
The glossary seems to have a large list of what I wanted, but I think I’ll expand to known attributes and such
You used set as an example, but set is "valid Python syntax" in exactly the same way as spam is. From a syntactic point of view, it's just an identifier
to list builtins:
set(__import__("sys")._getframe(0).f_builtins)
to list keywords:
set(__import__("keyword").kwlist)
fair enough, but set has special meaning to the language itself, spam would be a variable name that isn't part of the name
@jade raven can you tell us more about how you will use the list of words? That might help us find the best answer.
it'll be additional data for an transformer model in order to filter out gibberish text from LLMs when they go haywire
there are considerably less obscure ways to do that too 🙂
currently the model i'm using filters out snippets of text as gibberish because it contains words that belong to the language and i want to retrain it with more context
i don't understand: it sounds like you said if the text uses words from the Python language, then it's considered gibberish?
for builtins, i could simply use __builtins__ (which exists in pypy but doesn't react to changes as cpython afaik), right?
not sure about other implementations, but they are far behind anyway
aren't they
I'd use import builtins; set(builtins.__dict__)
for example, ```py
set(import("sys")._getframe(0).f_builtins)
or I suppose dir(builtins)
i thought about that, though a frame doesn't have to use the builtins module as its source of builtins, does it?
if it had to, why the f_builtins api?
why is that gibberish? It's literally something useful an actual human said.
but i think any mutations of builtins usually just happen to the builtins module anyway
It sounds like they're saying their model is incorrectly flagging that as gibberish
would have to check how _sitebuiltins is applied
because the model doesn't know Python enough
i see
yep!
it's trained on english, so i figure i should teach it more about python, i'm not home right now or else i'd send an example
so yeah, you could do something like extract a list of words from the Python docs, but not sure how well that would work. Maybe the bigger problem is that you're training a model on English and expecting it to recognize Python; instead, you could get a corpus of Python code to train on.
That's not really what this channel is about though, more of an ML problem
my original question fit this channel better methinks, just went offtopic a bit
you could do something like extract a list of words from the Python docs
yeah i was thinking of extracting words from the glossary rst
Your original question isn't really answerable, which is why we went off on some tangents to figure out what you meant
although __builtins__ is still not guaranteed to be (practically) equal to sys._getframe(0).f_builtins, i believe; but that's an edge case of an edge case i ran into in very rare circumstances.
shifting from the topic a bit... since the new type syntax arrived in the stdlib, do you think it's feasible to run Ruff/Black/whatever formatter suits on the entire stdlib codebase, to give it uniform style?
appending that rev to .git-blame-ignore-revs ofc, but that's a detail :P
no, lots of core developers will be very upset
🤔
(and I'm not sure what new typing syntax has to do with it)
i understand the confusion. i meant that i observed something syntactically backwards-incompatible, although i see it doesn't matter due to stdlibs being part of cpython releases.
The word set doesn't have special meaning to the language itself. The only thing special about the name set is that it's a member of builtins
and what that has to do with it... well, i associated that made up 'backward compatibility' convention with why the old code is untouched for styling, maybe to (again made up) to facilitate generating backports.
why would they?
because they don't like autoformatters. I am not well placed to defend that position because I don't agree with it, but it's not a battle I'm interested in having.
oh, I see.
i'm not here for battles, but interested in the reasons to not like it--where can i look for an answer then?
there are probably some threads on discuss.python.org if you search for Black
thanks
i know barry worked on blue, just guessing now that it may also be a conflict on which formatters would be chosen
but well, somehow the community agreed on the conditional expression syntax democratically... or not, i don't remember the story very well :p
dropping what i found here, in case someone else happens to be interested too https://discuss.python.org/t/clean-code-of-stdlib/3755
I think this is low priority, but do core developers have a plan to make code in standard library more PEP8-compliant? I feel ashamed when introducing Python to non-Python programmer and he/she see a lot of red error indicator in VS Code when accidentally open std lib source code.
thanks. i see the reasoning. thanks for help.
also regarding that... why? shouldn't typeshed manage all annotation-related concerns in the stdlib code?
or is it a myth and those annotations of typing.override are essentially necessary
btw, reformatting the stdlib would be opposed even if everyone liked autoformatters: they don't like edits that churn the source code.
yes, i see the time and cost that would bring
yes, my "they don't like autoformatters" is simplistic, there are a lot of good reasons 🙂
i'm slowly coming to terms with autoformatting.
i like what that problem teaches us
which problem, and what does it teach us?
Not sure what you're referring to specifically, the only @override I see in the stdlib are in the tests for the decorator and docs
Lib/typing.py lines 3454 to 3457
type _Func = Callable[..., Any]
def override[F: _Func](method: F, /) -> F:```
having new-syntax annotations
it's not very often to see type annotations in stdlib code, or am i mistaken
I guess I put that in to set a good example. It's not really necessary.
i see, thanks :)
i mean, annotations are also in other places anyway
but i was always used to annotation-free code in stdlib
I think we found at least one bug in the PEP 695 implementation thanks to using it in the stdlib
https://github.com/python/cpython/pull/104553 is where this was done
the problem of making stdlib automatically formatted to be more consistent in every place.
what it teaches us (or maybe it's only me, if nobody relates): it's very hard and probably unlikely, given the time and cost necessary at this scale.
i used to believe it wouldn't be so hard, and used to believe that it's a worthwhile change; but how the code looks doesn't matter as much as how the code works, at the end of the day.
the only thing that bothers me is the fact that the rest of the code is untyped in the module that partially introduced type annotations.
i would typically think--annotate the rest
but i can see it being problematic and unnecessary, given typeshed & co. 🤷
you are already asking in the main channel. This channel is about internals and peps.
interestingly, typeshed does actually use an autoformatter while cpython doesn’t
because it's typeshed
people don't import it directly
although hm, it is as fragile as the stdlib source code i'd think
hm, no
types might not pass, but the code must work
typeshed does have actual source code e.g. tests
so there's a key difference
who, except typeshed, imports typeshed tests in production code?
ive been guilty of using typeshed’s utility types in the past
it's fine to use typeshed's utility types
as long as they are marked as stable
in comments
typeshed has a different history and a less conservative style of development
i use _typeshed.StrPath often
out of curiosity, are there really that many cases where running a formatter breaks things?
depends on the definition of a formatter
and 'breaking things'
i assume that by breaking things you mean changing targetted code in a backwards-incompatible manner
i think it would be vanishingly rare. Black at least used to check that the ast was the same before and after.
the problem is with things like pragma comments for tooling, which get moved, and then don't silence the warnings they used to.
i'm not sure people are trying to solve it.
does black use libcst under the hood? i think libcst was made later
oh, it doesnt
gotta check how it handles preserving cst information then
maybe something similar to how lib2to3 did that
the only reason i would be an advocate for adding an autoformatter is that it’s a PITA to disable my formatter every time i want to edit test code or stdlib code
i have my formatter disabled by default and enable it in appropriate workspaces if necessary.
beware various projects can have different guidelines--you might want to use black if black is the used formatter, and blue if blue is the way to go.
ruff is not a universal solution here--i don't think it handles all the config of various tools it absorbed some of its logic from.
do they have a reason to not try?
preserving comment position semantics seems like a nice and valuable feature in a formatter
you would have to ask them. the black issue tracker has a number of issues about it, but I don't know how many have been fixed.
i guess you faced those incompatibilities often in context of pragma: no cover comments and such
interesting. i'm wondering what the status is for ruff
black uses a forked version of lib2to3 under the hood
ha! that was a nice guess then.
it's not really tractable in general. we attempt to preserve some common pragma comments like type ignores, but I don't think you can totally solve the problem of pragmas pointing to the wrong line and preserve useful autoformatting
hmmm, i think it necessitates a good example 🤔
i can't make up one from head rn
maybe thats offtopic
if you have
def f(a: foo, b: bar[baz], # type: ignore
) -> None: some(code)
``` (with a longer signature) Black would want to put each parameter on its own line
but it doesn't know if the type ignore applies to one of the parameters, or to the def
so the only real solution would be to not format that line
and not formatting any line with comments in it is not what you'd want an autoformatter to do
since that in practice affects type exprs in this line, why not format it to still affect all type exprs in the line after formatting?
i'd say just calculate what happens in practice
but yeah, that means integrating different meanings of comments in those tools
i can as well invent my own DSL used in comments and in an analogous example ```py
def f(a: foo, b: bar[baz], # myowndsl: spameggs
) -> None: some(code)
could target only the relevant parameter, so this is true that the only real solution is to not format that line, as it is not really known how to preserve the underlying semantics
but!
we could integrate known cases
such as type: ignore
since that in practice affects the entire function
It doesn't, it affects one line.
yes! thanks.
affecting entire function would also mean affect the entire body
this affects type checking in that line specifically, so i imagine ```py
def f(a: foo, b: bar[baz], # type: ignore
) -> None: some(code)
being formatted as
```py
def f(
a: foo, # type: ignore
b: bar[baz], # type: ignore
) -> None:
some(code)
does that look correct?
but hmm
after f(
no, because the ignore might be for an error that gets reported on the def
and because if you duplicate the comment, you may also break type checking (with --warn-unused-ignores)
and more generally, there is an unlimited set of possible special-meaning comments. Black can't realistically reason about how all of those should be handled.
yeah... just now i realized this ^ is an existing problem and there's no way of avoiding it differently than just by always formatting first.
or being lucky.
well, i'm taking this back 😅
wouldn't that already appear in the first scenario?
oh
it wouldn't
because ignore could apply to any of cases from that line
yup, that's too convoluted to think of a working solution
@uneven raptor now imagine this in stdlib https://github.com/python/typeshed/commit/ee487304d76c671aa353a15e34bc1102bffa2362
😅
well, as long as functionality doesn’t change, it’s not that big a deal is it?
i guess it messes up the blame a little bit
Nice pink Zero
git lets you mark commits to ignore in the blame, which is sometimes better sometimes worse.
😛
oh, TIL. why don’t they want formatters then?
because sometimes skipping commits in blame is more confusing than showing them, and they don't want to break anything, and all the other reasons they try not to change code that doesn't improve its behavior
have you ever considered a carreer in research
Random question is random
But yes, I’ve broadly decided on research or technical writing if this computers thing doesn’t take off
not exactly random, you seem to be great at digging up stuff along with the data to back it up
Tanks
Imagine the formatter being buggy and producing slightly wrong code... that will be a much more exciting debugging experience if you will hide the formatting commit
Is this supposed to happen? I think it isn't?
import types
import inspect
inspect.getmembers(types.CellType())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.11/inspect.py", line 595, in getmembers
return _getmembers(object, predicate, getattr)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/inspect.py", line 573, in _getmembers
value = getter(object, key)
^^^^^^^^^^^^^^^^^^^
ValueError: Cell is empty
getmembers ignores members which raise AttrbiuteError, but CellType raises ValueError when it is empty
maybe CellType should raise AttributeError in that case
class EmptyCellError(AttributeError, ValueError): ...
``` or this ^ to preserve backwards compatibility
I think maybe I just need to vendor my own getmembers() implementation?
This seems unintentional, but I can't think of what the actual expected behavior should be.
why do you need inspect.getmembers?
imo, it should not be used anywhere other than playing in REPL
inspect.getmembers_static() doesn't error, but idk whether in other cases you need descriptor behavior being triggered or not
I would prefer it not to be. The context is that I'm trying to get every object that object holds a reference to.
maybe use gc.get_referents instead?
or whatever can be used from gc, i dont remember exactly
I forget, there was some reason why I wasn't.
getmembers_static() seems like a proper solution for the use case; descriptors often don't imply hard references
but make up values on the fly
Oh, I was doing both. I was calling get_members to get the attribute names, and get_referents to get the full list.
Then matching up the list so I could label attribute names
Oh, interesting.
just beware getmembers_static is 3.11+, you can backport it if you need to support 3.8+ (or 3.9+ more realistically, since 3.8 is EOL soon)
That should be fine, the program I want to debug is 3.11+ anyway.
const tuples are optimized in 'comptime'.
any idea why const sets aren't?
i assume the mutability of sets doesn't matter here, because what i'd expect is like here
we're calling BUILD_SET in each case
ok, lists aren't optimized either
must have to be the mutability here then...?
(Python 3.12.5 (main, Aug 14 2024, 05:08:31) [Clang 18.1.8 ] on linux)
Yeah, it would be invalid for checkcomp() to return the same object every time
i don't think that it would ever return always the same object though
last checkcomp() that just returns {"a", "b", "c"} doesn't always return the same set
Oh, you mean that it's not the same as a plain list literal
yep, like this
i wonder why this is not the same as if we used ["a"] + ["b", "c"] for example
well, optimization is a best-effort thing 🙂 there are no guarantees
I guess + on list literals is just not taken into account here
question is: can it?
are there any reasons not to?
i doubt that i was the first one to have that idea
it can
but it probably dont worth it
you can gain negligible speedups in rare cases at the cost of complicating compiler/optimizer code
are const tuple cases less frequent than const list/set cases?
i never seen ['a'] + ['b'] in practice
The surprising optimization i just heard about: "%s %r" % (x, y) is compiled to f"{x!s} {y!r}"
how about ("a",) + ("b",)
(which is optimized)
that'd be a very frequent optimization!
%a as well ("%s %r %a" % (x, y, z) is compiled to f"{x!s} {y!r} {z!a}")
yes, i haven't explored it, but I imagine it's pretty complete.
haha
me and my folk recently had open an issue about more common ast optimizations... and this was closed by core developer: https://github.com/python/cpython/issues/123080 Actually, I kind of agree with Serhiy, that kind of changes requires more motivation and rationale. Currently I'm writing a document (I guess this can't be called a PEP) to show the motivation and rationale.
cool! where can i read up on how common is the ("a",) + ("b",) case?
or "foo" + "bar" case
FYI, I've recently write a tests for this optimizations: https://github.com/python/cpython/blob/8ef8354ef15e00d484ac2ded9442b789c24b11e0/Lib/test/test_ast/test_ast.py#L3040-L3332
Feel free to send me a PR for cases that I forgot to include (if they exist 🙂 ). Currently, I'm thinking about how to test all AST optimization paths without having to write each path manually in the tests
Yeah, that's the most interesting question in the "rationale" paragraph. I currently don't have any numbers, but I know that we need to scan the top-5,000 Python packages and somehow search for them and then decide whether to optimize them or not.
In my horrible opinion, I would like to add every possible common ast optimization, but it seems like an overhead.
it's because list is not a constant in terms of AST. AST optimizer can optimize binary operations only if both operands are constants
the same applies for unary operations
what makes a tuple constant in terms of ast, and list not constant?
both constructs have only constant children
and are parsed from displays
did you know that?
not (a is b) -> a is not b
not (a in b) -> a not in b
so there's no rationale for the existing optimizations?
List can't be a constant in terms of AST in any way.
Tuple can be a constant if all of elements are constants(literals).
for example:
(1, 2, 3) - a constant
(a, 1, 2) - not a constant
[1, 2, 3] - not a constant
[n, 1, 2] - not a constant
yes, i see, but why?
oh.. these optimizations were added about a seventeen years ago..
I guess the most correct answer is - "just because"
AST optimizations need to be revised(IMO), but I'm not sure if it's worth the effort.
for example, did you know that
for i in [1, 2, 3]: pass
transforms to
for i in (1, 2, 3): pass
and
for i in {1, 2, 3}: pass
transforms to
for i in frozenset({1, 2, 3}): pass
you can check it via dis (as you did before)
or you can directly see which ast is generated using
import ast
print(ast.dump(ast.parse("for i in [1, 2, 3]", optimize=1))
(FYI, optimize parameter for ast.parse was added in 3.13)
for example, did you know that
for i in [1, 2, 3]: passtransforms to
for i in (1, 2, 3): pass
and that is very odd, given what you said before, that
List can't be a constant in terms of AST in any way.
Tuple can be a constant if all of elements are constants(literals).
and i think i can find a case where that optimization breaks the code, which is super-funny
This optimization differs from others in that it searches for a list literal as an iteration target and then converts it to a tuple (same for set -> frozen set).
(custom str subclass in locals() keys, appending to list on assign)
can you show the example?
sure
if an ast optimizer can provide code that is semantically different from the original code, it should be fixed
ok, nvm :D
i thought it was somehow possible to intercept the list or iterator and immediately exhaust it, like you could have in non-inlined comprehensions (<3.12) via .0 magic variable
https://bpa.st/7J2BC
like, if you run [locals().get(".0") for _ in range(1)] on Python 3.11, it will tell you the iterator
oh, wait
it should be sys._getframe(1)
but you can't even try to repro there
because it creates new scope
there's no way to nonlocal/global in an old-style (scope creating) comprehension
what about constant comparisons
they're not folded
every constant operation (iirc) is optimized except comparisons for some reason
Folding of comparisons can be done during the ast optimization, but we can't use PyObject_RichCompare here because.. well.. I don't know why, interpreter will segfault if we do this thing. Actually we can just compare the pointers which is easy
Therefore, in and not in operations also can be folded because it's just a iteration + ==/!=
hello everyone, why import turtle can't run in my mac?
you should be asking in #python-discussion but it's because you named your file turtle.py
how do we utilize tensorflow gpu on pycharm
i tried every possible way, but I can't find the right solution
when does the garbage collector actually run? in between frames?
iirc, it performs some logic on each allocation
basically, every X1 allocations it collects first generation, every X2 1gen collections it collects second generation, every X3 2gen collections it collects third geberation
!d gc.set_threshold
gc.set_threshold(threshold0[, threshold1[, threshold2]])```
Set the garbage collection thresholds (the collection frequency). Setting *threshold0* to zero disables collection.
When the number of allocations minus the number of deallocations exceeds threshold0, collection starts.
no, it can run within one frame. Within the eval loop, it can run after any bytecode, in general
that sounds... inefficient. how long does it take for the GC to do a collection cycle (among other things)?
Inefficient how? it doesn't run after every bytecode instruction, but it can run after any bytecode instruction, when the heuristics denball mentioned above are satisfied. Most of the time they're not, so it doesn't run. Sometimes they are, so it does.
how long it takes to run depends on how many objects exist, and how many of them are garbage
inefficient as in, running the gc seems like an expensive operation -- i would have thought that running it less would be better for performance
yes, that's the goal
having more places where it can run doesn't necessarily impact the frequency with which it runs
if there aren't enough places where it can run, you can wind up with a situation where there's a huge amount of garbage not being collected because the GC isn't able to run
i see -- so very broadly, how much of a bottleneck is the GC in a python program?
if it only ran, as you suggested, between frames, then this would eventually run out of memory: ```py
class C:
pass
def foo():
while True:
x = C()
y = C()
x.y = y
y.x = x
foo()
theoretically, couldn't that be solved by just collecting in between frames and then in between iterations or something? (i'm not suggesting that it's a good idea, i'm just speculating)
what if there's no iteration or frames? What if the function is 100,000 lines long?
anyway, "between iterations" already lands you at "after some bytecodes"
ouch, don't do that
beyond that, it's just a question of which bytecodes you'd check at. I believe it can only trigger after certain bytecodes and not all of them. I'm not sure which
ok, interesting to know
^I would like an answer to this too! Only out of curiosity though, I've never had to care about performance
from what godly said, im assuming that the answer is not much, because the GC only runs a lot on programs that actually need a lot of runs (as in, they're allocating a lot of memory)
I think that question doesn't really admit a general answer. Different programs have different performance characteristics; profile your program to figure out whether GC is a bottleneck for yours.
I think the pyperformance suite includes a benchmark that (on purpose) spends much of its time in the GC. Basically all it does is create cycles.
also, memory is usually reclaimed via reference counting, not a "run of the gc" at all. The gc is used to collect circular structures.
yeah, im thinking about this in the context of a program that has a lot of cycles
It's hard to answer these things generically, but to try to take a stab at a generalization, and at least bring up a relevant point if I fail:
the biggest cost I suspect in the average GC program isn't the actual cycles spent on the GC itself
it's the fact that GC more or less tends to force a lot more indirection, and memory to be laid out a lot less contiguously
in C++ if you have
struct Person {
int id;
int age;
};
then id and age are always right beside each other, and once you've "reached" a Person, there's never any further indirection.
@dataclass
class Person:
person_id: int
age: int
in the python equivalent, that's not going to be the case in general - person_id and age are not stored "in" the Person, person stores some pointers to some other location in memory where the data lives
and you'll have to chase that pointer every time you want to access fields
You can use GC strategies to try to recompact memory and so on but it's never going to be the same, and that pointer always has to be chased.
None of this is technically running refcounting/GC code, we haven't said anything about freeing memory.
but in order for GC to work, any object in your program is capable of picking up additional references which can keep it alive. So you need this additional pointer/indirection, and this lack of guarantee about contiguous memory layout.
in C++ if you have
Person p{...};
a = p.age;
ap = &p.age;
When p goes away, well, a is an unrealted copy, and ap will simply dangle. C++ knows that the whole Person is going away at the same time and nothing will ever change that, hence contiguous memory and no indirection.
p = Person(..)
a = p.age;
// no equivalent since python doesn't distinguish values and pointers the way C++ does
in the python version, the age field will have to be kept alive even if Person gets cleaned up
This applies to everything in python pretty much - class members, data structures. And not just python - pretty much every GC language (or at least, the "GC" types in those languages - some GC languages have value types).
all that additional indirection and lessened memory contiguity is extremely expensive even when no ownership changes are happening
compare the memory layout of a C++ vector<Person> to a python list[Person] - dramatically different
but vector<Person*> and list[Person] are pretty much identical
Still not, because when you access v[0].age (or v[0]->age in the C++), there's still an extra pointer chase in the python
compared to the C++
that's why I say "dramatically different" - the python memory layout involves two extra layers of indirection
@uneven raptor
how GC works can also vary on the implementation.
see https://doc.pypy.org/en/latest/cpython_differences.html
well yeah, reference counting itself is an implementation detail
so you can never be sure GC collects a dead object immediately
i would certainly only support cpython if i needed behavior like that
Most GC languages do not make any guarantees, because the guarantees aren't worth too much and having the option to change implementations is wroth a lot more.
the big exception is swift which is explicitly designed as an only reference counted language. Which adds some manual work to the language (not too much), and in exchange gives some useful guarantees and perf.
I didn't know this, very interesting
It happens anywhere signals are handled or thread switches occur. As an implementation detail, it’s when starting/resuming a frame, after returning from calls, and at the bottom of loops.
So a 100,000 line function with no calls, yields, or loops wouldn’t collect GC or handle signals till it got to the end, I’m pretty sure.
(Assuming no other frames ran, like finalizers or properties or whatever.)
it would be tricky to write any useful code like that, I think
IIRC, it can also be triggered by memory pressure. IIRC, if PyMem_Malloc fails, it runs a GC cycle then tries again
By hand, definitely, but code generators do some weird stuff sometimes.
It used to happen every n GC allocations, but that was changed two years ago because having GC run on any allocation creates all sorts of headaches. So now it just sets a flag on the eval breaker. I think that was changed in 3.11 or 3.12.
We had really nasty situations like having frames objects be allocated while we were trying to allocate that same frame. Nightmare.
And with the JIT we need GC to run in more predictable places, otherwise the world could be invalidated on any allocation, which is everywhere.
I remember the fun you had with that during the Santa Clara sprint
Yup.
Ah, I stand corrected then
Thanks!
Help I keep on getting User Cancelled Installation error when I’m not pressing the cancel button
#❓|how-to-get-help Opena help thread plz.
!class
in the situation
foo = [0, 1, 2]
bar = foo
Can anyone explain to me the rationale that led the devs opting to turn bar into a pointer to foo, rather than a fullfledged copy of foo? why isn't the standard behavior of that operation the same as if using the copy function?
copies are expensive, and shouldn't happen implicitly.
you wouldn't want to copy the list when you pass it into a function.
why not?
variables don't hold any data, they hold pointers to data
assignment always copies only one pointer, type doesn't matter
to be clear, this doesn't "turn bar into a pointer to foo". foo and bar are both pointers to the same list.
that's still wrong. every variable is a reference. It doesn't "turn bar" into something different than foo at all
nothing turns into anything
every variable is always a pointer to some object
because that's a lot of copying, which will slow everything down
I see
not my point but sure
it underlies your point. Your question was predicated on an incorrect understanding that foo and bar are somehow different. They're not, both foo and bar are references to the same list. They're on totally equal footing.
a = b always makes a refer to the thing that b refers to
there was that great blog post with boxes, and the variables are labels you put on the boxes
when I was starting python it helped me a lot
https://www.youtube.com/watch?v=_AEJHKGk9ns&feature=youtu.be is our go-to for explaining this to people
a video is good but some people might prefer writing. Unfortunately this post was from over a decade ago so I think it's unlikely I'll find it
possibly https://nedbatchelder.com/text/names.html ? That's the article version of the talk
No, that's not it. Although it looks really nice! good stuff Ned
the one I read I don't remember much about but it had actual drawings of boxes and tags on the boxes (like luggage tags)
the funny thing is I found a few blog posts which copied this idea, I suspect from that source, but I can't find the original now
anyway I'm sure this is fine
(probably better, in fact)
the funny thing is though that this is actulaly how most languages work, so when people ask questions like this it's such a giveaway that they're coming from C or C++ (the main languages where it doesn't work this way)
isn't this from this year
i remember the one you mean, with realistic tags and boxes.
right!?
and when i search I see so many very new blog posts talking about boxes and labels/tags - I am pretty sure they are all inspired by this same blog post
the Ur-box post if you want
i'm a little surprised I can't find it, i guess probably it was shut down.
honestly it was so old I wouldn't be surprised if it was hosted by geocities 😂
have anyone made a pr to cpython before that could share any tips on how i should structure it, any precautions or good to know things?
Can you share more information about this PR? But, in general, writing a fix and running the test suite are enough
I've read that thread and I suggest you firstly open an issue: https://github.com/python/cpython/issues/new/choose
As far as I understand, that's not a bug, this looks like a feature request, so choose a Feature on enhancement option
After opening an issue, you can wait for feedback from the Tkinter maintainers, or you can start writing a pull request (but be prepared that it may be closed).
yeahhh i can see where this is going, ill refrain, thank you!
is there any chance of ever linking the functions in the documentation to their typeshed signatures? i find myself checking the stubs for what i'm supposed to do rather often
one idea is to publish the type signatures on the doc pages
but they get messy
it is? i see the type signatures on most docs for third party libraries nowadays, i wouldn't consider them messy
but what do the Python typeshed annotations look like?
fair, the pyi files are pretty bad. but i would hope that it wouldn't be that bad with docstrings and whatnot seperating each function
i don't think this looks that bad, does it?
Unless care is going to be taken to fix the cases where the typeshed has a lie in it done to assist type checkers, this will lead to user confusion. an example of this is the actual return type of asyncio.gather (which matches the docs) vs the stub (which lies)
https://github.com/python/typeshed/issues/2652#issuecomment-1847411223
This is only one such example, but there are a lot more than this in the typeshed.
that is a very good point
There really aren't a lot of lies, though there are a lot of cases where the typeshed types are very complex and not useful as documentation.
I'm still only a few modules into re-typing the standard libray so far as I go through little by litte, and am over 500 so far in my count. I understand that many people view the scope of some of the inaccuracies not to rise to the level of a lie, but when it comes to static analysis in a gradual type system, anything that's intentionally inaccurate rather than left gradual and imprecise (imprecise does not neccessitate inaccurate) until it can be typed accurately is a lie.
What are some concrete examples that you view as lies?
I'm not sure there's a good way to have this discussion using concrete examples that isn't going to be read by some as rehashing old arguments based on feedback I've received, and that's not really the intent I have here with it, the intent here is to keep the documentation useful to human readers.
There are plenty of easier arguments to make, such as the fact that many functions in the standard library can't be typed currently, so the lack of typing leads to inconsistency in the documentation on when type information is included, or there will be a lot of "noise" in the documentation for things where the type isn't useful information.
There's also the case where prose better explains the purpose behind most overloads while including the type as a result, than the type of those overload would explain the intended use.
If you'd like to discuss the inaccuracies vs imprecision philosophy more, I'd be willing to pick that discussion back up at another time in #type-hinting , but not tonight.
out of curiosity, why do we keep stdlib stubs on typeshed rather than cpython anyway?
They are not tied to CPython in particular. They should be the same for e.g. pypy
And the interpreter doesn't really use them
Same reasoning as to why pip is not in cpython
Ok that's moderately cursed
Well, I mean more like: it's not coupled to cpython. It may be vendored
this also applies to all of the pure-Python parts of the stdlib, which is supposed to be usable by other Python implementations like PyPy and RustPython with only minimal patches required
History, politics, readability and pragmatics
can i put my discord server
No, ads aren't allowed anywhere on this server.
<@&831776746206265384> stray bot
Hello,World
I am one month old in this python journy ,I have done countless quizzes and training on the site called codewars
but I face a problem I always forget the little functions such as append and emmurate and so on
so is there a place where its all sorted in a good way
a place where I can figure out which function works with which objects and so on
just throw python cheatsheet on Google and you'll find some
that is not really on topic for this channel though, see the channel's description and pinned messages.
For that kind of generic question, just use #python-discussion or see #❓|how-to-get-help for instructions on using #1035199133436354600
srry didn't see those places,I will be more mindful next time
Is there any documentation on the marshal format python uses? It's kind of hard to understand it from the c code
I'm specifically interested in the way references work in the format
I imagine pickle works in kinda the same way, and pickle is well documented
Right my bad, if anyone would be so kind to explain how they implemented the references. Like when they decide to use references while writing mostly, reading is quite obvious how to parse that
I'll take a look thanks
<@&831776746206265384>
For format version lower than 3, recursive lists, sets and dictionaries cannot be written (see below).
they mention recursive sets but I can't figure out how to create one?
set is not hashable so you can't add the set to itself
it should be possible to create recursive frozenset, but that is a task for #esoteric-python
#esoteric-python message reusing my trick for recursive tuples
<@&831776746206265384> ^
is there anyone who is expert in python and python frameworks?
Do you have a question?
not only question.
Ask your question in #1035199133436354600
The experts help in #1035199133436354600
!rule ad paid
6. Do not post unapproved advertising.
9. Do not offer or ask for paid work of any kind.
@simple narwhal_saint i have 5+ years of exprience in python and its framewrok.If you have project let me know
!warn @zealous oak A helper just reminded that offering paid work is not allowed here.
:incoming_envelope: :ok_hand: applied warning to @zealous oak.
Is there any existing explanation for the decisions making the rules?
which decisions about which rules?
Does anybody here use a typing_prelude kind of thing in their codebase, or does the mere idea make people froth at the mouth?
The python typing stuff has just moved so fast and deprecated so many things in the last few years, we have so much slightly older code, that remembering what best practice is, is annoying
e.g. not using Dict, not using like 80% of what's in typing in favor of collections.abc, but some things from typing are still okay, etc etc.
it seems to me like a trivial typing_prelude.py file in our codebase would do more good than harm
from collections.abc import Sequence, MutableSequence, Callable, ...
etc. The file would basically be a couple of lines long and just consist of the dozen or so most common type aliases, and specifically exclude things that we aren't supposed to use anymore (Tuple, Dict, Option, etc)
i deal with this by running pyupgrade when I drop a Python version.
ooh
but pyupgrade probably will not change List to Sequence automatically, will it?
no, it wouldn't know when that was OK.
It would have to be pretty clever and see if the function body exclusively uses Sequence API
yeah
that's not an upgrade sort of thing is it?
still a useful tip
That's actually changing the meaning of the type
@faint river it's not an upgrade per se, upgrade of my team's knowledge 🙂
I mean when python typing first started getting big, and even to this day
oh, you need pip install upgrade_team_py_knowledge 😄
a lot of people just say "I want to pass a list to this function. So I'll do List[str]"
it's obvious and logical and also wrong 90% of the time
link? 😛
Sorry to interrupt, I'm new to type hinting. Why is this usually wrong?
it's usually wrong because most functions should not be mutating their argument
it's also limiting in that the same function could work on other kinds of sequences.
And it's pretty valauble for callers of a function to get a promise that arguments passed will not be mutated
yeah, that as well.
The distribution should probably be like: 90% Sequence, 9% MutableSequence, 1% actual List/list
Ah, so Sequence implies it's not mutated?
I see, thank you!
that's not obvious of course, until you know that MutableSequence also exists 🙂
just by virtue of not having any methods that mutate.
I've head a poor experience with tooling when adding builtins for example with a _ translation function
unless you have a defaultdict, of course 😛
then Mapping will happily do mutations...
Sorry, I'm not sure I completely understood?
Some linting didn't like new non-standard names in builtins, with no options to configure new ones, I assume that's what you meant with the prelude?
no, definitely not looking to mess with the builtins
I just meant a small file, that was explicitly designed, so that in other files, we write
from typing_prelude import *
where typing_prelude.py itself is a file with like... 2 lines of code
just importing the most important and frequently used types (sequence, callable, etc), from the correct location
the idea being for my team, that every python file just has that line at the top, and 99% of the time you shouldn't need to import any other "general purpose" types
I think Numerlor is talking about https://docs.python.org/3/library/gettext.html#gettext.install which is about yikes/10 on my yikes scale
yeah no
that's definitely not worth it for me. way too magical.
I'm fine with including this one line at the top of all my python files - I know star imports are discouraged generally, and I realize that not every type will be used in every file.
but it still seems worth it to me personally
like, it basically encodes the correct generic type annotations, and the correct place to import them from - in code, one time.
😎 you can also fix builtins in other ways ```py
import builtins
import sys
sys.get_id = builtins.id
del builtins.id
instead of having everyone try to remember it, which is not easy when you are constantly confronted with code that does it differently
lol
goddam it
tbf, this is what it should have been from the beginning
exactly
can we also move sum to the math package while we're at it?
maybe even min and max
list/set/dict/len to collections; iter and next to itertools; int/str etc. to types, etc. >:)
why the *attr methods in sys
someone in #python-discussion suggested to move them to inspect
still doesn't make sense imo
sys is like for "internal" interface and inspect is for code inspection
this would break backward compatibility
inspect is for object inspection as well
then hasattr() and getattr()
but not setattr() and delattr()
yeah...
Could just have a deprecation cycle
another 15 years of pain...
I meant it as a joke, but yeah probably should keep this channel more serious - sorry, it derailed quite a bit because of that
I think that it wouldn't get the level of support necessary to deprecate them. IMO, it's okay to have sum/min/max as builtins
sum keeps getting fancy optimizations
yeah that's true
although there's both math.fsum() and sum()
but take a look at the thread about the ~bool deprecation. There are hundreds of messages here, and it's just a method of the bool type. Just imagine how long the thread would be about sum/min/max deprecations
I guess Thomas talking about this case: https://github.com/python/cpython/issues/111933
sum is not good for summing floating point number, math.fsum should be used instead
so it's only use is to add integers together
I cry whenever I see sum(lists, [])
Wait how does that work?
Does sum flatten for you?
sum([a, b, c], d) == d + a + b + c, so sum([list1, list2, list3], []) == list1 + list2 + list3
[] + ...
very important detail
this thread is so funny
the problem is that it's O(n^2) where n is the number of lists (assuming they're all the same size)
there's literally one core dev who created multiple alt accounts to respond
wat
no
set(map(lst.extend, lists)) is the fix
OP can't block members of python organization
If you are talking about Sergey Kirpichev, then he is not a member of python org at that moment (now he's a triager!)
still there's pochmann, pochmann3, and pochmann4
he's not a triager or core dev
I didn't know about that issue
well..
Because there was an invalid link to the research
No because the proposed change was rejected
issue is basically getting 2 birds with 1 stone
1 with the invalid link and the other (more discussed) sum() algorithm change
the theme of the issue was changed from <incorrect optimization> to <invalid link> so I guess it's ok to be closed as completed.
Congrats on purple Eclips4
done
We did it at the same time
your version looks more accurate for me!
Oh I didn’t you know you could do that
You guys are sneaky
You should get name tags
sneaky triagers
@spark verge what’s your day job?
Just being nosy, I keep seeing you pop up but I don’t actually know what you do
Perhaps triagers will gain some role on this server
SWE
nice, what flavor?
Sherbert
General?
or the term "core team" would be changed, and it would start including triagers. This is the right thing to do, IMO
I kinda just stumbled into the middle of the conversation.
I’m in favor of shrinking the builtins, though I don’t personally think it’s a huge deal.
However, I am a huge proponent of breaking up the docs page for builtins.
That thing is huge.
So you consider to split doc page for builtins?
Yeah
Not exactly sure what that would look like
But maybe like one page for strings, one page for numbers
Would give them more room to go into detail and examples for their specific types
And the builtins page is not even in the top 10 longest doc pages...
Wait really?
yeah,
admin@Admins-MacBook-Air ~/p/c/Doc (main)> find . -name "*.rst" -type f | xargs wc -l | sort -n -r | head -n 10
302342 total
5790 ./library/os.rst
5729 ./library/stdtypes.rst
4154 ./howto/logging-cookbook.rst
4018 ./library/typing.rst
3691 ./reference/datamodel.rst
3345 ./whatsnew/2.6.rst
3068 ./library/multiprocessing.rst
2966 ./library/unittest.mock.rst
2869 ./c-api/typeobj.rst
builtins page (aka functions.rst) has 2283 lines
“Admins MacBook”
You just made a sysadmin cry
do you prefer sudo Admin?
I'm too lazy to change it
But yeah, in general, I think smaller pages make for more digestible pages and easier sharing. I think it’s easy to get lost in never ending paragraphs, especially for new readers.
are you fan of numpy docs?
Uh… I can’t picture them off the top of my head. Let me look.
I do know pandas docs. I love the effort put into examples, though I do concede the giant stack of page-per-function does also get me lost sometimes.
I hear that they use the principle "one page per method", which doesn't sound right to me
An index with all the functions/classes on the left side of the page would be a huge improvement. See for example https://doc.rust-lang.org/std/vec/struct.Vec.html
but yeah, one function/method per page might be too much
That's essentially what Pandas has
But -- just look at how much detail there is about how exactly each argument behaves
The reason I like the one-big-page structure is that you can Ctrl+F over an entire module
How many arguments does that function have.....
yes
I personally view Rust's autogenerated docs website as the gold standard for reference documentation. Good modules/classes (like Vec) also include a good "module docstring" with examples and high-level description
I know that you are a rust fan.
I haven't seen that before
There are so many little QoL things here
Not sure about Rust fan, but definitely a rustdoc fan 🙂
no thanks, too verbose.
When you need to know why some little implementation detail is causing your code to explode, the verbosity will be here for you
then consider a default and verbose flag
a large number of people using the docs i'd think are novice programmers. at the very least the default docs should be short and welcoming, not that scary monstrosity you showed there (and i'm not exactly a novice programmer)
well, this is the reference documentation. It's supposed to tell you all the details
if you are are just starting with this library, you'll most likely read "getting started" (aka the tutorial) and the "user guide" (aka how-tos) first
Having everything in the module listed in an index on the left side would also make it possible to Ctrl + F, even if the index is collapsed I believe
Sometimes I want to find a particular phrase in the docs when I don't remember the function name
If the function is hairy and complicated, the reference documentation will naturally be hairy and complicated. See for example subprocess.Popen (https://docs.python.org/3/library/subprocess.html#subprocess.Popen)
Ah, I haven't considered that. Yeah, then I'm not sure how to solve that outside of a big page. Or maybe a search bar that prioritizes the module you're looking at? Not sure if that's a thing
well, if you put everything on the page, you can take advantage of the browser's built-in search 🙂
also, switching between items on a single page is faster because it doesn't require any new downloads
(not everyone is reading docs on a broadband connection)
I find the pandas reference okay, at least it makes each parameter stand out, unlike the stdlib docs where it's kinda intermingled
I mean if the rules are like peps, then in my understanding there should be a document explaining why the board pass the peps. I'm interested in that document
what rules are you talking about?
- Do not offer or ask for paid work of any kind
I mean why not create a channel or thread for people who need that?
this community's rules aren't like PEPs, and aren't decided by the PSF board or the Python Steering Council. Python Discord isn't part of the PSF, and our rules are decided by our moderators and admins. The rationale for that particular rule is that we don't want people getting scammed here, and we don't want to be in a position where we're forced to vet job offers, or where people think that the job offer is more likely to be legitimate and less likely to be a scam just because it's here. And Discord is just fundamentally a crappy platform for advertisements. Something forum-like works a lot better than something chatroom-like for advertising jobs.
in any event, that's off topic for this channel. If you have more questions about that, they'd be much more appropriate in #community-meta
the memory usage for this list -> ['',''] is not 0 right ?!
no. the memory usage for [] is not zero.
why would it be zero? It has something in it.
mb
i assumed empty string , won't take up any space. because there is no actual data , perhaps metadata ?!
every value in Python takes memory, even an empty string, even an empty list.
you can use sys.getsizeof to check the size of an object in memory. note it doesn't tell you about objects referenced from the objects (such as the elements of the list)
and yes, it's because of metadata: the object needs to know its type, and its reference count for example.
made my first contrib to cpython, to asyncio repl, please lmk if you do or do not agree with my thinking!
just wanna make sure everything's good before a final review (ig by łukasz) lands there
https://github.com/python/cpython/issues/124594
what's your opinion on this? https://github.com/python/cpython/issues/124852
closed it
The github repl is closed
I don't know if this is the right place for this question, but do people have good resources for understanding python's internals around object creation, method calling etc? Im trying to make design decisions and I just don't know enough about python's internals to make informed decisions.
moving this to the correct Chat/Help Channel
This is good for method calling https://docs.python.org/3/howto/descriptor.html
Do you have an example design you need to make?
One easy example to explain.
I'm trying to decide whether I should make some of my objects immutable and only provide pure functions for them.
Linear Algebra vectors is one example. Should Vec3.cross(Vec3) return a new Vec3 or modify the original. Currently they are immutable, but its causing perf issues with how many get created repeatedly for small maths operations.
You might like slotted classes or NamedTuples
I'm currently using NamedTuples and they have been pretty good, but there is still some overhead
Id also just like to know more on how python actually works
i've created a tracker for my ideas for cpython contributions. lmk what you think! https://github.com/bswck/ideas-for-cpython
any core dev to comment on this? https://discuss.python.org/t/new-function-repr-args-in-pprint-or-reprlib/65193
During writing representation string builders, an often practice is to represent every argument from a known tuple-like structure and every keyword argument from a known dict-like structure using a variant of arg_repr = [] arg_repr.extend(map(repr, args)) arg_repr.extend(f"{k}={v!r}" for k, v in kwargs.items()) return f"{type(self).name}({'...
thanks in advance!
Re: https://peps.python.org/pep-0758/#rejected-ideas
It's currently invalid to do:
try:
raise ValueError
except ((TypeError, ValueError), NameError):
pass
But it's not a syntax error. Should it raise a SyntaxWarning similar to tuples not being callable?
Yeah that sounds reasonable. Do you think it's a common mistake? We could also warn if it's some other literal (e.g., except 1:, but not sure it's worth it as I don't know when people would write that
Unfortunately it's not as simple as only allowing names as you can do this:
Right, that's why I wrote "disallowing literals"
!e
try:
raise ValueError("bad")
except ((TypeError, ValueError)[1], NameError) as e:
err = e
print(err)
:white_check_mark: Your 3.12 eval job has completed with return code 0.
bad
You know that an int literal (or a list/dict/set display) will never evaluate to a valid exception class
except random.choice([ValueError, TypeError])
quite
i love the idea of writing except random.choice([ValueError, TypeError])
Will the pep allow this:
try:
...
except (ValueError, TypeError)[0], OSError:
...
I would expect yes, but I am not a PEP author and I haven't tried the reference implementation
The intuition should be that each member of the tuple must evaluate to an exception class
would ```py
try:
raise ValueError
except *(ValueError, TypeError),:
pass
be allowed?
I'm surprised you can't unpack into an exception clause
as in ```py
def foo():
return *(ValueError, TypeError),
I don't think so. The spec section of the PEP says it's expressions; if you look that up in python.gram, it doesn't seem to allow 8
*
I'd like to be able to do:
Excs = (trio.Cancelled, asyncio.CancelledError)
try:
await foo()
except (*Excs):
...
raise
you can just do except Excs:
Can you?
was about to say that
Weird
>>> try: int("x")
... except es: pass
...
yup you can
So why aren't nested tuples allowed?
because there's a top-level tuple already
(not a core dev) I can confirm, I had written that exact code a couple of times in the past
nobody asked for it? nested tuples do work in isinstance()
i mean, nested tuples might actually make sense to support?
hm
not sure.
i imagine it could save from chaining/flattening in some cases, but those feel rare to ever happen
Yeah I think it makes some sense to support but it's probably very rare that people actually need it. Could be worth looking through CPython issues to see if anybody ever asked for it
so i wouldn't ask for it by myself i think
i'd be very grateful if you commented that in the ticket
so that it gets more attention and maybe other ppl can see it
Do you have a link to the ticket?
During writing representation string builders, an often practice is to represent every argument from a known tuple-like structure and every keyword argument from a known dict-like structure using a variant of arg_repr = [] arg_repr.extend(map(repr, args)) arg_repr.extend(f"{k}={v!r}" for k, v in kwargs.items()) return f"{type(self).name}({'...
Oh sorry I didn't mean to hijack your convo
adon't aworry :)
have you considered format_args name?
yeah, and format_args_and_kwargs seems the most precise
args are often associated with positional-only arguments
at least that's what i often see
as in the asyncio impl
note this conflicts with the except* syntax. if you remove the last comma it's valid syntax today and creates an except*. (Sorry for the late post, I just thought of this)
fair warning: threads on the ideas section of DPO normally don't go anywhere
True, but this feels like the kind of thing that is more likely to go through. New syntax is extremely hard; a new library function with a clear use case has much more of a chance.
yup, but people's favorite thing to do is shoot down ideas with "why can't you put it on pypi"
@feral island what's your opinion on making typing._eval_type public?
(and others' too)
why would you want it? is typing.evaluate_forward_ref what you want?
if all libraries that do runtime annotation checking (such as pydantic or attrs) will migrate to it, then yes, thanks!
i think this would close some existing issues
will/are supposed to.
3.14 brings us a lot of tasty cookies.
yeah if a report comes in that there's some private functionality that a lot of runtime libraries need, I'd be open to adding more functions to typing in 3.14
what's the official date of death on python 3.8? PEP just says "October 2024". My working assumption is the same day Python 3.13 is released.
https://discuss.python.org/t/python-3-13-0rc2-3-12-6-3-11-10-3-10-15-3-9-20-and-3-8-20-are-now-available/63161/23 your assumption is correct
thanks a bunch, Jelle
I think I got everything? https://en.wikipedia.org/w/index.php?title=History_of_Python&diff=prev&oldid=1249236770
yes that looks right
thanks!!
Print ("WAKE UP CHAT")
I'm not sure if it's a bug or not, so before I file an issue, I'd like to ask about it here:
UserString from collections is an alternative sequence class to str.
it's primarily made for easier subclassing, and by all means resembles the "non-inplace", rooted-in-immutability nature of str in its implementation.
UserString objects don't collide with str objects: their hash is always the same as of their data, and they report equality to their analogous str objects:
assert UserString("foobar") == "foobar"
assert hash(UserString("foobar")) == hash("foobar")
assert type(next(it := iter({UserString("foobar"), "foobar"}))) is UserString
assert next(it, None) is None # "foobar" discarded from set as it was 1. of the same hash 2. equal
which leads me to believe those should be interchangeable in most cases (I'm not considering things like scope dicts, obviously, as those are very implementation-specific and generally fragile to patch)
now...
from the match statement docs:
Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don’t match iterators or strings.
and yes, for str objects that is true:
match "foobar":
case [*letters]:
print(letters) # never matched
(for subclasses too, if i try match type("strsubc", (str,), {})("foobar") it works the same)
but for userstrings, well, you guessed it
match UserString("foobar"):
case [*letters]:
print(letters) # matched
the question is: should we also specialcase for UserString if it happens that str and UserString are used interchangeably, since we already support arbitrary subclasses of str? (I believe this isn't a good idea; from my experience, extending special cases is most often a never ending story).
or maybe not create another special case (my preference) and document the difference?
or maybe document more precisely that str objects are specialcased, not just "strings" (assuming UserString instances are also strings... or are they?)?
I looked for any relevant issues in the issue tracker of cpython, didn't find anything. so this most likely is low priority and it's just my nerdy research that led me here.
AFAIR the User*-classes were created to allow subclassing of builtins before that was ordinarily possible. Is there any reason to use UserString and its siblings over directly subclassing the builtin these days?
The need for this class has been partially supplanted by the ability to subclass directly from str; however, this class can be easier to work with because the underlying string is accessible as an attribute.
@quick snow additionally you don't have to tweak __new__ if you want to change constructor signature i think
so that can be a small win
but yeah, I can't think of a case where I'd personally use UserString other than if I really needed that User*.data attribute
I don't have problems with overriding __new__, but it is sometimes hard to teach others that __new__ is a static method
But in that case you can literally just do
class MyString:
def __init__(self, data):
self.data = data
no, not really
if I wanted to use it interchangeably with str objects
anyway, I don't think it's relevant
we're not discussing whether UserString is useful or not
Okay, if you need it as an attribute and at the same time want to have it behave like a string.
I think it's somewhat relevant because I think User* is basically being kept around in order not to break existing code, but it makes no sense to adapt new features to it.
I think User* is basically being kept around in order not to break existing code
I'll agree with this if you point me to a deprecation notice.
ever-popular type aliases from typing (typing.List, typing.Dict etc.) were supercommon all over the codebases and they got their deprecation notice because since 3.9 one was able to use builtin classes to annotate the same things
so if UserString is no longer useful at all in favor of new ways to achieve the same goal, why isn't it deprecated?
^ this can be a good question, because maybe UserString should be deprecated
(I'm not trying to prove a point here, just trying to justify why I think you can't just skip over adapting existing functionality to the new one because you "think" it's no longer pragmatic to use it)
I'm pretty new to pattern matching in python specifically. That said I think that making the "special case" work for UserString would be less of a special case than how it currently works
It's extremely weird for *letters to only fail to match str at top level
If *letters doesn't match str then it should never match string, including if that's nested inside a field of a struct
Like, if a pattern P matches a type T, then given a one field class Foo, with that field of type T
Then I would expect Foo(P) to match
And the converse (if P doesn't match T, then Foo(P) shouldn't match)
I think
User*is basically being kept around in order not to break existing code, but it makes no sense to adapt new features to it.
They don't have as many use cases as they used to, but they definitely are not deprecated and still have uses for certain tasks. While the builtin classes can be subclassed, they are (deliberately) not designed for easy subclassing. For example, with dict, many methods that you would think would simply delegate to __getitem__ do not respect the __getitem__ method being overridden in a subclass:
>>> class LoggedAccessDict(dict):
... def __getitem__(self, key):
... print('accessing', key)
... return super().__getitem__(key)
...
>>> x = LoggedAccessDict({1: 2})
>>> x[1] # hooray, it works!
accessing 1
2
>>> x.get(1) # wait, what?
2
>>> for key, value in x.items(): # ?
... print(key, value)
...
1 2
So if you wanted to create a not-buggy version of LoggedAccessDict that printed out a message from any method that looked up a value of a key, you'd have to override a whole slew of methods in the subclass. Not so with UserDict, which is deliberately designed to be extremely ergonomic when it comes to subclassing semantics (at the cost of quite a big drop in performance over dict, since it's written in pure Python):
>>> from collections import UserDict
>>> class LoggedAccessDict2(UserDict):
... def __getitem__(self, key):
... print('accessing', key)
... return super().__getitem__(key)
...
>>> y = x = LoggedAccessDict2({1: 2})
>>> y[1]
accessing 1
2
>>> y.get(1)
accessing 1
2
>>> for key, value in x.items():
... print(key, value)
...
accessing 1
1 2
In general it's still kinda bug-prone to subclass a builtin container. I'd usually recommend using one of the collections.abc ABCs or one of the collections.User* classes as a base class instead, unless it's something that's performance-critical.
can you expand more on the performance critical part? Thank you 
UserDict forwards on most calls to its underlying data storage, which is a dict. But UserDict is written in pure Python so that forwarding can be kinda expensive; it adds an intermediate Python call to simple operations that you'd expect to be fast, such as looking up or setting a key in the "dictionary". If there really is just one specific behaviour that you want to change about dict, then you may be better off creating a subclass and just overriding that one specific thing rather than going for UserDict where literally every operation on the mapping becomes a slow(er) pure-Python call.
But as I said above: this is bug-prone! Think carefully before doing it! Usually collections.abc.(Mutable)Mapping or UserDict is less buggy as a base class!
Going back to the original question, though... I would not treat str and UserString the same in pattern matching 😆
UserString is a subclass-friendly pure-Python class that wraps a str. It happens to be in the standard library, but you could just as easily write something very similar to it in a third-party package. I don't think there's any reason to privilege stdlib Python classes over other Python classes, or otherwise treat them differently in terms of Python's core semantics. str is a fundamental builtin data type; UserString is not.
src?
that was closed though, not accepted
okayy
no that's still open
yeah but hugo, the release manager, seems open to it
Don't you think though it's more a question of making the special treatment of str more consistent?
Right now str gets special treatment, but only at top level in a pattern
what makes you think that? ```>>> match "x":
... case [*y]: print(y)
...
match "x",:
... case ([*y],): print(y)
...
match ["x"],:
... case ([*y],): print(y)
...
['x']
match-case being inconsistent with structural assignment is a bit confusing really
but i guess that's part of new features
Because *letters is not matching str at top level, but does match it when it's nested inside as a field of a class
At least, that's how I understood the origin post
nope ```>>> from dataclasses import dataclass
@dataclass
... class A:
... a: object
...
match A("x"):
... case A(a=[*y]): print(y)
...
str never matches a sequence pattern
Then why is it matching nested inside UserString then, if it has no special treatment 🤔
in this example here, you're not actually matching on the inner str that the UserString wraps. You're matching on the UserString itself. A UserString is not an instance of str, but it is a sequence. Pattern-matching allows you to match on a UserString like a sequence, but the special case that only applies to instances of str does not apply to instances of UserString:
match UserString("foobar"):
case [*letters]:
print(letters) # matched
the "special case" is more of the pattern only being limited to sequences
yup
which i don't think is "special" atp?
sorry about that, I think I actually got confused between the match and case 😅
current behavior seems fully reasonable
Merged 20 hours ago
i didn't see the linked PR, oops
!d type.static_attributes
documentation is not very clear, imo
A tuple containing names of attributes of this class which are assigned through self.X from any function in its body.
hm
does it mean, that here bar will be in the static attrs tuple? ```py
class X:
def foo(self):
self = 42
self.bar = 69
class X:
def foo(x):
self = 42
self.bar = 69
``` what about this?
class X:
def foo(x):
self.bar = 69
``` and this?
It does appear to look literally for the name self.
Interesting. What's the usecase for __static_atttibutes__?
I believe it's used in some internal optimizations and it was decided to also expose it publicly
13 now has a "free-threaded" beta for the gilectomy via a slightly different binary. this mode supports a new option via either environment variable (PYTHON_GIL=1) or cli flag (-X gil=1) to turn the GIL back on. how does free-threaded with the GIL enabled by either of those differ from regular GIL-mandatory cpython?
A lot of the free-threading changes are at build time. The free-threaded build therefore has a lot of low-level changes that support thread-safety, even when the GIL is still used.
For instance, this PR we're currently working on: https://github.com/python/cpython/pull/124993. It adds a bunch of locking logic that is used only in the free-threading build. The locks will be used in the free-threading build, regardless of whether the GIL is currently on.
So a while ago I made a contribution to cpython (https://github.com/python/cpython/pull/113790)
I was looking through the release notes and realised that
a) the feature isn't listed (not so important, I get that it's small, but I guess would have been nice for people who do read at least a decent portion of the release notes to see mention of it since it's a new feature that was added in 3.13,
but more importantly...
b) The documentation for this feature in ctypes (https://docs.python.org/3/library/ctypes.html#ctypes.Structure._align_) don't indicate that it was introduced in python 3.13. I can imagine that of someone were to use this functionality in an earlier version without realising there may be come confusion.
I don't know if this is because of an error on my part wrt the documentation or something else, but figured I'd raise it because I guess it should at least indicate that the attribute only does something in 3.13...
The docs definitely need a ..versionadded:: 3.13. Feel free to submit a PR adding that, and feel free also to add it to the What's New for 3.13 in a new PR. Looks like the news updates were just missed during review.
Ok thanks! Would you happen to have an example PR I can base it off so that I can get it right and I don't waste anyones' time with extra back and forth of review?
I'd just look at the git log for Doc/whatsnew/3.13.rst to find some examples, and go off the existing ones. You can add to the section https://docs.python.org/3/whatsnew/3.13.html#ctypes
Feel free to request my review (JelleZijlstra) on your PR
ok great thanks! I'll get to that after work! 😄
And remember, PR to main, and we'll backport it to 3.13
Thanks! I was actually just about to create a branch and was wondering which to branch off from!
I don't think I'm able to assign anyone as a reviewer and not sure if people appreciate being tagged, but here is the PR: https://github.com/python/cpython/pull/125087
Looks like it already got merged while I was asleep. I made the 3.13 backport though: https://github.com/python/cpython/pull/125113
(cherry picked from commit 5967dd8)
Issue: Add ability to force alignment of ctypes.Structure #112433
📚 Documentation preview 📚: https://cpython-previews--125113.org.readthedocs.build/
For anyone who is using Python 3.13 on Linux, and is attempting to use the free-threaded mode: Are you building from source manually, or is there another mechanism for obtaining the free-threaded build on your particular distribution? Also, does pyenv have a convenient way to set up a free-threaded build?
you can use the deadsnakes PPA on Ubuntu, and Fedora has a python3.13-freethreading:
https://dev.to/hugovk/help-us-test-free-threaded-python-without-the-gil-1hgf
very cool, thank you!
and looks like there's a 3.13t-dev for pyenv
very excellent, thanks once more
is there a way to modify the colours in the 3.13 repl via a config file or something?
I suppose it will work from any common way of configuring the interpreter if you do something like import _colorize; _colorize.ANSIColors.BOLD_MAGENTA = _colorize.ANSIColors.RED to change prompt color to red, as it works from within a running REPL.
For traceback colors, you'd have to monkey patch ANSIColors.BOLD_RED, ANSIColors.RED and ANSIColors.MAGENTA too.
PS E:\co\cpython\PCbuild> .\amd64\python.exe -V
Python 3.13.0
PS E:\co\cpython\PCbuild> .\amd64\python.exe
Fatal Python error: _PyImport_InitCore: failed to initialize importlib
Python runtime state: preinitialized
Traceback (most recent call last):
File "<frozen importlib._bootstrap>", line 965, in <module>
File "<frozen importlib._bootstrap>", line 997, in BuiltinImporter
File "<frozen importlib._bootstrap>", line 506, in _requires_builtin
File "<frozen importlib._bootstrap>", line 45, in _wrap
AttributeError: 'str' object has no attribute '__dict__'
PS E:\co\cpython\PCbuild>```
errors: https://basedbin.fly.dev/p/vpN6X9.txt
build log: https://basedbin.fly.dev/p/8jVu9X.txt
I am on `60403a5` branch 3.13
ping me on reply
@willow pewter Do you have any local changes to code that could cause these errors?
nope
I have even tried git reset --hard
I'll try a build as soon as I'm done updating a PR. Can you share the build command and existing python version you used?
This looks like there's some out of date data, such as old pyc files; try doing make distclean.
I used version 3.13.0 which is installed globally on path
and I just used Visual studio to build it
Release x64
I am on windows theres no make, I will see how to clean on windows
(I guess make distclean is a Unix command; not sure what the equivalent on Windows is, but maybe something like git clean to remove all untracked files)
there is pcbuild/clean.bat
done cleaning
let me try to build
https://basedbin.fly.dev/p/liuR9Z.txt build log
doesn't seems to work
I changed the global python to 3.12.7
and then I started getting can't open file 'python313.lib' then I switched back to 3.13.0
and now these are the errors I get
regarding PEP 760: No More Bare Excepts, it might be worth to mention just changing the behaviour of except: from except BaseException: to except Exception: as a rejected idea? assuming it was considered
there should be a thread to suggest this in
I think this is good
hi @rose schooner
same
I think this is bad TBH. I think it's bad to use except:, but breaking backwards compatibility just because "this is bad practice" seems not worth the harm.
not an representative but fellow member
3 years
3 years to change
So what? Code should continue working without changes, if at all possible.
all it requires is ctrl f and edit few excepts
It doesn't matter how much effort it is. There is lots of code out there that isn't maintained or worked on anymore.
also I don't think anyone is running ancient code on latest python after 3 years
(I'm also allergic to patronizing, which is what this feels like)
maybe a new project popped out of it if people still use it
there's also the possibility of all the typing deprecated aliases being removed next year
you'd be surprised...
every once in a while someone trying to run a 3.6~3.8 unmaintained project show up, and unless they are somewhat experienced, odds are they'll just download the latest python version
A tool will be provided to automatically update code to replace bare except: with except BaseException:.
oh so like 2to3
sounds like it
this is like the latest addition to __future__ too
anyone
Also, Barry Warsaw raises a good point in the discussion: This usage looks fine to him (and me):
try:
...
except:
rollback_thing()
raise
i.e., bare except with unconditional bare raise at the end
maybe we can take that as an exception?
its addressed in the pep
¯_(ツ)_/¯
Yeah, I disagree..
its just very little verbosity in code
..figure out if a program can raise
or not
but yes that is a good point
Whats the difference between that and
except Exception as e:
print(e)
BaseException covers more exceptions than just Exception
it catches SystemExit, KeyboardInterrupt, etc.
such as KeyboardInterrupt
sys.exit()'s error version
exits the program with the return code as the argument (optional and defaults to 0)
If u exit, will it even show the exception code?
i don't think so
!pep 760
You can catch the exit though and stop it
os._exit(0) should be pretty much unstoppable
Depends on how the program is invoked. It's used as the exit code; in a Unix shell for example you can see the exit code by typing echo $? after the command finishes.
Was withdrawn
that was quick
Thought the same
Is this call to printf even valid? I thought %d requires an int, which is probably 4 bytes on your platform
if you know how many bits you want, you should use an appropriately sized integer (like uint64_t), not int/long long int
int is defined in the C standard as having at least 16 bits, and long long int is at least 64 bits, but the exact size may vary
actually I only know basic C and have never used it before in practical application
realized I was looking at wrong function the problem wasn't there
why did they decide to use this short and vauge names
l, m, co, ...
co is for code, m is for module, d is for dictionary, l is for list
it's quite intuitive really
the lines where they're assigned also indicate which type of object that variable is
what is v?
value
if it's a generic or any object result it's either v for value or o for object
ok
there's also s for (C) string, i for (C) int, u for unicode (python string), and some other things
although not all places use short variable names
for some reason in my case(branch 3.13 60403a5409ff2c3f3b07dd2ca91a7a3e096839c7)c #define MODULES(interp) \ (interp)->imports.modulesthe MODULES is NULL
and I get Fatal Python error: _PyImport_InitCore: failed to initialize importlib Python runtime state: preinitialized Traceback (most recent call last): File "<frozen importlib._bootstrap>", line 965, in <module> File "<frozen importlib._bootstrap>", line 997, in BuiltinImporter File "<frozen importlib._bootstrap>", line 506, in _requires_builtin File "<frozen importlib._bootstrap>", line 45, in _wrap AttributeError: 'str' object has no attribute '__dict__'
I am on windows 11 amd64
@stable grail
this is a normal build?
yeah
what do you mean by MODULES is NULL though?
it's a macro
wait let me explain
this raises error because init_importlib returns -1
it returns -1 because PyImport_ImportFrozenModule returns -1
it returns -1 because PyImport_ImportFrozenModuleObject returns -1
PyImport_ImportFrozenModuleObject returns -1 because m is NULL
btw m = exec_code_in_module returns NULL
and c if (m == NULL) { goto err_return; } goto err_return where it returns -1
and this function returns NULL
oh wait v could be NULL too
I didn't check it
yeah its something else
something from PyEval_EvalCode
though
which is in the python code........
most likely
broken freeze files-?
a module with its code's marshal cached in a C table
autogenerated
ok
in this case it's importlib._bootstrap
what does line 997 on Lib/importlib/_bootstrap.py say for you?
0997| @staticmethod
0998| def exec_module(module):
0999| """Exec a built-in module"""
1000| _call_with_frames_removed(_imp.exec_builtin, module)```