#internals-and-peps
1 messages Β· Page 17 of 1
The PEP itself usually won't change, as it's considered a historical document. Small changes to the functionality described in the PEP could be done through a CPython issue, bigger changes would need a new PEP.
Do you think this channel is an appropriate place to present new ideas such as those small changes?
It can be a good place to get initial feedback, but you'll eventually have to go through channels like the CPython issue tracker, discuss.python.org, or a new PEP, to get the change into Python
OK, so my idea is about the new type syntax (PEP 695). I suggest a way to define NewTypes through it. How about type MyNewType = NewType[int] as an equivalent of MyNewType = NewType("MyNewType", int)?
No, this syntax is already reserved for type aliases
I thought of that too (in case you're not aware, I'm the one who implemented PEP 695)
One problem is that currently type aliases created by the type statement are not callable
I see
this has been discussed on the discourse thread as well fwiw
OK
i think they should be callable and itd be cool if this did somehow work but idk how youd get it to work
imo they should return the evaled value but that breaks the new type example anyway
Hold the phone here
If a dictionary is used to assign __slots__, the dictionary keys will be used as the slot names. The values of the dictionary can be used to provide per-attribute docstrings that will be recognised by inspect.getdoc() and displayed in the output of help().
So that's a way to assign docs to instances variables assigned through __slots__? Is this new?
I've never seen that before
Wdym, it works with help()
I'm flabbergasted bc it allows better documentation and I've never seen it
i more mean you cant do Foo.slot.doc
Sure but you inspect.getdoc does it
idk i still prefer attribute docstrings likepy class Foo: x: int """Some doc"""cause thats picked up by sphinx and pylance
You sure you cannot? Seems like they might just add that string as the __doc__ attribute in the slot descriptor instance
I cannot see how else they store it
I don't have a computer to test
they just check slots is a dict surely?
ill download a new version of 3.12 and double check
It be pretty stupid if they recorded the dict used to define the slots on the dict
I mean, if you have custom descriptors, you can set the doc on the instance of the descriptor and it gets special cases by help()
>>> class Foo:
... __slots__ = {"x": "A doc"}
...
>>>
>>> Foo.x.__doc__
>>>
Hey, I wrote that! https://github.com/python/cpython/pull/30109
It's actually existed since Python 3.8, it just wasn't documented anywhere except "What's new in Python 3.8" until I added a snippet to the datamodel in December 2021
huh, i swear i remeber this being discussed with something about docnames or something
i havent been following python since 3.8 though :P
before my time as well π
I can't remember how I originally found out about the feature
But I remember what motivated me to document it: there was a huge long python-ideas thread about how it would be great to standardise a way of having per-attribute docstrings, and nobody seemed to know that we already had this __slots__ feature
yeah thats what i thought the docnames thing was from
The way help() grabs the docstrings from the dictionary is pretty simple:
Lib/pydoc.py lines 140 to 143
if inspect.ismemberdescriptor(obj):
slots = getattr(cls, '__slots__', None)
if isinstance(slots, dict) and name in slots:
return slots[name]```
Isn't each slot a descriptor?
yes
Why not store it in the descriptor instance
probably because that would require changing C code, and this feature was a fun weekend project for Raymond that didn't require any C code changes?
idk π€·ββοΈ
I mean it's a nifty feature either way. I just wonder if sphinx knows about it
could always file a feature request if it doesn't π
This is true. I'm looking at their issue board and there is stuff for it. Just can't tell what their stance is
who is "their"?
if you mean the Python core developers, there's at least two of us here
Sphinx
im a bit confused
why is ceval.c so short now? currently it is <3k lines, in 3.11 it was 8k lines
where did it go? where should i look for opcode implementations?
i think now it all lives in https://github.com/python/cpython/blob/main/Python/bytecodes.c
yes, bytecodes.c is now where things mostly live. some code is generated (code in Tools/cases_generator)
i see π
WHO DID !silence
What ?
Can anyone suggest a not horrible way to hook a particular library being imported? I need to install some sort of hook such that, when another particular library gets imported, a callback of mine fires, and I can register my library with the newly imported one.
In particular, this is for Memray, and my goal is to say: If greenlet has already been imported, call greenlet.settrace. Otherwise, call greenlet.settrace as soon as greenlet gets imported.
Currently what I'm doing is horrific (I'm interposing dlopen and using string matching to detect when greenlet has loaded its private extension module). I started off trying to figure out how to do this by hooking the import system, but it actually, somehow, wound up seeming even more horrific... So I'm wondering if there's any reasonable option that eluded me.
the first thing that comes to mind is overriding builtins.__import__ to something like if name == "greenlet": do_something(); real_import(). That's pretty horrific too though.
indeed, heh. Maybe slightly less horrific than what I'm doing, but we already had the dlopen hook...
IIRC, if I override builtins.__import__, my wrapper wouldn't get invoked if the user uses importlib.import_module - though I think it would get invoked if I override importlib.__import__ instead... But yeah, still not exactly elegant...
wait what, if you override __import__ the import statement gets affected?!
there's probably some way to do it with sys.meta_path?
yes
I am moving to PHP
wait until you learn about __build_class__
probably, but not easily - that's set up for something that wants to take ownership of importing a module. There's no nice way for an finder earlier in the meta path to detect that an finder later in the meta path has successfully found and loaded something, IIRC. In order for that to work, I think I'd need to actually do the loading, rather than letting the meta path search continue to the real loader...
too late I already do from that dude's talk
whose name escapes me closely
James Powell!
which is especially ugly since part of loading it is importing an extension module.
Is it not possible to create a loader which checks if greenlet is being loaded, then fails, having python delegate to the further elements in meta path
I think this does approximately what you want
class MyFinder:
def find_spec(self, name, path, target=None):
if name == 'numpy':
print('numpy is being imported')
return None
import sys
sys.meta_path.insert(0, MyFinder())
import numpy
print(numpy.array([1,2,3]))
```this works at least on this naive example
that fires before it's imported, not after. I need to call greenlet.settrace after it's been imported
ah, I see
Create new thread, sleep 10ms in it and then do the stuff π
huh. That's... Also moderately horrific. If I'm reading that right, it's working by calling importlib.__import__ recursively. It injects a fake finder onto the metapath that "finds" things by calling import recursively. Hm. Maybe that's less bad than what I'm doing...
it does solve the problem I hit with sys.meta_path, at the cost that all the finders in sys.meta_path ahead of mine would get called twice...
hm. on balance, maybe that's not too bad...
it also could not work if an earlier finder can in fact load greenlet
but that is fairly unlikely
yeah, and I think I'd be willing to just chalk that case up as user error, honestly. Loading memray, then modifying sys.meta_path, then loading greenlet, then being mad that Memray didn't automatically adjust to greenlet being loaded - psh. "Don't do that." π
my last idea is customizing the loader in some way, but that will require a monkeypatch
of all the options, I think monkeypatching importlib.__import__ might be the least bad... but that's really saying something. That's quite ugly...
I'm only a few weeks into a bootcamp course, so I'm barely following the technicals, but I've been working in software engineerning for quite a while and the entire question has the very specific smell of "fighting the technology"
class LoaderDecorator:
def __init__(self, old):
self.old = old
def __getattr__(self, attr):
return getattr(self.old, attr)
def exec_module(self, mod):
ret = self.old.exec_module(mod)
print(mod.array([1,2,3]))
return ret
class MetaPathFinderDecorator:
def __init__(self, old):
self.old = old
def find_spec(self, name, path, target=None):
ret = self.old.find_spec(name, path, target)
if name == 'numpy' and ret:
ret.loader = LoaderDecorator(ret.loader)
return ret
def invalidate_caches():
return self.old.invalidate_caches()
import sys
sys.meta_path = [MetaPathFinderDecorator(mp) for mp in sys.meta_path]
import numpy
print(numpy.array([1,2,4]))```this does work, but is probably worse.
(ignore me getting lazy with Loader)
yes, it absolutely is. You're not wrong. Having one library detect that another library has been loaded and "do stuff" if it gets loaded is an absolutely crazy requirement. But in this case, the thing being implemented is a profiler, and profilers need to do all sorts of things that are fighting the language, heh
yikes, but hm. That's an interesting option as well. Though it has the same problem of what happens if things get added to the meta_path after hooks are installed
you could replace the meta path with a custom sequence, that's probably fine
but yeah, maybe just patching __import__ is best
lol, dear god
my even more terrible idea is to have import memray detect that greenlet is importable. If so, edit the .pyc file for greenlet/__init__.py to do what you need after the rest of the package is imported
ok, ok, you win, that's the worst idea π
That one's also more fragile, as you can set python to not write bytecode and other things to disk
!e ```py
import sys
class MyDict(dict):
def setitem(self, key, value):
print(key, value)
super().setitem(key, value)
sys.modules = MyDict(sys.modules)
import fishhook
@pliant tusk :x: Your 3.11 eval job has completed with return code 1.
001 | fishhook <module 'fishhook' from '/snekbox/user_base/lib/python3.11/site-packages/fishhook/__init__.py'>
002 | importlib <module 'importlib' from '/usr/local/lib/python3.11/importlib/__init__.py'>
003 | importlib._bootstrap <module '_frozen_importlib' (frozen)>
004 | importlib._bootstrap_external <module '_frozen_importlib_external' (frozen)>
005 | warnings <module 'warnings' from '/usr/local/lib/python3.11/warnings.py'>
006 | warnings <module 'warnings' from '/usr/local/lib/python3.11/warnings.py'>
007 | importlib <module 'importlib' from '/usr/local/lib/python3.11/importlib/__init__.py'>
008 | importlib.metadata <module 'importlib.metadata' from '/usr/local/lib/python3.11/importlib/metadata/__init__.py'>
009 | os <module 'os' (frozen)>
010 | stat <module 'stat' (frozen)>
011 | _stat <module '_stat' (built-in)>
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/LIUXEFAWH4PEVRADPCEUAIYTCM
@raven ridge would that work? ^
*note that you would need to play around with it a bit more, I think it would break some stuff without a bit more work
probably not - a key gets added when a module starts being imported, not when it finishes, so it has the same issue as the naive sys.meta_path approach
Ah fair enough
another approach you could take is lazy import greenlet ahead of time and add your hook in the unlazyifying code. importlib.util.LazyLoader for inspiration
Any specific reason that
type Num = int
isinstance(5, Num)
Doesn't work right now?
I don't think this is intentional, will this be added later? Would that require a PEP?
tested on Python 3.12.0b4
It's most likely a bug. isinstance must be changed to make it recognize type aliases
As for isinstance and issubclass, I think that the magic method __instancecheck__ should not be magic anymore because isinstance(value, ty) is issubclass(type(value), ty) should be True
is type new syntax?
yes
yup. PEP 695
ew
i don't know how i feel about the accumulation of soft keywords like this
i was already opposed to match and case, now we have type
at least async and await don't conflict with a lot of existing code. but type is everywhere..
i liked the type parameter stuff but i didn't know about this type statement
i wish they called it typedef or something like that
deftype even
if you're talking about type as an identifier, use typ or ty
i mean literally type is already both built-in function and a type, and it shows up somewhat frequently in code with type hints
it won't affect code using type tho
right, but it's going to be confusing
i guess the reasoning was that type is already a confusing special thing, so why not make i the soft keyword
idk. i don't see why it even needed to be part of this pep
i don't mind it, i just feel like there's a lot of cowboy stuff going on lately with peps and new syntax
the fact that the decisions have so far been ok doesn't make me feel good about the process
where do I ask questions
!pep 695
Disgusting
shoulda posted in mail thread then
It's not a bug, it's intentional behavior. Type aliases may be more complex types for which isinstance doesn't make sense, and they're lazily evaluated. It would be surprising if an isinstance() call would evaluate the alias.
Does anyone know how to debug pdb? I have a clone version of cpython and its built. I am trying to make modifications to pdb in hopes of getting it added. I however have no idea how to debug pdb though because I need to pdb to debug pdb. I'm having a horrible time trying to see the impact of my changes in my local pdb.py
when I was doing sys.settrace stuff I just did print debugging
Well that's unfortunate
I could just add some pdb aliases that auto setup breakpoint commands that do the prints I guess
what you may find helpful is looking at test_pdb and adding tests for your feature before you implement it.
right. Makes sense. Part of this too was trying to understand the current implementation and how I can modify it, but the tests should help with that too.
There is no builtin function called type π
Youre plucking hairs here
thoroughly dislike type becoming a soft keyword but that's just like my opinion man
Python has had some syntactic influence from Rust, like the match statement (which is packaged into Python's lexical structure), and the type statement.
I think building support for gradual typing into the language was a mistake tbh
I wish the typing stuff went into a new language, like TypeScript
Yeah, and the more that gets added the more there seems to be issues
Specifically with generics and stuff like that
Maybe 4.0 with nogil and first class typing support
nogil isnt 4.0 is it?
That's not the plan currently
There's no official stance from what I remember GvR talking about, default nogil python would be like a gradual build up to what would be like a 4.0
This was on lex Friedman podcast
GvR isn't in charge of Python anymore π
Yes
That's why there's going to be a build option so 3rd party can work on it
I think the issue is that users would assume a major version bump is going to be exactly like the Python 2->3 transition. And maybe a major version bump even encourages core devs in some ways to see backwards-incompatible changes as okay, so maybe it would lead to a repeat of the 2->3 transition. Nobody wants a repeat of the 2->3 transition.
So everybody's pretty averse to having a Python 4
it took almost a decade to fully deprecate 2.0 right?
i want a python 4 so we can break things :^)
What would you like to break?
asyncio and typing stuff
PEP 695 integrates generics into the language much better, not sure what you're talking about here
Will TypeVar and Generic be removed eventually?
no, they are used by the PEP 695 implementation
but there should no longer be a need to use them explicitly when defining generics
(you might still want to import them if you're doing introspection, or want compatibility with Python 3.11 and earlier)
some of us our still working on the long tail of Python 2 migration, thankyouverymuch
my condolences
seems bad
it is
why isn't ur security forcing it?
u work for a fortune 500
seems no brainer from that pov
it is
but "rewrite your code in a new language" is really hard to handle as a security ticket. Migrating an application from Python 2 to Python 3 is often a tremendous amount of new work and testing. If migrating a single application takes a week and a team has 100 different applications...
yeah idk
I've only updated one tool and 2to3 covered like 90% of it
rest I easily replaced with repeated test runs and vim
was it a tool that you were the original author of? Did it already have decent test coverage?
both of those make things much easier...
ah - and was it an application and not a library? That also makes things easier...
In any event: there's always going to be tension between the group of people who want to start from scratch and build something better versus the people who want the things they already wrote to keep working. But the Python 2 to 3 migration was a nightmare that damn near killed the language, and we shouldn't be wishing that on ourselves again. There's billions or trillions of lines of existing Python code that's in maintenance mode, and making major breaking changes requires auditing all of it.
which reminds me: I'd really like to get https://docs.python.org/3/library/importlib.resources.html#deprecated-functions un-deprecated. The deprecation makes it harder to write code that supports every supported version of the interpreter, and the proposed replacements are hardly elegant. It doesn't seem like a win to me to tell people that instead of writing py import importlib.resources config = importlib.resources.read_text("my.package", "config.txt") they just need to do py import importlib.resources config = importlib.resources.files("my.package").joinpath("config.txt").read_text(encoding="utf-8") (And you do need to pass the encoding="utf-8", since importlib.resources.read_text defaulted to UTF-8 but importlib.resources.Traversable.read_text does not)
Were there any attempts to use alternative to typing module? Type annotations can contain anything, so typing module is not mandatory.
Something like:
from better_typing import Join
def f(x: Join[SupportsAppend, SupportsAdd]) -> None:
x.add(1)
x.append(2)
Tbh. both versions look bad. Why not
importlib.Resource("my.package").open("config.txt").read()
or something like that
I guess they are gone in 3.13 already: https://github.com/python/cpython/commit/243fdcb40ebeb177ce723911c1f7fad8a1fdf6cb. I would probably not have removed them if I was maintaining that library
The removed functions look pretty simple
Annotations were added in 3.0 and typing was only added in 3.5, so there was some time for people to come up with alternatives
The Join looks like the missing Intersection type that I keep asking about
I doubt anything significantly better would have come up though
wait, isn't that too early? Aren't they supposed to be available for 2 minor versions after being deprecated?
from python312 import importlib
# do whatever you want using removed functions
deprecated in 3.11, removed in 3.13
I thought it was available for 2 minor versions after the one that deprecates them, not including - but you'd know better than me.
still. π¦
That counts on open().__del__ to clean up the file, which is an anti-pattern and doesn't work nicely in non-CPython interpreters
no, the standard is to emit DeprecationWarnings in two releases, then remove it
But arguably that's often too fast
I mean using that open like the builtin, so typically as a context manager.
We've deprecated things in typing but I'm generally not keen on removing things unless they are really bad
the code sample that you gave didn't use it as a context manager
My point is: why aren't these just normal file-like objects?
that's exposed as an option, too. That's even more code, though.
I see
my complaint here is that this is a lot of boilerplate to do a reasonably common operation
config = importlib.resources.read_text("my.package", "config.txt")
``` vs ```py
config = importlib.resources.files("my.package").joinpath("config.txt").read_text(encoding="utf-8")
``` vs ```py
with importlib.resources.files("my.package").joinpath("config.txt").open_text(encoding="utf-8") as f:
config = f.read()
one of these things is a good deal nicer than the other two π₯²
i'm hoping that they re-add helper functions like read_text as the design stabilizes
personally i didn't think the old model was that bad, once you understood that a directory wasn't allowed to be a "resource"
but here we are
!pep 722 are there any other languages that define dependencies like this?
I think user scripts does it with a comment. // @requires foobar
i don't find the argument against toml very convincing
An open source userscript manager.
not sure if you missed it or you mean specifically that exact syntax but quoting part of that pep:
A review of how other languages allow scripts to specify their dependencies shows that a βstructured commentβ like this is a commonly-used approach.
linking https://dbohdan.com/scripts-with-dependencies
It's a single file script, I think requiring a second file makes it cumbersome
rust isn't a good comparison to say "oh it's fine", I have plenty of binaries on my system written in rust without any of that living alongside it, python being interpreted, and this being to share scripts does change the concerns
!e someone in #python-discussion was asking questions about code that looked basically like this: ```py
match "1":
case str(1):
print("match")
case _:
print("no match")
@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.
no match
I wonder if this should be a SyntaxWarning
this is happening because of the built-in types special case in PEP 634:
For a number of built-in types (specified below), a single positional subpattern is accepted which will match the entire subject.
In this case, the subpattern is the 1, and "1" doesn't match 1, so the whole match fails.
if the subpattern is a literal pattern of a different type than the built-in type it can never match, and that seems worth a SyntaxWarning to me.
that does sound reasonable
I guess one issue is that we don't know at compile time that the types don't match
because you could have done globals()["str"] = int first
hm
I know that overwriting the builtins can be done, but I don't think things should be designed having to assume that may happen, if anything, we should prevent overwriting the builtins like that.
nah, Jelle's right. We shouldn't have a SyntaxWarning that's heuristic, it should only be for things that are provably wrong.
I'd say changing the semantics of the builtins in unexpected ways instead of just using a name that corresponds to what you actually want is something that should be considered an error as well.
Because by that logic, we can't check anything for syntax errors, someone could use ctypes to modify everything
ctypes or the C API is cheating - you can violate CPython's own invariants that way
but the same isn't true for Python code
We could just prevent shadowing or overwriting the builtins and then consider attempts to do this an error, because it likely already should be one.
that would be a major breaking change
when people do it with a variable name list, how many times is it a bug vs correct?
stdlib/imaplib.pyi: def list(self, directory: str = '""', pattern: str = "*") -> tuple[str, _AnyResponseData]: ...
stdlib/multiprocessing/managers.pyi: def list(self, __sequence: Sequence[_T]) -> ListProxy[_T]: ...
stdlib/multiprocessing/managers.pyi: def list(self) -> ListProxy[Any]: ...
stdlib/nntplib.pyi: def list(self, group_pattern: str | None = None, *, file: _File = None) -> tuple[str, _list[str]]: ...
stdlib/poplib.pyi: def list(self, which: Any | None = None) -> _LongResp: ...
stdlib/pydoc.pyi: def list(self, items: _list[str], columns: int = 4, width: int = 80) -> None: ...
stdlib/tarfile.pyi: def list(self, verbose: bool = True, *, members: _list[TarInfo] | None = None) -> None: ...
stubs/boto/boto/s3/bucket.pyi: def list(
stubs/commonmark/commonmark/render/html.pyi: def list(self, node, entering) -> None: ...
stubs/commonmark/commonmark/render/rst.pyi: def list(self, node, entering) -> None: ...
stubs/redis/redis/commands/bf/commands.pyi: def list(self, key, withcount: bool = False): ...
stubs/stripe/stripe/api_resources/abstract/listable_api_resource.pyi: def list(
stubs/stripe/stripe/api_resources/list_object.pyi: def list(
stubs/tensorflow/tensorflow/core/framework/attr_value_pb2.pyi: def list(self) -> global___AttrValue.ListValue:
^ e.g. you'd break all these libraries
And we can't do this over time by emitting a warning for a few versions than then fully disallow it?
I wonder if a runtime error would be appropriate here, if we can't do a SyntaxError. Although at runtime, I don't know that we'd realize we're matching against a literal pattern...
I guess we could, but that would be a lot of migration pain and I don't think it'd be worth it
you'd probably have to change the bytecode, which feels too expensive for what we're trying to achieve
blah.
From the perspective I'm in, I'm trying to get a lot more correctness. Internally at work, we have CI lints for some of this, but when a lot of limitations stem from "someone can change the entire world out from under us", it seems impractical to consider the world changing under us as a real thing, but you're right that it can happen currently.
Maybe it could be limited in scope to not include the scope of class definitions? In the typeshed, this actually only appears scoped to class definitions as far as I can see at a quick glance.
The confidence to look at code as correct without needing to scrutinize for things which feel like they should be detectable more automatically is missing in some places with python.
but this only covers the typeshed, the "whole world of python code" isn't available to see how breaking vs catching subtle issues this would really be.
I suspect it'd break quite a lot of real world code. I see several places in the stdlib it'd break.
Another cost of your proposal would be that we can't ever add another builtin without breaking compatibility
Lib/http/server.py line 786
list = os.listdir(path)```
Lib/idlelib/tree.py line 41
list = glob.glob(os.path.join(glob.escape(icondir), "*.gif"))```
Lib/idlelib/window.py line 24
list = []```
Lib/pickle.py line 1656
list = stack[-1]```
Lib/pstats.py line 426
width, list = self.get_print_list(amount)```
And every single one of these introduces potential for subtle issues arising from it in the future, it's inherently fragile and no better than doing _list = ...
Another cost of your proposal would be that we can't ever add another builtin without breaking compatibility
Not like there are frequently new builtins, but point taken. The addition of new keywords is already contentious for similar reasons.
Maybe instead of ever disallowing it, it could just be always warned about, I'll try and think about something less disruptive here.
I think we're somewhat unlikely to ever add another hard keyword
yeah, I got the distinct impression casematching only got approved because the new parser could do it without one
same for PEP 695's type statement
a fair amount of stuff broke due to async in particular. There were a decent number of libraries taking an async=True keyword argument, or something like that.
and keyword arguments are a particularly nasty case of the backwards incompatibility of introducing new hard keywords. That's particularly hard for libraries to adjust to
yes, I wish we would have just left it a soft keyword. PEG came only a few years too late
I have a lot I'm currently working on with intersections and type formalization, but I've been thinking about a "Better way" to catch some of these likely errors or fragile code being built into python. I think the comparison to keywords points out why hard disallowing it may end up too disruptive, but maybe in the future we could have an an interpreter flag that people can run code with to spot things that we can say with high confidence is likely unintentional or fragile.
Instagram's Static Python might be an interesting point of comparison
While alternative python implementations and add on tools provide a lot when it is your own code and you can elect to use them, people generally seem more likely to fix code when the issues with it are apparent in CPython. I've had cases relating to pypy and generators where someone didn't want to merge code that added gen.close() for generators that were broken out of π¦ (I don't think this is neccessary in pypy anymore, old issue)
spotting potential issues in code should be a "batteries included" feature if we can do it non-disruptively IMO
so what you really want is a linter built into CPython π
just drop an os.system("ruff") into a .pth file... π
Something inbetween a linter and -pedantic compiler errors really. Linters are ususally extremely opinionated in some fasion by default, when there are many things heuristically detectable that are more often than not done in error.
hm. Sounds a bit like -X dev, though not quite, because
It should not be more verbose than the default if the code is correct; new warnings are only emitted when an issue is detected.
Consider these warnings:
>>> ()()
<stdin>:1: SyntaxWarning: 'tuple' object is not callable; perhaps you missed a comma?
>>> ''('')
<stdin>:1: SyntaxWarning: 'str' object is not callable; perhaps you missed a comma?
>>> ...[...]
<stdin>:1: SyntaxWarning: 'ellipsis' object is not subscriptable; perhaps you missed a comma?
>>> 1[1]
<stdin>:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
>>> [][[]]
<stdin>:1: SyntaxWarning: list indices must be integers or slices, not list; perhaps you missed a comma?
Yes, in 99+% of cases warnings are correct, but int.__getitem__ or str.__call__ can be pached, so code can be correct.
The reason why these warnings exists is that making ()() code runnable would require some weird tricks (fishhook / stealing orig dict from cls.__dict__ and patching it / ...). So compiler assumes that there is no weird things happening and raises warning
Now consider these:
# 1 example:
str = int
... case str(1): ... # this is correct, no syntax warning should be there
# this is fine
# worst that can happen - lack of warning
# 2 example:
... case str(1): ... # wrong because str is not assigned to, so there should be a warning
# also fine
# str is not actually assigned, so warning is expected
# 3 example:
globals()[('r' + 't' + 's')[::-1]] = int # weird
... case str(1): ... # correct because str is actually int, but compiler should not care about weird code, so there should be a warning
# not fine
# str is assigned, but compiler cannot see it, so there is a wrong warning
# this case is not common, and it is a consequence of "bad" code, so having extra warning in this case is fine
At compile time, we don't know what str will resolve to, even if the user has only written Python code and not done anything tricky to mess with CPython internals.
it's normally looked up from __builtins__, but at compile time, we don't know whether the module will be loaded into a process where someone has or hasn't modified __builtins__
removing or replacing str with something else will most likely break everything (just like patching some internals with fishhook)
hm. Perhaps...
int.__getitem__ cannot be patched as far as the language is concerned. The tricks you mention aren't part of the language and can break at any time.
>>> class S(str):
... def __eq__(self, other):
... return eval(self) == other
...
>>> match S("1"):
... case str(1):
... print("π€")
...
π€
hah, true. I wasn't thinking of subclasses at all
Removing or replacing str in some local scope will often not break anything.
Compiler keeps track of local variables, so it should know whether str is a local var or not
an imprecise metric (only start of lines, specific whitespace pattern, including string content, only list, only published code on github) but still huge
wow that's a lot
actually, allowing code that's not at zero indentation bumps it up to 100k lol
includes false positives though
Or for the full set of builtins (as per the docs) it's 424k files. Mental
none on the first page of results for me which is a good enough heuristic
i managed to filter the results using /(?m-i)(?:[^\(]\n+|\A)\s*list\s*=/ instead which doesn't show any false positives for the first few pages and has 122k matches
found this for some reason
turns out even the stdlib does it
Lib/cgi.py lines 890 to 891
list = traceback.format_tb(tb, limit) + \
traceback.format_exception_only(type, value)```
imagine the chaos if this became a hard error
ok i got 2.8m from this
using /(?m-i)(?:[^\(,\\](?:\s*?\\?\s*?\n)+|\A)\s*(?:abs|aiter|all|any|anext|ascii|bin|bool|breakpoint|bytearray|bytes|callable|chr|classmethod|compile|complex|delattr|dict|dir|divmod|enumerate|eval|exec|filter|float|format|frozenset|getattr|globals|hasattr|hash|help|hex|id|input|int|isinstance|issubclass|iter|len|list|locals|map|max|memoryview|min|next|object|oct|open|ord|pow|print|property|range|repr|reversed|round|set|setattr|slice|sorted|staticmethod|str|sum|super|tuple|type|vars|zip|__import__)\s*=/
results for what i think are commonly unused built-ins especially help() since i that's not available for anything outside the REPL
id takes up 819k results
help takes up 308k results
breakpoint takes up 4k results
there's a lot of tensorflow code results on the first few pages
I mean id makes a lot of sense
I also tend to not overwrite builtins but with id I just don't care
async being a hard keyword made me sad as it forced me to name my files async_..
would it not be too hard to revert it?
Even stdlib shadows builtins occasionally
would be doable, but feels a little too late now
not a lot of people are going to migrate straight from 3.6 or whatever it was to 3.13
why would that matter? async has been a hard-keyword, which means making it soft shouldn't change anything with current-codebases?
the change from the previous soft -> hard would be breaking, right?
I think what Jelle says is, there aren't many people who will benefit from this, so perhaps it's not worth changing
that still doesn't make sense bc people that have async= or import async have been doing something else like async_ or asynch
have there been no one asking for existing hard keywords to turn soft?
or is it like a "we could do it, but it's riskier to make that change bc it's a 1 way street so we aren't going to" type thing
I guess by this argument a lot of keywords could be made soft, like class
It is possible, but it's still a change to the language π€· and it does make parsing a bit more complicated.
Also think of the children people who maintain various Python tooling: they'll have to take this new peculiarity into account
that's prolly fair
also might make it difficult for other implementations of python
right, but if they were to change their parameter now from async_=True or asynch=True to async=True just because async became a soft keyword, then they'd break all of their users again
async becoming a hard keyword was painful exactly because it forced libraries to change function signatures and communicate that change to their users. Making it a soft keyword would allow libraries to voluntarily change those function signatures back, yes - but that would cause the exact same breakage as was necessitated by it becoming a hard keyword
i see that's fair
would anyone happen to know how Python detects overlong UTF-8 encodings? π
what do you mean by that?
for example, we can represent the byte 0x01 as a UTF-8 sequence like:
11100000 10000000 10000001
as opposed to 0x01
when you mask out the metadata bits 1110, 10, and 10, you are left with
0000 000000 000001
which simplifes to 1
(i.e 0x01)
a more complex byte sequence, say, two bytes such as 0xD0 0x98
could be represented as follows:
11100000 10010000 10011000
as opposed to just 0xD0 0x98
seems like utf8 decoding happens in unicode_decode_utf8 in Objects/unicodeobject.c. Not familiar with this area of the code, though
(i mightve gotten the exact bit pattern wrong, but the point is, it happens)
hmm
@fossil fern wait, what did you mean by the bit length?
take the 3 byte overlong encodings of 0x01, 0x02, and 0x03:
11100000 10000000 10000001
11100000 10000000 10000010
11100000 10000000 10000011
what would be the bit length of them? (this is a genuine question, not trying to be snarky, just figuring out what to do haha)
holy fuck that file is 15k lines

just bit length of the code point
1 -> 1
10 -> 2
11 -> 2
100 -> 3
...
wait so what do you mean by bit length? the number of bits before the last 1 bit?
hmm
alright, yeah, let me give that a shot.
!d int.bit_length
int.bit_length()```
Return the number of bits necessary to represent an integer in binary, excluding the sign and leading zeros:
```py
>>> n = -37
>>> bin(n)
'-0b100101'
>>> n.bit_length()
6
``` More precisely, if `x` is nonzero, then `x.bit_length()` is the unique positive integer `k` such that `2**(k-1) <= abs(x) < 2**k`. Equivalently, when `abs(x)` is small enough to have a correctly rounded logarithm, then `k = 1 + int(log(abs(x), 2))`. If `x` is zero, then `x.bit_length()` returns `0`.
Equivalent to...
i already have a function for computing UTF-8 codepoints thankfully..
basically you know how many bits fit in each byte size
yeah

oh.
OH MY GOD.
YOU ARE A FUCKING GENIUS
how did i not think of that π€£
lol
it might just work.. let me do some manual tests
in my head that felt like an obvious thing π
overlong means it would have fit in a shorter byte length
so you can just get the codepoint and check if it would fit
there probably is some more clever way though, this is the most naive way
alright, let's try this:
actually, you can just make some "expected length" function
if codepoint <= 127:
return 1
if codepoint <= ...:
return 2
...
ya thats what i do for the length π
no need to fiddle with bit length if you have that
0xD0 0x98 = 11010000 10011000
using 0xE0 as the overlong encoding byte,
11100000 10010000 10011000
!e ```py
print(hex(0b11100000), hex(0b10010000), hex(0b10011000))
@crude turret :white_check_mark: Your 3.11 eval job has completed with return code 0.
0xe0 0x90 0x98
has the codepoint 1048, in binary is 10000011000
that binary number needs two bytes to store it π
i think your idea works then. 
ill write a more dedicated test case and see
this'll definitely be preferrable though π
and it's easy to get those cutoff points
< 2^7 β 1
< 2^11 β 2
< 2^16 β 3
< 2^21 β 4
I wonder if there is a neat way without getting the code point though
yeah
I don't see anything super neat
you can tell by looking at the first 1 or 2 bytes
but it's probably more work than just decoding
definitely lol
ill continue looking into this
thanks for the idea of checking the codepoints
basically not all these ys can be 0
(0xxxxxxx)
110yyyyx 10xxxxxx
1110yyyy 10yxxxxx 10xxxxxx
11110yyy 10yyxxxx 10xxxxxx 10xxxxxx
so it's not a nice check on the byte level
@fossil fern so the minimum bit length of an actual 3 byte encoding looks to be 12
E0 81 BF is an overly long encoding for 7F
and ends up having a bit length of 7
im still trying more but 
this is probably what you trying to get at with the bit stuff tho i think
e0 encoding?
1110xxxx 10xxxxxx 10xxxxxx
varint?
variable sized int
oh lol
I say just use this and compare to the number of read bytes
do the simple thing, revisit if really needed π
Is this the place to share a pep proposal?
I did post it in main, but it occurs to me that this is the right channel
So, this is my proposal
https://pastebin.com/Xte1nH78
The idea is to combine ABC and Protocol, with a syntax that allows the use of typehints instead of stub functions and properties
It makes abstract/protocol classes much shorter
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
How is it different from a normal Protocol?
Ah I see I think
What's wrong with stub functions though?
Nothing wrong per se, besides the verbosity
Protocols can be defined with nothing bt type hints
the same should, in theory, be true of abstracts
and at that point
they are structureally similar enough to be combined
Does some_variable: int require that the variable to be writable? Or is it the same as specifying a @property?
same as defining
@abstractmethod
@property
def some_variable(self):
...
in an ABC
So there's no way to define a writable variable in a StructuralProtocol?
cept it doesn't need to explicitly a property
well, its to represent a variable
it doesn't care for its writable state
It could make use of TypeDicts required/NotRequired maybe
I mean, if you do this: py class Foo(StructuralProtocol): bar: int will some_foo.bar = 42 be legal?
What's the difference from protocols then?
What features of ABCs do you want specifically?
think about how you have @runtime_checkable protocols?
this makes Protocol and ABC the same thing
if used as a type hint
then its basically a protocol
it inherited
its an ABC
it can be used for instance/subclass checks etc
it also means default impl can be included too
if you want to lean into the ABC side
You can already have a default implementation in a protocol and inherit from it
and we already have runtime_checkable
Exactly
they are doing the same thing with minor differences
Thus, can be updated and combined
So instead of 2 competing ways of defining an interface, you propose to have 3?
You can take the simpler definitions of Protocol for abstract methods and vars
I propose combine them into 1, and later deprecate the originals
You can make this backwards compatible with aliasing
I might be a bit pessimistic, but I doubt this will be accepted, especially with the intention of deprecating ABC
how would you do a purely nominative subtype then? If I don't want a structural typecheck to pass.
What do you mean?
From what I understand, your thing is exactly the same as a runtime-checkable Protocol 
what is actually the difference?
an ABC will not match a type which does not subclass it/is not registered to it. If I want this behaviour, how do I get it without using ABC?
My point is, why have two, basically same classes, with minor differences
the main difference, as lakmatiol mentioned, is the nominal vs structural
I imagine it not too hard to add optional constraints
(I do know my idea is not perfect yet, this is why posying to see what folks think :P)
I am currently writing an implementation of my idea
My main issue is the verbosity of ABCs
and I kinda want Protocol like definition of ABCs
to borrow an example from fix, a Point2D Protocol is a subtype of a Point3D Protocol, but a Point2D ABC is not necessarily a subtype of a Point3D. (where points have x y (z))
unifying ABCs and protocols is not really desirable as a consequence
Ah, co/ntravariance issues, I see
you could propose an alternative, protocol-like syntax for ABCs
this is unrelated to variance, this is related to the definition of a subtype wrt. abc and protocol
I will think about it some more
have been working on it for a week or so ow
(I have a real issue with overly verbose code kek)
I do agree that getting rid of all the stubs would be nice, but I would suggest keeping the scope as "nicer way to define ABCs".
YEah, given what you guys have said and folks on a few other servers, I am beginning to tend to agree
OK, reworked the proposal
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
It focusses solely on the syntax
and drops the idea of combining protocols and abcs
Consider this class: ```py
class MyABC(ABC):
@abstractmethod
def do_a(self) -> int:
return 42
@abstractmethod
@property
def do_b(self) -> Callable[[], int]:
return lambda: 42
Same class in your proposed syntax: ```py
class MyABC(ABC):
do_a: Callable[[], int]
do_b: Callable[[], int]
What in now the difference between do_a and do_b? I dont see any
well, the hints are wrong for the second object
where?
class MyABC(ABC):
do_a: Callable[[], int]
do_b: Callable[[], Callable[[], int]]
do_b is the property
then it should be defined as
do_b: int```
you proposed to declare properties like this: value: str
no, because it doesnt return int, it returns lambda that returns int
Ah right, misread
I mean, its still the samething really
You can define this the same way in a protocol
So, my proposal is to make these line up
class MyABC(ABC):
do_a: Callable[[], int]
do_b: Callable[[], int]
``` what does this code even mean? which field is a property and which field is a method?
and, python kinda treats them the same anyway
well, it doesn't consider "properties" directly, instead assuming they are a variable
you can still define as a property
just like you can a normal class
when you look at the class abstractly
you don't generally think of the method retuning the value
you just consider it to be a variable
(because descriptors are all black boxy, and properties are descriptors)
so
a property that returns int
is still a: int
disagree
method is a part of the class, not a part of an instance
and instance variables are part of instances
no, because it is not settable
if i have a: int, i should be able to do x.a = 42
Type hints don't consider the setability
and abstract properties don't define that really eithe
@abstractmethod
@property
def a(self)->int:
...
does not enforce setter or getter
not really
just that a variable with the name a that returns an int exists
class X:
@property
def x(self) -> int:
return 42
x = X()
print(x.x)
x.x = 23 # Cannot assign member "x" for type "X"
ah, but thats not abstract
thats concrete
class X(ABC):
@abstractmethod
@property
def x(self):
return "blah"
class SubX(X):
def __init__(self, x):
self.x = x
is valid
because that "recipe" in python just means a var called x exists
and does not enforce a property descriptor
so
x: int
just becomes shorthand for
@abstractmethod
@property
def x(self):
...
Take a protocol class instead
class X(Protocol):
def x(self) -> int:
...
a valid way to define this structure in py
class X(Protocol):
c: Callable[[], int]
when used with @runtimecheckable
this will be valid
and will succeed
My proposal is to make the same true of abstracts
and the only true difference between these syntaxes are a decorator
That's not really true.
While it's not the "obvious" use, type(SomeInstanceofX).x(something_else) is only valid with one of these. There are rare, but valid cases for this.
I'm trying to understand memory usage for some coding patterns. A bit of code needs to be callable in hot loops and execute against some fixed values: a few ctypes pointers and integers. I have two options:
a) create an object with a method that, when called, gets the values from self. to do its work
b) create a closure or functools.partial that, when called, gets the values from captured local variables/bound params to do its work
(a) has slow attribute lookups, (b) avoids them, so I'd prefer to go with (b)
But I'm curious about memory usage. Are there any good ways to figure out the memory usage of a closure or a functools.partial? As compared to an instance of a slotted class or tuple subclass?
!e ```py
from sys import getsizeof as s
class X:
slots = ('a',)
print(s(X()))
add = lambda x: lambda y: x + y
print(s(add))
print(s(add(2)))
@dusk comet :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 40
002 | 152
003 | 152
40 bytes - just object with one field
152 bytes - function object with all function-related stuff
!e ```py
from sys import getsizeof as s
from functools import partial
add = lambda x, y: x+y
f = partial(add, y=2)
print(s(f))
@dusk comet :white_check_mark: Your 3.11 eval job has completed with return code 0.
80
80 bytes - partial object
I wonder if 80 bytes is accurate, because binding a large number of params does not increase the memory size
sys.getsizeof(functools.partial(foo, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28))
This makes me believe that getsizeof() is reporting the size of one object storing a reference to a larger tuple/array/something of params
that is a classic problem with getsizeof.
which I've hit before, hence my question
if i had to figure out the memory footprint of a functools.partial, i would dig into the source to see its structure.
partial() is implemented in C, I'll see if I can dig it up. (it's also implemented in python, but the C implementation actually gets used in practice AFAIK
another classic pattern
Does anyone know what gets captured by closures, do they store a reference to the entire outer scope or only to the variables that their source refers to?
Like, in this example, are b and d eating up memory? Cuz closure() won't access them
def foo(a, b):
c = 3
d = 4
def bar():
a
c
return 0
return bar
closure = foo(1, 2)
you can actually check this yourself, but no
>>> closure.__closure__[0].cell_contents
1
>>> closure.__closure__[1].cell_contents
3
>>> closure.__closure__[2].cell_contents
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
at least not on the python version I'm using (3.11.3)
No, it is not accurate. It is the size of object itself, it doesn't take into account any child objects.
For example, for function all these things aren't counted (some of them are shared): code, global namespace, func name, default args, default kwargs, closure tuple, ... (I definitely missed something)
Of all of these things only .__closure__ tuple is not shared (maybe default values are not shared too, im not sure), so you take it into account
__closure__ tuple contains several cell-objects, which are very similar to X class in my code - they contain only one extra reference
So, for functions:
152 - plain function
40 (idk how many exactly) + 8 per reference = 48 - tuple for closure
40 per reference = 40 - for cell objects
So, if function have only one var in closure:
152 + (40+8*1) + 40*1 = 240 bytes per function
I guess partial contains these things:
- actuall callable (if it is shared, it is 0 bytes per object)
- args - probably tuple of all args
- kwargs - probably a dict of all kwargs
- dict - im not sure, maybe instance dict?
- list of weakrefs
- vectorcallfunc - no idea what this is, probably it is shared
So:
80 - partial itself
(I dont remember exact size of tuple, list and dict, you should test it yourself:)
40+something - args tuple (maybe it is not created if there is no positional args, so it is 0 bytes - you should check that)
80+something - kwargs dict (same)
40+something - weakrefs list (same)
= ~240bytes
Note: this is very inaccurate, it just shows how you could do it yourself
Also, all of this can vary between python versions and platforms
Thanks, I'm seeing that partial() gets a __dict__ that I don't need, it stores positionals in a tuple, but also gets a keywords dict that I don't need/want EDIT oih yeah, in what you linked, thanks
Sounds like lambda might be lighter, as long as I can verify each lambda created from the same statement really does share everything other than .__closure__ tuple. And as long as there isn't additional per-call overhead. Can't see why there would be
oh wait... those cell objects in __closure__ have overhead
Custom class with slots is definitely lighter.
Also, attr lookups are not that slow in 3.11, they are optimized a lot
unfortunately, they're still very slow compared to not-attr-lookups, I've been benchmarking and have researched the 3.11 optimizations
another factor I've been considering is that multiple types, subclasses, passing through a codepath don't play nice with 3.11's optimized attribute lookups, since the type changes each time
I guess I can create Xmillion of each kind, then run GC and check memory usage
That's probably the best way to measure memory usage
Can anyone think of a way that this would give inaccurate results? Would this ever count memory twice or fail to count some memory?
gc.collect()
print(sum([sys.getsizeof(o) for o in gc.get_objects()])) # ask garbage collector for array of all objects, call getsizeof() on each, sum result
Returns a list of all objects tracked by the collector
so it doesnt return all untracked objects: ints, strs, some tuples, ...
you might want to use https://github.com/bloomberg/memray
Since I want to track the per-lambda, per-partial, per-closure increase in memory usage, is that ok? Are all those untracked objects going to exist regardless, whether I have 1 closure or 2 million closures?
i think it would be good enough
but keep in mind that untracked objects are not counted
ok thanks
Can some cpython-[core]-dev please review these two PRs? They are really tiny, it probably wont take more than a couple of minutes to review them. Thank you!
https://github.com/python/cpython/pull/107407
https://github.com/python/cpython/pull/107410
Hello, as a feature, is-it possible to return a compilation error when we have a close() method runned on "with" block ?
example:
with open("path/to/myFile","r") as fileObject:
data = fileObject.read()
fileObject.close()
fileObject.close() should be an error. It upset a bit myself when I see this type of code. It means that there is a little conflict with the "with" block
I give an example with file but it could be a connection anything with a context manager
It's possible to do this with your own context managers, but there's no option to enable it for the built-in file objects.
You could write a wrapper that does this.
ty @quick snow for the info , the idea of the feature is to force dev to have clean code.
There might be linter checks that catch this
good idea with linter
As for the wrapper: I meant something like this:
orig_open = open
class open:
def __init__(self, *args, **kwargs):
self.fobj = orig_open(*args, **kwargs)
self.in_context = False
def __enter__(self):
self.in_context = True
return self.fobj.__enter__()
def __exit__(self, a, b, c):
self.in_context = False
return self.fobj.__exit__(a, b, c)
def close(self):
if self.in_context:
raise DirtyCodeError
return self.fobj.close()
def __getattr__(self, attr):
return getattr(self.fobj, attr)
fyi you can use builtins.open()
Creating a class that makes sure, that you do not do something that could be confusing and then naming it in a way that always makes it hard to see which open now is used and also shadows a builtin. That is 1 step forward 2 steps back in my opinion.
I know it is just an example, but it would be way better to name this anything else like context_only_open or so. It is still not clear but at least not shadowing.
Even in examples, because ai can guarantee that the close is called by that person because of a lax example too.
@foggy sedge
i think that is fair, on first hearing
What's everyone's thoughts on PEP 623? (https://peps.python.org/pep-0623/)
Excerpt from an article:
A new Python proposal, PEP 623, plans to removes legacy strings, a necessary feature PythonMonkey depends on to efficiently pass string data back and forth. Legacy strings allow for a data buffer from anywhere; hence, it can point directly to a JSStringβs buffer. Without them, PythonMonkey will need to make a copy of the JSStringβs buffer, leading to significant time and memory inefficiencies.
Python Enhancement Proposals (PEPs)
Announcing PythonMonkeyβs alpha releaseβββuse Python code in JavaScript and vice versa with ease and virtually no performance loss!
i don't think it's copying the JS buffer
i just looked at the source of PythonMonkey and not much changes really happen
there's 0 copying involved
What do you mean? I figured it would need to change how it passes strings once the pep is implemented (but it's not implemented yet)
Right now it passes the strings reference back and forth
there's already a check for 3.12 there
and it doesn't change much
(PEP 623 is already implemented in 3.12 btw)
I thought the pep passed but wasn't implemented
deprecation started back in 3.10, and its been implemented in 3.12, which is now in the release candidate phase
It doesn't really effect python code, only c extensions
I saw that, in future version of Python, the bitwise inversion operator ~ will be removed (see : https://docs.python.org/3.13/whatsnew/3.13.html#pending-removal-in-future-versions).
Why will it be removed ?
Of course I may have misunderstood something.
it would be removed only for bools
why would it be removed only for booleans ?
I'm not against it, I just don't understand the motivation behind this decision.
its probably because ~True equalling -2 and ~False equalling -1 isn't very intuitive or useful for most people, so forcing the few use cases where you're purposefully using bitwise NOT on a bool to be explicit (e.g. ~int(True)) helps with code readability, and stops people who think it also means a logical NOT from making the mistake of using it in code
because it returns the integer implementation where a boolean would be is expected
i don't think it would be valid to just change the implementation either, as that would be a silent breaking change as opposed to a visible error
What's the rationale for removing it if it breaks ability to just use strings from c backing stores easily
In pythonmonkey it breaks the ability for python to use JavaScript strings by reference
Or so I thought at least @rose schooner
I see it does make sense
the motivation is here: https://peps.python.org/pep-0623/#motivation
Python Enhancement Proposals (PEPs)
I believe it consumes 16 bytes per string on 64 bit systems, not 8. There's a pointer and a length, each 8 bytes
can someone help me understand this please?
>>> type(object)
<class 'type'>
>>> type(type)
<class 'type'>
>>> isinstance(type, object)
True
type is a subclass of object
then how come type(object) is type?
O_o
Type is the meta class for all objects
That's because object is a class
object is like quantum physics. If you look at it too closely, you'll get a headache.
someone actually posted an image in python general that shows this pretty well
solid is subclass, dotted is instance
yes, the type here is made out of type
also missing all the dotted lines from object 
spiderweb of everything being an instance of object
that is amazing, thanks guys
if x is instance of B and B is instance of A, then x is instance of A, but line from A to x would be redundant
so there are only lines from instances to direct classes:
Dog() <-.-.- class Dog <=== class Animal <=== class object
Dog() <-.-.- class object
Dog() <-.-.- class Animal
so everything in python is actually a type ?
every class is an instance of type
and everything is an instance of some class
i was kinda getting at the everything in python is an object aschtually type` meme
or instance of itself π ```py
class M(type): ...
...
class X(M, metaclass=M): ...
...
X.class = X
X is type(X) is X.class
True
isinstance(X, X)
True
issubclass(X, X)
True
Was reading asyncio issues on GitHub, but it didn't give a clue about what people might working on these days. Probably because of the lack of contributions(None) and don't have a clue about current threads.
Does anyone know what I might be working on in asyncio, what things could be added or what things need to be implemented?
maybe take a look through https://github.com/orgs/python/projects/29
don't think there is a lack of contributions, see e.g. the changes in 3.12: https://docs.python.org/3.12/whatsnew/3.12.html#asyncio
and previously 3.11 added TaskGroup among other things
sorry, I meant a lack of contributions from me. I have never ever contributed before.
oh I see! if you're interested in contributing then Gobot's list is good
so these are the current ones.
I wouldn't put too much weight on "Todo" vs. "In progress", often people end up not completing things
so just pick one and make a pr?
yes
Why did they decide to lump so many things under ast.Constant?
ast: The following ast features have been deprecated in documentation since Python 3.8, now cause a DeprecationWarning to be emitted at runtime when they are accessed or used, and will be removed in Python 3.14:
ast.Num
ast.Str
ast.Bytes
ast.NameConstant
ast.Ellipsis
Use ast.Constant instead. (Contributed by Serhiy Storchaka in gh-90953.)
Isn't that just more isinstance checks people using the ast would have to do then?
Which I do for validating user input on a gui program at work
match/case!
that's true
I don't know why that change was made, but maybe because for a lot of AST-processing code, these nodes are basically the same (see e.g. visit_Bytes, visit_Num etc. in https://github.com/quora/pyanalyze/blob/0e8482fb202e36252a0e34bd4acf4762a7f6e31b/pyanalyze/name_check_visitor.py#L2817)
pyanalyze/name_check_visitor.py line 2817
def visit_Bytes(self, node: ast.Bytes) -> Value:```
within the compiler they're also basically treated the same (compiled to LOAD_CONST)
That makes sense to make, have them align more
in case you want to write code for both pre- and post-3.8, you can use something like this
https://github.com/decorator-factory/flake8-useless-assert/blob/master/flake8_useless_assert/patch_const.py#L4
flake8_useless_assert/patch_const.py line 4
class LegacyConstantRewriter(ast.NodeTransformer):```
I heard that in MicroPython, there are no regular int objects for memory-saving reasons.
Pointers work like this: if the lowest bit is 0, then the whole pointer is just a normal pointer to some normal object. However, if the lowest bit is 1, then the remaining bits contain an integer value. Therefore, there is no need to allocate an int object; it can be embedded into the pointer. (I'm probably wrong somewhere, but I'm talking about the idea.)
Why isn't this implemented in CPython? Is it bad for performance? Is it backwards incompatible? Does it save a significant amount of memory/reduce the number of allocations?
Another somewhat related question:
Which types are created/used the most number of times in an average program? int/str/dict/(frame obj)/bytes/None?
They are working on some changes to ints for 3.13, but I don't know the details. I don't think it goes as far as storing the int in the object pointer.
Re: pygen discussion just now: When an attribute lookup fails on an object, it's tried again on the class. But that second lookup doesn't work like a normal lookup because it doesn't work recursively.
!e
class Foo(type):
x = 1
class Bar(metaclass=Foo):
y = 2
bat = Bar()
bat.z = 3
for attr in "xyz":
for name, entity in [("bat", bat), ("Bar", Bar), ("Foo", Foo)]:
try:
print(f"{name}.{attr} =", getattr(entity, attr))
except:
print(f"{name}.{attr}: AttributeError")
@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | bat.x: AttributeError
002 | Bar.x = 1
003 | Foo.x = 1
004 | bat.y = 2
005 | Bar.y = 2
006 | Foo.y: AttributeError
007 | bat.z = 3
008 | Bar.z: AttributeError
009 | Foo.z: AttributeError
bat.x doesn't work even though Bar.x works.
fwiw this is the behaviour that i thought enums should use to avoid having to use descriptors to disallow member access from members
(its what i do to stop it at least)
Was wondering if any development going on with this: https://discuss.python.org/t/asyncio-for-files/31077
I would like to see asyncio for files, using the underlying OS asynchronous IO features (see Overlapped writing - windows or io_uring for unix. Blockquote import aiofile as a await a.print(f"example") async with a.open(βmyfileβ) as f: await f.readline()β¦ aoifiles has an api for asynchronous io to files, but the implementation is using ...
Hi Data Gangs!!
if you want to start your journey as Data Engineers and gain some expertise in this field , you can start with this roadmap
!rule ads
(also, not the right channel in any case)
Hello! I'm curious if Cython 3.x.x will have a support of PEP 684. They only mention support of Python 3.12. And there is an exception raised if Cython is used in multiple interpreters.
no, for the reasons that they discuss in that thread
Still trying to find things to contribute to. Isn't there any list where they define what they want to implement in newer versions?
yes, pretty sure i sent you the link yesterday
Nope
actually i sent you a slightly different link, https://github.com/python/cpython/issues?q=is%3Aissue+is%3Aopen+label%3Aeasy
since i really think you should learn how to walk before running, specially with something as complex as asyncio
this is what they want help with
anything tagged with asyncio
Is there some reason for ABCMeta to not try to call instancecheck/subclasscheck from super for metaclass inheritance?
could you elaborate, also maybe try to come up with an example? So more people can understand.
They just don't try to call the corresponding methods of their super class, so cooperative inheritance doesn't work properly with it https://github.com/python/cpython/blob/917439d4d9bebdb5d2792bb5bba095b821fdf003/Lib/abc.py#L117-L123
Lib/abc.py lines 117 to 123
def __instancecheck__(cls, instance):
"""Override for isinstance(instance, cls)."""
return _abc_instancecheck(cls, instance)
def __subclasscheck__(cls, subclass):
"""Override for issubclass(subclass, cls)."""
return _abc_subclasscheck(cls, subclass)```
you don't know why?
If I knew I wouldn't be asking
It's probably mostly because not many people try to do cooperative multiple inheritance with metaclasses, so it just hasn't really come up :p
However, ABCMeta.__instancecheck__ and ABCMeta.__subclasscheck__ are both very performance-sensitive, and a lot of work has gone into speeding them up in the past. (Note that _abc_instancecheck and _abc_subclasscheck are both implemented in C.) I'd personally be pretty wary about changing these code paths.
It's just that you kinda have to use it if you have your own metaclass and want an ABC to go along with it, though I suppose it is rare enough that reordering inheritance should be easy enough and work
would there be any backwards compatibility issues with making type(list[int]()) is list False but keeping isinstance(list[int](), list) True? ie can the called, subscripted class return a subclass of list?
I feel that can have some confusing consequences. Would isinstance(list[int](), list[int]) be true? If so, what if you have a separate subclass of list that also holds int?
no thatd still be a TypeError or whatever it is now
IMO type hints should stay further away from runtime as long as possible
Python should get rid of the global interpreter lock
It's being worked on...
Hard disagree
The stuff that pydantic and other libs do with them is amazing imo
I haven't minded the approach that's been taken so far (keep everything typing-related pretty contained and separate from the rest of the lang), though the new typing syntax to me seems like a shift towards typing being a more central part of python
"gradual typing" is when your dynamic language slowly implements type system features one by one
If someone somehow finds an application for pep 695 I'd be surprised
does this count as an application of PEP 695? https://github.com/python/cpython/blob/bb456a08a3db851e6feaefc3328f39096919ec8d/Lib/typing.py#L2686
Lib/typing.py line 2686
class SupportsAbs[T](Protocol):```
I'm saying more along the lines of a runtime application, akin to how pydantic uses type hints
pydantic already wants to use TypeAliasType
I wasn't aware, how would they use it?
the syntax is basically just sugar so if you use generics with pydantic (which i think you can already do?) thats another runtime use of PEP 695
I meant the built-in stuff
Inspecting annotations in runtime is okay
What's the benefit at that point, point of pydantic for serialization of json
(and is the original goal with pep 3107)
I'm not the best person to ask, I haven't really looked into it
Is there discourse on how they wanted to use it
One use case I imagine is generating automated documentation for swagger
Not sure where to put it so I'm giving the idea here. It occurred to me that there is actually a very simple way of efficiently compiling python code into C (at least as far as typing is concerned).
Python, using a JIT compilier, is almost exclusively used in debug mode by developers. This means that the debugger is constantly keeping track of the variables as it compiles the code.
Should a debugger produce a log of variable names and the C data types those variables adopted, a python compiler could use that log to efficiently compile python for production-ready code.
At the very least it will give you a quick and easy basis for experimentation. In practice it might be better to store such info in the "pycache" folder.
Basically, this could allow the debug mode of python to produce all the data that a compiler would need. Lots of time is wasted finding variable types, and this is a simple solution for that specific problem.
the debugger has fairly little to do with this, but this is mostly what the adapting interpreter of 3.11 does, just on a bytecode level, not a variable level. If I have an add that has been adding two ints the past 10 times, it is probably an int add the eleventh time as well.
If the python data is stored on memory, you could also infer number types by looking at the size of memory the python data takes. Developing a scoring system for indicators might be a better way to arrange the sequence of type checks. Not sure if this will actually speed things up though, depends on the scores used.
Very long memory implies strings, where arrays of many members implies numbers. These kinds of things could be used to creates scores that can be used to order type checking processes
But the cost of scoring has to be lower than the cost of checking for type and getting the first guess wrong (or using an average-based metric)
One example of a score is to use the distribution of the length of English words. The normalised frequencies for each word length adds up to 1, so the frequency of seeing a word of a certain length can help determine is a sequence of characters are more likely to be letters or digits.
You can take The Pile and get the distribution of numbers per order of magnitude (frequency of 1-digit, 2-digit, ... numbers)
You can then statistically pre-determine if an array of chars is likely to be a word or a number
Frequency-based scoring has the advantage that the frequency can be measured and therefore hard-coded. You only need to determine which bin a length of characters falls into.
Not necessarily. BLOOM is a multilingual, open-source LLM. Their dataset could be used to get "universal" word-length and sentence-length frequencies (a score is a combination of metrics). It may be limited to distinguishing words from numbers though
Like if you have a list of items and need to determine the type of item, that's the ideal use case since you can score the members and get an average score
Might be useful for loops too
in this post https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/ the trio developer arges asyncio's stream read and write buffers are useless because the kernerl's network stack has its own data buffers. Any thoughts on this? I am wondering if I should set the write watermark to 0 and reduce the read buffer size.
That's quite an old post
These days, the advice I've heard is that if you want performance, you should use the low level transports and protocols API instead of the high level streams API
Would there be any positive reception to a PEP proposing to revert the one from the beginning of Python 3 that removed the ability to unpack function parameters by expressing them in tuples, and explaining why each of the points it makes is wrong?
I had always thought it would be nice to have but didn't know it had been explicitly removed until recently
And it would make many things like mapping a function over an enumerate of a zip much nicer (after all, we can do that in list comprehensions)
probably best to start with a discussion on discuss.python.org, and I'd frame it primarily as a case for adding the feature to Python now, not a point-by-point rebuttal to the old PEP
besides here, would #media-processing be the place to ask about the skimage library?
at me if you know.
#media-processing or the help system ( see #βο½how-to-get-help) would be appropriate
hm interesting. I have built a socks proxy server and I was wondering about backpressure and RAM usage due duplicated buffering in userspace and the kernel.
i added code block to documentation (pic2)
now doctests raises an error (pic1)
how can i fix that? should i add import struct line? or should i somehow ignore it?
if struct is not in the module globals
oh this is inside the struct module
then just Struct
how would doctest know that Struct should be in the global namespace?
this is confusing...
by grabbing a copy of the globals
see notes on execution environment here: https://docs.python.org/3/library/doctest.html?highlight=doctest
there are a lot of code block in one docs page
are they executed in the same context?
No, it doesn't work
Seems like i have to import struct module, despite it is already imported
Add a .. testsetup:: * section to the top of the document, like we do in the typing docs: https://github.com/python/cpython/blob/main/Doc/library/typing.rst?plain=1#L5
Doc/library/typing.rst?plain=1 line 5
.. testsetup:: *```
It will be executed by doctest before every doctest snippet
Oh, thank you a lot!
No problem!
If I store a reference to an int, does the int object occupy extra memory?
E.g. I understand that a slotted class occupies 32+8*slot_count bytes. 8 bytes per reference. And if that reference points to a python object, that object also occupies additional memory.
When references point to an int, does every single referenced int occupy a full 28 bytes? (sys.getsizeof(1) returns 28)
put yet another way, what's the total memory consumption for class Foo: __slots__ = ('a'); foo = Foo() ; foo.a = 1 , ignoring memory consumed by the class declaration
I don't understand what you mean by "does the int object occupy extra memory?" - "extra" relative to what?
the int object is whatever size it is, regardless of what references to it exist. The reference itself costs you sizeof(PyObject*) bytes, so yeah - 8 bytes per reference on a 64-bit system.
Relative to the int being stored inline within the slot, sorta like would happen in a C struct, where it's not storing a pointer to an int, it stores the int's value inline
I'm just not sure if that's an optimization that python does, I know 3.11 and 3.12 introduce new optimizations
it always stores the pointer
it wouldn't make sense for it to ever store anything other than the pointer. It doesn't know how large an arbitrary object is, so it can't save enough space to include a copy of any arbitrary object in its slots array
and as you say, the integer is much bigger than the pointer, so even if it could inline the int object into the slots array, that would likely take up more memory, not less
(at least, assuming the reference count of the int is >1)
I wouldn't be that confident. I could envision a tagged pointer scheme that stores small values (ints, strings) inline without a pointer
well to be fair, v8 does this kind of optimization
yeah, beat me to it. Google says it's "tagged pointers" and I guess you know more about it than me
yeah, fair enough. I suppose that's doable for specific immutable types. It's not doable in general, or for mutable types, though.
regardless, thanks for confirmation that it does not do that today
I think there's been some talk in the Faster CPython project about using tagged pointers, but it's not been implemented so far
wow that's kinda crazy, so even stuff like a = b + c is triggering a 28 byte heap allocation to store the result
possibly, though small integers are cached
and it's not necessarily 28 bytes - large integers can require an arbitrarily large number of bytes
true, "28 or more"
and it depends on what you mean by "heap allocation" - it'll be served by Python's allocator, not by malloc
what are the implications of small integers being cached? Like what does that mean?
and there are free lists to optimize heavily allocated types...
there is basically only one instance of 1
there is exactly one copy of the integer values -5 through 255 (IIRC, might be off by a few on one end of that range or the other)
if you try to create a new 1, the interpreter just gives you back the existing one
just like you can't create a second None
demonstration: ```In [5]: def diff(a, b):
...: return (a - b) is (a - b)
...:
In [6]: def add(a, b):
...: return (a + b) is (a + b)
...:
In [7]: diff(1001, 1000)
Out[7]: True
In [8]: add(1001, 1000)
Out[8]: False
because there's only one 1 but there can be multiple 2001
Ok, that's relevant in my case, I want to efficiently store some memory regions as pairs of offset and byte count. The count is low -- 0 to 8 or something like that -- but the offset is probably quite large, somewhere in a big buffer.
So storing those 2x ints is 2*8 for the references, but only 1*8 for the values
use the array module for packed integer data
Thanks, that doesn't work for this use-case, there's other details in play.
if efficiency matters a lot to your use case, you may want to consider not using python in first place
yeah, I know. I'm contributing to an external project where introducing non-python to the codebase is a non-starter
But I can still optimize within the constraints. And it's fun
https://github.com/python/cpython/blob/ed25f097160b5cbb0c9a1f9a746d2f1bbc96515a/Objects/bytearrayobject.c#L757
I don't get why bytearray __init__ needs to do this. Why can't it just realloc and then overwrite the existing data?
i wonder if 722 and 723 are both going to get accepted
that might be kind of nice. pep 722 being kind of a shorthand for 723
!p 722
!p 723
I hope 723 doesn't get accepted. I hate the fact that toml was even picked, it's horrible, writing anything large is a horrible experience. Writing in a triple quoted string is going to be just as bad
There is one thing that I can't wrap my head around. I just learnt about thread-local-storage and contextvars . I went through an example where they were really useful, so I could make my threads see the global state(global-var).
color = ContextVar('color', default='Unknown')
# define all the colors we want to work with
colors = ['red', 'green', 'blue', 'yellow', 'orange', 'purple']
# create a thread for each color
threads = [Thread(target=task, args=(col,)) for col in colors]
# start threads
for thread in threads:
thread.start()
# wait for threads to terminate
for thread in threads:
thread.join()
so what I need help with is to understand why all of these things are not implemented by default.
Why threads can't see the same context: Is there any underlying reason why the latter is not defined, or is it part of some basic computer architecture principles, that I'm not aware of.
I don't think I understand the question... The whole point of context-local variables is that each execution context has its own value for that variable. They do some magic behind the scenes to make one variable refer to different values in different execution contexts
well, I thought that this is how context works by default.
and why is it being used in asynchronous programming?
It's how ContextVar works. Any non ContextVar variable will have the same value in every execution context
You can think of a ContextVar as holding a mapping from execution context to value, and automatically giving you the correct value for the context that's running
!e
import contextvars
# with context-vars
var = contextvars.ContextVar('var', default=10)
def main2():
global var
var.set(12)
print("var", var.get())
ctx = contextvars.copy_context()
ctx.run(main2)
print("var", var.get())
# by default
value = 10
def main():
global value
value = 12
print("value", value)
main()
print("value", value)
@cyan raven :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | var 12
002 | var 10
003 | value 12
004 | value 12
well, yes, I tried to represent it somehow.
you use context- or thread-local variables whenever there's some variable that you want to be shared between functions running in the same task or thread, and you don't want to (or can't) pass it explicitly as an argument from one function to the next
one common example is that you might store information about the request being processed by each thread of execution in an HTTP server, so that if you need to log a problem, you have access to information about the request that you failed to process no matter what part of the processing failed
that lets you, for instance, log a request id along with any message that you need to log, any time during processing that request
thank you
it sounds like you just dont like pyproject
personally i think toml is a good config format, and it makes sense imo instead of something like a yaml or json transliteration of the pyproject format
im not a huge fan of 722, i think instead we might want a pep that defines a "metadata block", and inside that we can do whatever
i dont like the design of 732 with the __pyproject__ string, i'll give you that
especially because toml itself supports triple-quoted strings
id rather have this:
#!/usr/bin/env python3
# -*- pyproject -*-
# [project]
# dependencies = [
# 'sqlalchemy',
# 'click',
# ]
# -*-
if __name__ == "__main__":
...
i too would prefer some sort of baked in syntax thats stripped at run time but I doubt it'll get support because why should metadata cost a runtime hit
well 723 proposed __pyproject__ but that's not really what i had in mind... i was thinking more like perl __DATA__
perl did it the smart way. metadata goes at the end, parsing can stop early if needed
however realistically python startup and module import can be pretty slow anyway, and if you're jamming lots of metadata into your script you probably don't care that much about startup performance
i just proposed some really outlandish stuff in the pep 723 thread, let's see what happens π
I don't know where this question would go, I am putting this here because the channel has "internal" in its name.
Whenever I install a python package via pip, its going to be installed in either the purelib, scripts, or data installation paths right? Or are packages of the platlib, stdlib or platstdlib kind also installable?
I couldn't find examples for the same
I wouldn't expect it to ever use stdlib or platstdlib. Those are for the Python standard library, not for 3rd party libraries installed via pip. Some pip installed libraries will use platlib, though. Anything with an extension module would - for instance, numpy
Oh yes extension modules being platlibs would indeed make sense, thanks!
Any experience contributing to CPython using pycharm? it's throwing a bunch of setuptools not found errors and it can't understand the build version. (It's saying 3.10). maybe it is because of the visual studio(which needed to be installed)?
so I just set up the system interpreter to "Pcbuild/amd64/python_d.exe"
import sys
import threading
i = 0
sys.setswitchinterval(1e-6)
f = 10_000_000
def test():
global i
for x in range(f):
i += 1
threads = [threading.Thread(target=test) for t in range(10)]
for t in threads:
t.start()
for t in threads:
t.join()
print(i)
assert i == f * 10, i
this seems to consistently not fail on the assert on 3.11
are the specializations part of it?
i see the diffs between the output of 3.11 and 3.9 (which does fail on my machine) being PRECALL and RESUME
I have a vague memory of seeing += made atomic on ints in some changelog/proposal somewhere
tf
https://peps.python.org/pep-0583/
this is withdrawn
Python Enhancement Proposals (PEPs)
!pep 686
that's set for 3.15. is it the furthest-out version-set difference?
if i define a class like ```py
class Foo:
def add(self, other): pass
yeah tp_as_number.nb_add I believe
that makes things painful, thanks
In what python version was the | or operator introduced for typehints?
3.10
(fwiw, it is called a union when you use it on types (and also pre-3.10 is typing.Union))
I have been digging through CPython sources for a while trying to find where exactly "normal class instances" get __dict__ set, I believe it's in object.__new__ but having trouble finding it. Anyone have any pointers on what I should look at?
Objects/dictobject.c line 5409
_PyObject_InitializeDict(PyObject *obj)```
!e looks like the error message for setting f_lineno is incorrect in 3.11 ```py
import sys
should say "f_lineno can only be set in a trace function"
instead if says "can't jump from the 'call' trace event of a new frame"
sys._getframe().f_lineno = 0```
@pliant tusk :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 4, in <module>
003 | sys._getframe().f_lineno = 0
004 | ^^^^^^^^^^^^^^^^^^^^^^^^
005 | ValueError: can't jump from the 'call' trace event of a new frame
ah its fixed in 3.13 sweet
When 3.12/3.13 snek?
probably as soon as 3.12 is out of prerelease
we upgraded the bot to support 3.11 while 3.11 was still a release candidate. We could do the same for 3.12. https://github.com/python-discord/bot/pull/2218
I've just suggested it in #dev-contrib message
https://github.com/python/cpython/blob/fc23f34cc9701949e6832eb32f26ea89f6622b82/Objects/bytearrayobject.c#L443C21-L443C21
ob_start only exists to do this shrinking trick?
https://github.com/python/cpython/blob/fc23f34cc9701949e6832eb32f26ea89f6622b82/Objects/bytearrayobject.c#L222
It seems like ob_start, in the one place where it gets changed to something other than ob_bytes, it always just gets set back to being equal to ob_bytes when PyByteArray_Resize is called. So I'm wondering why it even needs to exist.
looks like it, yeah - it's for efficiently removing bytes from the beginning of the bytearray when you do del some_bytearray[0:5] or some_byte_array[0:5] = ()
it avoids a copy (well, memmove) in that case
if I had to guess, that's a worthwhile optimization because bytearrays are often used for things like buffering data that's received from a socket - and for that usage pattern, you're repeatedly consuming things from the start of the array and appending things to the end of the array
yep - that's it. https://bugs.python.org/issue19087
There is no bytedeque().
And FIFO buffers are quite common when writing parsers
for network applications.
PyByteArray_Resize() is always called afterwards to ensure
that the buffer is resized when it gets below 50% usage.
This is what I was missing. PyByteArray_Resize only reallocates it after shrinking when it goes below 50%, not any time it gets called.
Thanks for posting this link
https://github.com/python/cpython/blob/2135bcd3ca9538c6782129f9a5837d62c2036102/Objects/bytearrayobject.c#L185
When you make a new bytearray, the buffer isn't initialized at all and isn't null terminated because it leaves PyByteArray_Resize at this size.
Objects/bytearrayobject.c line 185
if (requested_size == Py_SIZE(self)) {```
!e ```py
import sys
b = bytearray()
print(sys.getsizeof(b))
b = bytearray(0)
print(sys.getsizeof(b))
b = bytearray([])
print(sys.getsizeof(b))
b = bytearray("", "ascii")
print(sys.getsizeof(b))
b.init(1)
print(sys.getsizeof(b))
b.init(0)
print(sys.getsizeof(b))
@sand goblet :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 56
002 | 56
003 | 56
004 | 56
005 | 58
006 | 57
Is that a mistake? It seems like it would be better if it was initialized.
if (Py_SIZE(self) != 0) {
/* Empty previous contents (yes, do this first of all!) */
if (PyByteArray_Resize((PyObject *)self, 0) < 0)
return -1;
}
In bytearray___init___impl I think this part could be replaced by something like this:
/* Empty previous contents (yes, do this first of all!) */
if (Py_SIZE(self) != 0 && !_canresize(self)) {
return -1;
}
void* sval = PyObject_Realloc(self->ob_bytes, 1);
if (sval == NULL) {
PyErr_NoMemory();
return -1;
}
self->ob_bytes = self->ob_start = sval;
Py_SET_SIZE(self, 0);
self->ob_alloc = 1;
self->ob_bytes[0] = '\0'; /* Trailing null byte */
Then it should always be initialized and have the null terminator
I have no idea if anyone else would ever need this feature or if this feature is even worth it. But I think optional injections should be a thing. #type-hinting message
TL;DR
Function looks like
@inect_missing_a
def foo(a: A) -> A:
return A
But the signature of that function would look like:
(function) foo(a: A | None = None) -> A
The current reason why this is impossible is because of that = None part. Making it show up as an optional argument is possible, but that is useless because syntactically it is only valid if you either enter in A | None leaving it empty would be wrong.
# Current limited implementation
foo() # Not valid
foo(None) # "valid"
But ideally, that first call of just foo() would be valid.
What resources should I know about to investigate if this change is
A) possible
B) practical
C) even potentially worth doing and suggesting as a PEP
This is just a random thing and not like it is actually important. But I feel like this should be possible
if youd want to suggest it as a pep your need some usecases to justify a completely new feature that seems very finicky in implementation
im sure the folk in #esoteric-python could show you how to implement this at runtime but i highly doubt something like this would ever get accepted by type checkers
If you don't care about the argument name and only have one argument this is already possible, make a generic callable protocol class _C[T, U](Protocol): def __call__(self, arg: T = ..., /) -> U: ... and def inject_missing_a[T, U](fn: Callable[[T], U]) -> _C[T | None, U]: ...
Does someone know where async generators are being implemented in CPython?
finally, there is a useful ToC in 3.13 docs, i like that
D:\>py -2.7
Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 123
File "<stdin>", line 1
123
^
SyntaxError: invalid syntax
>>> ^Z
D:\>py -2.7-32
Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:19:08) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 123
File "<stdin>", line 1
123
^
SyntaxError: invalid syntax
>>> ^Z
why my pythons 2.7 dont work? (at least repls dont work)
i tried reinstalling, but that didnt help
>>> import platform
>>> platform.architecture()
('64bit', 'WindowsPE')
>>> platform.platform()
'Windows-10-10.0.19045-SP0'
>>> platform.system()
'Windows'
>>> platform.win32_edition()
'Core'
>>> platform.win32_is_iot()
False
>>> platform.win32_ver()
('10', '10.0.19045', 'SP0', 'Multiprocessor Free')
>>> platform.version()
'10.0.19045'
>>> platform.machine()
'AMD64'
>>> platform.processor()
'Intel64 Family 6 Model 142 Stepping 12, GenuineIntel'
wait so those things work but "123" doesn't?
I suppose it's the universe telling you to stop using 2.7
i think windows might have something to do with it but i'm not sure
I wonder... Does IDLE work? That sort of looks to be like a terminal emulator issue
i wonder if \r\n is responsible
i dont have idle, but other things seems to work: ```py
py -2.7 -m pprint
_safe_repr: 1.70600008965
pformat: 8.71899986267
>py -2.7 -c "print 123"
123
yeah, probably \r\n cause some problems in repl or something like that
all of this works fine (without syntax errors): ```py
D:>py -2.7 -c "compile('\n\r','','exec')"
D:>py -2.7 -c "compile('\r\n','','exec')"
D:>py -2.7 -c "compile('\r','','exec')"
D:>py -2.7 -c "compile('\n','','exec')"
D:>py -2.7 -c "compile('','','exec')"
yes, presumably your terminal is sending some weird character to the REPL or otherwise interacting poorly
i managed to install ipython, and it looks weird: ```py
D:>py -2.7 -m IPython
[TerminalIPythonApp] ERROR | Exception while loading config file C:\Users\denba.ipython\profile_default\ipython_config.py
Traceback (most recent call last):
File "D:\Programs\Python\2.7\lib\site-packages\traitlets\config\application.py", line 563, in _load_config_files
config = loader.load_config()
File "D:\Programs\Python\2.7\lib\site-packages\traitlets\config\loader.py", line 457, in load_config
self._read_file_as_dict()
File "D:\Programs\Python\2.7\lib\site-packages\traitlets\config\loader.py", line 489, in read_file_as_dict
py3compat.execfile(conf_filename, namespace)
File "D:\Programs\Python\2.7\lib\site-packages\ipython_genutils\py3compat.py", line 280, in execfile
exec(compiler(scripttext, filename, 'exec'), glob, loc)
File "C:\Users\denba.ipython\profile_default\ipython_config.py", line 11
c: t.Any = get_config()
^
SyntaxError: invalid syntax
D:\Programs\Python\2.7\lib\site-packages\win_unicode_console_init.py:31: RuntimeWarning: sys.stdin.encoding == 'cp866', whereas sys.stdout.encoding == 'utf-8', readline hook consumer may assume they are the same
readline_hook.enable(use_pyreadline=use_pyreadline)
Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 5.10.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: 1
Out[1]: 1
In [2]: 2
Out[2]: 2
In [3]:
there is some random syntax error, but repl itself works fine
c: t.Any = get_config()
oh, i get it. This
is from my 3.11 config, and a: b = c is invalid syntax. That makes sense
But ipython works fine...
I was wondering, if I want the Python code base follow the Python standards for coding, what would be the process ?? I just want to help make Python consistant.
I was just reading this pep proposal: https://peps.python.org/pep-0492/
There is a line, saying: "It is a SyntaxError to have yield or yield from expressions in an async function."
Well, this is quite strange to me, not sure how async generators are being established then.
Python Enhancement Proposals (PEPs)
that was added in a later PEP
this one? https://peps.python.org/pep-0525/
Python Enhancement Proposals (PEPs)
yes
hi
Whenever I do a PR to CPython, do I have a sign-off by?
There is a bot enforcing developers to sign off (DOC stuff).
it is good practice to do it yes.
besides it's easy to do git commit --signoff
:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1693101937:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).
The <@&831776746206265384> have been alerted for review.
>>> for i in range(4):
... if i == 2:
... break
... print(i)
... else:
... print('here')
...
0
1
There is this pattern in python, that really confuses me. the for else. I wonder if this is really ever used, and if so is there an advantage to this code pattern? In that here will only happen if the loop finishes. That to me is the confusing part of this. It would make more sense to me if it doesnt reach the end to hit the else. instead it hits the else if it does reach the end.
it's like the else: in a try-except-else
it only happens if the program doesn't "break"
the try-finaly you mean or the try except ?
it look like try finally ?
both can have an else: clause
!e ```py
try:
raise ValueError
except:
print("a") # happens because there is a "break" in the program (raise ValueError)
else:
print("b")
try:
pass
except:
print("a")
else:
print("b") # happens because there is no "break" in the program
@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | a
002 | b
I see. I dint know about that one. so are these used? Is there an advantage to them? and why have these patterns ?
not really sure but i've used them a couple of times
do you remember why you chose that pattern? I mean i dont know a reason to code this. and i dont want to confuse myself or others because it seems like for else seems like the opposite if what it should be
> try:
> ... print('a')
> ... except:
> ... print('b')
> ... else:
> ... print('c')
> ... finally:
> ... print('d')
> ...
> a
> c
> d
> ```
else: is like "if there wasn't an interruption"
since in a try-except-else the except clauses handle the interruptions
ok so your saing the break is like the except here.
the else: clause in for/while probably did the same thing for consistency even if there isn't a handler for break
Yeh i think i will always avoid them as they bend the meaning of the general English meaning for else. an alternative path. it would make sense if else is the alternative of the other block (the for)
π
By the way is there a pattern for do while ? I really think that is more useful than while 1: stuff.
do
x += 1
while ( x < 5)
Hey buddy. There is a dedicated python-help channel ask there.
i know, but no one answers me 
ok i will help you there. this channel for other stuff
mm ok, i did not know the function of this
This is specific to traits of python language. your error is general
just the obvious one - c do { // stuff } while (condition); becomes ```py
while True:
# stuff
if not condition:
break
or alternatively: ```py
done_once = False
while not done_once or condition:
done_once = True
# stuff
so that's the difference between while and do...while, I've never known until now 
a do/while loop always runs at least once, because the condition is checked at the end of each iteration instead of the start
was wondering what cpython uses for that?
I've never done git commit --signoff and I'm a core dev
We don't require it or use it for anything AFAIK
Which bot are you referring to? I don't think we have a bot that does this
this
yes
cpython-cla-bot
AFAIK that bot doesn't look at the signed-off-by stuff in the commit. It just looks at the email associated with the author of the commit
That project is archived and no longer used; we use the CLA-bot now instead of the Knights
could you link me the source code of that bot?
denball already did
Actually they didn't provide a direct link. Anyway we use https://github.com/ambv/cla-bot, which is a fork of https://github.com/edgedb/cla-bot
The fact that we had to fork it indicates that you might also have to fork it and make some modifications if you want to use it in your project
But I have no idea π
Εukasz is the guy to ask
oh yeah been needing to ask, It would be nice if someday we could get a flag to mark a method as a coroutine if and only if they run the code within code that they run with a running event loop already for module methods and type object methods that basically use the built in event loop).
I have such a use case for one and I do not like cython at all.
due to portability and also how said code looks.
(and the fact that AI will try to slap on CO_COROUTINE to the method's PyMethodDef entry for it if you let it and donβt point out that itβs not possible yet.)
Hi guys, relating to this I've been trying to set this thing up following this: https://github.com/edgedb/cla-bot/wiki/Configuration
and I'm kinda confused on what the db would be for? Like why is an EdgeDB needed? I'm sorry if it is obvious but I'm confused. Also when it says that "Use the provided migrations to create the structure of the database" how should the structure look like, based on what would I do it?
what is a frozen module and is it important?
i need to compile python 3.12 as an embedded scripting language for a c++ game
but it is under a completely different build system (it is called waf)
so i cant use the fancy magic code generators from make
I'm not the right person to ask about this π consider posting at https://discuss.python.org/c/async-sig/20 to get the attention of one of the async experts on the core dev team
Not too familiar with the system, but I believe some import-related modules are frozen for bootstrapping reasons; if they weren't, we'd need the import system to import itself
A few other stdlib modules are frozen as a startup optimization
so how do i freeze these modules without using the provided make files?
you probably don't need to
no idea how to do it without a Makefile, though. The easiest bet is probably to have your build system invoke make
the build system is itself just a python script
then you can use subprocess
the issue with that, is that it wont really play well with the installation process of the build system
I wish that asyncio could be imported by the c api much like the datetime module can, I just like how that sort of thing works lol.
Python proposal (inspired by Lua!):
args = {'a':1, 0:98, 1:99, 'b':2}
f(**args)
would be the same as: f(98, 99, a=1, b=2)
I've long wished to have one thing that could serve as both *args and **kwargs. Int keys are already invalid, so nothing would break.
sounds pretty based
one of these days I will need to understand what "based" means π
args = {'a': 1, 0: 98, 1: 99, 'b': 2}
def f(*args, **kwargs): return args, kwargs
def g(**kwargs): return kwargs
f(**args) == ((98, 99), {'a': 1, 'b': 2})
g(**args) == {0: 98, 1: 99, 'a': 1, 'b': 2} # or error
what would happen if there is no positional args in function? will {0: 98, 1: 99} be passed into **kwargs? or TypeError will be raised?
yeah, there are many error situations to sort out.
TypeError would be raised.
cool thing about *args, **kwargs is that you can pass them further without touching
but in your proposal "keyword-positional" arguments will become positional again, so passing this kind of kwargs is a destructive action
i like the idea, but there is a lot to think about
i don't understand what you mean by "become positional again"?