#internals-and-peps
1 messages Β· Page 19 of 1
newstyle classes are classes that are inherited from object, right?
yep
The proof of concept I provided is closer to my implementation in my code (I have made many more adjustments since). By using an object-oriented approach to threading, I keep the results in that object and when I am ready to make the first call then I assign the module to a new name using a modified .join() that either returns True or waits for the thread to finish and then returns True.
This allows me to write something like
if ImportObject.join():
use imported_package
Basically, using an intermediary object allows you to control how and when you deploy a module that was imported in parallel.
I treat it like a wrapper for the imported module, which means name-reassignment is redundant. Perhaps I should stop calling it an intermediary object... A "package wrapper" describes it better
For a compiler implementation I would probably borrow the await keyword rather than add a join function (which could cause a name clash). The interpreter can detect that it is not awaiting a coroutine, and in this case it is probably better to use an intermediary object that gets awaited and rewrites itself once import is complete.
I'm only speculating as food for thought now, I'm not convinced await is the ideal keyword either.
Actually, if a compiler implementation uses an intermediary object, you can just do an isinstance() check and define a new type called Import. If it is type Import, you can use the Import's join function to complete the process. If it is not an Import then you know it is safe to use. This can be done inside a function called isimport() (or something like that).
Hello everyone, I am Kshitij Pandey. I am currently pursuing BS Data Science offered by IITM. I have recently started learning Python. If any of you have recently started Python, I need a study partner, please DM me.
How would you achieve this with the new record type (just wondering, I know you didn't specify this as the way to go)?
Would you prepare the function signature before you pass in the iterable, or not?
or would you just add it to the __slots__, also the corresponding __match_args__, and finally just set the attributes?
object.__setattr__(self, {name!r}, {name})
@record
def MyTuple(a, b, c):
pass
valeu = MyTuple(1, 2, 3)
valeu.from_iterable(range(3))
# slots -> ("a", "b", "c", "some-key", "some-key", "some-key")
and then you would generate a key automatically for the corresponding values coming from the iterable.
also, slots are immutable tuples, so you cant really add new elements after it was initialised.
it would be a class method, not an instance method
not that i know much about record
but my suggestion was for a class method
similar to:
@record
def MyTuple(a, b, c):
@classmethod
def from_iterable(cls, iterable):
return cls(*iterable)
would you expect to have some kind of under-the-hood handling here, I mean what you provided is literally the solution.
well, you'd expect this method to be added by the decorator
I can take a look and maybe send a pr to the author.
also, how would you combine it with normal attributes coming from the function parameters?
combine it?
I'm trying to understand what you are proposing, would it be the same under the hood as a normal parameter in the function signature, such as a? How would you use it?
cls(*iterable) is not telling much.
Also, "slots" are being established with the decorator.
so after that, you can't extend it, and also can't add additional parameters in the same way(if you want to keep the same structure).
records.py line 30
__slots__ = ()```
records.py line 114
__slots__ = ({slots})```
I'm afraid we can't handle items in an iterable range as we'd handle parameters defined in the function prototype.
why would we need to extend slots
you're just passing the unpacked arguments to the normal __init__ nothing else needs to happen
!e it could be just as simple as:
def init_source(fields):
args = ", ".join(fields)
body = "\n".join(f" self.{field} = {field}" for field in fields)
return f"def __init__(self, {args}):\n{body}"
def repr_source(fields):
attrs = ", ".join(f"{field}={{self.{field}!r}}" for field in fields)
return f'def __repr__(self):\n return f"{{type(self).__name__}}({attrs})"'
class SimpleDataClassMeta(type):
def __new__(meta, name, bases, namespace):
fields = namespace.get("__annotations__")
if fields:
exec(init_source(fields), globals(), namespace)
exec(repr_source(fields), globals(), namespace)
return super().__new__(meta, name, bases, namespace)
class SimpleDataClass(metaclass=SimpleDataClassMeta):
@classmethod
def from_iterable(cls, iterable):
return cls(*iterable)
class MyDataClass(SimpleDataClass):
a: int
b: int
print(MyDataClass(1, 2))
print(MyDataClass.from_iterable([1, 2]))
@deft pagoda :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | MyDataClass(a=1, b=2)
002 | MyDataClass(a=1, b=2)
adding slots to this would change nothing really
Where do I go to submit some documentation updates for Python? In particular 3.12 coming up?
you can send a PR on https://github.com/python/cpython

Make sure to target PRs to the main branch β a bot will create an automated PR to backport it to the 3.12 branch after the PR to the main branch is merged π
Do I need to tag it or format the PR title so it does that?
Only maintainers and triagers have that power. One usually comes along within a few hours and adds the appropriate labels π
I think you should check out this code and tell me how you exactly imagine it, send a patch or something. I can implement it if you want since I'm really into this whole idea(struct syntax).
instead of what?
struct?
did you mention your idea on discussion.python?
nevermind, I thought this was Protocol related
also, what do you think about the "struct idea"?
which is defined here: https://snarky.ca/proposing-a-struct-syntax/
This is what I was thinking of when I heard record. https://discuss.python.org/t/introducing-record-types-in-python/34397
Dear Python Community, I would like to propose the introduction of a new record type in Python, which would serve as an extension of the existing class. This proposal aims to simplify and enhance the way we define and work with data structures, providing a more concise and Pythonic syntax. Proposal Overview: The proposed record type, letβs cal...
which links that repo
I mean It seems like that brett's idea was more popular.
this is like a first step to a struct syntax or something similar.
wondering how you pass something to the normal init by using the decorator.
I feel like the packing operator might be the best here.
My big time sub-interpreter ideas have very quickly been reign checked after discovering that most Python module are not what the interpreter classes as 'safe' 
We're about to open a temporary discussion channel for talking about v3.12, and I want to make sure my understanding of the new generic syntax is correct.
def max[T](args: Iterable[T]) -> T:
...
class list[T]:
def __getitem__(self, index: int, /) -> T:
...
def append(self, element: T) -> None:
...
does T need not be defined anywhere else for this to work?
this syntax is what defines T
so you're correct, it doesn't need to be defined anywhere else
thanks! so having def max[T](args: Iterable[T]) -> T: ... causes T to be instantiated as a TypeVar?
correct
T <class 'typing.TypeVar'>
nice
does this eliminate the need for defintions like T = TypeVar('T') everywhere in the language?
yes, with two caveats. The obvious one is that you can only use this if you only support 3.12+. The second one is that PEP 696's TypeVar default isn't supported with this new syntax, only with typing_extensions.TypeVar. PEP 696 hasn't been decided upon yet but some type checkers support it.
The new generics stuff is such a huge improvement in improving the ergonomics
Even if you currently don't have a default
Hi friends. I'd like to share one more proof of concept for a split import, which should work for some imports but may have namespace issues (if a package has a constant you will have to access it with dot notation because the import happens inside an object). I am using sentence_transformers as the example because of the long load time.
I have implemented some suggestions based on discussions here. In particular, initiallising the SplitImport object automatically starts the new thread, but in order to access the imported package you have to call .join(). This then overwrites the dictionary of the SplitImport object with the import dictionary. SplitImport methods are no longer available after that.
import threading
import time
class SplitImport(threading.Thread):
def __init__(self, name:str = None) -> None:
self.target = __import__
threading.Thread.__init__(self, None, self.target, name, (name,), {})
self._return = None
self.name = name
self.start()
#? New thread starts automatically
def run(self):
if self.target is not None:
self._return = self._target(*self._args)
def join(self, *args):
'''The join method has been modified to return True once complete.
Join has to be called in order to overwrite the Thread Object with the Import dictionary.'''
threading.Thread.join(self, *args)
#* Rewrite object dictionary
_import = self._return
self.__dict__ = dict()
self.__dict__.update(_import.__dict__)
def isimported(self):
return True
self.isimported = isimported
return True
def isimported(self):
return False
#* TIMING AND TESTING *#
start_time = time.perf_counter()
# Initialise split import
sentence_transformers = SplitImport("sentence_transformers")
# Check time taken to start thread
print("Import statement pass time:", time.perf_counter()-start_time)
# Check original dictionary
print(sentence_transformers.__dict__)
# Activate import
print(sentence_transformers.join())
# Check that new atrributes are available
print(sentence_transformers.__dict__)
# Check total import time (including two print statements)
print("Import statement execution time:", time.perf_counter()-start_time)
isn't importing CPU bound anyway? I don't think that threading would make it much faster in the end?
I just want to make sure that my understanding of __match_args__ is correct.
Basically, it lets you do a comparison in the switch-case statement without using keyword arguments.
class TestClass:
__match_args__ = ("a", "b")
def __init__(self, a, b):
self.a = a
self.b = b
match TestClass(1, 1):
case TestClass(1, 1):
print("here")
Also, could someone link me the source code in cpython corresponding to match_args?
Python/ceval.c line 476
if (PyObject_GetOptionalAttr(type, &_Py_ID(__match_args__), &match_args) < 0) {```
hey that code has a bug
why?
Several places in _PyEval_MatchClass call PyList_Append without checking the return value (e.g., https://github.com/python/cpython/blob/fc2cb86d210555d509debaeefd370d5331cd9d93/Python/ceval.c#L509C...
I don't really understand what's the point of having a default like this: [object()]
other_attrs = frozenset(getattr(type(other), "__slots__", [object()]))
Could someone tell me?
For anyone reading this who is interested in contributing to Python, fixing this bug would be a good easy first PR
yes, I'm interested, but I have to learn more c. I'm focusing on more c than python lately.
Where does that code from? Hard to tell without context
records.py line 38
other_attrs = frozenset(getattr(type(other), "__slots__", [object()]))```
I suppose the effect there is that if other doesn't have slots, it always returns NotImplemented. Not sure why Brett chose that behavior
would [object()] return non-implemented?
not sure how
if other doesn't have slots, then the getattr calls return [object()], so other_attrs becomes a frozenset containing object(), which will never compare equal to self_attrs
oh okay, thank you. a frozenset is an immutable set right?
yes
why is frozenset a good idea here?
I guess he wants a set because the order shouldn't matter. There doesn't look to be a strong reason to use a frozenset over a regular set, but it's a good idea to default to immutability
Personally I'd probably use a set here because it's a less obscure type, so might raise fewer questions to readers of the code. But the behavior should be the same
okay, fair enough.
is this just checking whether they have the same length?
https://github.com/brettcannon/record-type/blob/main/records.py#L40
I mean he's checking the attributes anyway.
records.py line 40
if self_attrs != other_attrs:```
it also matters if other has extra attrs (as the comment says)
but why is it necessary to check attributes independently?
set1 = frozenset([1, 2, 3])
set2 = frozenset([1, 5, 3])
set1 == set2 # false
that's it.
you also need to check that the values are equal
isn't that what set1 == set2 does?
in the code you linked, if self_attrs == other_attrs only checks that the names of the attributes match, not their values
oh, got it
cpython#110238
i forgot how trigger bot to make it send link to pr/issue :(
Jelle, why do you say < 0 is more idiomatic than == -1?
is it because other negative values might be used for other kinds of errors in the future?
5590
(py311) jelle@m2mb-jelle cpython % git grep ' == -1\b' | wc -l
2245
though that's less of a difference than I thought, probably == -1 is fine too. Just thought I'd usually seen < 0
i've always not understood the use of == -1 vs < 0
I think for some API functions it's specifically -1 that is special as they can return other negative values
But PyList_Append is documented to only return 0 or -1
i guess there is a huge amount of cases when < 0 is used when function is documented to return -1 on error
even docs are not perfect
what is the error you are pointing out?
I suppose the inconsistency between the < 0 check and the documentation saying it returns -1. But I don't really think that's a problem
I don't either. If it's 0 or -1, then <0 is fine.
python/cpython#110238
is there any official documentation/pep about this built-in benchmark?
https://github.com/brettcannon/record-type/blob/main/benchmarks/bench_dataclasses.py#L78
benchmarks/bench_dataclasses.py line 78
__benchmarks__ = [```
At this point the whole thing is just Brett's idea. There is nothing "official" about any of it
No, I mean the __benchmarks__ dunder, or whatever it is.
Brett should know better than to invent dunders like that.
there is nothing special about __benchmarks__, it is just a variable name
"just a one time quick test thing"
"I should know better" π
I thought its some kind of built-in for benchmarking functions.
if it's part of some core dev personal repo
then there's nothing official about it and what it includes might not have anything to do with the python lang at all, that benchmark dunder is just a variable in this case
but wouldn't it be a good idea to have a built-in benchmark?
there is the timeit module
and there is a standard benchmarking suite in pyperformance
benchmarks vary so much based on what they are measuring, and ^^
I don't think it makes sense to maintain that as part of the CPython core
fair enough.
and btw, faster-cpython is the group doing the performance work in CPython these days, and they have a benchmark, so that's official enough.
I was just confused about the __benchmarks__
yes, that's why people shouldn't invent their own.
Does anyone know what is Sfsmanager in python? Please let me know.
Looks like it might be a Salesforce library
Do you have any more context?

Canβt actually find it on PyPI
"Generative AI is experimental"
True
FFS google
It says it in here, but I canβt actually find it in the documents
something something private/ organization repos
If they know enough have access to that shouldnβt they have access to know what that is?
yeah, but some people don't read the instructions they're given
Hey guys i have a doubt i have a pyobject of a custom function
printf("My custom function called\n");
Py_RETURN_NONE;
}```
how do i make it callable with PyCFunction_New()
according to the source PYCFunction_new() takes PyMethodDef struct and Pyobject struct
im not able to construct the PyMethodDef properly
Sounds like most code π
what did you want to do that it doesn't do?
I'm looking for more information about cpython core development.
is there an upgraded guide for newer versions? https://hackmd.io/@klouielu/ByMHBMjFe?type=view
https://devguide.python.org, though it won't have the same kind of data as that linked document. For what it's worth, in general that document still looks useful, though a couple of components (notably the parser and the interpreter) are very different now
what I'm trying to do now, is get a fundemental understanding.
parser, lexer all that stuff. The only article that I have found so far is Guido's explanation.
(But this one is by the BDFL. :-) Introduction There are a lot of subdirectories in the CPython repo. The devguide has an overview, which is broader than this doc, but shallow.
!pep 617
should be helpful for the parser
https://hackmd.io/@klouielu/ByMHBMjFe?type=view#GDB-intro
is this still being used for debugging purposes?
you can use whatever C debugger you want, but gdb is what I'd expect most people to use on linux at least
π
!rule 7 9 6
6. Do not post unapproved advertising.
7. Keep discussions relevant to the channel topic. Each channel's description tells you the topic.
9. Do not offer or ask for paid work of any kind.
when I try to build CPython on Fedora this is what I get:
-> .python/ Lib/my_python_module
ModuleNotFoundError: No module named '_socket'
not sure what might be wrong.
Any ideas?
I think there's a section in the build guide about it
You probably also can't import ssl?
These instructions cover how to get a working copy of the source code and a compiled version of the CPython interpreter (CPython is the version of Python available from https://www.python.org/). It...
this is exactly what I did.
Oh no idea then sorry
oh now it's working. I need to find a pattern so I can define when it works and when it's not. π
Can someone help me with finding the source code of the keyword in in the cpython repository?
π
There is no one place like that. There will be a place in the parser that defines the grammar, code in the compiler that emits bytecode for it, code in the interpreter that executes the bytecode corresponding to it, and implementations for various types
as far as the implementations goes, iirc it just calls a in b = b.__contains__(a)?
I need to see the C code like this one
static int
list_contains(PyListObject *a, PyObject *el)
{
PyObject *item;
Py_ssize_t i;
int cmp;
for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i) {
item = PyList_GET_ITEM(a, i);
Py_INCREF(item);
cmp = PyObject_RichCompareBool(item, el, Py_EQ);
Py_DECREF(item);
}
return cmp;
}
This is for __contains__ for list obj
Now I'm searching for the keyword in and see how it writes in C like this one
@open vortex It's being talked about in here now (sorry, carry on)
We can start in grammar/python.gram, where there is a rule in_bitwise_or. That's the piece of grammar that turns the use of the in keyword into a piece of Abstract Syntax Tree (AST)
the parser is being generated by pegen right?
!d PySequence_Contains
int PySequence_Contains(PyObject *o, PyObject *value)```
*Part of the [Stable ABI](https://docs.python.org/3/c-api/stable.html#stable).*Determine if *o* contains *value*. If an item in *o* is equal to *value*, return `1`, otherwise return `0`. On error, return `-1`. This is equivalent to the Python expression `value in o`.
Then in Python/compile.c in the compiler_addcompare function, it processes the In AST node and emits a bytecode CONTAINS_OP
then in Python/bytecodes.c, there is an implementation for CONTAINS_OP, which calls PySequence_Contains (as @dusk comet just mentioned)
That's defined in Objects/abstract.c. It first looks at the sq_contains slot. Slots are pointers in the definition of a type that point to a function. For list for example, this slot will be set to the list_contains function mentioned above
If there is no such slot, it calls _PySequence_IterSearch, which will iterate over the object and compare every object against the object being searched for
So the answer to this question is no, it doesn't just look at __contains__ (which corresponds to the sq_contains slot), because it will also iterate over the object
Yes, the python.gram file is input for pegen, which generates a bunch of C code from it
and these c codes are going to be used in the parser.c I suppose.
that's the generated output from pegen
41k lines π±
oh where do you see that?
Parser/parser.c
so this is not being edited directly I suppose.
nope
I might check out pegen first, looks interesting.
@feral island @dusk comet Thank you for your help ππ»
!e
Not sure ```py
def f():
return (x for x in (1, 2, 3))
print(4 in f())
print(f().contains(4))
@grave jolt :x: Your 3.12 eval job has completed with return code 1.
001 | False
002 | Traceback (most recent call last):
003 | File "/home/main.py", line 5, in <module>
004 | print(f().__contains__(4))
005 | ^^^^^^^^^^^^^^^^
006 | AttributeError: 'generator' object has no attribute '__contains__'
It's iterables
in probably uses __contains__ if it exists, otherwise it falls back to an one-by-one search
It definitely should not be used directly.
it would make more sense as a magic method inside a Python class.
so yeah a generator doesn't have that method defined.
!eval
class Box:
def __iter__(self):
print("__iter__ called")
self.value = 0
return self
def __next__(self):
print("__next__ called")
self.value += 1
if self.value > 5:
raise StopIteration
return self.value
print(1 in Box())
print(6 in Box())
@naive saddle :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | __iter__ called
002 | __next__ called
003 | True
004 | __iter__ called
005 | __next__ called
006 | __next__ called
007 | __next__ called
008 | __next__ called
009 | __next__ called
010 | __next__ called
011 | False
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/JTD6NCWEO5QHXTBGLZPUIRVERY
!eval
class FixedLength:
def __len__(self):
print("__len__ called")
return 5
def __getitem__(self, index):
print(f"__getitem__({index}) called")
return str(index)
print("1" in FixedLength())
print("6" in FixedLength())
@naive saddle :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | __getitem__(0) called
002 | __getitem__(1) called
003 | True
004 | __getitem__(0) called
005 | __getitem__(1) called
006 | __getitem__(2) called
007 | __getitem__(3) called
008 | __getitem__(4) called
009 | __getitem__(5) called
010 | __getitem__(6) called
011 | True
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/WGIQZGD4WPEX3QG6A27HIG3ZE4
hmm
I thought you needed both the len and getitem dunders for the old-style iteration protocol
FixedLength there is an iterable
I believe it calls __getitem__ until it throws IndexError
Lastly, the old-style iteration protocol is tried: if a class defines getitem(), x in y is True if and only if there is a non-negative integer index i such that x is y[i] or x == y[i], and no lower integer index raises the IndexError exception. (If any other exception is raised, it is as if in raised that exception).
https://docs.python.org/3/reference/expressions.html#membership-test-operations yup
I got it confused with this
If the reversed() method is not provided, the reversed() built-in will fall back to using the sequence protocol (len() and getitem()). Objects that support the sequence protocol should only provide reversed() if they can provide an implementation that is more efficient than the one provided by reversed().
This is still less confusing than Python's binary operator dunders. They are full of exceptions.
<@&831776746206265384>
@astral gazelle fuck off
!mute 1108058393815552135 1h 
:incoming_envelope: :ok_hand: applied timeout to @cinder cliff until <t:1696696230:f> (1 hour).
:x: According to my records, this user already has a timeout infraction. See infraction #91286.
Isn't the internet a lovely place? full of completely reasonable, level-headed people 
I can't see the c code only the assembly when I use gdb, I suppose it's being compiled automatically with the essential "tags" (-g)under the hood(cpython-source-code)?
i guess you have to use debug version of cpython in order to see C-src (not sure)
what is the difference between pegen.c and the python pegen package? Isn't the Python pegen generator output the c parser(41k lines) itself?
pegen.c produces parser in C and pegen package produces parser in python
what's the point of having both?
or its like you can use both
so just include this tag --with-pydebug
im not sure, ask someone else
is there a reason why optimizer while optimizing 1+2 calls int.__add__ and not some direct C-function (like _PyLong_Add or something)?
i guess since there is no public PyLong_Add function, it is easier to call generic PyNumber_Add and let it figure everything by itself, so it ends up calling int.__add__
yes it just calls PyNumber_Add. That makes sense, because otherwise it would have to directly call the string/int/float versions separately
this is fold_binop in Python/ast_opt.c
what do you guys think of mojo
already installed and using it here and there....
if they bring up updates (what they do) this will be THE new language for alot of things
It's a cool idea for ML/AI devs, though is still early in development, missing quite a handful of python features, and isn't the magic cure-all to python performance that some people think it is
It's also just a separate language from python, all of the performance gains are done using syntax completely different from python
Dont they also use a different compiling method?
yeah, it compiles down to native machine code, versus python's compilation to bytecode that's then interpeter, though compiling python extensions is something that's already been being done for ages using C/C++/Rust/Cython
we've even got a channel dedicated to such extensions, #c-extensions (which would also probably be better suited to mojo talk than this channel, which is more for python and its intricacies specifically)
Yeah I have no idea what your talking about
Could someone take a look at this question?
also this?
probably needs the debug version but in some cases it sorta just points to the assembly
I tink the latter might be the solution.
pegen.c where?
that contains functions used during parsing
it doesn't generate anything
why does it have the same name then?
pegen.py generates parser.c
then what is pegen.c doing?
the pegen package on the other hand is a python-only mirror of the thing at Tools/peg_generator/pegen/*
defining functions used by Parser/parser.c from Parser/pegen.h
that is the header file
yes
there's actually 2 places where functions in Parser/pegen.h are defined and that's either Parser/pegen.c or Parser/action_helpers.c
so it's not being used in reality.
I mean it's the same as tokenize.py -> just a python implementation.
ahh
I might be wrong
eh well the actual parser generator is also written in python
I kinda get the idea.
I think.
the pegen package can only generate python code
but the actual parser generator generates both C (Parser/parser.c) and python code
wait, isn't it generating the c parser?
parser.c is being generated by the python pegen package
nope
.
talking about the actual parser generator there
which is also named pegen
yes, was talking about the pegen.c in CPython
huh, now I'm confused.
huh
There are two tools called pegen with equivalent-ish behavior, one in Python (the PyPI one cereal linked above) and one in C (in the CPython source tree)
I think?
didn't read all of the above conversation
Tools/peg_generator/peg_extension/ has a C file but i think the bulk of the generating comes from the python files at Tools/peg_generator/pegen/
if there are 2 pegen(s), which one is generating the parser.c?
yeah I think you're right, it's Python code generating that
@$(MKDIR_P) $(srcdir)/Parser
PYTHONPATH=$(srcdir)/Tools/peg_generator $(PYTHON_FOR_REGEN) -m pegen -q c \
$(srcdir)/Grammar/python.gram \
$(srcdir)/Grammar/Tokens \
-o $(srcdir)/Parser/parser.c.new
$(UPDATE_FILE) $(srcdir)/Parser/parser.c $(srcdir)/Parser/parser.c.new
(from the Makefile)
oh yes. I stumbled upon this in the documentation(how to regenerate the parser).
https://github.com/python/cpython/blob/main/Tools/peg_generator/pegen/__main__.py#L79
Tools/peg_generator/pegen/__main__.py line 79
argparser.add_argument("-q", "--quiet", action="store_true", help="Don't print the parsed grammar")```
in that case, tokenizer.c is being used, the tokenize.py is just a Python experiment.
I've been trying to build 3.13-dev (via pyenv) for snekbox (used by @fallen slate for its eval command) and got the below error
#23 0.341 0:01:04 load avg: 1.18 [37/44] test_sqlite3
#23 0.341 Failed to import test module: test.test_sqlite3.test_dbapi
#23 0.341 Traceback (most recent call last):
#23 0.341 File "/tmp/python-build.20231009201631.27/Python-3.13-dev/Lib/unittest/loader.py", line 394, in _find_test_path
#23 0.341 module = self._get_module_from_name(name)
#23 0.341 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#23 0.341 File "/tmp/python-build.20231009201631.27/Python-3.13-dev/Lib/unittest/loader.py", line 337, in _get_module_from_name
#23 0.341 __import__(name)
#23 0.341 File "/tmp/python-build.20231009201631.27/Python-3.13-dev/Lib/test/test_sqlite3/test_dbapi.py", line 38, in <module>
#23 0.341 from _testcapi import INT_MAX, ULLONG_MAX
#23 0.341 ModuleNotFoundError: No module named '_testcapi'
#23 0.341
This suggests test modules are being disabled (and is evidenced further up in the output where it mentioned a few test modules being disabled
I've found https://github.com/python/cpython/pull/110530 to safely import it which will solve this issue if/when it gets merged, however I'm wondering what to do in the meantime.
I found that this ./configure arg exists https://docs.python.org/3.13/using/configure.html#cmdoption-disable-test-modules which would raise the errors above, but I can't seem to find anywhere this is being set by pyenv. Anyone have some ideas on how to resovle this?
why is it running that test at all? is it for PGO?
(if so, I don't think the test failure should block anything)
honestly, I'm not sure. I'm running https://github.com/pyenv/pyenv/blob/master/plugins/python-build/bin/python-build with 3.13-dev as an arg and nothing else, and it's exitting,
Think I'll read through this script a bit to see the why
This part of the readme suggests that optimizations are opt-in https://github.com/pyenv/pyenv/blob/master/plugins/python-build/README.md#building-for-maximum-performance
It's possibly a pyenv issue, rather than CPython
Not 100% sure but I suspect _testcapi exists only on debug builds and you're running tests in release mode
Ah yea, found this in the log file LD_LIBRARY_PATH=/tmp/python-build.20231009212348.27/Python-3.13-dev ./python -m test --pgo --timeout=
no, it is available on release builds too, iirc
!e ```py
import _testcapi
print(dir(_testcapi))
@dusk comet :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 1, in <module>
003 | import _testcapi
004 | ModuleNotFoundError: No module named '_testcapi'
PowerShell 7.3.7
PS C:\Users\BradleyReynolds> py
Python 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import _testcapi
>>> print(dir(_testcapi))
['CHAR_MAX', 'CHAR_MIN', 'ContainerNoGC', # omitted for brevity
PowerShell 7.3.7
PS C:\Users\BradleyReynolds> py -3.12
Python 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import _testcapi
>>> print(dir(_testcapi))
['ALIGNOF_MAX_ALIGN_T', 'CHAR_MAX', 'CHAR_MIN', 'ContainerNoGC', # omitted for brevity
huh
We disable test modules when building for snekbox https://github.com/python-discord/snekbox/blob/main/Dockerfile#L23
Dockerfile line 23
PYTHON_CONFIGURE_OPTS='--disable-test-modules --enable-optimizations \```
@granite heath
tokenize.py and ast.py are just python implementations, more like helper features rather than something being used in the interpreter itself?
ast.py uses the same AST as the interpreter itself, though it creates wrapper objects that the interpreter doesn't use
IIRC tokenize uses the C implementation since 3.12 but haven't looked at it much
π
tokenize.py still has a bunch of regexes?
I think those weren't removed for compatibility reasons. But the tokenize.tokenize function invokes the C tokenizer
ok, i didn't read closely. it looked shorter, but still had regexes. thanks.
!e why is this not allowed?
((yield) + 1 for _ in range(3))
@quick snow :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 1
002 | ((yield) + 1 for _ in range(3))
003 | ^^^^^
004 | SyntaxError: 'yield' inside generator expression
It was allowed previously (I think until 3.8 or 3.9)
What would this do? 
but the behavior was extremely weird and confusing
would that make the resulting expression accept something via send?
[None, (None, 1), None, (None, 1), None, (None, 1)]```
What's unexpected there?
https://github.com/python/cpython/issues/54753 for context
This seems the same behavior that the equivalent generator function would do, unless I'm missing something.
Sure, but it's hard to imagine this being useful, and the behavior is even weirder with comprehension as opposed to genexps
I see. I wonder if this weird effect would be gone now with the comprehension inlining.
[<generator object <genexpr> at 0x7fe453036f10>]```
Comprehension inlining is a performance optimization, it should not lead to behavior changes (except for various small effects on introspection)
sorry that had too many parens $ python3.6 -c 'print([((yield), 1) for _ in range(3)])' <generator object <listcomp> at 0x7f81b63fdf10>
so the listcomp actually produced a generator
Yeah, I see. Because it was an inner function it didn't turn the function containing it into a generator but the implementation-detail comprehension function. That's why I'm guessing the effect might not be present anymore if this was allowed again.
I decided to try it out by commenting out the check in symtable.c that disallows this, but it's going to be a bit harder than that π
Assertion failed: (frame->owner == FRAME_OWNED_BY_GENERATOR), function _PyFrame_GetGenerator, file pycore_frame.h, line 309.
zsh: abort ./python.exe -c '(lambda: print([((yield), 1) for _ in range(3)]))()'
oh probably because it doesn't think the lambda is a generator since the yield is in the listcomp
[(None, 1), (None, 1), (None, 1)]
[None, None, None, None]
I suppose that's fairly reasonable behavior
Won't work for genexps though as PEP 709 doesn't inline those
Thanks, now I don't have to check. I didn't remember generator expressions weren't inlined, but even if, I guess it'd be hard to find a good reason to do this.
Har there been any improvements on memory usage since 3.10? Is there any benchmark/kpi to follow if I'm interested?
is there some statement in some PEP about why func(a=7, *b) works but func(**{'a': 7}, *b) doesn't? i stumbled upon it while reporting a bug to cpython and i don't know what PEP to find it in
if there's a statement about it, it should be in the docs, not a pep.
the fact that f(a=7, *b) works is weird, imo
Give the function signature and I bet I can make an example of that not working. You have to realize that by default all args are named and positional. I bet if you a was keyword only meaning defined after the * in the signature this wouldn't work. Note keywords don't need default values
** is arbitrary keywords and must be at the end
* is arbitrary positional args
At least that's my hypothesis
!e
def func(*args, a):
print(args)
print(a)
func(1,2,a="a")
func(a="a", 1,2)
@swift imp :x: Your 3.12 eval job has completed with return code 1.
001 | File "/home/main.py", line 6
002 | func(a="a", 1,2)
003 | ^
004 | SyntaxError: positional argument follows keyword argument
@swift imp :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 5, in <module>
003 | func(1,2,"a")
004 | TypeError: func() missing 1 required keyword-only argument: 'a'
Like that, if you don't pass it as keyword it breaks
!e ```
def f(b, a):
print(b, a)
f(a="a", *("b",))
@feral island :white_check_mark: Your 3.12 eval job has completed with return code 0.
b a
@swift imp this is what cereal was talking about
This is the stuff why functions are so costly right?
how is async implemented under the hood? i know roughly how rust does it, by implementing a state machine at compile time. does python do something similar?
Do you mean coroutines or the asyncio built-in lib?
it's really hard to answer without any context.
if you define a function with async def it will return back a coroutine.
This feature itself is pretty useless, this is why asyncio exists.
i'm not asking about the executor
rather, how the async is implemented under the hood
how does the coroutine determine whether the task is ready to be polled or not
well yes, you need to check out asyncio for this.
a coroutine for sure doesn't determine whether a task is ready or not.
it's more like a generator behavior so that you can suspend your code while it's running.
what, how does the executor know whether the coroutine is ready to be polled or not if the coroutine does not yield a ready or pending result?
does python do some really cursed magic?
no?
so the executor e.g asyncio suspends the task to check whether if its ready to be polled?
how does it figure out if its ready
well yes.
but please checkout the asyncio lib and also generators(send etc).
its much faster for me to ask it here to someone who already knows the answer
my question is not about the executor either
for my answer, i'd have to dig through the CPython src code which is written in C, i do not speak C
My basic knowledge of it is:
- There's an event loop which is in charge of managing all the current running coroutines
- Any time something is awaited, the couroutine that
awaitsyields control over to the event loop, using a mechanism that's sorta like how iterables work (couroutines used to be based off of generators, fun fact!) - The event loop then finds some coroutine which is ready to run/continue, and lets it run until completion, or until it yields control back again
Screencast based on a workshop originally presented at PyCon India, Chennai, October 14, 2019.
Code samples at: https://gist.github.com/dabeaz/f86ded8d61206c757c5cd4dbb5109f74
This workshop is about the low-level foundations and abstractions for asynchronous programming in Python. It's a bit unusual in that rather than starting with tradi...
how is the ready/pending result for coroutines implemented?
this is the executor
so are you trying to understand what Python has to offer and not the lib itself?
As I said before they provide a functionality in which you can suspend and resume the running function.
Python Enhancement Proposals (PEPs)
async fn foo(x: u32) {
println!("{x}");
let a: String = bar().await;
println!("{a}");
baz().await;
}
enum Poll<T> {
Ready(T),
Pending,
}
fn foo(x: u32) -> StateMachine {
StateMachine::Start { x }
}
enum StateMachine {
Start { x : u32 }
FirstAwait { a_fut: impl Future<Output = String> }
SecondAwait { fut: impl Future<Output = ()>, a: String }
Finished
}
impl Future for StateMachine {
type Output = ();
fn poll(&mut self) -> Poll<Self::Output> {
loop {
match self {
Self::Unstarted { x } => {
println!("{x}");
*self = Self::FirstAwait { a_fut: bar() };
}
Self::FirstAwait { a_fut } => match a_fut.poll() {
Poll::Pending => return Poll::Pending;
Poll::Ready(a) => {
println!("{a}");
*self = Self::SecondAwait { fut: baz(), a }
}
}
Self::SecondAwait { fut, a } => match fut.poll() {
Poll::Pending => return Poll::Pending;
Poll::Ready(()) => {
ptr::drop_in_place(a);
*self = Self::Finished
return Poll::Ready(());
}
}
Self::Finished => panic!("Attempted to poll a completed future");
}
}
}
}
this is how rust roughly implements async
well this is about python
and a async runtime (the executor, e.g pythons "asyncio" library) would call poll on this future
im wondering
how python implements this coroutine
Python Enhancement Proposals (PEPs)
this only explains the behaviour of it, not how a ready/pending result is generated?
asyncio
will poll the tasks, correct?
what implementation, returns the result for that poll
I think Python is a bit different. An event loop implementation keeps track of the real-world "things" it waits on (like timers, sockets and file descriptors), and maps them to coroutines which are waiting for something to happen
so the concept of futures does not exist in python?
if a coroutine is finished it's going to throw an exception.
asyncio does have a Future
it does exist.
tasks are futures.
so what happens when you poll a future that is not ready, under the hood.
like, in the C implementation
what do you mean by poll?
it has a _step method under the hood which does one step(till it reaches a yield-await).
@candid tinselThe await chains eventually lead to a yield future (they do yield from future which then does yield self in its own __await__ == __iter__). A future object is an asyncio.Future, or an object duck typing compatible with it.
A done_callback of the yielded future is registered by a Task, which is a future subclass which is used to wrap a coroutine. That done callback then calls .send(None)/.throw on the coroutine that yielded the future, once the future is done.
Generally, a coroutine will at a low level setup a reader/writer for the event loop, setup that callback to complete a future, then yield that future.
You can read the python implementation of asyncio, they are 95% equivalent
jesus, async is really messy in this language
Ye, asyncio is quite convoluted
fortunately, you never have to think about any of this
!e
import asyncio
class DebugCoro:
def __init__(self, wrapped):
self.wrapped = wrapped
def __await__(self):
gen = self.wrapped.__await__()
for fut in gen:
print(fut)
yield fut
async def foo():
print("a")
await asyncio.sleep(0.5)
print("b")
await asyncio.sleep(1)
print("c")
async def main():
await DebugCoro(foo())
asyncio.run(main())
@grave jolt :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | a
002 | <Future pending>
003 | b
004 | <Future pending>
005 | c
asyncio.sleep for example registers a timer for the event loop, that will then complete a future which it then yields (or awaits, more specifically).
and the event loop knows the task is finished when an exception is thrown like hels was referring to?
yeah
but that's not what i'm talking about
because that's params
The event loop doesn't know what a Task is, it knows it ended because it registers no new callbacks.
futures should be a language feature, why is this in an executor library
because python async implementations predate the syntax
you can use libraries other than asyncio with the async/await syntax, it is mostly just sugar over slightly weird iterables that you can throw exceptions to.
await syntax knows nothing about futures, event loop knows nothing about futures, it just has callbacks. Task and Future do know about the event loop and about async/await syntax.
!e ```py
def func(*args, a):
print(args, a)
func(a="a", *(1,2))
@rose schooner :white_check_mark: Your 3.12 eval job has completed with return code 0.
(1, 2) a
I don't buy that this isn't surprising. There's quite a few weird inconsistencies around here
(like how func(a="a", 1) doesn't work but func(a="a", *(1,)) does)
# not allowed
func(a=7, 1, 2)
func(**{'a': 7}, 1, 2)
func(**{'a': 7}, *(1, 2))
# allowed
func(a=7, *(1, 2))
We are using the signature that @feral island provided?
!e ```
def f(b, a):
print(b, a)
f(a="a", *("b", "c"))
@swift imp :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 3, in <module>
003 | f(a="a", *("b", "c"))
004 | TypeError: f() got multiple values for argument 'a'
Seems like it unpacks the tuple and since there's still an open param it looks what keywords were passed
no matter the signature the last case shouldn't work at all
But it does matter
nope
If you define it def func(a, *b): then a is positional and named
it only errors when the correct form also errors (func(*("b", "c"), a="a"))
So the keyword being valid makes sense
No bc in that case you're calling it with expectation that a is keyword only
Anything defined or passed after * in the signature is keyword only
func() could literally be def func(b, a):
actually idk what even is the point here
Anything before is keyword or positional. That's why there was / to signify positional only
the original thing was to say that func(a=7, *(1, 2)) shouldn't be valid in syntax given that all the other alternative ways to write it are invalid in syntax
idk how that's related to function signatures
Bc the signature dictates what's valid?
no
it dictates what params the function accept
and that's checked later in the runtime
f(a=7, 1) is a SyntaxError. It doesn't even matter what f is
so anything inconsistent with the format provided by the signature produces a error in the runtime instead of an error in the parser
Hm
Yeah that's an inconsistency
anyways i'll have to go now
I'm not meaning anything by this
I just didn't see anything weird about passing a named/pos arg as a keyword and the tuple unpacking being valid
Sorry
I also don't have any background knowledge of your bug report
!e
def func(a,b):
print(a,b)
func(a=1, 2)
No I'm an idiot
@rose schooner @feral island do you have a link to the bug report bc id love to follow that to understand it better
not sure there is a bug report?
I thought cereal said be found it and reported it above
Oh no he said he came across while reporting another bug
!e ```
def f(a, b):
print(b, a)
f(a="a", *("b",))
@swift imp :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 3, in <module>
003 | f(a="a", *("b",))
004 | TypeError: f() got multiple values for argument 'a'
I'm interested in it but i still think it's something to do with the signature and order of which python prioritizing the passing of arguments like pos_only -> tuple unpack -> keyword only -> dict unpack
Like maybe func(a=1, 2) is a syntax error bc there's just no way for the prioritization to rectify that
Super weird non the less
I also no none of the machinery of python and just talking through observation
how do i know if something is atomic in python? is there an atomic type?
I don't think it's really needed because of the GIL, right?
all operations are atomic by default
so the os can't interrupt a thread during any python operation?
I'm not entirely sure, but you can always kill python.
you just can't kill individual threads
Not true, only those marked as thread-safe in the docs. For example, I was until recently under the impression that list.append is atomic, but it's not guaranteed to be: #internals-and-peps message
Same for int's +=, depending on the Python version.
If you need atomicity across threads, you'll need locks, or explicitly thread-safe objects (e.g. queue.Queue).
Would it be correct to have the assumption that python code is atomic, and c is not?
What do you mean by atomic, then?
In the context of "the OS can't interrupt" I'd argue that nothing in Python-land is atomic.
atomic usually refers to how many threads can access the variable at once (1)
a form of auto synchronizing
that's my understanding at least
No two Python threads run at the same time, yes, and that means that stuff like reference counts are atomic in that sense. But "accessing" a variable, when meaning more than resolving the reference is not atomic, depending on what you mean by accessing.
so the docs will explicitly state whether it's atomic/thread-safe or not? https://docs.python.org/3/tutorial/datastructures.html none of these functions state they are atomic? so they're not atomic?
Yep, they're not thread-safe. At least not guaranteed to be (some of them happen to be in CPython)
is there an example operation that is atomic and stated in it's documentation?
just want to see what i should be looking for
But there's also some individual functions marked as thread-safe, one sec..
are all queue associated functions and objects thread-safe aswell?
Yes
Other example where you can even choose how thread-safe you want it: https://docs.python.org/3/library/sqlite3.html?highlight=thread safe#sqlite3.threadsafety
π
so it's safe to assume that every python operation in a thread can be interrupted by the os unless i manually lock? except for those explicitly documented to be thread-safe
We're talking about two different things here. I'm talking about Python thread-safety, meaning where two threads are interacting with the same data, and whether that's safe.
The OS can interrupt Python pretty much at any time. Almost nothing is "atomic" in that sense in Python, but you mostly don't have to worry about it.
There's some OS-level actions that are atomic in that sense, e.g. moving a file in the same file system (os.rename).
i thought when you have a lock acquired, the thread cant be interrupted, hence it's cooperative?
is my understanding wrong?
ah nvm async is cooperative, not threads
i keep mixing them, this is a complicated topic for me
No, that's not how that works. Whether you have a lock or not doesn't influence whether the thread can be interrupted. A lock protects a resource, such that e.g. only one thread can be in a specific part of the code at the same time.
could you elaborate
only one thread can be in a specific part of the code at the same time.
if you dont mind
import threading
my_lock = threading.Lock()
def worker():
while True:
result = do_some_work()
with my_lock:
with open("results.log", "a") as f:
f.write(result)
threading.Thread(target=worker).start()
threading.Thread(target=worker).start()
This would be an example. If the two workers would simultanously have a result ready, one of them would block at the with my_lock: line until the other worker has left that context manager.
The GIL is responsible for interpreter internals not getting broken - you won't see things like objects leaking because their refcount was wrong due to parallelism, lists segfaulting due to miscounting their capacity/not using a realloced buffer etc.
The GIL removal replaces all of this with per-object locks, which is slower, but means that if two threads execute python code that doesn't share anything too regularly, it will actually run fully parallel.
this is called a "critical section". locks guarantee only one thread is in a critical section
anyone know off the top of their head if everything is good if i call loop.run_until_complete on one thread, and then once that's finished call loop.run_until_complete on some other thread for the same asyncio loop object?
I would expect any Task relying on contextvars would find the thread change also loses the contextvars, but I am not 100% sure
Most of asyncio's objects aren't threadsafe. There are threadsafe scheduling APIs, just start an event loop in the thread if you need it and pass data, not stateful objects between threads.
why can lists be mutated when iterated over, but not dictionaries?
well can and can, but python throws an exception on the latter case
the internal state of the dictionary changes in ways where it is difficult to ensure that it is safe to keep iterating, where lists are linear and easier to manage
makes sense, thanks
im curious though, what exactly is difficult to ensure its safety?
internally the dict uses a sort of hash table, and that table can change drastically when key value pairs are inserted or removed
so the iterator no longer knows which slots are safe/have already been seen/exist
because items can move to before where the iterator is
so if it was allowed the iterator could skip items, double items, or try to access items that no longer exist at the index it thinks they are at
and you can't just rebuild the iterator because its impossible to know what index to continue at
dicts can be mutated during iteration
if you look closely at the error, it says something like "dict size changed", so you can do whatever you want to your dict without causing reallocations, and it will work fine
oh oops
i always assumed that the dict keys couldnt change when iterating
!e ```py
d = {'a': 1, 'b':2, 'c': 3}
for k, v in d.items():
print(k, v)
if k == 'b':
d.clear()
d.update(d=4, e=5, f=6)
@pliant tusk :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | a 1
002 | b 2
003 | f 6
interesting
so it maintains the index, because as long as the dict is the same size the internal table remains consistent?
ah makes sense
so it had more to do with dicts being an unordered datastructure
yea, and now that they are ordered, you can performed limited mutation
perfect, thanks for the explanation
AFAIK Tuples are the only thing in Python with immutability. Pretty sure Python only uses heap for value memory?
Lots of other things are immutable
strings and ints, for one
- sorry i meant the keys could not change while iterating (which is not true, they can change)
And basically all objects are stored on the heap
Ah yes, I should say syntactically, since Python is abstracting the pointer architecture & automated memory management to the interpreter right?
I don't understand what you are saying there
Though yes, the interpreter takes care of memory management and managing pointers, that's not something you deal with in Python code
(unless you're using ctypes or various other exotic things)
As in, for dev experience Python is allowing you to reassign variables with string types etc, except for tuples right?
you can also reassign variables that hold tuples
I thought python tuples were immutable?
the tuple itself is immutable, but you can delete the reference, or modify the contents of the tuple if it contains mutable items
yeah, I think I'm conflating the python object type & the variables themselves
!e py x = ([0], ) # x is a tuple x[0][0] = 1 # changing the value of contained item print(x)
@pliant tusk :white_check_mark: Your 3.12 eval job has completed with return code 0.
([1],)
Okay, I hadn't considered that use case, but that makes a lot of sense
Err, you can't mutate keys of a dict, when they're inside a dict. Whether it's during iteration or not.
Or, to put it more precisely: you shouldn't mutate keys of a dictionary in a way that would affect teh result of their hash or equality operators
@candid tinsel @pliant tusk , fwiw
if you do such things you'll simply break your dict
python will try to stop you from doing this, by preventing mutable types from being keys to begin with
but it only goes so far
*by mutate keys I meant swap them out for other keys, not mutate the object in place. I keep those shenanigans to #esoteric-python
Okay, fair enough. Just thought I'd make sure that got across to evrsen too
I want to nitpick that mutable != unhashable
but yeah you should not be able to change the hash value of an existing object
There's unsafe_hash in dataclasses.dataclass but it's unsafe for a reason π
well, I didn't say that they were the same, just that python tries to stop you from doing it with common types
in languages with some form of mutation control like C++ or Rust of course this isn't an issue, you can have any type you want as your hashmap key, including mutable types
you'll just be prevented from mutating that mutable type while it's in the hashmap
Yeah in Rust you can't get a mutable reference to a key
Rust doesn't actually have "mutable types"/"immutable types" which I found strange at first (so you can have a vector as a key!)
(I suppose you can make a type whose public interface only includes read-only operations, but then there's stuff like std::mem::swap so it's kinda interesting)
the "type" itself would still be immutable
you're just swapping handles
the difference is that the object follow the handles as it were, rather than handles following the objects
it's pretty similar to, say, a dataclass in python that has a str member
str is immutable, yes, but the dataclass' member can just be asked to point to a diferent string
same thing in Rust. the only difference is that instead of having a pointer, and changing where it points, you memcpy the bits of the object to where you want it instead
true, true
I was a decent C++ programmer before I was a decent programmer in python or any GC language
so for me, the python way was very strange
and then you have Rust which is more like C++ yet not, at the same time
in C++ move-assignment is a mutator, so you can write objects that are "truly" immutable
like, you can write types in C++ such that they cannot be swapped
i think solar flairs have something to say about mutability
python/cpython#110805 finally
oh wait it says REPL now instead of stdin?
why does it still say module tho π€
is it a requirement of a traceback?
π
The documentation says that PyGen_New is not being used directly. I was wondering what is being returned after the parser found a yield_stmt. How is a generator object created or recognised by the parser?
Objects/genobject.c line 1003
PyGen_New(PyFrameObject *f)```
`Parser/parser.c` line 1885
```c
{ // &'yield' yield_stmt```
a generator object doesn't get recognized by the syntax parser
it gets recognized by the symbol table generator i think
which then passes info to the compiler
the function node handler in the compiler itself wraps the compiler in code that ends with a StopIteration
Python/compile.c lines 2317 to 2322
if (c->u->u_ste->ste_coroutine || c->u->u_ste->ste_generator) {
if (wrap_in_stopiteration_handler(c) < 0) {
compiler_exit_scope(c);
return ERROR;
}
}```
then the code is created here and in that function the code flags are computed in this function (generator flag added specifically in this part)
Python/compile.c line 2323
PyCodeObject *co = optimize_and_assemble(c, 1);```
moving on to this other function which calls _PyCfg_OptimizedCfgToInstructionSequence moving on to Python/flowgraph.c
Python/compile.c lines 7573 to 7575
static PyCodeObject *
optimize_and_assemble_code_unit(struct compiler_unit *u, PyObject *const_cache,
int code_flags, PyObject *filename)```
`Python/compile.c` lines 7601 to 7603
```c
if (_PyCfg_OptimizedCfgToInstructionSequence(g, &u->u_metadata, code_flags,
&stackdepth, &nlocalsplus,
&optimized_instrs) < 0) {```
where skipping a few calls we get to this part which adds instructions needed to create a generator when the function is called, specifically RETURN_GENERATOR
Python/flowgraph.c line 2476
if (IS_GENERATOR(code_flags)) {```
and RETURN_GENERATOR is important because that's where the generator is created
but as you can see here it doesn't use PyGen_New(), it uses _Py_MakeCoro()
Python/bytecodes.c lines 3739 to 3742
inst(RETURN_GENERATOR, (--)) {
assert(PyFunction_Check(frame->f_funcobj));
PyFunctionObject *func = (PyFunctionObject *)frame->f_funcobj;
PyGenObject *gen = (PyGenObject *)_Py_MakeCoro(func);```
Thank you for the explanation, what I don't get is why the functions in the genobject.c are not used.
https://github.com/python/cpython/blob/main/Python/flowgraph.c#L2483 This here just initialises a struct.
Python/flowgraph.c line 2483
cfg_instr make_gen = {```
Thank you this is what I was looking for.
Objects/genobject.c lines 919 to 920
if (coro_flags == CO_GENERATOR) {
return make_gen(&PyGen_Type, func);```
so basically if the flag is a generator then it calls make_gen anyways.
I still don't know where all these functions defined in genobject are used.
make_gen is only one of them.
https://github.com/python/cpython/blob/fa18b0afe47615dbda15407a102b84e40cadf6a5/Python/symtable.c#L295
I think this is that part.
Python/symtable.c line 295
ste->ste_generator ? " generator" : "",```
no that's for debugging purposes only
if I'm not mistaken, you didn't mention where its being recognised by the syntax parser.
Python/symtable.c line 2181
st->st_cur->ste_generator = 1;```
`Python/symtable.c` line 2191
```c
st->st_cur->ste_generator = 1;```
π glorious hope for 505 π: #mailing-lists message
there's also a similar recent discourse thread https://discuss.python.org/t/introducing-a-safe-navigation-operator-in-python/35480
Iβve been considering the idea of proposing a new feature in Python - a Safe Navigation Operator, similar to whatβs available in languages like JavaScript, Ruby, and C#. Before proceeding with writing a formal PEP, I wanted to bring this up here to gather initial feedback, insights, and ascertain if such an idea has been proposed or discussed be...
i feel like a lot of these discussions revolve around individual people claiming that something is "rare" in their own anecdotal experience
probably the only real argument against 505 is that debugging long chains of ?. lookups can get difficult. but that's imo not a good enough reason to prevent them from existing
argument by the PEP writer himself (i think)
is
it provides ways for API writers to "slack off" and "be lazy"
since more Nones and implicits and stuff instead of explicitly using types
at least i think that's what he said
"being lazy" is the whole point of python
using python we can write simpler and shorter code in less time, and this pep reduces code size and complexity even more
most of my need for this PEP stems from default arguments
there's also another PEP for that (lazy defaults) but that's deferred too
there is a lot of implicit stuff already in python, ao adding another implicit thing is not that bad (and it is not implicit - you are clearly indicating what you need by typing ? symbol)
this: def f(x, y:=x+1) ?
i just don't understand why there isn't an option to make this easier
def f(x=>[]): stuff like this
but yes
that's also part of the proposal
dataclasses maybe
this syntax still remains my favourite
you can make factory for default value, can't you?
actually wait idek
because i wanna support None being passed to intentionally trigger a default value being created
something like ```py
class Log:
def init(pos=None, msg=None, *args):
self.pos = pos if pos is not None else EMPTY_POS
self.msg = msg if msg is not None else "logged"
self.addmsgs = []
for msg in args:
self.addmsgs.append(msg if msg is not None else "additional log")
where i could do Log(pos, None, "some msg") to skip the main message
so i actually need PEP 505 to make it easier ```py
class Log:
def init(pos=None, msg=None, *args):
self.pos = pos ?? EMPTY_POS
self.msg = msg ?? "logged"
self.addmsgs = []
for msg in args:
self.addmsgs.append(msg ?? "additional log")
which ascii characters are not used in python syntax?
?$ and `
is that all?
I think that's all yes
lets combine pep about lazy defaults, pep about ?? stuff and some nonsense from me:
class Log:
def __init__(pos:=$ ?? EMPTY_POS, msg:=$ ?? "logged", *args):
self.pos = pos
self.msg = msg
self.addmsgs = []
for msg in args:
self.addmsgs.append(msg ?? "additional log")
$ is the passed value of current argument
thing after := becomes a lambda, which is called with passed value (or None, if nothing is passed) and result of this lambda is the final value of arg
that's weird
Yeah I think that's an important detail of why using None for default might actually be good
If a function accepts an optional argument, and when passed it must be not None, you will be in a world of hurt if you want to pass-through some arguments to it
On the other hand, none can be a lousy sentinel value if none should be considered a valid value with it's own behavior. An example of this in the real world is handling json merge patch
This can be solved by either a sentinel object() default or the (also deferred) lazy defaults. sentinels are obnoxious when they aren't meant to be used by users to typing, and lazy defaults handily solves both of these problems.
I tend to agree with the detraction that this leads to people not caring about the "shape" of the objects they have, and see the perceived need of it as solvable with other methods.
Yeah, that's sort of the complaint against 505 - it special cases None in a way it really shouldn't be. It is a question whether that ship hasn't long since sailed. The stdlib does use None as a sentinel in a massive variety of places, and is not consistent in the places it doesn't do that.
guido says None is special
Yeah, I can't say I disagree
enough APIs carry their Nones around places in the style of null/nil of java/ruby that it does deserve syntax
I still like the fantasy of sentinel-free python where the absence of something means acessing it is an exception, but it is not a reflection of how real code is written.
hrmm, see I don't agree with that complaint. It's more that I'm not sure if pep 505 is individually neccessary, or only feels neccessary because of sentinel nones, and I'd rather know that before real effort is spent making it happen. Since lazy defaults could give us a real perspective on this, while solving other issues, I'd rather see 505 revisited after 671.
None is definitely special, the question though isn't "is None special enough to justify syntax?" (it certain seems like it is) but "what syntax would alleviate the most issues?"
sentinel Nones also exist outside of default arguments, but perhaps not enough of them to require syntax, that's fair
even just getting a confirmation "hey, here's the cases where we still feel it" then shows the people detracting about shape of functions that there are cases that a consistent function shape didn't fix, so even in the case where it's still felt as needed, we still gain perspective while working on other improvement.
I suspect dict access (particularly with nested structures) and objects modeling json apis to be where it really feels needed if re-evaluated after 671
PEP 671 doesn't feel like it's going to go anywhere, for what it's worth.
I kind of like the idea, but it's a hard sell to have two different kinds of defaults in the language
Could someone tell me where funcobject is defined in CPython? Or where can I see how functions are being implemented?
Objects/funcobject.c
oh sorry, I should have recognised that.
Would this need to be changed if we were about to add this stuff(which was proposed a few days ago):
def my_function(var_1, var_2)
pass
my_function(=var_1, =var_2)
(the picture above is from the grammar)
yes, that would need some grammar changes. That feature could likely be implemented fully in the parser and compiler
wait but that grammar node you cite might be for pattern matching
so that would require regenearting it with pegen(after adding some new features)?
let me see
yes, something like make regen-all will do it afer changing Grammar/python.gram
my approach for implementing this would be to add a new AST node for it, add a grammar rule that generates this AST node, and write some code in compile.c and symtable.c that handles this new syntax
so I don't have to change the pegen generator itsefl, just the grammar?
I would look at the devguide and/or the implementation of PEP 695 for how to do this
definitely shouldn't need to change the generator
thank you. I messaged the guy who brought this up a few days ago, I might be able to implement it.
good luck! implementing something like this is a great way to get more familiar with CPython internals
yes, this is why I thought it'd be a good learning opporutnity.
though obviously this proposal would have a long way to go before it could be accepted into CPython itself
yes, but I think I can come up with a pr, just to see how it works.
i think that things being potentially missing though is pretty common in code, and exceptions are very unwieldy for doing something like replacing a None with a default value
Pretty much every language is trying to have a concise way, these days, to express something like "i optionally have a value of this type, I want an expression that is the value if present, and some default if not"
I liked 505 a lot but it's been a minute so I have to assume it's totally dead
I haven't thought very hard about this idea, but I wonder: instead of 3 new operators (? and ?[] and ?.) I wonder if we could get away with a single new unary ? that returns a proxy object wrapping its operand and preventing KeyError and IndexError and AttributeError from being raised by [] and .
I'm imagining (obviously this would be implemented in C and not Python, but for the sake of illustrating the idea): ```py
class MissingClass:
def getattr(self, name):
return Missing
def getitem(self, item):
return Missing
Missing = MissingClass()
class MissingProxy:
def init(self, proxied):
self.proxied = proxied
def __getattr__(self, name):
try:
return getattr(self.proxied, name)
except AttributeError:
return Missing
def __getitem__(self, item):
try:
return self.proxied[item]
except LookupError:
return Missing
def postfix_qmark_operator(operand):
if operand is None:
return Missing
return MissingProxy(operand)
fwiw the operators were ??, ?[], and ?. (and, if you choose to consider it separately, ??=)
i think your suggestion works for ?[] and ?.
not sure how it would work for ??
You're right, I wasn't thinking about that case - largely because I don't think it comes up that often... I think it's way less compelling than ?. and ?[]
foo(a ?? b)
``` doesn't seem that much better to me than ```py
x = a
if x is None:
x = b
foo(x)
``` given how rarely I need it. I need None-aware traversal far more often than I need None-coalescing
I genuinely do think there are better ways to deal with potentially missing things (e.g. https://www.youtube.com/watch?v=9lv2lBq6x4A, rakus unitialized objects, etc.) than coalescing et al, but coalescing has gotten popular enough that it should probably exist in python in some form.
and of course, if a is a variable instead of an expression with a function call, you can always just do ```py
foo(a if a is not None else b)
I guess with my only-postfix-question-mark idea, I'd do coalescing with a method. py a?.coalesce(b) would evaluate to a if a is not None and b otherwise, and I'd implement it using: ```py
class Missing:
def coalesce(self, other):
return other
class MissingProxy:
def coalesce(self, other):
return self.proxied
Though that would mean that `?.coalesce` behaves differently than `?.literally_anything_else`, which sucks, so that's probably a bad idea
can you summarize these "better ways"
or do I have to watch π
Usually when I need coalescing, it's with more than two items and better handled with
item = next(filter(None, items), None)
objects that are never falsey mixed with None are handled fine with z = x or y. otherwise
z = x if x is not None else y
The last one can be shorter by inverting the condition and ordering, but I prefer this for intent. I don't find any of these strongly benefit from ??, maybe the last one is better, but enough to justify new syntax? idk
idk it comes up a lot for me in defaults. x ??= dict() for example would be extremely nice
is it that much nicer than ```py
if x is None:
x = dict()
basically all the languages I know of that provide ?. and ?[] in some form, also provide ?? in some form
Null object pattern is the thing I sent.
it's nicer than it, yes
What's nicer is not using None this way at all, but as others have said, that ship has sailed.
can you summarize what that means, how it's different from the approach being discussed in 505?
I mean, I agree that it's nicer. Is it enough nicer to justify two new operators (??= and ??)? π€·ββοΈ I dunno. I'd use them if they existed, but I've never really felt annoyed that they don't exist
I think realistically, if you're doing ?[] and ?. you probably do ?? and ??= too, yes. It makes sense. The biggest issue here is carving out the conceptual real estate, IMHO, having consensus on None being the missing thing, handling edge cases in how ?[] and ?. work, agreeing that there will be some operators with ? tied to None.
once you've done all that work I'm not really sure why you'd only standardize 2 operators but not ??
You essentially create a special default object instead of None, for example if I had something like
class X:
preprocessor : Preprocessor | None = None
```you would do instead.
```py
class X:
preprocessor : Preprocessor = IdentityPreprocessor()
```It doesn't work all the time, but I have seen many cases where people do some variant of optional instead of making things unified.
but then the whole thing is specific to each object? The whole idea here is to keep it independent of the particular object? I definitely like this vastly less, fwiw.
It's always going to be specific to an object, the only difference is whether it is specific in a subtype of the object, or in the body of a null check.
If people want to implement specific objects like that, they still can
the whole point here is to have some kind of easy way to handle these common cases that's independent of any specific type
it's just not really solving the same problem; I don't thin kyou can call this a serious alternative to 505
also, classmethod .default and have all parameters be non-optional is also a way that works. someone can take your defaults or provide everything.
Ah, I meant potentially missing things as in the abstract concept of a value possibly not being present, not specifically A | None. As I said, A | None is ubiquitous enough that the syntax should exist, I am just saying it is way too ubiquitous despite often leading to worse code.
(or multiple functions for differing behavior, or passing in a configuration class, flags with defaults, etc)
A lot of the ubiquity of it isn't good. Some of it is neccessary, some is a heavily overloaded set of behavior in a single user facing function
AFAICU, you're saying that people overuse A | None? That may be true, but even if true, I'm not sure that's really an alternative to coalescing per se, because there are still tons of valid use caes for A | None
also I will tell you, without a doubt IME working in a language where almost everything has a default state, that generally the opposite is true
there's cases where default states make sense but they're not very common
everything with T | None has a default state. some are just defining that in the function body
even in your example above: IdentityPreprocessor seems like a good preprocessor to have, and if you want to default to "do nothing" then sure, it's a good choice. I don't really see that as a "default state" though
the thing that makes ?. and ?[] much more necessary than ?? in my mind is that they're very often chained. Working around the lack of ??= turns one line into 2, but working around the lack of ?[] and ?. tends to lead to many more. PEP 505 gives this example:
For example,
await a?.b(c).d?[e]is evaluated:
If you were to do that without none-aware operators, you'd need a lot more code. Something like ```py
tmp = a
if tmp is not None:
tmp = tmp.b
tmp = tmp(c)
tmp = tmp.d
if tmp is not None:
tmp = tmp[e]
await tmp
let me rephrase, it's not a "special default object"
it's just some specific useful preprocessor that you picked as the default
yes, I understand what you're saying, they save more code. If your argument is that ?? and ??= by itself couldn't motivate the proposal, the way that ?[] and ?. can, yes, maybe that's true.
but that doesn't mean that if we solve all the problems associated with ?[] and ?., that it makes sense to not toss in ?? and ?.
it doesn't mean we Should either
if ?? encourages worse code, do we want language features to do that?
as long as they don't add new objections, least π
if they did, then sure, that would be a real factor
I don't see how they do though
(existing objection already in the discuss thread to ?? is the worse code thing btw)
outside of the tiny minority that think that ?? one lienrs are worse than two liners with if statements π
well, quantity of new operators was one of the objections people raised. Removing ?? and ??= cuts it in half. Likewise, the "how to teach" section gets cut in half if we're teaching half as much
I don't think "how to teach" gets cut in half at all; the ideas are exactly the same
I think in practical terms it would be mega strange to have these special operators for more complex none-aware operations, but be lacking the simplest (and often most useful) none aware operator at all
it would certainly make python a tremendous outlier in this space (not in a good way)
I don't think "other language has this" is a good compelling rationale.
What are the motivations of the other language having it and what have the consequences been.
I've seen way too many people with "clever" code that isn't any more performant or maintainable in other languages for the sake of a few characters.
IDK, maybe I just haven't hit the right domains, but I have pretty much never felt the need for sentinel-checking syntax except for nested error checks, which you can do with exceptions in python.
there's nothing "clever" about x ??= dict(). You're just not used to it π€·ββοΈ
That's the one case I think has merit, and I think it's better served by another pep
I can see None aware traversals if your APIs return a lot of Nones
it's not the only case with merit π€·ββοΈ
already mentioned as much #internals-and-peps message
Other languages usually look at these issues very carefully, they also already have implementation experience, they are, as the saying goes "prior art" - if you have near consensus on something, it's good to consider why exactly you think python is special, that it should do things differently from those languages
if you take away dealing with function defaults, that turns the proposal from "this affects every single library with a public API" to "This affects the couple people who do nested traversals with Nones, and some amount of other code".
that's a poor argument that glosses over what needs unique to that language motivated it
Err, no, it's not
it'd be simpler to just say "I don't understand the value of prior art or how it works" then to keep saying my arguments are poor
yes, it is. you assume they considered it well, and they probably did, in the context of the language it was added to
yes
so if you think those reasons don't apply to python, you should think about why python is actually different
python isn't x other language, so you need to compare the contexts and motivating cases to use prior art
other way around
you can't assume it is good, you have to show it
"i don't think that x = y ?: 5 is that much better than an if statement" is not unique to python in any way of course
because python (theoretically) doesn't return None from things to indicate that it failed. (in practice, even the stdlib does it at this point).
nothing is being assumed here; the pep still has to stand on its own merits, it's just a rather weird deviation from how this sort of thing is done, that's all
yes, very much a valid point, but it applies to the PEP as a whole, right
this fits in with what I was saying before, how the major challenges with 505 are things that apply to all the operators
if we solve those issues, then only adding part of the operators, is strange IMHO, unless as @raven ridge mentioned, there are actually issues unique to specific operators
I'm saying that once you solve the real issues here: "should None really be special" being among them, arguments about readability, or one extra operator, are fairly π worthy
I agree that ?? will lead to cleaner code, but I can see how one less line of code is not worth a whole operator.
the chaining is handled just fine with:
a = None
try:
a = foo.bar.etc
except AttributeError:
pass
Is the new syntax any better than this? This is already "free" in the happy path where you get what you expect.
The coalescing for mutable default would be better handled by lazy defaults, so does this actually solve a problem or is it just aesthetic?
i don't think an operator is worth much per se, it's mostly the concept behind that operator
You can't just throw in every operator that lets you skip a short pattern, there are not that many operators
sure, I agree, but it's very different to talk about one operator tossed in ad hoc that solves some specific problem, then when we're talking about a set of operators that solves things holistically
I think if ?[] and ?. were "solved", all the issues, etc, 100% accepted, and so on
then we'd already have to teach people all these ideas, at which point ?? is literally another 3 lines in some help page
the "cost" of the operator is mostly conceptual, not syntactic, and that conceptual cost is 99% amortized
Well, there's other problems beyond just teaching it. Does the operator encourage further use of None where exceptions for lack of value should be preferred?
there is a finite number of operators python can fit, and ?? could be used for something more useful in the future potentially is more the concern as I understood it.
even an objection like that is vastly diminished by standardizing things like ?[] and ?. right
because now that you've so heavily associated ? with None, using ?? for anything else will always be met with "that's surprising"
Yeah, I am mostly in favour of having the operators, but I can see why people don't like ?? specifically
i think people giving that argument are basically just giving the generic argument about reserving syntax without considering the in-context implications of the language design
i would say if you're going to argue against pep 505 it makes a lot more sense to argue against it as a whole
Though, again, I am open to seeing operator specific technical issues, but I haven't seen any yet
the technical issues seem common, and it's just the "taste" issues that seem operator specific
btw, fwiw, here's a core dev's take on the main reason 505 didn't make it:
As others noted, the main semantic sticking point was that the specific is None check was seen as too limiting, but the proposals to offer a more flexible underlying protocol based approach (e.g. https://www.python.org/dev/peps/pep-0532 ) were seen as too complicated. (There was also a syntactic sticking point, which is that ??, ?., and ?[] donβt really meet anyoneβs definition of βexecutable pseudocodeβ, which is a standard we aspire to for new Python syntax)
I.e. it's mostly focused on the more "meaty" objections that apply to all the operators
Python Enhancement Proposals (PEPs)
I think the inclusion of ??/=?? only serves two purposes:
- Mutable defaults (better handled by PEP 671)
- encouraging worse code that returns None to things that should expect a value rather than use an Exception.
So I'd heavily prefer 505 not be accepted if ??/=?? is on the table.
okay, well, luckily for us those are not at all the reasons why 505 was held back π
https://www.youtube.com/watch?v=0m2Cy5X6lcE&t=1520s Has more context on 505 not making it.
EuroPython 2022 - CPython Developer Panel - presented by Εukasz Langa, Pablo Galindo Salgado, Mark Shannon, Steve Dower, Irit Katriel, Batuhan Taskaya & Ken Jin
[The Auditorium on 2022-07-13]
Come meet the folks who make the Python programming language!
A panel discussion of core Python developers will take place on Wednesday at 2pm. Hear wh...
nice find
but also oy π
reasons which may be applicable to python programmers that deliberately eschew type annotations
and type checkers
rather disappointing
the reasoning I pasted from that other link is, IMHO, a lot more reasonable
Type checkers don't save you from sticking Nones in random places, they just ensure you don't screw up the nonsense places you have Nones in.
^
type checkers will prevent you from putting a None in a List[Foo]
if you want to put None, you need to deliberately change the type of that list
That's not the argument, the argument is that people would confidently make a list[Foo | None] since they know they could easily handle the Nones at the other end
I don't think that's the argument, since the whole argument is about it "creeping in"
it's one thing to just put some nones in your list, it's another thing to change the type of your list
well, yeah, you put it one list, and now you need to keep one of those values in a class, so you make that | None, and then that needs to go into a function, so you make that argument into a | None, etc etc etc.
Like, if you reframe the argument in the context of a statically typed language, it would just provoke laughter, put it that way
python is obviously not a statically typed language, but a lot folks do use static type checkers these days so it greatly weakens the argument
realistically, people choose teh types that make sense, and then when locally they push a None in there, the type checker will complain and they'll just check for None to make that error go away. that's what happens inreality when you put static type checking in the mix.
if you don't have static type checking, the argument is a lot more plausible
I think they meant creeping in in the design phase
you push the None in, and the error occurs at runtime, and it's a lot more obvious to fix it at the usage site
It's not just about a potential error, but about design.
you decide to use a None here and there and suddenly half your types are | None because you pass things around that way now.
Well, no, you ahve to change all your types by hand, or design them that way explicitly
from the beginning
Yeah, but you can do that
You can but you're not terribly likely to unless it makes sense
I have done that with I believe Maybe in haskell.
Well, did you try telling haskell folks that Maybe being ergonomic leads to people overusing it everywhere, and Maybe should be made less ergonomic so there's less unnecessary usage of it?
Most kotlin codebases will return a Type? when they have a potentially failing operation, because they know it will be the easier thing to deal with compared to an exception since the syntax for it is better.
I don't really agree that's the norm or encouraged (not that I program Kotlin profesionally)
I've had the misfortune of re-writing both haskell and C# code bases where a lot of junk was passed around, type checking doesn't change that people write code based on the language features they have, and that we should be mindful that language features can encourage bad patterns.
Funny enough, there's been talks about this (well, the bigger issues with over-using monads) like every functional programming convention for the last at least 4 years.
arguments in the form of "making X easier will make people overuse X" are not compelling because a) they're totally generic, you can apply it to anything without thinking (and people often do), and b) they're not generally backed by any evidence. If someone is asserting it will shift how things are being used negatively, I'd say the burden of evidence is certainly on that person.
you need something much more concrete; specific evidence that indicates it will significantly increase overuse, etc
Prolog has cuts and they are massively overused since they are easier than writing logically pure code. That's sort of a simple example
When it comes to syntax, you can't put the genie back in the bottle easily once added. We have ample evidence of code horror due to people golfing shorter "clever" solutions with much more basic things.
I'm not familiar with prolog, but that doesn't show that prolog is better off without cuts, right?
and can show plenty with null coalescing in real world code in languages that have it
well, no, you do still sometimes need to do a cut
but maybe it shouldn't be as simple as it is
and then people would maybe actually write more pure code.
respectufully since you consider x ??= dict() to be "too clever" I'm going to take that with a heavy grain of salt
I dont consider that too clever, I consider that to be the case with merit but better served by pep 671
haskell is much better at maintaining purity, and it is harder to do impure things in haskell.
hard for me to discuss this, as I don't know Prolog nor cuts. but I've seen this play out, many times, in many languages, and it's just rarely compelling
I made that quite clear already
it's also the case that typical users read a lot more into these "muh readability" issues and the people actually developing the language, and so on, are more concerned with whether they can actually find a good design, that is technically sound
Go generics was that in spades, for example (and still one of my all time favorites)
I'm not concerned with readability issues here, im concerned that in given a way to be "more concise" and "just handle None" design decisions will stop taking into account where none should be optimally handled.
This has been an observable effect of null coalescing already in other languages that have it, and even in the elimation of None from Unions in typed python code not being done until very late in a call chain because someone else handles it
I mean, that might be a valid concern if there weren't already plenty of languages that have ergonomic null handling where this doesn't happen
If you survey 1000 kotlin developers and ask them "Do you find that people are just returning nullables everywhere because they're ergonomic, to the point where you wish it was less ergonomic so they would stop"
I would argue most other languages with null coalescing got it because they had too much nulls that were obnoxious to deal with without the syntax.
Well, I'll happily take bets on the outcome of that survey
and python is in the same position now.
this goes back to what I said about the motivations of null coalescing as a language feature and the context of the language they are added to.
I mean Swift has it, that's pretty much a "from scratch" position.
Rust doe snot have null coalescing but it has very ergonomic ways to do the same things
haskell, as you said
and so on
it's pretty much just the norm now, regardless of whether it's being retrofit on an existing language or not
I'm not really sure why you say "for better or worse" - we could argue the merits of the different flavors of them but having something in this vein is pretty fantastic
that's the reason it's pervasive
not because all these language designers aren't too bright
I find that it[Maybe]'s problematic in Haskell when overused, that it gets overused. Same with js, same with C#. My experience matches yours of it not being an issue in Kotlin, and I think it's a difference of other features that never encouraged over-use of a nullable value in Kotlin
python, None is everywhere, gets used as basically a "Default sentinel" in the tutorials that tell people why a default value of list/dict is bad, etc
gets used as a "Bad return value"
etc
given the landscape, I want more tools for None to show up less, not to be ignored more readily further just ignoring and even encouraging the landscape problem because "Well, it's what everyone else is doing and was even the motivating reason for this feature"
@flat gazelle fwiw I don't think the python (w/ 505) or kotlin solutions are ideal, both since they collapse nulls, and because they do take up syntax for what I think is solvable without it.
Rust's approach I think is pretty good. It doesn't use special syntax, it's a little less concise, but good enough I think relative to how common the use cases are.
but I don't think that approach can be mapped back to python very easily
you'd need to have "real" sum types instead of Union just being a type annotation and having a specific value from one of the types in the Union there
yeah, the more I think about it, the more I realise that it's really a consequence of null checks being the simplest solution to a lot of patterns, and I think the ship of "giving people something that's not typechecks for missing values" has long since sailed.
why would you want to give people something that's not a typecheck for missing values π€
you already do because of how PyObject actually is and interacts with containers/values.
you can isinstance or pattern match on types too :)
because I dislike sentinels
maybe we should argue on adding Maybe to the stdlib π€ͺ
and then disregard 30 years of python code
like, in C++, when you have a unique_ptr<Foo>, nullptr is a sentinel value
In Rust the equivalent is Option<Box<Foo>> ; there is no sentinel value anymore
Imagine having an optional argument, a Optional[] argument, or a Maybe[] argument, surely that isn't confusing.
I see Option as sort of the natural development over a sentinel that actually makes it pretty convenient to work with. It is however married to a very different approach from the more OO style I except of python.
Option isn't married to OO.... at all
oh, I see
sorry I misread
I'm not sure I see anything about Option that's incompatible with OO either though π€·ββοΈ
The reason that Option is an upgrade over a sentinel isn't convenience; it's for type checking reasons
the point is that often sentinels don't "really" support the operations that other values of the type do, or they don't in this specific context
famously, nullptr doesn't support dereferencing
but also you can imagine if you are storing say, identification numbers of people, and you use -1 as a sentinel value to indicate "no ID"
then operations like first_id < second_id will "just work"
that is, they'll type check
to say this sort of thing has caused a lot of bugs in C, C++, Java, python, etc would be putting it mildly
Yea, and Option solves this by giving you a dead object that you can then ask whether it is the real object or not.
It is arguably an improvement over a sentinel, but sentinels are IMO something to be avoided in the first place, and thus, so is Option in OO, I would argue. Of course, there are cases where avoiding them is worse, and there Option is actually useful.
I mean... the issue I take with this is that people aren't really in the habit of randomly adding Option into things where it's not needed
like I think most people understand that they can use specific values of objects to consolidate later branching
Like, if they have a member variable of a class, that stores a function, that maps say str into str
if they can use the identity function as a default, and thereby avoid making it an Optional[Callable[str, [str]]]
then... I think most people already understand that's a good idea, and will try to do that
the whole mutable defaults situation in python is kind of proof of that: people first try to make the default dict() or whatever, and then they find out they can't for weird python specific reasons.
then they use None as the default and branch on it.
i dislike adding ?. and ?[ without ?? and ??=
the purpose of ?. and ?[ is safe navigation, but if you have possible None somewhere, all attr- and item-accesses after that should become None-aware, so presense of None infects whole expression
and i strongly believe that maybe(a).b.c.d.e.f is by far more readable than a?.b?.c?.d?.e?.f
if you have ?? operator, you can replace None-able value with some non-None default value: (a ?? default).b.c.d.e.f which shows clearly that a can possibly be None, but everything else cannot (which wasnt clear in a?.b?.c?.d?.e?.f example)
i made a library for this once upon a time -- makes sentinels that have no-op operations or whatever you want so you don't have to branch
I'd be curious how you generalize that
Seems like each case is rather unique?
you can provide the methods if the defaults aren't good enough
though i thought python was adding sentinels at some point? or there was a pep for it... is that what started this conversation?
It was more about 505
Is there some hidden meaning here?? https://github.com/python/cpython/issues/110930#issuecomment-1765303127
what
open source gets trippy sometimes, huh
:incoming_envelope: :ok_hand: applied timeout to @supple flume until <t:1697540896:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).
The <@&831776746206265384> have been alerted for review.
compromised account methinks
I've seen some pretty unhinged arguments going on in Core Python communities. I believe threats of violence have been thrown by a few community members.
Not entirely certain; I wouldn't expect that to be particularly opaque as far as decisions go-- realistically you don't necessarily have to be a 'community member' to contribute, but in this case I believe the individual was reasonably involved in Python in general; enough to make an impassioned speech that was... unfortunately a bit inflammatory. I would presume the end-result was likely a simple ban from contributing on GitHub discussions, but I truly don't know.
Obviously it's an infinitesimally small population, but they exist everywhere.
hello
There is a token being used in the grammar called TYPE_COMMENT, I'm wondering whether it is equivalent to # type: str.
or what is TYPE_COMMENT referring to?
yes, that's what it's about
hmm, but between parameters? it is literally a possible element to pass in, I might be wrong though.
see the send_email example in https://peps.python.org/pep-0484/#suggested-syntax-for-python-2-7-and-straddling-code
Python Enhancement Proposals (PEPs)
the reason why it is a bit strange, is because it is in the parameters.
param ',' TYPE_COMMENT?
here the TYPE_COMMENT is being marked as optional.
'(' [params] ')', I don't think type comments can be used between the brackets.
I just linked you to the specification for where you can use type comments within brackets
ah my bad, there is a shorter "send_email", which I stumbled upon when I opend the link, so I didn't go further.
oh right π
I'm at the stage where I'm trying to figure out which rule in the grammar makes it possible to pass in arguments like this:
test(param=param)
I've already found out the part which "handles" positional arguments.
test(param1, param2)
param_no_default+ param_with_default* [star_etc]
This specification looks good despite the "expression" inside the rule does some interesting stuff
param_with_default+ [star_etc]
Could the expression represent a parameterl(param=...(expression))?
default: '=' expression | invalid_default
expression:
| disjunction 'if' disjunction 'else' expression
| disjunction
| lambdef
not sure if I put the question in an understandable way.
That looks like you're looking at the grammar for function declarations, not function calls
oh
(in general, "parameter" is the thing in a function declaration and "argument" is the thing in a function call, though the terminology isn't always consistent)
yes, that was the mistake that I made.
args:
| ','.(starred_expression | ( assignment_expression | expression !':=') !'=')+ [',' kwargs ]
| kwargs
I think this is what I'm looking for.
assignment_expression:
| NAME ':=' ~ expression
that's the walrus
that looks more like it π
so if I have something like this
my_func(arg1=smt)
arg1 -> NAME
"=" -> '='
smt -> expression
I'm not entirely sure about the "expression" here, how can it indicate an argument such as smt
hmm π§
a name is an expression
I think this is what confusing me
expression:
| disjunction 'if' disjunction 'else' expression
| disjunction
| lambdef
if you follow disjunction down you'll probably eventually find something that matches a bare name
at first glance, I think the whole thing could be enabled by adding * front of NAME. * is matching zero or more occurences.
kwarg_or_starred:
| NAME* '=' expression
oh you're looking at the f(=x) thing. I think you should add a new alternative to the rule in that case
well, yes I can add a new rule so like:
kwargs_or_starred:
| NAME '=' expression
| NAME* '=' expression
if this is what you mean.
I'm not sure how I can test this out and see whether what I'm doing is correct or not.
NAME* doesn't make sense. Wouldn't that allow f(a b c=3)?
well you would still need to use comma to seperate them and also the equal sign.
NAME* means any number, right?
you mean the * sign?
yes
That's not zero or one.
*zero or more
I just realised that the grammar on the website ignores all of the "action_helpers", this makes the grammar even harder to understand π
I suppose I'd need to add/refactor something, so then I can call something similar to _Pypegen_keyword_or_starred
Should I construct a new object such as KeywordOrStarred? hmmm
Is the fact that it's impossible to obtain a callable for the currently executing frame without doing a lookup in f_locals/f_globals of another frame intentional?
I would really really like it to not be impossible.
import inspect
from types import FrameType
def inner_function():
frame: FrameType | None = inspect.currentframe()
if not frame: return lambda: print("No current frame.")
callable_object = frame.f_globals[frame.f_code.co_name]
return callable_object
inner = inner_function
def inner_function():
print("This is the wrong function!")
return inner_function
inner()() # prints "This is the wrong function!"
I would like a way to not have this issue
It seems like what I want is the f_funcobj of the _PyInterpreterFrame. Is there a reason why this isn't public?
Just that it may be None?
Possibly the right question is why it should be public. These things are implementation details, and exposing them publicly means guaranteeing backwards compatibility and possibly contraining future changes in the interpreter
because the thing is actually executed is code object, function just wraps it with appropriate namespace and some other stuff
also, you can exec(codeobj), and in this case there is no function that is currently executing
any ideas on how I can move from this point?
Hello everyone I am a beginner
Hi!
It should be public because I really desperately need to know, and because it's been the same for many, many versions.
I don't think frame objects have ever not stored their callable
And because we already can ask it for the name
Not every frame needs to have a callable, but if you ask for the name of one of those frames it just returns none.
I propose the same, but just give me the callable.
I use to learn python as my first language but when I joined my college my professor told me that's it's easy to learn and you not be able to have good grip on building knowledge so you should first start with a moderate language such as java or c/c++. I want to ask is that true what I have been told.
their arguments are not entirely wrong, but their conclusion is arguable
a bunch of people do agree with that, but at the same time, it's fine to start with python to get an overall understanding of programming before diving in a more "moderate" language
But now I have learned java so is it a good time for me to switch to python?
just use whichever language is more appropriate for each project
Ok
| '=' b=expression _PyPgen_keyword_or_starred(...)
KeywordOrStarred *
_PyPegen_keyword_or_starred(Parser *p, void *element, int is_keyword)
{
KeywordOrStarred *a = _PyArena_Malloc(p->arena, sizeof(KeywordOrStarred));
if (!a) {
return NULL;
}
a->element = element;
a->is_keyword = is_keyword;
return a;
}
any ideas?
should I get help from someone who wrote the parser generator?
I think what I said before still applies: add a new AST node for it, then write parser code that generates that new AST node. The PEP 695 implementation can serve as an example there
Thank you, but shouldn't I first implement this grammar change?
Grammar rules generate AST objects. You can't generate AST objects that don't exist yet
Obviously all these things need to be done in tandem for a working implementation
I wonder whether this thing will be 9k+(lines) or not. π
should definitely be a lot smaller than PEP 695 was!
you shouldn't have to add new bytecodes and scope types and builtin objects
what are you trying to do anyway? i'm curious
_PyInterpreterFrame is brand new in 3.11, isn't it?
IIRC stack frame objects are at least as old as Python 3.
It has had different names though I think.
Although it's always contained the callable.
Python 3.11+ has two entirely different types of stack frame structs, though. _PyInterpreterFrame is the new, totally private one, which isn't a Python object
https://discuss.python.org/t/syntactic-sugar-to-encourage-use-of-named-arguments/36217
Trying to implement this first just because of learning purposes. Some core devs are opposing the idea.
Issue Named arguments confer many benefits by promoting explicit is better than implicit, thus increasing readability and minimising the risk of inadvertent transposition. However, the syntax can become needlessly repetitive and verbose. Consider the following call: my_function( my_first_variable=my_first_variable, my_second_variable=my_s...
I'm supporting the idea. (personally speaking)
this gives the same vibe as py def __init__(self, some_name, another_name): self.some_name = some_name self.another_name = another_name ... being turned into ```py
def init(self, $some_name, $another_name):
...
I don't think so. Not sure what that sign indicates.
it means some_name will be assigned to self.some_name
both are good ideas to me
The former is more about keyword arguments.
but still trying to avoid repeating names
What do you think about the idea? (what I sent before)
both are good ideas to me
although i'd much prefer f(x=, y=)
because f(=a+2) (valid assuming expression is used) isn't very clear as to what it's gonna do
which is implied to work because the right side of a keyword argument is a value
that is any expression
What would happen there? a=a+2
sure, but what about f(=a+b)
Interesting.
