#internals-and-peps

1 messages · Page 20 of 1

cyan raven
#

I think first a minimal pep would be a great idea.

#

Since there are a bunch of things to clarify.

#

I suppose the only good option is to use a single "NAME" after the equal sign.

umbral plume
#

if it's worth anything, having x= instead of =x syntax would end up mirroring the syntax of f-strings a bit more, for consistency's sake

faint river
#

f(a=*, b=*) I prefer some placeholder on RHS over simply name=

rose schooner
#

i don't think you should call this directly

#

wait hmm

#

what am i even saying

#

so based on the format and using the x= syntax it should be ```diff
kwarg_or_starred[KeywordOrStarred*]:
| invalid_kwarg

  • | a=NAME '=' b=expression {
  •    _PyPegen_keyword_or_starred(p, CHECK(keyword_ty, _PyAST_keyword(a->v.Name.id, b, EXTRA)), 1) }
    
  • | a=NAME '=' b=expression? {
  •    _PyPegen_keyword_or_starred(p, CHECK(keyword_ty, _PyAST_keyword(a->v.Name.id, b ? b : a, EXTRA)), 1) }
    
    | a=starred_expression { _PyPegen_keyword_or_starred(p, a, 0) }
#

that's about it too

#

i'll implement that later

cyan raven
rose schooner
#

it's easy enough

cyan raven
# rose schooner :3

Well, I talked with the guy who wants to propose the pep and asked him whether I can implement it so that I can get into CPython internals.

rose schooner
#

it's not really something that gets too involved in CPython internals

#

unless the understanding of the functions is included

cyan raven
#

Well obviously I can't stop you from implementing it but this is what I could find easy enough to give it a try and hopefully, I can improve during the process.
If it's easy for you, you probably can implement more complex/complicated things. 🤷‍♂️

rose schooner
#

good luck with the learning

cyan raven
rose schooner
#

i hope you can learn something about CPython internals if you try to implement it

#

it's fun

cyan raven
#

👍

cyan raven
#

well, I think it should be a syntactic sugar really.
f(a=a) -> f(a=)
that is it.

#

so the expression can be ignored -> NAME, = or smt.

cyan raven
#

I still don't quite understand how grammar actions are processed.
I'm talking about function calls between brackets such as _PyPegen_keyword_or_starred. Other than these functions are defined in the actions_helpers, I can't think of any other use cases.

    @memoize
    def action(self) -> Optional[str]:
        # action: "{" ~ target_atoms "}"
        mark = self._mark()
        cut = False
        if (
            (literal := self.expect("{"))
            and
            (cut := True)
            and
            (target_atoms := self.target_atoms())
            and
            (literal_1 := self.expect("}"))
        ):
            return target_atoms
        self._reset(mark)
        if cut: return None
        return None

(this is how they are recognised by the pegen_generator)

cyan raven
#

a new AST node would mean extending the Python.asdl file, right?

#
Parser/Python.asdl may need changes to match the grammar. Then run make regen-ast to regenerate Include/internal/pycore_ast.h and Python/Python-ast.c.
fallen slateBOT
#

Parser/Python.asdl line 28

| TypeAlias(expr name, type_param* type_params, expr value)```
dusk comet
#

is there any working implementation of pep554?
maybe on PyPI

#

!pep 554

fallen slateBOT
dusk comet
#

also:

  • is this pep going to be accepted? what is opinion of core devs?
  • is it possible to implement it in pure python using untouched CPython3.12 (with ctypes to call already existing api)? (i dont need reliable and robust implementation, i just wanna experiment with that idea)
raven ridge
cyan raven
# feral island yes

the return type tells what node it returns, right?
simple_stmts[asdl_stmt_seq*]
this node exists in pycore_ast.h hmm but I can't see it in the asdl file itself.

cyan raven
#

was asdl_typeparam_seq also created from the asdl file, or have you added it manually? I don't see any asdl changes there.

karmic ingot
#

hello

#

i came across this pdf

#

but find it hard to understand

#

Create a function called replaceString that takes in three parameters word,
search and replaceWith. The function would replace all instances of the search
parameter with the replaceWith parameter. Example:
replaceString(‘Abdulqudus’, ‘u’, ‘v’) // Should return ‘Abdvlqvdvs’
replaceString(‘javascript’, ‘a’, ‘o’) // Should return ‘jovoscript’

#

how long do i have to wait for someone to reply lol

turbid sentinel
#

Maybe it’s cus you are posting in internals and peps? Just a thought

feral island
faint river
cyan raven
boreal umbra
#

This behavior is unsurprising, but it's never occurred to me that this could happen.

In [8]: stuff = {'a': {'b': 1}}

In [9]: foo = {'c': 2, **stuff}

In [11]: foo['a']['d'] = 3

In [12]: foo
Out[12]: {'c': 2, 'a': {'b': 1, 'd': 3}}

In [13]: stuff
Out[13]: {'a': {'b': 1, 'd': 3}}
halcyon trail
boreal umbra
halcyon trail
#

well, it can cause a lot of headaches very quickly. Here it's just a harmless little itnerpreter example but it's not hard to imagine how this can get nasty

#

one thing is like, whenever you write classes, in principle they can't really maintain their invariants unless you do defensive copying everywhere

boreal umbra
#

Whenever I write functions that operate on json-like data, I typically start with something like this

type JsonType = dict[str, JsonType] | list[JsonType] | str | int | float | bool | None

def some_func(data: JsonType) -> JsonType:
    data = deepcopy(data)
    ...
halcyon trail
#
class Foo:
    def __init__(self, x: List[int]):
        self.x = [e for e in x if e != 0] # class invariant: self.x has no 0's

l = [1,2,3]
f = Foo(l)
l.append(0) # oops
#

Yeah, I mean you can sprinkle in defensive copies randomly but oviously a) lots of people don't do it, and b) it's a ton of waste that actually pretty quickly adds up

#

more modern languages tend to be designed with a lot more awareness around controlling mutation

#

even relatively simple things can make a pretty big difference

boreal umbra
#

do you have an example that isn't from Rust?

halcyon trail
#

sure

#

Kotlin for example, doesn't really have a very fancy system around controlling mutation per se

#

but one big difference is that most of the operations "by default" use read-only APIs, and also in practice those instances are often immutable

#
x = [f(e) for e in old_list] # x is mutable

val x = old_list.map { f(it) } # x is immutable
#

python has Sequence and MutableSequence but they're a lot more bolted on. People should really use Sequence and MutableSequence more but they often end up using List (or list as of whatever version) a lot more

#

and it's not the default in many places

#

Rust and C++ fwiw I wouldn't use as examples anyway; they take a very different appraoch but most importantly they dont' default to shared references. So some of these issues don't even exist to start with.

halcyon trail
#

All you need to do is

#

type ReadJsonType = Mapping[str, ReadJsonType] | Sequence[ReadJsonType] | str | int | float | bool | None
At least - in principle

#

if you do this, then if you mutate data then the type checker will already complain for you

#

because ReadJsonType doesn't have any mutating operations at all

#

But you can see even here, it's not your first instinct to use Sequence instead of List. In Kotlin, everyone uses List and MutableList, and people generally default to the former; List is read-only.

dusk comet
# raven ridge <https://github.com/ericsnowcurrently/interpreters> uses this interface, I belie...

It was pretty easy to install, but now i am facing some problems:

  • I have no idea what this means:
>>> i = interpreters.create()
Error in sitecustomize; set PYTHONVERBOSE for traceback:
ImportError: Could not find a console implementation for local python version
>>> i
Interpreter(id=1, isolated=True)

^ this happens at every subinterpreter creation (error is raised in subinterpreter, not in the main interpreter). I cant find where this error is raised. I can show you the output i got with sys.settrace(print).
PYTHONVERBOSE causes even more problems:

Python 3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import interpreters
>>> i = interpreters.create()
import _frozen_importlib # frozen
import _imp # builtin
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 1534, in _install
  File "<frozen importlib._bootstrap>", line 1523, in _setup
  File "<frozen importlib._bootstrap>", line 1489, in _builtin_from_name
  File "<frozen importlib._bootstrap>", line 942, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 496, in _verbose_message
RecursionError: maximum recursion depth exceeded while calling a Python object
### ^^^ this is in subinterpreter, vvv this is in the main one
ValueError: _PyImport_InitCore: failed to initialize importlib

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Programs\Python\3.12\Lib\site-packages\interpreters_3_12-0.0.1.1-py3.12-win-amd64.egg\interpreters.py", line 25, in create
    id = _interpreters.create(isolated=isolated)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: interpreter creation failed
>>> i
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'i' is not defined. Did you mean: 'id'?
>>>
#
  • ctypes doesnt work in subinterpreters:
>>> i.run('''
... try:
...   import ctypes
... except BaseException as exc:
...   print(repr(exc))
... ''')
ImportError('module _ctypes does not support loading in subinterpreters')
  • some weird things happen if subinterpreter raises exception:
>>> i.run('import ctypes')
RunFailedError: script raised an uncaught exception (unable to format exception type name)Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Programs\Python\3.12\Lib\site-packages\interpreters_3_12-0.0.1.1-py3.12-win-amd64.egg\interpreters.py", line 98, in run
    _interpreters.run_string(self._id, src_str, channels)
MemoryError
  • simple things seems to be working:
>>> i.run('print(1 + 2)')
3
cyan raven
feral island
cyan raven
#

it seems like the action helper is constructing the node which is returned by the rule.

cyan raven
grave jolt
#

!pypi pyrsistent

fallen slateBOT
grave jolt
#

Surprisingly, they're not that slow. Definitely better than defensive copies sprinkled everywhere

halcyon trail
#

generally speaking, functions especially should be able to just get by with arguments that are simply things like Mapping and Sequence. They don't really have much reason to care if the argument is immutable or not; they just want to read some data from their arguments, and they want to promise the caller that they're not going to mutate it.

#

for classes this does not work as well, unfortunately

grave jolt
#

I think immutability gets a bit easier if you align your program better with the "data in, data out" perspective 🙂

#

but that's not always a thing

halcyon trail
#

it's just not easy in python from a pure mechanical point of view

#

It's certainly doable but you have to work harder and swim more upstream

#

the built in data structures are used a lot as type annotations in libraries, and all the comprehensions are "built in" to the standard types

dusk comet
cyan raven
#

any ideas on what this syntactic sugar should be called?
I mean I cant use syntactic sugar as a node name, or any other object's name.

feral island
cyan raven
cyan raven
#
The shorthand argument names are automatically provided by Swift. The first argument can be referenced by $0, the second argument can be referenced by $1, and so on.
feral island
misty oxide
#

Does anybody know if CALL_FUNCTION_KW/CALL_KW are ever produced for syntax where the number of positional/keyword arguments cannot be statically analyzed?

cyan raven
#

hmm, which constructor/field represents the equal sign in the asdl file?
keyword = (identifier? arg, expr value)
I mean I can't really see: NAME, =, some expression here.

feral island
#

it doesn't need to be

cyan raven
feral island
#

correct

dim plank
#

i've been trying to modify Assign nodes of an ast of a python program to contain annotations, for that, I am replacing the Assign nodes with AnnAssign nodes (since they have annotations, currently ignoring multiple targets), the only problem is converting the modified ast back to source code, Im using astor.to_source and it probably does not support modification of Assign nodes to AnnAssign, am I using the wrong framework for conversion or is there something fundamentally wrong with my approach?

feral island
dim plank
#

im sorry ,here is the my code @feral island -

def visit_Assign(self, node: AST) -> Any:
  # some filler code here 
  # ..
  target = node.targets[0]
            if target.id not in self.assigned_vars:
                annAssign_node = ast.AnnAssign(
                    target = target,
                    annotation = annotation,
                )
                self.assigned_vars.add(target.id)
                print(f"\nNew node: {ast.dump(annAssign_node)}")
                ast.copy_location(annAssign_node, node)
                ast.fix_missing_locations(annAssign_node)
                return annAssign_node```
#

and here is the main -

def parse_and_assign_types(program):
    tree = ast.parse(program)
    pretty_print(tree)
    modified_tree = assign_types(tree)
    return modified_tree

if __name__ == "__main__":
    with open("example.py", "r") as f:
        program = f.read()
    tree = parse_and_assign_types(program)
    pretty_print(tree)
    print(to_source(tree))```
#

in astor, the assert is happening in this function -

def set_precedence(value, *nodes):
        """Set the precedence (of the parent) into the children.
        """
        if isinstance(value, AST):
            value = get_op_precedence(value)
        for node in nodes:
            # print(f"\n Setting precedence for {ast.dump(node)}")
            if isinstance(node, AST):
                node._pp = value
            elif isinstance(node, list):
                set_precedence(value, *node)
            else:
                assert node is None, node```
#

and this happens while visiting AnnAssign which was originally ast.Assign node -

Visiting AnnAssign AnnAssign(target=Name(id='result', ctx=Store()), annotation='str')```
feral island
#

Also, not sure what your use case is, but you may want to look into using libcst instead, which would allow you to transform code while preserving comments and other formatting

dim plank
#

sure, I'll have a look at libcst, thanks alot

feral island
#

if you don't care about preserving comments or formatting working with the AST should be fine too

dim plank
feral island
#

make a correct AST, e.g. with annotation=ast.Name(id="str")

dim plank
#

is this an issue with my python3 (3.10.12)?

feral island
dim plank
feral island
# dim plank u mean this? `ast.fix_missing_locations`

No I mean this ```In [177]: arg=ast.arguments(x=3)

In [178]: arg.x
Out[178]: 3

In [179]: arg.posonlyargs

Traceback (most recent call last):
File "/main_instance_shell/jelle/venv3.9/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3397, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-179-4c22e773849e>", line 1, in <cell line: 1>
arg.posonlyargs
AttributeError: 'arguments' object has no attribute 'posonlyargs'

AttributeError Traceback (most recent call last)
Input In [179], in <cell line: 1>()
----> 1 arg.posonlyargs

AttributeError: 'arguments' object has no attribute 'posonlyargs'

dim plank
feral island
#

yeah, pass in all the arguments when you construct the object

dim plank
#

ok makes sense

feral island
#

like ast.arguments(posonlyargs=[]) and all the others

dim plank
#

sure, that was really helpful, thanks alot

rose schooner
misty oxide
nova hill
#

❤️BTS❤️✯¸.•´¨*•✿ ✿•*¨`•.¸✯미적인. [FF00FF]❤️

nova hill
#

??

static hinge
#

!ot

fallen slateBOT
peak spoke
#

How's all the speedup/parallelization of cpython going right now? First it was just subinterpreters, then faster cpython (if I'm remembering the timeline correctly) and now there's also GIL-less. Are they all getting implemented on their own? GIL-less sounds like it'd make a good mess in what faster cpython is doing, and looking at some pyperformance stats the new 3.12 has a decent portion of tests where it's slower than 3.11

sly crystal
#

just curious, does anyone know what the time complexity of using [::-1] to invert a list in python is? Would believe it's O(n) but I have no idea how python handles arrays internally.

dusk comet
#

it is O(N) indeed
it does roughly this:

  • allocates new list of size N
  • iterates over old list and copies pointers to new list
    memory allocation is O(N) operation and iteration through entire list is also O(N)
sly crystal
#

is rather innefficient and I should re-write my generator to return an ordered tuple ?

dusk comet
#

idk what you mean by "ordered tuple"

#

and linear time complexity doesn't mean that your code is slow

#

if you have some perfomance issues - use profiler to figure out the bottleneck, and then do optimizations

sly crystal
# dusk comet and linear time complexity doesn't mean that your code is slow
def split_digits(n:int, base=10):
    """Generator to split number into digits

    Parameters:
    --------------
    n(int):                             integer to split into digits
    base(int)=10                        base to split integer to

    Return Values:
    --------------
    function:                           function to split ints into digits

    #! the digits this returns are in descending order !
    """
    if n == 0:
        yield 0
    while n:
        n, d = divmod(n, base)
        yield d

Currently I'm using this code to get a integer split into a tuple of digits, however the tuple is in reverse order and I'm re-sorting it using [::-1].

rose schooner
sly crystal
dusk comet
dusk comet
sly crystal
dusk comet
sly crystal
sly crystal
dusk comet
#

is there a way to get a list of all dunder-names that have corresponding slot in type object?

feral island
sly crystal
#

Yet another question about the python internals: Is it faster to use enumerate or to use a for loop and then just get the current element using i as the index ?

unkempt rock
#

I've done some testing and there is almost no difference My testing is dumb

#
# Script 1
iterable = [i for i in range(10_000_000)]
target = [0 for i in range(10_000_000)]

def test():
    for idx, i in enumerate(iterable):
        target[idx] = idx + i

    return target
``` ```py
# Script 2
iterable = [i for i in range(10_000_000)]
target = [0 for i in range(10_000_000)]

def test():
    for i in range(len(iterable)):
        target[i] = i + iterable[i]

    return target
``` ```py
# Timing script
import timeit
print("Timing with enumerate:")
print(timeit.timeit("to_profile.test()", setup="import to_profile", number=10))
print("Timing with range(len(...)):")
print(timeit.timeit("to_profile2.test()", setup="import to_profile2", number=10))

range(len(...)) appears to be marginally faster quite a bit (see below) faster

# Output:
Timing with enumerate:
5.525693699994008
Timing with range(len(...)):
5.302737100006198
grave jolt
#

I have similar results

#

actually that's quite a bit of difference

unkempt rock
#

About 4% faster

#

Although the difference you got with your bench is larger

#

because you do less in the loop

grave jolt
#

yep

unkempt rock
#

I was a bit wary with optimizations there, but I guess Python doesn't really optimize this away

grave jolt
#

Yeah, there isn't much to optimize here while preserving behaviour

unkempt rock
#

Okay, now it's way faster ```
Timing with enumerate:
1.8064987999969162
Timing with range(len(...)):
1.1512878000066848

#

Doing it your way

#

That's dumb, I don't like this result 😦

#

Luckily both are fast enough to not make a huge difference most of the time

#

In 99% of programs, you could probably just replace the enumerate with this kind of range-len loop and get a performance boost

#

Is there a reason Python doesn't have this kind of optimization? I can't really think of a case that would change behaviour, if the pattern matching is right

#

Or does Python also aim to preserve behaviour of internal state

feral island
#

however, there might be an opportunity to optimize how enumerate() is executed

unkempt rock
#

What if we restricted it to those cases?

#

Hm, I guess then it could fail, depending on how the object is implemented

#

But for inbuilt types, maybe

fringe acorn
#

Good afternoon friends, could anyone help me validate some unit tests?

cyan raven
loud delta
#

I found a developed but un-documented feature in the PEG grammar described in pep 617. Would that warrant an update to the pep?

steel solstice
#

what is it?

loud delta
#

'&&'

#

I can send a mock update of 617 that I made

#

If needed

#

I was looking at the whole peg grammar to try to do something funny and realized I didn't know what '&&' did and that it wasn't in pep 617

cyan raven
#

That means "if it can be parsed".

loud delta
#

& is "Succeed if e can be parsed, without consuming any input." in pep 617
&& is nowhere in pep 617, and should probably be documented somewhere.

#

somewhere besides the depths of git blame

#

Want me to refind the commit that added it?

feral island
#

that's probably where it should be documented, the PEP is a historical document at this point

loud delta
feral island
loud delta
#

thx!

olive marsh
#

Hi , what is cstack?
Python is a c binary and when we run python binary just like other c binary it complies into assembly instructions which run on a stack called cstack, and cstack is managed by os, like cpython process asked os that I need a stack to execute my assembly instructions am i correct? or is it a cpython internal thing managed by cpython.

#

But, I saw in the code base something like frame owned by cstack then I got confused, is my above reasoning correct?

flat gazelle
#

As I understand the term, yeah, it's the call stack C runs on, in contrast to the python call stack.

#

it's provided by default to any process.

cyan raven
#

have you found anything?

rose schooner
#

it produces the SyntaxError: expected '<char>' message

#

e.g. ```pycon

def a()
File "<stdin>", line 1
def a()
^
SyntaxError: expected ':'

cyan raven
rose schooner
#

i've no clue about updating documentation though

fallen slateBOT
#

Parser/parser.c line 4518

(_literal_2 = _PyPegen_expect_forced_token(p, 11, ":"))  // forced_token=':'```
`Parser/pegen.c` lines 396 to 397
```c
Token *
_PyPegen_expect_forced_token(Parser *p, int type, const char* expected) {```
cyan raven
cyan raven
loud delta
cyan raven
loud delta
cyan raven
# cyan raven isnt there something about it in the pegen generator as well?
    def visit_Forced(self, node: Forced) -> FunctionCall:
        call = self.generate_call(node.node)
        if call.nodetype == NodeTypes.GENERIC_TOKEN:
            val = ast.literal_eval(node.node.value)
            assert val in self.exact_tokens, f"{node.value} is not a known literal"
            type = self.exact_tokens[val]
            return FunctionCall(
                assigned_variable="_literal",
                function=f"_PyPegen_expect_forced_token",
                arguments=["p", type, f'"{val}"'],
                nodetype=NodeTypes.GENERIC_TOKEN,
                return_type="Token *",
                comment=f"forced_token='{val}'",
            )
        else:
            raise NotImplementedError(
                    f"Forced tokens don't work with {call.nodetype} tokens")
cyan raven
fallen slateBOT
#

Python/symtable.c line 1600

symtable_visit_stmt(struct symtable *st, stmt_ty s)```
feral island
cyan raven
feral island
#

it's not a statement

#

I think it gets called through some macro

cyan raven
cyan raven
#

I think I need to do the same.

feral island
#

VISIT_SEQ(st, type_param, s->v.FunctionDef.type_params);

cyan raven
#

So I need to expose it in Call_kind or should I just handle my node?

feral island
#

haven't looked exactly, but in general, the smaller change to make is probably the right one

cyan raven
#

is there any documentation on how symtable_add_def works?
I mean, I can go through the source code but, still.....

dusk comet
#

if it is a private thing, there is unlikely to be any documentation

#

the only chance is the developer guide

marsh cosmos
#

hi

cyan raven
fallen slateBOT
#

Python/symtable.c line 1357

symtable_add_def_helper(struct symtable *st, PyObject *name, int flag, struct _symtable_entry *ste,```
cyan raven
fallen slateBOT
#

Python/symtable.c lines 1315 to 1325

prev = st->st_cur;
/* bpo-37757: For now, disallow *all* assignment expressions in the
 * outermost iterator expression of a comprehension, even those inside
 * a nested comprehension or a lambda expression.
 */
if (prev) {
    ste->ste_comp_iter_expr = prev->ste_comp_iter_expr;
}
/* The entry is owned by the stack. Borrow it for st_cur. */
Py_DECREF(ste);
st->st_cur = ste;```
feral island
cyan raven
feral island
#

like how a = 3 defines the name a

cyan raven
feral island
#

it records the definition of the name in some internal data structure

feral island
#

the calling code there essentially gives up its reference to st_cur, so it has to decref its reference

cyan raven
feral island
#

it generally does, but the way reference counts are handled can be very specific to individual function calls

cyan raven
#

basically, what I have to do here is set varkeywords to 1?

feral island
#

probably all you need to do is make your AST node have a child node of type ast.Name with kind ast.Load

cyan raven
feral island
#

then visit_expr will take care of putting it in the symtable

cyan raven
#

this struct might be wrong.

struct _shorthand_keyword_arg {
    identifier arg;
    int lineno;
    int col_offset;
    int end_lineno;
    int end_col_offset;
};

here, the arg is the name, right?

feral island
#

there are multiple ways to do it

#

what you have could work too, you just need to call symtable_add_def yourself in this case with that name

#

something like symtable_add_def(st, something.arg, USE, LOCATION(e))

#

actually that's probably better than putting an extra layer of Name node inside

cyan raven
#

My current approach is having a new field in Call_kind

Call(func=Name(id='foo', ctx=Load()), args=[Constant(value=1), Constant(value=2)], keywords=[]), shorthand_keyword_arg...)])

Then I just visit it: VISIT_SEQ(st, shorthand_keyword_arg, e->v.Call.shorthand_keyword_args);
and then record the name, enter the block, exit from the block.

#

if im not mistaken the block such as TypeVarBoundBlock is for only debugging purposes.

 case TypeVarBoundBlock: blocktype = "TypeVarBoundBlock"; break;
#
typedef enum _block_type {
    FunctionBlock, ClassBlock, ModuleBlock,
    AnnotationBlock,
    TypeVarBoundBlock, TypeAliasBlock, TypeParamBlock
} _Py_block_ty;

I'm not sure if I should extend it or not.

feral island
#

that's for new kinds of scopes

cyan raven
feral island
#

pass in where?

cyan raven
feral island
#

why are you entering a new block

#

you shouldn't be

cyan raven
#

should I only add it to the symtable?

feral island
#

yes

#

a block is a scope

#

your thing doesn't create a new scope

cyan raven
feral island
#

the current one

slender glacier
#

umm can u guys help me when your done?

slender glacier
#

sorry

cyan raven
#
static int
symtable_visit_shorthand_keyword_arg(struct symtable *st, shorthand_keyword_arg_ty skwa)
{
    if (++st->recursion_depth > st->recursion_limit) {
        PyErr_SetString(PyExc_RecursionError,
                        "maximum recursion depth exceeded during compilation");
        VISIT_QUIT(st, 0);
    }

    if (!symtable_add_def(st, skwa->arg, USE, LOCATION(skwa)))
        VISIT_QUIT(st, 0); // return decreased recursion depth and 0
        
    VISIT_QUIT(st, 1);
}
#

this is what I have.

feral island
#

makes sense, I don't think you need the recursion check here

cyan raven
feral island
#

haven't looked too hard but I feel we only need to check that depth when adding a new block

drifting inlet
#

I hope I'm not breaking any rules by asking this, but it seems like this is the most recently active channel. Can someone please help me with my assignment lol... it's in python-help as "Harvard CS50 Assignment Help"

cyan raven
drifting inlet
#

Omg, I didn't even see that channel, I actually feel stupid

#

My apologies

cyan raven
#

wait I might be able to use this.

    return compiler_call_helper(c, loc, 0,
                                e->v.Call.args,
                                e->v.Call.keywords);
#

not sure why I have to expose a custom node if I can just access it through v.

feral island
cyan raven
feral island
#

sure

cyan raven
feral island
#

I wouldn't fret too much about whether or not to create a new function. You need code that works, and whether or not that requires a new function is something you decide as you look at the code

cyan raven
# feral island I wouldn't fret too much about whether or not to create a new function. You need...

the difference between keyword and shorthand_keyword_arg is that the latter doesn't have a value

struct _keyword {
    identifier arg;
    expr_ty value;
    int lineno;
    int col_offset;
    int end_lineno;
    int end_col_offset;
};

struct _shorthand_keyword_arg {
    identifier arg;
    int lineno;
    int col_offset;
    int end_lineno;
    int end_col_offset;
};

however, in the compiler process, I can't see that value would be used.
it's only using CALL_KW.

feral island
#

well it needs to compile the value at some point. for your case it just needs to put a LOAD_NAME there instead

cyan raven
#

because I have to evaluate the value somehow from the name.

feral island
#

to the compiler your shorthand_keyword("X") should be equivalent to keyword("X", Name(X))

#

and the Name(X) part compiles to a LOAD_NAME

#

(or LOAD_FAST, or various other options)

cyan raven
feral island
#

yes

cyan raven
#

ADDOP_I(c, loc, CALL_KW, n + nelts + nkwa);

feral island
#

I think that represents a series of kwargs at once? Might need some work to figure out how exactly that works

#

Honestly it might be easier to desugar the shorthand kwargs into normal kwargs before the compiler

#

not sure there's a good place for that though, maybe the ast optimizer

cyan raven
#

read the rules please

#

and delete your post.

feral island
cyan raven
feral island
#

call compiler_nameop

cyan raven
feral island
#

that makes sense, but you will need to call compiler_nameop to load the name

cyan raven
#

didn't you mention LOAD_NAME or smt similar?

feral island
#

it doesn't, it just emits the right instruction

cyan raven
feral island
#
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (f)
              4 PUSH_NULL
              6 LOAD_NAME                1 (x)
              8 LOAD_CONST               0 (('x',))
             10 CALL_KW                  1
             12 RETURN_VALUE
#

your job here is to make f(x=) compile to this same set of instructions

cyan raven
cyan raven
feral island
#

yeah, you'd call compiler_nameop() and that would emit the LOAD_NAME

#

for normal keyword args, you'll probably see somewhere that it calls some function to visit the value of the keyword

#

in your case, you instead should call compiler_nameop() there directly with the name of the kwarg

cyan raven
feral island
feral island
#

that happens in compiler_call_simple_kw_helper

cyan raven
feral island
#

Do you feel you understand what bytecode means and how it works?

cyan raven
#

oh basically, the value is emitted by visiting the expression again.


static int
compiler_visit_keyword(struct compiler *c, keyword_ty k)
{
    VISIT(c, expr, k->value);
    return SUCCESS;
}

Which is going to end up in expr1.

#

I mean that is where its using k->value

#

@feral island Thank you very much for helping me out, I'll continue tomorrow. 😄

dusk comet
#

isn't it simpler to make parser "emit" x=x ast node instead of emitting x= ast node and then handling it in special way further?

cyan raven
feral island
cyan raven
#

wondering how the grammar would look like, if it doesn't return a specific ast node.

feral island
#

With this suggestion it would simply parse x= the same as x=x. There would be no change to the ASDL at all

cyan raven
#

it sounds way easier.

dusk comet
rose schooner
#

so not a very big change really

#

making y optional in x=y would be somewhat less easier but it does reflect in the AST

#

the needed files to change (excluding the auto-generated ones and docs) given the approach are as follows
x=x from x=:
Grammar/python.gram
y is optional in x=y:
Grammar/python.gram
Parser/Python.asdl
Python/ast.c
Python/ast_opt.c
Python/symtable.c
Python/compile.c
despite the number of files in the second approach the only "big" change is in Python/compile.c

cyan raven
cyan raven
#

What I have so far is this.
.gram

# shorthand keyword arguments
#--------------------------

shorthand_keyword_args[asdl_shorthand_keyword_arg_seq*]: t=shorthand_keyword_seq {
    CHECK_VERSION(asdl_shorthand_keyword_arg_seq *, 13, "Shorthand keyword arguments are", t)
    }

shorthand_keyword_seq[asdl_shorthand_keyword_arg_seq*]: a[asdl_shorthand_keyword_arg_seq*]=','.shorthand_keyword_arg+ [','] { a }

shorthand_keyword_arg[shorthand_keyword_arg_ty]:
    | a=NAME '=' { _PyAST_shorthand_keyword_arg(a->v.Name.id) }

.asdl

| Call(expr func, expr* args, keyword* keywords, shorthand_keyword_arg* shorthand_keyword_args)
...
    -- shorthand keyword arguments supplied to call
    shorthand_keyword_arg = (identifier? arg)

                            attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)

.symtable


    case Call_kind:
        ...
        VISIT_SEQ(st, shorthand_keyword_arg, e->v.Call.shorthand_keyword_args);

...

static int
symtable_visit_shorthand_keyword_arg(struct symtable *st, shorthand_keyword_arg_ty skwa)
{

    if (!symtable_add_def(st, skwa->arg, USE, LOCATION(skwa)))
        VISIT_QUIT(st, 0); // return decreased recursion depth and 0

    VISIT_QUIT(st, 1);
}

and now I have to emit the same bytecodes as a keyword in compiler_call_helper

cyan raven
#

what do you guys think would be the best way of implementing this?

cyan raven
#

not sure if I made a good decision when I decided to extend the Call node.
now, I have to go through the grammar and pass in shorthand keyword arguments everywhere.

#

At this point, I'm not even sure what I'm doing.

cyan raven
#

well, it works.

cyan raven
# cyan raven well, it works.
  • so I created 2 branches, one of them not even exposing a new AST node, just changing the grammar by invoking some action helpers and literally handling it as a keyword argument from the beginning(see above).

  • In the second case, I'm adding a new AST node, creating a new grammar rule which returns that node, and then adding it to the symbol table. Moving on to the compiler, I'm just passing in the new node to the function which emits byte codes related to keyword arguments and executes the same instruction when a new shorthand keyword argument is passed in(here the former and keyword arguments should behave the same). I haven't finished this one yet.

#

what do you think, which one should I go with?

dusk comet
#

first one sounds way easier than second one

#

it is easier to implement, maintain and understand

#

(i guess)

cyan raven
radiant garden
cyan raven
radiant garden
#

every piece of syntactic sugar needs a method of distinguishing from other code after parsing for user purposes 😄
look around the ast module for a while and you'll find that every tiny detail is exposed there

#

Details like the variable annotation (x): int being distinct from x: int

steel solstice
#

as an aside it doesnt expose quotes though does it? hows that different syntactically?

radiant garden
#

yeah not quote type for some reason

#

all are Constants

#

only indirectly by span I guess?

cyan raven
#

I would have achieved the same thing If I were about to implement it in the compiler. If we handle this feature as a keyword argument from the beginning, everything works just fine. Again, what I'd do in the compiler is emit the same bytecode instructions so a "shorthand keyword argument"(this new thingy) would be basically the same thing as a general keyword argument. Why not just do it in the first place?

#

So it behaves as a keyword argument during the whole process.

rose schooner
#

huh

rose schooner
radiant garden
#

for proper support yeah

cyan raven
radiant garden
#

because a new syntax feature is more than just making the compiler parse it right lol

cyan raven
rose schooner
radiant garden
#

yeah, making the second param optional

#

that's what I'm saying is appropriate

cyan raven
cyan raven
rose schooner
#

that is, it's only helpful when there's been no changes yet

rose schooner
# rose schooner that is, it's only helpful when there's been no changes yet

here's the "optional value" (y is optional in x=y) approach which reflects in the AST ```pycon

from ast import dump, parse
from dis import dis
print(dump(parse("a(b=)", mode='eval'), indent=4))
Expression(
body=Call(
func=Name(id='a', ctx=Load()),
args=[],
keywords=[
keyword(arg='b')]))
dis("a(b=)")
0 0 RESUME 0

1 2 LOAD_NAME 0 (a)
4 PUSH_NULL
6 LOAD_NAME 1 (b)
8 KW_NAMES 0 (('b',))
10 CALL 1
18 RETURN_VALUE

#

and here's the "substituted value" (x=x from x=) approach which doesn't reflect in the AST (that is, it's the same as if x=x is written instead of x=) ```pycon

from ast import dump, parse
from dis import dis
print(dump(parse("a(b=)", mode='eval'), indent=4))
Expression(
body=Call(
func=Name(id='a', ctx=Load()),
args=[],
keywords=[
keyword(
arg='b',
value=Name(id='b', ctx=Load()))]))
dis("a(b=)")
0 0 RESUME 0

1 2 LOAD_NAME 0 (a)
4 PUSH_NULL
6 LOAD_NAME 1 (b)
8 KW_NAMES 0 (('b',))
10 CALL 1
18 RETURN_VALUE

grave jolt
#

wait, wrong reply. but the same person

cyan raven
grave jolt
cyan raven
radiant garden
#

that's not really an appropriate solution for the problem

cyan raven
radiant garden
#

Good beyond a proof of concept

#

Sorry if I've been dismissive of your efforts

cyan raven
#

based on the current version, this is what the ast looks like.

def func(a, b):
    pass
    
a = 12
b = 12

func(a=, b=)
Module(body=[FunctionDef(name='func', args=arguments(posonlyargs=[], args=[arg(arg='a'), arg(arg='b')], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Pass()], decorator_list=[], type_params=[]), Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value=12)), Assign(targets=[Name(id='b', ctx=Store())], value=Constant(value=12)), Expr(value=Call(func=Name(id='func', ctx=Load()), args=[], keywords=[keyword(arg='a', value=Name(id='a', ctx=Load())), keyword(arg='b', value=Name(id='b', ctx=Load()))]))], type_ignores=[])

Name(...) looks interesting ngl.

#

a=, keyword(arg='a', value=Name(id='a', ctx=Load()))

#

(this is how it represented in the ast)

cyan raven
rose schooner
#

if you want to i can send the github diff

cyan raven
rose schooner
cyan raven
rose schooner
rose schooner
cyan raven
rose schooner
rose schooner
cyan raven
# rose schooner yep

I thought its represented in the ast as shorthand_keyword_arg or something similar.

rose schooner
cyan raven
#

and in the ast, you are making the expression optional rather than creating a new constructor(node).

rose schooner
#

yep

cyan raven
# rose schooner yep

then I don't understand where you are coming from.
I think Im not sure what "reflects" in the ast means, based on your understanding this one reflects the ast:

 keyword(arg='b')]))

but this is not

value=Name(id='b', ctx=Load()))]))
rose schooner
cyan raven
rose schooner
cyan raven
dusk comet
#

mypy dont care about exact ast, it doesnt need that information, from mypy perspective x=x and x= behave in exactly the same way

black, on the other hand, needs this information to correctly reformat your code (but they use their own ast implementation, i think)

pylint, flake8 needs this information to issue warnings like "confusing name of shorthand argument" or something like that, that could happen only with shorthand args

cyan raven
#

@rose schooner in your diff, are the shorthand keyword args exposing the same bytecodes as a keyword argument?

#

I just realised in the compiler you are adding stuff to compiler_subkwargs What does that even mean?

rose schooner
cyan raven
rose schooner
#

e.g. a(*b, c=x+2, d=d, **e, f=f)
c, d, and f are the keyword arguments handled by compiler_subkwargs

radiant garden
#

so like

#

f(x=) can have two different errors, one where x is not defined and another where f doesn't have a x keyword param

#

pyright for example gives these two

#

and it's important to make sure they have the right span information (i know pyright rolls its own thing but that is what I have at hand)

#

It would also be preferable if ast.unparse(ast.parse(src)) was a no-op (or as close to a no-op as possible) if at all doable, though that's a minor thing

cyan raven
cyan raven
feral island
cyan raven
feral island
#

CALL always visits the thing without a NULL check

#

now that field is nullable, so we need CALL_OPT instead

cyan raven
feral island
#

yes

cyan raven
#

I don't have to extend ast.py nor tokenizer.py since nothing has changed related to these parts, right?

feral island
#

probably. ast.py might need some small changes for the unparser

cyan raven
#

Should I create a new test file for shorthand keywords or extend the current one test_keywordonlyarg.py? what do you reckon?

feral island
cyan raven
#

well, not sure if we need more complicated test cases 😄

    def testShorthandKeywordArgs(self):
        def foo(a, b):
            return a+b

        a = 12
        b = 24

        try:
            foo(a=, b=)
        except:
            self.fail("shorthand keyword args are not supported")
feral island
cyan raven
feral island
#

more tests is generally better than fewer

cyan raven
#

I suppose I also need to extend Cpython's language reference with this new feature?

cyan raven
#

well, I think the glossary change is enough.

feral island
#

definitely needs a change to the language reference too

cyan raven
feral island
#

at least the grammar there, for keyword_item

#

and the language reference should also describe how the shorthand syntax works

feral island
#

I don't know, that section is pretty long

dusk comet
#

why do you care about documentation?
i thought you are doing this only as small experiment in your own fork of cpython

cyan raven
feral island
#

But if this feature is accepted, I suppose it will be fairly common, so probably good to cover in the tutorial

dusk comet
dusk comet
cyan raven
cyan raven
steel solstice
#

I wouldn't worry about the docs too much tbh

#

I think having a solid implementation is more important in the first stages

cyan raven
#

Would it be a good idea to write more detailed explanations about certain Cpython topics(developer guide)? For example, the symbol table could be explained more deeply and can serve as a guide for people so they can understand how to use it when needed.

feral island
cyan raven
spark magnet
quartz pawn
#

Hey folks, do you have a good intro to the Cpython implementation? A series of blog posts or something? Just diving into the repo seems rather overwhelming

dusk comet
#

there is a developer guide
you also can find pycon talks about cpython internals

quartz pawn
#

Ah nice, I found a pycon talk by Sebastiaan Zeeff, I'll start there, then work through the developers guide. Thank you!

rocky ravine
#

Are there any tools available that might show how long a thread has held on to the GIL? I've been working with async and I use the debug messages that a task too too long, I was hoping for something similar to that but for the GIL.

cyan raven
#

do I need to add a new TypeObject to the module in the C-API?

typedef struct {
    PyObject_HEAD
} MyObject;

static PyTypeObject MyObject_Type = {
        PyVarObject_HEAD_INIT(NULL, 0)
        .tp_name = "fputs.MyObject",
        .tp_basicsize = sizeof(MyObject),
        .tp_doc = PyDoc_STR("MyObject objects"),
};

static struct PyModuleDef fputsmodule = {
        PyModuleDef_HEAD_INIT,
        "fputs",
        "Python interface for the fputs C library function2",
        -1,
        FputsMethods
};
feral island
cyan raven
#

this is not related to shorthand keyword arguments. I'm just testing out a few things on my own.

#

but now you mention it, should I add shorthand keyword arguments to the C API?
or how would that work?

feral island
raven ridge
raven ridge
signal otter
#

can someone plz help me with ai stuff, its for home work?

craggy flame
willow cliff
#

hello

wet laurel
cyan raven
#

I always wondered where these "private" libraries are coming from.
e.g _symtable
anyone knows?

fallen slateBOT
#

Modules/symtablemodule.c line 125

PyInit__symtable(void)```
steel solstice
#

generally just search for the name in ""s

cyan raven
cyan raven
#

is this part still relevant to Python's grammar?
https://docs.python.org/3/reference/introduction.html#notation
I can't find those notations in the grammar file.

feral island
dapper lily
#

This is not on-topic for this channel. Please see the #rules

dusk comet
#

oh, Jelle already said that 😄

cyan raven
cyan raven
#

is there any reason why test_ast.py doesn't contain check for ast.unparse(...)?

dusk comet
feral island
cyan raven
feral island
cyan raven
# feral island I don't understand what you mean
    def test_function(self):
        node = ast.FunctionDef(
            name="f",
            args=ast.arguments(posonlyargs=[], args=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]),
            body=[ast.Pass()],
            decorator_list=[],
            returns=None,
        )
        ast.fix_missing_locations(node)
        self.assertEqual(ast.unparse(node), "def f():\n    pass")

this is not

a = ...
b = ...

f(a=, b=)

It tests the function definition, and not the arguments passed in. In fact, in the module, the "caller site" is ignored.
I mean when you added type params, you only checked the definition def f[T]().

feral island
cyan raven
feral island
cyan raven
feral island
#

sounds good!

low pecan
#

hi guys

#

who can help me

faint river
granite apex
#

anyone wanna see code i wrote tired

#

no well too bad

#

eh nvm

faint river
granite apex
#

OOPS

#

misclicked, was looking for general

boreal umbra
#

Has there been any talk about elevating pathlib.Path, like making Path a builtin?

dusk comet
#

all builtins are lowercase, so maybe path?

halcyon trail
#

what would be the benefit out of curiosity

dusk comet
#

no reason to use str for paths anymore

halcyon trail
dusk comet
#

some people are too lazy to from pathlib import Path

boreal umbra
pliant tusk
halcyon trail
#

I already use them everywhere

#

you could have a path literal, or not have to import Path

spark magnet
halcyon trail
#

those are the two main potential improvement I could imagine, I suppose

#

I'm not sure if those are super compelling though

halcyon trail
#

I will say I've been putting off, putting from pathlib import Path in my ipython "startup" for.... months now 😛

#

I really should do it

#

80% of the time I start up ipython and start messing around with our code, I need to immediately start by importing Path and datetime

#

so I can construct the Paths and dates I need, to pass into functions and see the outputs

#

but that also kind of illustrates how this is a rabbit hole, why make Path a built-in, why not date, or datetime, or regex, etc

feral island
#

Path is technically more difficult to make a builtin as it's entirely implemented in Python, while regexes and datetimes are mostly in C already

halcyon trail
#

interesting

feral island
#

Don't think that should be a blocking concern though; if we really wanted Path to be a builtin we could reimplement it in C or find a way to allow builtins to be written in Python

boreal umbra
boreal umbra
feral island
boreal umbra
#

(I got flamed last time I suggested something, and I'm still emotionally damaged.)

halcyon trail
#

out of curiosity when you say "built-in"

#

what do you really want

#

just not having to import it?

spice pecan
#

I assume it means having Path within the builtins module, which transitively means not having to import it, yeah

#

Since it's already somewhat built-in by being part of stdlib

halcyon trail
#

It just kind of feels like a tough sell to add anything to builtins, to me

#

honestly, half the things there don't belong there

#

but will stay out of inertia

#

Namespaces are one honking great idea -- let's do more of those!

#

I've gotten mildly exasperated so many times at the IDE warnings for naming a variable id 😛

#

i can probably count on one hand how many times I've actually used the builtin id function

feral island
#

Agree there's a few builtins that really don't need to be there

#

I tried to argue against aiter and anext becoming builtins but no luck

frigid bison
#

what do those even do? I thought Python always had clear naming PeepoSad

flat gazelle
#

they call __aiter__ and __anext__.

frigid bison
#

damn I didn't even know about those dunders, I guess I will need to educate myself

halcyon trail
#

async stuff, I guess

#

async iter and async next

frigid bison
#

ah yeah makes sense

halcyon trail
#

memoryview is also... pretty sus as a builtin

#

honestly looking at the list of builtins I feel like 80% of the stuff here I barely use and I don't really see a reason for it to be builtin

#

and it's fairly arbitrary that things like abs and pow are builtin, but not say exp

feral island
#

I mean 80% is random exception classes 😛

halcyon trail
#

lol

#

touche

#

I was just looking at built-in functions

flat gazelle
#

abs and pow I can see since they call dunders

feral island
#

so does math.floor though

#

abs, pow, round might as well have been in math too

halcyon trail
#

oops

feral island
#

I am not sure i have ever seen ascii() used

flat gazelle
#

I mostly use it via !a in fstrings/format strings

halcyon trail
#

why do they call dunders

#

I'm scared to learn

feral island
flat gazelle
#

oh wait, pow probably doesn't, I may have been wrong about that one. I think it just does the clever exponentiation by squaring thing

halcyon trail
#

ah. dang the whole function forwarding to a dunder thing is awkward

flat gazelle
#

ah

feral island
#

though there are complexities around the three-argument version I think

halcyon trail
#

I think ultimately the only thing that builtins gives you is "you don't have ot import it" - so I'm not sure if there should be any criterion for builtins other than "super duper commonly used"

#

there's no problem with something calling a dunder being in math - there's already stuff like that there

flat gazelle
#

Yeah, the builtin selection is pretty strange, upon further review

#

and yet there is no builtin to call __index__ or __float__

halcyon trail
#

A few broad categories of builtins:

  • data structure returning functions
  • math
  • itertools
  • cpython internal stuff: id, other things probably
  • reflective stuff: setattr, getattr, eval, exec
  • type hierarchy stuff (isinstance, issubclass, etc)
  • IO (print, input)
  • pretending to be keywords (classmethod, staticmethod)
#

let me see what other categories are needed to cover basicallye verything

#

I guess str and repr don't really fit in any of those

feral island
#

where would you put bin() and hex()?

halcyon trail
#

right, so "type conversions" I guess?

feral island
#

(another two I've basically never seen used)

halcyon trail
#

bin, hex, str, and repr

flat gazelle
#

bin and hex I believe are for repl usage

prime estuary
#

Also chr and ord, those plus print/input I'd put in the category of convenience for the repl yeah.

flat gazelle
#

I use the python repl for converting between bases all the time

halcyon trail
#

and then there's also hash

prime estuary
#

ascii() I think is a Py2 remnant?

feral island
# prime estuary `ascii()` I think is a Py2 remnant?

nope ```% ~/.pyenv/versions/2.7.18/bin/python
Python 2.7.18 (default, Oct 5 2023, 10:23:23)
[GCC Apple LLVM 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

ascii("x")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'ascii' is not defined

flat gazelle
#

ascii definitely should not be a builtin, I agree

prime estuary
#

repr, hash, id, getattr, setattr are all kinda fundamental operations...

halcyon trail
#

why does that matter that they're fundamental

#

how many times have you called the hash builtin directly

feral island
#

my guess is people thought early in the 2/3 transition that ascii() was something you'd want to do all the time

#

but it really isn't

feral island
flat gazelle
#

I do think there are cases where people call repr when they want ascii

halcyon trail
flat gazelle
#

because they forget ascii exists

halcyon trail
#

i mean, you'd need to be implementing a custom hash, that nonetheless uses hashes of other existing python types

#

but doesn't just do the obvious approach to combining them

#

because then a dataclass would already give you that

flat gazelle
halcyon trail
#

I'm saying that in relative terms, calling the hash built-in directly is very rare.
if we go back to the type that started this convo, lets take from pathlib import Path. My guess is that 80% of the builtins, including hash, are used far less than Path

feral island
#

there are 56 def __hash__ in our internal codebase. Likely many of these should be dataclasses but they're not

halcyon trail
#

Do all of those always call hash ?

regal glen
#

Builtin is an impossible thing considering so many people use python vastly different.

feral island
#

too lazy to check that, but most of them probably do

regal glen
#

But like, improvmenets could be made

halcyon trail
#

i don't think built-ins are "impossible". You just pick the best possible set you can. Python's current set just happens to be pretty terrible.

regal glen
halcyon trail
#

obviously any two people will not agree precisely on what the built-ins should be. But there's still things that will be a lot more popular than others.

#

I mostly get annoyed over the built-ins that I basically never use, yet occupy an extremely natural variable name. that's why I used id as an example, I think it's one of the most egregious

flat gazelle
#

Yeah, id is definitely the worst offender

halcyon trail
#

i've never used id outside of toy pieces of python demonstrating when new objects are created or not

regal glen
halcyon trail
#

idk what you mean by "develop by the most popular" - but this is literally just a list of functions you save people from one line importing, there's not really much depth here

#

popularity is quite fine

#

it's not like language design decisions, like whether async should work via A or B, and A is more popular, but is it necessarily better, etc. Nothing that deep 😛

dusk comet
dusk comet
#

!e print(open, type(open))

fallen slateBOT
#

@dusk comet :white_check_mark: Your 3.12 eval job has completed with return code 0.

<built-in function open> <class 'builtin_function_or_method'>
dusk comet
#

hmmm

feral island
#

it has this weird thing where it claims to be in the io module

#

but it's definitely written in C

dusk comet
dusk comet
halcyon trail
#

I mean there are tons of them that are weird

#

the fact that I've been programming python at one level or another for... idk, more than a decade, and there are functions here I've never used ever in a real program 🤷‍♂️

#

id, ascii, hex, bin, hex
even if I used ene of these things they would still just be one offs in the implementation of some function

#

that reminds me that bin is also one of the worst offenders; I've wanted to name variables bin many times, if they were the path to a binary

boreal umbra
boreal umbra
feral island
#

For builtins specifically it's not too clear what the inclusion criteria are, so who knows what people would say

#

(And a technical thing: instead of python-ideas the mailing list, we now have the Ideas category on discuss.python.org. It might be a little less prone to flaming.)

feral island
boreal umbra
#

!d pow

fallen slateBOT
#
pow

pow(base, exp, mod=None)```
Return *base* to the power *exp*; if *mod* is present, return *base* to the power *exp*, modulo *mod* (computed more efficiently than `pow(base, exp) % mod`). The two-argument form `pow(base, exp)` is equivalent to using the power operator: `base**exp`.
boreal umbra
#

interesting

feral island
#

@merry bramble I believe has some traumas about three-argument pow() that he might now be typing out 🙂

merry bramble
#

what's fun is that builtins.pow and operator.pow work differently

#

builtins.pow accepts 3 arguments, operator.pow only accepts 2

#

for Reasons

feral cedar
#

it sort of makes sense though, right? operator.pow should correspond to **, but ** is only a binary operator

flat gazelle
#

don't forget math.pow

#

float-only, two arguments

raven ridge
#

A Path literal is an interesting idea, though. That sounds much more useful to me than just moving Path from pathlib to builtins.

feral cedar
#

p""? 👀

raven ridge
#

It's not crazy to imagine a p"..." that behaves like __import__("pathlib").Path(r"...")

#

I'm not sure it's a good idea, but it seems not totally unreasonable

raven ridge
#

Probably the biggest reason for having raw literals is paths, anyway, but now that we have a better data type for paths than str... Hm.

feral island
#

I suppose you'd use raw literals for Windows paths

peak spoke
#

p"whatever".open() would look a bit weird

raven ridge
raven ridge
#

Doesn't really seem any weirder than b"...".decode() to me

prime estuary
#

Well you could just use open(p"whatever") instead.

#

One problem with making Path a builtin is that you'd also potentially want PurePath, the two OS flavours of each of those, and also now PathBase/PurePathBase for making subclasses...

halcyon trail
raven ridge
halcyon trail
#

so if you want to startup ipython and run some things interactively

#

you need paths

#

also yeah I'd agree that Path is used vastly more than PurePath

#

that said I don't really see why Path is a stronger candidate per se to get its own literal than say, dates, or datetimes. those are super common too.

#

i think at a language level the only things to really do are either a) accept that literals will only be for a handful of the lowest level types, and use functions for other stuff, or b) have a way to define new literals.

prime estuary
#

Time for macros I guess.

raven ridge
#

There's been talk about a way to define new literals for Python for quite a while now

raven ridge
prime estuary
#

Do we actually want a literal syntax? What's our actual goal here, saving a few characters and an import, or do we want it because constant forms can be cached?

feral island
prime estuary
#

That is a good reason yeah. They're so much nicer than os.path.

halcyon trail
#

I still think it's pretty ad hoc to start picking and choosing new literals like that

#

I guess in python it's more justifiable since python has container literals. but a) while it seemed good at the time, I don't think that decision aged well, and b) the langauge is a lot more tied to list/dict/set I think, than it will ever be to Path.

random thistle
#

Am I right, that the only reason for pathlib is to try to hide the insanity of handling pathnames on Windows?

raven ridge
#

No. Handling paths is annoying on every platform

spark magnet
#

i'm not even sure what is more difficult on windows than on unix, other than the "current directory per disk".

feral island
#

and junctions

spark magnet
random thistle
#

Only Windows has

  • Single-letter drive names
  • “Reserved” file names. Google tried to document how these worked, but even they managed to miss a few cases.
raven ridge
#

I don't think pathlib does anything about those, does it?

#

Huh, looks like it does

#

!d pathlib.PurePath.is_reserved

fallen slateBOT
#

PurePath.is_reserved()```
With [`PureWindowsPath`](https://docs.python.org/3/library/pathlib.html#pathlib.PureWindowsPath), return `True` if the path is considered reserved under Windows, `False` otherwise. With [`PurePosixPath`](https://docs.python.org/3/library/pathlib.html#pathlib.PurePosixPath), `False` is always returned.

```py
>>> PureWindowsPath('nul').is_reserved()
True
>>> PurePosixPath('nul').is_reserved()
False
```  File system calls on reserved paths can fail mysteriously or have unintended effects.
raven ridge
#

TIL

spark magnet
#

the reserved names are definitely odd. I don't see how single-letter drive names is a problem, but i haven't done a lot of windows programming.

raven ridge
#

UNC paths are odd, too

#

On every platform, you have weirdness like that a/foo/../bar and a/bar might be different files

raven ridge
#

Ya know... pathlib doesn't really do much to help with one of the most difficult things to handle about POSIX paths: they're not necessarily textual strings at all. They're sequences of bytes, but they need not be text.

grave jolt
#

(typing powers is a pain IIRC)

native flame
#

these behaviors are not hard-coded into the Julia parser or compiler. Instead, they are custom behaviors provided by a general mechanism that anyone can use: prefixed string literals are parsed as calls to specially-named macros.

flat gazelle
#

Nim does something similar as well

spice pecan
#

That would be nice to have, not sure about long term effects on readability though

native flame
#

but also python doesnt have macros like julia does so it might be out of place

rose schooner
#

(with guido support i think)

spice pecan
#

Sounds nice, hopefully it'll get better treatment than none-aware operators

#

deferred since 3.8, might as well be rejected it seems

spark magnet
#

genuine question: why would path"abc.foo" be better than Path("abc.foo") ?

spice pecan
#

Depends, although it's more of a QoL thing than a really necessary change. If we're assuming it's a built-in prefix, it would eliminate the need to manually import pathlib, promoting the use of paths. If using Path is as easy as just adding p in front of a literal, there'll be little to no reason to notuse real Paths in place of strings in APIs that could support both.

If we're talking "custom prefixes in general, including one for Path", then it depends on how prefix imports would be handled. Best case, for paths specifically, it would only mean that it doesn't matter if you import pathlib, from pathlib import Path or import pathlib as ..., regardless of how you imported it, just p'abc/def.ghi' works, every time. Worst case, it changes effectively nothing, just removes parens

#

That's what my interpretation is, anyway

spark magnet
#

i guess i keep coming back to 1) imports are not hard, and 2) we don't have regex literals. There are lots of programs that don't deal with file paths, just like there are lots of programs that don't use regexes.

#

but i've been on the wrong side of lots of discussions that have added new features to the language, so ¯_(ツ)_/¯

spice pecan
#

Objectively speaking, it really doesn't change much if at all, especially when using an IDE that autoimports stuff for you (vast majority of use cases)

#

It would be most noticeable when running something in REPL and/or short, throwaway scripts

spark magnet
quick snow
#

I think path"foo" is bad for several reasons, but one showstopper is that currently the string prefixes are all single-letter flags. That pretty much makes custom string prefixes not doable anymore.

#

How would you combine it with f-strings? fpath"foo{bar}"? pathf"foo{bar}"? Either? pafth"foo{bar}"?

dusk comet
dusk comet
#

since th = f, we should use paf"abc/def"

halcyon trail
grave jolt
#

from * import *

frigid bison
#

someone has to make that work

sour thistle
#

considering that there are some joke imports guaranteed to give you errors, I'd argue that it already works as well as it would if it were implemented

#

!e from future import braces

fallen slateBOT
#

@sour thistle :x: Your 3.12 eval job has completed with return code 1.

001 |   File "/home/main.py", line 1
002 |     from __future__ import braces
003 |     ^
004 | SyntaxError: not a chance
sour thistle
#

!e from * import *

fallen slateBOT
#

@sour thistle :x: Your 3.12 eval job has completed with return code 1.

001 |   File "/home/main.py", line 1
002 |     from * import *
003 |          ^
004 | SyntaxError: invalid syntax
sour thistle
#

it already gives the right error 😉

grave jolt
#

I think a humorous error might only confuse beginners who try this

halcyon trail
#

my biggest peeve with the python import system is probably the fact that imports show up transitively

#

like, the typical thing that people write, say import foo, is the "wrong" thing 95% of the time, since technically you are saying that foo is public API of your package

#

it should be import foo as _foo

#

Even worse, if you are using, say Path as an implementation detail, you shouldn't write from pathlib import Path
but rather from pathlib import Path as _Path

#

yuk

sour thistle
halcyon trail
sour thistle
#

last I checked it also removes from the autocomplete suggestions
hmmm nope? weird, my autocomplete is even listing private variables. was it different for the current project vs installed dependencies or something... eh, never mind

halcyon trail
#

I guess it's ultimately convention

#

Still I would say this is a pretty ugly way to work in most cases

#

The average file doesn't want to re-export anything

#

You shouldn't need to re-list every single publicly defined entity in all to achieve that

vapid kraken
jade raven
vapid kraken
#

ye ye

random thistle
#

Python can do token-level macros, with the help of the ast module. I used this in my wrapper for Asterisk, to implement both synchronous and asynchronous variants of all the main API classes from a common code base.

fallen slateBOT
#

failmail :ok_hand: applied timeout to @unkempt rock until <t:1699484454:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

charred thicket
#

Hi everyone! I'm getting familiar with the cpython repository for the first time out of hobbyism/interest. I want to implement some syntax changes in a personal version of python (not stuff I ever expect to get integrated at a large scale).

I have a high-level sense of how the interpreter works:
parser tokenizes input code string based on the PEG grammar
then an AST is made from the tokens
then the AST is turned into bytecode
the bytecode is run in the PVM

And so to implement new grammar, I'd need to modify the first three steps. I'd need to modify the grammar file (and maybe something manual to the parser but not sure), the formation of AST nodes to allow for a new one, turning that AST into bytecode

but honestly I'm finding the repo a bit intimidating and see where I'd make some of these changes but am confused about others

Is there some reference PR I can look at where someone else implemented new syntax recently? Like myabe the := syntax or something. I can't find a clean example in the PR history but maybe I'm missing it

cyan raven
# charred thicket Hi everyone! I'm getting familiar with the cpython repository for the first time...

This is a really small one - adding shorthand keyword arguments into CPython. Feel free to take a look at it here: https://github.com/Hels15/cpython/commit/bd99a946c83f67bfc8cae3c14ef7b9fbb6f0bd6a
Also note that you shouldn't change the parser manually, and the files related to ast.
A better and more detailed one can be found here: https://github.com/python/cpython/pull/103764

GitHub

I will update this message as the status of the PR changes.
This is a complete implementation. It incorporates the changes in python/peps#3122, which were approved by the SC.
I wrote a detailed acc...

feral island
# charred thicket Hi everyone! I'm getting familiar with the cpython repository for the first time...

to add to what @cyan raven said, this page is useful: https://devguide.python.org/developer-workflow/grammar/. Depending on the change you're making, you might not need to change the bytecode

charred thicket
#

Thanks folks -- will check these out. I appreciate it

feral island
#

oh and one minor correction to your initial post: you said "parser tokenizes input code string based on the PEG grammar"

#

in fact the PEG grammar kicks in at the next step. First the tokenizer turns raw source code into a stream of tokens, then the PEG grammar is what turns the token stream into an AST

charred thicket
#

Ahh I appreciate it; that makes sense

#

Other than https://devguide.python.org/developer-workflow/grammar/, is there something you reccomend looking at to learn how CPython works and which files are relevant? E.g. something that says something like "First, input code is tokenized using Parser/tokenizer/string_tokenizer.c. Then, the tokens are run through a PEG parser specified by Grammar/python.gram. Then, an AST is built using {x}", and so on. Basically some way to learn how to think about which files are doing what other than by reading the source. If no such thing exists, no problem, that's probably expected

dusk comet
#

there are talks about python internals on pycon and other conferences, you might find something useful there

feral island
charred thicket
#

wonderful, thank you both

charred thicket
#

I'm working on a toy change where I want to introduce the ?. syntax, where a?.b is equivalent to None if a is None else b

My sense is that I don't need to change the compiler because all the changes can be done at the AST level: when parsing tokens into the AST, I can parse ?. into AST nodes presenting None if a is None else b instead of making a new type of node

So I think this means that I need to:

  • update Grammar/python.gram to introduce the syntax in the grammar
  • update Grammar/Tokens to introduce the new token
  • run make regen-token and make regen-pegen to regenerate the tokenizer (which I think is Parser/tokenizer.c? but not sure) as a function of those two updated files
  • update Parser/parser.c to handle the new case, appending AST nodes corresponding to None if a is none else b for the new kind SafeAttribute_kind

Then, add tests and such

I understand that this approach -- where I only change the AST and not the compiler -- may mess up things like showing where syntax errors are. I'm actually not sure why this is but I assume it's because there's ambiguity about what source code led to the bytecode that's actually encountering the error when unparsing. But I can live with that for this first project.

Does this sound right to you? Is this missing anything?

#

Hmm I'm feeling most uncertain about the last step there -- do I update Python/ast.c instead of Python/parser.c? Parser/Python-ast.c got updated through some of the make regen-* functions I think; I see a new _PyAST_SafeAttribute(....) definition in it

#

It looks like I might want Parser/parser.c ultimately, but it gets generated. So would I really need to change Tools/peg_generator/pegen/c_generator.py to make sure that parser.c emites a production rule for safe_attr that appends a set of AST nodes that represent None if a is None else b instead of _PyAST_SafeAttribute, which it seems to have done by default?

rose schooner
feral island
charred thicket
#

phew alright glad I asked the gurus, lol

rose schooner
feral island
#

for your change, if it was a real CPython change, I would recommend adding a new AST node instead of generating a fake one

#

however, what you suggest could work too. I don't think you should see a new _PyAST_SafeAttribute function though

rose schooner
#

so i already did this some time ago and i didn't create a new AST node although that would've probably helped

rose schooner
#

nvm

#

create a new AST node or not the amount of writing to do are sort of the same

charred thicket
#

Hm, so suppose I wasn't making a new node; what's the right place to turn the new token into multiple AST nodes instead of a new AST node that needs to be implemented in Python/compile.c?

#

If it isn't parser.c

#

My fear about adding a new node is that I think that means I have to touch Python/compile.c, which I understand is doing something like generating bytecode from the AST, and it looks intimidating af

#

Though I do understand how it could end up resulting in the same amount of writing, if this^ is accurate @rose schooner

#

Really appreciate the guidance btw

feral island
#

e.g. this is the rule for a if b else c ``` | a=disjunction 'if' b=disjunction 'else' c=expression { _PyAST_IfExp(b, a, c, EXTRA) }

#

you'd have to write a call like that in your rule but hardcode some of the arguments

charred thicket
#

I see so instead of

primary[expr_ty]:
+   | a=primary '?.' b=NAME { _PyAST_SafeAttribute(a, b->v.Name.id, Load, EXTRA) }
    | a=primary '.' b=NAME { _PyAST_Attribute(a, b->v.Name.id, Load, EXTRA) }

I want to do something like (replacing just the added + line)

_PyAST_IfExp( _PyAST_Attribute(value, attr->v.Name.id, Load, EXTRA), attr, _PyAST_Constant(Py_None, NULL, EXTRA))

feral island
#

yeah something like that, I think the attr in the middle is wrong

#

it should be equivalent to a is not None, I think a Compare node of some sort

dusk comet
#

is there a pep about intrinsic bytecode instructions?

cyan raven
#

Ah I see

dusk comet
#

they are described in docs, but i thought there is also a pep

feral island
#

PEP 659 about the specializing adaptive interpreter is sort of in this area

dusk comet
#

ok, thanks

feral island
#

I think intrinsics were initially added to free up some bytecodes used for uncommon operations, and then I really jumped on them for PEP 695 🙂

thin anvil
#

MIgrated from #python-discussion, per @is.alex's recommendation:

On Python, why does the GIL's removal needs to be done over several versions, aside from not causing a shock in the ecosystem? Is there "This One Thing" that needs the GIL so much that its immediate removal wreaks havoc?

dusk comet
#

there is: thousands of C-extensions on pypi

#

it is explained in the pep clearly, and also there is discourse thread

thin anvil
#

What does the GIL have to do with C-extensions? If anything, just the move from 3.10 to 3.11 has broken plenty of C-extensions by deprecating certain names

thin anvil
# dusk comet it is explained in the pep clearly, and also there is discourse thread
dusk comet
thin anvil
#

such as?

feral island
# thin anvil such as?

Many C extensions implicitly depend on the GIL for protecting access to data structures. They were designed under the assumption that the GIL was present to make concurrency simpler.

flat gazelle
#

Is there a reason for pyright and mypy to reject this, or is it a bug?

def f(x: int | None):
    match [x]:
        case [None]: print('not here')
        case [y]: print('doubled', y*2)
#

It works fine if the None/y is not in a nested pattern

dusk comet
#

it is kinda bug
typechecker doesnt know type of what variable to narrow, so it doesnt narrow anything, so in second case y is still int | None

cyan raven
flat gazelle
#

In the actual example I ran into this in, yes. I simplified here

#

I rewrote it to just check for None in the match branch

cyan raven
#

Does someone know where the lexer entry point is in Cpython? Where the source is passed in(presumably as a string).

fallen slateBOT
#

Parser/pegen.c line 927

_PyPegen_run_parser_from_string(const char *str, int start_rule, PyObject *filename_ob,```
fallen slateBOT
#

Parser/tokenizer/string_tokenizer.c line 54

decode_str(const char *input, int single, struct tok_state *tok, int preserve_crlf)```
fallen slateBOT
#

Parser/tokenizer/string_tokenizer.c line 112

_PyTokenizer_FromString(const char *str, int exec_input, int preserve_crlf)```
cyan raven
dusk comet
#

i think tokenizer is lazy, it is like python generator
parser pulls tokens from it when needed

fallen slateBOT
#

Parser/pegen.c line 756

p->tok = tok;```
dusk comet
#

im too scared to go further

cyan raven
#

I'm talking about the first time when the source is fed in and the lexer makes tokens out of it.

#

lexer.c doesn't seem to have a main loop or smt, so I suppose those are just helper functions.

#

which is a bit tricky.

dusk comet
cyan raven
fallen slateBOT
#

Parser/lexer/lexer.c line 366

tok_get_normal_mode(struct tok_state *tok, tokenizer_mode* current_tok, struct token *token)```
dusk comet
#

i guess that returns next token

cyan raven
#

interesting.

spark magnet
#

There doesn't seem to be a way to disable the specializing adaptive interpreter, is that right? I have a bizarre situation that would be explained by code objects being mutable.

spark magnet
#

the results of specialization are visible in dis output, right?

#

hmm, yes.

raven ridge
quick snow
raven ridge
#

I'm shocked that's hashing byte codes at all... I'd have expected function hashing to only be based on identity

spark magnet
spark magnet
raven ridge
#

And sys.monitoring only exists in 3.12+

quick snow
spark magnet
charred thicket
#

Hi guys -- thanks for your help earlier. I'm working on a project to implement the ?. syntax, such that a?.b is equivalent to None if a is None else b. This is just a personal project for me to understand python better; don't worry, I don't have any grand hopes of this getting incorporated into the language

When I run a simple test of it I see a failure:

Traceback (most recent call last):
  File ".../repos/cpython/Lib/test/test_safeattr.py", line 12, in test_safe_attr
    a?.b
AttributeError: 'NoneType' object has no attribute 'b'

Does anyone understand why this is happening? I would expect that my code in compiler_visit_expr1 would cause this to be evaluated as None. Here is the relevant piece of code: https://github.com/nishu-builder/cpython/pull/2/files#diff-ebc983d9f91e5bcf73500e377ac65e85863c4f77fd5b6b6caf4fcdf7c0f0b057R6285

  1. Visit e->v.SafeAttribute.value, which is the a in a?.b I believe. This puts the value of a at the top of the stack
  2. Make another copy of it and put that at the top of the stack. So now our stack looks like [a, a]
  3. Load NONE on to the stack. Stack: [a, a, None]
  4. ADDOP_I(c, LOC(e), IS_OP, 1);, which I believe pops the last two items on the stack, checks if they Is each other., and puts that result on the stack. Stack: [a, a is None]
  5. Pops the last value off the stack, and if it's true, jumps. Stack: [a]
  6. a) if a is None: end

This leaves None on the top of the stack, and nowhere do we try to access a.b as far as I can tell, so I'm not sure where the AttributeError is coming from

#

I think I got @rose schooner and @feral island 's thoughts earlier (for a version of this that potentitally only changed the grammar), but I'm taking their advice and trying to actually make a dedicated AST node and its own bytecode generation

charred thicket
charred thicket
#

Oh never mind I got it! I just had my comparator flipped for the a is None check. All good

rose schooner
fallen slateBOT
#

:incoming_envelope: :ok_hand: applied timeout to @steep wagon until <t:1699812176:f> (10 minutes) (reason: links spam - sent 26 links).

The <@&831776746206265384> have been alerted for review.

cyan raven
raven ridge
#

that's the implementation of the gc module

cyan raven
raven ridge
#

oh, I'm wrong - that's the implementation of the gc module, and the implementation of the garbage collector itself - they're both in that file

fallen slateBOT
#

Modules/gcmodule.c line 1198

gc_collect_main(PyThreadState *tstate, int generation,```
cyan raven
raven ridge
#

yeah, that's what I meant by my correction - it does contain the implementation of the garbage collector itself

dusk comet
#

is it possible to patch something in gc module and use other gc algorithm?

#

rc is baked into python pretty deeply, but gc works on top of that, so i think it might be possible to change algo

feral island
#

nogil will change how gc works

charred thicket
# rose schooner that's a more well documented change than what i did

Now I'm giving a go at the same idea but for subscripts (e.g. a?[b] is equivalent to None if a is None else a[b]. https://github.com/nishu-builder/cpython/pull/3

Though I'm realizing that this totally messes with parentheses balancing! This would mostly be fine, I think, if ?[ was one character, since lexer.c seems to make the assumption that each token is one character

Anyone have a sense of how to address this?

dusk comet
#

<@&831776746206265384>

#

they are crossposting

tepid bloom
#

!pban 783527861238759435 seems like you're just here to spam advertisement

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @novel lynx permanently.

rose schooner
#

also a?[b] *= 4 checking for a would be a nice feature albeit it'll take some more thinking to implement

cyan raven
#

what is the algorithm that CPython using(for gc)? "mark-and-sweep"?

cyan raven
fallen slateBOT
#

Modules/gcmodule.c line 1198

gc_collect_main(PyThreadState *tstate, int generation,```
cyan raven
raven ridge
cyan raven
raven ridge
#

No

dusk comet
# cyan raven rc?

rc = refcounting
it is independent from gc, current gc relies on the fact that rc is happening

halcyon trail
#

i know python uses this distinction but in general, RC wouldn't be considered "independent" from GC; it's just part of the overall GC strategy

dusk comet
#

i think about it in a different way: GC is a nice addition to RC-based memory deallocation strategy

#

in python 99% of all deallocations happen because rc dropped to 0
and only tiny amount of objects participate in refcycles

cyan raven
#

wondering what a bespoke garbage collection is then.

#

I might just go through the source and see what's happening, can't really go further

dusk comet
raven ridge
#

It's not a tracing GC, because a tracing GC assumes the existence of some roots that it can trace out from, and CPython's GC doesn't

dusk comet
#

__main__ is the root that references everything else
(and there are other roots: sys, builtins, cached ints)

dusk comet
#

then how does it know which part of graph should be deleted and which should not?

raven ridge
halcyon trail
#

garbage collection is a much broader thing than "the cycle detector for a reference counted language"

raven ridge
#

CPython tends to say that it's got two different GCs, a naive reference counting one and an optional cycle collecting one

halcyon trail
#

that's a pretty weird take too, I think; if someone said a language had two different GC's I would expect that they both work correctly on t heir own and you choose which to use

#

if you only use one of python's GC's, then it... doesn't really have correctly working GC in the sense that almost anyone would expect

#

they're practically speaking two components of python's GC

raven ridge
#

!zen 13

fallen slateBOT
#
The Zen of Python (line 13):

Although that way may not be obvious at first unless you're Dutch.

halcyon trail
#

😂

#

Guilty; I'm not dutch 😛

raven ridge
#

😄

halcyon trail
#

it is interesting to imagine disabling python's cycle collector and re-imagining it as a swift like, ARC language

#

encourage the use of finalizers!

#

no more context managers

#

You would use the cycle collector instead as a diagnostic tool to tell you when you'd accidentally created cycles

#

this seems like a potentially fun wacky project and I'm confident that nobody is actually trying to do this in prod (or at least I hope not)

merry bramble
# fallen slate

My takeaway is that we should direct all our further questions on this to @feral island

neat delta
#

he is the former prime minister after all

tawny aurora
#

hello! all folks here are python professionals?

charred thicket
# rose schooner specially handle `?[` in the C tokenizer files or just parse `?` and `[` as sepa...

Sweet, did that

I originally had no grand delusions of getting my code introduced into Python, but now that I've written the changes I can't help but wonder if there's a shot

I posted this https://discuss.python.org/t/new-syntax-for-safe-attribute-and-safe-subscript-access/38643/2 and someone responded noting that this PEP exists, which has some real overlap: https://discuss.python.org/t/new-syntax-for-safe-attribute-and-safe-subscript-access/38643/2

I'd love to work on that PEP, but I'm not sure (and can't find if I scan the github PRs/etc) if this is already done or claimed. How can I find that out and claim it if it isn't done?

#

I see, it looks like PEP 505 has been deferred

Well @rose schooner or @feral island (or others; pinging you two since you both provided me help over the last few days) if either of you has interest in being the PEP Delegate for these changes, I'd really appreciate it. No presure at all!

#

And is it appropriate for me to submit PRs for a Deferred PEP (if I have the understanding that it may no get merged)?

raven ridge
#

Personally, speaking as not-a-core-dev, I think it would be better to submit a PR for it than not. Sure, it'll probably get closed, but it might provide a starting point for someone else to pick this up if the PEP ever gets un-deferred

dusk comet
#

you also can mark PR as draft and explicitly state that it is an implementation for existing PEP

charred thicket
#

Sweet thanks

rose schooner
quick snow
#

But I don't think that's applicable here either, unless you want to create a new PEP.

feral island
fallen slateBOT
#

:incoming_envelope: :ok_hand: applied timeout to @unborn pelican until <t:1699976366:f> (10 minutes) (reason: emoji spam - sent 129 emojis).

The <@&831776746206265384> have been alerted for review.

boreal umbra
#

For relative imports, does anyone have a go-to resource that explains how the leading-dot notation works, and how to run code involving relative imports with -m?

raven ridge
# boreal umbra For relative imports, does anyone have a go-to resource that explains how the le...

the first part is easy. if the module name between from and import starts with a leading dot, then:

  • raise an error if __package__ isn't set ("attempted relative import with no known parent package")
  • split __package__ apart on dots
  • count how many leading dots there are in the import
  • raise an error if there aren't at least that many components in the split __package__ ("attempted relative import beyond top-level package")
  • remove one fewer than that many components from the end of the split __package__
  • rejoin whatever's left with dots, and prepend it to whatever came after the import's leading dots with a dot between them
#

so if __package__ is foo.bar.baz then: ```
from . import bang # from foo.bar.baz import bang
from .a.b import bang # from foo.bar.baz.a.b import bang
from .. import bang # from foo.bar import bang
from ...something import bang # from foo.something import bang
from .... import bang # error ("attempted relative import beyond top-level package")

raven ridge
# boreal umbra For relative imports, does anyone have a go-to resource that explains how the le...

and for part 2... If you run python -m foo.bar.baz, __package__ will be set to something different depending on whether foo.bar.baz is a package or a non-package module. If it's a package, then the __main__.py will be run and __package__ will be set to "foo.bar.baz". If it's a non-package module, then baz.py will be run and __package__ will be set to "foo.bar" -- either way, relative imports will be resolved as explained above, based on the value of __package__

neat delta
#

!rule 7 ad
no cross-channel spamming either

fallen slateBOT
#

6. Do not post unapproved advertising.

7. Keep discussions relevant to the channel topic. Each channel's description tells you the topic.

silver vale
#

I just found a limitation with inspect that I wasn't aware of...

You can't get the sourcecode (using inspect) from a function defined within a shell session or an exec call... This will fail:

code = """
import inspect

def my_fun(n):
    return n ** 2

inspect.getsource(my_fun)
"""
exec(code)

The same if you define the function in a shell session.

It kinda makes sense, because the findsource tries to read the lines from the source file[0], which doesn't exist.

But i'm wondering if this limitation could be somehow overcome? 🤔

[0] https://github.com/python/cpython/blob/0ee2d77331f2362fcaab20cc678530b18e467e3c/Lib/inspect.py#L1130

rose schooner
misty oxide
#

I think I just found a bug. Can anyone confirm that this is in fact unintentional?

class ConstructsNone(BaseException):
  @classmethod
  def __new__(*args, **kwargs): return None

raise Exception("Printing this exception raises an exception :3") from ConstructsNone

In Python/ceval.c, in do_raise(), when you raise an object, cpython checks if it's an exception type, and if it is, constructs it by calling it with no arguments. Then it checks to make sure that what was constructed is in fact an exception. Then it does the same thing for the exception's cause. If it's a type, it constructs the cause by calling it with no arguments. But, for the cause, it actually doesn't check to make sure that the result of the call is in fact an exception, it just stores the result without checking.

Then, when the interpreter goes to print the cause, it expects it to be an exception. This leads to yet another exception being raised, telling you that the cause is the wrong type.

grave jolt
#

!e

class ConstructsNone(BaseException):
  @classmethod
  def __new__(*args, **kwargs): return None

raise Exception("Printing this exception raises an exception :3") from ConstructsNone
fallen slateBOT
#

@grave jolt :x: Your 3.12 eval job has completed with return code 1.

001 | TypeError: print_exception(): Exception expected for value, NoneType found
002 | 
003 | The above exception was the direct cause of the following exception:
004 | 
005 | Traceback (most recent call last):
006 |   File "/home/main.py", line 5, in <module>
007 |     raise Exception("Printing this exception raises an exception :3") from ConstructsNone
008 | Exception: Printing this exception raises an exception :3
misty oxide
misty oxide
#

:3

misty oxide
#

Fancy seeing you here ❤️

halcyon trail
#

I felt the same way 😛 That plus the bug being genuinely cool made me want to reach out

misty oxide
#

Added the test, and made sure it worked with both a patched and an unpatched version.

unkempt rock
#

Hi

cyan raven
feral island
cyan raven
feral island
grave jolt
#

based refcounting

cyan raven
feral island
#

i.e., a lot

cyan raven
cyan raven