#internals-and-peps

1 messages ยท Page 11 of 1

pliant tusk
#

I should probably register an atexit to revert classes before teardown

warm breach
pliant tusk
warm breach
#

maybe I'll switch to finalizing on the type instead

#

all types are weakref-able (I think?)

pliant tusk
#

why do you want to keep track?

warm breach
pliant tusk
#

finalize isnt passed the class tho

warm breach
#

thought about atexit but I think the type might not exist by then?

pliant tusk
#

atexit and finalize both get called at the same place for static types it seems

#

this was hit by weakref.finalize(int, hit_bp);exit()

warm breach
pliant tusk
#
>>> weakref.finalize(A(), lambda *A:print(A))
()
<finalize object at 0x105f6a8c0; dead>
>>> 
``` wdym?
warm breach
pliant tusk
#

finalize doesnt pass any arguments as far as i can tell

warm breach
#

hm? You can "store" args for it to pass kind of like partial

pliant tusk
#

oh i didnt know that

#

cool

warm breach
#

you can't store the object itself though, since that will make it never be GC'd

pliant tusk
#

seemed to work here

#

or actually weakref.finalize handlers that are not called by deconstructors are just called at exit

#

so cyclical ones will be called at interpreter teardown

deep nova
#

Can anyone point me to lexers for python source code?

#

I've seen a few, but they're few and far between

deep nova
#

Wonderful!

warm breach
#

is it possible to subclass ctypes.Structure without defining _fields_

crisp flume
#

Hi

warm breach
#

I just want to provide some common mixin methods using a base Structure class

pliant tusk
fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 1000000084
002 | <__main__.B object at 0x7f4c952f4710>
warm breach
#

I thought it said _fields_ must be defined before the class is subclassed

pliant tusk
#

ยฏ_(ใƒ„)_/ยฏ

warm breach
#

!e

from ctypes import *

class Struct(Structure):
    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        cls._fields_ = [(k, v) for k, v in cls.__annotations__.items()]

class Foo(Struct):
    ob_refcnt: c_ssize_t
    ob_type: py_object
fallen slateBOT
#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 8, in <module>
003 |   File "<string>", line 6, in __init_subclass__
004 | TypeError: ctypes state is not initialized
warm breach
#

__init_subclass__ can't assign to _fields_ apparently though, weird

pliant tusk
#

weird

pliant tusk
warm breach
#

when does init_subclass happen anyways?

#

I thought it was after the class was defined, as if you had a decorator

pliant tusk
# warm breach I thought it was after the class was defined, as if you had a decorator

!e ```py
from ctypes import *

class Cmeta(type(Structure)):
def init(self, name, bases, mapping, **kwargs):
super().init(name, bases, mapping, **kwargs)
self.fields = list(mapping.get('annotations', {}).items())

class Struct(Structure, metaclass=Cmeta):
# methods here
pass

class PyObject(Struct):
ob_refcount: c_ssize_t
ob_type: py_object

print(sizeof(PyObject), PyObject.fields)```

fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

16 [('ob_refcount', <class 'ctypes.c_long'>), ('ob_type', <class 'ctypes.py_object'>)]
pliant tusk
#

I got it to work

warm breach
#

o.O

pliant tusk
#

Gotta love metaclasses

warm breach
#

so when does Cmeta.__init__ get called here

pliant tusk
#

When subclasses are initialized

warm breach
#

huh...

#

but it's later than __init_subclass__?

pliant tusk
#

Yeah

#

I guess

fallen slateBOT
#

Modules/_ctypes/stgdict.c line 427

if (!stgdict) {```
warm breach
#

still don't understand how this is null in __init_subclass__ though

#

seems like a bug

pliant tusk
#

idk, ctypes has a lot of hackery in its use of STGdict

#

the Cmeta trick should work for einspect tho @warm breach

warm breach
pliant tusk
#

(you can technically do class PyObject(Structure, metaclass=Cmeta) instead of having the interim class)

warm breach
#

having to do the decorator plus ctypes.Structure plus mixins

@struct
class Foo(Structure, AsRef, Display)

was quite annoying

pliant tusk
#

yea I can imagine

pliant tusk
#

you should also test what happens if _fields_ is already set, cause I did not

unreal hornet
#

can i use chat gpt too help me

boreal umbra
warm breach
#

got NULL comparisons working as well

from einspect import view, NULL

n = view(int).tp_as_number.contents

print(n.nb_add == NULL)
# False
print(n.nb_matrix_multiply == NULL)
# True
warm breach
#

== NULL just checks if a pointer is null, not that the pointer address has to be the same

pliant tusk
#

Then is would work

warm breach
#

I guess I could override __getattr__ to detect returned LP_PyObject pointers and replace them

pliant tusk
#

You can modify the STGdict

#

I have example code

#

One sec

warm breach
#

but the types like ctypes Arrays of PyObject pointers I wouldn't be able to change iirc

#

like arr here comes from cast(<ob_item_0_ptr>, POINTER(PyObject) * 2)

from einspect import view

t = (1, 2)
arr = view(t).item
rose schooner
warm breach
#

since it's just a ctypes.Array I'm not sure how I'd override what it returns

pliant tusk
#

!e ```py
from ctypes import *

base_size = sizeof(c_ssize_t)

def getclsdict(cls):
d = cls.dict # hold reference due to cls.__dict__ being a getter in static classes
if isinstance(d, dict):
return d
return py_object.from_address(id(d) + 2 * base_size).value

creates modded handlenull type to shim null values

Null = type('Null',(),{'repr':lambda self:f'<NULL>'})()

GETFUNC = PYFUNCTYPE(py_object, c_void_p, c_ssize_t)

class StgDictObject(Structure):
fields = [
('-', c_ubyte*(
dict.sizeof({}) +
sizeof(c_ssize_t * 7) +
sizeof(c_ushort * 2)
)),
('getfunc', GETFUNC)
]

def get_stg_dict(cls):
return StgDictObject.from_address(id(getclsdict(cls)))

p_stg = get_stg_dict(py_object)
orig_getfunc = p_stg.getfunc

@GETFUNC
def getfunc(ptr, size):
if c_void_p.from_address(ptr).value:
return orig_getfunc(ptr, size)
return Null

p_stg.getfunc = getfunc

print(py_object().value is Null)```

fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

True
warm breach
#

huh

pliant tusk
#

@warm breach thats an example of changing the py_object getfunc

#

(you can also make your own subclasses of _SimpleCData and inject get and set funcs)

warm breach
pliant tusk
#

the same thing should work with any subclass of _SimpleCData (with some edits)

warm breach
pliant tusk
#

I am checking rn

warm breach
#

!e

from ctypes import *
from einspect.structs import PyObject

x = POINTER(PyObject)
print(type(x))
print(type(x).__mro__)
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | <class '_ctypes.PyCPointerType'>
002 | (<class '_ctypes.PyCPointerType'>, <class 'type'>, <class 'object'>)
warm breach
#

hm

gray galleon
#

what does PUSH_NULL mean

#

it pushes None?

warm breach
#

part of the method caching in 3.11 iirc

gray galleon
#

does NULL point to a python object

warm breach
#

NULL will be interpreted as a NULL PyObject pointer if it is casted to one

#

but they probably check it for NULL and do something with it

gray galleon
#

so NULL is a python object lemon_thinking

#

can't wait to see how NULL behave

warm breach
gray galleon
#
print(NULL)
#

can i do this using ctypes or something

warm breach
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<NULL>, <NULL>, <NULL>)
warm breach
#

python builtin collections are able to show reprs of NULL PyObject pointers somehow

#

but if you try to access those indices it will segfault

gray galleon
#

strange that it can print NULL

#

!e

from einspect import NULL
print(NULL)
print(id(NULL))
fallen slateBOT
#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | <NULL ptr[PyObject] at 0x7fc6b2a861d0>
002 | 140491379206784
rose schooner
rose schooner
warm breach
warm breach
rose schooner
warm breach
#

!e

from ctypes import addressof
from einspect import NULL

print(NULL)
print(hex(addressof(NULL)))
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | <NULL ptr[PyObject] at 0x7f34fee562f0>
002 | 0x7f34fee562f0
gray galleon
#

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=1,
    ob_item=[NULL]
).into_object()

print(t)
fallen slateBOT
#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<NULL>,)
gray galleon
#

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=1,
    ob_item=[NULL]
).into_object()

print(t)
print(t[0])
fallen slateBOT
#

@gray galleon :x: Your 3.11 eval job has completed with return code 139 (SIGSEGV).

(<NULL>,)
gray galleon
#

printing null itself cause sigsegv

#

printing the tuple doesn't

warm breach
#

well before you print it, t[0] tries to convert the pointer into a python object for you

#

the repr happens in C so they can handle NULLs

gray galleon
#

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=1,
    ob_item=[NULL]
).into_object()

null = t[0]

print(dir(null))
fallen slateBOT
#

@gray galleon :warning: Your 3.11 eval job has completed with return code 139 (SIGSEGV).

[No output]
warm breach
#

!e python is doing this essentially (but without the safety check, hence segfault)

from einspect import NULL

print(NULL.contents.into_object())
fallen slateBOT
#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 3, in <module>
003 | ValueError: NULL pointer access
gray galleon
#

thx
now i know how to create a tuple that breaks when indexed

#

also its impressive that python can be so unsafe

raven ridge
#

at the point where you're pulling in a C FFI, you're not really writing Python anymore.

warm breach
gray galleon
#

just rewrite python in rust then

warm breach
#

"safe" calls in rust aren't free speed-wise

#

also making python in rust without unsafe calls would probably be close to impossible

pliant tusk
gray galleon
feral island
#

ctypes?

gray galleon
pliant tusk
#

!e ```py
import gc

class magic:
def length_hint(self):
return 1

def __iter__(self):
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                yield obj
                break

weird = tuple(magic())
print(weird[0] is weird, weird)```

fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

True ((...),)
feral island
#

sure. I guess another option is directly creating a code object

pliant tusk
#

@gray galleon that abuses the gc to grab a tuple as it is being created. You can change length_hint to larger values and it will leak NULLs

gray galleon
warm breach
#

and now consider transient usages of those calls

pliant tusk
feral island
pliant tusk
#

Hash it

rose schooner
gray galleon
gray galleon
#

!e ```py
import gc

class magic:
def length_hint(self):
return 10

def __iter__(self):
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                yield obj
                break

weird = tuple(magic())
print(weird)```

fallen slateBOT
#

@gray galleon :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 15, in <module>
003 | SystemError: Objects/tupleobject.c:927: bad argument to internal function
gray galleon
#

lmao

rose schooner
fallen slateBOT
#

Objects/tupleobject.c lines 923 to 929

if (v == NULL || !Py_IS_TYPE(v, &PyTuple_Type) ||
    (Py_SIZE(v) != 0 && Py_REFCNT(v) != 1)) {
    *pv = 0;
    Py_XDECREF(v);
    PyErr_BadInternalCall();
    return -1;
}```
pliant tusk
#

!e ```py
import gc

class magic:
def length_hint(self):
return 1

def __iter__(self):
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                yield obj
                break

weird = tuple(magic())
hash(weird)```

fallen slateBOT
#

@pliant tusk :warning: Your 3.11 eval job has completed with return code 139 (SIGSEGV).

[No output]
rose schooner
warm breach
#

regression?

pliant tusk
warm breach
#

or at least python collections don't prepare for that

warm breach
#
from ctypes import *
from einspect.structs import PyObject

PyCPointerType = type(POINTER(c_void_p))


class LP_PyObject(PyCPointerType):
    _type_ = PyObject
pliant tusk
warm breach
#

like can I customize how LP_PyObject gets converted when it's a Structure member

pliant tusk
#

I think you need to use memory patching

pliant tusk
# rose schooner wdym?

!e ```py
import gc

class magic:
def length_hint(self):
return 1

def __iter__(self):
    global weird
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                weird = obj
                return
                yield

try:tuple(magic())
except:pass
print(weird)

fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<NULL>,)
rose schooner
#

ok

raven ridge
#

so uh - has someone reported that bug? The tuple probably shouldn't be getting tracked by the GC (and thus discoverable through gc.get_objects()) until after it's in a valid state

gray galleon
#

gc.get_objects() get all living instances in python?

#

!e```
import gc
print(len(gc.get_objects()))

fallen slateBOT
#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

4026
gray galleon
#

big

pliant tusk
warm breach
gray galleon
#

!e```
from gc import get_objects
print(len(get_objects()))

fallen slateBOT
#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

4027
gray galleon
#

how does the from..import version have more instances than import version

raven ridge
gray galleon
pliant tusk
warm breach
warm breach
#

you imported get_objects, that's another gc tracked reference

stone sandal
#

helo

#

Wait why am I the chair expert

gray galleon
stone sandal
#

Yeah I figured

#

Maybe I can help tho

#

What's up

gray galleon
raven ridge
#

no, it's still in sys.modules

gray galleon
#

real

#

!e```
import gc

class Foo: pass

print(len(gc.get_objects()))

fallen slateBOT
#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

4035
gray galleon
#

1 class = 9 more instances

#

smh

#

i assume its from the class attributes?

dusk comet
#

There are many objects that classes are made up of

  • __name__, __qualname__, __module__ str
  • __bases__, __orig_bases__ tuple
  • __subclasses__() list
  • __mro__ list
  • __annotations__, __dict__ dict
  • __flags__, __dictoffset__, __itemsize__, ... int
foggy lodge
warm breach
#

trying to make it a Structure field will have an error that it's not a ctypes type

foggy lodge
# warm breach trying to make it a Structure field will have an error that it's not a ctypes ty...

Yes, you are correct. Subclassing a ctypes data type does not automatically make the subclass a recognized ctypes type. To use your subclass as a field in a ctypes Structure, you need to register the subclass as a ctypes data type using the ctypes.POINTER function.

Here's an example:

from ctypes import *

class LP_PyObject(c_void_p):
pass

LP_PyObject_p = POINTER(LP_PyObject)

class MyStructure(Structure):
fields = [("obj", LP_PyObject_p)]

warm breach
#

I mean, the whole point was having a custom POINTER type that I can override from_param on null values

feral island
pliant tusk
warm breach
#

I think just overriding my structure __getattr__ is probably the least cursed way to do it though, then I could have this work

from einspect import view, NULL

n = view(int).tp_as_number.contents

print(n.nb_add is NULL)
# False
print(n.nb_matrix_multiply is NULL)
# True
pliant tusk
#

that would work

warm breach
#

not sure about Array types made with LP_PyObject * n though

#

is subclassing ctypes.Array a thing

#

or is it one of those dynamic type types

pliant tusk
#

Im taking a look rn to see if you can modify Array type unwrapping

pliant tusk
#

it looks like Arrays use their proto get/set funcs @warm breach

#

Pointers are a bit weirder tho

warm breach
#

seems length has to be known at define time though pithink

pliant tusk
#

you can just make a class factory

warm breach
#

I suppose it's not too different from what I do now with dynamically ptr[PyObject] * 3

pliant tusk
#

yea

warm breach
#

I have no idea how to even make a class of a pointer type though

pliant tusk
#

use _ctypes._Pointer and set the _type_

warm breach
#

wait no

#

that's python to ctypes

quiet crane
#

noooo I deleted my nicely crafted message ๐Ÿ˜ฆ

warm breach
#

!e

from ctypes import *
from ctypes import _Pointer
from einspect.structs import PyObject


class LP_PyObject(_Pointer):
    _type_ = PyObject

    @classmethod
    def from_buffer(cls, buffer):
        print("in from_buffer")
        return super().from_buffer(buffer)
    
    @classmethod
    def from_param(cls, param):
        print("in from_param")
        return super().from_param(param)

    @classmethod
    def from_address(cls, address):
        print("in from_address")
        return super().from_address(address)


class MyObject(Structure):
    _fields_ = [
        ("ob_refcnt", c_ssize_t),
        ("ob_type", LP_PyObject),
    ]

x = MyObject.from_address(id(5))
print(x.ob_type)
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

<__main__.LP_PyObject object at 0x7f62a1c124e0>
warm breach
#

seems like it doesn't call any of those

pliant tusk
#

afaik

warm breach
#

eh honestly might not do this

#

would also break usages of assignments of pointers after getting them from structs

fallen slateBOT
#

src/einspect/structs/py_object.py lines 64 to 67

if obj_ptr:
    obj_ptr.contents.DecRef()
# Set new
obj_ptr.contents = PyObject.try_from(value).with_ref()```
warm breach
#

== NULL is a thousand times easier since the class just does its own __eq__ and compares whatever it wants

warm breach
#

crazy idea, __matmul__ alias for Structure.from_address? ๐Ÿฅด

from einspect.structs import PyFloatObject

obj = PyFloatObject @ id(1.5)

print(obj.ob_fval)
>> 1.5
pliant tusk
fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

py_object(<class 'int'>)
deep nova
#

Hey peeps! Quick question about match statement mechanics

#

I'm told that this (following) desugars to an if-else ladder with approximately O(n) lookup time

#
match some_character:
  
    case 'a':
        ...

    case 'b':
        ...

    case 'c':
        ...
#

But what about this...?

#
match some_character:
  
    case 'a' | 'b' | 'c':
        ...
#

I could see a smart optimization step seeing this and converting the characters into some kind of set

frigid bison
#
  4           0 LOAD_FAST                0 (some_character)

  5           2 DUP_TOP
              4 LOAD_CONST               1 ('a')
              6 COMPARE_OP               2 (==)
              8 POP_JUMP_IF_FALSE        8 (to 16)
             10 POP_TOP

  6          12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

  8     >>   16 DUP_TOP
             18 LOAD_CONST               2 ('b')
             20 COMPARE_OP               2 (==)
             22 POP_JUMP_IF_FALSE       15 (to 30)
             24 POP_TOP

  9          26 LOAD_CONST               0 (None)
             28 RETURN_VALUE

 11     >>   30 LOAD_CONST               3 ('c')
             32 COMPARE_OP               2 (==)
             34 POP_JUMP_IF_FALSE       20 (to 40)

 12          36 LOAD_CONST               0 (None)
             38 RETURN_VALUE

 11     >>   40 LOAD_CONST               0 (None)
             42 RETURN_VALUE```
#
15           0 LOAD_FAST                0 (some_character)

 16           2 DUP_TOP
              4 LOAD_CONST               1 ('a')
              6 COMPARE_OP               2 (==)
              8 POP_JUMP_IF_FALSE        8 (to 16)
             10 POP_TOP

 17          12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

 16     >>   16 DUP_TOP
             18 LOAD_CONST               2 ('b')
             20 COMPARE_OP               2 (==)
             22 POP_JUMP_IF_FALSE       15 (to 30)
             24 POP_TOP

 17          26 LOAD_CONST               0 (None)
             28 RETURN_VALUE

 16     >>   30 DUP_TOP
             32 LOAD_CONST               3 ('c')
             34 COMPARE_OP               2 (==)
             36 POP_JUMP_IF_FALSE       22 (to 44)
             38 POP_TOP

 17          40 LOAD_CONST               0 (None)
             42 RETURN_VALUE

 16     >>   44 POP_TOP
             46 LOAD_CONST               0 (None)
             48 RETURN_VALUE```
#

this is the bytecode, respectively

#

you can see it's identical

deep nova
#

Awesome!

#

Thanks

#

How do you get this bytecode? Pass the source code as a string to dis?

pliant tusk
fallen slateBOT
#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 |   0           0 RESUME                   0
002 | 
003 |   2           2 LOAD_NAME                0 (some_character)
004 | 
005 |   3           4 COPY                     1
006 |               6 LOAD_CONST               0 ('a')
007 |               8 COMPARE_OP               2 (==)
008 |              14 POP_JUMP_FORWARD_IF_FALSE     1 (to 18)
009 |              16 JUMP_FORWARD            17 (to 52)
010 |         >>   18 COPY                     1
011 |              20 LOAD_CONST               1 ('b')
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/ohosimuyav.txt?noredirect

deep nova
#

Sick

#

Thanks

radiant garden
#

Just be careful, make sure it's faster:

#

!timeit ```py
x = 'q'
x == 'a' or x == 'b' or x == 'c' or x == 'd'

fallen slateBOT
#

@radiant garden :white_check_mark: Your 3.11 timeit job has completed with return code 0.

2000000 loops, best of 5: 160 nsec per loop
radiant garden
#

!timeit ```py
x = 'q'
x in {'a', 'b', 'c', 'd'}

fallen slateBOT
#

@radiant garden :white_check_mark: Your 3.11 timeit job has completed with return code 0.

5000000 loops, best of 5: 46.4 nsec per loop
radiant garden
#

Good to know it is in fact faster!

grave jolt
#

yup that's a handy optimization

#

...which breaks if you introduce a module-level constant

#

๐Ÿ˜ฆ

pliant tusk
#

because then all usages of the macro would expand to a set literal which would then turn into a single frozenset at compile time

grave jolt
#

๐Ÿ’€

#

eh, not sure it deserves expanding the language with such complex feature

#

if you really want faster lookups, create the frozenset once explicitly

warm breach
#

!e

def char_match(some_char):
    match some_char:
        case "a" | "b" | "c":
            return 1
        case "d" | "e" | "f":
            return 2
        case _:
            return None
        
print(char_match("a"))
print(char_match([1, 2]))
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 1
002 | None
warm breach
#

!e

def set_match(some_char):
    if some_char in {"a", "b", "c"}:
        return 1
    elif some_char in {"d", "e", "f"}:
        return 2
    else:
        return None

print(set_match("a"))
print(set_match([1, 2]))
fallen slateBOT
#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | 1
002 | Traceback (most recent call last):
003 |   File "<string>", line 10, in <module>
004 |   File "<string>", line 2, in set_match
005 | TypeError: unhashable type: 'list'
warm breach
#

and custom types can == a str but not nessasarily be hashable

pliant tusk
feral island
#

!pep 638

fallen slateBOT
#
**PEP 638 - Syntactic Macros**
Status

Draft

Created

24-Sep-2020

Type

Standards Track

grave jolt
#

yeah there was

#

eeeeh

#

why do people want more features in Python

deep nova
#

Hey internals people

grave jolt
#

that's called "surgeons"

deep nova
#

Should the literal 0b123 be lexed as an invalid-integer token, or as two tokens (0b1 and 23)

pliant tusk
#

i would assume an invalid integer token

feral island
#
    0b123
       ^
SyntaxError: invalid digit '2' in binary literal
``` that's what Python does
#

as opposed to ``` Input In [47]
0b1 23
^
SyntaxError: invalid syntax

native flame
#

the mailing list has been talking about macros recently

#

i dont know how i feel about it honestly

#

for all its benefits its still a massive change

warm breach
#

when everyone is using custom parsed macros it's harder to look at python code and be able to tell what's going on

#

which I feel like is kind of where rust macros are at

#

a lot of things that really shouldn't be macros are macros in rust libraries because why not and everyone wants to do something "magical" for themselves

#

since python doesn't really have a compile time (that at least it can provide to macros) it will be limited to literals, which I kind of question how useful it would be

native flame
#

wdym by limited to literals pithink

warm breach
#

does the compiler compile the ast at bytecode time? or is it fully runtime

feral island
warm breach
feral island
#

"it" being a macro?

#

I have no idea, there are no macros right now

warm breach
#

it seems using it will be fairly complex

feral island
#

I imagine you'd implement them as code that runs at compile time and outputs something like an AST

#

but there are other options

warm breach
#

how would the macro creator resolve clashing local / global names or something

feral island
#

PEP 638 probably discusses this, I haven't read it recently

warm breach
#

sounds like it might have good implications for rewrite libraries though or FFI

#

numba.njit is pretty much only interested in AST so a macro would be perfect there

raven ridge
#

Python programmers tend to be pretty judicious about only using a more magical feature when the alternatives would provide a much worse user experience.

warm breach
warm breach
#

it's just a minor performance overhead (which considering LLVM and everything else it's almost negligible)

raven ridge
#

pytest could potentially use macros for its assert rewriting, instead of the nastiness that it does today. There was a proposal to implement match/case using syntactic macros, rather than adding it to the language - or even to trial it with macros as a library to decide on the desired syntax and semantics, and then upgrade it to a first class language feature.

warm breach
#

fair yeah

#

not sure how tooling support will go though

#

even rust IDEs have a hard time giving completions in macros

raven ridge
#

people keep asking for macros in order to build DSLs in Python - from that PoV, it makes sense that completion would be tough, since you're effectively in the domain of a new language when you're using a DSL

warm breach
#

would type checkers and linters even be able to parse python without running run-time code

raven ridge
#

That would depend on the implementation, I suppose. If the implementation of the macros is generating Python code dynamically while creating the AST, they'd need to match that

#

That is, the static analysis tools would need to run compile time code, I guess

#

But static analysis tools already can't understand all sorts of stuff you can do in Python.

warm breach
#

I think I can see a fairly comprehensive implementation of this but it sounds more like a python 4 level feature

raven ridge
warm breach
raven ridge
#

๐Ÿคทโ€โ™€๏ธ

#

I guess it would be a nice to have, but I don't think it's a requirement

warm breach
#

if we're losing IDE syntax checks and type checks for macros I'm not sure how good of a trade that is over strings / decorators and type hints (or whatever we use instead of a macro currently)

raven ridge
#

dataclasses are already not statically analyzable for type checkers, for instance - they all needed to add special support recognizing and special casing them

warm breach
#

that's just a type thing though, this changes ast

raven ridge
#

pytest changes the AST.

warm breach
#

hm, how?

raven ridge
#

It rewrites assert statements

#

In order to include information about why an assertion fails

warm breach
#

yeah but that doesn't concern what the user writes right

#

you don't need a special ast support to have IDE completions for writing pytest tests

#

(which, not saying all macros must be invalid python ast, but just that it seems they can be now)

#

intellij / pycharm currently supports injected language ast natively

#

but it seems pylance / vscode has decided not to

raven ridge
warm breach
raven ridge
#

Just because macros could modify the AST doesn't imply that all uses of macros would be totally opaque to static analysis, is all I'm saying. Pytest runs tests with a modified AST, and pytest tests are understood by static analysis tools. In the pytest case, static analysis tools work fine because it rewrites assert to do something nearly totally compatible with what it would ordinarily do

lone sun
#

My biggest concern with macros is how easy they make it to obfuscate code. I have actually seen C code where someone did #define BEGIN {, #define END }, and #define LOOP for. (Actually I think LOOP might have been a little fancier, but I don't remember how.) The result was completely illegible to anyone but the original author; several years later he admitted this had not been a good idea.

#

I don't want to tell people that all macros are evil, because they're not. But I feel like they're different from other advanced language features because they're so easy to use. There's a steep barrier before most people even know what a metaclass is, but there's very little to prevent you from littering your code with awful macros.

#

(Though I have to admit that I'm a little tempted to see if there's a sneaky way to convert ! into a factorial operator.)

raven ridge
#

It wouldn't be the first time that an advanced feature was made easy to use and got overused, I guess. That's basically the situation with namespace packages today

warm breach
#

if all you're doing is things you don't need a macro for, why use one in the first place?

lone sun
#

Python is Turing complete, so you never need a macro.

raven ridge
#

In the pytest case, it manages because it's the runner. Instead of importing your code, it compiles, rewrites, and executes the rewritten code. That technique isn't broadly applicable, it only really works for frameworks. And it's a lot of work.

warm breach
#

like np.einsum

raven ridge
#

Oh, I'm only half right there. It does import your code, bit only after installing its own import hook. I'm right that that only works because it's importing you and not the other way around, though.

warm breach
#

how would using pytest with macros look like anyways?

raven ridge
raven ridge
warm breach
#

I'll agree it'll make the library simpler, but I don't see how the end product is better

warm breach
raven ridge
#

Why would it need to affect the end user experience at all? I don't think that follows

warm breach
#

and if types and attributes are still valid

lone sun
# warm breach I think I'm more referring to whether it makes a difference in the public api ex...

Lots of things make a difference in the experience. This is really a question of aesthetics, not functionality. There is nothing you can do in Python that you can't also do in C, Rust, Lisp, various assembly dialects, BASIC, etc. Part of why I like Python is because I like its aesthetics. Judicious use of macros can make certain things clearer and easier. But I expect that if they're easy to use, there will be codebases where they're used in preference to function calls (with some flimsy justification like "it avoids the overhead of setting up a stack frame"), and those will be horrible to work on.

warm breach
raven ridge
warm breach
#

it does the same thing as normal python assert

raven ridge
#

No, it doesn't

warm breach
#

I'm talking about the statement after assert

#

it's a valid normal python statement

#

where names need to exist and normal rules need to be obeyed

raven ridge
#

Sure, but it doesn't have to be

#

It is because that's what pytest defined it to be

warm breach
#

once your IDE sees the macro all bets are off about what is valid inside

raven ridge
#

that's already the case for assert in pytest

#

you're saying that the expression fed as an argument to assert is still using the names visible in the function scope, and so the IDE can blindly use its existing inference machinery without needing to know that assert has been replaced and isn't the normal assert statement anymore.

That's true, but that's only because pytest implemented it that way. There's nothing that would stop pytest from injecting a name into the scope that that expression is evaluated in, for instance.

#

and if pytest was implemented with macros, it would still make the same guarantees about what names are visible in the expression that's fed as an argument to its asserting macro. Because that's the contract that it wants to provide to its users.

warm breach
#

or they can indicate they have custom ast and the IDE can skip parsing name and other checks for that part?

raven ridge
#

perhaps, but that's not what I'm getting it. My point is that it's certainly not the case that static analysis tools would need to throw their hands up and give up whenever they encounter any macro, just as it's certainly not the case that static analysis tools need to give up whenever there's any AST rewriting happening

#

it depends entirely on what the macro/AST rewriting does

#

if it injects new variables, or changes the flow of control, or something, then sure, they might get confused. If it just expands to a bunch of valid Python statements, they probably won't.

warm breach
#

we should probably aim for actually working inferencing and ast parsing that something like rust at least tries to do in macros

raven ridge
#

we can already do everything without macros - you can literally rewrite files at import time

#

if you're looking only for things that require macros to do, you won't find any.

warm breach
#

like say... np einsum

warm breach
raven ridge
#

no, you just need to call ideas.examples.fractions_ast.add_hook() before importing your code.

plain condor
#

hello people, i have just started out in python and i installed pycharm, but the text isnt colourful, why is it, and how can i fix it?

warm breach
plain condor
#

okay thank you for guiding!

plain condor
#

oh yes, it worked flawlessly, thank you @warm breach

raven ridge
warm breach
raven ridge
#

๐Ÿคทโ€โ™‚๏ธ you can do that with an import hook by installing it and then re-importing yourself, I assume

raven ridge
#

(removing your entry from sys.modules in the middle)

warm breach
#

it will fail at compile time before imports run

raven ridge
warm breach
raven ridge
#

you can do anything with it.

#

that's you defining a function that gets called with the bytes of your .py file, and that returns a str that will be compiled

warm breach
#

ah it hooks the string source?

raven ridge
#

yep. That lets you make any textual substitutions you'd like on the contents of the file

warm breach
#

honestly why does python still support custom encodings ๐Ÿฅด

raven ridge
#

I'd be willing to bet there are people using this trick for real DSLs.

#

and even without all of these tricks, it's always been possible to read a file written in some DSL language of your choice, transpile it to valid Python code, and then exec that Python code, all from a Python session.

#

macros aren't giving you any new capability in that sense - they're just making it easier to use, and making it integrate more nicely with the rest of the language

warm breach
#

eh...

#

I agree it would be nicer but

#

macros aren't giving you any new capability in that sense
not sure if this can really be said though

#

it's kind of like saying making tuples mutable doesn't give us any new capability, we could mutate them via ctypes all along

raven ridge
#

I disagree - ctypes is jumping through an FFI and breaking out of the Python languages. All of those things I've been linking are things that can be done in the Python language

warm breach
#

C is also an implementation detail of CPython

raven ridge
warm breach
#

In any case just because something is possible through some hack doesn't mean it deserves a place in the formal language

#

if macros are added they should be due to their own merits

#

which it seems like it may be already

raven ridge
#

or indeed, by generating Python code and exec'ing it

warm breach
#

the pep also describes a runtime specification for the parser and a static value / type structure for static inferencing?

#

as well as restrictions on side effects a macro can have

#

it seems it will be a fairly complex reference implementation if we get one

raven ridge
warm breach
raven ridge
#

that's one possible advantage. I don't think it's the whole point - it's hard to imagine any implementation of macros that would be more arcane and difficult to set up and use than hacking in custom import hooks to rewrite ASTs

warm breach
spring musk
#

Hey

#

umm

warm breach
#

also I assume these could be offered support by IDEs in some way

#

though performance wise repeatedly rerunning macro preprocessors isn't too ideal

#

also I guess python would finally have non-syntax compile time errors? ๐Ÿ‘€

#

or, runtime errors in the preprocessor?

#

does that count as compile time or runtime pithink

#

can preprocessors use macros themselves

warm breach
#

why is PyObject_DelAttr not stable ABI?

#

though PyObject_SetAttr and PyObject_HasAttr already are

fallen slateBOT
#

Include/abstract.h line 101

#define  PyObject_DelAttr(O, A) PyObject_SetAttr((O), (A), NULL)```
feral island
#

it's not in the stable ABI because it's not in any ABI

warm breach
#

ah it's a macro

#

hm okay yeah that's simple enough

raven ridge
#

there's other macros that have been added to the stable ABI as functions, though

warm breach
#

I'm just gonna pretend it exists ๐Ÿฅด

@bind_api(pythonapi["PyObject_SetAttr"])
def SetAttr(self, name: str, value: object) -> int:
    """Set attribute `name` of the PyObject. Returns -1 on failure."""

def DelAttr(self, name: str) -> int:
    """Delete attribute `name` of the PyObject. Returns -1 on failure."""
    return self.SetAttr(name, ctypes.py_object())
raven ridge
#

I've only needed to limit myself to the stable ABI once, and I found the experience pretty painful. There's so many convenience things that are missing from the stable ABI, forcing you to reimplement stuff yourself

warm breach
#

both PyList_GetSlice and PyList_SetSlice only work on start:end without steps

#

and start end need to be computed as real indices (not negative)

raven ridge
#

you can't do steps even from Python, right?

#

!e ```py
x = list(range(10))
x[::2] = [0, 0, 0, 0, 0]
print(x)

fallen slateBOT
#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

[0, 1, 0, 3, 0, 5, 0, 7, 0, 9]
raven ridge
#

TIL! I had no idea that worked.

warm breach
#

ls[::-1] is popular for reversing

raven ridge
#

I'm guessing that that's just handled manually somewhere in the implementation of list, then

warm breach
#

!e

ls = [1, 2, 3, 4, 5, 6]
ls[0:6:2] = ["a", "b", "c"]

print(ls)
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

['a', 2, 'b', 4, 'c', 6]
warm breach
#

assignment steps work as well

raven ridge
warm breach
#

!e had to implement that myself for tuple set slice ๐Ÿ˜”

from einspect import view

t = (1, 2, 3, 4, 5, 6)

view(t)[0:6:2] = ("a", "b", "c")

print(t)
fallen slateBOT
#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

('a', 2, 'b', 4, 'c', 6)
raven ridge
fallen slateBOT
#

Objects/object.c lines 1003 to 1012

PyObject_SetAttr(PyObject *v, PyObject *name, PyObject *value)
{
    PyTypeObject *tp = Py_TYPE(v);
    int err;

    if (!PyUnicode_Check(name)) {
        PyErr_Format(PyExc_TypeError,
                     "attribute name must be string, not '%.200s'",
                     Py_TYPE(name)->tp_name);
        return -1;```
warm breach
#

is there a way to get errors like this that return -1 instead of NULL

#

ctypes.pythonapi automatically raises NULL returns with errors

feral island
#

different C functions have different conventions for errors. NULL is the most common one but not all functions return a pointer

raven ridge
#

some don't return any error sentinel, forcing you to check yourself after every call

warm breach
#

how do I get the error there though

#

do I have to do some PyErr call

raven ridge
#

if PyObject_SetAttr returns -1, that means that PyErr_Occurred() is true, and you can fetch the exception that occurred with PyErr_Fetch

warm breach
#

oh huh

#

!e

from ctypes import *

SetAttr = pythonapi.PyObject_SetAttr
SetAttr.argtypes = [py_object, py_object, py_object]
SetAttr.restype = c_int

class Foo:
    pass

SetAttr(Foo, [], 123)
fallen slateBOT
#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 10, in <module>
003 | TypeError: attribute name must be string, not 'list'
warm breach
#

it's automatic as well? pithink

#

I guess -1 is also special cased?

#

can no pythonapi function return -1 as a real value then

raven ridge
#

I'm sure that's not the case. Without checking the implementation, I'd bet that pythonapi always calls PyErr_Occurred(), and always propagates an exception if so

warm breach
#

I guess that's why PyDict_GetItem segfaults when it returns NULL with restype py_object

#

since it doesn't set an exception

raven ridge
#

I don't think ctypes.pythonapi needs to check whether an exception occurred - it just blithely ignores it, and then the eval loop that made the call into ctypes notices that the exception indicator is set and propagates the exception

warm breach
#

@pliant tusk made slot deletes work now

from einspect import view

del view(int)["__pow__"]

try:
    print(2 ** 65)
except TypeError as e:
    print(e)

view(int).restore("__pow__")

print(2 ** 65)
unsupported operand type(s) for ** or pow(): 'int' and 'int'
36893488147419103232
pliant tusk
#

nice

#

@feral island do you know if it is possible to download the exact disk image that the eval command uses? I want to run that binary in a debugger

white nexus
#

!gh pythondiscord/snekbox

feral island
#

I know nothing about the internals of the bot

white nexus
pliant tusk
#

you may know the other side of the bug tho, do you know if there are any conditions where PyCLEAR will decref a pointer but not set it to NULL? Because thats what I think might be happening

white nexus
warm breach
#

on windows there is no segfault

#

but on windows with PYTHONDEVMODE=1 will segfault with Windows fatal exception: access violation

pliant tusk
#

that is even weirder

#

on my macbook it gives SystemError (which is what it should be doing given the C code that runs)

warm breach
#

PYTHONMALLOC=debug will make it segfault on windows

pliant tusk
#

weird

#

guess I need to boot into windows to debug then

warm breach
#

3.12.0a4 windows:

    print(corrupt.__reduce__())
          ^^^^^^^^^^^^^^^^^^^^
SystemError: NULL object passed to Py_BuildValue
pliant tusk
warm breach
#

wait no

#

3.12.0a4 ubuntu, prints with no segfault:

(<built-in function iter>, (<function  at 0x7f7cf79e9f80>, 0))
#

wtf

pliant tusk
#

yea this is a weird one

pliant tusk
raven ridge
feral island
pliant tusk
pliant tusk
feral island
pliant tusk
#

yea this shouldnt be a threading issue, I wonder if it is the compiler optimizing something out incorrectly on specific platforms

raven ridge
#

the compiler is (almost) never wrong

pliant tusk
#

or some flags that are passed cause this

raven ridge
#

if it's happening on more than 1 platform, with different compilers, the odds of it being a bug in two different compilers is basically 0

raven ridge
warm breach
# pliant tusk thats the only reason I can think for the bug only happening on specific platfor...

so summary

  • 3.11 , windows
  • 3.12.0a4, windows
(<built-in function iter>, (<function  at 0x000001B8D6A8C720>, 0))
  • 3.11, windows, PYTHONMALLOC=debug
  • 3.12.0a4, windows, PYTHONMALLOC=debug
Windows fatal exception: access violation
> exit code -1073741819 (0xC0000005)
  • 3.11, ubuntu
(<built-in function iter>, (<function  at 0x7fb772c3c4a0>, 0))
> terminated by signal SIGSEGV (Address boundary error)
  • 3.12.0a4, ubuntu
(<built-in function iter>, (<function  at 0x7f3480d71f80>, 0))
  • 3.12.0a4, ubuntu, PYTHONMALLOC=debug
Fatal Python error: Segmentation fault
pliant tusk
raven ridge
#

could be a use-after-free, if PYTHONMALLOC=debug is changing the behavior

#

that'd be my first educated guess, before looking at the code at all...

#

try it with valgrind or asan, perhaps... (and PYTHONMALLOC=malloc)

warm breach
pliant tusk
#

the resulting bug would be a use after free, since Py_CLEAR is supposed to clear iter->seq_callable

warm breach
#

otherwise I can't get the SystemError in release binaries

pliant tusk
#

and the rest of the code there sets up the Py_CLEAR to happen after iter->seq_callable is checked

#

but it should be NULLed out by Py_CLEAR and raise a SystemError Exception

pliant tusk
pliant tusk
#

macos 3.10.10 and 3.11.1

#

both give me the correct result of SystemError

warm breach
#

hm, my 3.11 on ubuntu was built with GCC via pyenv

pliant tusk
#

ill test on my windows machine tonight

#

but this is a weird bug

feral island
#

hm so the SystemError would happen if only one of it_callable and it_sentinel is NULLed out?

pliant tusk
#

Yea, the SystemError is triggered inside PyBuild_Value

raven ridge
#

what pointer is it that Py_CLEAR ought to be clearing and you think it isn't?

pliant tusk
#

It->callable

fallen slateBOT
#

Objects/iterobject.c line 208

calliter_iternext(calliterobject *it)```
pliant tusk
#

From callable_iterator

feral island
#

specifically I think the branch on line 226 should clear both it_callable and it_sentinel

pliant tusk
#

It should and the fact that it is segfaulting means that it is at least decreasing the refcount

raven ridge
#

well, the crash is happening in tuplerepr, because the first element of the tuple is an already freed object

pliant tusk
#

Yea that's where the use after free actually gets hit

feral island
pliant tusk
#

But I'm pretty sure the root cause is Py_CLEAR

pliant tusk
#

Because the func_name is cleared it displays like that

pliant tusk
feral island
pliant tusk
#

The system error is triggered in PyEval_GetBuiltin?

feral island
#

well at least, it checks for both fields being non-NULL, so the SystemError can't be from only one of them being NULL

warm breach
fallen slateBOT
#

Objects/iterobject.c lines 243 to 244

return Py_BuildValue("N(OO)", _PyEval_GetBuiltin(&_Py_ID(iter)),
                     it->it_callable, it->it_sentinel);```
raven ridge
#

the call to _PyEval_GetBuiltin to find the iter builtin is calling Cstr.__eq__, which exhausts the iterator, causing the Py_CLEAR in calliter_iternext to be executed, setting it->it_callable and it->it_sentinel to NULL. But the order of evaluation of arguments in a function call isn't specified, and modifying an argument by evaluating another argument is a bug

#

this is basically the same bug as printf("%d %d\n", i++, i++); just in a trickier package.

#

if it->it_callable and it->it_sentinel are evaluated before _PyEval_GetBuiltin(&_Py_ID(iter)) then it passes pointers to freed objects to Py_BuildValue, and if they're evaluated after then it passes null pointers.

#

that's unspecified behavior, not undefined, actually - but same difference.

warm breach
#

can't it just do a direct call to the object

raven ridge
#

it'd do something entirely different if it didn't look up __builtins__.iter, right?

warm breach
#

hm, does __reduce__ need this behavior? (fetching iter from __builtins__)

feral island
raven ridge
#

can you take it from here? ๐Ÿ™‚

feral island
#

sure, I'll file a bug and PR tonight

#

or if anyone else here wants to make a CPython contribution, feel free to do that and ping me for a review

raven ridge
#

the fix should just be hoisting the _PyEval_GetBuiltin(&_Py_ID(iter)) out of the if, and then adding a big comment explaining why that's necessary

feral island
#

yep and a unit test

warm breach
pliant tusk
warm breach
#

wonders of C ๐Ÿฅด

astral lion
#

Hello, I'm stuck on a problem and I can't seem to figure out a solution. I have a dictionary whose key item is a class object. When I change that key item class object's .name attribute, the key no longer works in the original dictionary. Is there anyway around this?

raven ridge
#

This is hitting case (2) there.

raven ridge
astral lion
raven ridge
#

you'd have to show your code for me to be any more specific than that

astral lion
#

but the memory addresses look intact

pliant tusk
raven ridge
# astral lion but the memory addresses look intact

dicts in Python are implemented using a data structure called a hash map. The idea behind a hash map is that, when you look for a key, you only need to look at other objects with the same hash code, and you can totally ignore every key in the hash table except for ones that have the same hash code. When you change the .name attribute, that changes the object's hash code: https://github.com/c4deszes/ldfparser/blob/06e9cd02f5fbf120de112c92df22a588279ffa55/ldfparser/signal.py#L44-L45
Which is why the dict stops being able to find that object.

fallen slateBOT
#

ldfparser/signal.py lines 44 to 45

def __hash__(self) -> int:
    return hash((self.name))```
astral lion
#

I know the problem now, but how do I fix it?

#

just get the hash and update the dictionary and remove the old instance?

warm breach
#

haven't seen that outside reduce

pliant tusk
#

Knowing this, there's actually a lot of places where you can produce undefined behavior with this pattern. I'll start sifting through the ones that I know about and see if they're actually triggerable.

pliant tusk
warm breach
pliant tusk
#

No because ob_type is not iter it's callable_iter which cannot be constructed from python code

deep nova
#

Hey internals people. You're the one's to talk to when it comes to nuances problems

warm breach
pliant tusk
pliant tusk
#

yea. listiter_reduce_general has it in two places

#

tupleiter_reduce too

fallen slateBOT
#

Objects/tupleobject.c lines 1049 to 1056

tupleiter_reduce(_PyTupleIterObject *it, PyObject *Py_UNUSED(ignored))
{
    if (it->it_seq)
        return Py_BuildValue("N(O)n", _PyEval_GetBuiltin(&_Py_ID(iter)),
                             it->it_seq, it->it_index);
    else
        return Py_BuildValue("N(())", _PyEval_GetBuiltin(&_Py_ID(iter)));
}```
warm breach
pliant tusk
#

Makes me wonder how often similar patterns exist

warm breach
#

how come compiling with debug mode makes the systemerror work?

#

does it change the arg eval order?

raven ridge
#

Nasal demons

pliant tusk
#

because anything like function(nested_call(), object->member) would have the bug if nested_call can be manipulated into calling python code that can manipulate object->member

warm breach
#

seems like vs knows something is off as well ๐Ÿฅด

raven ridge
pliant tusk
warm breach
pliant tusk
#

although this didnt add any as it just ends up at PyDictGetItemWithError

raven ridge
# warm breach seems like a lot of calls can though

It's a very persistent source of issues in the interpreter that calls into Python code can invalidate assumptions or state of C code higher up on the stack. There's a ton of ugly stuff in the interpreter just guarding against cases where this can occur

pliant tusk
#

yea most of the bugs I have reported are caused by this exact issue

warm breach
#

why is __builtins__.__dict__ mutable anyways ๐Ÿ˜”

pliant tusk
#

the repl uses that to add help and license

pliant tusk
raven ridge
#

to see if it needs to re-adjust assumptions
If it's able to re-adjust its assumptions, then it's also able to just not make those assumptions at all

warm breach
#

that might be slow for very hot paths as well

#

the overhead of the check would probably overcome the performance benefits in the branched "assumption" path

pliant tusk
#

ah yea fair enough

warm breach
#
static PyObject *
listiter_reduce_general(void *_it, int forward)
{
    PyObject *list;
+   PyObject *builtin_iter;
+   PyObject *builtin_reversed;
    
    /* the objects are not the same, index is of different types! */
    if (forward) {
        _PyListIterObject *it = (_PyListIterObject *)_it;
        if (it->it_seq) {
+           builtin_iter = _PyEval_GetBuiltin(&_Py_ID(iter));
            return Py_BuildValue("N(O)n", builtin_iter,
                                 it->it_seq, it->it_index);
        }
    } else {
        listreviterobject *it = (listreviterobject *)_it;
        if (it->it_seq) {
+           builtin_reversed = _PyEval_GetBuiltin(&_Py_ID(reversed));
            return Py_BuildValue("N(O)n", builtin_reversed,
                                 it->it_seq, it->it_index);
        }
    }
    /* empty iterator, create an empty list */
    list = PyList_New(0);
    if (list == NULL)
        return NULL;
    return Py_BuildValue("N(N)", _PyEval_GetBuiltin(&_Py_ID(iter)), list);
}
#

these look kind of weird tbh

pliant tusk
#

could probably get away with just PyObject *builtin; and using that variable in both places

warm breach
#

is there really no frozen constant pointer to builtins?

#

__builtins__.__dict__ holds the only reference to them?

pliant tusk
#

__builtins__ can change depending on what frame you are evaluating

#

so you can't use a constant frozen pointer if there is some utility lib that changes some

raven ridge
#

because (presumably) calling _PyEval_GetBuiltin can unset it_seq

pliant tusk
#

ah yea that would fix the SystemError exception from passing NULL to Py_BuildValue

warm breach
#

if called before it would go to the empty iterator

raven ridge
#

ok, but every SystemError is a programming bug

warm breach
#

I guess the empty case is more correct?

raven ridge
#

yeah.

#

every SystemError is raised because code inside the interpreter or an extension module has a bug.

#

if you're able to provoke the interpreter to set a SystemError from one of its own calls, that's proof there's a bug in the interpreter ๐Ÿ™‚

gray galleon
pliant tusk
#

!e ```py
help

fallen slateBOT
#

@pliant tusk :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 1, in <module>
003 | NameError: name 'help' is not defined
gray galleon
#

!e```
from sitebuiltins import *

print(help)

fallen slateBOT
#

@gray galleon :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 1, in <module>
003 | ModuleNotFoundError: No module named 'sitebuiltins'
gray galleon
#

!e```
from _sitebuiltins import *

print(help)

fallen slateBOT
#

@gray galleon :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 3, in <module>
003 | NameError: name 'help' is not defined
warm breach
#

could I just copy this into every iter_reduce

    /* _PyEval_GetBuiltin can invoke arbitrary `__eq__` code
     * calls must be *before* access of _it pointers
     * since C/C++ parameter eval order is undefined.
     * see issue #101765 */
#

or is that too much repetition

rose schooner
pliant tusk
rose schooner
#

!e ```py
from _sitebuiltins import _Helper
print(help := _Helper())

fallen slateBOT
#

@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.

Type help() for interactive help, or help(object) for help about object.
gray galleon
#

!e```
from _sitebuiltins import _Helper

help = _Helper()

print(help)

fallen slateBOT
#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

Type help() for interactive help, or help(object) for help about object.
gray galleon
#

help is not instantiated in _sitebuiltins smh

#

given that it is almost always a singleton

lone sun
# deep nova I've got a question happening over here, if anyone want's to weigh in https://di...

That thread is closed, so I'll weigh in here.

The structure seems very odd to me. Normally, I think of tokenizers as extracting a single token at a time. Your seems to follow a pattern where you look for as many three-long tokens as you can, and if you can't find any, then you look for as many two-long tokens as you can, and if you still can't find any, then you look for one-long tokens. Maybe your functions return at most a single token? I can't tell. Regardless, it seems very odd: Ultimately, for each possible token, you want to know either it appears at the present location (True) or not (False). That boolean structure does not appear very clearly in your code.

A more common pattern for a hand-crafted lexer, I think, is to attempt to match your first token possibility; if it matches, yield it; if not, attempt to match your next token possibility; and so on. This turns tokenization into a simple loop: You just iterate over regular expressions, one for each token, testing each one for a match; when you get a match, you return the matched characters and advance your input pointer by the length of the match.

For an automatically generated lexer, you use the same idea, but you combine all the regular expressions into a single DFA. This is faster (when implemented correctly). (You could replicate this effect in pure Python by taking all the regular expressions you're interested in and combining them with | branches. To find out which token you matched, you examine the match object.)

If your lexer is a more general grammar (like we've discussed here for handling indentation and f-strings) then you'll need a more complicated strategy. But in this case, the right strategy really depends on the complexity of the grammar you're parsing.

gray galleon
#

how would you make a DFA in python tho
given that it has no goto

lone sun
#

You rely on re to do a good job of that.

#

You could build the state machine yourselfโ€”you just assign every state a number and have an outer loop which looks at the number and determines the possible transitionsโ€”but it's going to be terribly slow compared to re.

gray galleon
#

skywalker's intention is not to use regex i think

lone sun
#

Well, it's totally possible. You just end up with a big table.

deep nova
raven ridge
gray galleon
deep nova
#

I need to yield from each of the "sub tokenizers" because a one or two of the methods for creating tokens yield multiple in a single pass (namely, newlines trigger the creation of a newline token as well as indents or dedents). Thus, I've got to either yield multiple in one pass or else cache any tokens beyond the first, and address them in the next loop

#

The thread you saw was me asking about a more graceful way of short-circuiting the outloop upon collecting tokens from one of the inner tokenizers

deep nova
gray galleon
#

using yield from?

deep nova
#

A few weeks ago. It turned out well, and I'm going to revisit it for the lulz when this hand rolled lexer is done

#

Nooooooo, totally different thing

gray galleon
gray galleon
deep nova
#

Lemme just find the code

#

It isn't perfect, but it was at least mostly operational. What you end up with is a big directed graph of states, connected by transitions. Each tick of an outer loop you'd query the next character, and then move from the current state of the next accordingly

#

Long term, though, you'd convert the data structure to an if-else FSM in raw source code

gray galleon
#

so you just simulate a dfa bruh

deep nova
#

_>

#

<_<

#

That depends on your definition. As far as I'm concerned, a DFA is an abstract specification which might be implemented in any number of ways

#

Question

#

About python's handling of escaped newlines

#
def escaped_newline(self) -> str:

    if self.observe(0) != '\r' and self.observe(0) != '\n':
        raise SyntaxError(f"backslashes must be immediately followed by newlines")

    if self.observe(0) == '\r' and self.observe(1) == '\n':
        return self.advance(2)

    if self.observe(0) == '\r':
        return self.advance(1)
        
    if self.observe(0) == '\n':
        return self.advance(1)
#

Does that about cover it? The backslash is already consumed by the time the function is entered. Thus, check to make sure that a some kind of newline occurs after it, consume that newline, and otherwise raise an error?

#

Leading whitespace on the next line will just be ignored the same as any other whitespace, and there are no indentation considerations which need to be made?

raven ridge
#

what does observe(0) do?

pliant tusk
#

@warm breach here is all of the places I have found so far that trigger undefined behavior due to the same issue as callable_iter

class A:
    def __len__(self):
        return 0
    def __getitem__(self, name):
        raise StopIteration

types = [
    ([A],),
    ([list], range(64)),
    ([bytes], 64),
    ([bytearray], 64),
    ([tuple], range(64)),
    ([lambda:(lambda:0), 0],)
]

def do(item):
    (callable, *flag), *initializer = item
    corrupt = iter(callable(*initializer), *flag)
    class Cstr:
        def __hash__(self):
            return hash('iter')
        def __eq__(self, other):
            [*corrupt]
            return other == 'iter'

    builtins = __builtins__.__dict__ if hasattr(__builtins__, '__dict__') else __builtins__
    oiter = builtins['iter']
    del builtins['iter']
    builtins[Cstr()] = oiter
    try:
        print(callable, corrupt.__reduce__())
    except Exception as e:
        print(callable, e)

for typ in types:
    do(typ)```
deep nova
raven ridge
#

do you intend to support files with legacy Mac line endings?

deep nova
#

I honestly have no clue how line endings work

#

XD I'm just covering my bases

raven ridge
#

nothing has used \r as a line ending for over 20 years

deep nova
#

HA

#

But \r\n still exists?

vast saffron
raven ridge
#

yeah, Windows uses \r\n, and everything else that still exists uses \n

deep nova
#

Gotta love that consistency

raven ridge
#

Mac OS 9 and earlier used to use \r

deep nova
#

That aside

vast saffron
deep nova
#

With respect to handling escaped newlines

#

I just need to make sure the backslash is followed by a newline and consume it, right? And otherwise throw an error?

raven ridge
#

seems reasonable - I can't think of anything else that can come after a \ outside of a string literal

lone sun
# deep nova The thread you saw was me asking about a more graceful way of short-circuiting t...

I think this is still amenable to the kind of structure I described, as long as you make some minor adjustments. Think of each potential token as a pair. One entry of the pair is a regular expression that matches when you find the token. The other entry is sequence that you emit. The point of this division is that it lets you separate the yes/no question of whether you have a match from the question of what to do if you did match. The structure is still a loop over regular expressions (or still a DFA). When you match, you look up the sequence to emit for that token and yield from. Something like:

for tok_re, new_toks in token_data:
    if tok_re.match(input_str):
        ...  # Update internal state
        yield from new_toks
        break
else:
    raise SyntaxError
feral island
#

@warm breach I feel the test can go into Lib/test/test_iter.py

fallen slateBOT
#

Objects/methodobject.c lines 176 to 184

static PyObject *
meth_reduce(PyCFunctionObject *m, PyObject *Py_UNUSED(ignored))
{
    if (m->m_self == NULL || PyModule_Check(m->m_self))
        return PyUnicode_FromString(m->m_ml->ml_name);

    return Py_BuildValue("N(Os)", _PyEval_GetBuiltin(&_Py_ID(getattr)),
                         m->m_self, m->m_ml->ml_name);
}```
warm breach
#

it looks like builtin functions / PyCFunctionObjects might also be affected

#

but I couldn't reproduce with your example structure

#

oh I guess we'd have to mutate m_self or ml_name in eq

feral island
#

I wonder if you could do something even more evil where Py_BuildValue allocates a new tuple -> GC is triggered -> that mutates the object

warm breach
warm breach
#

but the call order is still UB

feral island
warm breach
#

will try to fit it into test_iter somewhere

feral island
#

but I feel like it's cleaner to just always do the PyEval_GetBuiltin call separately so it's more clear the behavior is safe

#

also I think the UB is a bit of a red herring here. It doesn't really matter what order the args are evaluated, what matters is that the PyEval_GetBuiltin call has a side effect that invalidates the earlier if statement

warm breach
#
    (A(),),
    (list(range(64)),),
    (bytes(64),),
    (bytearray(64),),
    (tuple(range(64)),),
    ((lambda: 0), 0),
#

all of these ones are for sure affected with segfaults on 3.11

feral island
#

good work

warm breach
feral island
#

right, but it's a bug either way

warm breach
#

but moving before the if, fixes the systemerror also

#

so there was kind of 2 levels of bug here I guess

raven ridge
#

Really just one bug with two possible effects depending on argument evaluation order, I'd say

#

The bug being that PyEval_GetBuiltin is modifying the object in a way that violates the invariants of the running __reduce__ call.

warm breach
#

is there some way to modify __builtins__.__dict__ for the test but not affect the other tests

feral island
#

you could also restore the old builtins after the test

raven ridge
#

Monkeypatch it in a context manager's __enter__, restore it in __exit__

#

Or a subprocess, but that's way slower and far more overhead...

#

Which adds up, especially if you're adding a bunch of similar tests

warm breach
#

if it fails it might segfault and crash, not sure how much __exit__ will help

feral island
#

we'll count segfaults as test failures ๐Ÿ™‚

raven ridge
#

It won't, but that's not what you're restoring it for. You're restoring it for the case where the test succeeds, because the fixes are applied, and you need to put things back into a sane state for the next test to run

warm breach
#
def test_reduce_mutating_builtins_iter(self):
    # Backup of original iter
    builtins = __builtins__.__dict__ if hasattr(__builtins__, "__dict__") else __builtins__
    orig_iter = builtins["iter"]

    def run(item):
        (fn, *flag), *initializer = item
        corrupt = iter(fn(*initializer), *flag)

        class CustomStr:
            def __hash__(self):
                return hash("iter")
            def __eq__(self, other):
                list(corrupt)
                return other == "iter"

        _iter = builtins["iter"]
        del builtins["iter"]
        builtins[CustomStr()] = _iter

        return corrupt.__reduce__()

    types = [
        ([EmptyIterClass],),
        ([bytes], 8),
        ([bytearray], 8),
        ([tuple], range(8)),
        ([lambda: (lambda: 0), 0],)
    ]

    self.assertEqual(run(([str], "xyz")), (orig_iter, ("xyz",), 0))
    self.assertEqual(run(([list], range(8))), (orig_iter, ([],)))
    for case in types:
        self.assertEqual(run(case), (orig_iter, ((),)))

    # Restore original iter
    del builtins["iter"]
    builtins["iter"] = orig_iter
warm breach
#

might be simpler than having a context manager

raven ridge
#

Sure. It's effectively the same thing, every context manager can be rewritten as a try/finally. But splitting it out into a context manager might let you reduce duplication and copy/pasting between tests

warm breach
#

!e

x = list[int]

it = iter(x)
print(repr(next(it)))
print(it.__reduce__())
fallen slateBOT
#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | *list[int]
002 | Traceback (most recent call last):
003 |   File "<string>", line 5, in <module>
004 | SystemError: NULL object passed to Py_BuildValue
warm breach
#

SystemError path is pretty simple to reproduce with this example even

#

hm

#

what should be done about genericaliasobject here? Moving the _PyEval_GetBuiltin will still result in SystemError

static PyObject *
ga_iter_reduce(PyObject *self, PyObject *Py_UNUSED(ignored))
{
    gaiterobject *gi = (gaiterobject *)self;
    return Py_BuildValue("N(O)", _PyEval_GetBuiltin(&_Py_ID(iter)), gi->obj);
}
#

gi->obj is NULL when the iterator is exhausted

#

I'm guessing we need a

if (gi->obj)
    return Py_BuildValue("N(O)", iter, gi->obj);
else
    return Py_BuildValue("N(())", iter);
#

kind of surprised that wasn't there before

deep nova
#

Tomorrow is the day

#

That I wrap my mind around whatever the hell python does to handle leading tabs and spaces

deep nova
#

Okay, so, walk me through this

#

I'm not sure I'll be able to sleep until I've given this a bit of effort

#

I know python get grumpy about mixed tabs and spaces, but, I know it can in the very least handle tabs followed by spaces

#

And I know it does some kind of normalization, converting every tab to exactly eight spaces

#

Beyond that, hows it all work?

gray galleon
deep nova
#

In the interest of simplicity, I'll just say yes

gray galleon
#

you're in for a ride

#

i think you can read the docs

deep nova
#

I've read the docs โ€” the lexical analysis document at least. Many times. And I think I understand most of it

#

But I do my best learning by way of Q&A

#

"Tab characters count as one, then round up to the nearest multiple of eight."

#

Wut?

gray galleon
#

as an example
3 spaces + 1 tab = 4 characters
those characters are rounded to 8 (because of the tab)
so the final amount of indentation is 8 spaces

#

it will do that for every tab character it encountera

#

thats how i interpret it

deep nova
#
def round_to(x):
    return 8 * round(x/8)

indentation = 0

while char := self.next_char():

    if char == ' ':
        indentation += 1
    
    if char == '\t':
        indentation += 1
        indentation = round_to(indentation)
#

Like this?

gray galleon
#

yeah

rose schooner
#

those have to always be consistent with the top of 2 stacks, one for the "space length" and the other for the number of spaces/tabs

deep nova
#

Hmmmm

#

I might have to give this problem a little more thought, and employ another method. Apparently Python's handling of whitespace is one of the reasons it can't support multiline lambdas

deep nova
#

And that's an absolute must me for (though I intend to go with the much more attractive => syntax)

#

Supposedly, yeah. Something to do with switching of context between whitespace sensitivity and whitespace agnosticism

#

Which sounds like a job for the lazy lexer I've already got planned to handle fstrings, now that I think of it

flat gazelle
#

look at nim for a language that manages to have both

deep nova
rose schooner
deep nova
#

I had a feeling. I've always had the impression that the hatred for multiline lambdas has always been more about dogma than anything. Especially now in the era of async (and hence, callbacks as arguments), the extra flexibility is important

rose schooner
flat gazelle
#

a core part of python is the strict separation between statements and expressions, which multiline lambdas very fundamentally cannot be

rose schooner
flat gazelle
#

though honestly, that whole idea is pretty simply incorrect, separating the two leads to a worse language

deep nova
flat gazelle
#

ye

flat gazelle
deep nova
#

I had never considered that. That said, it seems perfectly palatable in that you've got an expression with an isolated environment in it.

#

You guys are too smart for your own good. I'll be back tomorrow to soak up some more knowledge

#

XD You know what happens when you leave smart people alone too long? Programming languages.

#

And no good can come from such things

rose schooner
radiant garden
#

other langs can do fine without that decision, and easily so if they're "expression-oriented" (i.e. everything is an expression)

deep nova
#

That's far too big of a concept for me to grapple with as of right now

#

What I'll say is that I'm glad I took my time with this lexer. Everyone has been "gently encouraging" me to just slap something together and move on to "the more important things"

#

Which isn't exactly an unwise position. But taking the pains to really consider everything has let me take a much longer, deeper view of what I want. I'd have hated to have written a poor lexer, move on to the parser, and then half way through realize I need scrap it all and start over because I hadn't considered multiline lambdas early on

gray galleon
rose schooner
#

a def is the "only one obvious way to do it"

#

despite having a name which seems to be a constant problem for a lot of programmers

gray galleon
#

if โ€œonly one obvious wayโ€ is allowed, might as well remove lambda altogether

gray galleon
#

and you agree with that unironically

rose schooner
#

lambda is pretty convenient for use cases that evaluates and returns only one expression
the majority of needed anonymous function uses in python satisfies that requirement

#

a multiline lambda is pretty much too niche and its costs outweigh the benefits and frequency of use

peak spoke
#

could allow def btn.on_click(): ...

gray galleon
rose schooner
rose schooner
rose schooner
gray galleon
peak spoke
#

I'd imagine frameworks would offer expose an attribute for the callbacks if it was a possibility, but yes it is somewhat limited as it doesn't solve the lambdas in args

gray galleon
flat gazelle
#

the convention in python for this kinda stuff are decorators

rose schooner
#

just use _ as a name

flat gazelle
#

yeah, that's easiest

peak spoke
#

not being able to do assignment in lambdas has been the most annoying thing for me with event callbacks

rose schooner
peak spoke
#

Yes, doing setattr isn't exactly nice

gray galleon
flat gazelle
#

ye

swift imp
#

So what exactly are the implications of PEP 649 being accepted?

#

I see it won't break pydantic and the like but for the future, do we think more libraries that support type hints based features are going to pop up?

gray galleon
# gray galleon like```py @btn.event_handler("click") def _(): # ... ```

all of this conversation gave me an idea
how about ruby inspired blocks```py
btn.event_handler("click") do:

...

should support all those use cases
most importantly it is cleaner-looking than the current โ€œdefine a named function then use it as an argโ€ approach
and does not introduce named functions
gray galleon
radiant garden
# swift imp So what exactly are the implications of PEP 649 being accepted?

Less maintenance overhead and a cleaner implementation of annotations overall? Not all that much from a consumer perspective. Annotation-handling code using (good practice) typing.get_type_hints() will be unaffected, and code using eval just needs to skip that call. External tools that parse python won't be affected much, as pep 563 support already means forward references in annotations.

halcyon trail
#

I'd say lambdas being so limited is annoying on a fairly regular basis. Certainly, anytime you want to do something slightly more complicated in a list or dict comprehension, I'd much rather have nice lambdas.
I don't really get the whole python thing of "multi line lambdas are niche".
In other languages that have both good lambdas and nested functions, you still see multiline lambdas used a lot.

#

there's nothing massively different about python that makes that not the case. Just folks looking at a python downside and justifying it, rather than simply admitting that it's a downside.

warm breach
gray galleon
halcyon trail
#

a valid reason not to have multi-line lambdas

#

but not a valid reason not to admit that it's a downside

proven bramble
#

Is there any plan for a alternative forced statically typed mode for 3.12 or 3.13 (where i believe jit will be added ?)

proven bramble
#

And will the annotations be ever used in the jit compiler ?

proven bramble
gray galleon
proven bramble
gray galleon
#

python annotation system is pretty inconvenient tbh

proven bramble
#

I wish they overhauled it a bit
So that we would be able to add type qualifiers and type modifiers
But it doesn't make sense if they never use it in jit

gray galleon
#

it is also kind of a hack

proven bramble
gray galleon
proven bramble
#

you mean the types need to be defined before we use it ? (you cant annotate a method of a class with the class itself, it needs to be a string or requires the import statement)

#

?

proven bramble
warm breach
#

quick question @feral island , are we allowed to use functools.partial in test_iter?

warm breach
#

not sure pithink thought it was supposed to not depend on anything or

#

hm...

NameError: name 'reversed' is not defined
Warning -- Unraisable exception
Exception ignored in: <module 'threading' from '/home/ionite/repos/C/cpython/Lib/threading.py'>
Traceback (most recent call last):
  File "/home/ionite/repos/C/cpython/Lib/threading.py", line 1571, in _shutdown
    for atexit_call in reversed(_threading_atexits):
                       ^^^^^^^^
NameError: name 'reversed' is not defined
feral island
#

did you not put it properly back in builtins?

warm breach
#

apparently format exception happens before finally and fails there due to the patches

#

wait no it doesn't, I just had a failed del in finally, will fix

warm breach
deep nova
#

Hey smart people

feral island
#

or iterator rather

deep nova
#

It makes the obvious claim that lexing and parsing as Python (and I assume many other languages) is not context free. A lexer shouldn't (puritanically speaking) store any internal state except its current position in the input stream. In practice, who cares. Keeping a stack representing indentation/parentheses is a value add, without any real drawbacksโ„ข๏ธ

#

But he does raise the point that "we're, technically, using the wrong tools for the job". Alternatively, it might be that we're doing the job inside out in some way. This begs the question: what's the alternative?

warm breach
#

so essentially

>>> it = iter(reversed([]))
>>>
>>> try:
...    next(it)
... except StopIteration:
...    pass
...
>>> it.__reduce__()
(<built-in function iter>, ([],))
raven ridge
#

Hm. That would unpickle to an object of a different type. That seems not great...

#

Might be worth filing a bug report for that, too...

feral island
#

I feel like the exact iterator type is an implementation detail. You pickle an empty iterator, you get an empty iterator back

raven ridge
#

You don't think it's reasonable to ```py
assert type(it) == type(pickle.loads(pickle.dumps(it)))

feral island
#

That's reasonable, but until the type discrepancy causes a real-world issue I'm not sure it's worth changing. In some cases it may not be practical to create an empty iterator of the same type.

raven ridge
#

True, though in this case it is

#

I'm not sure it's worth fixing, but it's probably worth reporting so at least there's a record of it in Google and some documentation about why it wasn't worth fixing

#

Assuming it hasn't been reported already ๐Ÿ™‚

warm breach
#

just that something even more dangerous happens when builtins dict access mutates reversed

raven ridge
warm breach
#

is there something you can do with reversed list iter and not iter

warm breach
#

it now also returns a plain iter on __reduce__ of an exhausted one

raven ridge
#

Isn't that the expected behavior?

#

!e ```py
print(iter(tuple[()]).reduce())

fallen slateBOT
#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<built-in function iter>, (tuple[()],))
raven ridge
#

Looks like it always returns a plain iterator, exhausted or not

warm breach
#

ah right

#

I guess the only changed one is reversed then?

raven ridge
#

That doesn't seem changed, either.

#

!e ```py
it = iter(reversed(""))
list(it)
print(it.reduce())

fallen slateBOT
#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<class 'reversed'>, ((),))
raven ridge
#

Hm.

#

Oh, weird

#

!e ```py
it = iter(reversed([]))
list(it)
print(it.reduce())

fallen slateBOT
#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<built-in function iter>, ([],))
warm breach
#

yeah it's only for lists

feral island
#

reversed() on a list gives a different type than general reversed() I believe

raven ridge
#

Strange. That seems very odd

raven ridge
warm breach
#
static PyObject *
reversed_reduce(reversedobject *ro, PyObject *Py_UNUSED(ignored))
{
    if (ro->seq)
        return Py_BuildValue("O(O)n", Py_TYPE(ro), ro->seq, ro->index);
    else
        return Py_BuildValue("O(())", Py_TYPE(ro));
}
#

seems reversed reduce calls Py_TYPE instead of builtins dict access

feral island
#

we care about the reduce for iter(reversed) though, right?

raven ridge
warm breach
#

for anything besides list

grave jolt
raven ridge
#

It doesn't matter that it's already kind of weird, we can fix the bug without worrying about its other weirdness

feral island
#

incidentally why is reversed() in enumobject.c of all places

grave jolt
#

It's pretty clear from modern usage that sometimes people prefer lambdas over def'd functions.

#

You could go all the way in on the one-obvious-way philosophy and remove lambda. But that would make a lot of existing code extremely verbose, with function names that don't add any meaning

#

It's okay to offer options ๐Ÿ™‚ and it's true that there is such a thing as too many options.

deep nova
#

Good points all around

#

At the end of the day, though

#

Multiline lambdas are nice. They're useful, they're pretty, and people want them

#

I want them, lots of other people want them

#

Purity be damned, that's what I'm going to give them

deep nova
grave jolt
#

also, %-formatting is used in logging ๐Ÿ™‚ though the utility is questionable

feral island
#

I think #4 was meant to be string.Template?

grave jolt
#

replace my 4 with 5 then ๐Ÿ™‚

deep nova
#

Five XD

grave jolt
#

and then add number 6 standing for all the templating engines...

grave jolt
deep nova
#

Hehe, I'm not sure I'd count straight up concatenation as a method formatting

#

But yeah, sometimes its the only tool for the job

feral island
#

%-formatting is useful for creating binary strings. I also used it recently for generating TypeScript code so I wouldn't have to keep writing {{

grave jolt
#

These are the alternatives, it seems ```py
def repr(self) -> str:
return "({0})".format(", ".join(map(repr, self)))

def repr(self) -> str:
return "(%s)" % [", ".join(map(repr, self))]

def repr(self) -> str:
return f"({', '.join(map(repr, self))})" # yuck

def repr(self) -> str:
amogus = ", ".join(map(repr, self))
return f"({amogus})"

grave jolt
grave jolt
grave jolt
halcyon trail
#

it's too bad that the logging using %, at least by default

#

ideally, what you really want is to pass logging functions lambdas that return a string, rather than actual strings. then the logging framework decides whether to evaluate them.

#

Then you could just use f-strings for logging and still be perfectly efficient

#

lambdas again ๐Ÿ˜‰

grave jolt
#

also lambda is such an awkward keyword

feral island
#

assuming that the cost of creating a lambda doesn't exceed the cost of creating an f-string

grave jolt