pliant tusk Feb 7, 2023, 4:05 PM

#

I should probably register an atexit to revert classes before teardown

warm breach Feb 7, 2023, 4:06 PM

#

pliant tusk I should probably register an `atexit` to revert classes before teardown

how do you keep track of what classes and hooks are made?

pliant tusk Feb 7, 2023, 4:33 PM

#

warm breach how do you keep track of what classes and hooks are made?

i don't keep track as of rn

warm breach Feb 7, 2023, 4:35 PM

#

pliant tusk i don't keep track as of rn

I've been trying to weakref.finalize hooked methods but that doesn't seem to work well with non-weakrefables like property

#

maybe I'll switch to finalizing on the type instead

#

all types are weakref-able (I think?)

pliant tusk Feb 7, 2023, 4:35 PM

#

why do you want to keep track?

warm breach Feb 7, 2023, 4:46 PM

#

pliant tusk why do you want to keep track?

restoring the original attribute on finalize I guess pithink

pliant tusk Feb 7, 2023, 4:47 PM

#

finalize isnt passed the class tho

warm breach Feb 7, 2023, 4:47 PM

#

thought about atexit but I think the type might not exist by then?

pliant tusk Feb 7, 2023, 4:47 PM

#

atexit and finalize both get called at the same place for static types it seems

#

this was hit by weakref.finalize(int, hit_bp);exit()

warm breach Feb 7, 2023, 4:48 PM

#

pliant tusk finalize isnt passed the class tho

I attach it to the method and it gets the type as *args

pliant tusk Feb 7, 2023, 4:49 PM

#

>>> weakref.finalize(A(), lambda *A:print(A))
()
<finalize object at 0x105f6a8c0; dead>
>>> 
``` wdym?

warm breach Feb 7, 2023, 4:50 PM

#

pliant tusk ```py >>> weakref.finalize(A(), lambda *A:print(A)) () <finalize object at 0x105...

like here the weakref.finalize in on the __hash__ object, and it's passed a stored strong ref to int

@impl(int, detach=True)
def __hash__(self):
    print("in hash", self)
    return orig(int).__hash__(self)

pliant tusk Feb 7, 2023, 4:51 PM

#

finalize doesnt pass any arguments as far as i can tell

warm breach Feb 7, 2023, 4:51 PM

#

hm? You can "store" args for it to pass kind of like partial

#

https://github.com/ionite34/einspect/blob/main/src/einspect/views/view_type.py#LL105

pliant tusk Feb 7, 2023, 4:52 PM

#

oh i didnt know that

#

cool

warm breach Feb 7, 2023, 4:53 PM

#

you can't store the object itself though, since that will make it never be GC'd

pliant tusk Feb 7, 2023, 4:53 PM

#

seemed to work here

#

or actually weakref.finalize handlers that are not called by deconstructors are just called at exit

#

so cyclical ones will be called at interpreter teardown

deep nova Feb 7, 2023, 7:33 PM

#

Can anyone point me to lexers for python source code?

#

I've seen a few, but they're few and far between

grave jolt Feb 7, 2023, 7:58 PM

#

deep nova Can anyone point me to lexers for python source code?

https://github.com/gvanrossum/ctok what about this one?

deep nova Feb 7, 2023, 8:00 PM

#

Wonderful!

warm breach Feb 7, 2023, 10:59 PM

#

is it possible to subclass ctypes.Structure without defining _fields_

crisp flume Feb 7, 2023, 10:59 PM

#

Hi

warm breach Feb 7, 2023, 10:59 PM

#

I just want to provide some common mixin methods using a base Structure class

pliant tusk Feb 7, 2023, 11:38 PM

#

warm breach is it possible to subclass `ctypes.Structure` without defining `_fields_`

!e i think you can ```py
from ctypes import *

class A(Structure):
def method(self):
print(self)

class B(A):
fields = [('ob_refcount', c_ssize_t)]

b = B.from_address(id(1))
print(b.ob_refcount)
b.method()```

fallen slateBOT Feb 7, 2023, 11:38 PM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 1000000084
002 | <__main__.B object at 0x7f4c952f4710>

warm breach Feb 7, 2023, 11:41 PM

#

pliant tusk !e i think you can ```py from ctypes import * class A(Structure): def method(...

huh... pithink

#

I thought it said _fields_ must be defined before the class is subclassed

pliant tusk Feb 7, 2023, 11:50 PM

#

¯_(ツ)_/¯

warm breach Feb 7, 2023, 11:51 PM

#

!e

from ctypes import *

class Struct(Structure):
    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        cls._fields_ = [(k, v) for k, v in cls.__annotations__.items()]

class Foo(Struct):
    ob_refcnt: c_ssize_t
    ob_type: py_object

fallen slateBOT Feb 7, 2023, 11:51 PM

#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 8, in <module>
003 |   File "<string>", line 6, in __init_subclass__
004 | TypeError: ctypes state is not initialized

warm breach Feb 7, 2023, 11:51 PM

#

__init_subclass__ can't assign to _fields_ apparently though, weird

pliant tusk Feb 7, 2023, 11:52 PM

#

weird

pliant tusk Feb 7, 2023, 11:56 PM

#

warm breach `__init_subclass__` can't assign to `_fields_` apparently though, weird

it is because the STGdict is not initialized

warm breach Feb 7, 2023, 11:58 PM

#

pliant tusk it is because the STGdict is not initialized

is it possible to delay init_subclass to happen after that

#

when does init_subclass happen anyways?

#

I thought it was after the class was defined, as if you had a decorator

pliant tusk Feb 8, 2023, 12:42 AM

#

warm breach I thought it was after the class was defined, as if you had a decorator

!e ```py
from ctypes import *

class Cmeta(type(Structure)):
def init(self, name, bases, mapping, **kwargs):
super().init(name, bases, mapping, **kwargs)
self.fields = list(mapping.get('annotations', {}).items())

class Struct(Structure, metaclass=Cmeta):
# methods here
pass

class PyObject(Struct):
ob_refcount: c_ssize_t
ob_type: py_object

print(sizeof(PyObject), PyObject.fields)```

fallen slateBOT Feb 8, 2023, 12:42 AM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

16 [('ob_refcount', <class 'ctypes.c_long'>), ('ob_type', <class 'ctypes.py_object'>)]

pliant tusk Feb 8, 2023, 12:42 AM

#

I got it to work

warm breach Feb 8, 2023, 12:42 AM

#

o.O

pliant tusk Feb 8, 2023, 12:43 AM

#

Gotta love metaclasses

warm breach Feb 8, 2023, 12:43 AM

#

so when does Cmeta.__init__ get called here

pliant tusk Feb 8, 2023, 12:44 AM

#

When subclasses are initialized

warm breach Feb 8, 2023, 12:45 AM

#

huh...

#

but it's later than __init_subclass__?

pliant tusk Feb 8, 2023, 12:45 AM

#

Yeah

#

I guess

warm breach Feb 8, 2023, 12:47 AM

#

https://github.com/python/cpython/blob/main/Modules/_ctypes/stgdict.c#L427

fallen slateBOT Feb 8, 2023, 12:47 AM

#

Modules/_ctypes/stgdict.c line 427

if (!stgdict) {```

warm breach Feb 8, 2023, 12:47 AM

#

still don't understand how this is null in __init_subclass__ though

#

seems like a bug

pliant tusk Feb 8, 2023, 12:50 AM

#

idk, ctypes has a lot of hackery in its use of STGdict

#

the Cmeta trick should work for einspect tho @warm breach

warm breach Feb 8, 2023, 12:51 AM

#

pliant tusk the Cmeta trick should work for einspect tho <@233059161401720832>

yeah thanks for that, I'll probably switch over

pliant tusk Feb 8, 2023, 12:51 AM

#

(you can technically do class PyObject(Structure, metaclass=Cmeta) instead of having the interim class)

warm breach Feb 8, 2023, 12:52 AM

#

having to do the decorator plus ctypes.Structure plus mixins

@struct
class Foo(Structure, AsRef, Display)

was quite annoying

pliant tusk Feb 8, 2023, 12:52 AM

#

yea I can imagine

pliant tusk Feb 8, 2023, 12:55 AM

#

warm breach having to do the decorator plus `ctypes.Structure` plus mixins ```py @struct cla...

you should be able to define multiple utility classes ex: class AsRef(Struct) to inherit from and they should compose correctly.

#

you should also test what happens if _fields_ is already set, cause I did not

warm breach Feb 8, 2023, 1:25 AM

#

pliant tusk you should be able to define multiple utility classes ex: `class AsRef(Struct)` ...

🎉

unreal hornet Feb 8, 2023, 1:55 AM

#

can i use chat gpt too help me

boreal umbra Feb 8, 2023, 1:56 AM

#

unreal hornet can i use chat gpt too help me

if you don't care if the answers are correct, yes.

warm breach Feb 8, 2023, 2:16 AM

#

warm breach having to do the decorator plus `ctypes.Structure` plus mixins ```py @struct cla...

so much easier to override __setattr__ for every struct now, finally have my NULL singleton working everywhere

from einspect import view, NULL

v = view(int)
v.tp_as_number[0].nb_power = NULL

print(3 ** 85)
>> TypeError: unsupported operand type(s) for ** or pow(): 'int' and 'int'

#

got NULL comparisons working as well

from einspect import view, NULL

n = view(int).tp_as_number.contents

print(n.nb_add == NULL)
# False
print(n.nb_matrix_multiply == NULL)
# True

rose schooner Feb 8, 2023, 2:32 AM

#

warm breach got NULL comparisons working as well ```py from einspect import view, NULL n = ...

does is work or no?

warm breach Feb 8, 2023, 2:33 AM

#

rose schooner does `is` work or no?

well no but you wouldn't do that in C either right

#

== NULL just checks if a pointer is null, not that the pointer address has to be the same

pliant tusk Feb 8, 2023, 2:34 AM

#

warm breach well no but you wouldn't do that in C either right

You could just make it return your NULL singleton if the pointer is null

#

Then is would work

warm breach Feb 8, 2023, 2:36 AM

#

pliant tusk You could just make it return your NULL singleton if the pointer is null

how would I do that with Structures though

#

I guess I could override __getattr__ to detect returned LP_PyObject pointers and replace them

pliant tusk Feb 8, 2023, 2:37 AM

#

You can modify the STGdict

#

I have example code

#

One sec

warm breach Feb 8, 2023, 2:37 AM

#

but the types like ctypes Arrays of PyObject pointers I wouldn't be able to change iirc

#

like arr here comes from cast(<ob_item_0_ptr>, POINTER(PyObject) * 2)

from einspect import view

t = (1, 2)
arr = view(t).item

rose schooner Feb 8, 2023, 2:39 AM

#

warm breach `== NULL` just checks if a pointer is null, not that the pointer address has to ...

idk it just feels like == None for me

warm breach Feb 8, 2023, 2:39 AM

#

since it's just a ctypes.Array I'm not sure how I'd override what it returns

pliant tusk Feb 8, 2023, 2:41 AM

#

!e ```py
from ctypes import *

base_size = sizeof(c_ssize_t)

def getclsdict(cls):
d = cls.dict # hold reference due to cls.__dict__ being a getter in static classes
if isinstance(d, dict):
return d
return py_object.from_address(id(d) + 2 * base_size).value

creates modded handlenull type to shim null values

Null = type('Null',(),{'repr':lambda self:f'<NULL>'})()

GETFUNC = PYFUNCTYPE(py_object, c_void_p, c_ssize_t)

class StgDictObject(Structure):
fields = [
('-', c_ubyte*(
dict.sizeof({}) +
sizeof(c_ssize_t * 7) +
sizeof(c_ushort * 2)
)),
('getfunc', GETFUNC)
]

def get_stg_dict(cls):
return StgDictObject.from_address(id(getclsdict(cls)))

p_stg = get_stg_dict(py_object)
orig_getfunc = p_stg.getfunc

@GETFUNC
def getfunc(ptr, size):
if c_void_p.from_address(ptr).value:
return orig_getfunc(ptr, size)
return Null

p_stg.getfunc = getfunc

print(py_object().value is Null)```

fallen slateBOT Feb 8, 2023, 2:41 AM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

True

warm breach Feb 8, 2023, 2:41 AM

#

huh

pliant tusk Feb 8, 2023, 2:42 AM

#

@warm breach thats an example of changing the py_object getfunc

#

(you can also make your own subclasses of _SimpleCData and inject get and set funcs)

warm breach Feb 8, 2023, 2:42 AM

#

pliant tusk <@233059161401720832> thats an example of changing the `py_object` getfunc

but like... that wouldn't change a POINTER(PyObject) field?

pliant tusk Feb 8, 2023, 2:44 AM

#

the same thing should work with any subclass of _SimpleCData (with some edits)

warm breach Feb 8, 2023, 2:45 AM

#

pliant tusk the same thing should work with any subclass of `_SimpleCData` (with some edits)

can I override what the POINTER type does in general?

pliant tusk Feb 8, 2023, 2:46 AM

#

I am checking rn

warm breach Feb 8, 2023, 2:47 AM

#

!e

from ctypes import *
from einspect.structs import PyObject

x = POINTER(PyObject)
print(type(x))
print(type(x).__mro__)

fallen slateBOT Feb 8, 2023, 2:47 AM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | <class '_ctypes.PyCPointerType'>
002 | (<class '_ctypes.PyCPointerType'>, <class 'type'>, <class 'object'>)

warm breach Feb 8, 2023, 2:47 AM

#

hm

gray galleon Feb 8, 2023, 2:50 AM

#

what does PUSH_NULL mean

#

it pushes None?

warm breach Feb 8, 2023, 2:52 AM

#

gray galleon what does `PUSH_NULL` mean

https://docs.python.org/3/library/dis.html#opcode-PUSH_NULL pushes a C NULL to the stack

#

part of the method caching in 3.11 iirc

gray galleon Feb 8, 2023, 2:54 AM

#

warm breach https://docs.python.org/3/library/dis.html#opcode-PUSH_NULL pushes a C `NULL` to...

something which is not a python object but is in the stack hmm

#

does NULL point to a python object

warm breach Feb 8, 2023, 2:54 AM

#

gray galleon does `NULL` point to a python object

well they're all PyObject pointers

#

NULL will be interpreted as a NULL PyObject pointer if it is casted to one

#

but they probably check it for NULL and do something with it

gray galleon Feb 8, 2023, 2:57 AM

#

so NULL is a python object lemon_thinking

#

can't wait to see how NULL behave

warm breach Feb 8, 2023, 2:58 AM

#

gray galleon so `NULL` is a python object <:lemon_thinking:754441881420562433>

a *PyObject can be a null pointer, yeah

gray galleon Feb 8, 2023, 2:58 AM

#

print(NULL)

#

can i do this using ctypes or something

warm breach Feb 8, 2023, 2:59 AM

#

gray galleon can i do this using ctypes or something

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=3,
    ob_item=[NULL] * 3
).into_object()

print(t)

fallen slateBOT Feb 8, 2023, 3:00 AM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<NULL>, <NULL>, <NULL>)

warm breach Feb 8, 2023, 3:00 AM

#

python builtin collections are able to show reprs of NULL PyObject pointers somehow

#

but if you try to access those indices it will segfault

gray galleon Feb 8, 2023, 3:00 AM

#

strange that it can print NULL

#

!e

from einspect import NULL
print(NULL)
print(id(NULL))

fallen slateBOT Feb 8, 2023, 3:01 AM

#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | <NULL ptr[PyObject] at 0x7fc6b2a861d0>
002 | 140491379206784

rose schooner Feb 8, 2023, 3:01 AM

#

gray galleon strange that it can print `NULL`

it has handling for it for some reason

rose schooner Feb 8, 2023, 3:02 AM

#

fallen slate <@842225380051124254> :white_check_mark: Your 3.11 eval job has completed with r...

why don't the ids match ```pycon

0x7fc6b2a861d0
140491377631696

warm breach Feb 8, 2023, 3:02 AM

#

gray galleon !e ```py from einspect import NULL print(NULL) print(id(NULL)) ```

that's just a singleton instance of a null POINTER(PyObject) I have, not really a null object

gray galleon Feb 8, 2023, 3:02 AM

#

fallen slate <@842225380051124254> :white_check_mark: Your 3.11 eval job has completed with r...

NULL pointer is not 0 smh

warm breach Feb 8, 2023, 3:02 AM

#

rose schooner why don't the ids match ```pycon >>> 0x7fc6b2a861d0 140491377631696 ```

the repr is showing ctypes.addressof

rose schooner Feb 8, 2023, 3:02 AM

#

warm breach the repr is showing `ctypes.addressof`

ok

warm breach Feb 8, 2023, 3:03 AM

#

!e

from ctypes import addressof
from einspect import NULL

print(NULL)
print(hex(addressof(NULL)))

fallen slateBOT Feb 8, 2023, 3:03 AM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | <NULL ptr[PyObject] at 0x7f34fee562f0>
002 | 0x7f34fee562f0

gray galleon Feb 8, 2023, 3:04 AM

#

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=1,
    ob_item=[NULL]
).into_object()

print(t)

fallen slateBOT Feb 8, 2023, 3:04 AM

#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<NULL>,)

gray galleon Feb 8, 2023, 3:04 AM

#

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=1,
    ob_item=[NULL]
).into_object()

print(t)
print(t[0])

fallen slateBOT Feb 8, 2023, 3:04 AM

#

@gray galleon :x: Your 3.11 eval job has completed with return code 139 (SIGSEGV).

(<NULL>,)

gray galleon Feb 8, 2023, 3:04 AM

#

printing null itself cause sigsegv

#

printing the tuple doesn't

warm breach Feb 8, 2023, 3:05 AM

#

well before you print it, t[0] tries to convert the pointer into a python object for you

#

the repr happens in C so they can handle NULLs

gray galleon Feb 8, 2023, 3:06 AM

#

!e

from einspect import NULL
from einspect.structs import *

t = PyTupleObject(
    ob_refcnt=1,
    ob_type=PyTypeObject(tuple).as_ref(),
    ob_size=1,
    ob_item=[NULL]
).into_object()

null = t[0]

print(dir(null))

fallen slateBOT Feb 8, 2023, 3:06 AM

#

@gray galleon :warning: Your 3.11 eval job has completed with return code 139 (SIGSEGV).

[No output]

warm breach Feb 8, 2023, 3:06 AM

#

!e python is doing this essentially (but without the safety check, hence segfault)

from einspect import NULL

print(NULL.contents.into_object())

fallen slateBOT Feb 8, 2023, 3:06 AM

#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 3, in <module>
003 | ValueError: NULL pointer access

gray galleon Feb 8, 2023, 3:07 AM

#

gray galleon !e ```py from einspect import NULL from einspect.structs import * t = PyTupleOb...

yeah
the t[0] is the culprit

#

thx
now i know how to create a tuple that breaks when indexed

#

also its impressive that python can be so unsafe

raven ridge Feb 8, 2023, 3:11 AM

#

at the point where you're pulling in a C FFI, you're not really writing Python anymore.

warm breach Feb 8, 2023, 3:12 AM

#

gray galleon also its impressive that python can be so unsafe

well, python you write in CPython should be safe, the C which CPython is written in is naturally not safe

gray galleon Feb 8, 2023, 3:13 AM

#

~~just rewrite python in rust then~~

warm breach Feb 8, 2023, 3:14 AM

#

gray galleon ~~just rewrite python in rust then~~

it'd be pretty much the same thing but slower

#

"safe" calls in rust aren't free speed-wise

#

also making python in rust without unsafe calls would probably be close to impossible

gray galleon Feb 8, 2023, 3:15 AM

#

warm breach also making python in rust without unsafe calls would probably be close to impos...

how

pliant tusk Feb 8, 2023, 3:15 AM

#

gray galleon thx now i know how to create a tuple that breaks when indexed

You can even make it without using the CFFI

gray galleon Feb 8, 2023, 3:16 AM

#

pliant tusk You can even make it without using the CFFI

how

feral island Feb 8, 2023, 3:16 AM

#

ctypes?

gray galleon Feb 8, 2023, 3:16 AM

#

feral island ctypes?

that is an ffi

pliant tusk Feb 8, 2023, 3:16 AM

#

!e ```py
import gc

class magic:
def length_hint(self):
return 1

def __iter__(self):
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                yield obj
                break

weird = tuple(magic())
print(weird[0] is weird, weird)```

fallen slateBOT Feb 8, 2023, 3:16 AM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

True ((...),)

feral island Feb 8, 2023, 3:16 AM

#

sure. I guess another option is directly creating a code object

pliant tusk Feb 8, 2023, 3:17 AM

#

@gray galleon that abuses the gc to grab a tuple as it is being created. You can change length_hint to larger values and it will leak NULLs

gray galleon Feb 8, 2023, 3:17 AM

#

pliant tusk !e ```py import gc class magic: def __length_hint__(self): return 1...

does it break python?
looks impressive tho

warm breach Feb 8, 2023, 3:17 AM

#

gray galleon how

https://github.com/RustPython/RustPython

#

and now consider transient usages of those calls

pliant tusk Feb 8, 2023, 3:18 AM

#

gray galleon does it break python? looks impressive tho

It makes a tuple that contains itself

feral island Feb 8, 2023, 3:18 AM

#

pliant tusk !e ```py import gc class magic: def __length_hint__(self): return 1...

hm that seems like a fixable bug. I guess it should call __length_hint__ before creating the tuple

pliant tusk Feb 8, 2023, 3:18 AM

#

Hash it

rose schooner Feb 8, 2023, 3:18 AM

#

pliant tusk <@842225380051124254> that abuses the gc to grab a tuple as it is being created....

>>> weird = tuple(magic())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: D:\_w\1\s\Objects\tupleobject.c:927: bad argument to internal function
``` um

gray galleon Feb 8, 2023, 3:18 AM

#

pliant tusk It makes a tuple that contains itself

ik
that doesn't cause python to crash when indexed

pliant tusk Feb 8, 2023, 3:19 AM

#

gray galleon ik that doesn't cause python to crash when indexed

Hash it

gray galleon Feb 8, 2023, 3:19 AM

#

!e ```py
import gc

class magic:
def length_hint(self):
return 10

def __iter__(self):
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                yield obj
                break

weird = tuple(magic())
print(weird)```

fallen slateBOT Feb 8, 2023, 3:19 AM

#

@gray galleon :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 15, in <module>
003 | SystemError: Objects/tupleobject.c:927: bad argument to internal function

gray galleon Feb 8, 2023, 3:19 AM

#

lmao

rose schooner Feb 8, 2023, 3:19 AM

#

it fails one of these checks https://github.com/python/cpython/blob/3.11/Objects/tupleobject.c#L923-L929

fallen slateBOT Feb 8, 2023, 3:19 AM

#

Objects/tupleobject.c lines 923 to 929

if (v == NULL || !Py_IS_TYPE(v, &PyTuple_Type) ||
    (Py_SIZE(v) != 0 && Py_REFCNT(v) != 1)) {
    *pv = 0;
    Py_XDECREF(v);
    PyErr_BadInternalCall();
    return -1;
}```

pliant tusk Feb 8, 2023, 3:20 AM

#

!e ```py
import gc

class magic:
def length_hint(self):
return 1

def __iter__(self):
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                yield obj
                break

weird = tuple(magic())
hash(weird)```

fallen slateBOT Feb 8, 2023, 3:20 AM

#

@pliant tusk :warning: Your 3.11 eval job has completed with return code 139 (SIGSEGV).

[No output]

rose schooner Feb 8, 2023, 3:20 AM

#

fallen slate `Objects/tupleobject.c` lines 923 to 929 ```c if (v == NULL || !Py_IS_TYPE(v, &P...

or passes, actually

warm breach Feb 8, 2023, 3:20 AM

#

rose schooner it fails *one* of these checks https://github.com/python/cpython/blob/3.11/Objec...

worked on my 3.11.1 though

rose schooner Feb 8, 2023, 3:20 AM

#

warm breach worked on my 3.11.1 though

not on my 3.11.0

warm breach Feb 8, 2023, 3:20 AM

#

Thonk

#

regression?

gray galleon Feb 8, 2023, 3:20 AM

#

fallen slate <@274715613115711488> :warning: Your 3.11 eval job has completed with return cod...

o

#

hash bug?

pliant tusk Feb 8, 2023, 3:21 AM

#

rose schooner it fails *one* of these checks https://github.com/python/cpython/blob/3.11/Objec...

It fails those when you change length hint. Just leak the tuple by setting it as a global inside iter

warm breach Feb 8, 2023, 3:21 AM

#

gray galleon hash bug?

you can't really hash self-referencing objects

#

or at least python collections don't prepare for that

rose schooner Feb 8, 2023, 3:24 AM

#

pliant tusk It fails those when you change length hint. Just leak the tuple by setting it as...

wdym?

warm breach Feb 8, 2023, 3:25 AM

#

pliant tusk (you can also make your own subclasses of `_SimpleCData` and inject get and set ...

is there a non memory-patch way of making custom get set funcs?

#

from ctypes import *
from einspect.structs import PyObject

PyCPointerType = type(POINTER(c_void_p))


class LP_PyObject(PyCPointerType):
    _type_ = PyObject

pliant tusk Feb 8, 2023, 3:26 AM

#

warm breach is there a non memory-patch way of making custom get set funcs?

not as far as I know

warm breach Feb 8, 2023, 3:26 AM

#

like can I customize how LP_PyObject gets converted when it's a Structure member

pliant tusk Feb 8, 2023, 3:28 AM

#

I think you need to use memory patching

pliant tusk Feb 8, 2023, 3:30 AM

#

rose schooner wdym?

!e ```py
import gc

class magic:
def length_hint(self):
return 1

def __iter__(self):
    global weird
    for obj in gc.get_objects():
        if isinstance(obj, tuple):
            try:0 in obj
            except SystemError:
                weird = obj
                return
                yield

try:tuple(magic())
except:pass
print(weird)

fallen slateBOT Feb 8, 2023, 3:30 AM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<NULL>,)

rose schooner Feb 8, 2023, 3:30 AM

#

ok

warm breach Feb 8, 2023, 3:31 AM

#

pliant tusk !e ```py import gc class magic: def __length_hint__(self): return 1...

wtf 🥴

raven ridge Feb 8, 2023, 3:32 AM

#

so uh - has someone reported that bug? The tuple probably shouldn't be getting tracked by the GC (and thus discoverable through gc.get_objects()) until after it's in a valid state

gray galleon Feb 8, 2023, 3:33 AM

#

gc.get_objects() get all living instances in python?

#

!e```
import gc
print(len(gc.get_objects()))

fallen slateBOT Feb 8, 2023, 3:33 AM

#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

gray galleon Feb 8, 2023, 3:33 AM

#

big

pliant tusk Feb 8, 2023, 3:33 AM

#

raven ridge so uh - has someone reported that bug? The tuple probably shouldn't be getting t...

It's been known for a while afaik

warm breach Feb 8, 2023, 3:34 AM

#

raven ridge so uh - has someone reported that bug? The tuple probably shouldn't be getting t...

don't all tuples start GC-tracked before they're released?

gray galleon Feb 8, 2023, 3:34 AM

#

!e```
from gc import get_objects
print(len(get_objects()))

fallen slateBOT Feb 8, 2023, 3:34 AM

#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

gray galleon Feb 8, 2023, 3:35 AM

#

how does the from..import version have more instances than import version

raven ridge Feb 8, 2023, 3:35 AM

#

warm breach don't all tuples start GC-tracked before they're released?

probably - that seems like a bug, though

gray galleon Feb 8, 2023, 3:36 AM

#

gray galleon how does the `from..import` version have more instances than `import` version

isn't the gc module unreachable?

pliant tusk Feb 8, 2023, 3:37 AM

#

raven ridge probably - that seems like a bug, though

I first posted code with that bug in 2021

warm breach Feb 8, 2023, 3:37 AM

#

gray galleon isn't the `gc` module unreachable?

wdym

gray galleon Feb 8, 2023, 3:38 AM

#

gray galleon how does the `from..import` version have more instances than `import` version

this

#

@warm breach

warm breach Feb 8, 2023, 3:38 AM

#

you imported get_objects, that's another gc tracked reference

stone sandal Feb 8, 2023, 3:38 AM

#

helo

#

Wait why am I the chair expert

gray galleon Feb 8, 2023, 3:39 AM

#

stone sandal helo

sorry i pinged the wrong person

stone sandal Feb 8, 2023, 3:39 AM

#

Yeah I figured

#

Maybe I can help tho

#

What's up

gray galleon Feb 8, 2023, 3:39 AM

#

warm breach you imported `get_objects`, that's another gc tracked reference

but should gc be unreachable after get_objects was imported

raven ridge Feb 8, 2023, 3:40 AM

#

no, it's still in sys.modules

gray galleon Feb 8, 2023, 3:40 AM

#

real

#

!e```
import gc

class Foo: pass

print(len(gc.get_objects()))

fallen slateBOT Feb 8, 2023, 3:41 AM

#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

gray galleon Feb 8, 2023, 3:42 AM

#

1 class = 9 more instances

#

smh

#

i assume its from the class attributes?

dusk comet Feb 8, 2023, 5:40 AM

#

There are many objects that classes are made up of

__name__, __qualname__, __module__ str
__bases__, __orig_bases__ tuple
__subclasses__() list
__mro__ list
__annotations__, __dict__ dict
__flags__, __dictoffset__, __itemsize__, ... int

foggy lodge Feb 8, 2023, 2:14 PM

#

warm breach ```py from ctypes import * from einspect.structs import PyObject PyCPointerType...

Yes, it is possible to make custom get and set functions without using memory patching. One way to do this is to subclass the desired data type, such as PyCPointerType in your example, and define custom get and set methods as class methods or properties. Then, you can use instances of the subclass to perform operations with custom behavior.

warm breach Feb 8, 2023, 2:16 PM

#

foggy lodge Yes, it is possible to make custom get and set functions without using memory pa...

I don't think ctypes even recognizes the subclass as a ctypes type though

#

trying to make it a Structure field will have an error that it's not a ctypes type

foggy lodge Feb 8, 2023, 2:19 PM

#

warm breach trying to make it a Structure field will have an error that it's not a ctypes ty...

Yes, you are correct. Subclassing a ctypes data type does not automatically make the subclass a recognized ctypes type. To use your subclass as a field in a ctypes Structure, you need to register the subclass as a ctypes data type using the ctypes.POINTER function.

Here's an example:

from ctypes import *

class LP_PyObject(c_void_p):
pass

LP_PyObject_p = POINTER(LP_PyObject)

class MyStructure(Structure):
fields = [("obj", LP_PyObject_p)]

warm breach Feb 8, 2023, 2:22 PM

#

I mean, the whole point was having a custom POINTER type that I can override from_param on null values

feral island Feb 8, 2023, 2:24 PM

#

foggy lodge Yes, you are correct. Subclassing a ctypes data type does not automatically make...

you sound an awful lot like ChatGPT

pliant tusk Feb 8, 2023, 2:33 PM

#

foggy lodge Yes, it is possible to make custom get and set functions without using memory pa...

I believe @warm breach was refering to the automatic unwrapping that basic ctypes types have

warm breach Feb 8, 2023, 2:33 PM

#

I think just overriding my structure __getattr__ is probably the least cursed way to do it though, then I could have this work

from einspect import view, NULL

n = view(int).tp_as_number.contents

print(n.nb_add is NULL)
# False
print(n.nb_matrix_multiply is NULL)
# True

pliant tusk Feb 8, 2023, 2:34 PM

#

that would work

warm breach Feb 8, 2023, 2:36 PM

#

not sure about Array types made with LP_PyObject * n though

#

is subclassing ctypes.Array a thing

#

or is it one of those dynamic type types

pliant tusk Feb 8, 2023, 2:37 PM

#

Im taking a look rn to see if you can modify Array type unwrapping

pliant tusk Feb 8, 2023, 3:20 PM

#

it looks like Arrays use their proto get/set funcs @warm breach

#

Pointers are a bit weirder tho

warm breach Feb 8, 2023, 3:24 PM

#

pliant tusk it looks like `Array`s use their proto get/set funcs <@233059161401720832>

yeah looks like I can just

class MyArray(ctypes.Array):
    _length_ = 3
    _type_ = ptr[PyObject]

#

seems length has to be known at define time though pithink

pliant tusk Feb 8, 2023, 3:25 PM

#

you can just make a class factory

warm breach Feb 8, 2023, 3:25 PM

#

I suppose it's not too different from what I do now with dynamically ptr[PyObject] * 3

pliant tusk Feb 8, 2023, 3:25 PM

#

yea

warm breach Feb 8, 2023, 3:25 PM

#

I have no idea how to even make a class of a pointer type though

pliant tusk Feb 8, 2023, 3:26 PM

#

use _ctypes._Pointer and set the _type_

warm breach Feb 8, 2023, 3:29 PM

#

pliant tusk use `_ctypes._Pointer` and set the `_type_`

can I just define from_param or something for that

#

wait no

#

that's python to ctypes

quiet crane Feb 8, 2023, 3:32 PM

#

noooo I deleted my nicely crafted message 😦

warm breach Feb 8, 2023, 3:33 PM

#

!e

from ctypes import *
from ctypes import _Pointer
from einspect.structs import PyObject


class LP_PyObject(_Pointer):
    _type_ = PyObject

    @classmethod
    def from_buffer(cls, buffer):
        print("in from_buffer")
        return super().from_buffer(buffer)
    
    @classmethod
    def from_param(cls, param):
        print("in from_param")
        return super().from_param(param)

    @classmethod
    def from_address(cls, address):
        print("in from_address")
        return super().from_address(address)


class MyObject(Structure):
    _fields_ = [
        ("ob_refcnt", c_ssize_t),
        ("ob_type", LP_PyObject),
    ]

x = MyObject.from_address(id(5))
print(x.ob_type)

fallen slateBOT Feb 8, 2023, 3:33 PM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

<__main__.LP_PyObject object at 0x7f62a1c124e0>

warm breach Feb 8, 2023, 3:34 PM

#

seems like it doesn't call any of those

pliant tusk Feb 8, 2023, 3:34 PM

#

warm breach seems like it doesn't call any of those

you would need to wrap the internal C functions to get them called

#

afaik

warm breach Feb 8, 2023, 3:35 PM

#

eh honestly might not do this

#

would also break usages of assignments of pointers after getting them from structs

#

like https://github.com/ionite34/einspect/blob/main/src/einspect/structs/py_object.py#L64-L67

fallen slateBOT Feb 8, 2023, 3:36 PM

#

src/einspect/structs/py_object.py lines 64 to 67

if obj_ptr:
    obj_ptr.contents.DecRef()
# Set new
obj_ptr.contents = PyObject.try_from(value).with_ref()```

warm breach Feb 8, 2023, 3:38 PM

#

== NULL is a thousand times easier since the class just does its own __eq__ and compares whatever it wants

warm breach Feb 8, 2023, 5:18 PM

#

crazy idea, __matmul__ alias for Structure.from_address? 🥴

from einspect.structs import PyFloatObject

obj = PyFloatObject @ id(1.5)

print(obj.ob_fval)
>> 1.5

pliant tusk Feb 8, 2023, 5:25 PM

#

warm breach crazy idea, `__matmul__` alias for `Structure.from_address`? 🥴 ```py from eins...

!e ```py
from ctypes import *
from fishhook import hook

@hook(type(Structure))
@hook(type(c_void_p))
def matmul(cls, addr):
return cls.from_address(addr)

print(py_object @ (id(1) + 8))```

fallen slateBOT Feb 8, 2023, 5:25 PM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

py_object(<class 'int'>)

deep nova Feb 8, 2023, 5:26 PM

#

Hey peeps! Quick question about match statement mechanics

#

I'm told that this (following) desugars to an if-else ladder with approximately O(n) lookup time

#

match some_character:
  
    case 'a':
        ...

    case 'b':
        ...

    case 'c':
        ...

#

But what about this...?

#

match some_character:
  
    case 'a' | 'b' | 'c':
        ...

#

I could see a smart optimization step seeing this and converting the characters into some kind of set

frigid bison Feb 8, 2023, 5:39 PM

#

  4           0 LOAD_FAST                0 (some_character)

  5           2 DUP_TOP
              4 LOAD_CONST               1 ('a')
              6 COMPARE_OP               2 (==)
              8 POP_JUMP_IF_FALSE        8 (to 16)
             10 POP_TOP

  6          12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

  8     >>   16 DUP_TOP
             18 LOAD_CONST               2 ('b')
             20 COMPARE_OP               2 (==)
             22 POP_JUMP_IF_FALSE       15 (to 30)
             24 POP_TOP

  9          26 LOAD_CONST               0 (None)
             28 RETURN_VALUE

 11     >>   30 LOAD_CONST               3 ('c')
             32 COMPARE_OP               2 (==)
             34 POP_JUMP_IF_FALSE       20 (to 40)

 12          36 LOAD_CONST               0 (None)
             38 RETURN_VALUE

 11     >>   40 LOAD_CONST               0 (None)
             42 RETURN_VALUE```

#

15           0 LOAD_FAST                0 (some_character)

 16           2 DUP_TOP
              4 LOAD_CONST               1 ('a')
              6 COMPARE_OP               2 (==)
              8 POP_JUMP_IF_FALSE        8 (to 16)
             10 POP_TOP

 17          12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

 16     >>   16 DUP_TOP
             18 LOAD_CONST               2 ('b')
             20 COMPARE_OP               2 (==)
             22 POP_JUMP_IF_FALSE       15 (to 30)
             24 POP_TOP

 17          26 LOAD_CONST               0 (None)
             28 RETURN_VALUE

 16     >>   30 DUP_TOP
             32 LOAD_CONST               3 ('c')
             34 COMPARE_OP               2 (==)
             36 POP_JUMP_IF_FALSE       22 (to 44)
             38 POP_TOP

 17          40 LOAD_CONST               0 (None)
             42 RETURN_VALUE

 16     >>   44 POP_TOP
             46 LOAD_CONST               0 (None)
             48 RETURN_VALUE```

#

this is the bytecode, respectively

#

you can see it's identical

deep nova Feb 8, 2023, 6:15 PM

#

Awesome!

#

Thanks

#

How do you get this bytecode? Pass the source code as a string to dis?

pliant tusk Feb 8, 2023, 6:19 PM

#

deep nova How do you get this bytecode? Pass the source code as a string to `dis`?

!e py import dis dis.dis(''' match some_character: case 'a' | 'b' | 'c': ... ''')

fallen slateBOT Feb 8, 2023, 6:19 PM

#

@pliant tusk :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 |   0           0 RESUME                   0
002 | 
003 |   2           2 LOAD_NAME                0 (some_character)
004 | 
005 |   3           4 COPY                     1
006 |               6 LOAD_CONST               0 ('a')
007 |               8 COMPARE_OP               2 (==)
008 |              14 POP_JUMP_FORWARD_IF_FALSE     1 (to 18)
009 |              16 JUMP_FORWARD            17 (to 52)
010 |         >>   18 COPY                     1
011 |              20 LOAD_CONST               1 ('b')
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/ohosimuyav.txt?noredirect

deep nova Feb 8, 2023, 6:19 PM

#

Sick

#

Thanks

radiant garden Feb 8, 2023, 7:30 PM

#

Just be careful, make sure it's faster:

#

!timeit ```py
x = 'q'
x == 'a' or x == 'b' or x == 'c' or x == 'd'

fallen slateBOT Feb 8, 2023, 7:31 PM

#

@radiant garden :white_check_mark: Your 3.11 timeit job has completed with return code 0.

2000000 loops, best of 5: 160 nsec per loop

radiant garden Feb 8, 2023, 7:32 PM

#

!timeit ```py
x = 'q'
x in {'a', 'b', 'c', 'd'}

fallen slateBOT Feb 8, 2023, 7:32 PM

#

@radiant garden :white_check_mark: Your 3.11 timeit job has completed with return code 0.

5000000 loops, best of 5: 46.4 nsec per loop

radiant garden Feb 8, 2023, 7:32 PM

#

Good to know it is in fact faster!

grave jolt Feb 8, 2023, 7:32 PM

#

yup that's a handy optimization

#

...which breaks if you introduce a module-level constant

#

😦

pliant tusk Feb 8, 2023, 7:56 PM

#

grave jolt ...which breaks if you introduce a module-level constant

this would be a good use case for macros

#

because then all usages of the macro would expand to a set literal which would then turn into a single frozenset at compile time

grave jolt Feb 8, 2023, 8:13 PM

#

💀

#

eh, not sure it deserves expanding the language with such complex feature

#

if you really want faster lookups, create the frozenset once explicitly

warm breach Feb 8, 2023, 8:14 PM

#

radiant garden !timeit ```py x = 'q' x == 'a' or x == 'b' or x == 'c' or x == 'd' ```

these are not functionally equivalent though

x == 'a' or x == 'b' or x == 'c' or x == 'd'
x in {'a', 'b', 'c', 'd'}

#

!e

def char_match(some_char):
    match some_char:
        case "a" | "b" | "c":
            return 1
        case "d" | "e" | "f":
            return 2
        case _:
            return None
        
print(char_match("a"))
print(char_match([1, 2]))

fallen slateBOT Feb 8, 2023, 8:16 PM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 1
002 | None

warm breach Feb 8, 2023, 8:17 PM

#

!e

def set_match(some_char):
    if some_char in {"a", "b", "c"}:
        return 1
    elif some_char in {"d", "e", "f"}:
        return 2
    else:
        return None

print(set_match("a"))
print(set_match([1, 2]))

fallen slateBOT Feb 8, 2023, 8:17 PM

#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | 1
002 | Traceback (most recent call last):
003 |   File "<string>", line 10, in <module>
004 |   File "<string>", line 2, in set_match
005 | TypeError: unhashable type: 'list'

warm breach Feb 8, 2023, 8:18 PM

#

and custom types can == a str but not nessasarily be hashable

pliant tusk Feb 8, 2023, 8:20 PM

#

grave jolt eh, not sure it deserves expanding the language with such complex feature

was there not already a PEP for macros?

feral island Feb 8, 2023, 8:30 PM

#

!pep 638

fallen slateBOT Feb 8, 2023, 8:30 PM

#

**PEP 638 - Syntactic Macros**

Link

Status

Draft

Created

24-Sep-2020

Type

Standards Track

grave jolt Feb 8, 2023, 8:40 PM

#

yeah there was

#

eeeeh

#

why do people want more features in Python

deep nova Feb 8, 2023, 8:41 PM

#

Hey internals people

grave jolt Feb 8, 2023, 8:41 PM

#

that's called "surgeons"

deep nova Feb 8, 2023, 8:42 PM

#

Should the literal 0b123 be lexed as an invalid-integer token, or as two tokens (0b1 and 23)

pliant tusk Feb 8, 2023, 8:51 PM

#

i would assume an invalid integer token

feral island Feb 8, 2023, 9:02 PM

#

    0b123
       ^
SyntaxError: invalid digit '2' in binary literal
``` that's what Python does

#

as opposed to ``` Input In [47]
0b1 23
^
SyntaxError: invalid syntax

native flame Feb 9, 2023, 3:02 AM

#

the mailing list has been talking about macros recently

#

#

i dont know how i feel about it honestly

#

for all its benefits its still a massive change

warm breach Feb 9, 2023, 4:06 AM

#

native flame i dont know how i feel about it honestly

it gives a huge amount of possibilities but I feel the danger is it will be used in a lot of places as well

#

when everyone is using custom parsed macros it's harder to look at python code and be able to tell what's going on

#

which I feel like is kind of where rust macros are at

#

a lot of things that really shouldn't be macros are macros in rust libraries because why not and everyone wants to do something "magical" for themselves

#

since python doesn't really have a compile time (that at least it can provide to macros) it will be limited to literals, which I kind of question how useful it would be

native flame Feb 9, 2023, 4:18 AM

#

wdym by limited to literals pithink

warm breach Feb 9, 2023, 4:22 AM

#

native flame wdym by limited to literals <:pithink:652247559909277706>

maybe I'm misunderstanding how this works

#

does the compiler compile the ast at bytecode time? or is it fully runtime

feral island Feb 9, 2023, 4:24 AM

#

warm breach does the compiler compile the ast at bytecode time? or is it fully runtime

the source gets parsed into an AST and then compiled to bytecode by the compiler

warm breach Feb 9, 2023, 4:27 AM

#

feral island the source gets parsed into an AST and then compiled to bytecode by the compiler

so it's just an AST object that is inlined in the bytecode?

feral island Feb 9, 2023, 4:28 AM

#

"it" being a macro?

#

I have no idea, there are no macros right now

warm breach Feb 9, 2023, 4:28 AM

#

it seems using it will be fairly complex

feral island Feb 9, 2023, 4:29 AM

#

I imagine you'd implement them as code that runs at compile time and outputs something like an AST

#

but there are other options

warm breach Feb 9, 2023, 4:29 AM

#

how would the macro creator resolve clashing local / global names or something

feral island Feb 9, 2023, 4:29 AM

#

PEP 638 probably discusses this, I haven't read it recently

warm breach Feb 9, 2023, 4:30 AM

#

sounds like it might have good implications for rewrite libraries though or FFI

#

numba.njit is pretty much only interested in AST so a macro would be perfect there

raven ridge Feb 9, 2023, 5:46 AM

#

warm breach when everyone is using custom parsed macros it's harder to look at python code a...

It seems unreasonable to assume that people will start using it "just because". People don't tend to use metaclasses Just Because, or import hooks, or .pth files, or __init_subclass__, or any of the other tons of customization points that the language provides for advanced use cases

#

Python programmers tend to be pretty judicious about only using a more magical feature when the alternatives would provide a much worse user experience.

warm breach Feb 9, 2023, 5:49 AM

#

raven ridge Python programmers tend to be pretty judicious about only using a more magical f...

though under that standard what would really justify as needing a macro?

warm breach Feb 9, 2023, 5:50 AM

#

warm breach `numba.njit` is pretty much only interested in AST so a macro would be perfect t...

for example this is cool but currently numba works just fine with decorators

#

it's just a minor performance overhead (which considering LLVM and everything else it's almost negligible)

raven ridge Feb 9, 2023, 5:52 AM

#

pytest could potentially use macros for its assert rewriting, instead of the nastiness that it does today. There was a proposal to implement match/case using syntactic macros, rather than adding it to the language - or even to trial it with macros as a library to decide on the desired syntax and semantics, and then upgrade it to a first class language feature.

warm breach Feb 9, 2023, 5:53 AM

#

fair yeah

#

not sure how tooling support will go though

#

even rust IDEs have a hard time giving completions in macros

raven ridge Feb 9, 2023, 5:54 AM

#

people keep asking for macros in order to build DSLs in Python - from that PoV, it makes sense that completion would be tough, since you're effectively in the domain of a new language when you're using a DSL

warm breach Feb 9, 2023, 5:55 AM

#

would type checkers and linters even be able to parse python without running run-time code

raven ridge Feb 9, 2023, 5:59 AM

#

That would depend on the implementation, I suppose. If the implementation of the macros is generating Python code dynamically while creating the AST, they'd need to match that

#

That is, the static analysis tools would need to run compile time code, I guess

#

But static analysis tools already can't understand all sorts of stuff you can do in Python.

warm breach Feb 9, 2023, 6:10 AM

#

raven ridge But static analysis tools already can't understand all sorts of stuff you can do...

but they can check syntax without runtime code right?

#

I think I can see a fairly comprehensive implementation of this but it sounds more like a python 4 level feature

raven ridge Feb 9, 2023, 6:23 AM

#

warm breach but they can check syntax without runtime code right?

Not necessarily, no. They can't handle https://pypi.org/project/cstyle/ for instance

PyPI

cstyle

Use c-style braces instead of indentation.

warm breach Feb 9, 2023, 6:24 AM

#

raven ridge Not necessarily, no. They can't handle https://pypi.org/project/cstyle/ for inst...

but ideally we do want macros to be statically analyzable?

raven ridge Feb 9, 2023, 6:25 AM

#

🤷‍♀️

#

I guess it would be a nice to have, but I don't think it's a requirement

warm breach Feb 9, 2023, 6:26 AM

#

if we're losing IDE syntax checks and type checks for macros I'm not sure how good of a trade that is over strings / decorators and type hints (or whatever we use instead of a macro currently)

raven ridge Feb 9, 2023, 6:26 AM

#

dataclasses are already not statically analyzable for type checkers, for instance - they all needed to add special support recognizing and special casing them

warm breach Feb 9, 2023, 6:27 AM

#

that's just a type thing though, this changes ast

raven ridge Feb 9, 2023, 6:28 AM

#

pytest changes the AST.

warm breach Feb 9, 2023, 6:29 AM

#

hm, how?

raven ridge Feb 9, 2023, 6:29 AM

#

It rewrites assert statements

#

In order to include information about why an assertion fails

warm breach Feb 9, 2023, 6:30 AM

#

yeah but that doesn't concern what the user writes right

#

you don't need a special ast support to have IDE completions for writing pytest tests

#

(which, not saying all macros must be invalid python ast, but just that it seems they can be now)

#

intellij / pycharm currently supports injected language ast natively

#

but it seems pylance / vscode has decided not to

raven ridge Feb 9, 2023, 6:33 AM

#

warm breach yeah but that doesn't concern what the user writes right

Sure, and it wouldn't if pytest could use a macro to do its assertion rewriting, either.

warm breach Feb 9, 2023, 6:34 AM

#

raven ridge Sure, and it wouldn't if pytest could use a macro to do its assertion rewriting,...

that's assuming we just treat the macro as in-line python ast and have all the normal rules of statements and types

raven ridge Feb 9, 2023, 6:35 AM

#

Just because macros could modify the AST doesn't imply that all uses of macros would be totally opaque to static analysis, is all I'm saying. Pytest runs tests with a modified AST, and pytest tests are understood by static analysis tools. In the pytest case, static analysis tools work fine because it rewrites assert to do something nearly totally compatible with what it would ordinarily do

lone sun Feb 9, 2023, 6:38 AM

#

My biggest concern with macros is how easy they make it to obfuscate code. I have actually seen C code where someone did #define BEGIN {, #define END }, and #define LOOP for. (Actually I think LOOP might have been a little fancier, but I don't remember how.) The result was completely illegible to anyone but the original author; several years later he admitted this had not been a good idea.

#

I don't want to tell people that all macros are evil, because they're not. But I feel like they're different from other advanced language features because they're so easy to use. There's a steep barrier before most people even know what a metaclass is, but there's very little to prevent you from littering your code with awful macros.

#

(Though I have to admit that I'm a little tempted to see if there's a sneaky way to convert ! into a factorial operator.)

raven ridge Feb 9, 2023, 6:42 AM

#

It wouldn't be the first time that an advanced feature was made easy to use and got overused, I guess. That's basically the situation with namespace packages today

warm breach Feb 9, 2023, 6:43 AM

#

raven ridge Just because macros could modify the AST doesn't imply that all uses of macros w...

isn't this an example of how pytest does not need a macro?

#

if all you're doing is things you don't need a macro for, why use one in the first place?

lone sun Feb 9, 2023, 6:43 AM

#

Python is Turing complete, so you never need a macro.

raven ridge Feb 9, 2023, 6:45 AM

#

In the pytest case, it manages because it's the runner. Instead of importing your code, it compiles, rewrites, and executes the rewritten code. That technique isn't broadly applicable, it only really works for frameworks. And it's a lot of work.

warm breach Feb 9, 2023, 6:47 AM

#

lone sun Python is Turing complete, so you never _need_ a macro.

I think I'm more referring to whether it makes a difference in the public api experience

#

like np.einsum

raven ridge Feb 9, 2023, 6:48 AM

#

Oh, I'm only half right there. It does import your code, bit only after installing its own import hook. I'm right that that only works because it's importing you and not the other way around, though.

warm breach Feb 9, 2023, 6:48 AM

#

how would using pytest with macros look like anyways?

raven ridge Feb 9, 2023, 6:49 AM

#

warm breach I think I'm more referring to whether it makes a difference in the public api ex...

Why not also consider how it affects the implementation of the library? https://www.pythoninsight.com/2018/02/assertion-rewriting-in-pytest-part-4-the-implementation/ describes how pytest does what it does, and it's incredibly complex

Python Insight

tim

Assertion rewriting in Pytest part 4: The implementation

raven ridge Feb 9, 2023, 6:50 AM

#

warm breach how would using pytest with macros look like anyways?

Why knows, they don't exist so we'd just be guessing at hypothetical syntax

warm breach Feb 9, 2023, 6:52 AM

#

raven ridge Why not also consider how it affects the implementation of the library? https://...

I'm assuming with macros we'd have pytest without type inference or attribute suggestions, so I'm not seeing how that's better

#

I'll agree it'll make the library simpler, but I don't see how the end product is better

warm breach Feb 9, 2023, 6:52 AM

#

warm breach I'm assuming with macros we'd have pytest without type inference or attribute su...

which is purely an assumption, but how would an IDE know what the macro would do to your statement

raven ridge Feb 9, 2023, 6:52 AM

#

Why would it need to affect the end user experience at all? I don't think that follows

warm breach Feb 9, 2023, 6:52 AM

#

and if types and attributes are still valid

lone sun Feb 9, 2023, 6:52 AM

#

warm breach I think I'm more referring to whether it makes a difference in the public api ex...

Lots of things make a difference in the experience. This is really a question of aesthetics, not functionality. There is nothing you can do in Python that you can't also do in C, Rust, Lisp, various assembly dialects, BASIC, etc. Part of why I like Python is because I like its aesthetics. Judicious use of macros can make certain things clearer and easier. But I expect that if they're easy to use, there will be codebases where they're used in preference to function calls (with some flimsy justification like "it avoids the overhead of setting up a stack frame"), and those will be horrible to work on.

warm breach Feb 9, 2023, 6:53 AM

#

raven ridge Why would it need to affect the end user experience at all? I don't think that f...

having things like IDE autocomplete work, flake8, mypy, etc.

raven ridge Feb 9, 2023, 6:53 AM

#

warm breach which is purely an assumption, but how would an IDE know what the macro would do...

It already doesn't know what assert will do, and it works fine anyway

warm breach Feb 9, 2023, 6:54 AM

#

raven ridge It already doesn't know what `assert` will do, and it works fine anyway

huh?

#

it does the same thing as normal python assert

raven ridge Feb 9, 2023, 6:54 AM

#

No, it doesn't

warm breach Feb 9, 2023, 6:54 AM

#

I'm talking about the statement after assert

#

it's a valid normal python statement

#

where names need to exist and normal rules need to be obeyed

raven ridge Feb 9, 2023, 6:55 AM

#

Sure, but it doesn't have to be

#

It is because that's what pytest defined it to be

warm breach Feb 9, 2023, 6:56 AM

#

raven ridge Sure, but it doesn't have to be

but knowing whether or not it can be is not statically inferable (again assuming, but it seems to be the case from the pep)

#

once your IDE sees the macro all bets are off about what is valid inside

raven ridge Feb 9, 2023, 6:58 AM

#

that's already the case for assert in pytest

#

you're saying that the expression fed as an argument to assert is still using the names visible in the function scope, and so the IDE can blindly use its existing inference machinery without needing to know that assert has been replaced and isn't the normal assert statement anymore.

That's true, but that's only because pytest implemented it that way. There's nothing that would stop pytest from injecting a name into the scope that that expression is evaluated in, for instance.

#

and if pytest was implemented with macros, it would still make the same guarantees about what names are visible in the expression that's fed as an argument to its asserting macro. Because that's the contract that it wants to provide to its users.

warm breach Feb 9, 2023, 7:02 AM

#

raven ridge and if `pytest` was implemented with macros, it would _still_ make the same guar...

so macros could indicate "this is a normal python statement" somewhere I guess?

#

or they can indicate they have custom ast and the IDE can skip parsing name and other checks for that part?

raven ridge Feb 9, 2023, 7:03 AM

#

perhaps, but that's not what I'm getting it. My point is that it's certainly not the case that static analysis tools would need to throw their hands up and give up whenever they encounter any macro, just as it's certainly not the case that static analysis tools need to give up whenever there's any AST rewriting happening

#

it depends entirely on what the macro/AST rewriting does

#

if it injects new variables, or changes the flow of control, or something, then sure, they might get confused. If it just expands to a bunch of valid Python statements, they probably won't.

warm breach Feb 9, 2023, 7:04 AM

#

raven ridge if it injects new variables, or changes the flow of control, or something, then ...

so we'll have macros but tooling won't actually work if they do anything outside of what we could already do without macros?

#

we should probably aim for actually working inferencing and ast parsing that something like rust at least tries to do in macros

raven ridge Feb 9, 2023, 7:05 AM

#

we can already do everything without macros - you can literally rewrite files at import time

#

if you're looking only for things that require macros to do, you won't find any.

warm breach Feb 9, 2023, 7:06 AM

#

raven ridge we can already do everything without macros - you can literally rewrite files at...

you can't have a library that you just import and be able to write your own ast-valid statements right?

#

like say... np einsum

raven ridge Feb 9, 2023, 7:08 AM

#

warm breach you can't have a library that you just import and be able to write your own ast-...

you can, yes. https://aroberge.github.io/ideas/docs/html/motivation.html

#

for example, https://aroberge.github.io/ideas/docs/html/fractional_math_ast.html

warm breach Feb 9, 2023, 7:10 AM

#

raven ridge you can, yes. <https://aroberge.github.io/ideas/docs/html/motivation.html>

don't you need to run python with a special argument for this to work?

raven ridge Feb 9, 2023, 7:10 AM

#

no, you just need to call ideas.examples.fractions_ast.add_hook() before importing your code.

plain condor Feb 9, 2023, 7:11 AM

#

hello people, i have just started out in python and i installed pycharm, but the text isnt colourful, why is it, and how can i fix it?

raven ridge Feb 9, 2023, 7:11 AM

#

plain condor hello people, i have just started out in python and i installed pycharm, but the...

try asking in #editors-ides

warm breach Feb 9, 2023, 7:11 AM

#

plain condor hello people, i have just started out in python and i installed pycharm, but the...

your file needs an extension .py

plain condor Feb 9, 2023, 7:11 AM

#

okay thank you for guiding!

plain condor Feb 9, 2023, 7:11 AM

#

warm breach your file needs an extension `.py`

is that all?

#

oh yes, it worked flawlessly, thank you @warm breach

raven ridge Feb 9, 2023, 7:14 AM

#

raven ridge no, you just need to call `ideas.examples.fractions_ast.add_hook()` before impor...

this works through an import hook, rewriting the AST when new modules are imported. It's the same trick that pytest plays, basically, except pytest rewrites assert and this rewrites /

warm breach Feb 9, 2023, 7:15 AM

#

raven ridge no, you just need to call `ideas.examples.fractions_ast.add_hook()` before impor...

but with macros you can do the same in the current code you're writing right?

raven ridge Feb 9, 2023, 7:16 AM

#

🤷‍♂️ you can do that with an import hook by installing it and then re-importing yourself, I assume

warm breach Feb 9, 2023, 7:16 AM

#

raven ridge 🤷‍♂️ you can do that with an import hook by installing it and then re-importing...

not if your code contains syntax errors

raven ridge Feb 9, 2023, 7:16 AM

#

(removing your entry from sys.modules in the middle)

warm breach Feb 9, 2023, 7:17 AM

#

it will fail at compile time before imports run

raven ridge Feb 9, 2023, 7:17 AM

#

true, for that you'd have to use something like https://aroberge.github.io/ideas/docs/html/lambda.html

warm breach Feb 9, 2023, 7:20 AM

#

raven ridge true, for that you'd have to use something like https://aroberge.github.io/ideas...

hm... curious, can you do anything with this or is it limited to replacing valid python identifiers

raven ridge Feb 9, 2023, 7:20 AM

#

you can do anything with it.

#

that's you defining a function that gets called with the bytes of your .py file, and that returns a str that will be compiled

warm breach Feb 9, 2023, 7:21 AM

#

ah it hooks the string source?

raven ridge Feb 9, 2023, 7:21 AM

#

yep. That lets you make any textual substitutions you'd like on the contents of the file

warm breach Feb 9, 2023, 7:21 AM

#

honestly why does python still support custom encodings 🥴

raven ridge Feb 9, 2023, 7:22 AM

#

I'd be willing to bet there are people using this trick for real DSLs.

#

and even without all of these tricks, it's always been possible to read a file written in some DSL language of your choice, transpile it to valid Python code, and then exec that Python code, all from a Python session.

#

macros aren't giving you any new capability in that sense - they're just making it easier to use, and making it integrate more nicely with the rest of the language

warm breach Feb 9, 2023, 7:24 AM

#

eh...

#

I agree it would be nicer but

#

macros aren't giving you any new capability in that sense
not sure if this can really be said though

#

it's kind of like saying making tuples mutable doesn't give us any new capability, we could mutate them via ctypes all along

raven ridge Feb 9, 2023, 7:25 AM

#

I disagree - ctypes is jumping through an FFI and breaking out of the Python languages. All of those things I've been linking are things that can be done in the Python language

warm breach Feb 9, 2023, 7:26 AM

#

raven ridge I disagree - ctypes is jumping through an FFI and breaking out of the Python lan...

a lot of what you describes are already only supported in CPython

#

C is also an implementation detail of CPython

raven ridge Feb 9, 2023, 7:26 AM

#

warm breach a lot of what you describes are already only supported in CPython

I think other implementations support coding comments, but I'm certain other implementations support import hooks

warm breach Feb 9, 2023, 7:27 AM

#

In any case just because something is possible through some hack doesn't mean it deserves a place in the formal language

#

if macros are added they should be due to their own merits

#

which it seems like it may be already

raven ridge Feb 9, 2023, 7:28 AM

#

warm breach if macros are added they should be due to their own merits

sure, obviously. But those merits can't be "must enable you to do things you couldn't do with macros", because it's already possible to do anything at all without macros, whether by rewriting bytecode or ASTs or the text of the imported module on the fly.

#

or indeed, by generating Python code and exec'ing it

warm breach Feb 9, 2023, 7:29 AM

#

raven ridge sure, obviously. But those merits can't be "must enable you to do things you cou...

well, the implementation wise it'll be vastly different from the current import hooks or custom encodings

#

the pep also describes a runtime specification for the parser and a static value / type structure for static inferencing?

#

as well as restrictions on side effects a macro can have

#

it seems it will be a fairly complex reference implementation if we get one

raven ridge Feb 9, 2023, 7:34 AM

#

warm breach if all you're doing is things you don't need a macro for, why use one in the fir...

I'm just arguing that this this isn't a good argument, because you could use it to apply to absolutely everything. Given sufficient setup (.pth files, usercustomize.py, sitecustomize.py, PYTHONSTARTUP, etc) you can already modify any Python file to do something drastically different than what a static analysis tool thinks it will do

warm breach Feb 9, 2023, 7:38 AM

#

raven ridge I'm just arguing that this <:this:470903994118832130> isn't a good argument, bec...

Isn't the whole point of macros for better static analysis though? over source rewrites with .pth and encodings

raven ridge Feb 9, 2023, 7:39 AM

#

that's one possible advantage. I don't think it's the whole point - it's hard to imagine any implementation of macros that would be more arcane and difficult to set up and use than hacking in custom import hooks to rewrite ASTs

warm breach Feb 9, 2023, 7:40 AM

#

raven ridge that's one possible advantage. I don't think it's the whole point - it's hard to...

but the difference is the macros would be a standard that we would adhere to, and tools could possibly support

spring musk Feb 9, 2023, 7:41 AM

#

Hey

#

umm

warm breach Feb 9, 2023, 7:43 AM

#

https://peps.python.org/pep-0638/#compile-time-checked-data-structures

PEP 638 – Syntactic Macros | peps.python.org

Python Enhancement Proposals (PEPs)

#

also I assume these could be offered support by IDEs in some way

#

though performance wise repeatedly rerunning macro preprocessors isn't too ideal

#

also I guess python would finally have non-syntax compile time errors? 👀

#

or, runtime errors in the preprocessor?

#

does that count as compile time or runtime pithink

#

can preprocessors use macros themselves

warm breach Feb 9, 2023, 8:00 PM

#

https://docs.python.org/3/c-api/object.html#c.PyObject_DelAttr

Python documentation

Object Protocol

#

why is PyObject_DelAttr not stable ABI?

#

though PyObject_SetAttr and PyObject_HasAttr already are

feral island Feb 9, 2023, 8:03 PM

#

warm breach though `PyObject_SetAttr` and `PyObject_HasAttr` already are

isn't SetAttr with the last argument set to NULL equivalent to DelAttr?

#

https://github.com/python/cpython/blob/e60892f9db1316dbabf7a652d7648e4f968b745d/Include/abstract.h#L101

fallen slateBOT Feb 9, 2023, 8:03 PM

#

Include/abstract.h line 101

#define  PyObject_DelAttr(O, A) PyObject_SetAttr((O), (A), NULL)```

feral island Feb 9, 2023, 8:03 PM

#

it's not in the stable ABI because it's not in any ABI

warm breach Feb 9, 2023, 8:03 PM

#

ah it's a macro

#

hm okay yeah that's simple enough

raven ridge Feb 9, 2023, 8:07 PM

#

there's other macros that have been added to the stable ABI as functions, though

warm breach Feb 9, 2023, 8:10 PM

#

I'm just gonna pretend it exists 🥴

@bind_api(pythonapi["PyObject_SetAttr"])
def SetAttr(self, name: str, value: object) -> int:
    """Set attribute `name` of the PyObject. Returns -1 on failure."""

def DelAttr(self, name: str) -> int:
    """Delete attribute `name` of the PyObject. Returns -1 on failure."""
    return self.SetAttr(name, ctypes.py_object())

raven ridge Feb 9, 2023, 8:13 PM

#

I've only needed to limit myself to the stable ABI once, and I found the experience pretty painful. There's so many convenience things that are missing from the stable ABI, forcing you to reimplement stuff yourself

warm breach Feb 9, 2023, 8:16 PM

#

raven ridge I've only needed to limit myself to the stable ABI once, and I found the experie...

I still don't know how to set a list slice from c api

#

https://docs.python.org/3/c-api/list.html#c.PyList_SetSlice

Python documentation

List Objects

#

both PyList_GetSlice and PyList_SetSlice only work on start:end without steps

#

and start end need to be computed as real indices (not negative)

raven ridge Feb 9, 2023, 8:17 PM

#

you can't do steps even from Python, right?

#

!e ```py
x = list(range(10))
x[::2] = [0, 0, 0, 0, 0]
print(x)

fallen slateBOT Feb 9, 2023, 8:18 PM

#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

[0, 1, 0, 3, 0, 5, 0, 7, 0, 9]

raven ridge Feb 9, 2023, 8:18 PM

#

TIL! I had no idea that worked.

warm breach Feb 9, 2023, 8:18 PM

#

ls[::-1] is popular for reversing

raven ridge Feb 9, 2023, 8:19 PM

#

I'm guessing that that's just handled manually somewhere in the implementation of list, then

warm breach Feb 9, 2023, 8:20 PM

#

!e

ls = [1, 2, 3, 4, 5, 6]
ls[0:6:2] = ["a", "b", "c"]

print(ls)

fallen slateBOT Feb 9, 2023, 8:20 PM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

['a', 2, 'b', 4, 'c', 6]

warm breach Feb 9, 2023, 8:20 PM

#

assignment steps work as well

raven ridge Feb 9, 2023, 8:21 PM

#

yeah, that's special cased. https://github.com/python/cpython/blob/main/Objects/listobject.c#L2953-L3085

warm breach Feb 9, 2023, 8:21 PM

#

!e had to implement that myself for tuple set slice 😔

from einspect import view

t = (1, 2, 3, 4, 5, 6)

view(t)[0:6:2] = ("a", "b", "c")

print(t)

fallen slateBOT Feb 9, 2023, 8:21 PM

#

@warm breach :white_check_mark: Your 3.11 eval job has completed with return code 0.

('a', 2, 'b', 4, 'c', 6)

raven ridge Feb 9, 2023, 8:22 PM

#

warm breach and start end need to be computed as real indices (not negative)

that's easy enough to deal with, though - you add the length of the list to any negative index to get its positive counterpart

warm breach Feb 9, 2023, 8:25 PM

#

https://github.com/python/cpython/blob/7b20a0f55a16b3e2d274cc478e4d04bd8a836a9f/Objects/object.c#L1003-L1012

fallen slateBOT Feb 9, 2023, 8:25 PM

#

Objects/object.c lines 1003 to 1012

PyObject_SetAttr(PyObject *v, PyObject *name, PyObject *value)
{
    PyTypeObject *tp = Py_TYPE(v);
    int err;

    if (!PyUnicode_Check(name)) {
        PyErr_Format(PyExc_TypeError,
                     "attribute name must be string, not '%.200s'",
                     Py_TYPE(name)->tp_name);
        return -1;```

warm breach Feb 9, 2023, 8:26 PM

#

is there a way to get errors like this that return -1 instead of NULL

#

ctypes.pythonapi automatically raises NULL returns with errors

feral island Feb 9, 2023, 8:26 PM

#

different C functions have different conventions for errors. NULL is the most common one but not all functions return a pointer

raven ridge Feb 9, 2023, 8:27 PM

#

some don't return any error sentinel, forcing you to check yourself after every call

warm breach Feb 9, 2023, 8:27 PM

#

how do I get the error there though

#

do I have to do some PyErr call

raven ridge Feb 9, 2023, 8:29 PM

#

if PyObject_SetAttr returns -1, that means that PyErr_Occurred() is true, and you can fetch the exception that occurred with PyErr_Fetch

warm breach Feb 9, 2023, 8:30 PM

#

oh huh

#

!e

from ctypes import *

SetAttr = pythonapi.PyObject_SetAttr
SetAttr.argtypes = [py_object, py_object, py_object]
SetAttr.restype = c_int

class Foo:
    pass

SetAttr(Foo, [], 123)

fallen slateBOT Feb 9, 2023, 8:30 PM

#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 10, in <module>
003 | TypeError: attribute name must be string, not 'list'

warm breach Feb 9, 2023, 8:30 PM

#

it's automatic as well? pithink

#

I guess -1 is also special cased?

#

can no pythonapi function return -1 as a real value then

raven ridge Feb 9, 2023, 8:31 PM

#

I'm sure that's not the case. Without checking the implementation, I'd bet that pythonapi always calls PyErr_Occurred(), and always propagates an exception if so

warm breach Feb 9, 2023, 8:34 PM

#

raven ridge I'm sure that's not the case. Without checking the implementation, I'd bet that ...

ah yeah probably

#

I guess that's why PyDict_GetItem segfaults when it returns NULL with restype py_object

#

since it doesn't set an exception

raven ridge Feb 9, 2023, 8:46 PM

#

raven ridge I'm sure that's not the case. Without checking the implementation, I'd bet that ...

actually, thinking more about it, I'm betting that ctypes doesn't do anything special to handle an exception having been raised. You're calling ctypes via the normal eval loop, and the normal eval loop has machinery in it to propagate exceptions

#

I don't think ctypes.pythonapi needs to check whether an exception occurred - it just blithely ignores it, and then the eval loop that made the call into ctypes notices that the exception indicator is set and propagates the exception

warm breach Feb 9, 2023, 9:11 PM

#

@pliant tusk made slot deletes work now

from einspect import view

del view(int)["__pow__"]

try:
    print(2 ** 65)
except TypeError as e:
    print(e)

view(int).restore("__pow__")

print(2 ** 65)

unsupported operand type(s) for ** or pow(): 'int' and 'int'
36893488147419103232

pliant tusk Feb 9, 2023, 9:13 PM

#

nice

#

@warm breach I just noticed a super weird bug #esoteric-python message

#

@feral island do you know if it is possible to download the exact disk image that the eval command uses? I want to run that binary in a debugger

white nexus Feb 9, 2023, 9:15 PM

#

!gh pythondiscord/snekbox

feral island Feb 9, 2023, 9:15 PM

#

I know nothing about the internals of the bot

white nexus Feb 9, 2023, 9:15 PM

#

https://github.com/python-discord/snekbox

GitHub

GitHub - python-discord/snekbox: Easy, safe evaluation of arbitrary...

Easy, safe evaluation of arbitrary Python code. Contribute to python-discord/snekbox development by creating an account on GitHub.

pliant tusk Feb 9, 2023, 9:16 PM

#

feral island I know nothing about the internals of the bot

ah whoops, misread your roles

#

you may know the other side of the bug tho, do you know if there are any conditions where PyCLEAR will decref a pointer but not set it to NULL? Because thats what I think might be happening

white nexus Feb 9, 2023, 9:16 PM

#

https://github.com/python-discord/snekbox/blob/main/Dockerfile

GitHub

snekbox/Dockerfile at main · python-discord/snekbox

Easy, safe evaluation of arbitrary Python code. Contribute to python-discord/snekbox development by creating an account on GitHub.

pliant tusk Feb 9, 2023, 9:16 PM

#

white nexus https://github.com/python-discord/snekbox/blob/main/Dockerfile

thanks, saw it

warm breach Feb 9, 2023, 9:17 PM

#

pliant tusk <@233059161401720832> I just noticed a super weird bug https://discord.com/chann...

I reproduced it on ubuntu 3.11

#

on windows there is no segfault

#

but on windows with PYTHONDEVMODE=1 will segfault with Windows fatal exception: access violation

pliant tusk Feb 9, 2023, 9:18 PM

#

that is even weirder

#

on my macbook it gives SystemError (which is what it should be doing given the C code that runs)

warm breach Feb 9, 2023, 9:18 PM

#

PYTHONMALLOC=debug will make it segfault on windows

pliant tusk Feb 9, 2023, 9:19 PM

#

weird

#

guess I need to boot into windows to debug then

warm breach Feb 9, 2023, 9:19 PM

#

pliant tusk weird

oh in 3.12 there is no segfault

#

3.12.0a4 windows:

    print(corrupt.__reduce__())
          ^^^^^^^^^^^^^^^^^^^^
SystemError: NULL object passed to Py_BuildValue

pliant tusk Feb 9, 2023, 9:20 PM

#

warm breach 3.12.0a4 windows: ```py print(corrupt.__reduce__()) ^^^^^^^^^^^^^^...

thats what it should be doing in 3.11 as well

warm breach Feb 9, 2023, 9:20 PM

#

wait no

#

3.12.0a4 ubuntu, prints with no segfault:

(<built-in function iter>, (<function  at 0x7f7cf79e9f80>, 0))

#

wtf

pliant tusk Feb 9, 2023, 9:21 PM

#

yea this is a weird one

pliant tusk Feb 9, 2023, 9:22 PM

#

warm breach 3.12.0a4 ubuntu, prints with no segfault: ```py (<built-in function iter>, (<fun...

most recent branch from github?

raven ridge Feb 9, 2023, 9:23 PM

#

warm breach 3.12.0a4 windows: ```py print(corrupt.__reduce__()) ^^^^^^^^^^^^^^...

it's an error to make a call to a Python C API while the exception indicator is already set, with few exceptions

feral island Feb 9, 2023, 9:23 PM

#

warm breach 3.12.0a4 windows: ```py print(corrupt.__reduce__()) ^^^^^^^^^^^^^^...

I get this too on 3.11 ubuntu

pliant tusk Feb 9, 2023, 9:24 PM

#

feral island I get this too on 3.11 ubuntu

thats what you should get, but it seems that on some platforms the Py_CLEAR isnt properly clearing the pointer

pliant tusk Feb 9, 2023, 9:25 PM

#

warm breach 3.12.0a4 ubuntu, prints with no segfault: ```py (<built-in function iter>, (<fun...

what compiler did you build with?

feral island Feb 9, 2023, 9:26 PM

#

pliant tusk thats what you should get, but it seems that on some platforms the `Py_CLEAR` is...

makes sense. I don't have too much knowledge here. I looked at the definition of Py_CLEAR and it does seem like it should always set its arg to NULL (unless perhaps there's a threading race condition, but that seems unlikely here)

pliant tusk Feb 9, 2023, 9:26 PM

#

yea this shouldnt be a threading issue, I wonder if it is the compiler optimizing something out incorrectly on specific platforms

raven ridge Feb 9, 2023, 9:27 PM

#

the compiler is (almost) never wrong

pliant tusk Feb 9, 2023, 9:27 PM

#

raven ridge the compiler is (almost) never wrong

thats the only reason I can think for the bug only happening on specific platforms

#

or some flags that are passed cause this

raven ridge Feb 9, 2023, 9:28 PM

#

if it's happening on more than 1 platform, with different compilers, the odds of it being a bug in two different compilers is basically 0

raven ridge Feb 9, 2023, 9:28 PM

#

pliant tusk thats the only reason I can think for the bug only happening on specific platfor...

the much more likely explanation is that there's undefined behavior in CPython

warm breach Feb 9, 2023, 9:28 PM

#

pliant tusk thats the only reason I can think for the bug only happening on specific platfor...

so summary

3.11 , windows
3.12.0a4, windows

(<built-in function iter>, (<function  at 0x000001B8D6A8C720>, 0))

3.11, windows, PYTHONMALLOC=debug
3.12.0a4, windows, PYTHONMALLOC=debug

Windows fatal exception: access violation
> exit code -1073741819 (0xC0000005)

3.11, ubuntu

(<built-in function iter>, (<function  at 0x7fb772c3c4a0>, 0))
> terminated by signal SIGSEGV (Address boundary error)

3.12.0a4, ubuntu

(<built-in function iter>, (<function  at 0x7f3480d71f80>, 0))

3.12.0a4, ubuntu, PYTHONMALLOC=debug

Fatal Python error: Segmentation fault

pliant tusk Feb 9, 2023, 9:29 PM

#

raven ridge the much more likely explanation is that there's undefined behavior in CPython

oh yea fair enough, I just don't see anything in the code it hits that looks like it would be undefined behavior

raven ridge Feb 9, 2023, 9:29 PM

#

could be a use-after-free, if PYTHONMALLOC=debug is changing the behavior

#

that'd be my first educated guess, before looking at the code at all...

#

try it with valgrind or asan, perhaps... (and PYTHONMALLOC=malloc)

warm breach Feb 9, 2023, 9:30 PM

#

warm breach 3.12.0a4 windows: ```py print(corrupt.__reduce__()) ^^^^^^^^^^^^^^...

also this was me accidentally using python 3.12.0a4 I built with debug mode

pliant tusk Feb 9, 2023, 9:30 PM

#

the resulting bug would be a use after free, since Py_CLEAR is supposed to clear iter->seq_callable

warm breach Feb 9, 2023, 9:30 PM

#

otherwise I can't get the SystemError in release binaries

pliant tusk Feb 9, 2023, 9:30 PM

#

and the rest of the code there sets up the Py_CLEAR to happen after iter->seq_callable is checked

#

but it should be NULLed out by Py_CLEAR and raise a SystemError Exception

pliant tusk Feb 9, 2023, 9:32 PM

#

warm breach otherwise I can't get the SystemError in release binaries

weird, I only get SystemError on my machine(s) so far

warm breach Feb 9, 2023, 9:33 PM

#

pliant tusk weird, I only get SystemError on my machine(s) so far

on 3.11?

pliant tusk Feb 9, 2023, 9:33 PM

#

macos 3.10.10 and 3.11.1

#

both give me the correct result of SystemError

warm breach Feb 9, 2023, 9:34 PM

#

hm, my 3.11 on ubuntu was built with GCC via pyenv

#

windows 3.11 is using binary from python.org

pliant tusk Feb 9, 2023, 9:34 PM

#

ill test on my windows machine tonight

#

but this is a weird bug

feral island Feb 9, 2023, 9:37 PM

#

hm so the SystemError would happen if only one of it_callable and it_sentinel is NULLed out?

pliant tusk Feb 9, 2023, 9:39 PM

#

Yea, the SystemError is triggered inside PyBuild_Value

raven ridge Feb 9, 2023, 9:40 PM

#

what pointer is it that Py_CLEAR ought to be clearing and you think it isn't?

pliant tusk Feb 9, 2023, 9:40 PM

#

It->callable

feral island Feb 9, 2023, 9:40 PM

#

https://github.com/python/cpython/blob/f1f3af7b8245e61a2e0abef03b2c6c5902ed7df8/Objects/iterobject.c#L208

fallen slateBOT Feb 9, 2023, 9:40 PM

#

Objects/iterobject.c line 208

calliter_iternext(calliterobject *it)```

pliant tusk Feb 9, 2023, 9:41 PM

#

From callable_iterator

feral island Feb 9, 2023, 9:41 PM

#

specifically I think the branch on line 226 should clear both it_callable and it_sentinel

pliant tusk Feb 9, 2023, 9:42 PM

#

It should and the fact that it is segfaulting means that it is at least decreasing the refcount

raven ridge Feb 9, 2023, 9:43 PM

#

well, the crash is happening in tuplerepr, because the first element of the tuple is an already freed object

pliant tusk Feb 9, 2023, 9:43 PM

#

Yea that's where the use after free actually gets hit

feral island Feb 9, 2023, 9:43 PM

#

pliant tusk It should and the fact that it is segfaulting means that it is at least decreasi...

I wouldn't assume that, I think there's memory corruption somewhere and anything could be happening

pliant tusk Feb 9, 2023, 9:43 PM

#

But I'm pretty sure the root cause is Py_CLEAR

pliant tusk Feb 9, 2023, 9:44 PM

#

feral island I wouldn't assume that, I think there's memory corruption somewhere and anything...

I was assuming that because I have printed functions directly after freeing them and that's what it typically looks like

#

Because the func_name is cleared it displays like that

pliant tusk Feb 9, 2023, 9:45 PM

#

feral island I wouldn't assume that, I think there's memory corruption somewhere and anything...

But we could guarantee with a custom callable and a __del__

feral island Feb 9, 2023, 9:47 PM

#

feral island hm so the SystemError would happen if only one of it_callable and it_sentinel is...

this isn't true, it's PyEval_GetBuiltin that is NULL

pliant tusk Feb 9, 2023, 9:47 PM

#

The system error is triggered in PyEval_GetBuiltin?

feral island Feb 9, 2023, 9:49 PM

#

well at least, it checks for both fields being non-NULL, so the SystemError can't be from only one of them being NULL

warm breach Feb 9, 2023, 9:49 PM

#

pliant tusk The system error is triggered in PyEval_GetBuiltin?

the one I get on 3.12 debug build triggers here https://github.com/python/cpython/blob/main/Python/modsupport.c#L472-L489 in do_mkvalue

raven ridge Feb 9, 2023, 9:49 PM

#

yeah, it's undefined behavior

#

https://github.com/python/cpython/blob/f1f3af7b8245e61a2e0abef03b2c6c5902ed7df8/Objects/iterobject.c#L243-L244

fallen slateBOT Feb 9, 2023, 9:50 PM

#

Objects/iterobject.c lines 243 to 244

return Py_BuildValue("N(OO)", _PyEval_GetBuiltin(&_Py_ID(iter)),
                     it->it_callable, it->it_sentinel);```

raven ridge Feb 9, 2023, 9:52 PM

#

the call to _PyEval_GetBuiltin to find the iter builtin is calling Cstr.__eq__, which exhausts the iterator, causing the Py_CLEAR in calliter_iternext to be executed, setting it->it_callable and it->it_sentinel to NULL. But the order of evaluation of arguments in a function call isn't specified, and modifying an argument by evaluating another argument is a bug

#

this is basically the same bug as printf("%d %d\n", i++, i++); just in a trickier package.

#

if it->it_callable and it->it_sentinel are evaluated before _PyEval_GetBuiltin(&_Py_ID(iter)) then it passes pointers to freed objects to Py_BuildValue, and if they're evaluated after then it passes null pointers.

#

that's unspecified behavior, not undefined, actually - but same difference.

warm breach Feb 9, 2023, 9:56 PM

#

fallen slate `Objects/iterobject.c` lines 243 to 244 ```c return Py_BuildValue("N(OO)", _PyEv...

does this really need to use _PyEval_GetBuiltin(&_Py_ID(iter))?

#

can't it just do a direct call to the object

raven ridge Feb 9, 2023, 9:57 PM

#

it'd do something entirely different if it didn't look up __builtins__.iter, right?

warm breach Feb 9, 2023, 9:59 PM

#

hm, does __reduce__ need this behavior? (fetching iter from __builtins__)

feral island Feb 9, 2023, 10:00 PM

#

raven ridge the call to `_PyEval_GetBuiltin` to find the `iter` builtin is calling `Cstr.__e...

nice work, thanks for finding this

raven ridge Feb 9, 2023, 10:00 PM

#

can you take it from here? 🙂

feral island Feb 9, 2023, 10:01 PM

#

sure, I'll file a bug and PR tonight

#

or if anyone else here wants to make a CPython contribution, feel free to do that and ping me for a review

raven ridge Feb 9, 2023, 10:03 PM

#

the fix should just be hoisting the _PyEval_GetBuiltin(&_Py_ID(iter)) out of the if, and then adding a big comment explaining why that's necessary

feral island Feb 9, 2023, 10:04 PM

#

yep and a unit test

warm breach Feb 9, 2023, 10:04 PM

#

raven ridge the call to `_PyEval_GetBuiltin` to find the `iter` builtin is calling `Cstr.__e...

I made python/cpython#101765 if you want to post that there

neon troutBOT Feb 9, 2023, 10:04 PM

#

GitHub

IssueOpen [cpython] #101765 iter.__reduce__ can segfault if accessing __builtins__.__dict__['iter'] exhausts the iter object

pliant tusk Feb 9, 2023, 10:06 PM

#

raven ridge the call to `_PyEval_GetBuiltin` to find the `iter` builtin is calling `Cstr.__e...

Ah I didn't know that the order of arguments is unspecified

warm breach Feb 9, 2023, 10:07 PM

#

wonders of C 🥴

astral lion Feb 9, 2023, 10:07 PM

#

Hello, I'm stuck on a problem and I can't seem to figure out a solution. I have a dictionary whose key item is a class object. When I change that key item class object's .name attribute, the key no longer works in the original dictionary. Is there anyway around this?

raven ridge Feb 9, 2023, 10:10 PM

#

pliant tusk Ah I didn't know that the order of arguments is unspecified

I had it right the first time, it's undefined behavior and not unspecified. Check the "undefined behavior" section at the bottom of https://en.cppreference.com/w/c/language/eval_order

#

This is hitting case (2) there.

raven ridge Feb 9, 2023, 10:11 PM

#

astral lion Hello, I'm stuck on a problem and I can't seem to figure out a solution. I have...

don't include the name in the hash. It's up to you to define a __hash__ that doesn't use any mutable fields.

astral lion Feb 9, 2023, 10:12 PM

#

raven ridge don't include the name in the hash. It's up to you to define a `__hash__` that d...

ehm, sorry can you elaborate a bit? I'm a little new to Python still 😦

raven ridge Feb 9, 2023, 10:12 PM

#

you'd have to show your code for me to be any more specific than that

astral lion Feb 9, 2023, 10:15 PM

#

raven ridge you'd have to show your code for me to be any more specific than that

Hopefully this makes sense. key_signal is an object that's used as a key to ldf._signal_representations dictionary. It works when I do not modify the key_signal object members, but when I update the .name member of key_signal, the key no longer works in the ldf._signal_representations dictionary

#

but the memory addresses look intact

pliant tusk Feb 9, 2023, 10:18 PM

#

warm breach I made python/cpython#101765 if you want to post that there

There's a lot of places where that pattern happens and I think they all would need to be fixed

raven ridge Feb 9, 2023, 10:19 PM

#

astral lion but the memory addresses look intact

dicts in Python are implemented using a data structure called a hash map. The idea behind a hash map is that, when you look for a key, you only need to look at other objects with the same hash code, and you can totally ignore every key in the hash table except for ones that have the same hash code. When you change the .name attribute, that changes the object's hash code: https://github.com/c4deszes/ldfparser/blob/06e9cd02f5fbf120de112c92df22a588279ffa55/ldfparser/signal.py#L44-L45
Which is why the dict stops being able to find that object.

fallen slateBOT Feb 9, 2023, 10:19 PM

#

ldfparser/signal.py lines 44 to 45

def __hash__(self) -> int:
    return hash((self.name))```

astral lion Feb 9, 2023, 10:20 PM

#

raven ridge dicts in Python are implemented using a data structure called a hash map. The id...

ohh ok

#

I know the problem now, but how do I fix it?

#

just get the hash and update the dictionary and remove the old instance?

warm breach Feb 9, 2023, 10:21 PM

#

pliant tusk There's a lot of places where that pattern happens and I think they all would ne...

why does reduce need to get itself from builtins dict anyways?

#

haven't seen that outside reduce

pliant tusk Feb 9, 2023, 10:21 PM

#

Knowing this, there's actually a lot of places where you can produce undefined behavior with this pattern. I'll start sifting through the ones that I know about and see if they're actually triggerable.

pliant tusk Feb 9, 2023, 10:22 PM

#

warm breach why does reduce need to get itself from builtins dict anyways?

It needs to get it so that it can return the information needed to reproduce the object

warm breach Feb 9, 2023, 10:28 PM

#

pliant tusk It needs to get it so that it can return the information needed to reproduce the...

calliter_reduce(calliterobject *it, PyObject *Py_UNUSED(ignored)) couldn't this just get ob_type of it here

pliant tusk Feb 9, 2023, 10:29 PM

#

No because ob_type is not iter it's callable_iter which cannot be constructed from python code

grave jolt Feb 9, 2023, 10:46 PM

#

fallen slate `Objects/iterobject.c` lines 243 to 244 ```c return Py_BuildValue("N(OO)", _PyEv...

https://tenor.com/view/nooo-no-nope-no-way-screaming-gif-15477187

Tenor

#

@raven ridge

#

I was genuinely confused for a moment 😄

deep nova Feb 9, 2023, 11:23 PM

#

Hey internals people. You're the one's to talk to when it comes to nuances problems

#

I've got a question happening over here, if anyone want's to weigh in https://discord.com/channels/267624335836053506/1073382454028664902

warm breach Feb 9, 2023, 11:39 PM

#

feral island yep and a unit test

do we have iter tests? or would it go in pickletester

pliant tusk Feb 9, 2023, 11:40 PM

#

warm breach do we have iter tests? or would it go in `pickletester`

there are a lot of places that this pattern exists, not just callable_iter. most of the default iterable types have it

warm breach Feb 9, 2023, 11:40 PM

#

pliant tusk there are a lot of places that this pattern exists, not just callable_iter. most...

like the builtins dict call?

pliant tusk Feb 9, 2023, 11:42 PM

#

yea. listiter_reduce_general has it in two places

#

tupleiter_reduce too

#

https://github.com/python/cpython/blob/5b946d371979a772120e6ee7d37f9b735769d433/Objects/tupleobject.c#L1049-L1056

fallen slateBOT Feb 9, 2023, 11:43 PM

#

Objects/tupleobject.c lines 1049 to 1056

tupleiter_reduce(_PyTupleIterObject *it, PyObject *Py_UNUSED(ignored))
{
    if (it->it_seq)
        return Py_BuildValue("N(O)n", _PyEval_GetBuiltin(&_Py_ID(iter)),
                             it->it_seq, it->it_index);
    else
        return Py_BuildValue("N(())", _PyEval_GetBuiltin(&_Py_ID(iter)));
}```

warm breach Feb 10, 2023, 12:00 AM

#

pliant tusk https://github.com/python/cpython/blob/5b946d371979a772120e6ee7d37f9b735769d433/...

hm, yeah all of these appear affected

pliant tusk Feb 10, 2023, 12:01 AM

#

Makes me wonder how often similar patterns exist

warm breach Feb 10, 2023, 12:01 AM

#

how come compiling with debug mode makes the systemerror work?

#

does it change the arg eval order?

raven ridge Feb 10, 2023, 12:02 AM

#

Nasal demons

pliant tusk Feb 10, 2023, 12:02 AM

#

because anything like function(nested_call(), object->member) would have the bug if nested_call can be manipulated into calling python code that can manipulate object->member

warm breach Feb 10, 2023, 12:03 AM

#

seems like vs knows something is off as well 🥴

raven ridge Feb 10, 2023, 12:03 AM

#

pliant tusk because anything like `function(nested_call(), object->member)` would have the b...

Only if it can manipulate that pointer, not if it manipulates the pointed to object

pliant tusk Feb 10, 2023, 12:04 AM

#

raven ridge Only if it can manipulate that pointer, not if it manipulates the pointed to obj...

ah yea your right unless the pointed to object is checked for some condition that you can then bypass

warm breach Feb 10, 2023, 12:04 AM

#

raven ridge Only if it can manipulate that pointer, not if it manipulates the pointed to obj...

seems like a lot of calls can though

pliant tusk Feb 10, 2023, 12:05 AM

#

warm breach seems like a lot of calls can though

tbh I found this while working on my list of different C functions that can eventually call into python code

#

although this didnt add any as it just ends up at PyDictGetItemWithError

raven ridge Feb 10, 2023, 12:09 AM

#

warm breach seems like a lot of calls can though

It's a very persistent source of issues in the interpreter that calls into Python code can invalidate assumptions or state of C code higher up on the stack. There's a ton of ugly stuff in the interpreter just guarding against cases where this can occur

pliant tusk Feb 10, 2023, 12:10 AM

#

yea most of the bugs I have reported are caused by this exact issue

warm breach Feb 10, 2023, 12:11 AM

#

why is __builtins__.__dict__ mutable anyways 😔

pliant tusk Feb 10, 2023, 12:12 AM

#

the repl uses that to add help and license

pliant tusk Feb 10, 2023, 12:14 AM

#

raven ridge It's a very persistent source of issues in the interpreter that calls into Pytho...

I wonder if you could add some sort of global flag that can be used as part of a sort of PROTECT macro. Basically at the start the flag would be set to 0, then if bytecode is executed the flag would be set to 1. then C code can check that flag to see if it needs to re-adjust assumptions

raven ridge Feb 10, 2023, 12:16 AM

#

to see if it needs to re-adjust assumptions
If it's able to re-adjust its assumptions, then it's also able to just not make those assumptions at all

warm breach Feb 10, 2023, 12:16 AM

#

that might be slow for very hot paths as well

#

the overhead of the check would probably overcome the performance benefits in the branched "assumption" path

pliant tusk Feb 10, 2023, 12:17 AM

#

ah yea fair enough

warm breach Feb 10, 2023, 12:24 AM

#

static PyObject *
listiter_reduce_general(void *_it, int forward)
{
    PyObject *list;
+   PyObject *builtin_iter;
+   PyObject *builtin_reversed;
    
    /* the objects are not the same, index is of different types! */
    if (forward) {
        _PyListIterObject *it = (_PyListIterObject *)_it;
        if (it->it_seq) {
+           builtin_iter = _PyEval_GetBuiltin(&_Py_ID(iter));
            return Py_BuildValue("N(O)n", builtin_iter,
                                 it->it_seq, it->it_index);
        }
    } else {
        listreviterobject *it = (listreviterobject *)_it;
        if (it->it_seq) {
+           builtin_reversed = _PyEval_GetBuiltin(&_Py_ID(reversed));
            return Py_BuildValue("N(O)n", builtin_reversed,
                                 it->it_seq, it->it_index);
        }
    }
    /* empty iterator, create an empty list */
    list = PyList_New(0);
    if (list == NULL)
        return NULL;
    return Py_BuildValue("N(N)", _PyEval_GetBuiltin(&_Py_ID(iter)), list);
}

#

these look kind of weird tbh

pliant tusk Feb 10, 2023, 12:26 AM

#

could probably get away with just PyObject *builtin; and using that variable in both places

warm breach Feb 10, 2023, 12:27 AM

#

is there really no frozen constant pointer to builtins?

#

__builtins__.__dict__ holds the only reference to them?

pliant tusk Feb 10, 2023, 12:28 AM

#

__builtins__ can change depending on what frame you are evaluating

#

so you can't use a constant frozen pointer if there is some utility lib that changes some

raven ridge Feb 10, 2023, 12:30 AM

#

warm breach ```diff static PyObject * listiter_reduce_general(void *_it, int forward) { ...

This doesn't look correct to me. You need to call _PyEval_GetBuiltin before you check if (it->it_seq)

#

because (presumably) calling _PyEval_GetBuiltin can unset it_seq

pliant tusk Feb 10, 2023, 12:31 AM

#

ah yea that would fix the SystemError exception from passing NULL to Py_BuildValue

warm breach Feb 10, 2023, 12:31 AM

#

raven ridge This doesn't look correct to me. You need to call `_PyEval_GetBuiltin` before yo...

currently this hits SystemError since it->it_seq would be NULL

#

if called before it would go to the empty iterator

raven ridge Feb 10, 2023, 12:31 AM

#

ok, but every SystemError is a programming bug

warm breach Feb 10, 2023, 12:32 AM

#

I guess the empty case is more correct?

raven ridge Feb 10, 2023, 12:32 AM

#

yeah.

#

every SystemError is raised because code inside the interpreter or an extension module has a bug.

#

if you're able to provoke the interpreter to set a SystemError from one of its own calls, that's proof there's a bug in the interpreter 🙂

gray galleon Feb 10, 2023, 12:34 AM

#

pliant tusk the `repl` uses that to add `help` and `license`

when you can't import help and license

pliant tusk Feb 10, 2023, 12:34 AM

#

!e ```py
help

fallen slateBOT Feb 10, 2023, 12:34 AM

#

@pliant tusk :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 1, in <module>
003 | NameError: name 'help' is not defined

gray galleon Feb 10, 2023, 12:35 AM

#

!e```
from sitebuiltins import *

print(help)

fallen slateBOT Feb 10, 2023, 12:35 AM

#

@gray galleon :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 1, in <module>
003 | ModuleNotFoundError: No module named 'sitebuiltins'

gray galleon Feb 10, 2023, 12:35 AM

#

!e```
from _sitebuiltins import *

print(help)

fallen slateBOT Feb 10, 2023, 12:35 AM

#

@gray galleon :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 3, in <module>
003 | NameError: name 'help' is not defined

warm breach Feb 10, 2023, 12:49 AM

#

could I just copy this into every iter_reduce

    /* _PyEval_GetBuiltin can invoke arbitrary `__eq__` code
     * calls must be *before* access of _it pointers
     * since C/C++ parameter eval order is undefined.
     * see issue #101765 */

#

or is that too much repetition

rose schooner Feb 10, 2023, 1:07 AM

#

gray galleon !e``` from _sitebuiltins import * print(help) ```

you also have to do help = _Helper()

pliant tusk Feb 10, 2023, 1:07 AM

#

warm breach could I just copy this into every `iter_reduce` ``` /* _PyEval_GetBuiltin ca...

Wouldn't hurt. I don't think you need to specify __eq__

rose schooner Feb 10, 2023, 1:08 AM

#

rose schooner you also have to do `help = _Helper()`

this won't work with a star import though

#

!e ```py
from _sitebuiltins import _Helper
print(help := _Helper())

fallen slateBOT Feb 10, 2023, 1:08 AM

#

@rose schooner :white_check_mark: Your 3.11 eval job has completed with return code 0.

Type help() for interactive help, or help(object) for help about object.

gray galleon Feb 10, 2023, 1:09 AM

#

!e```
from _sitebuiltins import _Helper

help = _Helper()

print(help)

fallen slateBOT Feb 10, 2023, 1:09 AM

#

@gray galleon :white_check_mark: Your 3.11 eval job has completed with return code 0.

Type help() for interactive help, or help(object) for help about object.

gray galleon Feb 10, 2023, 1:10 AM

#

help is not instantiated in _sitebuiltins smh

#

given that it is almost always a singleton

lone sun Feb 10, 2023, 2:00 AM

#

deep nova I've got a question happening over here, if anyone want's to weigh in https://di...

That thread is closed, so I'll weigh in here.

The structure seems very odd to me. Normally, I think of tokenizers as extracting a single token at a time. Your seems to follow a pattern where you look for as many three-long tokens as you can, and if you can't find any, then you look for as many two-long tokens as you can, and if you still can't find any, then you look for one-long tokens. Maybe your functions return at most a single token? I can't tell. Regardless, it seems very odd: Ultimately, for each possible token, you want to know either it appears at the present location (True) or not (False). That boolean structure does not appear very clearly in your code.

A more common pattern for a hand-crafted lexer, I think, is to attempt to match your first token possibility; if it matches, yield it; if not, attempt to match your next token possibility; and so on. This turns tokenization into a simple loop: You just iterate over regular expressions, one for each token, testing each one for a match; when you get a match, you return the matched characters and advance your input pointer by the length of the match.

For an automatically generated lexer, you use the same idea, but you combine all the regular expressions into a single DFA. This is faster (when implemented correctly). (You could replicate this effect in pure Python by taking all the regular expressions you're interested in and combining them with | branches. To find out which token you matched, you examine the match object.)

If your lexer is a more general grammar (like we've discussed here for handling indentation and f-strings) then you'll need a more complicated strategy. But in this case, the right strategy really depends on the complexity of the grammar you're parsing.

gray galleon Feb 10, 2023, 2:12 AM

#

how would you make a DFA in python tho
given that it has no goto

lone sun Feb 10, 2023, 2:15 AM

#

You rely on re to do a good job of that.

#

You could build the state machine yourself—you just assign every state a number and have an outer loop which looks at the number and determines the possible transitions—but it's going to be terribly slow compared to re.

gray galleon Feb 10, 2023, 2:18 AM

#

skywalker's intention is not to use regex i think

lone sun Feb 10, 2023, 2:18 AM

#

Well, it's totally possible. You just end up with a big table.

deep nova Feb 10, 2023, 2:21 AM

#

lone sun That thread is closed, so I'll weigh in here. The structure seems very odd to m...

I can understand how you might have gotten that impression, but in actuality I'm doing what you're describing. Obviously, you need to match tokens whose prefixes are longest first. This way, the token >>= will match before the token >>, which will match before >

raven ridge Feb 10, 2023, 2:22 AM

#

lone sun You could build the state machine yourself—you just assign every state a number ...

that'd be a dict mapping state number to state handler function, and each state handler would return the number of the next state to transition to. Or something like that.

gray galleon Feb 10, 2023, 2:23 AM

#

lone sun Well, it's totally possible. You just end up with a big table.

what i had in mind is trampolining lol
which is a technique to optimize tail calls
then you can write each state as a function
and a transition as a tail call

deep nova Feb 10, 2023, 2:23 AM

#

I need to yield from each of the "sub tokenizers" because a one or two of the methods for creating tokens yield multiple in a single pass (namely, newlines trigger the creation of a newline token as well as indents or dedents). Thus, I've got to either yield multiple in one pass or else cache any tokens beyond the first, and address them in the next loop

#

The thread you saw was me asking about a more graceful way of short-circuiting the outloop upon collecting tokens from one of the inner tokenizers

deep nova Feb 10, 2023, 2:24 AM

#

gray galleon how would you make a DFA in python tho given that it has no `goto`

I actually built a DFA generator

gray galleon Feb 10, 2023, 2:25 AM

#

using yield from?

deep nova Feb 10, 2023, 2:25 AM

#

A few weeks ago. It turned out well, and I'm going to revisit it for the lulz when this hand rolled lexer is done

#

Nooooooo, totally different thing

gray galleon Feb 10, 2023, 2:25 AM

#

gray galleon using `yield from`?

that will create a lot of frames

gray galleon Feb 10, 2023, 2:26 AM

#

deep nova Nooooooo, totally different thing

so how

deep nova Feb 10, 2023, 2:27 AM

#

Lemme just find the code

#

https://hastebin.com/share/apitokuled.py
https://hastebin.com/share/nuzapunoni.py

Hastebin

Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.

Hastebin

Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.

#

It isn't perfect, but it was at least mostly operational. What you end up with is a big directed graph of states, connected by transitions. Each tick of an outer loop you'd query the next character, and then move from the current state of the next accordingly

#

Long term, though, you'd convert the data structure to an if-else FSM in raw source code

gray galleon Feb 10, 2023, 2:35 AM

#

so you just simulate a dfa bruh

deep nova Feb 10, 2023, 2:36 AM

#

_>

#

<_<

#

That depends on your definition. As far as I'm concerned, a DFA is an abstract specification which might be implemented in any number of ways

#

Question

#

About python's handling of escaped newlines

#

def escaped_newline(self) -> str:

    if self.observe(0) != '\r' and self.observe(0) != '\n':
        raise SyntaxError(f"backslashes must be immediately followed by newlines")

    if self.observe(0) == '\r' and self.observe(1) == '\n':
        return self.advance(2)

    if self.observe(0) == '\r':
        return self.advance(1)
        
    if self.observe(0) == '\n':
        return self.advance(1)

#

Does that about cover it? The backslash is already consumed by the time the function is entered. Thus, check to make sure that a some kind of newline occurs after it, consume that newline, and otherwise raise an error?

#

Leading whitespace on the next line will just be ignored the same as any other whitespace, and there are no indentation considerations which need to be made?

raven ridge Feb 10, 2023, 2:51 AM

#

what does observe(0) do?

pliant tusk Feb 10, 2023, 2:52 AM

#

@warm breach here is all of the places I have found so far that trigger undefined behavior due to the same issue as callable_iter

class A:
    def __len__(self):
        return 0
    def __getitem__(self, name):
        raise StopIteration

types = [
    ([A],),
    ([list], range(64)),
    ([bytes], 64),
    ([bytearray], 64),
    ([tuple], range(64)),
    ([lambda:(lambda:0), 0],)
]

def do(item):
    (callable, *flag), *initializer = item
    corrupt = iter(callable(*initializer), *flag)
    class Cstr:
        def __hash__(self):
            return hash('iter')
        def __eq__(self, other):
            [*corrupt]
            return other == 'iter'

    builtins = __builtins__.__dict__ if hasattr(__builtins__, '__dict__') else __builtins__
    oiter = builtins['iter']
    del builtins['iter']
    builtins[Cstr()] = oiter
    try:
        print(callable, corrupt.__reduce__())
    except Exception as e:
        print(callable, e)

for typ in types:
    do(typ)```

warm breach Feb 10, 2023, 2:53 AM

#

pliant tusk <@233059161401720832> here is all of the places I have found so far that trigger...

👍

#

I did this as well

deep nova Feb 10, 2023, 2:53 AM

#

raven ridge what does `observe(0)` do?

Look at the next character

raven ridge Feb 10, 2023, 2:55 AM

#

do you intend to support files with legacy Mac line endings?

deep nova Feb 10, 2023, 2:55 AM

#

I honestly have no clue how line endings work

#

XD I'm just covering my bases

raven ridge Feb 10, 2023, 2:56 AM

#

nothing has used \r as a line ending for over 20 years

deep nova Feb 10, 2023, 2:56 AM

#

HA

#

But \r\n still exists?

vast saffron Feb 10, 2023, 2:56 AM

#

raven ridge nothing has used `\r` as a line ending for over 20 years

but my backwards compatibility! /s

raven ridge Feb 10, 2023, 2:56 AM

#

yeah, Windows uses \r\n, and everything else that still exists uses \n

deep nova Feb 10, 2023, 2:56 AM

#

Gotta love that consistency

raven ridge Feb 10, 2023, 2:56 AM

#

Mac OS 9 and earlier used to use \r

deep nova Feb 10, 2023, 2:57 AM

#

That aside

vast saffron Feb 10, 2023, 2:57 AM

#

https://imgs.xkcd.com/comics/standards.png

deep nova Feb 10, 2023, 2:57 AM

#

With respect to handling escaped newlines

#

I just need to make sure the backslash is followed by a newline and consume it, right? And otherwise throw an error?

raven ridge Feb 10, 2023, 2:59 AM

#

seems reasonable - I can't think of anything else that can come after a \ outside of a string literal

lone sun Feb 10, 2023, 3:53 AM

#

deep nova The thread you saw was me asking about a more graceful way of short-circuiting t...

I think this is still amenable to the kind of structure I described, as long as you make some minor adjustments. Think of each potential token as a pair. One entry of the pair is a regular expression that matches when you find the token. The other entry is sequence that you emit. The point of this division is that it lets you separate the yes/no question of whether you have a match from the question of what to do if you did match. The structure is still a loop over regular expressions (or still a DFA). When you match, you look up the sequence to emit for that token and yield from. Something like:

for tok_re, new_toks in token_data:
    if tok_re.match(input_str):
        ...  # Update internal state
        yield from new_toks
        break
else:
    raise SyntaxError

feral island Feb 10, 2023, 4:54 AM

#

@warm breach I feel the test can go into Lib/test/test_iter.py

warm breach Feb 10, 2023, 5:43 AM

#

pliant tusk <@233059161401720832> here is all of the places I have found so far that trigger...

https://github.com/python/cpython/blob/main/Objects/methodobject.c#L176-L184

fallen slateBOT Feb 10, 2023, 5:43 AM

#

Objects/methodobject.c lines 176 to 184

static PyObject *
meth_reduce(PyCFunctionObject *m, PyObject *Py_UNUSED(ignored))
{
    if (m->m_self == NULL || PyModule_Check(m->m_self))
        return PyUnicode_FromString(m->m_ml->ml_name);

    return Py_BuildValue("N(Os)", _PyEval_GetBuiltin(&_Py_ID(getattr)),
                         m->m_self, m->m_ml->ml_name);
}```

warm breach Feb 10, 2023, 5:44 AM

#

it looks like builtin functions / PyCFunctionObjects might also be affected

#

but I couldn't reproduce with your example structure

#

oh I guess we'd have to mutate m_self or ml_name in eq

feral island Feb 10, 2023, 5:46 AM

#

I wonder if you could do something even more evil where Py_BuildValue allocates a new tuple -> GC is triggered -> that mutates the object

warm breach Feb 10, 2023, 5:46 AM

#

feral island <@233059161401720832> I feel the test can go into `Lib/test/test_iter.py`

should the scope of this be all _reduce methods with pointer access alongside _PyEval_GetBuiltin or only things that could be affected by python code?

warm breach Feb 10, 2023, 5:46 AM

#

fallen slate `Objects/methodobject.c` lines 176 to 184 ```c static PyObject * meth_reduce(PyC...

technically in this one m_self and ml_name shouldn't be mutable

#

but the call order is still UB

feral island Feb 10, 2023, 5:47 AM

#

warm breach should the scope of this be all `_reduce` methods with pointer access alongside ...

hm now that you found the same issue in a bunch of other tests I think it makes more sense to group the tests

warm breach Feb 10, 2023, 5:48 AM

#

I currently am just running https://paste.pythondiscord.com/zakeseguca

#

will try to fit it into test_iter somewhere

feral island Feb 10, 2023, 5:49 AM

#

warm breach but the call order is still UB

I don't think it's UB if there's no way the PyEval_GetBuiltins call can have a side effect on the other arguments

#

but I feel like it's cleaner to just always do the PyEval_GetBuiltin call separately so it's more clear the behavior is safe

#

also I think the UB is a bit of a red herring here. It doesn't really matter what order the args are evaluated, what matters is that the PyEval_GetBuiltin call has a side effect that invalidates the earlier if statement

warm breach Feb 10, 2023, 5:50 AM

#

    (A(),),
    (list(range(64)),),
    (bytes(64),),
    (bytearray(64),),
    (tuple(range(64)),),
    ((lambda: 0), 0),

#

all of these ones are for sure affected with segfaults on 3.11

feral island Feb 10, 2023, 5:50 AM

#

good work

warm breach Feb 10, 2023, 5:51 AM

#

feral island also I think the UB is a bit of a red herring here. It doesn't really matter wha...

well, the unbound thing is what causes the segfault sometimes as opposed to SystemError

feral island Feb 10, 2023, 5:51 AM

#

right, but it's a bug either way

warm breach Feb 10, 2023, 5:51 AM

#

but moving before the if, fixes the systemerror also

#

so there was kind of 2 levels of bug here I guess

raven ridge Feb 10, 2023, 6:05 AM

#

Really just one bug with two possible effects depending on argument evaluation order, I'd say

#

The bug being that PyEval_GetBuiltin is modifying the object in a way that violates the invariants of the running __reduce__ call.

warm breach Feb 10, 2023, 6:08 AM

#

is there some way to modify __builtins__.__dict__ for the test but not affect the other tests

feral island Feb 10, 2023, 6:09 AM

#

warm breach is there some way to modify `__builtins__.__dict__` for the test but not affect ...

maybe we should just run your test in a subprocess

#

you could also restore the old builtins after the test

raven ridge Feb 10, 2023, 6:10 AM

#

Monkeypatch it in a context manager's __enter__, restore it in __exit__

#

Or a subprocess, but that's way slower and far more overhead...

#

Which adds up, especially if you're adding a bunch of similar tests

warm breach Feb 10, 2023, 6:11 AM

#

raven ridge Monkeypatch it in a context manager's `__enter__`, restore it in `__exit__`

I mean...

#

if it fails it might segfault and crash, not sure how much __exit__ will help

feral island Feb 10, 2023, 6:13 AM

#

we'll count segfaults as test failures 🙂

raven ridge Feb 10, 2023, 6:13 AM

#

It won't, but that's not what you're restoring it for. You're restoring it for the case where the test succeeds, because the fixes are applied, and you need to put things back into a sane state for the next test to run

warm breach Feb 10, 2023, 6:28 AM

#

def test_reduce_mutating_builtins_iter(self):
    # Backup of original iter
    builtins = __builtins__.__dict__ if hasattr(__builtins__, "__dict__") else __builtins__
    orig_iter = builtins["iter"]

    def run(item):
        (fn, *flag), *initializer = item
        corrupt = iter(fn(*initializer), *flag)

        class CustomStr:
            def __hash__(self):
                return hash("iter")
            def __eq__(self, other):
                list(corrupt)
                return other == "iter"

        _iter = builtins["iter"]
        del builtins["iter"]
        builtins[CustomStr()] = _iter

        return corrupt.__reduce__()

    types = [
        ([EmptyIterClass],),
        ([bytes], 8),
        ([bytearray], 8),
        ([tuple], range(8)),
        ([lambda: (lambda: 0), 0],)
    ]

    self.assertEqual(run(([str], "xyz")), (orig_iter, ("xyz",), 0))
    self.assertEqual(run(([list], range(8))), (orig_iter, ([],)))
    for case in types:
        self.assertEqual(run(case), (orig_iter, ((),)))

    # Restore original iter
    del builtins["iter"]
    builtins["iter"] = orig_iter

warm breach Feb 10, 2023, 6:36 AM

#

raven ridge Monkeypatch it in a context manager's `__enter__`, restore it in `__exit__`

I think I'll do this?

try:
    self.assertEqual(run(([str], "xyz")), (orig_iter, ("xyz",), 0))
    self.assertEqual(run(([list], range(8))), (orig_iter, ([],)))
    for case in types:
        self.assertEqual(run(case), (orig_iter, ((),)))
finally:
    # Restore original iter
    del builtins["iter"]
    builtins["iter"] = orig_iter

#

might be simpler than having a context manager

raven ridge Feb 10, 2023, 6:38 AM

#

Sure. It's effectively the same thing, every context manager can be rewritten as a try/finally. But splitting it out into a context manager might let you reduce duplication and copy/pasting between tests

warm breach Feb 10, 2023, 6:54 AM

#

!e

x = list[int]

it = iter(x)
print(repr(next(it)))
print(it.__reduce__())

fallen slateBOT Feb 10, 2023, 6:54 AM

#

@warm breach :x: Your 3.11 eval job has completed with return code 1.

001 | *list[int]
002 | Traceback (most recent call last):
003 |   File "<string>", line 5, in <module>
004 | SystemError: NULL object passed to Py_BuildValue

warm breach Feb 10, 2023, 6:54 AM

#

SystemError path is pretty simple to reproduce with this example even

#

hm

#

what should be done about genericaliasobject here? Moving the _PyEval_GetBuiltin will still result in SystemError

static PyObject *
ga_iter_reduce(PyObject *self, PyObject *Py_UNUSED(ignored))
{
    gaiterobject *gi = (gaiterobject *)self;
    return Py_BuildValue("N(O)", _PyEval_GetBuiltin(&_Py_ID(iter)), gi->obj);
}

#

gi->obj is NULL when the iterator is exhausted

#

I'm guessing we need a

if (gi->obj)
    return Py_BuildValue("N(O)", iter, gi->obj);
else
    return Py_BuildValue("N(())", iter);

#

kind of surprised that wasn't there before

deep nova Feb 10, 2023, 7:07 AM

#

Tomorrow is the day

#

That I wrap my mind around whatever the hell python does to handle leading tabs and spaces

deep nova Feb 10, 2023, 8:49 AM

#

Okay, so, walk me through this

#

I'm not sure I'll be able to sleep until I've given this a bit of effort

#

I know python get grumpy about mixed tabs and spaces, but, I know it can in the very least handle tabs followed by spaces

#

And I know it does some kind of normalization, converting every tab to exactly eight spaces

#

Beyond that, hows it all work?

gray galleon Feb 10, 2023, 8:52 AM

#

deep nova That I wrap my mind around whatever the hell python does to handle leading tabs ...

are you trying to parse python?

deep nova Feb 10, 2023, 8:52 AM

#

In the interest of simplicity, I'll just say yes

gray galleon Feb 10, 2023, 8:54 AM

#

you're in for a ride

#

i think you can read the docs

deep nova Feb 10, 2023, 8:55 AM

#

I've read the docs — the lexical analysis document at least. Many times. And I think I understand most of it

#

But I do my best learning by way of Q&A

#

"Tab characters count as one, then round up to the nearest multiple of eight."

#

Wut?

gray galleon Feb 10, 2023, 9:00 AM

#

as an example
3 spaces + 1 tab = 4 characters
those characters are rounded to 8 (because of the tab)
so the final amount of indentation is 8 spaces

#

it will do that for every tab character it encountera

#

thats how i interpret it

deep nova Feb 10, 2023, 9:03 AM

#

def round_to(x):
    return 8 * round(x/8)

indentation = 0

while char := self.next_char():

    if char == ' ':
        indentation += 1
    
    if char == '\t':
        indentation += 1
        indentation = round_to(indentation)

#

Like this?

gray galleon Feb 10, 2023, 9:06 AM

#

yeah

rose schooner Feb 10, 2023, 9:37 AM

#

deep nova That I wrap my mind around whatever the hell python does to handle leading tabs ...

it keeps 2 numbers: one for total "space length" of the indent and one for keeping track of the number of spaces/tabs used in the indent

#

those have to always be consistent with the top of 2 stacks, one for the "space length" and the other for the number of spaces/tabs

deep nova Feb 10, 2023, 9:47 AM

#

Hmmmm

#

I might have to give this problem a little more thought, and employ another method. Apparently Python's handling of whitespace is one of the reasons it can't support multiline lambdas

rose schooner Feb 10, 2023, 9:48 AM

#

deep nova I might have to give this problem a little more thought, and employ another meth...

really?

deep nova Feb 10, 2023, 9:48 AM

#

And that's an absolute must me for (though I intend to go with the much more attractive => syntax)

#

Supposedly, yeah. Something to do with switching of context between whitespace sensitivity and whitespace agnosticism

#

Which sounds like a job for the lazy lexer I've already got planned to handle fstrings, now that I think of it

flat gazelle Feb 10, 2023, 9:50 AM

#

look at nim for a language that manages to have both

deep nova Feb 10, 2023, 9:51 AM

#

flat gazelle look at nim for a language that manages to have both

Siiiiiiick. I'll do that tomorrow

rose schooner Feb 10, 2023, 9:51 AM

#

deep nova I might have to give this problem a little more thought, and employ another meth...

it's totally possible to do multiline lambdas in Python since they're expressions
and expressions ignore indents

deep nova Feb 10, 2023, 9:54 AM

#

I had a feeling. I've always had the impression that the hatred for multiline lambdas has always been more about dogma than anything. Especially now in the era of async (and hence, callbacks as arguments), the extra flexibility is important

rose schooner Feb 10, 2023, 9:54 AM

#

rose schooner it's totally possible to do multiline lambdas in Python since they're expression...

unless indents determine what part is in the lambda and what part isn't

flat gazelle Feb 10, 2023, 9:56 AM

#

a core part of python is the strict separation between statements and expressions, which multiline lambdas very fundamentally cannot be

rose schooner Feb 10, 2023, 9:56 AM

#

deep nova I had a feeling. I've always had the impression that the hatred for multiline la...

lambdas were meant to be for convenience and so was allowed only one expression in its body
anything else should use a def

flat gazelle Feb 10, 2023, 9:56 AM

#

though honestly, that whole idea is pretty simply incorrect, separating the two leads to a worse language

deep nova Feb 10, 2023, 9:56 AM

#

flat gazelle a core part of python is the strict separation between statements and expression...

On account of the fact that it's an expression that contains statements?

flat gazelle Feb 10, 2023, 9:56 AM

#

ye

flat gazelle Feb 10, 2023, 9:57 AM

#

flat gazelle though honestly, that whole idea is pretty simply incorrect, separating the two ...

as evidenced by about every modern language except go.

deep nova Feb 10, 2023, 9:57 AM

#

I had never considered that. That said, it seems perfectly palatable in that you've got an expression with an isolated environment in it.

#

You guys are too smart for your own good. I'll be back tomorrow to soak up some more knowledge

#

XD You know what happens when you leave smart people alone too long? Programming languages.

#

And no good can come from such things

rose schooner Feb 10, 2023, 10:02 AM

#

deep nova XD You know what happens when you leave smart people alone too long? Programming...

i'm trying to make a programming language but i made it too powerful, which meant it required too much work, which caused me to lose motivation because i'm a lazy person
i'm still waiting for the motivation to spike again

radiant garden Feb 10, 2023, 10:02 AM

#

deep nova I might have to give this problem a little more thought, and employ another meth...

It's a design decision. It was decided that an expression can't contain a "block" structure.

#

other langs can do fine without that decision, and easily so if they're "expression-oriented" (i.e. everything is an expression)

deep nova Feb 10, 2023, 10:04 AM

#

That's far too big of a concept for me to grapple with as of right now

#

What I'll say is that I'm glad I took my time with this lexer. Everyone has been "gently encouraging" me to just slap something together and move on to "the more important things"

#

Which isn't exactly an unwise position. But taking the pains to really consider everything has let me take a much longer, deeper view of what I want. I'd have hated to have written a poor lexer, move on to the parser, and then half way through realize I need scrap it all and start over because I hadn't considered multiline lambdas early on

gray galleon Feb 10, 2023, 12:25 PM

#

radiant garden It's a design decision. It was decided that an expression can't contain a "block...

which is a shame```py

won’t work :sob:

btn.on_click = lambda:
print('hello')
print('hi')

eww, yet another named function

def btn_on_click():
print('hello')
print('hi')

btn.on_click = btn_on_click

rose schooner Feb 10, 2023, 12:27 PM

#

gray galleon which is a shame```py # won’t work :sob: btn.on_click = lambda: print('hello')...

if python multiline lambdas are a thing then it's the wrong thing for any job

#

a def is the "only one obvious way to do it"

#

despite having a name which seems to be a constant problem for a lot of programmers

gray galleon Feb 10, 2023, 12:31 PM

#

rose schooner a `def` is the "only one obvious way to do it"

if that way is suboptimal in these use cases, better have another way

#

if “only one obvious way” is allowed, might as well remove lambda altogether

rose schooner Feb 10, 2023, 12:32 PM

#

gray galleon if “only one obvious way” is allowed, might as well remove `lambda` altogether

that's what guido wants

gray galleon Feb 10, 2023, 12:36 PM

#

and you agree with that unironically

rose schooner Feb 10, 2023, 12:36 PM

#

lambda is pretty convenient for use cases that evaluates and returns only one expression
the majority of needed anonymous function uses in python satisfies that requirement

#

a multiline lambda is pretty much too niche and its costs outweigh the benefits and frequency of use

peak spoke Feb 10, 2023, 12:38 PM

#

could allow def btn.on_click(): ...

gray galleon Feb 10, 2023, 12:39 PM

#

rose schooner a multiline lambda is pretty much too niche and its costs outweigh the benefits ...

that is until you write event driven code
and that is a big enough use case

rose schooner Feb 10, 2023, 12:40 PM

#

rose schooner a multiline lambda is pretty much too niche and its costs outweigh the benefits ...

some of the costs include:

special handling for statements in an expression
new grammar
more handling in the compiler
the compiler isn't so easily readable and by extension maintainable (i've experienced it myself)

rose schooner Feb 10, 2023, 12:40 PM

#

peak spoke could allow `def btn.on_click(): ...`

^ this is better

rose schooner Feb 10, 2023, 12:40 PM

#

rose schooner some of the costs include: - special handling for statements in an expression - ...

the tokenizer is even worse and that's where all the indent/dedent handling happens
‫it's already very hard to explain and adding special cases for statements in expressions will just make it harder to maintain

gray galleon Feb 10, 2023, 12:42 PM

#

peak spoke could allow `def btn.on_click(): ...`

sure
won’t work in the general case```py
btn.add_event_handler('click', btn_on_click)

peak spoke Feb 10, 2023, 12:43 PM

#

I'd imagine frameworks would offer expose an attribute for the callbacks if it was a possibility, but yes it is somewhat limited as it doesn't solve the lambdas in args

gray galleon Feb 10, 2023, 12:44 PM

#

gray galleon sure won’t work in the general case```py btn.add_event_handler('click', btn_on_c...

ig at that point you can just demand the gui framework to try something else more pythonic?

flat gazelle Feb 10, 2023, 12:44 PM

#

the convention in python for this kinda stuff are decorators

rose schooner Feb 10, 2023, 12:44 PM

#

gray galleon which is a shame```py # won’t work :sob: btn.on_click = lambda: print('hello')...

if you really wanna use a lambda here in this specific case you'll just do ```py
btn.on_click = lambda: (print('hello'), print('hi'))

"But how about other cases like this, that, etc?"
that's where you just use `def`
it's easier to just use `def` instead of arguing for something that doesn't make sense cost-wise

#

just use _ as a name

flat gazelle Feb 10, 2023, 12:45 PM

#

yeah, that's easiest

peak spoke Feb 10, 2023, 12:45 PM

#

not being able to do assignment in lambdas has been the most annoying thing for me with event callbacks

rose schooner Feb 10, 2023, 12:46 PM

#

peak spoke not being able to do assignment in lambdas has been the most annoying thing for ...

for every python statement there is an expression equivalent
although you probably won't like the equivalent

peak spoke Feb 10, 2023, 12:47 PM

#

Yes, doing setattr isn't exactly nice

gray galleon Feb 10, 2023, 12:47 PM

#

flat gazelle the convention in python for this kinda stuff are decorators

like```py
@btn.event_handler("click")
def _():

...

flat gazelle Feb 10, 2023, 12:49 PM

#

ye

swift imp Feb 10, 2023, 1:00 PM

#

So what exactly are the implications of PEP 649 being accepted?

#

I see it won't break pydantic and the like but for the future, do we think more libraries that support type hints based features are going to pop up?

gray galleon Feb 10, 2023, 1:05 PM

#

gray galleon like```py @btn.event_handler("click") def _(): # ... ```

all of this conversation gave me an idea
how about ruby inspired blocks```py
btn.event_handler("click") do:

...

should support all those use cases
most importantly it is cleaner-looking than the current “define a named function then use it as an arg” approach
and does not introduce named functions

gray galleon Feb 10, 2023, 1:06 PM

#

swift imp I see it won't break pydantic and the like but for the future, do we think more ...

people like typing
so demand for them won’t go down

radiant garden Feb 10, 2023, 1:14 PM

#

swift imp So what exactly are the implications of PEP 649 being accepted?

Less maintenance overhead and a cleaner implementation of annotations overall? Not all that much from a consumer perspective. Annotation-handling code using (good practice) typing.get_type_hints() will be unaffected, and code using eval just needs to skip that call. External tools that parse python won't be affected much, as pep 563 support already means forward references in annotations.

halcyon trail Feb 10, 2023, 2:38 PM

#

I'd say lambdas being so limited is annoying on a fairly regular basis. Certainly, anytime you want to do something slightly more complicated in a list or dict comprehension, I'd much rather have nice lambdas.
I don't really get the whole python thing of "multi line lambdas are niche".
In other languages that have both good lambdas and nested functions, you still see multiline lambdas used a lot.

#

there's nothing massively different about python that makes that not the case. Just folks looking at a python downside and justifying it, rather than simply admitting that it's a downside.

warm breach Feb 10, 2023, 2:43 PM

#

radiant garden Less maintenance overhead and a cleaner implementation of annotations overall? N...

I wouldn't say unaffected, some code currently write to the __annotations__ string dict expecting get_type_hints to be affected, and the proposed new future co_annotations import will mean existing code importing future annotations will stop evaluating forward references and likely break when annotations is deprecated (which, we have never deprecated a future import)

gray galleon Feb 10, 2023, 3:00 PM

#

halcyon trail there's nothing massively different about python that makes that not the case. J...

cough cough more complicated parsing

halcyon trail Feb 10, 2023, 3:12 PM

#

gray galleon *cough* *cough* more complicated parsing

sure, that's a downside for implementers though, not users 🙂

#

a valid reason not to have multi-line lambdas

#

but not a valid reason not to admit that it's a downside

proven bramble Feb 10, 2023, 3:29 PM

#

Is there any plan for a alternative forced statically typed mode for 3.12 or 3.13 (where i believe jit will be added ?)

gray galleon Feb 10, 2023, 3:30 PM

#

proven bramble Is there any plan for a alternative forced statically typed mode for 3.12 or 3.1...

no i think

proven bramble Feb 10, 2023, 3:30 PM

#

And will the annotations be ever used in the jit compiler ?

proven bramble Feb 10, 2023, 3:30 PM

#

gray galleon no i think

Ahh okok

gray galleon Feb 10, 2023, 3:30 PM

#

gray galleon no i think

the only way is to use external typechecking tools

proven bramble Feb 10, 2023, 3:31 PM

#

gray galleon the only way is to use external typechecking tools

Yeah I am aware of this
I am just wondering if python will ever use the annotations while jitting code

gray galleon Feb 10, 2023, 3:32 PM

#

python annotation system is pretty inconvenient tbh

proven bramble Feb 10, 2023, 3:33 PM

#

gray galleon python annotation system is pretty inconvenient tbh

Why ?

#

I wish they overhauled it a bit
So that we would be able to add type qualifiers and type modifiers
But it doesn't make sense if they never use it in jit

gray galleon Feb 10, 2023, 3:35 PM

#

proven bramble Why ?

problem with forward references

#

it is also kind of a hack

proven bramble Feb 10, 2023, 3:38 PM

#

gray galleon problem with forward references

can this be fixed with the same syntax ?

gray galleon Feb 10, 2023, 3:39 PM

#

proven bramble can this be fixed with the same syntax ?

yes
pep 649

proven bramble Feb 10, 2023, 3:39 PM

#

you mean the types need to be defined before we use it ? (you cant annotate a method of a class with the class itself, it needs to be a string or requires the import statement)

#

?

proven bramble Feb 10, 2023, 3:39 PM

#

gray galleon yes pep 649

ahh okok I was right then ^

warm breach Feb 10, 2023, 5:53 PM

#

quick question @feral island , are we allowed to use functools.partial in test_iter?

feral island Feb 10, 2023, 5:53 PM

#

warm breach quick question <@783088578363523104> , are we allowed to use `functools.partial`...

sure, why not?

warm breach Feb 10, 2023, 5:54 PM

#

not sure pithink thought it was supposed to not depend on anything or

#

hm...

NameError: name 'reversed' is not defined
Warning -- Unraisable exception
Exception ignored in: <module 'threading' from '/home/ionite/repos/C/cpython/Lib/threading.py'>
Traceback (most recent call last):
  File "/home/ionite/repos/C/cpython/Lib/threading.py", line 1571, in _shutdown
    for atexit_call in reversed(_threading_atexits):
                       ^^^^^^^^
NameError: name 'reversed' is not defined

feral island Feb 10, 2023, 6:04 PM

#

did you not put it properly back in builtins?

warm breach Feb 10, 2023, 6:05 PM

#

feral island did you not put it properly back in builtins?

ah it was a separate test failure

#

apparently format exception happens before finally and fails there due to the patches

#

wait no it doesn't, I just had a failed del in finally, will fix

warm breach Feb 10, 2023, 6:21 PM

#

feral island sure, why not?

also currently empty reversed iter reduce results in iter instead of reversed, is that intentional?

deep nova Feb 10, 2023, 6:21 PM

#

Hey smart people

#

Question about lexing, parsing, and context sensitivity. I hust read this little article here: http://trevorjim.com/python-is-not-context-free/

feral island Feb 10, 2023, 6:23 PM

#

warm breach also currently empty reversed iter reduce results in `iter` instead of `reversed...

I think it's fine since either way it's an empty iterable

#

or iterator rather

deep nova Feb 10, 2023, 6:26 PM

#

It makes the obvious claim that lexing and parsing as Python (and I assume many other languages) is not context free. A lexer shouldn't (puritanically speaking) store any internal state except its current position in the input stream. In practice, who cares. Keeping a stack representing indentation/parentheses is a value add, without any real drawbacks™️

#

But he does raise the point that "we're, technically, using the wrong tools for the job". Alternatively, it might be that we're doing the job inside out in some way. This begs the question: what's the alternative?

warm breach Feb 10, 2023, 6:27 PM

#

so essentially

>>> it = iter(reversed([]))
>>>
>>> try:
...    next(it)
... except StopIteration:
...    pass
...
>>> it.__reduce__()
(<built-in function iter>, ([],))

raven ridge Feb 10, 2023, 7:57 PM

#

Hm. That would unpickle to an object of a different type. That seems not great...

#

Might be worth filing a bug report for that, too...

feral island Feb 10, 2023, 7:58 PM

#

I feel like the exact iterator type is an implementation detail. You pickle an empty iterator, you get an empty iterator back

raven ridge Feb 10, 2023, 7:59 PM

#

You don't think it's reasonable to ```py
assert type(it) == type(pickle.loads(pickle.dumps(it)))

feral island Feb 10, 2023, 8:00 PM

#

That's reasonable, but until the type discrepancy causes a real-world issue I'm not sure it's worth changing. In some cases it may not be practical to create an empty iterator of the same type.

raven ridge Feb 10, 2023, 8:01 PM

#

True, though in this case it is

#

I'm not sure it's worth fixing, but it's probably worth reporting so at least there's a record of it in Google and some documentation about why it wasn't worth fixing

#

Assuming it hasn't been reported already 🙂

warm breach Feb 10, 2023, 8:07 PM

#

raven ridge Might be worth filing a bug report for that, too...

not that it's changed by this PR, it's still like that now

#

just that something even more dangerous happens when builtins dict access mutates reversed

raven ridge Feb 10, 2023, 8:08 PM

#

warm breach not that it's changed by this PR, it's still like that now

Right, that's why it warrants a totally separate bug report

warm breach Feb 10, 2023, 8:09 PM

#

is there something you can do with reversed list iter and not iter

warm breach Feb 10, 2023, 9:49 PM

#

raven ridge Right, that's why it warrants a totally separate bug report

actually there is technically one behavior change for generic alias iter

#

#

it now also returns a plain iter on __reduce__ of an exhausted one

raven ridge Feb 10, 2023, 9:58 PM

#

Isn't that the expected behavior?

#

!e ```py
print(iter(tuple[()]).reduce())

fallen slateBOT Feb 10, 2023, 10:01 PM

#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<built-in function iter>, (tuple[()],))

raven ridge Feb 10, 2023, 10:01 PM

#

Looks like it always returns a plain iterator, exhausted or not

warm breach Feb 10, 2023, 10:03 PM

#

ah right

#

I guess the only changed one is reversed then?

raven ridge Feb 10, 2023, 10:05 PM

#

That doesn't seem changed, either.

#

!e ```py
it = iter(reversed(""))
list(it)
print(it.reduce())

fallen slateBOT Feb 10, 2023, 10:06 PM

#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<class 'reversed'>, ((),))

raven ridge Feb 10, 2023, 10:06 PM

#

Hm.

#

Oh, weird

#

!e ```py
it = iter(reversed([]))
list(it)
print(it.reduce())

fallen slateBOT Feb 10, 2023, 10:08 PM

#

@raven ridge :white_check_mark: Your 3.11 eval job has completed with return code 0.

(<built-in function iter>, ([],))

warm breach Feb 10, 2023, 10:08 PM

#

yeah it's only for lists

feral island Feb 10, 2023, 10:08 PM

#

reversed() on a list gives a different type than general reversed() I believe

raven ridge Feb 10, 2023, 10:08 PM

#

Strange. That seems very odd

raven ridge Feb 10, 2023, 10:08 PM

#

feral island reversed() on a list gives a different type than general reversed() I believe

Ah. Ok, that makes sense

raven ridge Feb 10, 2023, 10:09 PM

#

warm breach I guess the only changed one is `reversed` then?

Well, then - this is the existing behavior for pickling an exhausted reverse list iterator, right?

warm breach Feb 10, 2023, 10:09 PM

#

static PyObject *
reversed_reduce(reversedobject *ro, PyObject *Py_UNUSED(ignored))
{
    if (ro->seq)
        return Py_BuildValue("O(O)n", Py_TYPE(ro), ro->seq, ro->index);
    else
        return Py_BuildValue("O(())", Py_TYPE(ro));
}

#

seems reversed reduce calls Py_TYPE instead of builtins dict access

feral island Feb 10, 2023, 10:10 PM

#

we care about the reduce for iter(reversed) though, right?

raven ridge Feb 10, 2023, 10:11 PM

#

warm breach seems reversed reduce calls `Py_TYPE` instead of builtins dict access

Regardless, calling that at a different point (before the Py_BuildValue call) doesn't result in a behavior change

warm breach Feb 10, 2023, 10:11 PM

#

feral island we care about the reduce for iter(reversed) though, right?

iter(reversed(...)) returns itself it seems

#

for anything besides list

grave jolt Feb 10, 2023, 10:11 PM

#

rose schooner a `def` is the "only one obvious way to do it"

I think the one-obvious-way doesn't really work in practice...

raven ridge Feb 10, 2023, 10:11 PM

#

It doesn't matter that it's already kind of weird, we can fix the bug without worrying about its other weirdness

feral island Feb 10, 2023, 10:12 PM

#

incidentally why is reversed() in enumobject.c of all places

grave jolt Feb 10, 2023, 10:13 PM

#

grave jolt I think the one-obvious-way doesn't really work in practice...

Because

Languages and language features evolve over time. What's obvious today is not obvious tomorrow. That's why we have 4 ways of formatting strings, not 1.
That kind of assumes the language creator has thought about all possible use cases of all the users.

#

It's pretty clear from modern usage that sometimes people prefer lambdas over def'd functions.

#

You could go all the way in on the one-obvious-way philosophy and remove lambda. But that would make a lot of existing code extremely verbose, with function names that don't add any meaning

#

It's okay to offer options 🙂 and it's true that there is such a thing as too many options.

deep nova Feb 10, 2023, 10:19 PM

#

Good points all around

#

At the end of the day, though

#

Multiline lambdas are nice. They're useful, they're pretty, and people want them

#

I want them, lots of other people want them

#

Purity be damned, that's what I'm going to give them

deep nova Feb 10, 2023, 10:21 PM

#

grave jolt Because 1. Languages and language features evolve over time. What's obvious toda...

As an aside, we don't have four ways of formatting strings. We have two ways — fstrings and .format(). We also have two atrophied, wholly intolerable vestigial means of formatting that don't bear thinking about

grave jolt Feb 10, 2023, 10:22 PM

#

deep nova As an aside, we don't have four ways of formatting strings. We have two ways — f...

string concatenation will be forever in my heart

#

also, %-formatting is used in logging 🙂 though the utility is questionable

feral island Feb 10, 2023, 10:24 PM

#

I think #4 was meant to be string.Template?

grave jolt Feb 10, 2023, 10:24 PM

#

replace my 4 with 5 then 🙂

deep nova Feb 10, 2023, 10:24 PM

#

Five XD

grave jolt Feb 10, 2023, 10:24 PM

#

and then add number 6 standing for all the templating engines...

grave jolt Feb 10, 2023, 10:25 PM

#

deep nova As an aside, we don't have four ways of formatting strings. We have two ways — f...

I do use string concatenation occasionally. Suppose you want to implement the repr for a tuple. I would do it like this:

def __repr__(self) -> str:
    return "(" + ", ".join(map(repr, self)) + ")"
``` (let's just ignore recursive repr handling...)

deep nova Feb 10, 2023, 10:25 PM

#

Hehe, I'm not sure I'd count straight up concatenation as a method formatting

#

But yeah, sometimes its the only tool for the job

feral island Feb 10, 2023, 10:26 PM

#

%-formatting is useful for creating binary strings. I also used it recently for generating TypeScript code so I wouldn't have to keep writing {{

grave jolt Feb 10, 2023, 10:27 PM

#

These are the alternatives, it seems ```py
def repr(self) -> str:
return "({0})".format(", ".join(map(repr, self)))

def repr(self) -> str:
return "(%s)" % [", ".join(map(repr, self))]

def repr(self) -> str:
return f"({', '.join(map(repr, self))})" # yuck

def repr(self) -> str:
amogus = ", ".join(map(repr, self))
return f"({amogus})"

grave jolt Feb 10, 2023, 10:27 PM

#

feral island %-formatting is useful for creating binary strings. I also used it recently for ...

Yeah I've definitely seen it used with code generation when you have {}s

grave jolt Feb 10, 2023, 10:29 PM

#

grave jolt I do use string concatenation occasionally. Suppose you want to implement the `r...

Oh, there's also this ```py
def repr(self) -> str:
return "".join(["(", ", ".join(map(repr, self)), ")"])

grave jolt Feb 10, 2023, 10:29 PM

#

grave jolt and then add number 6 standing for all the templating engines...

Seems like the counter is 7 now

halcyon trail Feb 10, 2023, 10:30 PM

#

it's too bad that the logging using %, at least by default

#

ideally, what you really want is to pass logging functions lambdas that return a string, rather than actual strings. then the logging framework decides whether to evaluate them.

#

Then you could just use f-strings for logging and still be perfectly efficient

#

lambdas again 😉

grave jolt Feb 10, 2023, 10:31 PM

#

also lambda is such an awkward keyword

feral island Feb 10, 2023, 10:31 PM

#

assuming that the cost of creating a lambda doesn't exceed the cost of creating an f-string

grave jolt Feb 10, 2023, 10:31 PM

#

grave jolt also `lambda` is such an awkward keyword

especially for a language that tries to emphasize readability, in theory

#internals-and-peps

creates modded handlenull type to shim null values

won’t work :sob:

eww, yet another named function

...

...