#internals-and-peps

1 messages ยท Page 15 of 1

rose schooner
#

wdym by "one less line event in code that does nothing"?

quick snow
rose schooner
#

3.10+ it prints a line event thrice

quick snow
#

Oh, right. In my original code it was 2 vs 1.

feral island
#

@pliant tusk I think I fixed my releasebuffer problems now. The issue was that it became impossible to actually call the C bf_releasebuffer slot on classes that override it with a Python __release_buffer__. Now when we call the slot from C code, we invoke both the Python method and any C slot in a base class, to ensure the C slot is called correctly and only when required.

spark magnet
quick snow
pliant tusk
feral island
#

thanks!

naive saddle
fallen slateBOT
#

Lib/typing.py line 3585

No runtime warning is issued. The decorator sets the `โ€‹`โ€‹__deprecated__`โ€‹`โ€‹```
feral island
spark magnet
quick snow
spark magnet
coarse chasm
#

Damn i love decorators, but holy sh*t they're way too difficult sometimes.
Only took me about 5 hours to get a decorator working that could wrap around a function with parameters and had 2 parameters of its own.
Basically i wanted a decorator that had its own parameters, could wrap around a function with parameters so it would still be able to use those parameters on call.
So i came up with this:
https://pastebin.com/XiiauQMK
This is probably a really bad way to implement it. as i haven't got that much experience with decorators yet.
I would have loved if i could have worked out a way to access the object the decorator is wrapping, that isn't also the first parameter.
Again, i assume there is already such a thing, and i just didn't find it. But just in case it isn't; might i suggest adding a property to decorators constructed from a class instead of a function. Just an exposed property that holds the object you've wrapped, in order to make sure the decorator's parameters are still left open.
I'm sure it is technically possible to find ways other then a special property to get the same results as i wanted. but i genuinely couldn't find them.

sour thistle
#

the way you did it is fairly standard as far as I can tell?

#

you can take a look at how dataclasses.dataclass is implemented, and/or some of discord.py's decorators

teal flint
#

Does anyone know of proposals that try to address the problem of dependencies with conflicting subdepndencies? Like you depend on module A and module B but they both depend on different versions of module C? A naive solution could be to automatically rewrite module A and B to have a private module C. I imagine solutions like that fall apart quickly?

dusk comet
# coarse chasm Damn i love decorators, but holy sh*t they're way too difficult sometimes. Only ...

you can do something like: ```py
def decorator(func=None, *, boolean_1=False, boolean_2=False):
def actual_decorator(func):
@wraps(func)
def wrapper(*func_args, **func_kwargs):
print(boolean_1, boolean_2)
result = func(*func_args, **func_kwargs)
return result
return wrapper

if func is None:
    return actual_decorator
return actual_decorator(func)
```py
@decorator
# or
@decorator()
# or
@decorator(boolean_1=True)
# or 
@decorator(boolean_1=True, boolean_2=True)

# not like this:
@decorator(True)
lethal nest
raven ridge
#

I think doing this for pure Python modules is actually much trickier than doing it for C modules, because those Python modules do return objects that came from the library dependency. If you've got two libraries that want to pull in two different versions of pandas, let's say, and both of those libraries return dataframes to the user, then the user can get access to dataframes created by two different versions of pandas, and won't be able to use isinstance(obj, pandas.dataframe) on them both (that'll be false for at least one of them, if not both), and won't necessarily be able to call the same methods on both (the newer version of the library might have added or removed methods), etc.

#

vendoring shared library dependencies for C extension modules the way auditwheel repair does is basically safe because you can't take an object created in a non-Python C library and return it to the Python caller of your library. With Python libraries, you can, and that'll cause you a ton of trouble.

teal flint
#

in the situation of Blender, the plugins can only interact with Blender apis and can't pass in outside module types so I think this works out in this situation

raven ridge
#

well, that might be a reason why you won't find anyone else who's done this, and will need to build your own home-grown solution, at least.

#

if you have enough control over the environment where these plugins are imported, you might be able to play a trick like manipulating sys.path and sys.modules before importing a plugin, so that it has its own private set of "what modules have been imported" as well as its own private set of "what directories do I look for modules that haven't yet been imported in"

teal flint
#

that ones tricky because modules can dynamically load dependencies at unknown times right?

raven ridge
#

yes - that's not super common, but it is possible. But perhaps you could also mess with sys.path and sys.modules before calling from your core into any function defined by a module?

teal flint
#

ah right

raven ridge
#

I'd be remiss if I didn't point out: the simple and obvious solution here is subprocesses. Run each plugin in its own dedicated Python process, where it has its own sys.path and sys.modules. That comes with the cost of needing to serialize data that needs to be shuffled back and forth to the "controller" process, but gives you isolation basically for free.

teal flint
#

on IRC someone suggested subinterpreters in the same vein

raven ridge
#

yeah, that works as well, with basically the same caveats - you need to serialize data to get it between interpreters, or to whatever is controlling and coordinating all of the interpreters

lethal nest
raven ridge
#

you'd just have different venvs

#

instead of one site-packages directory with two versions of numpy installed under different names, you'd have two site-packages directories that each have their own numpy

lethal nest
#

for each subprocess?

raven ridge
#

yeah, that's what the sys.path manipulation I was talking about above would be doing.

lethal nest
#

oh right yeah that makes sense

teal flint
lethal nest
raven ridge
#

you could make it work by just spawning a new copy of the same interpreter and then messing with sys.path before importing the module, too.

#

but just using different venvs with different interpreters is the easy mode solution. Heck, that could potentially even let different plugins require different versions of Python.

coarse chasm
teal flint
dusk comet
coarse chasm
#

Again, i knew about this, but i'm just wayy too finicky about this kind of stuff. sorry

#

it's all good though, i found a way that works for me eventually

lethal nest
teal flint
#

thanks!

raven ridge
lethal nest
dusk comet
#

Is it possible to send object from one interpreter to another?
For example, one interpreter creates some object (with no references to it), returns it to C level, and then it is passed to other interpreter.

raven ridge
#

no

lethal nest
#

the C level is isolated by interpreter (not subinterpreter though) right?

raven ridge
#

I don't think I understand what you mean by that

lethal nest
#

couldn't you pass objects between subinterpreters if they have the same parent interpreter?

raven ridge
#

you can't pass objects between subinterpreters at all

#

each interpreter has its own set of objects, you can't move an object from one interpreter to another

lethal nest
#

the way I interpreted the original question was you would DECREF in one interpreter, convert to a C object, then send that object to another interpreter

#

like using PyInterpreterState_GetDict

dusk comet
#

I think almost every nontrivial object is tracked somewhere inside garbage collector, and you cannot extract it from there.
Also object is stored inside arena, and it can cause problems if other interpreter will try to deallocate that arena

lone sun
#

I think if you want to send objects between subinterpreters, you should treat it as a form of interprocess communication. It's more difficult and expensive than just giving one subinterpreter a pointer to the other's object. But that's by design; it keeps the subinterpreters isolated from each other so that they don't need to share a mutex.

teal flint
#

in my use case (Blender with multiple add-ons), I don't think anything has to move in between the subinterpreters. Just between the main process and individual subinterpreters

#

I guess the same point still applies

raven ridge
#

yep, it does

pliant tusk
#

@feral island I just put a comment on the pull request for the fix from last night but i found an issue where the buffer passed into release_buffer is not properly tracked

#

so you can use it to access the memory of the exported buffer even after that memory is freed

#
>>> class B(bytearray):
...     def __release_buffer__(self, buffer):
...             B.leak = self.clear() or bytearray()
...             B.backing = buffer
... 
>>> b = B(bytearray.__basicsize__)
>>> m = memoryview(b)
>>> m.release()
>>> B.leak
bytearray(b'')
>>> B.backing
<memory at 0x109d1f700>
>>> B.backing.cast('P').tolist()
[1, 4454897440, 0, 0, 0, 0, 0]
>>> B.backing.cast('P')[2] = -1
>>> len(B.leak)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: <built-in function len> returned NULL without setting an exception
>>> 
#

probably need to track down where this view is being exported and ensure that its ownership is tracked properly from the beginning

feral island
#

thanks, will take a look

fallen slateBOT
#

Objects/typeobject.c line 9068

Py_DECREF(mv);```
pliant tusk
feral island
#

right, I may need to swap the order of calling the Python and C release slots

pliant tusk
#

I think this would still be an issue due to the memoryview not being owned properly.

#

Inside of release_buffer you could reexport that buffer with memoryview.cast or similar and then even an explicit release wouldn't be enough

feral island
pliant tusk
feral island
#

I don't think so. We'd have to "resurrect" the buffer, which I don't think the buffer protocol generally allows

feral island
#

at various places in C code

pliant tusk
#
  • i meant the buffer that is passed into release_buffer
feral island
#

restricted mode wasn't that hard to implement actually either

pliant tusk
#

trying to figure out why that view doesnt have proper ownership

feral island
#

it's created from PyMemoryView_FromBuffer

#

all we have is a Py_buffer*

pliant tusk
feral island
#

maybe alternatively we could do like PyMemoryView_FromObject(buffer->obj), but that feels like it's opening an even bigger can of worms

#

because then we're requesting a new buffer just to release the buffer

feral island
pliant tusk
#

yea thats where the buffer being passed into release is made right?

feral island
#

if you have a Python __buffer__, yes

#

hm maybe I need to test that branch more

pliant tusk
#

i am trying to find the line of code that is resulting an an un-owned memoryview being passed into __release_buffer__ in this code: #internals-and-peps message

feral island
#

look at releasebuffer_call_python

fallen slateBOT
#

Objects/typeobject.c line 9119

mv = PyMemoryView_FromBuffer(buffer);```
pliant tusk
#

it almost feels like to do this elegantly you would need to redesign the buffer protocol

feral island
#

not something I'm signing up for ๐Ÿ™‚

pliant tusk
#

fair enough, i'm just wondering if it would be more worthwhile moving forwards to do that before implementing this PEP

feral island
#

I think the "restricted memoryview" solution isn't too bad. It's not extremely invasive, and we can always relax the rules later if someone has a use case and a way to do it safely

#

I'm not sure what use cases people would have for overriding __release_buffer__ in Python anyway

#

my motivation is just to make buffers representable in the type system

pliant tusk
#

i have the beginnings of an idea on how to implement this pep with less of the memory risks we are hitting now.
implement a new type ownedmemoryview that has an extra property owner which is the python exporter, and obj would point to the object from the memoryview returned by __buffer__. It would implement a bf_getbufferproc that creates Py_Buffer* that has its obj member set to the ownedmemoryview instance. then in slot_bf_releasebuffer you can check if the type of PyBuffer->obj is ownedmemoryview and call the respective __release_buffer__ with the values stored on the ownedmemoryview instance. then, regardless afterwards you can call the bf_releasebuffer of the obj properly.

feral island
#

isn't that basically what buffer_wrapper does?

#

the trouble is that sometimes things don't go through the __buffer__ codepath, so we can't make a wrapper

#

e.g. if you only override __release_buffer__

pliant tusk
#

but you should still have PyBuffer->obj to make a proper owned memoryview right?

feral island
#

I don't understand what you mean

pliant tusk
feral island
#

the current implementation has that too. but I think to make a "proper" memoryview you'd need to re-call the bf_getbuffer slot

pliant tusk
#

might need to do that to ensure safety in all of the edgecases

#

because for example, even with a restricted memoryview you could still clear the underlying buffer because the exporter (bytearray) does not know it exists

#

( I assume, I do not know how you implemented the restricted memoryview )

feral island
#

this call happens before we call bf_releasebuffer on the underlying object, so it knows it still has a reference active

pliant tusk
#

ah ok

#

ill pull the new commit and look at it

pliant tusk
feral island
sour thistle
fallen slateBOT
#

Lib/socketserver.py lines 450 to 452

def __init__(self, server_address, RequestHandlerClass, bind_and_activate=True):
    """Constructor.  May be extended, do not override."""
    BaseServer.__init__(self, server_address, RequestHandlerClass)```
raven ridge
#

there was a point in time before super() existed, where explicitly calling your base class's __init__ was the thing to do

rare timber
#

Call supers constructor at least I wasn't paying attention

willow pewter
#

when I run code as a module(with -m) why printing __builtins__ type is dict and when run as file its type is module? (py311_venv) E:\co\tpy>py -m tpy <class 'dict'> ... (py311_venv) E:\co\tpy>py tpy/tpy.py <class 'module'> ... (py311_venv) E:\co\tpy>
structure: E:\co\ ... tpy\ ... tpy\ __init__.py tpy.py __main__.py ...
__init__.py:```py
from .tpy import main

all = ['main']
version = '1.8' \_\_main\_\_.py:py
from . import tpy
tpy.main()```

#

it becomes annoying when doing some esoteric stuff(code doesn't run everywhere) and I am making a REPL and I want _builtins_ to be a module

quick snow
#

Probably just an implementation detail. You're not supposed to touch __builtins__ anyways.

willow pewter
spark magnet
willow pewter
spark magnet
willow pewter
#

and for running this code

spark magnet
#

... and there is madness

willow pewter
willow pewter
#

and that's the only reason I want _builtins_ as module

#

is there anything I can do?

spark magnet
#

but i recommend not driving yourself crazy like this

quick snow
#

import builtins as __builtins__?

willow pewter
quick snow
#

I think that should work, yes.

willow pewter
#

let me try

#

seems to work

#

heres the namespace now is it ok?

#
namespace: dict = {'__builtins__': __builtins__, '__name__': '__main__', '__doc__': 'Automatically created module for TPython interactive environment', '__package__': None, '__loader__': None, '__spec__': None, '__annotations__': {}}
namespace_local: dict = {'__builtins__': __builtins__, '__name__': '__main__', '__doc__': 'Automatically created module for TPython interactive environment', '__package__': None, '__loader__': None, '__spec__': None, '__annotations__': {}}```
#

goofy ahh executor functionpy try: eval_return = eval(code, namespace, namespace_local) if eval_return != None: print(repr(eval_return)) err = False except: run = True if run: try: exec(code, namespace, namespace_local) err = False except Exception: exc() err = True

quick snow
#

If by "ok" you mean "behaves like a normal REPL", then no: globals and locals should be the same dict.

willow pewter
quick snow
#
In [1]: globals() is locals()
Out[1]: True
willow pewter
#

ok

#

done

#
                try:
                    eval_return = eval(code, namespace)
                    if eval_return != None:
                        print(repr(eval_return))
                    err = False
                except:
                    run = True
                if run:
                    try:
                        exec(code, namespace)
                        err = False
                    except Exception:
                        exc()
                        err = True```
```py
namespace: dict = {'__builtins__': __builtins__, '__name__': '__main__', '__doc__': 'Automatically created module for TPython interactive environment', '__package__': None, '__loader__': None, '__spec__': None, '__annotations__': {}}
#
[1]-> globals() is locals()
True
[2]->```
quick snow
#

You still need to do exec(code, namespace, namespace), otherwise it inherits the globals of your REPL's code, right?

willow pewter
#

though I never understand how locals() works

#
>>> def a():
...     b=1
...     return locals()
...
>>> def b():
...     return locals()
...
>>> a()
{'b': 1}
>>> b()
{}
>>> locals()
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'a': <function a at 0x00000215109004A0>, 'b': <function b at 0x0000021510E572E0>}
>>> ```
willow pewter
#
[2]-> def a():
[:]->   b=1
[:]->   return locals()
[:]->
[:]->
[3]-> def b():
[:]->   return locals()
[:]->
[:]->
[4]-> a()
{'b': 1}
[5]-> b()
{}
[6]-> locals()
{'__builtins__': <module 'builtins' (built-in)>, '__name__': '__main__', '__doc__': 'Automatically created module for TPython interactive environment', '__package__': None, '__loader__': None, '__spec__': None, '__annotations__': {}, 'a': <function a at 0x0000020F10D26B60>, 'b': <function b at 0x0000020F10D26E80>}
[7]->```
#

the behavior is same but ยฏ_(ใƒ„)_/ยฏ

quick snow
#

Right, but what's in globals() afterwards?

#

Ah, I guess it doesn't happen if you have __builtins__ set already

willow pewter
#
[7]-> globals()
{'__builtins__': <module 'builtins' (built-in)>, '__name__': '__main__', '__doc__': 'Automatically created module for TPython interactive environment', '__package__': None, '__loader__': None, '__spec__': None, '__annotations__': {}, 'a': <function a at 0x0000020F10D26B60>, 'b': <function b at 0x0000020F10D26E80>}
[8]->
>>> globals()
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'a': <function a at 0x00000215109004A0>, 'b': <function b at 0x0000021510E572E0>}
>>> ```
feral island
#

exec() and eval() set the __builtins__ key in your globals dict if it's not already there I think

willow pewter
#

and that's a problem

quick snow
willow pewter
#

and everything seems to work fine

#
----------------------------------------------------------------------
AttributeError in Cell 9             Traceback (most recent call last)

AttributeError: module 'builtins' has no attribute '__getitem__'
[10]->``` ![pithink](https://cdn.discordapp.com/emojis/652247559909277706.webp?size=128 "pithink")
#

that's unexpected

feral island
#

what code is that running?

willow pewter
#

also import builtins as __builtins__

willow pewter
#

ohh you were asking what code it executed

#

i just realized

willow pewter
willow pewter
#

nvm

deft ruin
#

Hi, I have a question. running this code which behaves weirdly. It crashes and it is impossible to catch:

from itertools import chain

foo = []
for _ in range(10**6):
    foo = chain([], foo)

try:
    next(foo)
    print("No Error")
except Exception as e:
    print("Got Error: ", e)

# Process crashes, nothing is printed!

Presumably because the error occurs inside the underlying C call. My question is, is this intended design (ie. known about and not going to be changed), or a bug? And if it is a bug, is it reported somewhere already?

flat gazelle
#

I believe these are considered bugs. chain doesn't check for recursion error, which is probably what's happening here.

deft ruin
#

ok, thanks

feral island
clever sapphire
naive saddle
lofty agate
#

<@&831776746206265384> cross posting scam

crisp thistle
lofty agate
#

one team

dusk comet
#

I noticed weird wording in typing module docs (in TypedDict section):

Deprecated since version 3.11, will be removed in version 3.13: The keyword-argument syntax is deprecated in 3.11 and will be removed in 3.13. It may also be unsupported by static type checkers.
https://docs.python.org/3/library/typing.html#typing.TypedDict

dusk comet
#

"Deprecated since version 3.11, will be removed in version 3.13" - this is repeated twice for some reason

#

Deprecated since version 3.11, will be removed in version 3.13: The keyword-argument syntax is deprecated and may also be unsupported by static type checkers.
this looks better for me

rose schooner
rose schooner
#

python/cpython#101441 wow

neon troutBOT
static hinge
#

Oh, that pep was accepted

feral island
#

And implemented

pliant tusk
fallen slateBOT
#

Objects/bytearrayobject.c lines 555 to 558

res = bytearray_setslice_linear(self, lo, hi, bytes, needed);
if (vbytes.len != -1)
    PyBuffer_Release(&vbytes);
return res;```
feral island
pliant tusk
#
>>> class A:
...     def __buffer__(self, flags):
...             return memoryview(bytes(8))
...     def __release_buffer__(self, view):
...             pass # do not need to do anything here, just needs to exist
... 
>>> b = bytearray(8)
>>> m = memoryview(b) # now b.extend will raise an exception due to exports
>>> b.extend(A())
Exception ignored in: <__main__.A object at 0x10c19c530>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: <function A.__release_buffer__ at 0x10c306160> returned a result with an exception set
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: <method 'extend' of 'bytearray' objects> returned NULL without setting an exception
>>> 
``` heres a simple replication example
pliant tusk
#

you already have it setup to ignore errors in __release_buffer__ so that would not cause any new issues anywhere else

feral island
#

yes

pliant tusk
#

would have submitted an issue and a pull request but i'm busy with end of my semester, figured pinging you would suffice

feral island
#

thanks! I'll make a PR later today

#

on a debug build it actually aborts, nice ```>>> b.extend(A())
Assertion failed: (!PyErr_Occurred()), function _PyType_Lookup, file typeobject.c, line 4707.
zsh: abort ./python.exe

quick snow
#

!pep 703 was submitted to the SC :)

fallen slateBOT
#
**PEP 703 - Making the Global Interpreter Lock Optional in CPython**
Status

Draft

Python-Version

3.13

Created

09-Jan-2023

Type

Standards Track

dusk comet
#

I dont get the point. It says that ~10 python threads is a bottleneck. Why? If they are calling C-side, GIL is released, so at the same moment several C-threads (threads that run C code) and at most one Python-thread can run. So if code on python side is very fast (or time on C-side is very big), there is no performance issue

quick snow
#

Yes, but if they're not calling C-code, or the C-code doesn't release the GIL*, then there is.

spark magnet
grave jolt
#

ah yes
Goday I Learned

dusk comet
# spark magnet it sounds like your question is, "Why is the GIL a problem?" or have I misunde...

No, i know why gil is the problem. Im just trying to understand beginning of this pep.

There is also this phrase at the end:

the overhead of acquiring and releasing the GIL typically prevents this from scaling efficiently beyond a few threads.
I dont think it is true. I believe it is possible to acquire/release gil at least thousands times a second. If ratio "(time with released gil)/(gil acquire+ gil release time)" is low then there will be no problem

spark magnet
spark magnet
dusk comet
#

GIL prevents several python threads from executing simultaneously
If almost all work is happening with released GIL, then there is no performance issue.

spark magnet
raven ridge
# dusk comet GIL prevents several python threads from executing simultaneously If almost all ...

It sounds like you're saying "the GIL isn't a problem when there's low contention for the GIL", which is true, of course, but most multithreaded programs have fairly high GIL contention.

Consider a web server with a JSON API, for instance. That's primarily an IO bound program, but every request that comes in needs to be JSON parsed. That's something that needs the GIL held, so no matter how many threads you've got, a second thread can't start parsing the request to execute until the last request has been parsed. Likewise, the GIL needs to be held while turning the response from a Python object into a JSON string, so one thread may need to wait for another to finish serializing a response before it can serialize its own.

raven ridge
#

The GIL needs to be held:

  • When executing Python bytecode through the Python VM
  • When creating Python objects
  • When calling methods or accessing attributes of Python objects
  • When saving or dropping a reference to a Python object
  • When calling almost any function from the CPython C API

Now, that's not everything. There are lots of things you can do that don't require the GIL to be held. But that is a lot of things, and multithreaded Python code generally needs to do some (if not all) of those things in every thread

raven ridge
#

also, a lot of things that could be done with the GIL released are done with it held instead. There's two major reasons for that:

  • Releasing and re-acquiring the GIL is relatively expensive. It can lead to unnecessary context switches and cache thrashing under contention. The less expensive the operation to be done is, the less valuable it is to release the GIL before doing it and reacquire it after. For instance, hashlib only releases the GIL if the string it's asked to hash is at least 2048 bytes, and keeps it held otherwise. It could unconditionally release it, but for short strings that's more likely to hurt performance than to help it.
  • Even when an operation is slow enough to justify releasing the GIL, libraries might hold the GIL anyway because releasing and reacquiring the GIL introduces extra complexity. It's more lines of code that obscures the important stuff the library is doing, and it requires more error handling (you need to ensure that the GIL gets picked back up even if an exception is thrown, for instance). And of course, it's unreasonable to expect extension authors to even know which things are slow enough to justify releasing the GIL and which aren't. That's fundamentally something based on heuristics...
raven ridge
coral pasture
#

!pep 684 has been accepted, targeting 3.12

fallen slateBOT
#
**PEP 684 - A Per-Interpreter GIL**
Status

Accepted

Python-Version

3.12

Created

08-Mar-2022

Type

Standards Track

rose schooner
#

python/cpython#104210

neon troutBOT
dusk comet
lethal nest
# dusk comet That makes sense, thank you ๐Ÿ‘

Another thing that I don't think has been mentioned so far is that Python does not have its own thread scheduler, it's all deferred to the OS. So that means that if you have 10 threads, if one thread is holding the GIL, you would potentially have to check whether the current thread holds the GIL 9 times in the absolute worst case (if the OS scheduler happened to schedule all of the non-GIL-holding threads before ever going back to the one that holds it)

#

a short example: ```python
from threading import Thread
import sys
from timeit import timeit

def thread_func(num_iter):
x = 0
for i in range(num_iter):
x += 1

def test_case():
num_iter = 10000
num_threads = 10
threads = []
for i in range(num_threads):
threads.append(Thread(target=thread_func, args=(num_iter,)))
for thread in threads:
thread.start()
for thread in threads:
thread.join()
sys.setswitchinterval(0.005)
print(timeit(test_case, number=10))
sys.setswitchinterval(0.0000001)
print(timeit(test_case, number=10))

#
0.06188675000157673
0.2476623330003349
```this is the output I get
#

setswitchinterval changes how long python waits before asking the thread with the GIL to drop it, so it simulates a lot of GIL contention if you put it super low

frigid bison
flat gazelle
# frigid bison what things can you do that don't require it to be held?

A couple of things come to mind
waiting for

  • a child process to die
  • an IO operation to finish
  • a thread to stop
  • OS synchronization primitives
    notably, you can also perform a non-python CPU bound operation (e.g. using some number crunching C library).
    Also theoretically waiting user input, though that generally falls into one of the previous categories already
frigid bison
#

but all those things are only triggered by events that do require it to be held

flat gazelle
#

what do you mean?

frigid bison
#

like an IO operation is triggered by the Python bytecode or a Cpython api function, and those do require the GIL to be held

flat gazelle
#

yup, the general flow is

lock gil
compute arguments for IO operation in python and convert them into C variables
unlock gil
recv(whatever)
lock gil
wrap the result in python objects
unlock gil
fallen slateBOT
#

Modules/_multiprocessing/multiprocessing.c lines 110 to 112

Py_BEGIN_ALLOW_THREADS
nread = recv((SOCKET) handle, PyBytes_AS_STRING(buf), size, 0);
Py_END_ALLOW_THREADS```
raven ridge
# frigid bison like an IO operation is triggered by the Python bytecode or a Cpython api functi...

Yes, the normal way of working with the GIL is that it's held "by default" when executing Python code, and dropped temporarily while performing an operation that doesn't require it.

The exception would be threads that are spawned by a C library instead of a Python library. Those will start off in a C entry point without the GIL held, and for those threads the pattern will be to temporarily acquire the GIL to do things that need it (like calling a Python callback provided by the user, or saving a reference to a Python object), and then drop it when you're done doing those things

unkempt rock
quick trellis
#

!e

try:
    [][0]
except Exception as e:
    t = e.__traceback__
    print(t.tb_lasti)
    print(t.tb_frame.f_lasti)

edit this actually works fine, see below

fallen slateBOT
#

@quick trellis :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | 8
002 | 96
quick trellis
#

guys am i stupid or what shouldn't they be equal

#

btw what i actually want to find is last instruction that occured this g() call

    try:
        g()
    except Exception as e:
        tb = e.__traceback__
        # ... tb.tb_lasti is something else
trim merlin
#

tb_lasti should be correct. check if tb_lineno is what you expect

quick trellis
#

it is

#

just tb_lasti buffles me

amber marsh
#

Gaiz need help with data extraction from website

quick trellis
#

not here

amber marsh
#

Anyone knows how to do that

quick trellis
#

ok i see now that the issue persists only across function calls, consider the following

#

!e

def g():
    [][1]

try:
    g()
except Exception as e:
    t = e.__traceback__
    print(list(t.tb_frame.f_code.co_positions())[t.tb_lasti])
fallen slateBOT
#

@quick trellis :white_check_mark: Your 3.11 eval job has completed with return code 0.

(7, 7, 8, 23)
quick trellis
#

guess ill make an issue

dusk comet
# quick trellis guys am i stupid or what shouldn't they be equal

no, they shouldn't
t.tb_lasti is captured at the moment when exception occurs, so it sontains some opcode position from this: [][0]
t.tb_frame is the current frame, so t.tb_frame.f_lasti contains position of opcode that is currently executing: print(t.tb_frame.f_lasti)

[][0] and print(t.tb_frame.f_lasti) are obviously different places, so you get different numbers

quick trellis
#

yea i got that eventually

#

my actual problem was different as you can see

rose schooner
#

python/cpython#103764 PEP 695 is here

neon troutBOT
rose schooner
#

this all happened before 3.12 beta 1

rich cradle
#

yay!

quick snow
#

Inlining comprehensions was un-accepted for 3.12 :(

rose schooner
#

oh it's a release blocker

#

damn

quick snow
#

Because of this, basically:

foo = 23
class Example:
    foo = 42
    bar = [foo for _ in range(1)]

Example.bar is [23] in 3.11 and below, and would have been [42] now.

rose schooner
#

ok so the PR is still a draft

quick snow
#

I think it was merged

rose schooner
#

and it doesn't really remove everything

rose schooner
rose schooner
#

that's the PR that reverts

#

there's also another PR by jelle which is more simple

quick snow
#

Ah, I was talking about the previous one that added this. I see, so there's still hope for the rest?

rose schooner
#

python/cpython#104528

grave jolt
#

why is deque implemented as a linked list of arrays and not a ring buffer?

native flame
grave jolt
#

ah i c

grave jolt
#

||btw||

jade raven
feral island
#

<@&831776746206265384> I thought the NFT scammers would be done by now

grave jolt
feral island
quick snow
#

A pleonasm!

halcyon trail
#

has anyone ever suggested explicit is_empty functions for str and the built in containers

#

I'm pretty tired of truthiness, and len(s) == 0 feels very clumsy

#

has this been attempted? Would it just get shut down immediately?

umbral plume
halcyon trail
#

I just prefer to be explicit. Ironically, I just recently wrote a program and had a bunch of such checks, and was lazy, and used truthiness

#

and almost instantly got burned, because I assumed that Path("") was falsy

#

explicit is better than implicit, right ๐Ÿ™‚

umbral plume
#

whether or not pathlib paths implemented __bool__ wasn't even something I can considered up to this point, so I probably would've made the same mistake, i'll give you that ๐Ÿคท
but IMO, string truthiness is something that most intermediate python devs are aware of or at least have heard of, so the solution of doing if str(some_path): would be explicit enough

dusk comet
#

I dont think pathlib.Path objects are considered containers

#

It feels wrong to me

halcyon trail
#

yeah, I'm not advocating for path to implment truthiness

#

but if member functions were used or at least could be used more consistently then there wouldn't need to be a question about it, or opportunity to make a mistake

grave jolt
#

Yeah I would definitely prefer explicit stuff. I don't really see the point of the implicit coercion

quick snow
#

I don't understand this. Presumably a .is_empty() method should then exist on all containers? How would that differ in behaviour from bool(the_thing), if you want to be explicit?

grave jolt
#

Because not everything will have this method

#

Whereas erroring on __bool__ is a bit strange, I have only seen numpy arrays do it

quick snow
#

So it's like arguing that object.__bool__ should raise an error.

peak spoke
#

fun fact, object.__bool__ doesn't exist

halcyon trail
#

bool(the_thing) is terrible because it basically always works

#

and if there's no implementation it returns

#

true

#

if bool(x) adds no value at all over if x

#

that's not the sense of "explicit" that's valuable

#

if numpy arrays decided to deliberately error on bool(x) that actually tells you a lot ๐Ÿ˜›

grave jolt
#

Yeah, the problem is that the "emptiness check" (__bool__) is implemented for stuff which doesn't have the concept of empty/non-empty

faint river
#

Bool will use len if no bool dunder is present

grave jolt
#

<@&831776746206265384> I am on mobile ๐Ÿ’€, please do something^

halcyon trail
faint river
#

A string is a "container" is it not?

halcyon trail
#

Like, I still think a named function would be better, regardless, but __bool__ erroring by default would already be a huge improvement

#

(but obviously, that's not going to be changed)

halcyon trail
#

either way, my main point wasn't to draw a line between strings and containers

grave jolt
halcyon trail
#

yeah, that's yet another issue with it

#

I mean, idk, I feel like truthiness just has, so many gotchas and downsides, and there's literally no benefit but saves a few characters ๐Ÿคทโ€โ™‚๏ธ

#

obviously it cannot be eliminated in python but at least facilitating truthy-avoiding coding styles seems reasonable

grave jolt
halcyon trail
#

yeah. I tend to dislike that trick in python.

#

it's a lisp classic

#

(or first-option second-option third-option) etc

quick snow
quick snow
#

That's a feature, not an issue.

halcyon trail
#

it's literally saving you a few characters

#

"stuff that returns something or none"

#
x = returns_something_or_none()
if x is not None:
    ...
#

and when someone writes if x: you may have to figure out whether they knew that "containers" could be returned from the function and intended to skip empty containers

#

or if they assumed that the function wouldn't return a container

dusk comet
#

or they assumed that returned containers are always nonempty

dusk comet
#

is argumentless super() using something like frame->locals[0] (this is a C pseudocode) to get first argument passed to a method?

fallen slateBOT
#

Objects/typeobject.c line 10340

static int```
native flame
fallen slateBOT
#

Python/compile.c line 4751

static int```
feral island
native flame
#

oh damn

feral island
#

you can see that code essentially translating super() into super(__class__, self). It then uses the opcode LOAD_ZERO_SUPER_METHOD which will load the method object directly if possible (bypassing creation of the super object) and otherwise (e.g., if super has been shadowed) falls back to regular execution

fallen slateBOT
#

Python/bytecodes.c line 1597

inst(LOAD_SUPER_ATTR, (unused/1, global_super, class, self -- res2 if (oparg & 1), res)) {```
quick snow
halcyon trail
#

It's not clear to the reader of the code which of those were actually considered

#

"someone" means, another person on your team, or even you, 6 months ago

#

Like it's clear from your response you're thinking mostly about how code writes, rather than mostly how code reads

quick snow
#

No, I often see code written by e.g. coworkers that do if x is not None: and I have to think "is it really important to check for None here, or does this person just not like duck-typing?"

Whether reading or writing the code I often don't need to consider this (see previous message). Does re.search return False, None, or some falsy object when it doesn't match, and is it the same for re.findall? No idea, don't care.

halcyon trail
#

Err, no, you don't have to think that

#

They checked for None

#

that's thepoint of the code, to check for None

quick snow
#

But maybe it shouldn't check for None..

halcyon trail
#

if x, you don't know if their intent was to check against None, against empty, or against both

#

....

#

this is just disingenuous

#

yes, any code could just be completely wrong

#

the issue is that if you're using if x wherever it's applicable, you're going to be using it in several overlapping situations

quick snow
#

If this were to change, writing the equivalent of if x: would become much more complicated. If you want to check for one exact kind of truthiness now, you already can, with e.g. == "", which is readable and very clear.

halcyon trail
#

I'm not suggesting anything change that would affect if x, the truthiness ship has sailed in python (though it's not likely to sail anywhere else)

#

just inquiring about adding a member function

#

you might notice that basically all languages have .is_empty() or equivalent functions for strings and lists

#

so, it's not exactly a minority opinion that .is_empty() is more readable than == ""

#

(or len(s) == 0)

quick snow
#

(I must also note that I'm not arguing that I share my opinions with the majority of Python users)

umbral plume
#

Whether there's an .is_empty sorta method seems to depend on how the language actually treats strings - if they're mutable, there's pretty much always one, whereas if they are immutable then it kinda just varies

quick snow
#

I wouldn't have a problem with str.is_empty, it just seems rather pointless. If you think s == "" isn't clear, I don't know what to say. (Is it beautiful? No.)

halcyon trail
#

if JS and PHP are doing the same thing then it's probably safe to do the opposite

grave jolt
#

In JS, only numbers, strings, null and undefined can be falsey

halcyon trail
#

s.is_empty() is also better than s == "" because, s == "" will run fine no matter what the type of s is

#

if I think something is a string, then I want to call a string method, that's the whole point

#

that at least gives me a good chance of an error, if my expectaitons are violated

#

On top of all that there's the static typing dimension to this

#

if s will basically never error and it will basically never trigger a mypy error

grave jolt
#

well, depends on what you mean by it

grave jolt
umbral plume
grave jolt
halcyon trail
#

like if there's literally no benefit except saving a few characters, i think reader clarity easily trumps that.
if there's some benefit other than golfing I'd like to know what it is

grave jolt
#

your keyboard lasts longer

#

(:

halcyon trail
#

hey now, that's not trivial

#

my cost of ownership for an advantage kinesis, well, I'm afraid to calculate

#

responding to your joke probably cost me a quarter

grave jolt
#

unfortunately you can't send me an invoice

halcyon trail
#

๐Ÿ˜ข

quick snow
halcyon trail
#

so in this case you explicitly know that items is nullable, and a container, so why not be explicit about that

grave jolt
#

how is that better than if items is None?

halcyon trail
#

yeah, also that

#

if you just have a for loop you don't need to branch specially

quick snow
halcyon trail
#
def safe_frobnicate(x, items=None):
    if items:
        pre_pre_frobble()
        for item in items:
            pre_frobble(item)
    x.frobnicate_for_real()
#

๐Ÿคทโ€โ™‚๏ธ

#

but, again, in terms of reading this, it's alot clearer to do

umbral plume
halcyon trail
#

if items is not None and not items.is_empty()

quick snow
halcyon trail
umbral plume
#

like, i've used the strick of doing string_that_may_be_empty or default_string quite a lot, which also has the added benefit of not evaluataing that string twice like a ternary operator would

halcyon trail
#

Compared to a protocol that is automatically implemented for every object

#

whether or not it makes any sense

#

like, truthiness is bad enough, but like, python has the worst possible version of it. automatic opt in truthiness.

grave jolt
#

it is truthy! but also empty!

halcyon trail
#

๐Ÿ˜‚

#

great example

#

the relationship between iteration and truthiness isn't even consistent out of standard python types

#

At least if you did is_empty and iterators (as opposed to collections) did not implement that

#

you would get an immediate error

#

instead of the program quietly running through an unexpected codepath

grave jolt
#

I suppose that's one kind of confusion that type annotations can solve, you can require items to be None or a Collection[Stuff]

halcyon trail
#

Yeah, but if items is annotated as iterable, this still doesn't actually solve the problem

grave jolt
#

Yep, that was the problem

umbral plume
#

i guess i'm just yet to ever fall into the trap of relying on the truthiness of something that hasn't actually implemented __bool__ itself, so its never been an issue to me - though i can easily see myself probably doing it at some point

halcyon trail
#
def safe_frobnicate(x, items: Optional[Iterable[Whatever]]=None):
    if items:
        pre_pre_frobble()
        for item in items:
            pre_frobble(item)
    x.frobnicate_for_real()
#

containers are still iterables

grave jolt
#

Yeah that's what I had at work, I think

halcyon trail
#

so even with code that type checks perfectly

#

this code behaves in very different ways for non-container iterables

#

and iterables

umbral plume
#

truthiness is, after all, a remnant of python's past for the most part

halcyon trail
#

this is exactly why it's wildly bad to have truthiness implemented by default

umbral plume
#

back in the super early versions before the bool type was a thing, boolean logic was done using the integers 0 and 1 (which is why bool is a subclass of int for compatibility reasons)

halcyon trail
#

like, C++ has truthiness but in practical terms it' snever caused me 1% of the grief of python's truthiness

grave jolt
halcyon trail
grave jolt
#

oh yeah I didn't finish reading

halcyon trail
#

but then why do style guides still push truthiness so hard, so often

#

if people just said "yeah, this is historical, please do explicit is None, and added is_empty() checks

#

I'd be perfectly happy

grave jolt
#

though I think he had a different rationale in mind

halcyon trail
#

maybe it's starting to shift then, which is good

#

interesting

grave jolt
#

PEP 8 still recommends using the implicit stuff though

halcyon trail
#

gross

#

if len(users) is a pretty decent recommendation for containers. I probably still prefer to just do len(users) > 0 but at least len(users) is clearly an integer

#

and this enforces users to support len, which is really the most important part

#

(enforces it/communicates it to the reader)

umbral plume
#

I can only really circle around to just like, the fact that its never been an issue for me when reading/writing code, because its generally only ever used (and implemented) by the built-in sequences and numbers

halcyon trail
#

"a feature so intuitive that it works well as long as it's only touched by a small handful of built-in types"

grave jolt
#

well, it's implemented by everything

umbral plume
#

and i'm not gonna argue too much in favour of object.__bool__, if that was removed i wouldn't be too upset

halcyon trail
#

unfortunately nothing can be removed for backwards compat etc

feral island
#

the classic gotcha is if self.is_done instead of if self.is_done()

halcyon trail
#

๐Ÿ˜ฆ

grave jolt
#

was about to write that @feral island

halcyon trail
#

mypy could have a no-truthiness mode

#

might not even be that hard to implement, tbh

feral island
#

I think for functions among others

halcyon trail
#

what's it called?

#

ah

grave jolt
#

welllllll, you can't really remove it fully

halcyon trail
#

why not?

grave jolt
halcyon trail
#

Ah this is fantastic:

Warn when the type of an expression in a boolean context does not implement bool or len. Unless one of these is implemented by a subtype, the expression will always be considered true, and there may be a bug in the condition.

#

thanks jelle

#

when you say "remove it", obviously at the language level it's a huge breaking change

#

so you can't even regardless of that filter example

grave jolt
#

yeah, this should catch a lot of stuff

halcyon trail
#

mypy should be able to catch that filter example, in principle

#

not sure if it actually does

#

like, mypy has all the information it needs

grave jolt
#

because items can be falsey

halcyon trail
#

yep

#

truthiness is the devil overall

#

it's just different degrees

#

everywhere implemented truthiness is definitely the worst of it

#

the best is to just not have it in you rlanguage because the benefits are so close to 0, if not negative

#

just one less thing to deal with

#

but that is obviously too late

grave jolt
#

conclusion: use <insert_language> btw

halcyon trail
#

definitely

grave jolt
#

Lua is different from both Python and JS... In Lua, everything has a truthiness, but only false and nil are falsey

#

so 0 and "" are truthy

umbral plume
#

that seems to sorta link in with lua's disregard of 0 or emptiness being special, like how there's 1-indexing and the inability to get errors from accessing undefined table entries

grave jolt
#

in JS you also don't get an exception when accessing an array/map out of bounds

umbral plume
#

i wouldn't be upset if obj.__bool__ emitted a warning of some kind, like how numpy/pandas objects do

halcyon trail
#

having multiple languages that do truthiness all with slightly different takes is another thing that also makes this painful

#

or even multiple libraries within python

#

somebody told me about a lib that had an operator bool but deprecated it

rose schooner
dusk comet
dusk comet
fallen slateBOT
#
The Zen of Python (line 1):

Explicit is better than implicit.

quick snow
halcyon trail
#

But if it's an iterator that has no sensible truthiness operator, you don't get a type error ๐Ÿ˜›

#

Just behavior worthy of a JavaScript "wat" talk

quick snow
halcyon trail
#

For the if check

#

For a regular container, if it's empty, an if check fails

#

If it's an empty iterator , if it's empty , an if check succeeds

feral island
halcyon trail
#

Our mypy is so behind, I really want to find time to upgrade it and "tighten up" some of the checks but it's not easy

unkempt rock
#

With PEP 695, diffs like these will become common for a while to reduce boilerplate for typing

-from typing import TypeVar
 from collections.abc import Iterator
-
-K = TypeVar('K')
-V = TypeVar('V')
 
-def pick(
+def pick[K, V](
     dict_: dict[K, V],
     keys: list[K],
 ) -> Iterator[tuple[K, V]]:
     for k, v in dict_.items():
         if k in keys:
             yield (k, v)
swift imp
#

Guys think this funky? I want to use __class_getitem__ to act as a factory for creating a type alias with attached metadata. Then do something with newly generated type alias at runtime. No idea how mypy will react, cannot test.

from typing import TypedDict, Annotated, TypeVar
import pandas as pd

Column = str
Dtype = TypeVar("Dtype")

class Table:

     def __class_get__(cls, col_info: dict[Column,Dtype]):
          return Annotated[pd.DataFrame, col_info]

UserData = Table[{"name": str, "age": int}]
dusk comet
#

Mypy will yell at you because this class is not generic

unkempt rock
keen berry
#

@unkempt rock

#

dude how to make this project

swift imp
feral island
halcyon trail
#

thanks for sharing

bold sparrow
#

Hey! So, I've been thinking about this for quite a while and Ive asked some experienced people but nobody quite seems to know either.
I know about name mangling and the concept behind it, but I dont think thats related here (?)

Basically, if you have a look at the builtins.pyi, you will often times see function parameters defined the following:

def pow(__x: int, __y: int, __z: int) -> int: ...
def getattr(__o: object, __name: str, __default: None) -> Any | None: ...

Now these are inconsistent too, sometimes there no underscored for same parameters in different overloads and sometimes its only one.

Really curious what the purpose of this is? I know it probably isnt of much importance, I just cant stop questioning it

sour thistle
#

you forgot to include the actual question (which was the title in the help post)

What do the double underscores indiciate in the builtins stubs?

bold sparrow
#

oww my bad ๐Ÿ˜‚

feral island
sour thistle
#

shouldn't it use / instead?

feral island
#

we can't use the native syntax for that (/) for technical reasons

bold sparrow
#

ahh! Yea we thought it would use / if it was for positional only purposes, but I guess that was it afterall then

#

thanks for clearing that up!

feral island
#

After Python 3.7 goes EOL (pretty soon), I think we can switch to the / syntax since PEP 570 was in 3.8

bold sparrow
#

Okay finally thats off my mind haha. Thank you

merry bramble
#

The consensus is that we should wait for 6 months after Python 3.7 goes EOL before we start using syntax that only works in Python 3.8 (https://github.com/python/typeshed/issues/10113). So we'll probably start using / for positional-only parameters in early January 2024 over at typeshed.

#

Some functions are designed to take their arguments only positionally, and expect their callers never to use the argumentโ€™s name to provide that argument by keyword. All arguments with names beginning with __ are assumed to be positional-only, except if their names also end with __

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied timeout to @worldly schooner until <t:1684961099:f> (10 minutes) (reason: role mentions spam - sent 4 role mentions).

The <@&831776746206265384> have been alerted for review.

raven ridge
#

I guess I'm not seeing who benefits from being able to install new stubs that support being parsed by 3.7 at this point, and what would be lost by just saying "sorry, no more updates for you" to people who still need to be able to parse the stubs with a version that doesn't have positional only arguments

feral island
#

well, not for the stdlib at least

#

we probably could be more aggressive in dropping support, but mypy still supports 3.7

raven ridge
raven ridge
feral island
#

So if we drop 3.7 support, mypy won't be able to update typeshed while still supporting 3.7

#

Because they literally won't be able to parse it on 3.7

raven ridge
feral island
#

(though this is only the stdlib: third-party packages in typeshed are distributed separately)

raven ridge
#

I see. I guess your constraint is that, once you start merging PRs to add positional only args, mypy can't update its vendored typeshed without dropping 3.7 support, and you don't want to force mypy to make that choice indirectly because of something typeshed has done

feral island
#

Yes. Personally I'm also not that eager to drop support as soon as possible

#

The double underscores for pos-only args work fine

#

Though admittedly they confuse new contributors on a regular basis

raven ridge
#

I dunno. "If you need to run mypy using Python 3.7, the last release of mypy that came before Python 3.7's EOL date is the latest version you can use" doesn't sound like a bad policy to me

feral island
#

Sure. But 3.7's EOL hasn't arrived yet

#

I expect mypy and typeshed will drop support soon after that

raven ridge
raven ridge
#

That is, mypy would be giving up running under a 3.7 interpreter, but not necessarily giving up analyzing 3.7 code

feral island
raven ridge
#

Fair enough. ๐Ÿ™‚

#

I typically think of people running linters as being more sophisticated users from an SDLC point of view, and more likely to keep up to date on interpreter updates. I'd have guessed that the portion of mypy's user base running the tool using a 3.7 interpreter would be minuscule. With the possible exception of projects that are no longer being actively maintained but still have CI running, but for those it wouldn't matter if they stopped getting updates to the mypy tool.

feral island
lone sun
#

They may not have a choice about their environment, though.

raven ridge
#

That's... Entirely fair.

feral island
#

I think at work we barely got off 3.6 in time

lone sun
#

I once had to maintain code that needed to run simultaneously on 2.6 and 3.6. It was Not Fun.

raven ridge
#

I've still got some (C API) code that needs to run on 2.7 and 3.6 through 3.11... I haven't quite figured out how to split that so I can freeze the 2.7 part while everything else moves on.

#

It shares one header across all versions, as it's deployed today. ๐Ÿ˜”

agile breach
#

Which database should I learn

#

Which is preferable for interacting with Python

#

If I learn python and want to learn web developement is django enough?

faint river
agile breach
#

Oh

#

I'm new here

merry bramble
# raven ridge Sure, but it's in ~4 weeks, not the 6 months Alex mentioned

I would be OK with a more aggressive schedule personally, but I'm also OK waiting. I hate the __ syntax for positional-only arguments โ€” it's ugly, and I found it really alienating when I first started reading through typeshed's stubs, prior to contributing. But it also doesn't really cause any massive problems right now; it basically works fine. We've put up with it for many years now, so waiting another six months doesn't make much of a difference ๐Ÿ˜„

vocal pasture
#

๐Ÿ’€

vocal pasture
rose schooner
dusk comet
#

i've coded in npp also for about a year, it was good enough
it wasnt python, i used some very unpopular language, so there is no huge difference between editors (because they all have no idea what this language is and what to do with it)

grave jolt
#

I used to program in gedit early in my journey

#

In like, 2018 or 2019

sweet kayak
#

2018

quick snow
#

You should ask in a help channel, see #โ“๏ฝœhow-to-get-help . But yes, you should random.shuffle the list and then go through it from start to end.

frank spoke
#

but then why do style guides still push truthiness so hard, so often

grave jolt
#

Maybe it's because PEP8 suggests it ๐Ÿ™‚

#

Or maybe python programmers like code golfing and a fog of mystery

ancient jackal
#

!e

print(False in [False] == True)
fallen slateBOT
#

@ancient jackal :white_check_mark: Your 3.11 eval job has completed with return code 0.

False
grave jolt
#

!e
unrelated but

from math import nan

print(nan == nan)
print(nan in [nan])
print([nan] == [nan])
fallen slateBOT
#

@grave jolt :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | False
002 | True
003 | True
sour thistle
#

(False in [False]) and ([False] == True)?

urban sandal
grave jolt
#

but nan not being equal to itself is honestly the core issue...

#

I know that's how the floating point standard works, but it's cursed

urban sandal
#

not really something I'm going to agree with you on. nan needing to be handled seperately is a good thing that prevents other errors, and I'm eying that list comparison as breaking that. If have have nan in a dataset, something is wrong with the data and it shouldn't be being used to compare with.

grave jolt
urban sandal
#

nan in a dataset is an error of some sort. pretty much everything you do that implicitly involves the nan without explicitly correcting for it (probably by exclusion, but this could impact statistics) would be a problem

grave jolt
#

IIRC the reason NaN was defined to never equal anything is that the actual bits of different NaNs could be different

urban sandal
#

it's because NaN isn't a number, it's a signaling tool

#

and yeah, you can use the multiple different nans as part of that signaling

halcyon trail
grave jolt
#

Yeah that is true. And negative zero is special-cased anyway

halcyon trail
#

but doing that, doesn't really give you much, then you have to think about ordering of NaN's.... it's just not going anywhere good

#

i will also say that in real data analysis, most times, if you're comparing two columns in a dataframe, you do want them to compare equal when they have nans at matching spots

molten onyx
# urban sandal The first case being False is obviously correct, the second one feels like it's ...

I guess it all comes down to the fact that identity is a concept only present in Python land and the language explicitly specifies that identity is compared before value in both list equality and membership (nan in [nan] is also true) so you'd have to "lose" somewhere for the list check to fail, changing nan is nan to be false would probably break a very important invariant and changing the list semantics is probably too big of a change so it's just kinda what we have

halcyon trail
#

but that should be properly handled by the library, the [nan] == [nan] thing is still wrong

halcyon trail
molten onyx
#

what does

halcyon trail
#

that identity is compared before value in list equality

#

that seems totally incorrect to me

#

lots of languages have a notion of identity equality and I don't know of any where list equality does that

molten onyx
#

it would be silly for a list to compare unequal to itself I think

grave jolt
#

why does list do that btw?

halcyon trail
#

It's not though?

#

it's only silly if its always silly for things to compare unequal to themselves

#

nan compares unequal to itself and relatively few people think this is silly

grave jolt
#

||I do think it is silly lemon_pleased ||

halcyon trail
#

list1 == list2 should mean exactly len(list1) == len(list2) && for x, y in zip(list1, list2): x == y

flat gazelle
#

yeah, I would argue [nan] == [nan] being True is incorrect.

raven ridge
# grave jolt why does list do that btw?

as an optimization, I suppose. And possibly for consistency with dict, which needs this special case because otherwise you could use nan as a dict key and then never remove it or access its value

molten onyx
#

making an IEEE implementation overtake your language design seems like a silly rabbit hole to fall down

halcyon trail
#

yeah, probably an optimization

#

it's bad language design though ๐Ÿคทโ€โ™‚๏ธ

#

Like:

identity is a concept only present in Python land
This is not correct, identity is present in basically all mainstream programming languages

flat gazelle
#

but IG it doesn't really matter in practice, so whatever.

halcyon trail
#

and this is a very unusual way to handle things

flat gazelle
#

yeah, a is b implying a == b is... wrong

grave jolt
molten onyx
#

I think two lists should compare identical if they contain the same objects even if you override those objects not to compare equal to themselves because they are the same list, I think breaking this invariant would be more catastrophic than the current behavious

flat gazelle
#

it's just not how python works

halcyon trail
#

how would it be "catastrophic"

#

and the phrase "compare identical" is very confusing here

#

i assume you meant compare equal

flat gazelle
#

two lists are equal if their elements are equal and in the same order.

molten onyx
#

yes

halcyon trail
#

I don't see anything "catastrophic" still

grave jolt
#

NaN and its related semantics seems really weird to me to be honest. It's like... trying to hack error handling into a number format

merry trellis
#

i have someones ip address what can i do

grave jolt
#

write it on a piece of paper

flat gazelle
halcyon trail
#

oh my god, I Just discovered kotlikn has th esame behavior ๐Ÿ˜‚

#
fun main() {
    val x = Double.NaN
    println(x == x)
    println(listOf(x) == listOf(x))
}

prints false true

flat gazelle
#

raku simply compares NaN equal to NaN, which is IMO a fair enough decision

raven ridge
halcyon trail
#

I would say that straying from IEEE is a pretty bad decision

flat gazelle
#

surprisingly, PHP is the one who does this correctly and compares [NAN] != [NAN]

halcyon trail
#

Well, I'm certain C++ and Rust do this correctly

#

without even looking

grave jolt
#

Haskell does this, but that's because it doesn't really have a notion of identity

halcyon trail
#

I'm shocked by Kotlin though

#

If Kotlin does it it's probably because Java does

lone sun
flat gazelle
#

well, languages which don't have identity don't have this dillema at all

halcyon trail
#

C++ and Rust have identity

lone sun
#

Not in the same way, though.

halcyon trail
#

that's not a meaningful thing to say, or at least, you mean something by it that's not clear

grave jolt
halcyon trail
#

well, then you better define what you mean by identity

raven ridge
#

&a == &b is identity in C, no?

molten onyx
#

even in C++ if you compare pointers you'll get a true value because a pointer to NaN is still equal to itself

halcyon trail
#

for many years for example, it was common when writing copy assignment to do exactly a check for identity

lone sun
grave jolt
#

Yeah I guess some values in Rust have identity, basically everything on the heap and such

#

ye^

halcyon trail
#

everything "has" identity. The identity of things is not really meaningful in most cases

grave jolt
#

js has this but for a completely different reason ๐Ÿ˜›

halcyon trail
#

that's true in python too

flat gazelle
molten onyx
#

in python identity is a fundamental language-level feature and is not tucked away the same way, is pretty well defined as everything is boxed, and everything that follows is pretty logical

halcyon trail
#

it's just being used as a performance shortcut here, most likely

#

The & operator is a fundamental language level feature in C++ too ๐Ÿคทโ€โ™‚๏ธ

flat gazelle
#

but it doesn't work on everything

molten onyx
#

yeah but it's not really the same

halcyon trail
#

it's not "the same" but these very vague "well but C++ doesn't have identity" just isn't accurate

#

and doesn't really clarify the issues

flat gazelle
#

to be more specific, not every single type in C++/rust has a valid operation that returns a value that is unique to that object for its entire lifetime.

raven ridge
halcyon trail
lone sun
halcyon trail
flat gazelle
#

not every object has an address as per the C++ abstract machine afaik

#

you can of course give it an address through various means, but objects without an address can exist

halcyon trail
#

prvalues surely don't, but prvalues aren't really values at all, they're basically just expressions that the compiler is delaying the instantiation of

#

"objects" in fact do

#

values are not objects

#

An object, in C++, has

size (can be determined with sizeof);
alignment requirement (can be determined with alignof);
storage duration (automatic, static, dynamic, thread-local);
lifetime (bounded by storage duration or temporary);
type;
value (which may be indeterminate, e.g. for default-initialized non-class types);
optionally, a name.

#

After implicitly creating objects within a specified region of storage, some operations produce a pointer to a suitable created object. The suitable created object has the same address as the region of storage. Likewise, the behavior is undefined if only if no such pointer value can give the program defined behavior, and it is unspecified which pointer value is produced if there are multiple values giving the program defined behavior.

flat gazelle
#

ah, interesting, didn't know this distinction was made, my bad

halcyon trail
#

it's fine, I didn't really mean to drag us into the C++ weeds

#

my point was just that it's not so simple to say "well python has identity and X doesn't"

grave jolt
#

what are we discussing at this point

#

ah

halcyon trail
#

sorry sorry my bad

grave jolt
#

no worries, you were just clarifying something

#

I have a short attention span

halcyon trail
#

all good. at any rate, I think python's behavior here is insane but at least it has company

lone sun
#

Well, originally the discussion was about NaNs.

halcyon trail
#

container == should recursively defer to ==

#

I would never have thought that was controversial

#

using identity as an optimizaiton is always okay, but not when it can change semantics

lone sun
#

I think it's important to keep in mind that the IEEE specification was designed by and for numerical analysis. It's great if you're doing scientific computation and need to get things exactly correct.

halcyon trail
#

I mean at this point it's also how the vast majority of computers work at the native level

flat gazelle
#

ye, the assumption a is b implies a == b is used to optimise container comparisons in python/kotlin/likely java, which doesn't happen in C++/rust/other languages where nan is a 4/8byte string with no extra runtime information, PHP is noteworthy because despite having identity, it compares [NAN] == [NAN] as false

#

that was what I was trying to say

halcyon trail
#

Yeah, I mean the term which I think helps here is not "have identity" but rather "have reference semantics"

#

python, kotlin, and java have reference semantics throughout, so when you "copy" nan you don't really copy it, you just have pointers to the same nan

#

C++ and Rust have value semantics

#

so that optimization is mostly worthless to begin with

flat gazelle
#

java primitive arrays surprisingly also compare equal here, which is just... odd.

molten onyx
#

just to add fuel to the fire [nan] == [nan] is also sometimes false in python if you have two different float instances

>>> l1 = [float('nan')]
>>> l2 = [float('nan')]
>>> l1 == l2
False```
flat gazelle
#

ah yeah, that makes sense

grave jolt
#

in Rust there's actually an explicit thing, floats implement PartialEq but not Eq

flat gazelle
#

yeah, the real takeaway here is that equality of IEEE floats is questionably useful at best

halcyon trail
#

well, their equality is quesitonably useful for additional reasons to this ๐Ÿ˜›

#

but yeah, python's behavior here is pretty terrible

#

like, it's not even consistent

#

if float('nan') were at least a singleton, it would at least be consistent

#

I'm kind of surprised it's not

grave jolt
#

hmmm not sure I'm convinced

halcyon trail
#

about which

lone sun
grave jolt
#

math operations can return nans, should they also all be converted to the same singleton? (and therefore the same bit patter)

lone sun
#

Most people think of NaN as a constant. Actually, there are different types of NaNs.

flat gazelle
#

actually, I wonder how pypy and graalpy do this - they both do simplify lists of floats into direct unboxed arrays

lone sun
#

The existence of multiple NaNs allows you to track the source of a NaN in a computation. Plus, you can give the NaNs different behavior (signaling versus non-signaling; basically, whether or not you raise an exception).

flat gazelle
#
>>>> l1 = [float('nan')]
>>>> l2 = [float('nan')]
>>>> l1 == l2
True
```pypy is different (which does make sense, since pypy lies about identity preservation in containers a lot)
halcyon trail
#

nan being a singleton in python or not has nothing to do with IEEE

#

and this has nothing to do with the different kinds of nans either

lone sun
#

NaN should not be a singleton, ever.

halcyon trail
#

float('nan') could easily have returned a reference to the same object each time

#

that doesn't make any sense

lone sun
#

Why not?

halcyon trail
#

because float('nan') is a deterministic function, it's obviously returning one specific NaN bit pattern

#

there's no reason why it can't be a singleton in python

flat gazelle
#

pypy has it effectively be a singleton

lone sun
#

Since NaN is not a singleton in IEEE, doing anything else at the Python level is an invitation to disaster.

halcyon trail
#

you're misunderstanding

#

IEEE for starters does not specify things like "singleton" at all; that's a softare engineering term and IEEE is a spec

#

second, yes, everyone in this conversation understands that many bit patterns are associated with NaN

lone sun
#

You're asking why float('nan') is float('nan') returns False, right?

halcyon trail
#

obviously, NaN's of different bit patterns cannot be represented by a singleton

#

but there's no reason why a NaN of the same bit pattern cannot be represented by a singleton

#

I'm not really asking "why" because it's just an optimization choice

#

there's nothing preventing it, contrary to what you're saying

flat gazelle
#

to be fair, a lot of things with the same bit pattern are not optimised to singletons, nan is not special in that regard.

lone sun
#

Oh, you're asking why there isn't a singleton Python object for each NaN with a particular bit pattern. Not why there isn't a singleton Python NaN object.

#

Well, that would be fine.

halcyon trail
#

I was asking why float('nan') specifically didn't use a singleton for its return

#

if float('nan') were at least a singleton, it would at least be consistent

lone sun
#

It could always return math.nan, for example.

#

I agree that there's nothing wrong with that behavior.

halcyon trail
#

I would have thought it would be worthwhile to make it a singleton since it's kind of the "default" NaN you'd use if it didn't happen directly from a computation

#

but I guess for some reason it' snot

merry bramble
sour thistle
#

you might want to DM @summer lichen to request the appropriate roles by the way

halcyon trail
#

btw, re the nan stuff above and the java/kotlin

#

apparently double/double comparison works properly but when you compare it "as an object" (which will occur in ay generic code, like list ==)

#

it has these special cases

#

apparently, to make hashing work correctly...

raven ridge
#

that's the same reason as dict special cases nan in Python - if it didn't work that way, py nan = float("nan") d = {} d[nan] = 1 d[nan] = 2 print(d) del d[nan] would print text {nan: 1, nan: 2} and then raise a KeyError

dusk comet
#

i dont think it is special cased
containers like dict and list are first doing identity check and only if it fails they do ==
so hash(nan) == hash(nan) and nan is nan, so nan's work the same way as all other objects

#
>>> class X: __eq__ = lambda*_: False; __hash__ = lambda x: id(x)
...
>>> hash(X())
2_657_535_760_144
>>> x = X()
>>> d = {}
>>> d[x] = 1
>>> d[x] = 2
>>> d
{<__main__.X object at 0x0000026AC17E6710>: 2}
>>> d[x]
2
halcyon trail
#

so dict is also assuming that identity implies equality

#

joy

#

at least there it's more reasonable somehow, as it's a real performance increase. And it feels less dirty because you didn't explicitly ask for ==

#

Indeed, Java's hacks and python's hacks for NaN are actually totally different.

In [1]: nan = float('nan')

In [2]: nan2 = float('nan')

In [3]: nan is nan2
Out[3]: False

In [4]: d = {nan: 5}

In [5]: d[nan2]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[5], line 1
----> 1 d[nan2]

KeyError: nan

In [6]:
#

java is using structural equality, rather than doing this dubious thing of falling back on identity. But it wants nan to be well behaved, when treated as an object, so it forces nan to be equal to itself

#

Java's solution for NaN at least yields predictable behavior.
python is just using identity as a shortcut over structural equality in many places, for performance reasons. This doesn't consistently fix any issue with nan, rather the opposite, it makes the behavior inconsistent

raven ridge
dusk comet
#
>>> x = float('nan')
>>> y = float('nan')
>>> x is y
False
``` ๐Ÿค” 
why there is no global `nan` object in interpreter? every `float('nan')` could just return already created object
#

or float('nan') has to return new object every time?

raven ridge
#

I can't see a reason why it couldn't always return the same object. I think it's just like tuples or ints or strs: the interpreter is free to reuse equivalent immutable objects rather than creating a new instance

urban sandal
#

The best solution for this would probably break too many things. math.isnan exists, so deciding that python's nans don't need to compare unequal and that people should be explicitly handling nans anyhow could have been a pragmatic decision... years ago, but not now.

inland fjord
#

I am installing Python on my another machine. I would like to know the best path/directory (the one that is generally use which does not have any issues related to admin privileges and some libraries not working) for it. Where should I install it?

dusk comet
#

bytearrays have two "sizes":

  1. len(obj) - size visible from python world
  2. size of internal buffer

there are two functions:

  • PyByteArray_Size(PyObject *bytearray) - Return the size of bytearray after checking for a NULL pointer.
  • PyByteArray_Resize(PyObject *bytearray, Py_ssize_t len) - Resize the internal buffer of bytearray to len.

PyByteArray_Resize explicitly says that it modifies size of internal buffer
PyByteArray_Size returns "the size of bytearray", but i dont understand what size: is it a len(obj) or size of internal buffer?

i looked into source code, but it confused me even more
i want to:

  • get len(bytearray) efficiently (avoiding calling .__len__() if possible)
  • set to len(bytearray) after i resized and modified internal buffer
    can you help me please?
feral island
dusk comet
#

is there a fast way to set to len()?

feral island
#

to set the allocated size to len()?

#

PyByteArray_Resize should do that, but resizing presumably means copying the internal buffer, so that's slower than not resizing

dusk comet
#

i think PyByteArray_Resize should not copy internal buffer if requested size is close enough to actual size of internal buffer
i am calling PyByteArray_Resize to ensure that some amount of bytes can fit into buffer, i dont want to change len() at this moment
in some cases i want to change len() (without copying anything, new len() definitely can fit into internal buffer)
i am confused, i will think about what i want and ask questions later

feral island
#

by the way seems like you can get the allocated size (ob_alloc in the C struct) from the __alloc__ method in Python ```>>> ba = bytearray(4)

ba.alloc()
5
len(ba)
4
ba.pop()
0
ba.alloc()
5
len(ba)
3

dusk comet
#

obj->ob_alloc - size of internal buffer (obj.__alloc__())
obj->ob_size - len(obj)
right?

feral island
dusk comet
#

i knew about it, i found it from looking at bytearray.__dict__

>>> help(bytearray.__alloc__)
Help on method_descriptor:

__alloc__(...)
    B.__alloc__() -> int

    Return the number of bytes actually allocated.
#

what is float.__getformat__?
i cannot call it, it either raises TypeError: __getformat__() argument must be str, not float if i pass float to it, or ValueError: __getformat__() argument 1 must be 'double' or 'float' if i pass str to it

prime estuary
#

That one exists for testing purposes I believe, which is why it's undocumented.

naive saddle
#

!rule 6 7 -- This is offtopic and inappropriately here. Sorry

fallen slateBOT
#

6. Do not post unapproved advertising.

7. Keep discussions relevant to the channel topic. Each channel's description tells you the topic.

sharp plover
#

!cban 895141929971486741 Spam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @teal torrent permanently.

dusk comet
#
>>> help(property)
class property(object)
...
 |  deleter(...)
 |      Descriptor to obtain a copy of the property with a different deleter.
 |
 |  getter(...)
 |      Descriptor to obtain a copy of the property with a different getter.
 |
 |  setter(...)
 |      Descriptor to obtain a copy of the property with a different setter.
...

why docstrigs are saying that these methods are descriptors? they are just regular methods

spark magnet
spark magnet
raven ridge
#

!cban 1099566257948327946 Advertising and probable scam

fallen slateBOT
#

:incoming_envelope: :ok_hand: applied ban to @rich zenith permanently.

twilit canopy
#

Yo

#

uhm

#

wanna see my script

narrow kettle
terse stream
#

can anyone tell me where I can find links to classes teaching html through pycharm?

narrow kettle
dusk comet
#

Long time Pythoneer Tim Peters succinctly channels the BDFLโ€™s guiding principles for Pythonโ€™s design into 20 aphorisms, only 19 of which have been written down.
where is the 20th aphorism?

#

it is not offtopic, because it is about pep20!

raven ridge
#

You'd have to ask Tim, I suppose

umbral plume
dapper lily
#

yet

jade raven
#

let's-hear-it-for-lambda-in-curly-assignment-stmts-ly y'rs - tim

ripe tinsel
#

Not sure which board to post this on, so here goes:

There has been a lot of hype with the recent AI showpieces that have come out, but I believe the true future of AI is likely to be single-task applications using small, focused models that fit on small devices. One such application can have a significant impact on the performance of Python.

There are two kinds of environment in which Python operates: the "development environment" is served well by the JIT compiler, but the "production environment" is not. Python is increadibly slow compared to other languages, but with the advent of AI a python app could be deployed in binary.

Theoretically, AI could be used to translate every package in PyPI into machine language (or you can just compile it) and include the binary files in the packages. You can then run a python program in the command line using a prompt like "python -P app.py" to let the JIT compiler load the binary for packages, rather than compiling packages every time they are imported.

Alternatively, you could provide two separate compilers: one for development and casual use (JIT) and another which compiles the whole program, including the import tree, into binary for production. Having the binaries in the package files will be more efficient though.

quick snow
#

Python doesn't compile packages every time they are imported (by default). In general, this would not work like you think, and there are already third-party packages that more-or-less do something like this (translate Python code into machine-code), but they have limitations.

#

The reason Python isn't compiled to machine code isn't because people aren't smart enough to write a compiler and we need an AI for it

ripe tinsel
#

It can't be hard to train a transformer with input that is python code and the output is the output from the compiler after it has compiled everything (but it could be expensive). If it is small enough to run quickly on a CPU then it changes the design prospects for future compilers

quick snow
#

It can and is.

grave jolt
#

Machine learning tools are inherently "black box", you can't peek inside and be relatively sure that it will do the right thing

#

And you would expect a compiler to be relatively correct.

umbral plume
#

In CPython (or any hand-written compiler/interpreter for that matter), if there's a bug which is causing otherwise valid code to be rejected or compile incorrectly, then you can (eventually) trace down exactly where that bug is originating from in the human-readable source code, and patch it
If your source-to-executable model has a similar bug, then good luck reading the weightings to figure out why its happening, let alone how to fix it

radiant garden
#

A compiler is one of the worst tasks for AI

#

You need your output to be semantically correct

#

and you better be able to justify people that it is correct (as opposed to how a black box behaves) or else what you have is a hazard ๐Ÿ˜†

#

And like @quick snow said, python files are not by default compiled each time an import is run. (The __pycache__ is built on first load)

#

(not to mention that machine code is highly platform-specific, so in practice you'd need to both train the model for each target)

dusk comet
#

i think it can generate some internal representation (which is (i hope) relatively simple compared to machine code)
then normal compiler can optimize it and compile to machine code

radiant garden
#

If the model loses track of its context at any point (which for large enough input is inevitable) you will have a miscompilation

#

anyways at that point you have a python parser

#

these techniques might have novel use in optimization passes, guaranteed that there is some proof of correctness

#

but it's a tough sell since "executing the output of a language model" is a known security risk

grave jolt
#

Yeah I don't see how "AI" is relevant to this problem

rose pagoda
#

seem like PEP 703 - Making the Global Interpreter Lock Optional in CPython will not get accepted or not get progressed
from Sam Gross author of the PEP

Hi Gregory and the Steering Council,

Thanks for reviewing the PEP. The PEP was posted five months ago, and it has been 20 months since an end-to-end working implementation (that works with a large number of extensions) was discussed on python-dev. I appreciate everyone who has taken the time to review the PEP and offer comments and suggestions.

You wrote that the Steering Council's decision does not mean "no," but the steering council has not set a bar for acceptance, stated what evidence is actually needed, nor said when a final decision will be made. Given the expressed demand for PEP 703, it makes sense to me for the steering committee to develop a timeline for identifying the factors it may need to consider and for determining the steps that would be required for the change to happen smoothly.

Without these timelines and milestones in place, I would like to explain that the effect of the Steering Council's answer is a "no" in practice. I have been funded to work on this for the past few years with the milestone of submitting the PEP along with a comprehensive implementation to convince the Python community. Without specific concerns or a clear bar for acceptance, I (and my funding organization) will have to treat the current decision-in-limbo as a โ€œnoโ€ and will be unable to pursue the PEP further.

https://github.com/python/steering-council/issues/188#issuecomment-1581534250

GitHub

Please consider PEP 703 -- Making the Global Interpreter Lock Optional in CPython https://peps.python.org/pep-0703/ The PEP has been discussed in threads listed in its Post-History header The PEP w...

astral gazelle
tranquil quarry
# rose pagoda seem like PEP 703 - Making the Global Interpreter Lock Optional in CPython will ...

Honestly, it's way too early to tell if this message spells the end of PEP 703.

For reference, here's what Sam Gross was responding to: https://github.com/python/steering-council/issues/188#issuecomment-1575106739

Given all the buzz generated by Sam Gross's nogil work, it's pretty clear that there is interest in offering GIL-free code execution for CPython, both from the developer side and community side.

The current steering council seems to have been discussing PEP 703 a lot (it's quite a long and substantial PEP) over the past months (see e.g. https://github.com/python/steering-council/blob/main/updates/2023-03-steering-council-update.md or the Discourse thread or mailing list).

Sam Gross's initial work at https://github.com/colesbury/nogil has been ongoing since late 2021, but only in January did he do a larger set of changes to the PEP and started a new repository at https://github.com/colesbury/nogil-3.12 for updating nogil to incorporate newer changes (e.g. immortal objects). Seems like other core devs want more time for discussion, and from what I can tell it seems like the steering council wants the same. Then again one could say that there has been enough time for discussion around nogil (there were language summits covering nogil in this and last year's PyConUS event, in addition to all the Discourse discussions around it, media references to nogil on e.g. YouTube, etc.).

Full-time work on nogil for such a long period of time must be pretty arduous (especially alone), so perhaps Sam is waiting for some kind of guarantee that PEP 703 will be accepted down the line, presumably soon.

jade raven
#

sounds like a big part of that message is the funding.

raven ridge
#

Not necessarily even that. It sounds like it's mostly just a lack of clarity. It sounds like funding wouldn't be an issue if the SC committed to move forward with the idea.

rich cradle
#

i mean it is reasonable that an organization would want to stop funding or he would want to stop working on it if it's going to go to waste

#

because that's a lot of time, work, and money

urban sandal
#

Without specific concerns or a clear bar for acceptance
It was pretty clear that this was the fundamental hangup. There's a "not no", but no path to a "Yes" given either.

vast saffron
#

To be honest this feels like both sides of the decision would have quite a lot of consequences, but also that there was perceived competition with the per interpreter gil.

This not only feels like almost each person on the council didnโ€˜t want to be the one that possible decided the wrong way and be linked to the consequences, but also from the thread on discourse, that quite a few didnโ€˜t actually really study the pep.

If this is the case (which I do not say it is, just that the optic of it is there) then that is actually worse for the structure as a whole.

Everyone can make easy or clear decisions, the council is needed for the hard ones, the dilemmas. By not making a decision this became the worst outcome and really gives of a bad vibe.

A No would be ok, a yes too, a missing decision can call the purpose of the body into question.

There is an even worse way of looking at it, but this would imply actual malice, which I donโ€˜t see nearly enough clear evidence for, to explicitly state it and therefore accuse people (cleary: I dont think there was any malice, but it is something that will infect the discussion if the whole situation is not adressed by the council)

#

Sorry for the long text and to sonewhat also state my bias:

I am someone who was hoping this would be accepted and actually prioritized.

A yes woild have made me happy.

A no would have made me sad and maybe frustrated.

A nothing is making me irrationally annoyed and angry.

raven ridge
#

Historically, one of the biggest issues with attempts to remove the GIL is that people would go off and build some huge change set in total isolation without any feedback from the people who actually maintain the interpreter from day to day and year to year, and then when they get something that more or less, they try to convince other people to take ownership of this huge thing that only they really understand. I'm not saying that's what's happening here, but clearly there's costs to moving forward with this PEP, and clearly there's value in taking the time to fully understand the approach, figure out the pros and cons, and carefully decide whether it's a good idea.

#

15,000 lines of changes to CPython code, plus an additional 15,000 lines of vendored third party code, is no small maintenance cost.

#

Though that's quite a lot smaller than some previous GIL removal attempts.

novel barn
#

I need some use cases and use case diagram . I need a system test design . I need object, class, and component diagram . I need possible risks and a risk analysis. I need a persistent data management explanation: the description of data schemes, the selection of a database, and the description of the encapsulation of the database. I need those for my project. Is there any anyone could help me ? If threre is anyone could help me please contact me on dm?

radiant garden
#

and you need to get tucked in to bed with a bedtime story

spark magnet
#

harsh

steel solstice
#

are there any peps that cite future features as reasons to implement things?

urban sandal
#

in the form "x will serve as a basis for y" where "y" is it's own pep, I think so, but I can't remember any from memory.

umbral plume
#

or the one which introduced __matmul__ (a @ b) specifically said "we're not gonna use this in the stdlib, its for other libraries to make use of"

steel solstice
#

Oh subinterpreters yeah

red wharf
#

Has there ever been discussion of implementing C-style ++ and -- operators as a shorthand for += 1 and -=1?

grave jolt
spark magnet
urban sandal
#

+= already isn't guaranteed to be in place, and ++/-- wouldn't be either. what would the perceived benefit be?

spark magnet
#

s = "Hello"; s++ # ????

unkempt rock
#

And there's evaluation order as well

inland acorn
#

and also how many chars do you save? one

urban sandal
#

More to my point, if the motivation is because another language has it, it's probably important to make sure the person asking about knows that both the current solution and what's proposed both do (or would have to, in the case of proposed) behave differently anyhow.

unkempt rock
#

It was mostly used in C-like for loops, but they are vulnerable to off-by-one bugs.
That was in Swift as well, but was removed in 3.0.

red wharf
#

I see, that makes sense

dusk comet
grave jolt
#

like advancing an iterator

spark magnet
meager arch
spark magnet
#

in C, yes ๐Ÿ™‚

grave jolt
spark magnet
grave jolt
#

very handy

raven ridge
# meager arch s++ is โ€œelloโ€

in C++, you can't know what s++ is without knowing what type s is. You've assumed it's const char*, but it could be any type that defines an operator=(const char*)

swift imp
spark magnet
swift imp
swift imp
faint river
#

seems you want ++ to be equivalent to *= 2 for strings

#

really doesn't make sense

swift imp
#

Yeah it doesn't

lone sun
#

I'm coming late to this discussion, but I just want to say: plus1

hybrid relic
#

y'all know how to regenerate the configure script in the Interpreter's source code?

alpine rose
hybrid relic
#

ah, thanks

dusk comet
#

Is cpython using CMake?

feral island
#

it's configure built with autoconf

dusk comet
#

๐Ÿ‘

#

ty

lilac sparrow
#

my are enums implemented as inheritance and dataclasses with a decorator? what's the reasoning?

dusk comet
#

all enum classes have some common methods, so it is more reasonable to use inheritance
dataclasses generate new methods, so there is no way to inherit from something, because methods are generated at runtime (__init__ can have arbitrary number of arguments, __eq__ can check equality for arbitrary number of fields)

spice pecan
# lilac sparrow my are enums implemented as inheritance and dataclasses with a decorator? what's...

Dataclasses are more of an implementation detail than an actual parent, they just generate boilerplate for you, and as such they are designed to be specifically opt-in - if you inherit class Child from a dataclass-decorated class Parent, Child will not be a dataclass itself unless you decorate it explicitly.

Enums, on the other hand, are generally way easier to spot than dataclasses, very rarely inherited from, and also have a fundamentally different use case compared to regular classes, so the effects on inheritance will come up less and be more predictable

vast saffron
#

Is there any footgun regarding inheriting from an enum? I use a base enum that only changes the way _missing_ is handled to a way I personally like more. There has been no problems with having almost all my enums inherit from that base enum, BUT should I be on the look out for trouble?

dusk comet
#

Why there is no float.is_nan and float.is_inf?
I think it is natural to put these functions inside float class instead of math module.
(There is one downside: it would require int.is_nan and complex.is_nan to exist)

#

Also float('inf') to construct special value seems ugly to me. Why there is no float.nan and float.inf classvars?

grave jolt
#

in fact there can be many different nans iirc

dusk comet
#

I thought cpython creates only one "static" nan object and converts any c-level nans to that canonical nan object

#

There are only two infinities, right?
I think they also can be allocated only once

quick snow
#

!e

import struct
nan1 = float("nan")
nan2, = struct.unpack("f", b"\x12\x00\xc0\x7f")
print(nan1, nan2, nan1 is nan2)
print(struct.pack("f",nan1), struct.pack("f", nan2))
fallen slateBOT
#

@quick snow :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | nan nan False
002 | b'\x00\x00\xc0\x7f' b'\x12\x00\xc0\x7f'
dusk comet
#

That feels wrong, but ok

spark magnet
#

nan doesn't mean infininty. It means, "I don't know what this is, but I know it's not a number".

dusk comet
#

Yes, i know. But nans and infs are similar because they are special-cased in float('inf')/float('nan') and math.is_inf/math.is_nan

#

And they have no literals

spark magnet
dusk comet
#

I think it would be better to have only one "canonical" nan object, so you can do checks like x is float.nan instead of math.is_nan(x). And in this case we would be able to rely on fact that float('nan') (or float('inf') always returns the same object

spark magnet
#

the floating-point standard says there can be different NaNs

dusk comet
#

!e print(sum(range(10), start=float('nan')))

fallen slateBOT
#

@dusk comet :white_check_mark: Your 3.11 eval job has completed with return code 0.

nan
quick snow
#

There's no need for float.__add__ (or the C equivalent) to return a new nan object here

fallen slateBOT
#

Objects/floatobject.c lines 594 to 602

static PyObject *
float_add(PyObject *v, PyObject *w)
{
    double a,b;
    CONVERT_TO_DOUBLE(v, a);
    CONVERT_TO_DOUBLE(w, b);
    a = a + b;
    return PyFloat_FromDouble(a);
}```
dusk comet
#

I think it creates new object every time

quick snow
#

It might, but it wouldn't need to.

#

I see a potential benefit for one pre-allocated nan, like for small ints (although of course it'd need a benchmark). But that doesn't require enforcing singletonness everywhere

rose schooner
#

it does

quick snow
rose schooner
spark magnet
#

if your code is too slow because it's creating too many NaNs, you have other things to worry about....

molten onyx
#

adding a branch just for the NaN (and inf?) special casings wouldn't really make sense

#

every floating point add would incur a huge performance penalty for what

#

saving an allocation on a select few cases

urban sandal
dusk comet
#

I didn't know about that, thank you.

quiet crane
quick snow
#

(Linux 5.19.0-41, Ubuntu 22.04, a Lenovo T14 (x86_64, 32 GB RAM, 16 processors), Python 3.12.0a3)

#

I also tested it on 3.11 and 3.10, same behaviour on both.

quiet crane
quick snow
fallen slateBOT
#

:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1686926429:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

#

:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1686926432:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

#

:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1686926435:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

#

:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1686926438:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

#

:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1686926441:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

#

:incoming_envelope: :ok_hand: applied timeout to @unkempt rock until <t:1686926445:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

lunar trail
#

Does anyone here happen to know how to raise an error in cpython? I found PyErr_SetString(PyExc_ValueError, "the text of the value error"); but that doesn't seem to actually raise it.

dusk comet
feral island
#

(or sometimes something else, depending on where you are)

lunar trail
#

That worked, thank you ๐ŸŽ‰

grave jolt
#

and I thought datetimes were only complicated enough

#

24:00?!

feral cedar
#

is that a leap second

quick snow
#

Nope, that would be 23:59:60

lunar trail
# grave jolt 24:00?!

It's only in the fromisoformat calls (not constructors), and it converts to 00:00 so .hour is still 0..23

rich cradle
#

hi, could someone perhaps help me comprehend this portion of the grammar?

simple_stmts:
    | simple_stmt !';' NEWLINE  # Not needed, there for speedup
    | ';'.simple_stmt+ [';'] NEWLINE 
```(<https://docs.python.org/3/reference/grammar.html>)

what i understand is:
- either a single simple stmt, followed by a newline
- any number of simple stmts, separated by semis, with an optional trailing, followed by a newline
effectively, that ends up being equivalent to the latter (which i presume is what the comment means?)

does that sound right?
#

also, i don't understand why the !';' is in the one at the top. is that part of the performance optimization?

urban sandal
#

The top line there prevents the parser from looking around too much in the case of a simple statement followed by a new line. (which is much more common due to commonly accepted style conventions)

If you're more familiar with regex than formal grammar, it's a similar trick people use with alternating patterns to skip a version of the pattern that requires backtracking if the more common case can be captured without it.

rich cradle
#

got it, thanks, i see why that's used. so, then, would it be right to look at that as the following, if the optimizations are ignored?

simple_stmts: simple_stmt (';' simple_stmt)* ';'? NEWLINE
urban sandal
#

At a glance, that looks as if you can consider it equivalent outside of nudging the parser to better performance, but I'm hesitant to give an affirmative "yes" without viewing everywhere simple_stmts is used as well in case there's any additional subtlety here.

#

I'm mostly sure it's fine to consider that equivalent.

#

(I'm not sure if there's a reason the second line is constructed the way it is rather than how you have constructed it which isn't obvious in some way)

rich cradle
#

that's good enough for me haha. thanks!

rich cradle
#

no idea if it's a performance thing or what but that seems to be a fairly common pattern

urban sandal
#

probably, but I'd still want to either test or inspect some other ways things are defined that I don't remember from memory in case this is somehow important before giving an affirmative yes on. I am at least sure that conceptually you can fold the two lines of simple_stmts together outside of performance concerns. As for the other part, it's entirely possible it's just stylistic choice here.

rich cradle
#

makes sense. thanks again!

rich cradle
#

hi, could i get a little sanity check on my understanding of how one might implement semantic indentation while lexing python? i've summarized an implementation of the rules as i understand them and was wondering if anyone would be willing to look it over. the summary is here:

The lexer handles them by keeping a stack of Indentations, a pair of the count of tabs and spaces on a given indentation level. Determining what the whitespace at the start of a line represents is done by the following rules:

  • If either the space and tab count increase, and the other increases or remains the same, it is an indent, and a new level is pushed onto the indentation stack.
  • If either the space and tab count decrease, and the other decreases or remains the same, it is a dedent. All indentations greater than the current are popped off the stack, and if the new indentation level is not the same as any of the ones currently on the stack, an error is emitted.
  • If the space and tab count change in different direction (that is, either the space count increases and the tab count decreases, or the space count decreases and the tab count increases), an error is emitted, as per the following clause in the reference:

Indentation is rejected as inconsistent if a source file mixes tabs and spaces in a way that makes the meaning dependent on the worth of a tab in spaces; a TabError is raised in that case.

  • If both the space and tab count both remain the same, then no indents or dedents are emitted.
    There are a few exceptions to the indentation rules. If a line is empty, only whitespace, only a comment, or a mixture thereof, then it does not contribute to the addition or removal of a level to the indentation stack.
rose schooner
#

although there are more special cases

#
  • if a form feed (\014/\f) is encountered, the indents reset as it is considered a newline(?)
  • if a backslash (\) is encountered in the middle of indenting, the indentation remains the same as the indentation before the first backslash in a group of continued lines
rich cradle
#

i just realized i forgot to copy in the backslash-at-eol and open-delimiter-pair rules too

rich cradle
#

but thanks for looking over it! that's very helpful

rose schooner
rich cradle
#

oh right that's just the backslash-at-eol rule