#internals-and-peps | Python | Page 110

raven ridge May 21, 2021, 1:37 AM

#

think of things like static analyzers for C that try to detect buffer overflows - they're in a statically typed language, performing difficult sorts of static analysis, in ways that are essentially orthogonal to the type system

#

You're drawing a correlation between static analysis and static typing that I don't see - beyond that obviously a statically typed language makes static analysis of types trivial

acoustic crater May 21, 2021, 1:38 AM

#

I think you can statically figure out every expression in python as long as the inputs aren't doing certain things but it'd be very complex

halcyon trail May 21, 2021, 1:38 AM

#

because they are bolted on after the fact, and they are heuristic

spark magnet May 21, 2021, 1:39 AM

#

static languages don't have metaclasses. That doesn't imply that metaclasses are the hardest thing to statically analyze.

raven ridge May 21, 2021, 1:39 AM

#

acoustic crater I think you can statically figure out every expression in python as long as the ...

You most certainly cannot.

with open("some.fifo") as fifo:
    eval(fifo.readline())

halcyon trail May 21, 2021, 1:39 AM

#

in a mathematical sense, no

raven ridge May 21, 2021, 1:40 AM

#

since a FIFO is read-once, you can't even peek in the file to see what that would do.

halcyon trail May 21, 2021, 1:40 AM

#

the question was asking about static analysis in the context of optimization; to optimize based on static analysis you have to be 100% sure its correct. If you have 100% certainty of something based on static analysis, then effectively at that point, it could be made part of the type system, since that's what types are.

spark magnet May 21, 2021, 1:41 AM

#

@halcyon trail the 100% is why i mentioned getattr. I think you said getattr was fine as long as the data met a certain criterion. You can't assume that criterion.

halcyon trail May 21, 2021, 1:41 AM

#

If you have a static analyzer for C, sure, it's bolted on, so you still have a separate type system of C itself, and the conclusions of the static analyzing bounds checker, or what not. Effectively, if the latter had 100% certainty, it would be like having a second type system.

#

@spark magnet I mean I'm talking about python features, as they are typically used in day to day usage.

gleaming rover May 21, 2021, 1:42 AM

#

If you have 100% certainty of something based on static analysis, then effectively at that point, it could be made part of the type system, since that's what types are.

#

feel like this is not valid

#

hm but I'm not sure I need to think about it

spark magnet May 21, 2021, 1:42 AM

#

halcyon trail <@!424559318617161740> I mean I'm talking about python features, as they are typ...

i would say getattr usually doesn't use literal string names.

halcyon trail May 21, 2021, 1:42 AM

#

A lot of usages of getattr could in practice be done, no problem, with a compile time reflection system

#

for me the most common usage of getattr is reflecting over things usually

#

all of the members of a dataclass, for example

spark magnet May 21, 2021, 1:44 AM

#

that's one use, sure.

#

you said i wasn't allowed to choose a hard case. I think you aren't allowed to choose an easy case 🙂

raven ridge May 21, 2021, 1:44 AM

#

@gleaming rover other things that are tough to statically analyze: import hooks, .pth files, # coding: comments

halcyon trail May 21, 2021, 1:44 AM

#

Well, there's a distribution of hard and easy, is the point

#

with metaclasses it's all hard

#

with decorators, very often hard, not always

gleaming rover May 21, 2021, 1:45 AM

#

raven ridge <@!171929073063297024> other things that are tough to statically analyze: import...

# coding: is hell

#

but really cool

halcyon trail May 21, 2021, 1:45 AM

#

a lot of decorators are ok because they only change implementation details of say a callable

#

in that case the situation vis-a-vis types is simple

raven ridge May 21, 2021, 1:45 AM

#

halcyon trail with metaclasses it's all hard

Is it?

class MyMetaclass(type):
    pass

class A(metaclass=MyMetaclass):
    pass

halcyon trail May 21, 2021, 1:45 AM

#

but as soon as the decorator starts changing the API, it's a mess

#

.... I mean in real world use cases of metaclasses

gleaming rover May 21, 2021, 1:46 AM

#

hm

raven ridge May 21, 2021, 1:46 AM

#

so you're picking hard cases.

halcyon trail May 21, 2021, 1:46 AM

#

No, I'm talking about what's typical in real world code

gleaming rover May 21, 2021, 1:46 AM

#

does it matter though

#

as long as there's a possibility of a hard case

#

you need to be prepared to deal with that

raven ridge May 21, 2021, 1:46 AM

#

Right - which means that it isn't metaclasses themselves that make static analysis difficult, it's things that are done by the metaclass.

gleaming rover May 21, 2021, 1:46 AM

#

anyway I feel like

halcyon trail May 21, 2021, 1:46 AM

#

well, I dunno, it depends I guess what angle you are looking at it from

gleaming rover May 21, 2021, 1:47 AM

#

raven ridge Right - which means that it isn't metaclasses _themselves_ that make static anal...

yes, this

#

that's what I was thinking

raven ridge May 21, 2021, 1:47 AM

#

What are things that metaclasses can do that make static analysis harder?

gleaming rover May 21, 2021, 1:47 AM

#

metaclasses are only hard because

#

you can do a LOT of things in Python

halcyon trail May 21, 2021, 1:47 AM

#

when you asked "most dynamic", I thought about the features that on average do the most dynamic things

gleaming rover May 21, 2021, 1:47 AM

#

but if you could do only what you would normally be able to do in a language with a stronger type system and less reflection

#

I don't think metaclasses would really be problematic

halcyon trail May 21, 2021, 1:48 AM

#

metaclasses, again, involve executing completely arbitrary code, just to understand what the type looks like

raven ridge May 21, 2021, 1:48 AM

#

did I mention that you can change the class of an instance in Python?

halcyon trail May 21, 2021, 1:48 AM

#

so they would be quite problematic

gleaming rover May 21, 2021, 1:48 AM

#

raven ridge did I mention that you can change the class of an instance in Python?

yes

#

but I'm saying

halcyon trail May 21, 2021, 1:49 AM

#

I guess it really all depends on how you want to compare. getattr means you'll need to know the value of a string at compile time. In theory, that can also involve executing arbitrary code.

#

So metaclasses and getattr are equally dynamic, in the black and white sense

raven ridge May 21, 2021, 1:49 AM

#

!e ```py
class A:
pass

class B:
pass

a = A()
print(isinstance(a, A), isinstance(a, B))
a.class = B
print(isinstance(a, A), isinstance(a, B))

fallen slateBOT May 21, 2021, 1:49 AM

#

@raven ridge :white_check_mark: Your eval job has completed with return code 0.

001 | True False
002 | False True

halcyon trail May 21, 2021, 1:49 AM

#

In practice computing strings at compile time can often be done to a great extent, if they don't literally depend on information given at runtime

#

with the metaclass though you are still running arbitrary code. at least, that is how I see it.

raven ridge May 21, 2021, 1:50 AM

#

halcyon trail So metaclasses and getattr are equally dynamic, in the black and white sense

not really - metaclasses are only equally dynamic as getattr if they do dynamic things like setattr. Otherwise, they're less dynamic, as in my trivial example of a do-nothing metaclass.

halcyon trail May 21, 2021, 1:50 AM

#

they're not less dynamic, because like I said, you need to execute the code in the metaclass in order to understand the type, and the type is something you want to understand statically

raven ridge May 21, 2021, 1:51 AM

#

why do you need to execute code in the metaclass in order to understand the type, instead of statically analyzing it?

#

The answer needs to be that the metaclass does something that's resistant to static analysis, and that's the thing that we're looking for - right?

halcyon trail May 21, 2021, 1:52 AM

#

If you want to do that then you can also just bury the answer all the way at the bottom

#

the most dynamic feature of python is that everything is a dictionary, and you can modify those dictionaries freely

#

it's not really getattr, it's what getattr does, etc

#

keep pushing it down

raven ridge May 21, 2021, 1:52 AM

#

yes, I agree with that.

#

except getattr is the lowest level for __slots__ classes, which don't have a __dict__

#

but I definitely agree that direct manipulation of __dict__ also makes static analysis very tough.

grave jolt May 21, 2021, 1:56 AM

#

isn't it also true that things which are difficult for static analysis are also kinda difficult for humans to understand/reason about?

#

(in general)

#

like, I remember programming in C for a little bit

gleaming rover May 21, 2021, 1:57 AM

#

grave jolt isn't it also true that things which are difficult for static analysis are also ...

monad transformers 🥴 vs human emotions 👼

grave jolt May 21, 2021, 1:57 AM

#

don't say the M-word plz lemon_pensive

raven ridge May 21, 2021, 1:58 AM

#

hm. There's a correlation, definitely, but I don't think that's a rule.

try:
    from unittest import mock
except ImportError:
    import mock

my_mock = mock.MagicMock()

is reasonably dynamic, but not difficult for humans to reason about.

grave jolt May 21, 2021, 1:58 AM

#

maybe

#

well, that's because there is an external explanation to a human

raven ridge May 21, 2021, 1:59 AM

#

there's two different modules named mock with basically the same interface, which can be used interchangeably.

grave jolt May 21, 2021, 1:59 AM

#

or I'm not sure what you're talking about

#

ah

raven ridge May 21, 2021, 1:59 AM

#

mock is a backport of unittest.mock to older interpreters - and it's still actively maintained, so you can actually import it in newer interpreters to get newer versions of mock than the stdlib shipped with.

#

it's more like unittest.mock is a periodic fork of mock, heh

grave jolt May 21, 2021, 2:00 AM

#

yeah, not sure how static analysis tools will cope with that

#

I think I'm damaged by static typing.

#

I'm somehow unable to enjoy dynamic typing

raven ridge May 21, 2021, 2:03 AM

#

I just built a library for work that, based on a config file or an environment variable, does essentially

if use_new_stuff:
    from .new_submodule import Thing1, Thing2
else:
    from .old_submodule import Thing1, Thing2
__all__ = ["Thing1", "Thing2"]

#

mypy is not happy.

#

whereas people have no trouble with it, because I've guaranteed that those things are interface compatible, even if they're not strictly speaking the same type.

acoustic crater May 21, 2021, 2:05 AM

#

halcyon trail .... I mean in real world use cases of metaclasses

are there any?

#

at this point

grave jolt May 21, 2021, 2:06 AM

#

acoustic crater are there any?

!d enum.Enum

fallen slateBOT May 21, 2021, 2:06 AM

#

enum.Enum


class enum.Enum```
Base class for creating enumerated constants. See section [Functional API](https://docs.python.org/3/library/enum.html#functional-api) for an alternate construction syntax.

grave jolt May 21, 2021, 2:06 AM

#

or, for example, pydantic.BaseModel

#

or the stuff in SQLAlchemy

acoustic crater May 21, 2021, 2:06 AM

#

ah nice examples

#

so, as I should have guessed, metaprogramming

raven ridge May 21, 2021, 2:07 AM

#

yeah - metaclasses are always for metaprogramming, really

grave jolt May 21, 2021, 2:07 AM

#

Yeah, you wouldn't write your own metaclass if you're making a web API or something. Unless you really care about job security.

rich cradle May 21, 2021, 2:07 AM

#

@fallen slate uses a metaclass for getting info from config.yml iirc

raven ridge May 21, 2021, 2:07 AM

#

the only reason to use a metaclass is that you want to do something that can't be done easily with just regular types.

grave jolt May 21, 2021, 2:08 AM

#

rich cradle <@!409107086526644234> uses a metaclass for getting info from config.yml iirc

yeah 🤔

acoustic crater May 21, 2021, 2:08 AM

#

the closest I've gotten to a real use case for metaclasses is using __init_subclass__

raven ridge May 21, 2021, 2:08 AM

#

and there's fewer of those cases now than ever, thanks to things like __init.... yeah.

grave jolt May 21, 2021, 2:08 AM

#

and __class_getitem__

#

well, metaclasses aren't the only solution to those things, other languages have come up with their own ways of creating schemas or enums

acoustic crater May 21, 2021, 2:10 AM

#

my favorite use case of metaclasses is when ppl say everything is an object in JAVA and C# and I can say "no, classes themselves aren't instantiated objects in those languages"

grave jolt May 21, 2021, 2:10 AM

#

well, in Java and C# not everything is an object anyway

#

they have primitives

acoustic crater May 21, 2021, 2:11 AM

#

ah I thought in C# types pretended to be classes

#

not very familiar with it though

valid rose May 21, 2021, 2:11 AM

#

anyone know what gc.get_referents does?

grave jolt May 21, 2021, 2:11 AM

#

!d gc.get_referents

fallen slateBOT May 21, 2021, 2:11 AM

#

gc.get\_referents


gc.get_referents(*objs)```
Return a list of objects directly referred to by any of the arguments. The referents returned are those objects visited by the arguments’ C-level [`tp_traverse`](https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_traverse "PyTypeObject.tp_traverse") methods (if any), and may not be all objects actually directly reachable. [`tp_traverse`](https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_traverse "PyTypeObject.tp_traverse") methods are supported only by objects that support garbage collection, and are only required to visit objects that may be involved in a cycle. So, for example, if an integer is directly reachable from an argument, that integer object may or may not appear in the result list.

Raises an [auditing event](https://docs.python.org/3/library/sys.html#auditing) `gc.get_referents` with argument `objs`.

grave jolt May 21, 2021, 2:12 AM

#

basically, gets children of an object

valid rose May 21, 2021, 2:12 AM

#

i still don't understand referred to

acoustic crater May 21, 2021, 2:12 AM

#

but yeah metaclasses let me badger ppl who claim everything is an object in their language which is pretty good on its own

grave jolt May 21, 2021, 2:13 AM

#

valid rose i still don't understand `referred to`

If an object a refers to object b, object b cannot be garbage collected before a is garbage collected

valid rose May 21, 2021, 2:13 AM

#

grave jolt If an object `a` refers to object `b`, object `b` cannot be garbage collected be...

ok

#

now for some crazy stuff

grave jolt May 21, 2021, 2:13 AM

#

valid rose ok

Basically, a has b as its attribute, you could say.

valid rose May 21, 2021, 2:14 AM

#

!e ```py
import gc
gc.get_referents(int.dict)[0]['uwu'] = lambda s: print('uwu')

(5).uwu()

fallen slateBOT May 21, 2021, 2:14 AM

#

@valid rose :white_check_mark: Your eval job has completed with return code 0.

uwu

valid rose May 21, 2021, 2:14 AM

#

what is happening here

#

my mind is bending

acoustic crater May 21, 2021, 2:14 AM

#

#esoteric-python lol

grave jolt May 21, 2021, 2:14 AM

#

!e

import gc
print(gc.get_referents(int.__dict__))

fallen slateBOT May 21, 2021, 2:14 AM

#

@grave jolt :white_check_mark: Your eval job has completed with return code 0.

[{'__repr__': <slot wrapper '__repr__' of 'int' objects>, '__hash__': <slot wrapper '__hash__' of 'int' objects>, '__getattribute__': <slot wrapper '__getattribute__' of 'int' objects>, '__lt__': <slot wrapper '__lt__' of 'int' objects>, '__le__': <slot wrapper '__le__' of 'int' objects>, '__eq__': <slot wrapper '__eq__' of 'int' objects>, '__ne__': <slot wrapper '__ne__' of 'int' objects>, '__gt__': <slot wrapper '__gt__' of 'int' objects>, '__ge__': <slot wrapper '__ge__' of 'int' objects>, '__add__': <slot wrapper '__add__' of 'int' objects>, '__radd__': <slot wrapper '__radd__' of 'int' objects>, '__sub__': <slot wrapper '__sub__' of 'int' objects>, '__rsub__': <slot wrapper '__rsub__' of 'int' objects>, '__mul__': <slot wrapper '__mul__' of 'int' objects>, '__rmul__': <slot wrapper '__rmul__' of 'int' objects>, '__mod__': <slot wrapper '__mod__' of 'int' objects>, '__rmod__': <slot wrapper '__rmod__' of 'int' objects>, '__divmod__': <slot wrapper '__divmod__' of 'int' objects>, '_
... (truncated - too long)

Full output: https://paste.pythondiscord.com/qotozigiyo.txt?noredirect

valid rose May 21, 2021, 2:15 AM

#

hmm, so what is a slot wrapper

grave jolt May 21, 2021, 2:15 AM

#

ah, i c

acoustic crater May 21, 2021, 2:15 AM

#

a slot wrapper wraps a function defined in C

valid rose May 21, 2021, 2:15 AM

#

so by modifiying this dict, we can add new methods heh?

acoustic crater May 21, 2021, 2:15 AM

#

not exactly

valid rose May 21, 2021, 2:16 AM

#

acoustic crater not exactly

then why did the uwu example work?

grave jolt May 21, 2021, 2:16 AM

#

valid rose what is happening here

int.__dict__ is a mappingproxy. It's a read-only mapping, and it refers to a dictionary (i.e. has it as an attribute)

acoustic crater May 21, 2021, 2:16 AM

#

yeah

grave jolt May 21, 2021, 2:16 AM

#

gc.get_referents(int.__dict__) has that dict as the first element

valid rose May 21, 2021, 2:17 AM

#

grave jolt `gc.get_referents(int.__dict__)` has that dict as the first element

but this time its mutable?

grave jolt May 21, 2021, 2:17 AM

#

yes, that dict is mutable

valid rose May 21, 2021, 2:17 AM

#

!e ```py
import gc
del gc.get_referents(int.dict)[0]['repr']

print(repr(5))

acoustic crater May 21, 2021, 2:18 AM

#

I am surprised that works actually without ctypes.pythonapi.PyType_Modified

#

that is pretty interesting

fallen slateBOT May 21, 2021, 2:18 AM

#

@valid rose :white_check_mark: Your eval job has completed with return code 0.

valid rose May 21, 2021, 2:18 AM

#

why doesnt it delete

acoustic crater May 21, 2021, 2:18 AM

#

you can't mess with dunders by editing the dict alone

valid rose May 21, 2021, 2:18 AM

#

interesting

acoustic crater May 21, 2021, 2:18 AM

#

because theya re defined in slots of the struct in C

valid rose May 21, 2021, 2:19 AM

#

so, if i define __slots__ in my own classes? can people mess with my dunders

acoustic crater May 21, 2021, 2:19 AM

#

no that's different

valid rose May 21, 2021, 2:19 AM

#

only for c dunders eh?

acoustic crater May 21, 2021, 2:19 AM

#

ya

#

fishhook module does it

#

https://pypi.org/project/fishhook/

PyPI

fishhook

Allows for runtime hooking of static class functions

valid rose May 21, 2021, 2:19 AM

#

can ctypes access libc?

pliant tusk May 21, 2021, 2:20 AM

#

yes

valid rose May 21, 2021, 2:20 AM

#

wait a sec, can i then use malloc and free in python?

pliant tusk May 21, 2021, 2:20 AM

#

technically, yes

valid rose May 21, 2021, 2:21 AM

#

hmm...

pliant tusk May 21, 2021, 2:22 AM

#

>>> from ctypes.util import find_library
>>> from ctypes import CDLL
>>> libc = CDLL(find_library('libc'))
>>> ret = libc.printf(b'%i\n', 1)
1
>>> ```

raven ridge May 21, 2021, 2:23 AM

#

can, yes! Should, no.

#

ctypes is like writing an extension module, but worse and harder to maintain.

pliant tusk May 21, 2021, 2:24 AM

#

ctypes should really only be used to integrate with code that doesnt have a python interface

raven ridge May 21, 2021, 2:24 AM

#

everything ctypes can do, Cython can do better.

pliant tusk May 21, 2021, 2:24 AM

#

ctypes doesnt require compiling tho

raven ridge May 21, 2021, 2:25 AM

#

yes, and in exchange it gives you no compile time type safety. And it's not necessarily portable across machines.

pliant tusk May 21, 2021, 2:25 AM

#

i still prefer Cython, but for quick prototyping with a c lib i tend to use ctypes

raven ridge May 21, 2021, 2:25 AM

#

CFFI has a mode where it doesn't require pre-compilation, too

#

I'd use that before ctypes, too

pliant tusk May 21, 2021, 2:26 AM

#

ctypes is stdlib

raven ridge May 21, 2021, 2:27 AM

#

yes, but so are all sorts of other terrible modules you should never use in real world code

pliant tusk May 21, 2021, 2:27 AM

#

fair enough

valid rose May 21, 2021, 2:27 AM

#

pliant tusk ```py >>> from ctypes.util import find_library >>> from ctypes import CDLL >>> l...

woah

raven ridge May 21, 2021, 2:27 AM

#

urllib.request comes to mind first, but there are plenty of other things in the stdlib that are substantially harder to use, less safe, and more error prone than the equivalent third party lib

static bluff May 21, 2021, 2:31 AM

#

So the other day one of you guys mentioned that having a reference to a class inside its own definition could "lead to the seeping out of the uninitialized class"

#

I guess I can sorta understand that- but what I want to know is why that would be untenable

raven ridge May 21, 2021, 2:32 AM

#

setting that aside, metaclasses make it impossible, so it's kind of a moot point

static bluff May 21, 2021, 2:32 AM

#

Metaclasses, as they currently function

raven ridge May 21, 2021, 2:32 AM

#

at least if we're talking about Python, and not a hypothetical Python-like language without metaclasses

#

right - metaclasses that take a namespace and turn it into a class.

#

with metaclasses as they exist today, you need to fill in a namespace first, before the class ever exists, because the namespace gets passed to the thing that makes the class.

static bluff May 21, 2021, 2:33 AM

#

Right

#

I'm not trying to suggest that referencing a class in its own definition is a good idea, by the way. I'm just curious

raven ridge May 21, 2021, 2:34 AM

#

setting that aside, in a hypothetical language without Python-like metaclasses, if the class existed before it was fully populated, you'd need to define semantics for what would happen if someone interacted with it in that intermediate state - what happens if you construct it? destroy it? call methods on it? Perform isinstance checks?

#

what happens if an exception occurs part way through defining the class, and so this class that you started to define never actually gets defined, despite something having obtained a reference to it?

acoustic crater May 21, 2021, 2:35 AM

#

maybe just make the definition recurse lol

raven ridge May 21, 2021, 2:35 AM

#

or maybe you just punt and say that it's all undefined behavior to touch it before it's finished being built - but in that case, what's the point of exposing it?

static bluff May 21, 2021, 2:36 AM

#

I think "obtained a reference to it" is an important thing for me to take note of. In theory, if the class instantiation fails you just return an undefined or else throw and error that may or may not get caught, yada

#

But if you store a reference to the uninstantiated class somewhere outside itself and then it fails, what then, right?

acoustic crater May 21, 2021, 2:38 AM

#

before a metaclass is fully instantiated and you refer to the instance... treat the reference as a new class constructor and run it with any new args n kwargs? It's dumb but that is what springs to mind

#

and the control flow is what you'd expect from recursion

#

it makes little sense but it makes the most sense as far as I can tell

raven ridge May 21, 2021, 2:41 AM

#

static bluff But if you store a reference to the uninstantiated class somewhere outside itsel...

note that this case isn't fundamentally different from what happens if you access the class in the middle of defining it. Like what this pseudocode would do:

class C:
    __class__().say_hello()

    def say_hello(self):
        ...

#

even if the type existed, in that case __init__ hadn't yet been defined (nor even say_hello, and yet you're creating an instance of that type and calling a method on it.

acoustic crater May 21, 2021, 2:44 AM

#

yeah there's no way to instantiate the metaclass instance if it itself isn't instantiated

static bluff May 21, 2021, 2:44 AM

#

Well, I'm sure there's some way to do it

acoustic crater May 21, 2021, 2:45 AM

#

recursive definition

static bluff May 21, 2021, 2:45 AM

#

But its probably not a good idea

acoustic crater May 21, 2021, 2:45 AM

#

is all I can think of

#

and yeah it's not haha

static bluff May 21, 2021, 2:45 AM

#

I think the reason I had originally brought it up was because I was discussing how one might implement privacy in a language which allows new methods to be attached to a class after the fact

raven ridge May 21, 2021, 2:46 AM

#

static bluff Well, I'm sure there's *some* way to do it

there most certainly isn't. The metaclass can return an entirely different class depending on the values in the namespace. You can't have the class before the namespace is passed to the metaclass, because the metaclass can do something different depending on what's in the namespace it gets.

static bluff May 21, 2021, 2:46 AM

#

You'd need to mark all of the methods which were defined directly inside the class as 'native', and define the method's 'owner' object as being the class

raven ridge May 21, 2021, 2:47 AM

#

what makes those methods more important than others?

static bluff May 21, 2021, 2:47 AM

#

You'd want the methods originally defined within the class to have access to the class' (and its instances') private attributes, but prevent any user defined methods from being able to access them

acoustic crater May 21, 2021, 2:47 AM

#

if you can make class attributes private you can just have a different sort of attribute that's private

#

the question is just how to make them private

raven ridge May 21, 2021, 2:48 AM

#

static bluff You'd want the methods originally defined within the class to have access to the...

why is that what you'd want?

#

I'm not being entirely facetious - that would make class decorators far less useful, for instance, because they wouldn't be able to monkeypatch in new methods.

static bluff May 21, 2021, 2:48 AM

#

Well, I might be wrong here but, if it was as simple as just attaching a method to an object and, tada, you have access to the private attributes- well- whats the point?

raven ridge May 21, 2021, 2:48 AM

#

likewise with @unittest.mock.patch for unit testing, etc.

raven ridge May 21, 2021, 2:49 AM

#

static bluff Well, I might be wrong here but, if it was as simple as just attaching a method ...

there isn't one, that's why Python doesn't have access modifiers 😄

acoustic crater May 21, 2021, 2:49 AM

#

how would access to itself change that?

#

where do you want those private attributes to be accessed?

#

during construction only?

static bluff May 21, 2021, 2:49 AM

#

During construction, and within the scope of any methods defined directly within the class' definition space

acoustic crater May 21, 2021, 2:50 AM

#

so you want a whole private namespace

#

for classes

static bluff May 21, 2021, 2:50 AM

#

Yes

#

In theory

acoustic crater May 21, 2021, 2:50 AM

#

but you also apparently want a liminal namespace for accessing "public" stuff

raven ridge May 21, 2021, 2:50 AM

#

which also means getting rid of getattr and setattr

static bluff May 21, 2021, 2:50 AM

#

liminal?

acoustic crater May 21, 2021, 2:50 AM

#

it could be like javascript Symbols just inaccessible

#

liminal means in between

#

Symbols without special methods/functions for accessing them

raven ridge May 21, 2021, 2:51 AM

#

because the contract getattr(obj, name) and setattr(obj, name, val) have no way of knowing if the caller of the function is allowed to call the function. They don't even know who the caller of the function is.

acoustic crater May 21, 2021, 2:51 AM

#

but you'd also need the in between components

#

the public interface

#

but yeah as godlygeek is gettin at, the hard part is actually building that private namespace and public interface to access it

static bluff May 21, 2021, 2:52 AM

#

raven ridge which also means getting rid of `getattr` and `setattr`

thing.public #public attribute
thing:private #private attribute
thing::dunder #same privacy level as private, but a separate namespace, to avoid naming conflicts

Keep in mind- I'm not advocating OR disadvocating this syntax, its a work in progress

acoustic crater May 21, 2021, 2:52 AM

#

you need between public and private too

#

can't be altered but are accessible

#

liminal

static bluff May 21, 2021, 2:53 AM

#

acoustic crater can't be altered but are accessible

Just make it accessible by a getter

raven ridge May 21, 2021, 2:53 AM

#

static bluff ```py thing.public #public attribute thing:private #private attribute thing::dun...

so you wouldn't have getattr() and setattr() functions, right? Seems like you're agreeing with me.

acoustic crater May 21, 2021, 2:54 AM

#

the public getter has the private attribute exposed to it though

static bluff May 21, 2021, 2:55 AM

#

raven ridge because the contract `getattr(obj, name)` and `setattr(obj, name, val)` have no ...

Well things brings my original concept of the problem full circle. Everything needs to have an 'owner' attribute (or even an array of them, in the case of nested classes???)

getattr would check to which owner( function/module/class ) the namespace inwhich the attribute is being requested belongs. If the owner is the class or a descendant of it all good, otherwise, fail

acoustic crater May 21, 2021, 2:55 AM

#

so u can just define a setter if the public stuff has access to the private stuff

static bluff May 21, 2021, 2:56 AM

#

acoustic crater so u can just define a setter if the public stuff has access to the private stuf...

It'd be kinda pointless, but in theory yeah

raven ridge May 21, 2021, 2:56 AM

#

static bluff Well things brings my original concept of the problem full circle. Everything ne...

it would need to know the owner of obj.name as well as the provenance of the caller

static bluff May 21, 2021, 2:56 AM

#

provenance?

raven ridge May 21, 2021, 2:56 AM

#

the identity and history, I guess

static bluff May 21, 2021, 2:56 AM

#

Ahh, well, yeah more or less

raven ridge May 21, 2021, 2:57 AM

#

origins

static bluff May 21, 2021, 2:57 AM

#

Which doesn't entirely disagree with Python's objective nature in my opinion

#

Obviously it would take proper planning and a thorough understanding of the problem, but it certainly seems doable to me

raven ridge May 21, 2021, 2:57 AM

#

have you done much unit testing? Both in Python, and in a language with access modifiers?

static bluff May 21, 2021, 2:57 AM

#

I can't say I have

raven ridge May 21, 2021, 2:58 AM

#

it's the best reason why access modifiers are a terrible idea

#

they don't do anything useful, they just get in the programmer's way.

static bluff May 21, 2021, 2:58 AM

#

psssssssst

#

Psssssssssssssssssst godly

halcyon trail May 21, 2021, 2:58 AM

#

Disagree

static bluff May 21, 2021, 2:58 AM

#

You're ruining my fun

halcyon trail May 21, 2021, 2:59 AM

#

Especially when languages have an internal access modifier or similar

raven ridge May 21, 2021, 2:59 AM

#

they make all sorts of reasonable things that programmers want to do - like "test what happens if I make a database call through my class while the database handle is in an error state" - much more difficult.

halcyon trail May 21, 2021, 2:59 AM

#

So you can have something that is private to the outside but a ailable to tests

static bluff May 21, 2021, 2:59 AM

#

halcyon trail So you can have something that is private to the outside but a ailable to tests

Like giving the tests an all access pass

raven ridge May 21, 2021, 2:59 AM

#

and they don't offer any benefits in exchange, because practically speaking, in every language with access modifiers, untrusted code is running in the same address space and can just choose to ignore the access modifiers.

halcyon trail May 21, 2021, 3:00 AM

#

That's really not true

static bluff May 21, 2021, 3:00 AM

#

yeah that doesn't sound right to me

raven ridge May 21, 2021, 3:00 AM

#

what's a counterexample?

halcyon trail May 21, 2021, 3:00 AM

#

"just ignore" via complicated tricks with reflection usually

raven ridge May 21, 2021, 3:00 AM

#

right.

static bluff May 21, 2021, 3:00 AM

#

And anything can be hacked

halcyon trail May 21, 2021, 3:00 AM

#

Yeah that's not the point

raven ridge May 21, 2021, 3:00 AM

#

that's not "hacking"

halcyon trail May 21, 2021, 3:00 AM

#

Protect against Murphy, not Machiavelli

raven ridge May 21, 2021, 3:01 AM

#

your code is running in the same process as the stuff that someone is trying to protect from your code. There's no trust barrier between the two things.

halcyon trail May 21, 2021, 3:01 AM

#

This just doesn't have any relationship to the software engineering realities of access control

static bluff May 21, 2021, 3:01 AM

#

halcyon trail Protect against Murphy, not Machiavelli

I think I want a t-shirt that says this

halcyon trail May 21, 2021, 3:01 AM

#

Nobody claims it's a security measure

raven ridge May 21, 2021, 3:02 AM

#

so we can agree on one thing that it's useless for, security.

#

what's a thing that it's not useless for? 😄

halcyon trail May 21, 2021, 3:02 AM

#

Literally nobody ever claimed otherwise

#

Access control

raven ridge May 21, 2021, 3:02 AM

#

you would be surprised how often people claim otherwise.

raven ridge May 21, 2021, 3:02 AM

#

halcyon trail Access control

that's just the name of the feature.

#

what's it good for?

#

why is "access control" a good thing? It doesn't aid security, but it does aid... ?

static bluff May 21, 2021, 3:03 AM

#

It dissuades people from screwing with the internals of the program

halcyon trail May 21, 2021, 3:03 AM

#

It aids prevention of people mucking around with internals of your code, encapsulation

static bluff May 21, 2021, 3:03 AM

#

Which, if you're designing for beginners for example, is a good thing

raven ridge May 21, 2021, 3:04 AM

#

hm, why?

#

beginners have plenty of things that they're told "just don't do that", or that they don't understand.

static bluff May 21, 2021, 3:04 AM

#

I built a project that essentially replicates a javascript runtime within python. Pythonic control over the elements in a web page. Elements had a 'style' attribute

raven ridge May 21, 2021, 3:05 AM

#

halcyon trail It aids prevention of people mucking around with internals of your code, encapsu...

why is preventing people from mucking with the internals of your code desirable? It stops them from fixing bugs with monkeypatches, or from white box testing, or from printf debugging. What does it buy in exchange?

static bluff May 21, 2021, 3:06 AM

#

Now, you can use an underscore to denote privacy, but people are going to screw with it anyway. It's one thing for someone to try to change it immediately and get an error more or less right away, but if they change something deep in the internals of the language and then some day, maybe weeks or months down the way, start getting an error whose traceback may even have nothing to do with the modification you used

grave jolt May 21, 2021, 3:06 AM

#

well, languages with private fields/methods usually have a way of circumventing that 🙂

raven ridge May 21, 2021, 3:06 AM

#

I've never seen one that doesn't.

#

and any with any sort of C FFI immediately has a way of circumventing it.

static bluff May 21, 2021, 3:06 AM

#

Protecting, really protecting the internals of a project make it more resislient

acoustic crater May 21, 2021, 3:07 AM

#

hiring coders that aren't idiots is much better

#

or just name private attributes DO_NOT_USE_OR_YOURE_FIRED lol

grave jolt May 21, 2021, 3:07 AM

#

raven ridge I've never seen one that doesn't.

const counter = () => {
    let x = 0;

    const increment = () => { x++; };
    const getValue = ()  => x;
    return { increment, getValue };
};

const ctr = counter()

Here you can't change the x variable from outside (apart from using increment)

static bluff May 21, 2021, 3:08 AM

#

For me, its not about absolute refusal of access, its about keeping the internals away from anyone who doesn't have the skills required to work with them

static bluff May 21, 2021, 3:08 AM

#

acoustic crater hiring coders that aren't idiots is much better

Yes, of course. What idiots high school students, passionate young people, and beginners of all types are. Such morons they are

raven ridge May 21, 2021, 3:08 AM

#

grave jolt ```js const counter = () => { let x = 0; const increment = () => { x++;...

that's not access modifiers, to be fair, just a closure with a language with insufficient introspection, heh

acoustic crater May 21, 2021, 3:08 AM

#

why does a project involving novices need to be resilient?

grave jolt May 21, 2021, 3:08 AM

#

well, in Python you can change x 🙂

raven ridge May 21, 2021, 3:09 AM

#

you're not just stopping idiots from touching your internals, you're also stopping people who know exactly what they're doing from touching your internals.

#

which isn't necessarily a good tradeoff.

static bluff May 21, 2021, 3:09 AM

#

acoustic crater why does a project involving novices need to be resilient?

All things being equal, why not?

raven ridge May 21, 2021, 3:09 AM

#

I have fixed bugs in production libraries through monkeypatches. It was the right call.

halcyon trail May 21, 2021, 3:09 AM

#

You're making it significantly harder to touch internals by accident, and very explicit when you do touch internals

acoustic crater May 21, 2021, 3:09 AM

#

it'll already be impossible to refactor or read, why would just hiding internals from them prevent them from making other mistakes that make the project untenable?

halcyon trail May 21, 2021, 3:10 AM

#

So people can see it in code review and ensure it's truly necessary

static bluff May 21, 2021, 3:10 AM

#

You've got two otherwise equivalent languages, one with privacy, one without. If you've decided privacy is something you want, go for it. If you've decided otherwise, go with the latter

halcyon trail May 21, 2021, 3:10 AM

#

Also, monkey patching and access control are two different things

acoustic crater May 21, 2021, 3:10 AM

#

except we don't have them we just have speculation about a theoretical language and can't even decide how to implement privacy

raven ridge May 21, 2021, 3:11 AM

#

I'm convinced that "privacy" is a way to give security through obscurity (or perhaps correctness through obscurity?), and therefore isn't valuable.

halcyon trail May 21, 2021, 3:11 AM

#

Seems like this is more of an issue of dynamic vs static

#

No

static bluff May 21, 2021, 3:11 AM

#

I think maybe this comes down to schools of thought 😛 I'm going to exit the debate with a smile on my face, having learned a thing or two not least of which- access modifiers, an empassioned issue

halcyon trail May 21, 2021, 3:11 AM

#

Nobody who knows anything about access control argues that it's related to security

raven ridge May 21, 2021, 3:11 AM

#

that's absolutely not true.

halcyon trail May 21, 2021, 3:11 AM

#

Frankly anybody who brings it up is just showing their own misunderstanding

raven ridge May 21, 2021, 3:12 AM

#

I can point you to recent threads on python-ideas with people arguing that it's related to security.

halcyon trail May 21, 2021, 3:12 AM

#

Then they don't understand it

raven ridge May 21, 2021, 3:12 AM

#

I agree.

halcyon trail May 21, 2021, 3:12 AM

#

I'm sorry but it's that simple

grave jolt May 21, 2021, 3:12 AM

#

I have never seen anyone argue it's for security tbh

halcyon trail May 21, 2021, 3:12 AM

#

Well that's what I said

acoustic crater May 21, 2021, 3:12 AM

#

if people can accidentally mess with the guts of something, despite widely accepted prefixed underscore syntax, and would, why are they part of the project?

raven ridge May 21, 2021, 3:13 AM

#

grave jolt I have never seen anyone argue it's for security tbh

https://mail.python.org/archives/list/python-ideas@python.org/thread/DD2L56GCOCWEUBBZBDKKKMPPVWB7PRFB/ was a recent example

#

it comes up surprisingly often.

halcyon trail May 21, 2021, 3:13 AM

#

The thing is that the underscore syntax + static enforcement would basically be access control

acoustic crater May 21, 2021, 3:13 AM

#

yeah true I just thought that as I typed that

grave jolt May 21, 2021, 3:13 AM

#

In my understanding it's a way to separate the public API of a thing from its implementaiton details, and to enforce it at the language level (with an escape hatch, as always)

halcyon trail May 21, 2021, 3:13 AM

#

Just a primitive form and by convention

acoustic crater May 21, 2021, 3:13 AM

#

pycharm yells at you for accessing underscore prefixed stuff, just enforce that and you're gold

#

no need to edit the language

halcyon trail May 21, 2021, 3:14 AM

#

The fact is that most new statically.typed languages being created today, if they aren't say very purely functional, continue to include access control

static bluff May 21, 2021, 3:14 AM

#

acoustic crater no need to edit the language

NO IMAGINATION ALERT

#

XD

acoustic crater May 21, 2021, 3:14 AM

#

lol u have a lexer not a rewrite of python internals

raven ridge May 21, 2021, 3:14 AM

#

grave jolt In my understanding it's a way to separate the public API of a thing from its im...

if there's an escape hatch, what good does enforcing that at the language level do? How is it better than Python's gentleman's agreement about underscore?

halcyon trail May 21, 2021, 3:14 AM

#

Why is static enforcement useful?

static bluff May 21, 2021, 3:14 AM

#

What I mean to say is, I (and others) build languages for fun. 'No need to' implies tedium and unpleasantness

grave jolt May 21, 2021, 3:15 AM

#

raven ridge if there's an escape hatch, what good does enforcing that at the language level ...

I don't know, I'm not a language designer!

#

Oh, another thing that comes to mind is preventing name collisions.

static bluff May 21, 2021, 3:15 AM

#

grave jolt Oh, another thing that comes to mind is preventing name collisions.

YESSSSSSSSS

grave jolt May 21, 2021, 3:15 AM

#

Which is mostly solved with __ in Python, but I haven't seen many people use it

static bluff May 21, 2021, 3:15 AM

#

Not the end of the world, but a nice bonus

grave jolt May 21, 2021, 3:16 AM

#

grave jolt Which is mostly solved with `__` in Python, but I haven't seen many people use i...

it also breaks when two classes in the inheritance chain have the same name lemon_pleased

#

which is, granted, something I have never seen

acoustic crater May 21, 2021, 3:16 AM

#

a way to access mangled attributes set in a subclass without it being ugly and hacky might be nice

#

tho that is kinda not what mangling is for

grave jolt May 21, 2021, 3:17 AM

#

a way to access mangled attributes set in a subclass
❓w ❓h❓y❓

acoustic crater May 21, 2021, 3:17 AM

#

...the only reason I've done it is because my current CS prof thinks mangled means private and makes us use them

#

cuz he hates python

grave jolt May 21, 2021, 3:17 AM

#

Why would you ever need to access a private variable of a subclass?

raven ridge May 21, 2021, 3:18 AM

#

grave jolt Which is mostly solved with `__` in Python, but I haven't seen many people use i...

The cases where you need to worry about name collisions are when

You own a class
You support other people subclassing your class
You want to extend your class by adding a private attribute after it's been subclassed

Name mangling should be used more than it is, but that's still not super common.

acoustic crater May 21, 2021, 3:18 AM

#

for inherited methods that access that variable >_>

#

it makes no sense I know

grave jolt May 21, 2021, 3:18 AM

#

In languages with access modifiers there are 'protected' variables

acoustic crater May 21, 2021, 3:18 AM

#

idk what the people new to python did

grave jolt May 21, 2021, 3:18 AM

#

You can't access parent's private variables/methods of a parent class AFAIK in those languages

raven ridge May 21, 2021, 3:18 AM

#

acoustic crater for inherited methods that access that variable >_>

if the subclass sets the attributes, the parent class shouldn't be touching them.

#

because the parent class should have no knowledge of its subclasses, generally, for Liskov reasons.

acoustic crater May 21, 2021, 3:20 AM

#

thinking back, maaaybe he meant for people to make more setters and getters

#

without explicitly saying so

raven ridge May 21, 2021, 3:20 AM

#

getters and setters are generally considered an anti-pattern in Python.

acoustic crater May 21, 2021, 3:20 AM

#

the tests for the assignments had methods like "setThis" and "getThis"

#

yeah it's gross

#

I tried using property and .setter but the tests required those naming conventions

#

and he calls instance methods class methods

#

_>

static bluff May 21, 2021, 3:22 AM

#

I really don't see the point in ever setting both a getter and a setter, unless you're doing some sort of logic with the value being passed to setter of course

#

That might be my ignorance talking though

acoustic crater May 21, 2021, 3:22 AM

#

yeah and you can just overload the attribute name in python

raven ridge May 21, 2021, 3:22 AM

#

yeah. @property makes getters and setters unnecessary in Python, because you can evolve obj.attr = 42 to call a method in the future without callers needing to change their code.

acoustic crater May 21, 2021, 3:22 AM

#

so there's no need to set them for everything initially

#

exactly

#

I'm just getting credits

#

ppl ask if they should take this class and I say "no, it's not Python"

raven ridge May 21, 2021, 3:23 AM

#

the reason you need setters in Java is because there's no way to start with obj.attr = ... and add validation to it later without needing all of the callers to change

acoustic crater May 21, 2021, 3:24 AM

#

yeah different languages have different standard practices for a reason

raven ridge May 21, 2021, 3:24 AM

#

yeah. Teaching Python as though it's Java is, unfortunately, very common, though.

#

it's a truer OOP language than Java! Everything is an object!

acoustic crater May 21, 2021, 3:27 AM

#

srsly

#

allows for functional programming but the functions are objects

static bluff May 21, 2021, 3:28 AM

#

Quick question. Is it too early to start using the 3.10 beta?

acoustic crater May 21, 2021, 3:28 AM

#

I am using it on a project and it's fine

static bluff May 21, 2021, 3:28 AM

#

Pattern matching working okay?

acoustic crater May 21, 2021, 3:28 AM

#

yeah

#

except my IDE hates it

#

in the project itself I've only used type union syntax but that works great

#

pattern matching will be perfect for one part though

static bluff May 21, 2021, 3:31 AM

#

yeah, I get the impression everyone, certainly me, is really excited for it

acoustic crater May 21, 2021, 3:33 AM

#

it makes me wish python had an option for better recursion handling

raven ridge May 21, 2021, 3:37 AM

#

static bluff yeah, I get the impression everyone, certainly me, is really excited for it

I'm not, personally. I'm not convinced that the complexity it adds to the language is worth the convenience.

#

it's a whole weird DSL that looks like Python without behaving like Python, and is going to be annoying to teach...

#

and the presence or absence of a . in determining whether something is a load or a store makes me sad, still.

#

I'm sure I'll wind up using it, but I'm not excited for it.

acoustic crater May 21, 2021, 3:39 AM

#

the dottedness to determine a constant is weird indeed

static bluff May 21, 2021, 3:39 AM

#

It's the steady march of change, either way

acoustic crater May 21, 2021, 3:39 AM

#

but you can use guards lol

#

I like the guard/pattern match combination

#

idk if it's unique

#

does another language have that?

raven ridge May 21, 2021, 3:40 AM

#

I wish that instead of guards there was a "go to next case" statement, to break out of one that matched and continue matching on the next one. Though that would have the tradeoff of removing the option to evaluate cases in parallel.

static bluff May 21, 2021, 3:41 AM

#

So, when you compile something into a code object, you can do so within a module's namespace or simply within its own little virtual space. What a term to describe whatever thing, module or otherwise, in which the code is being compiled?

raven ridge May 21, 2021, 3:41 AM

#

namespace

static bluff May 21, 2021, 3:41 AM

#

Rockin, thanks

acoustic crater May 21, 2021, 3:41 AM

#

yeah I'm glad they don't have fallthrough

#

that was always a werid aspect of switch case especially to me especially because it seems seldom used

#

maybe if match case was a module though and they introduced module-level soft keywords 🤔

raven ridge May 21, 2021, 3:43 AM

#

I'm betting all keywords going forward will be soft

acoustic crater May 21, 2021, 3:43 AM

#

cuz the new mini language being global is weird

#

yeah they should be

#

someone in another server linked a tweet where someone was complaining about for (var of of of){} in js I'm like wtf do you want it to do the only nasty part of that is your IDE not understanding it

#

(all the ofs were keyword colored)

sacred yew May 21, 2021, 3:44 AM

#

thats typescript im pretty sure 😛

acoustic crater May 21, 2021, 3:44 AM

#

oh I mean var

sacred yew May 21, 2021, 3:44 AM

#

i mean vanilla js is still pretty bad

raven ridge May 21, 2021, 3:45 AM

#

acoustic crater yeah I'm glad they don't have fallthrough

well, it would make an imperative alternative to guard clauses. Instead of

case int(x) if 0 <= x <= 10:
    do_stuff()
case _:
    other_stuff()

you could do:

case int(x):
    if x < 0 or x > 10:
        try next
    do_stuff()
case _:
    other_stuff()

acoustic crater May 21, 2021, 3:45 AM

#

js is extremely janky haha

sacred yew May 21, 2021, 3:46 AM

#

aren't guards what other FP langs do?

acoustic crater May 21, 2021, 3:46 AM

#

afaik not directly in conjuction with pattern matching

#

but yes

halcyon trail May 21, 2021, 3:46 AM

#

I like pattern matching a lot for statically typed languages

#

For python, the benefits just aren't as big

acoustic crater May 21, 2021, 3:47 AM

#

maybe they will add less overhead to recursive function calls >_>

halcyon trail May 21, 2021, 3:47 AM

#

So I'm less sure on whether I like it

acoustic crater May 21, 2021, 3:47 AM

#

somehow

halcyon trail May 21, 2021, 3:47 AM

#

Also python is starting to feel really kitchen sinky these days

raven ridge May 21, 2021, 3:47 AM

#

I'm sure I'll find places to use it, but I don't find it... exciting. I have a begrudging acceptance of it. 🙂

acoustic crater May 21, 2021, 3:48 AM

#

it's definitely not even something I think should be taught

unkempt rock May 21, 2021, 3:48 AM

#

hi

#

Is here the pro daddy section ?

sacred yew May 21, 2021, 3:48 AM

#

?

halcyon trail May 21, 2021, 3:48 AM

#

?

unkempt rock May 21, 2021, 3:49 AM

#

?

raven ridge May 21, 2021, 3:49 AM

#

acoustic crater it's definitely not even something I think should be taught

isn't that the worst of both worlds? A feature adding complexity to the language that most people don't know about or use?

halcyon trail May 21, 2021, 3:49 AM

#

Not necessarily

#

Depends on the feature

#

Some features are more for library authors

#

Like metaclasses

#

But in the case of pattern matching

#

I agree

raven ridge May 21, 2021, 3:50 AM

#

yeah, that's a fair point.

halcyon trail May 21, 2021, 3:50 AM

#

If it's so hard that most people use it, something is wrong

#

*don't use it

acoustic crater May 21, 2021, 3:51 AM

#

is haskell wrong?

#

haha

halcyon trail May 21, 2021, 3:51 AM

#

Yes :-)

raven ridge May 21, 2021, 3:51 AM

#

it gives me serious regex vibes. It's its own special DSL jammed into the language, and it sort of looks like the language, but it doesn't behave like the rest of the language.

acoustic crater May 21, 2021, 3:51 AM

#

it's only hard because of a barrier to entry

raven ridge May 21, 2021, 3:51 AM

#

int(x) after case does something entirely different from what int(x) does everywhere else.

acoustic crater May 21, 2021, 3:51 AM

#

it's not as convoluted or powerless as regex tho

#

lol you can even define real functions in it so it's your dict with multiline lambdas

#

...and weird rules haha

raven ridge May 21, 2021, 3:54 AM

#

acoustic crater it's not as convoluted or powerless as regex tho

Well... it's like regexes except that it matches arbitrarily nested objects of arbitrary types, instead of just textual strings. It's still pretty complex - at least, it's not intuitive, and it reuses syntax that means something entirely different in the rest of the language to mean a different thing

#

everywhere else in the language, int(x) takes an existing x and converts it to an int. Inside a case statement, int(x) takes an existing int and stores it to x

#

that's at least... weird.

acoustic crater May 21, 2021, 3:55 AM

#

yeah

#

it seems like a lot of ppl were typing isinstance too much tho haha

halcyon trail May 21, 2021, 3:56 AM

#

i still haven't come to terms with how weird walrus operator looks in comprehensions tbh 🤣

#

I'm behind

raven ridge May 21, 2021, 3:56 AM

#

and you can't know how it will behave on an arbitrary type without knowing if that type defines __match_args__

acoustic crater May 21, 2021, 3:56 AM

#

tbf that's all damn dunders

halcyon trail May 21, 2021, 3:57 AM

#

pattern matching also gets kinda weird in python because python has two orthogonal type systems, the dynamic one and the static one

#

If you have Union[Foo, Bar] is a pattern match that checks for Foo and Bar exhaustive?

acoustic crater May 21, 2021, 3:57 AM

#

does None even have defined behavior for __lt__? Who knows, it's got it though

raven ridge May 21, 2021, 3:57 AM

#

Ooh, and int(x) and MyClass(x) do something entirely different in pattern matching

#

int(x) matches an int and stores it in x.
MyClass(x) matches a MyClass whose first match arg is x

acoustic crater May 21, 2021, 3:58 AM

#

hmm what happens if MyClass is an int subclass?

raven ridge May 21, 2021, 3:58 AM

#

you don't get the special int behavior.

acoustic crater May 21, 2021, 3:58 AM

#

I guess int(x) would work

raven ridge May 21, 2021, 3:59 AM

#

int(x) would work, but ClassName(x) would not

#

you instead would need to do ClassName() as x

acoustic crater May 21, 2021, 3:59 AM

#

that sorta makes sense tho as far as the magic of builtins is concerned

#

only builtins really are apparently themselves not just defined to be as such

#

unless you implement a non built in in C or whatever

raven ridge May 21, 2021, 4:01 AM

#

"built in" has two different meanings. It's used to describe both things in the builtins module (possibly only things that are in it by default, possibly including things that you add to it dynamically), as well as things that are defined in C extension modules

acoustic crater May 21, 2021, 4:01 AM

#

so what does deque(x) do?

#

tho deque is an especially hacky builtin extension type object imo

raven ridge May 21, 2021, 4:02 AM

#

As mentioned above, for the following built-in types the handling of positional subpatterns is different: bool, bytearray, bytes, dict, float, frozenset, int, list, set, str, and tuple.

#

those are the only ones that are handled specially.

acoustic crater May 21, 2021, 4:02 AM

#

is it according to a new slotted dunder?

raven ridge May 21, 2021, 4:02 AM

#

no.

acoustic crater May 21, 2021, 4:03 AM

#

ah that is pretty weird

#

luckily can still be changed though

raven ridge May 21, 2021, 4:03 AM

#

no... that can't ever be changed

#

it would be a backwards incompatible change to the language.

acoustic crater May 21, 2021, 4:03 AM

#

ah true

raven ridge May 21, 2021, 4:04 AM

#

deque(x) has a defined meaning in 3.10 - they can't change what it means in a later version.

acoustic crater May 21, 2021, 4:04 AM

#

well, they can change the others to be slot dunders

#

without a change in functionality

raven ridge May 21, 2021, 4:04 AM

#

well, sure - but why?

acoustic crater May 21, 2021, 4:04 AM

#

idk

#

deque is weird in general tho

static bluff May 21, 2021, 4:05 AM

#

halcyon trail If it's so hard that most people use it, something is wrong

Like regex?

#

_>

acoustic crater May 21, 2021, 4:05 AM

#

behaves like it's defined in python

#

except for being a linked list

raven ridge May 21, 2021, 4:05 AM

#

that set of 11 types will forever need to be handled specially

#

whether that set is hardcoded in the parser or given a special dunder that nothing else uses seems like an implementation detail.

acoustic crater May 21, 2021, 4:05 AM

#

aren't they already?

#

can't mess with their methods without trickery

#

they just pretend to be classes

raven ridge May 21, 2021, 4:06 AM

#

the same is true of complex, and that's not included in the list

#

it's an arbitrary subset of the builtin types

#

and this is normative - future versions of PyPy, for instance, will need to special case this same set of 11 types.

acoustic crater May 21, 2021, 4:07 AM

#

at least it's just 11 then haha

raven ridge May 21, 2021, 4:09 AM

#

it is, but... well, hm. I can't help but feel that section of the PEP didn't get enough discussion

#

is complex really so much less special than frozenset or bytearray? All 3 are pretty special case types...

acoustic crater May 21, 2021, 4:11 AM

#

yeah lack of complex is weird

raven ridge May 21, 2021, 4:11 AM

#

well, unfortunately, it can never be added hyperlemon

acoustic crater May 21, 2021, 4:12 AM

#

long live the complexpy fork

#

skywalker u have a new calling

static bluff May 21, 2021, 4:13 AM

#

I have module A containing class A and module B containing class B. Class A uses an instance of class B as an attribute, but class B requires knowledge of class A for type annotation. Whats the solution?

acoustic crater May 21, 2021, 4:13 AM

#

don't

#

lol

static bluff May 21, 2021, 4:13 AM

#

XD

raven ridge May 21, 2021, 4:14 AM

#

from __future__ import annotations

static bluff May 21, 2021, 4:14 AM

#

Oh, and whats this about my new calling?

raven ridge May 21, 2021, 4:14 AM

#

(but also, don't)

acoustic crater May 21, 2021, 4:15 AM

#

make complex behave like other builtins in pattern matching in a cpython fork

raven ridge May 21, 2021, 4:16 AM

#

if you have a circular dependency in your types, it sounds pretty fishy - that seems to indicate something is factored wrong, more likely than not.

acoustic crater May 21, 2021, 4:17 AM

#

at least they're not passing self to the instantiation of the composed class any more

#

...I hope

raven ridge May 21, 2021, 4:18 AM

#

if an A has a B as an attribute, but a B has a method that returns an A, that's... suspicious. It may not always be bad, but it's a thing that's more likely to be bad than good, I think.

static bluff May 21, 2021, 4:18 AM

#

Wise words as always my dudes

acoustic crater May 21, 2021, 4:18 AM

#

good luck!

#

maybe you wanna pass a strategy for dealing with class A to class B... but rly shouldn't the concerned methods just be in class A?

raven ridge May 21, 2021, 4:21 AM

#

it can happen in cases where there really is a circular dependency - like a tree with multiple types of nodes, where any type could contain another type, perhaps...

#

it's not always gonna be wrong, but it warrants a closer look.

static bluff May 21, 2021, 4:21 AM

#

For the record, I just moved them both in to the same module

#

Spacing things out into multiple modules is, in my mind, important. But I've been known to take it too far

acoustic crater May 21, 2021, 4:23 AM

#

yeah drawing that line can be tough

static bluff May 21, 2021, 4:30 AM

#

My issue is that my lexer is already over 700 lines of code, comments included

#

And its working fine for everything except for strings, which are a rabbit hole

#

Fstrings need their own lexer and their own regular expression, which together will eat up a few hundred lines

acoustic crater May 21, 2021, 4:32 AM

#

then u got ur b strings and r strings

static bluff May 21, 2021, 4:32 AM

#

I'd really like to keep everything in one module, but I think it might be time to break things apart

#

Oh!

#

And, I want to implement arrow notation, those will require in the very least a good deal of coding within the normal lexer to handle, if not their own lexer and expression

#

https://ibb.co/4868nqL

ImgBB

Screen-Shot-2021-05-20-at-9-34-50-PM

Image Screen-Shot-2021-05-20-at-9-34-50-PM hosted in ImgBB

#

Too much?

#

Abstract - Base class
Anonymous - Arrow functions
Interpolated - Fstrings
Namespace - Module level (or just plain namespace in the event of a 'compile()' without being provided a module to compile in)
Stringified - normal string

grave jolt May 21, 2021, 4:36 AM

#

sacred yew i mean vanilla js is still pretty bad

My favourite new trivia piece about JS:

Object.defineProperty(
  String.prototype,
  'onions',
  {
    set: () => { console.log("I don't like onions") }
  }
);
  
x = "foo";
x.onions = 2000;
console.log({'x.onions': x.onions});

When getting/setting properties or calling methods, primitives spawn a temporary boxed object

#

acoustic crater May 21, 2021, 4:37 AM

#

@static bluff https://medium.com/hackernoon/modifying-the-python-language-in-7-minutes-b94b0a99ce14 this might help

static bluff May 21, 2021, 4:37 AM

#

acoustic crater <@!404546189372162058> https://medium.com/hackernoon/modifying-the-python-langua...

😛 With respect spoony, that article is like bringing a toothpick to a knife fight (read it before)

acoustic crater May 21, 2021, 4:38 AM

#

but have you whittled a toothpick yet?

#

it seems like you have a half built knife machine haha

static bluff May 21, 2021, 4:38 AM

#

I'm actually very pleased with my progress

acoustic crater May 21, 2021, 4:38 AM

#

yeha you seem to be chuggin along

static bluff May 21, 2021, 4:38 AM

#

😄

acoustic crater May 21, 2021, 4:39 AM

#

just might be useful to have a new operator actually implemented and tested

static bluff May 21, 2021, 4:39 AM

#

I mean- people have told me before that I need to walk before I can run, and I really really respect that position (and the people telling me that)

#

But I'm a trial by fire type. Its always been how I learn best

raven ridge May 21, 2021, 4:40 AM

#

static bluff Oh!

You should break things apart at boundaries between logical components, not arbitrarily based on size. If your lexer is a single logical component, there's nothing wrong with keeping it in a ten thousand line file

static bluff May 21, 2021, 4:41 AM

#

In principal I agree

raven ridge May 21, 2021, 4:41 AM

#

Replied to the wrong message. Stupid mobile.

static bluff May 21, 2021, 4:42 AM

#

But I personally don't think its tenable to expect anyone to read through a document more than 2000 lines of code long. Theres just too much to keep track of

#

Fine for me sure because I wrote it- and maybe you too. But most of the people here in the advanced channel have the wits to be able to handle that. The most likely reason someone would be reading through the source would be to figure out how it works- very possibly starting from square one with no context to fall back on

#

I guess, I dunno. I'm trying to bring my coding style to within more standardized limits- keeping things united in their own modules for example

#

But my instincts are yelling at me to space things out at this point, and I'm not sure how thin or thick is 'normal'

raven ridge May 21, 2021, 4:51 AM

#

if it's 2000 lines of code, it's 2000 lines of code. Reading through 2000 lines of code in a single file isn't necessarily worse than reading 2000 lines of code spread across 5 files.

#

if the boundaries are bad or arbitrary, reading the same 2000 lines of code can be much more difficult when it's spread across 5 files than all in one.

#

I'm not saying you shouldn't break things up into submodules, just that the criteria for where to split things should be based on the boundaries of logical components, and not on the size of the file

grave jolt May 21, 2021, 4:52 AM

#

at least it's not spread across 5 npm packages

static bluff May 21, 2021, 4:53 AM

#

raven ridge I'm not saying you shouldn't break things up into submodules, just that the crit...

Really good advice

raven ridge May 21, 2021, 4:54 AM

#

figuring out what should be a component and what shouldn't and where to draw the lines between them takes a lot of practice - you just need to read a lot of code, and see both good and bad divisions, before you can get any good at it.

#

as a professional programmer for many years, figuring out how to split things up so that the divisions make sense to other people, and so that things aren't too tightly coupled and don't have too many responsibilities, is still one of the parts of the job that takes the most effort for me.

#

but I have seen files of code that were ~40k lines where it wouldn't make any sense to divide them up - everything in them was closely related, they represented a single, large, logical component.

#

Any possible division would have been arbitrary, and wouldn't have aided comprehension.

static bluff May 21, 2021, 4:58 AM

#

Fair ^^

#

Well, I'm making the judgement call, at least for now, that to have multiple nearly identical (and visually confusing) components with different uses so close to each other is only going to cause confusion. And, its what my instincts are telling me 😛

#

What do you do Godly?

raven ridge May 21, 2021, 5:03 AM

#

I work on the Python Infrastructure team at Bloomberg - a big news and financial analytics company. My team is responsible for the health of the Python ecosystem at Bloomberg, from maintaining patched interpreters and keeping up with CVEs to providing Python bindings for first party C++ libraries that the company already had (our backend for a long time was Fortran, then C++, and now it's switching slowly towards Python)

#

slowly in the sense that there's a lot of new Python code being written, but there's a huge codebase of existing C++ code, and some Fortran, still in active use

static bluff May 21, 2021, 5:05 AM

#

O.O

#

Thats amazing

#

I'm not normally one to gush but christ, its an honor

#

Thinks for keeping my job warm for me by the way 😉 I'll see you in ten years

raven ridge May 21, 2021, 5:07 AM

#

eh, I'm just a developer. I'm a damn good developer, but there's a bunch of impressive people on my team. We've got a CPython core dev, and one of the maintainers of pip...

static bluff May 21, 2021, 5:08 AM

#

Still

#

Any advice for a whippersnapper with big dreams and quick fingers?

acoustic crater May 21, 2021, 5:10 AM

#

ur sayin my terminal will have 3.10 soon?!

#

I just got space invaders to run on the darn thing

static bluff May 21, 2021, 5:10 AM

#

XD

#

I saw someone tried to install doom on a home pregnancy test with a digital readout

raven ridge May 21, 2021, 5:15 AM

#

static bluff Any advice for a whippersnapper with big dreams and quick fingers?

We're already well off topic for this channel. Ping me in #career-advice if you want.

static bluff May 21, 2021, 6:16 AM

#

Would one of you fine folks be willing to take a look at my code? (I'm asking here, because I feel like I'm learning more talking to you guys than I've ever learned anywhere else)

lapis mist May 21, 2021, 7:02 AM

#

Hey Guys!!! I am just trying web development in Python using flask. I just created everything, but when I try to register and login it says "CSRF tokens are missing". What should I do here?

grave jolt May 21, 2021, 7:17 AM

#

static bluff Would one of you fine folks be willing to take a look at my code? (I'm asking he...

you can share it here 🙂

static bluff May 21, 2021, 8:04 AM

#

!paste

fallen slateBOT May 21, 2021, 8:04 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pydis.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

static bluff May 21, 2021, 8:04 AM

#

https://paste.pythondiscord.com/xopehuhuya.rb

#

I've been trying to conform to norms a bit better. There's nothing really impressive happening in this module, but I wanted to know if you guys noticed any bad habits (aside from the semicolons :P)

static bluff May 21, 2021, 8:38 AM

#

https://paste.pythondiscord.com/uqibamatug.py
Updated comments

prime estuary May 21, 2021, 9:09 AM

#

static bluff https://paste.pythondiscord.com/uqibamatug.py Updated comments

Well, in Python nesting your classes like that does absolutely nothing, other than making you have to access it with a dotted name. It's probably easier to just put them at the top level above your main class. In Caret, you could use operator.add instead of the lambdas you're creating there - or really just directly add the three attributes, the iteration is a bit overkill. Why is the caret iterable in the first place anyway? Instead of unpacking to tuples, give it a as_tuple method, probably returning a NamedTuple.

static bluff May 21, 2021, 9:10 AM

#

I had originally intended to have not just addition, but all four basic operations in caret- the lambdas are just a relic from that

#

As for the the iteration overkill, yeah, you're probably right

naive apex May 21, 2021, 1:30 PM

#

hey all, I've got a python-adjacent question that could use some help on. I've got a client-server system written in python that sends data over a socket by pickling it on one side and de-pickling it on the other. we use it for sending messages up to ~500kb in size. I wanted to rewrite the server portion of it in as there were some computations that could greatly benefit from rust, but now I'm stuck on server performance - since I can't pickle the Rust side of things, I've been sending it over as a JSON byte stream, which tanks the performance: the time is dominated by converting the vector arrays on the Rust side into a JSON string, and loading on the python end. fwiw, it's largely tuples of floats that get sent over.

I'm curious if anyone has any suggestions on something to look into that can get me back to the python-python pickling / unpickling performance. I imagine something like protobuf might get me closer, but I'm really not sure. I've tried to use some of the faster JSON parsers (ujson, orjson, hyperjson), but it's still ~100x slower than the pickle-pickle side of things. This makes me think JSON is not the correct answer.

TL;DR: efficient way to send large amounts of data from a Rust app to a python client?

flat gazelle May 21, 2021, 1:32 PM

#

something like messagepack or protobuff is likely to work

halcyon trail May 21, 2021, 1:48 PM

#

or even bson

radiant fulcrum May 21, 2021, 1:51 PM

#

generally JSON is fast enough

#

like serde (serde_json is orjson for python) can process hundreds of MB/s in terms of data throughput

#

also

#

I wouldnt recommend using it but https://docs.rs/serde-pickle/0.6.2/serde_pickle/ does exist for rust 😅

serde_pickle - Rust

API documentation for the Rust serde_pickle crate.

#

so technically, yes you can pickle rust

#

and rust can unpickle python stuff

#

within reason

warm wadi May 21, 2021, 1:53 PM

#

I wonder if you could think of having unix sockets open for comm and just stream the data from one to another, also stream the response back

#

sorry, my bad. I lost context you have already covered

radiant fulcrum May 21, 2021, 1:55 PM

#

socket wise yeah, unix sockets are gonna be the fastest method of transport though

halcyon trail May 21, 2021, 1:55 PM

#

but he's saying it's not fast enough?

radiant fulcrum May 21, 2021, 1:55 PM

#

pithink I dont see how that can really happen though for 500kb messages

halcyon trail May 21, 2021, 1:56 PM

#

there isn't much reason to use json as an interchange format between two programs you control, unless you actually have some reason for wanting human readability

radiant fulcrum May 21, 2021, 1:56 PM

#

Well i mean you get the easy of protocol compatability

halcyon trail May 21, 2021, 1:56 PM

#

you get the exact same thing with bson, or msgpack

radiant fulcrum May 21, 2021, 1:56 PM

#

writing or using a diffrent serializing setup can make life harder between programming language because of support

#

JSON is already hyper optimized for this stuff. Like I dont see how the format is the bottlekneck here

halcyon trail May 21, 2021, 1:57 PM

#

err what

#

how is json hyper optimized for this stuff

visual shadow May 21, 2021, 1:59 PM

#

Everyone uses json for communication over the web and its very common to use jsons for simple data between two programs. It's got its advantages.

radiant fulcrum May 21, 2021, 1:59 PM

#

Because it's been used for communications over sockets for decades, pretty much every language has a setup for handling the format and often most serlizers and deserializes are massively optimized for handling it because of how much it's used

visual shadow May 21, 2021, 1:59 PM

#

Primarily you're guaranteed everyone supports it

halcyon trail May 21, 2021, 1:59 PM

#

that doesn't mean it's "hyper optimized"

#

the data format is the data format

radiant fulcrum May 21, 2021, 1:59 PM

#

Yes but the implementations that follow it are what make it hyper optimized

halcyon trail May 21, 2021, 1:59 PM

#

the format itself isn't designed to be optimal in any performance sense, and there's only so much you can do to optimize performance of reading it

#

they try, yes, but it's still slow compared to a binary format

#

anyway, this is purely speculative? The guy said he benchmarked and the costs of sending via json are a big factor. We don't know what his timescale is.

#

Moving from json to bson or msgpack is very easy

radiant fulcrum May 21, 2021, 2:00 PM

#

But it should be fast enough for what they're dealing with, I find it hard to believe that the format is the bottleneck is considering that serde can process 400MB+ a sec per core

halcyon trail May 21, 2021, 2:01 PM

#

You don't know what they're dealing with though....

visual shadow May 21, 2021, 2:01 PM

#

This feels like premature optimization to me to be honest. I mean, if you're sending 1gb worth of data don't use jsons I guess. But for small sizes, why should json not be used

halcyon trail May 21, 2021, 2:01 PM

#

oy

visual shadow May 21, 2021, 2:02 PM

#

For what it's worth, in the grand scheme of things this decision won't really make or break your code regardless

flat gazelle May 21, 2021, 2:03 PM

#

in this case, they state that JSON serialization is too slow in rust. So either pick a faster JSON library or stop using JSON

halcyon trail May 21, 2021, 2:03 PM

#

Why not just actually give the person some useful advice, instead of telling he that he doesn't know what he's talking about, and that he's prematurely optimizing?

#

I swear, programmers

radiant fulcrum May 21, 2021, 2:03 PM

#

Because it is useful advise to tell them when it's premature

#

because you end up making sacrifices in the name of speed you dont need

halcyon trail May 21, 2021, 2:03 PM

#

You're making some pretty incredible assumptions here, that are totally unwarranted

#

Everything in the post indicates that he did reasonable due diligence, I have no idea why you two are still assuming that he's just wrong for wanting to move to a faster data format

radiant fulcrum May 21, 2021, 2:04 PM

#

recommend something like BSON has more negatives than benefits because sure, its faster to serialize and deserialize but IO is still by far the slowest thing in the equation and BSON takes up more space than JSON generally

warm wadi May 21, 2021, 2:05 PM

#

hey, how about a memory mapped file dumped from rust and read in python using c extension? just match the structures and padding and it should fit, should work faster by eliminating the intermediate, no?

#

doesn't have to be a file as such. I just mentioned file for god know what reasons

#

it could be a stream

flat gazelle May 21, 2021, 2:05 PM

#

sending a file over a stream is non trivial

#

since a stream doesn't have a length

#

but the data likely does

halcyon trail May 21, 2021, 2:05 PM

#

except that he indicated that he does need the speed....

#

you literally know nothing about his domain

#

his timescale

flat gazelle May 21, 2021, 2:06 PM

#

the time is dominated by converting the vector arrays on the Rust side
seems like json is indeed the issue here

warm wadi May 21, 2021, 2:06 PM

#

then the person should actually chime in and contribute some more to the discussion @naive apex

radiant fulcrum May 21, 2021, 2:06 PM

#

true

#

But in the world of performance, IO is the first and often biggest bottlekneck

#

the bigger your encoding format the slower you IO generally and the lower performance

#

In rust's case Recommending another format isnt going to change much due to the serde backbone

visual shadow May 21, 2021, 2:07 PM

#

halcyon trail Everything in the post indicates that he did reasonable due diligence, I have no...

Reading up, for what it's worth I'm simply taking part in the conversation from where I joined in, my comments were not intended for the op

halcyon trail May 21, 2021, 2:08 PM

#

You're saying these things like they're profound.... Trying other formats is definitely worthwhile, we have no idea what his bottleneck is

warm wadi May 21, 2021, 2:08 PM

#

its just tuples of float. a stream of fixed size byte buffer followed by known delimiter isn't that hard to put together. specially when you control both the systems

flat gazelle May 21, 2021, 2:08 PM

#

yes, generally, JSON should be fine

#

but in this case, it isn't, as, well, the person measured it

halcyon trail May 21, 2021, 2:08 PM

#

I only mentioned bson because it's exceptionally easy to try, if you're already using json

radiant fulcrum May 21, 2021, 2:09 PM

#

Yes but BSON is made for storage rather than transfer

halcyon trail May 21, 2021, 2:09 PM

#

certain, protobuf, or capn proto, etc, are better solutions, they're just a little more work to setup

radiant fulcrum May 21, 2021, 2:09 PM

#

It serializes to be a much bigger size than JSON because of metadata

flat gazelle May 21, 2021, 2:09 PM

#

I do find it odd, since I was sending 4G packed datasets with JSON over http realtime

halcyon trail May 21, 2021, 2:09 PM

#

it's not much bigger. It depends what you are storing.

#

If you are storing large arrays of floats, for example, it can be smaller

flat gazelle May 21, 2021, 2:09 PM

#

but well, I don't know what the exact specifics here are

radiant fulcrum May 21, 2021, 2:11 PM

#

halcyon trail If you are storing large arrays of floats, for example, it can be smaller

Generally no, but sure we can put a pin in that for now 🙂

halcyon trail May 21, 2021, 2:11 PM

#

lets

#

amusingly, I just remembered that we moved some data coefficients for models from json to bson, I actually have an email from a coworker with sizes of some of these files:
Json: 375M
BSON: 134M
MSGPACK: 91M

radiant fulcrum May 21, 2021, 2:20 PM

#

pithink Are you sure the JSON wasnt pretty formatted

halcyon trail May 21, 2021, 2:21 PM

#

I don't think so. but even so, it wouldn't explain a 3x in size. It's not very nested.

radiant fulcrum May 21, 2021, 2:21 PM

#

bearing in mind that with BSON you add a considerable amount of metadata per field

#

every field has a the type and key and the data itself

you have the metadata for each document which is is another 2 bytes per doc for the size + the delimiters

#

So technically speaking i dont think it's ever possible to make BSON be smaller than JSON without compression

halcyon trail May 21, 2021, 2:23 PM

#

....

#

you're literally looking at a contradictory data point, first of all.

#

second of all, your comment makes me genuinely thing that you don't realize that a float is smaller in binary than in text

radiant fulcrum May 21, 2021, 2:26 PM

#

Im aware of that though you have to have a considerably sized integer or float to make it lesser than the text representation accounting for metadata

halcyon trail May 21, 2021, 2:27 PM

#

it doesn't need to be "considerably sized" in the case of a float, it's just storing the full precision

#

anyway, facts are facts

radiant fulcrum May 21, 2021, 2:28 PM

#

bloblul

undone hare May 21, 2021, 2:36 PM

#

Floats can be really huge

grave jolt May 21, 2021, 2:37 PM

#

well,

#

JSON doesn't store floats, does it?

#

as in, it doesn't specify the precision or anything like that

radiant fulcrum May 21, 2021, 2:37 PM

#

they're just stored as their text representation, doesnt have any concept of floats no

#

so 3.1 is just stored as 3 bytes

#

not the full 16 8* bytes

grave jolt May 21, 2021, 2:38 PM

#

16?

#

why 16?

radiant fulcrum May 21, 2021, 2:38 PM

#

well, if it's a f64 / double

grave jolt May 21, 2021, 2:38 PM

#

you mean 8 bytes then?

radiant fulcrum May 21, 2021, 2:39 PM

#

fuck yes

#

FacePalm

grave jolt May 21, 2021, 2:39 PM

#

my point was that if you encode it as a double, you might lose information

undone hare May 21, 2021, 2:39 PM

#

I do this way too often

grave jolt May 21, 2021, 2:39 PM

#

grave jolt my point was that if you encode it as a `double`, you might lose information

(e.g. 3.1 cannot be represented as a finite binary fraction)

halcyon trail May 21, 2021, 2:39 PM

#

well, usually you compute it as a double to start with

#

it depends where your numbers are coming from

#

i shouldn't say usually I suppose

#

but anyhow you can see a trivial example where json is larger than bson in two minutes:

d = {"hello": [random.random() for i in range(1000000)]}

#

for me this results in a 20 meg json file, and a 16 meg bson file

#

messagepack is only 8.6 megs though

undone hare May 21, 2021, 2:43 PM

#

If you share that many data through json, you are probably doing something wrong

#

This is really inefficient due to the ascii serialisation

halcyon trail May 21, 2021, 2:43 PM

#

ideally yes, sometimes you don't have control though.

grave jolt May 21, 2021, 2:43 PM

#

laughs in a 3-gigabyte SQL query

radiant fulcrum May 21, 2021, 2:44 PM

#

also you can wang that though a compression algo and life is much nicer, although you can do that with any binary format really

halcyon trail May 21, 2021, 2:44 PM

#

but if you do have control over both ends, and you care about perf, then yeah json is just a prety bad choice

radiant fulcrum May 21, 2021, 2:44 PM

#

again sorta depends really

#

IO is still your slowest thing

#

sure if you have a big array of floats like that BSON will be slower (although others will be even smaller) which will mean less data to transfer but if its a bunch of strings etc.. theres a good chance JSON will slower

#

I mean I just stuck that 20 meg JSON file it produced into gzip and got some 8.8MB output

halcyon trail May 21, 2021, 2:46 PM

#

gzipping is expensive

#

you're taking "IO is all that matters" as an article of faith at this point

radiant fulcrum May 21, 2021, 2:46 PM

#

yeah it is a pretty expensive compression

halcyon trail May 21, 2021, 2:46 PM

#

unzipping is expensive, parsing strings into floats is actually also quite expensive

radiant fulcrum May 21, 2021, 2:46 PM

#

actually that was Zlib not gzip sorry

halcyon trail May 21, 2021, 2:47 PM

#

At any rate, all these approaches in the end are much slower than approaches with schemas

radiant fulcrum May 21, 2021, 2:47 PM

#

yes

halcyon trail May 21, 2021, 2:47 PM

#

If i were doing something like this I'd definitely be using something more like protobuf from day 1

#

it also, most likely, saves you having to write some kind of reasonable dataclass/struct to hold the data on either side. when you are sending data between multiple languages protobuf-like approaches are hard to beat

radiant fulcrum May 21, 2021, 2:49 PM

#

20269613 Bytes in
9578680 Bytes Out from zlib
0.20867420000000003 s
9578692 Bytes Out from gzip
0.22591870000000014 s

halcyon trail May 21, 2021, 2:55 PM

#

i've actually now just out of curiosity been trying to create somethign that will be smaller in json than messagepack, and have not been successful

#

messagepack must be fairly clever

#

In [46]: def get_random_string(length): 
    ...:     # choose from all lowercase letter 
    ...:     letters = string.ascii_lowercase 
    ...:     return ''.join(random.choice(letters) for i in range(length)) 
    ...:      
    ...:                                                                                                                                   

In [47]: d = {get_random_string(10): get_random_string(10) for i in range(1000)}

#

this still creates a 28K json file and 22K msgpack file

radiant fulcrum May 21, 2021, 2:56 PM

#

http://indiegamr.com/cut-your-data-exchange-traffic-by-up-to-50-with-one-line-of-code-msgpack-vs-json/ has a good tear down on the difference it produces

grave jolt May 21, 2021, 2:57 PM

#

halcyon trail i've actually now just out of curiosity been trying to create somethign that wil...

{} ? 🙂

radiant fulcrum May 21, 2021, 2:57 PM

#

halcyon trail May 21, 2021, 2:57 PM

#

@grave jolt touche 🙂 didn't try it

static bluff May 21, 2021, 3:07 PM

#

What do you guys think about a built in 'regex' object, designed for building complex regular expressions pattern by pattern, and pretty printing them (plus some other functionality I guess)

grave jolt May 21, 2021, 3:09 PM

#

Isn't that just a parser combinator library?

static bluff May 21, 2021, 3:09 PM

#

I mean, someone who is half decent with regular expressions would probably have an easier time just writing it, but having a sort of 'toolkit' where you can go command by command, providing only a minimal amount of actual pattern, might be helpful for some

#

Oh probably. I don't really know what that is, but it sounds about right

flat gazelle May 21, 2021, 3:17 PM

#

It would be nice to have some options for parsing in the stdlib, though I would prefer something that can also support irregular expressions

grave jolt May 21, 2021, 3:18 PM

#

well, there's lark

flat gazelle May 21, 2021, 3:20 PM

#

Lark often feels like overkill. Sometimes, you just need to parse sexprs without needing 50+ lines

grave jolt May 21, 2021, 3:20 PM

#

well, yes

paper echo May 21, 2021, 3:22 PM

#

there's also CSON and i think some other binary json-like thing

static bluff May 21, 2021, 3:22 PM

#

has no idea what you guys are talking about 0.0

paper echo May 21, 2021, 3:22 PM

#

or cjson?

#

idk

#

i know neovim settled on msgpack as their message format

warm wadi May 21, 2021, 3:54 PM

#

that person should really add more context to the problem he’s solving with rust. I have more questions lol

#

Like, why not just use a c extension to process it within Python and forget the whole json business all together

paper echo May 21, 2021, 3:56 PM

#

it sounds like their application already has a client-server architecture @warm wadi

#

however there might be a more specific format that makes sense for their usage

#

e.g. if it's a 500 kb array, maybe they should use the numpy data format

warm wadi May 21, 2021, 3:58 PM

#

paper echo it sounds like their application already has a client-server architecture <@!788...

They have it Python server Python client and now to do compute heavy stuff they are adding rust to it

paper echo May 21, 2021, 3:58 PM

#

who's to say that they're on the same machine, or even the same local network?

#

maybe there are other good reasons why they need or want client-server?

halcyon trail May 21, 2021, 3:59 PM

#

but like salt said, it's already client-server, we should assume it's client-server for a reason

warm wadi May 21, 2021, 4:00 PM

#

But that’s not the problem or point at all. Read that post again. They are happy with pickle performance of Python on both sides. Now only on server side they want to use rust to improve computation performance

#

So it’d become Python client and rust server

#

All I’m curious about is why can’t they use a c extension on Python server to improve computation performance. Then they don’t have to fight with encoding decoding stuff

halcyon trail May 21, 2021, 4:02 PM

#

i read the post....

warm wadi May 21, 2021, 4:04 PM

#

Then what part have I understood wrong?

halcyon trail May 21, 2021, 4:08 PM

#

a single python program is also simpler than a python client and python server

#

but that's not what they have. so there's probably a reason for that, right?

paper echo May 21, 2021, 4:13 PM

#

i understand

#

@halcyon trail they are saying, instead of rewriting the server in rust, they could use a c extension to do the computation within the python server

#

or i suppose rust with cffi (?)

#

or maybe hpy https://hpyproject.org/ because apparently theres a lot of overhead with cffi https://blog.ian.stapletoncordas.co/2018/01/making-python-faster-with-rust-and-cffi-or-not.html

HPy

HPy - A better C API for Python

What is HPy?
HPy provides a new API for extending Python in C. In other words, you use
#include instead of #include .

What are the advantages of HPy?

Zero overhead on CPython: ex

halcyon trail May 21, 2021, 4:14 PM

#

okay, I misunderstood dave's post then, not the original post

paper echo May 21, 2021, 4:14 PM

#

i did too

halcyon trail May 21, 2021, 4:14 PM

#

To me personally, that actually seems worse, but maybe it's a matter of taste

paper echo May 21, 2021, 4:15 PM

#

it definitely does fix the "serialization is now really slow" problem

warm wadi May 21, 2021, 4:15 PM

#

(You can address me as he/him, thanks 🙂 )

halcyon trail May 21, 2021, 4:15 PM

#

Oh, I just said "dave" there because it was getting confusing

#

not because of pronoun unsure-ity

paper echo May 21, 2021, 4:16 PM

#

that said, pickle is potentially very dangerous anyway - what if the client and server have different python versions? or any of the million other things that can go wrong with unpickling

halcyon trail May 21, 2021, 4:16 PM

#

people tend to overuse pronouns a lot in technical discussions, one of the first things I remember my boss drilling into me. The number of times that misunderstanding of "it" has cost 10 minutes...

#

Yeah, I mean, the "serialization is very slow" problem can be fixed in many ways, there's nothing that special about pickle

#

I actually think protobuff is a pretty nice and obvious solution here

grave jolt May 21, 2021, 4:17 PM

#

halcyon trail people tend to overuse pronouns a lot in technical discussions, one of the first...

In languages with grammatical gender, there's 2 to 3 times less confusion 🙂

#

I'm only half joking

halcyon trail May 21, 2021, 4:18 PM

#

it's a 2 for 1 value really, because protobuff gives you a) a serialization approach, b) an automatic translation/representation of data in both Rust and python at the same time

#

Writing extensions can still be quite a bit of work, and people tend to bring in libraries for that anyway if it's non-trivial, e.g. pybind11

#

but I guess it just depends

warm wadi May 21, 2021, 4:20 PM

#

sometimes if code is really out of std library then simply changing to pypy gives ample performance boost. But again, hundreds of other things to care about for long term

paper echo May 21, 2021, 4:21 PM

#

@naive apex what kind of data is this? a big array of numbers? some kind of deeply nested dicts and lists?

grave jolt May 21, 2021, 8:02 PM

#

!rule 9 @azure siren We don't allow requests or offers of paid work here.

fallen slateBOT May 21, 2021, 8:02 PM

#

Rules

9. Do not offer or ask for paid work of any kind.

azure siren May 21, 2021, 8:04 PM

#

Sorry

static bluff May 21, 2021, 9:41 PM

#

Maaaaaan

#

You guys are the best. I feel like I'm just drinking in knowledge talking with you all

sullen wolf May 22, 2021, 12:29 AM

#

So many smart people in here, definitely will come here to ask questions in the future. Thanks for having me.

modern bough May 22, 2021, 1:56 AM

#

Noob question, but can anyone point me towards some resources for learning Python VM bytecode?

prime estuary May 22, 2021, 2:38 AM

#

modern bough Noob question, but can anyone point me towards some resources for learning Pytho...

Well, the dis module docs https://docs.python.org/3/library/dis.html#python-bytecode-instructions has a list of all the current bytecodes and their functionality, and the opcode module has a bunch of lists with the actual indexes. One key thing about the behaviour is that CPython uses a stack to hold all in use data - load instructions push to the top of the stack, then operators pop their inputs, and push the result.

You may also want to consult ceval.c, which implements all the bytecodes and the core eval loop.
https://github.com/python/cpython/blob/main/Python/ceval.c

modern bough May 22, 2021, 2:43 AM

#

prime estuary Well, the `dis` module docs <https://docs.python.org/3/library/dis.html#python-b...

I was reading the dis output, but without those docs it made no sense lol. Thanks for the response. I'll check over that in the morning.

prime estuary May 22, 2021, 2:46 AM

#

Also for reference, the columns in dis are in order the line number, bytecode index, opcode name, opcode parameter (normally 0-255, with EXTENDED_ARG up to 4 bytes), then the decoded value of the parameter if useful (var name, constant value, etc).

#

The code object has a bunch of tuples the opcodes index into, like the constants array, the names array for global names looked up, etc.

#

Lines with a >> at the start are detected as the destination of a jump instruction.

static bluff May 22, 2021, 3:36 AM

#

I don't think I'm ever going to understand this parser O.o

pliant tusk May 22, 2021, 3:42 AM

#

prime estuary Also for reference, the columns in dis are in order the line number, bytecode in...

A bit more on EXTENDED_ARG: because internally the opcode argument are stored in a signed 4 byte integer (-2,147,483,648 - 2,147,483,647), it is possible to set the opcode argument to a negative value using repeated EXTENDED_ARG opcodes in manually crafted bytecode

#

afaik, negative opcode arguments do not occur in generated bytecode

modern bough May 22, 2021, 3:56 AM

#

prime estuary Also for reference, the columns in dis are in order the line number, bytecode in...

So dis doesn’t output the code in the same order it receives it?

prime estuary May 22, 2021, 5:15 AM

#

It does output it in order, but it just displays the offsets and line counts so you can keep track.

flat gazelle May 22, 2021, 1:28 PM

#

!pban @wintry herald spam

fallen slateBOT May 22, 2021, 1:28 PM

#

failmail :ok_hand: applied purge ban to @wintry herald permanently.

velvet cradle May 22, 2021, 3:00 PM

#

I dont know how to add graphics yet, anyways im trying to make a game that is almost completely reliant on achievments, achievments is how you beat the game, and like i said i dont know graphics yet so it is going to be a text based game. Any ideas for the game and the name for the game

static bluff May 22, 2021, 3:11 PM

#

velvet cradle I dont know how to add graphics yet, anyways im trying to make a game that is al...

This would be a question for #game-development, though all of us would probably say that if you aren't will to learn even basic graphics, you aren't invested enough to pull off what you're wanting to do. Game development is hard. It takes a lot of time and a lot of work. If you aren't prepared to do that work, it means your heart isn't in it and you should find something that you do like to do

#

Not to be mean or anything, but its the truth

real ruin May 22, 2021, 4:47 PM

#

I'd expect the description to be accessed by msg.embeds.description

rich cradle May 22, 2021, 4:47 PM

#

If you need node.js help, you should ask in off-topic

fallen slateBOT May 22, 2021, 4:47 PM

#

Off-topic channels

There are three off-topic channels:
• #ot0-psvm’s-eternal-disapproval
• #ot1-perplexing-regexing
• #ot2-never-nester’s-nightmare

Their names change randomly every 24 hours, but you can always find them under the OFF-TOPIC/GENERAL category in the channel list.

Please read our off-topic etiquette before participating in conversations.

static bluff May 22, 2021, 4:50 PM

#

I'm starting to do the reading on PEGs and Python's new parser. I wanted to check my understand as it stands, and ask a couple of questions of you guys

#

So, a parser-expression-grammer is a set of rules written in a creole not unlike regular expressions, and is used to define the various patterns that constitute valid syntax. Unlike regular expressions however, no 'standard' universal-across-languages procedure exists to apply the expression against text. Additionally, a parser-expression assumes whatever system is applying the rules is capable of recursion and other mechanics not available in regular expressions

#

To apply the expression, the expression is fed to a 'parser generation' (possibly alongside a 'metagrammar': an additional set of rules/specifications which tell the parser-generator how to interpret the expression) which generates actual code capable of applying the patterns to text (or a stream of tokens)

#

Unlike Python's original pgen parser, a 'left-recursive pushdown parser with 1-token lookahead' (I have very little understand of what that means), a PEG-enabled parser is capable of both infinite lookahead and infinite lookbehind (left-recursion???). Additionally, it is a 'recursive-decent' parser: one which checks a given alternative all the way to completion or failure—consuming input in the event of success and moving on the next alternative in the event of failure without consuming input

#

The addition of an 'action' notation with a PEG enables an abstract syntax tree, as opposed to a concrete syntax tree, to be built directly within the parser. Use of 'memoization' (caching) and a few other tricks to save memory keep the parser running at linear speed

#

Did I get anything wrong? Have I displayed any poor or partial comprehension of anything important?

#

*Addendum: left recursion and infinite look ahead/behind allows for significantly more readable and sensical patterns, with fewer 'hacks' and reliance on post-processing
*Addendum: actions specified within the grammar act not unlike callbacks, and are used to actually generate the nodes constituting the ast (???)

paper echo May 22, 2021, 5:14 PM

#

I believe a PEG is a grammar for expressing certain kinds of languages, and there are time-efficient parsing algorithms for parsing languages expressed as PEGs

#

Afaik the linear time algorithm you described is called "packrat parsing"

#

I guess because it memoizes a lot of stuff

static bluff May 22, 2021, 5:23 PM

#

That's my understanding also

#

I guess the most important question—the possibly months-of-work-saving question is

#

Assuming no gigantic changes to the nature of the system, could I apply python's native parser generator to a modified version of Python's PEG and have the resulting parser work?

true ridge May 22, 2021, 5:26 PM

#

static bluff Assuming no gigantic changes to the nature of the system, could I apply python's...

What do you mean by this?

static bluff May 22, 2021, 5:26 PM

#

Well, I'm building a language whose syntax is based off my Python's. A few minor tweaks but the real differences come from implementation, not syntax

true ridge May 22, 2021, 5:27 PM

#

it is possible. Currently CPython's parser generator

#

outputs in 2 different languages, c and python

static bluff May 22, 2021, 5:27 PM

#

Could I take Python's PEG, modify is as needed, and feed the modified expression to the same parser generator python uses to create working parser

#

🤤

true ridge May 22, 2021, 5:28 PM

#

and the C parser is pretty specific to CPython since it uses a lot of internal functions

#

but if you were to use the Python generator and port the grammar, then it would work

#

In fact, there is already a work in progress PR to do so

static bluff May 22, 2021, 5:28 PM

#

Being able to proceed rewriting only the grammar, and not the generator, would be a godsend

true ridge May 22, 2021, 5:29 PM

#

and i believe author of that PR is also working on a language that is based on python (a python superset tbmp)

prime estuary May 22, 2021, 5:29 PM

#

Or use another parser generator, there's quite a number around.

true ridge May 22, 2021, 5:29 PM

#

https://github.com/we-like-parsers/pegen/pull/11

GitHub

data: wip on making Python parser for python grammar by MatthieuDar...

#

Pegen is pretty cool tbh, especially with all the actions and it's custom expansion forms like ','.something+ etc

static bluff May 22, 2021, 5:30 PM

#

So it sounds to me as though my way forward is to treat the parser generator itself as a black box for now and instead focus on having a thorough understanding of the language used to write the expression.

#

Write the expression, feed it to the generator, and I guess see what happens?

true ridge May 22, 2021, 5:32 PM

#

if you are planning to make small changes on the grammar, you can even use the parser as is.

static bluff May 22, 2021, 5:33 PM

#

As it stands now, the only difference is a few additional operators as well as multiline lambdas (through arrow notation)

#

The latter might be a bit tricky. Blocks within blocks

true ridge May 22, 2021, 5:34 PM

#

for cases like this, i go with tokens. For example if you'd like to add something like $name, then you can simply alter the tokenizer to recognize ($) and then manually edit the token stream to replace $<something> with the form of __name_<something> and after the parser creates the AST, go over all the identifiers and replace the custom forms with their own nodes

true ridge May 22, 2021, 5:34 PM

#

static bluff The latter might be a bit tricky. Blocks within blocks

for the case of arrow functions, you could simply replace them as normal functions with a weird name, e.g __anon_uuid(<sig>): and handle it as my previous example

static bluff May 22, 2021, 5:35 PM

#

You're not wrong with regards to your approach, but it would be a missed learning opportunity for me to go that route

#

What I'd much rather do is focus on learning how the grammar works first and feed it through a working generator, and once I know the grammar is sound, backtrack and build my own generator

#

Having both as question marks would make things way too ambiguous

true ridge May 22, 2021, 5:37 PM

#

if your main purpose is learning, then I guess the proper way would be not caring too much about thoroughness (like how much of esoteric stuff that you could parse) but rather find a version of old python grammar (perhaps something 3.8<) and try to write a parser for it (or even parser generator, if you don't like hand written stuff)

true ridge May 22, 2021, 5:38 PM

#

static bluff What I'd much rather do is focus on learning how the grammar works first and fee...

once you get the theory, you could even fork the old 'pgen' to add backtracking. 🙂

static bluff May 22, 2021, 5:38 PM

#

Ahh, in my reading I've seen that the core devs feel its time to put pgen out to pasture

#

Too old

true ridge May 22, 2021, 5:39 PM

#

well pgen is gone (there is still a fork of it living under lib2to3) but it is deprecated

#

and will be gone soon

#

though I'd say a LL(1) parser is much more fundamental and simple then the other variants out there

static bluff May 22, 2021, 5:39 PM

#

I'm glad to hear I'm not completely misunderstanding the problem. I think I might be on the right track

#

I'm glad I took the time to write this all out. I was going to take a shower. But the process of building the parser seems much less like magic now that I've had a chance to put it all out in words

#

😄

true ridge May 22, 2021, 5:42 PM

#

it is indeed really fun to work on. If you are interested in going even deeper, I'd really recommend 'Parsing Techniques: A Practical Guide' for other different methodologies

static bluff May 22, 2021, 5:53 PM

#

Oh thank you!

boreal umbra May 22, 2021, 9:55 PM

#

It seems that if you're working on a bunch of folders of interrelated Python files that aren't part of a library (you haven't installed it with pip install -e), the best way to avoid import errors is to use python -m from the root folder of the project. Am I right in thinking this?

raven ridge May 22, 2021, 10:02 PM

#

it depends on the particular structure, but I think yes, python -m is more likely to work than anything else

prime estuary May 22, 2021, 10:27 PM

#

Yes, since then sys.path keeps your working directory.

silk pawn May 22, 2021, 10:35 PM

#

so i'm not exactly clear how python bytecode is turned into instructions for the computer, but basically i was wondering if, if python has, for example, two binary add opcodes in a row, if it used SIMD to execute it, since virtually every CPU supports that nowadays

#

i've been playing with simd in cython, and it's really cool, but i can't figure out how to check if python uses them

#

and if it doesn't, why not? guido had said that speeding up python is now a major goal

spark magnet May 22, 2021, 10:36 PM

#

@silk pawn bytecode isn't turned into instructions, they are interpreted by a giant C switch statement.

silk pawn May 22, 2021, 10:36 PM

#

oh

#

can you point me to the big switch table on github

spark magnet May 22, 2021, 10:37 PM

#

@silk pawn and the add opcode has to figure out what "add" means for the object at the top of the stack.

silk pawn May 22, 2021, 10:37 PM

#

let's say it determines that the object is a primitive int

spark magnet May 22, 2021, 10:38 PM

#

@silk pawn https://github.com/python/cpython/blob/main/Python/ceval.c#L1813

fallen slateBOT May 22, 2021, 10:38 PM

#

Python/ceval.c line 1813

switch (opcode) {```

spark magnet May 22, 2021, 10:38 PM

#

silk pawn let's say it determines that the object is a primitive int

i'm not sure what you mean by primitive. All values are objects, including ints

#

@silk pawn binary add: https://github.com/python/cpython/blob/main/Python/ceval.c#L2033-L2057

silk pawn May 22, 2021, 10:39 PM

#

sorry wrong terminology, i meant like if it determines that it's like a basic add for a C int

#

god i can't phrase this

spark magnet May 22, 2021, 10:39 PM

#

silk pawn god i can't phrase this

A python int is an object with a type and a refcount.

silk pawn May 22, 2021, 10:42 PM

#

spark magnet A python int is an object with a type and a refcount.

so if you do

x = 1
y = 2
z = 3
a = x + y
b = y + z

will python just call pynumber_add twice or is there a special thing to add two sets of pyobjects that can be determined to represent an integer

spark magnet May 22, 2021, 10:43 PM

#

it will have two BINARY_ADD bytecodes, and will call PyNumber_Add twice

silk pawn May 22, 2021, 10:43 PM

#

because in c, i believe you can do (some syntax omitted)

int x = 1
int y = 2
int z = 3
int a = x + y
int b = y + z

and if you use some simd stuff then it does the adding in one instruction

spark magnet May 22, 2021, 10:43 PM

#

this is CPython we're talking about. Other implementations like PyPy could be smarter

spark magnet May 22, 2021, 10:44 PM

#

silk pawn because in c, i believe you can do (some syntax omitted) ```c int x = 1 int y = ...

Python is very different than C

silk pawn May 22, 2021, 10:44 PM

#

yes i understand that much, but i'd think python could try to emulate this behavior to make it faster

#

what is the barrier to python doing this

pliant tusk May 22, 2021, 10:45 PM

#

i think the reason it doesnt is because python has very few guarantees about the type of an object

spark magnet May 22, 2021, 10:45 PM

#

silk pawn yes i understand that much, but i'd think python could try to emulate this behav...

There's more going on in those last two statements than simply adding numbers. There's object allocation. And the ints could be multi-precision in the first place.

pliant tusk May 22, 2021, 10:46 PM

#

the statement a = x + y results in LOAD_NAME 'x' LOAD_NAME 'y' BINARY_ADD STORE_NAME 'a'

#

BINARY_ADD is generic, so it will work for any object that defines __add__

silk pawn May 22, 2021, 10:47 PM

#

spark magnet There's more going on in those last two statements than simply adding numbers. T...

ok yeah i forgot about object allocation, but could python detect if the int is multi precision and then fall back to the current way

grave jolt May 22, 2021, 10:47 PM

#

the problem always boils down to the fact that Python doesn't know what type stuff is at compile time, I guess

spark magnet May 22, 2021, 10:47 PM

#

@silk pawn did you see the comment in the BINARY_ADD switch case?

silk pawn May 22, 2021, 10:48 PM

#

spark magnet <@!376166105917554701> did you see the comment in the BINARY_ADD switch case?

yes i'm just using this as an example

#

i saw the comment from victor about not micro optimizing

prime estuary May 22, 2021, 10:48 PM

#

It's impossible for Python to know, unless everything's a constant.

#

And they're all local variables.

gleaming rover May 22, 2021, 10:48 PM

#

yeah

#

if you need to add fast…you have numpy

#

and if you need to add very fast…numpy + numba

silk pawn May 22, 2021, 10:50 PM

#

ok cool

#

thanks for answering, nedbat, gm, fix, chilaxan, teamspen

raven ridge May 22, 2021, 10:54 PM

#

silk pawn ok yeah i forgot about object allocation, but could python detect if the int is ...

all ints are multi-precision.

grave jolt May 22, 2021, 10:54 PM

#

Interesting issue about subtyping and type narrowing
https://github.com/microsoft/pyright/issues/1899

from typing import Union, TypedDict

class Foo(TypedDict):
    x: int

class Foo2(Foo):
    y: int

class Bar(TypedDict):
    y: str

def f(foobar: Union[Foo, Bar]):
    if 'y' in foobar:
        print(foobar['y'].lower())

x: Foo2 = {'x': 1, 'y': 2} 

f(x)  # fails at runtime:

raven ridge May 22, 2021, 10:54 PM

#

Python doesn't have a special type of int that wraps a native integer.

#

every int in CPython is represented as an array of base 2**30 digits.

halcyon trail May 23, 2021, 12:00 AM

#

I think int in cpython is arbitrary number of digits

spark magnet May 23, 2021, 12:01 AM

#

halcyon trail I think int in cpython is arbitrary number of digits

yes, arbitrary number of 30-bit digits

halcyon trail May 23, 2021, 12:01 AM

#

I guess that is a way to look at it

spark magnet May 23, 2021, 12:03 AM

#

it's the way the implementation thinks of it.

halcyon trail May 23, 2021, 12:03 AM

#

I would have still thought in terms of an array of binary digits that expands in multiples of 30

#

I wonder why 30

#

I guess it uses the remaining 2 bits for something, not immediately obvious what

spark magnet May 23, 2021, 12:05 AM

#

not sure, it might be so that overflows of digit/digit ops don't become inconvenient.

halcyon trail May 23, 2021, 12:06 AM

#

That makes sense

#

It can just do 32 bit operations, and if the most significant bit gets set, set it back to zero and it knows to set the least significant bit in the next one

spark magnet May 23, 2021, 12:07 AM

#

the code has a constant, PYLONG_BITS_IN_DIGIT, which is either 15 or 30

halcyon trail May 23, 2021, 12:08 AM

#

I wonder why go this route when a huge fraction of machines running python are 64 bit

spark magnet May 23, 2021, 12:08 AM

#

perhaps because 2**30 * 2**30 will fit into a 64-bit int

#

From a comment in the code: Type 'digit' should be able to hold 2*PyLong_BASE-1, and type 'twodigits' should be an unsigned integer type able to hold all integers up to PyLong_BASE*PyLong_BASE-1.

halcyon trail May 23, 2021, 12:09 AM

#

Interesting stuff

spark magnet May 23, 2021, 12:09 AM

#

and twodigits is uint64_t

halcyon trail May 23, 2021, 12:09 AM

#

I wonder how it compares with arbitrary width integer implementations in C++

spark magnet May 23, 2021, 12:10 AM

#

https://github.com/python/cpython/blob/main/Include/longintrepr.h#L11-L42

halcyon trail May 23, 2021, 12:10 AM

#

I suppose you could actually compare them directly, using the C code and not going via python

cold dew May 23, 2021, 12:35 AM

#

Hi there I'm new to the community and I need help with python for an assignment

spark magnet May 23, 2021, 12:36 AM

#

cold dew Hi there I'm new to the community and I need help with python for an assignment

#python-discussion might be a better place

eternal furnace May 23, 2021, 1:58 AM

#

cold dew Hi there I'm new to the community and I need help with python for an assignment

#❓｜how-to-get-help

versed fable May 23, 2021, 2:01 AM

#

got pinged....

spark magnet May 23, 2021, 2:03 AM

#

versed fable got pinged....

there was a raid. it's over.

static bluff May 23, 2021, 2:05 AM

#

Another on, eh?

halcyon trail May 23, 2021, 2:21 AM

#

what is a "raid"

#

not a big discord user

#

some kind of attempt to do the discord equivalent of DDOS

raven ridge May 23, 2021, 2:26 AM

#

yep. spam messages to disrupt conversation.

versed fable May 23, 2021, 2:30 AM

#

spark magnet there was a raid. it's over.

oof that's so sad

static bluff May 23, 2021, 3:48 AM

#

Thoughts on building a decorator to dynamically hint a method? Would I rebuild the method by a call to types.FunctionType and provide the new annotations as an argument?

grave jolt May 23, 2021, 4:05 AM

#

static bluff Thoughts on building a decorator to dynamically hint a method? Would I rebuild t...

why would you dynamically hint a method? 👀

static bluff May 23, 2021, 4:05 AM

#

I'm actually just getting to the point of asking myself that very question

grave jolt May 23, 2021, 4:05 AM

#

🙂

static bluff May 23, 2021, 4:06 AM

#

Just, I find this rather unpleasant

    def generateTokens(self, whitespace:str=None, comment:str=None, number:str=None,
                       string:str=None, keyword:str=None, operator:str=None, identifier:str=None,
                       **matchgroups:ExpressionToplevel.Matchgroups):

grave jolt May 23, 2021, 4:08 AM

#

Well, the whole point of type hints is that they provide documentation, and that tools like mypy and pyright understand them. If you generate the typehints dynamically, you lose all of the benefits

#

What does that function do?

static bluff May 23, 2021, 4:09 AM

#

I'd much prefer (for my own language)

@annotate(matchgroups=ExpressionToplevel.Matchgroups)
@annotate(whitespace=str, comment=str, number=str, string=str)
@annotate(keyword=str, operator=str, identifier=str )
def generateTokens(self,whitespace, comment, number, string, keyword, operator, identifer):

#

Sorry that took so long to type. But its only just dawning on me in a formal sense that the language doesn't care about hints- introspection tools do, and they do so by searching the source literal

native flame May 23, 2021, 4:11 AM

#

static bluff Just, I find *this* rather unpleasant ```py def generateTokens(self, whitesp...

that would look a lot nicer just by splitting it over lines:

def generateTokens(
  self, 
  whitespace: str = None, 
  comment: str = None, 
  number: str = None,
  string: str = None, 
  keyword: str = None, 
  operator: str = None, 
  identifier: str = None,
  **matchgroups: ExpressionToplevel.Matchgroups
):

static bluff May 23, 2021, 4:12 AM

#

Signatures have always been a bit of a sore spot for me. In a perfect world, I'd prefer

@annotate(matchgroups=ExpressionToplevel.Matchgroups)
@annotate(whitespace=str, comment=str, number=str, string=str)
@annotate(keyword=str, operator=str, identifier=str )
@default(whitespace=None, comment=None, number=None, key=Non)
@default(operator=None, identifier=None, matchg=None )
def generateTokens(self,whitespace, comment, number, string, keyword, operator, identifer):

static bluff May 23, 2021, 4:12 AM

#

native flame that would look a lot nicer just by splitting it over lines: ```py def generateT...

That does indeed look better, but, my code is generally rather long- long lines I mean

grave jolt May 23, 2021, 4:13 AM

#

what does this function do? why does it have so many parameters?

static bluff May 23, 2021, 4:13 AM

#

And hinting like that throws off the feng-shui for me (all of this is completely topical, and unimportant by the way)

#

Well as the name suggests, it generates tokens. Each one of those arguments is a capture group as returned by match.groups()

#

So its either a string, assuming the group matched, or none

#

Jeeze, my spelling is bad today

grave jolt May 23, 2021, 4:15 AM

#

Is it possible that more than one of the arguments is not None?

static bluff May 23, 2021, 4:15 AM

#

Generally most of them are

#

Sorry, no, I misread

#

Generally speaking, only one will be not-none

grave jolt May 23, 2021, 4:16 AM

#

So you have a regex like (?P<foo>...)|(?P<bar>...)|..., right?

static bluff May 23, 2021, 4:16 AM

#

More or less

grave jolt May 23, 2021, 4:16 AM

#

So if you don't have a bug, it's impossible to have more than one matched group?

static bluff May 23, 2021, 4:18 AM

#

In this specific case, yes. 'generateTokens' only takes capture groups corresponding to the main lexical categories- only one will ever match. Other similar methods can have multiple matches though, for example

    def generateBaseX(self, number:str=None, integer:str=None, floatpoint:str=None,
                      exponent:str=None, complex:str=None,
                      **matchgroups:ExpressionToplevel.Matchgroups):
        """Generate a number token of integer, floatpoint, exponent, or complex.
        NOTE: Complex takes precedence over exponent, which takes precedence over other formats.
        NOTE: Exponents are floating point numbers by definition."""

        if complex:
            return self.token('NUMBER', 'COMPLEX', self.source.advance, number);

        if exponent:
            return self.token('NUMBER', 'EXPONENT', self.source.advance, number);

        if floatpoint:
            return self.token('NUMBER', 'FLOATPOINT', self.source.advance, number);

        if integer:
            return self.token('NUMBER', 'INTEGER', self.source.advance, number);

grave jolt May 23, 2021, 4:19 AM

#

Why not make a single object for the match result?

#

and then access its fields

static bluff May 23, 2021, 4:20 AM

#

Fix you beautiful son of a bitch

grave jolt May 23, 2021, 4:20 AM

#

👀

static bluff May 23, 2021, 4:20 AM

#

+2

grave jolt May 23, 2021, 4:20 AM

#

Or just accept the match as the argument.

static bluff May 23, 2021, 4:20 AM

#

???

grave jolt May 23, 2021, 4:21 AM

#

the re.Match object

static bluff May 23, 2021, 4:23 AM

#

I could, though I generally prefer having a little more control than that. I might want to switch up how the object gets printed or else add some other functionality

static bluff May 23, 2021, 4:53 AM

#

Is python's parser process a single step? I've seen it separated in some contexts into 'syntax analysis' and 'semantic analysis', but those might be so closely interwoven that they could be executed in a single step

deft pagoda May 23, 2021, 5:06 AM

#

I remember a quick tokenizer made during a beazley talk:

#

!e

import re
from collections import namedtuple

tokens = [
    r'(?P<NUMBER>\d+)',
    r'(?P<PLUS>\+)',
    r'(?P<MINUS>-)',
    r'(?P<TIMES>\*)',
    r'(?P<DIVIDE>/)',
    r'(?P<WS>\s+)',
]

PARSER = re.compile('|'.join(tokens))
Token = namedtuple('Token', 'type value')

def tokenize(text):
    scan = PARSER.scanner(text)
    for match in iter(scan.match, None):
        if match.lastgroup != 'WS':
            yield Token(match.lastgroup, match.group())

print(*tokenize('2 + 3*4 - 5'))

fallen slateBOT May 23, 2021, 5:06 AM

#

@deft pagoda :white_check_mark: Your eval job has completed with return code 0.

Token(type='NUMBER', value='2') Token(type='PLUS', value='+') Token(type='NUMBER', value='3') Token(type='TIMES', value='*') Token(type='NUMBER', value='4') Token(type='MINUS', value='-') Token(type='NUMBER', value='5')

static bluff May 23, 2021, 7:13 AM

#

Can I get a hand my dudes? I've asked in the general and also in a channel, no bytes (get it?)

#

self.source = source if isinstance(source, str) else source.read().decode();
self.sourcelines = io.StringIO(self.source).readlines();

#

I'm taking in input as either a string or a file-like io object. I want to get both the source and its constituent lines in unicode form

#

This approach is better than my original, but still seems off

sacred yew May 23, 2021, 7:15 AM

#

uh is there a reason why you cant just do self.sourcelines = self.source.split("\n")

static bluff May 23, 2021, 7:15 AM

#

I believe that different operating systems use different newline separators, no? Assuming that's true, I need to split the input using the same procedure a file-like object would

grave jolt May 23, 2021, 7:16 AM

#

when reading from file, Python turns newlines to \n

static bluff May 23, 2021, 7:16 AM

#

Oh, well that certainly helps

native flame May 23, 2021, 7:16 AM

#

str.splitlines also exists

grave jolt May 23, 2021, 7:16 AM

#

we shall not speak of the ;

static bluff May 23, 2021, 7:17 AM

#

SPLIT-lines. I KNEW there was a method for that

#

I thought it was readlines

native flame May 23, 2021, 7:17 AM

#

readlines exists too

#

but its not a string method

static bluff May 23, 2021, 7:17 AM

#

Now, should I actually care about decoding the input? For lexing purposes, does it matter?

grave jolt May 23, 2021, 7:18 AM

#

why are you even splitting the source into lines?

#

hm, I guess it can be useful to store the line number for each token

static bluff May 23, 2021, 7:19 AM

#

In case of a syntax error I need to supply the full line of text on which the error resides. Keeping an array of the lines and referencing them by the lexer's line index seems to me the most logical route

#

About the decoding?

pallid trout May 23, 2021, 7:22 AM

#

Uh who pinged?

static bluff May 23, 2021, 7:23 AM

#

I mean, yes, I'll want the string in unicode form so it can be matched against the regex, no?

unkempt rock May 23, 2021, 8:01 AM

#

who pinged me?

grave jolt May 23, 2021, 8:02 AM

#

unkempt rock who pinged me?

we had a raid.

true ridge May 23, 2021, 8:18 AM

#

static bluff Is python's parser process a single step? I've seen it separated in some context...

nop, the parser is pretty unaware of any semantics applied. For example return 42 is a legal statement on it's own, but when the semantics applied in the later stages (compiler / symbol table) it becomes a SyntaxError.

static bluff May 23, 2021, 8:19 AM

#

Is this normally achieved through if statements, or is there some sort of grammar applied?