#internals-and-peps
1 messages Β· Page 111 of 1
Python/compile.c lines 5309 to 5310
if (c->u->u_ste->ste_type != FunctionBlock)
return compiler_error(c, "'yield' outside function");```
python level tokenizer does tokenize them
I'm just commenting the 'roundtrip' flag on my lexer, which for me indicates to the lexer that it must create tokens everything it might otherwise omit because the tokens will be used to reconstruct the original document in as close to original condition as possible
I already have a whitespace flag for this, which has me asking what other than whitespace the standard lexer doesn't tokenize
the C level tokenizer also ignores individual new lines (NL token on the python level)
That is, newlines between continued physical lines as opposed to logical lines, such as newlines encapsulated within some sort of parenthetical
I'm already creating newline tokens for those with a subcategory of 'continued'- but that's just a minor detail
I guess it'd be easier on the parser if I gave them entirely different names, eh?
You know, if I'm going to properly implement runtime type checking, I'm going to need to come up with a whole schema for parsing types. I'll have to refactor and extend the typing system that is already in place
Which will be fun, but Christ, its gonna be intense
Its not just checking types. Its checking is something is an instance of type x, I'll need to be able to indicate within the hints whether something is hinted as an instance or must be of that exact type. Checking literals. Non-standard hints like "is this a positive integer" or 'is this an empty string'
The more I think about it, the more I think some sort of a grammar expression might be appropriate (and super fun to implement)
hi everyone ... please offer a best book for learning python.thanks
how about, automate the boring stuff with python
Also this isn't really a good channel got that
about mchine learning and python.if you know any channel please join me thx
Well
Thats another 12 hours I'm never getting back XD
A fantastic, fantastic night of coding
Havn't had a stretch like that in a while. Feels food :3
dependent typing?
welcome to the rabbit hole π
It only gets deeper
Inner-Voice: Hopefully someone can answer this
So I wanna know how to sense if I've clicked somewhere here (Just not a specific spot because that would be too precise.) and then I'll print a message from a 'Bot'.
If anyone here how to do this, ping me inside any available help channel.
Does anyone know how to do this advanced mouse detection?
Hello?
hi
This is not the appropriate place to ask questions. #internals-and-peps is used for discussion on the use cases, implementation, and future of Python. Not questions. @mystic forum
If you want help, open your own help channel. #βο½how-to-get-help
?
what you said
is basically
dependent typing
if I understood you correctly
i need hel-p
e.g. even integers
Question about left recursive vs recursive descent parsing
One is where you start with every possible avenue of advancement open and avenues close off as input is consumed- like the number of possible moves in a chess game getting smaller as moves are made
The other is where every avenue is attempted one by one until one succeeds (which sounds slower, but because any irrelevant avenues will fail immediately, probably isn't)
Correct?
You should ask this in a help channel, see #βο½how-to-get-help
I need to access a list inside a nested dictionary. How do i do it?
Example:
{Abc:
{def:[1,2,3,4]},{gh:[4,6,7,8]}, ghi:{jkl:[1,9,7,5]}}
And i want to access the lists. Any ideas??
If you need help, you should claim your own help channel, see #βο½how-to-get-help. This channel is used for discussion on the use cases, implementation, and future of Python. Not questions.
Dependent type systems are basically type systems that allow for more information at the type-level e.g. sized vectors or extensible records
They're convenient not only in terms of runtime/type safety but in terms for formal verification methods as well
That definition is a bit abstract but I think I get the jist
Basically a type system following some sort of a schema, such that types can be compared and constracted?
Think value-level semantics at the type-level
I'm curious what an implementation through mypy extensions would look like but here's my take on it:
K = TypeVar("K")
T = TypeVar("T")
class Vector(Generic[K, T]):
"""Provides a dependently-typed interface for Python lists."""
def __new__(cls) -> Vector[0, T]:
return super().__new__(cls)
def __init__(self):
self._vector: list[T] = []
def append(self: Vector[K, T], item: T) -> Vector[K + 1, T]:
v = Vector()
v._vector = self._vector[:]
v._vector.append(item)
return v
def extend(self: Vector[K, T], v: Vector[L, T]) -> Vector[K + L, T]:
v = Vector()
v._vector = self._vector + v._vector
return v
x: Vector[0, int] = Vector()
x_: Vector[1, int] = x.append(1)
x__: Vector[2, int] = x_.extend(x_)
Oh, the same idea popped up on the Literal type PEP
https://www.python.org/dev/peps/pep-0586/#true-dependent-types-integer-generics
Literal is actually a form of a dependent type, e.g. some type Literal[T] depends on the runtime value of T
I'm not sure if it really counts as dependent types since there's no real checking or verification for it, once it becomes a runtime value
at least, the code above looks comparable in power to non-type template parameters in C++, which I've been told are not really dependent types as well
but I'd be curious to hear the details
Dependent types aren't about runtime validation. They are about statically limited values that are however dynamic. For example append returns a vector of size one greater than. Afaik C++ templates cannot do this, since they at some point must resolve to a value at compile time.
I didn't say runtime validation
it helps you prove things statically, afaik, even if it's a dynamic value. The basic idea is a value that gets lifted into a type, but then the idea is that the type is somehow helpful, i.e. it's helping prove things in the context of your type system (which is static)
And yes, in C++, the value must be resolved at compile time, as you say.
Given a toolkit of python, cython, numpy, and anything else you might be able to call pure-python or pure-python-adjacent, what is the absolutely most performant framework for a peg style parser?
Theres probably a c library for packrat parsing with pegs
A library that generates a parser from a grammar is often called a "parser generator"
π Yes for certain
To the second part
I just don't think I have it in me to learn C this year- I'd rather stick with Python
And also finally learn whats what about performance anyway
Perhaps Rust 
I mean honestly your two goals are mutually contradictory
you can't learn "what's what" about performance while doing only python. Or at least, you can maybe learn about some very very specific notions of performance
"how to not block the CPU while doing I/O" - seems to be mostly what I hear about discussed in terms of perf in the pure python world
When I say "only python", I mean, just writing pure python, so not writing extensions in other languages, not writing or working on JITs, etc.
If you do those other things then of course you can learn a lot about performance
Hopefully this is the right place since its a slightly odd question
I'm trying to pass a method of an imported object **kwargs, like so:
def paginator(api_call=None, **kwargs):
Return api_call(**kwargs)
Problem is passing api_call = client.method doesn't work
It says api_call received multiple arguments
Anyone have any ideas
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
They're query filters that change over the course of the method
Sorry about the format, on mobile
Add a single set around **kwargs.
As in, in the code? Or a set of backticks around kwargs for formatting
def paginator(api_call=None, **kwargs):
Return api_call(**kwargs)
For formatting.
For your actual problem though, it'll depend on the parameter names set in your method, that error means you supplied one twice - perhaps both positionally or as the implicit self arg, and by keyword name.
It's something positional, it doesn't seem to recognise that client.request is a single parameter and should return client.request(**kwargs)
p sure it is
this is a channel for discussions of the language though, not a help channel. you can try #βο½how-to-get-help
there's ways to pass member functions, in most languages I would definitely prefer to just create a lambda that wraps the member function, that makes it pretty simple to understand where self is coming from
Not sure if most people would prefer that in python or not
could define a local function
In Python that's what accessing a method does already, so it's redundant.
Aight sorry
what do you mean?
Well, I understand that python is as slow as it comes, I do, but i also know that there are some ways solving problems that are faster than others. Even if they aren't important on the day to day and even considered micro optimizations, I'd like to get a handle on 'what's what'
in many (most?) cases you're passing a member function, you want it closed over the object, i.e. self is coming from you, not the API, so you have to bind that argument
obj.method constructs a method object, that when called prepends self to the parameters.
So it's a closure effectively.
Right, but it's not redundant if your local function/lambda would capture whatever you plan to use as self
I might also be totally missing the mark on this, but I know that cython can in fact offer major speeds up for some jobs and used right, and that numpy is build for fast arrays
basically, yes
lambda *args: some_obj.meth(*args) β some_obj.meth
waiting patiently for someone to say function equality is undecidable
ah I see what you mean
TIL, coming from C++ I guess I never thought of trying some_obj.meth, just did SomeClass.meth
though I do still prefer the explicit-ness of a lambda (in languages where lambdas are nice)
It's the same feature property uses, __get__(). On classes the function is returned.
@static bluff I guess this is obviously subjective, but IMHO, when you learn to write faster code for Java, python, etc, aside from the portion where you learn about performance for the domain, or algorithms
you are not learning that much about performance "in general". You're learning what that platform/VM is good at optimizing, how to play nicely with it.
When you learn about high performance code in a language that compiles to native, gives you control over memory layout, well, that's just how machines work, and that's just what kinds of things make programs fast (e.g. cache friendly data structures)
learning that say a list comprehension is slower/faster in python than a for loop, or whether to use in to check membership in a dict vs allowing an exception to pop up, etc
YOU'RE RIGHT I'm missing self and am dumb and need more coffee thanks

for element in input_space:
if f1(element) != f2(element):
return False
return True
!rule 9
I think it's interesting how all three of these solutions doesn't sit right, especially with newcomers
from functools import reduce
xs = range(100)
reduce(lambda x, y: x + y, xs)
# or
from operator import add
reduce(add, xs)
# or
reduce(int.__add__, xs)β
iirc Raymond Hettinger tweeted about how most folks are reluctant to access dunder names because of how they have that leading double underscore, which typically signals DO NOT TOUCH
Well, for one thing (99*100)/2 is a lot better, or you could do sum(range(100)). It's probably more the unfamiliarity with reduce(), since it's not as easy to think about in your head.
#Python psychology quirk: The dunder naming convention makes people feel icky when calling them directly. This natural aversion steers people away from using them as bound methods:
map(parent.__getitem__, children)
fetch = data_stream.__next__
119
And it is somewhat true that you want to avoid using them if there's a function alternative - int.__add__ won't work if it's other.__radd__ that has the implementation for instance.
fetch = data_stream.__next__
You could replace this by a partial(next, data_stream)
This also arise the question of whenever using function aliases like that are a good idea
I am fine with using dunders, but it does feel a bit hacky
ngl ```hs
foldr (+)
Accessing dunders directly feels weird. But then question, is that an issue, or is that how it should feel?
tt
It shouldn't feel icky in "normal" contexts especially if you have stuff like __annotations__ or__qualname__, I'd say
For some reason I find attributes more okay than functions
I guess it is because there is not other way to access them
I think the mental block with accessing methods directly also comes from learning how self is literal sorcery
sup]
!e
print('int:',int.__eq__(5, 5.0))
print('float:', float.__eq__(5.0, 5))
@prime estuary :white_check_mark: Your eval job has completed with return code 0.
001 | int: NotImplemented
002 | float: True
I don't think the language spec is going to say which of these needs to return True, just at least one does.
Interesting. But you can write the equality check either way right
5 == 5.0 and 5.0 == 5
There is no req so how does that work
Does it just always check for an equality member function on both
yes, if a dunder returns NotImplemented, it tries it the other way around
well it tries left.__eq__(right) and then (if NotImplemented) right.__eq__(left)
I'd need a second to think about why
Actually sorry no
I misread
You're right
Eq and ne are their own reflection
There are no swapped-argument versions of these methods (to be used when the left argument does not support the operation but the right argument does); rather, lt() and gt() are each otherβs reflection, le() and ge() are each otherβs reflection, and eq() and ne() are their own reflection.
That makes sense
!e
class A:
def __init__(s, v):
s.v = v
def __repr__(s):
return repr(s.v)
def __eq__(s, o):
print('eq', s, o)
return NotImplemented
def __ne__(s, o):
print('ne', s, o)
return NotImplemented
print(A(1) == A(2), A(1) != A(2))
@flat gazelle :white_check_mark: Your eval job has completed with return code 0.
001 | eq 1 2
002 | eq 2 1
003 | ne 1 2
004 | ne 2 1
005 | False True
it doesn't seem to try the other one if the dunder is present
but if the dunder is not present, it will call the other one
!e
class A:
def __init__(s, v):
s.v = v
def __repr__(s):
return repr(s.v)
def __eq__(s, o):
print('eq', s, o)
return NotImplemented
print(A(1) == A(2), A(1) != A(2))
@flat gazelle :white_check_mark: Your eval job has completed with return code 0.
001 | eq 1 2
002 | eq 2 1
003 | eq 1 2
004 | eq 2 1
005 | False True
hey yow
i just stumbled over https://en.wikipedia.org/wiki/Literate_programming again and wondered whether maybe doctest could be used to realize that in python?
Literate programming is a programming paradigm introduced by Donald Knuth in which a computer program is given an explanation of its logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which compilable source code can be generated. The approach is used in scientific computing and i...
maybe useful in teaching contexts
Heyyy is there a software engineer in here?? I would like to ask some questions based on that career
#career-advice would be a better place.
doctests are a step in that direction; notebooks like Jupyter are even moreso (though the typical application is not exactly Knuth's vision)
https://www.geeks3d.com/20181220/how-to-enable-intel-opencl-support-on-windows-when-amd-radeon-graphics-driver-is-installed/ would this make windows work with opencl and amd gpu?
How to Enable Intel OpenCL Support on Windows when AMD Radeon Graphics Driver is Installed
class Main:
def __init__(self, file_name, book_name=None, author_name=None, read_status=None):
self.file_name = file_name
self.book_name = book_name
self.author_name = author_name
self.read_status = read_status
def load_books(self):
try:
with open(self.file_name, 'r+') as file:
data = json.load(file)
return data
except Exception as e:
print(e)
def search_book(self, book_name):
data = Main.load_books(self.file_name)
for book in data:
if book["Book Name"] == book_name:
return True
else:
return False```
Is it right way to do it? I defined `load_book` method in Main class and then calling `load_book` method inside same class but into different method call `search_book`
You might be looking for self.load_books, or to make it a classmethod
It also doesnt take any arguments, but you give it one
ohh thanks
def search_book(self, book_name):
if not self.data:
# avoid loading data again and again.
# do it only if self.data is none
self.data = Main.load_books()
found_book = False
for book in self.data:
if book["Book Name"] == book_name:
found_book = True
break
return found_book
here, fixed couple of things for you. added some caching so that file is not to be read for every search call, it will be read only for the first call on that instance. and fixed search logic too
Really thanks you. This is my first project I have already written all the code all working as expected but, now I wanted to simply things and organizing it.. this will help alot..I am still unsure when to use decorators
its okay to not use all features of the language in the first project! π I'm sure you will get there soon
I think it was a long time before I used decorators beyond @classmethod and @staticmethod... those are good to dip your toes in as you build out a class.
you will also get introduced to those when you use some frameworks like flask or django. One can continue using them without knowing them what they are and after a week or so one would just "get" it. It's not that hard
Hey
I am trying to create a discord bot for a project
Is there a way to recieve image via discord and save in mongodb atlas as chunks using gridfs?
hey
i want help in making a tic toc game with a single player in it computer always wins or game is drawn
I think I'm starting to get this whole grammar-generator-parser thing
Hello @iron solar and @frozen flame, can you please see #βο½how-to-get-help for Python help?
SourceExpression = EXPRESSION(
TOKEN(
CATEGORY('WHITESPACE',
CONTEXT('', 'whitespace', success=['makeindent']),
),
CATEGORY('TERMINATOR',
CONTEXT('CONTINUED', 'escape newline'),
CONTEXT('UNCONTINUED', 'newline', success=['flagindent']),
transit=['descend'],
),
CATEGORY('NUMBER',
CONTEXT('BINARY', 'binary'),
CONTEXT('OCTAL', 'octal'),
CONTEXT('HEXA', 'hexa'),
CONTEXT('COMPLEX', 'real exponent? complex'),
CONTEXT('EXPONENT', 'real exponent'),
CONTEXT('FLOATPOINT', 'floatpoint'),
CONTEXT('INTEGER', 'integer'),
),
CATEGORY('STRING',
CONTEXT('TRIPLEQUOTED', 'prefix? triple tripledata (<<2)'),
CONTEXT('SINGLEQUOTED', 'prefix? single tripledata (<<2)'),
CONTEXT('UNTERMINATED', 'prefix? ( triple | single )'),
success=['unescape', 'lexfstring'],
failure=['unterminated'],
),
CATEGORY('OPERATOR', CONTEXT('', 'operator')),
CATEGORY('KEYWORD', CONTEXT('', 'keyword')),
CATEGORY('IDENTIFIER', CONTEXT('', 'identifier')),
CATEGORY('EOF', CONTEXT('', 'eof'), success=['terminate']),
success=['create'],
failure=['syntaxerror'],
transit=['advance'],
),
)
I was toying around with the idea of building a grammar out of actions instead of the other way around, and it occurred to me that this is probably not unlike what gets spit out by the parser generator???
And as I was reading Guido's original outline of his PEG machine I finally got to see some of the functionality hand written- in the code if a > b: print(a+b), the parser will check all its rules, eventually coming to the if-statement rule (something like) 'if_statement: KEYWORD['if'] NAME OPERATOR['>'] NAME OPERATOR[':']' (I'm assuming that because a parser is looking at tokens instead of source, when searching for a literal it will check the actual text value of the tokens it has to work with, hence the bracket notation)
When it encounters something like 'NAME', it calls a method attached to itself called 'NAME' which itself may contain two or three alternatives each with their own methods and possibly even references to itself
The issue, therefore, is building programmatically building an object which chains the right 'callforwards' in the right order for every non-terminal, each node consuming input (or not) as needed and ultimately failing or succeeding resulting in input being consumed or else no input being consumed and the object moving forward
And so, you need a hand built lexer and parser to take in your languages specification to create something not unlike an AST, which the generator then uses to spit out a parser, which itself produces an ast from the source code it was designed to consume
Thats... not completely wrong, right?
Does anyone know if Python's PEG Parser tokenizes and parses in a single step?
Ahh, you know what? I'm going to need separate machines to do both anyway
cpython is to python like _ is to _
what's a good analogy to fill in the blanks
i thought photoshop to image editors but the concept of image editor is way looser than the concept python
also an engine to a red beetle. The concept of red beetle is very strict, but you can have two otherwise identical beetles with different engines.
The best I can think of is that is like cheese is to mac and cheese
You can have mac and cheese without cheese. Soy cheese, rice cheese, even tofu
But the original and defining mac and cheese is made from cheese
I think its better to say though that cpython is the only python, and that other implementations are extremely similar but separate languages. Even Pypy isn't technically python
Its close- very very close, but when you replicate cpython in another language, even in ones that share cpython's nature like c (for cython) or python itself, you have to make judgement calls in terms of bending the language you're using and the data you have access to into as close its natural equivalents shape as possible.
@lament sinewlike Jabra Talk 25 to bluetooth handsfree headsets. There is a pretty precise way in which a handsfree bt headset should work, and Jabra Talk 25 is one such device=implementation.
I was thinking V8 & JavaScript, but maybe it's not supposed to be such a direct analogy?
Node.js may be a more correct comparison, not sure
python is not defined by how cpython behaves, but by the docs. It isn't as precise as e.g. the C standard. A language without a spec (correct me if this changed) is rust, rustc is how rust should behave.
I doubt very much python has more undefined behavior than C.
i don't like this because different headsets can look different, while python code should look the same accross implementations
I mean, there is specific code for each implementation that will only work there, but yeah, I can see where you are coming from
no, the tokenizer is completely isolated from the parser itself.
Rockin, thanks Ident
Is there a way I can prevent a class from being subclassed?
__init_subclass__ perhaps?
raising an exception there is probably the best option, yeah
there is a final decorator in typing that can be used for this
if you are using mypy as part of your CI, then it will flag an error statically if it's subclassed
Whoa, what's the use case for that? π²
it's a pretty common thing to want
to the point that most statically typed OO languages today support a way of preventing someone from inheriting from your class
This arguably goes against Python's philosophy. If you think subclassing your class is a bad idea, you say that, and another developer does it anyway, that's on them.
Hi, how to fixe this message error :
nbconvert failed: Inkscape executable not found
This is a discussion channel. See #βο½how-to-get-help
Is there a more pythonic way to get the original function out of a coroutine object?
def get_func_from_coro(coro):
return next(i for i in coro.cr_frame.f_globals.values() if getattr(i, "__code__", None) is coro.cr_code)
This in particular is lacking in cases where the original function isn't available at the global level
you probably know more about this than I do, but is there even an "original function" to obtain from a coroutine?
In almost all cases
async def foo(): pass
foo is a function
foo() is a coroutine
I want to get from foo() to foo^^
on the other hand, is it possible to create a coroutine object without calling an async function?
It looks like it's possible but you'd not want to
!e ```python
import asyncio
coro = asyncio.sleep(3.0)
print(coro.name)
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | sleep
002 | sys:1: RuntimeWarning: coroutine 'sleep' was never awaited
my knowledge of async is pretty limited. I've only used it in connection with the Python bot.
glimmer of hope?
maybe there's something inc.cr_code
!e ```python
import asyncio
coro = asyncio.sleep(3.0)
print(coro.cr_code.co_name)
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | sleep
002 | sys:1: RuntimeWarning: coroutine 'sleep' was never awaited
I had a look in the inspect docs and there didn't seem to be anything; cr_code can be created from way more than just functions
Lambdas and modules for example
ah, interesting
then it seems like there isn't any way to do it, unless the originating function adds extra info
You can't add attributes to coroutines as they're c objects
right
I think looking for it in the globals through the name would be a bit cleaner and about as reliable as your original approach (can go through qualname if it's under a class); doesn't looks like the coro holds anything to directly reference the function that created it as technically there could be none
What about c.cr_frame.f_code
Or in general, looking at the frame to see where it was defined.
for some reason __qualname__ isn't actually qualified
Maybe it only qualifies it if its within a class, but not for modules
a minor quibble but still kinda annoying
subprocess.SubprocessError doesn't at the base class level claim to have cmd, stdout, stderr attributes
even though it only has two children, and they both have those attributes
and it seems like given the nature of the exception, you should have those things
And indeed, mypy complains if I try to write the code that way
it took
me so long to make a new folder then put the file in it, then let the user put antoher file in it , after they can delete certain files from the folder
and let them
create more folder with filess
with
GUI
oh huh that is annoying lol
was gonna say it's fine because it's a base class no one uses it, but then remembered how you catch exceptions :/
Well, it's correct. The library does raise some SubprocessError instances that don't have those attributes
Sounds to me like mypy caught a real bug.
Sometimes the library raises the base class type instead of one of the two derived classes, that is.
when you terminate a program with ctrl+c (or really through failing to handle an exception), does the runtime cleanup all live objects and call their "destructors"?
or does it just assume the OS will free any resources acquired by the objects?
Functions thus registered are automatically executed upon normal interpreter termination.
like does not handling an exception count as normal termination?
Note: The functions registered via this module are not called when the program is killed by a signal not handled by Python, when a Python fatal internal error is detected, or when os._exit() is called.
I wanna say yes...
but that's just referencing atexit module, it doesn't talk about the runtime itself
oh wait ig i can test this by using __del__
@red solar you can't count on __del__ being called. what clean-up do you need to happen?
shared memory
doodspav@doodspav-desktop:~$ python3 t.py
Traceback (most recent call last):
File "t.py", line 7, in <module>
raise Exception("aaaaaa")
Exception: aaaaaa
del
doodspav@doodspav-desktop:~$ cat t.py
class A:
def __del__(self):
print("del")
a = A()
raise Exception("aaaaaa")
this worked π€·
afaik cpython tries to clean up everything but makes no guarantees
ig that's the best i can hope for
@raven ridge ah, ok, I missed that
It would be nice I suppose if it provided a subsequence base class/interface that collects the common attributes
@red solar if you have a resource you want free, use a context manager
maybe they wanted to but couldn't come up with a name for it π
it should work even via C-c
i can't
why not?
the shared memory is part of an object and they shared lifetime, it's not part of a context
like the object owns the shared memory
but i want the object to be passed around like an int would be π¦
You can pass the object around, but to actually use it, you'll need to pop it open like a context manager
oh, hmm
I realize you're coming from C++, but C++ has its own idioms (like RAII) and python has its way
i was on Python before C++ lol
hah
this was my first discord server π
it's just what you said was the most C++ thing that a person can say
"I want to pass it around like an int"
Thank you Stepanov π
C++ may have changed me
Why not atexit?
next you'll be telling people in the python server they should make their types regular
i could do atexit as well, but is it worth it? if it provides nothing additional
ultimately you can use various hacks like del, atexit, etc, and see if it works "well enough"
lol i'm not that deep
but the idiomatic way in python to make sure resources are freed are definitely context managers
@red solar can you use with-statements to manage those resources?
and if you have an object that owns another object that's a context manager, and its' not an implementation detail, then "infecting" the context manager upward is the way to do it
if you need it to occur then the difference would be that you can be sure that atexit will do the clean up as long as python is capable of doing it, unlike the destructor which may not be called at all (and other implementations may just ignore the destructor at shutdown)
the same shared memory buffer is used for the lifetime of the object (until __del__ is called, or __init__ is called again) - so all methods use the same region, meaning multiple with-statements wouldn't work :/
ohhh - ok i'll do atexit
guess i'll have to register for every object separately tho since it can't take parameters
@red solar what is using the object? The object itself could be a context manager
it's AtomicInt (i decided i wanted atomics in python)
with my_object_using_shared_memory:
blah blah
would feel weird having a context manager with int
with AtomicInt(width=4) as a:
a.add(10)

I think the python interpreter already guarantees many things are atomic
not between processes
But at any rate it's a very weirdly low level approach for python
i figured now that we have a shared_memory module i could try it
@red solar then I would consider having a SharedMemoryArena or something in which AtomicInt works.
Yeah, for an interprocess that's what you want
Look at boost interprocess for example
ok but i'd still need some way of freeing it
and that can only work if destructors on all objects are called
otherwise there'll be a memoryleak
@red solar
with myArena:
...
oh
@red solar forget about destructors. Don't write __del__ methods.
shm = SharedMemory(name="...")
a = AtomicInt(shm.buf[:8])
i mean i can already construct them like this
why not with SharedMemory(..) as shm:
because i was too lazy to change what i was writing halfway through
but i wanted the shared memory to be internal so that the object can be in charge of allocating it (and naming the shared memory object), so that i can make it possible to pass it as an argument to another process
also having its lifetime be constrained to a context is too restricted for my liking
at any rate i don't see context managers providing any guarantees that atexit doesn't, so atexit seems fine?
I mean context managers aren't globals?
@red solar atexit should work, it just means you have to register the callbacks, and be careful if they are called after your object is already cleaned up
yay π atexit it is then π
By using atexit you're saying that the only way this is going to be used is at global scope
what? how?
@red solar we don't know the full structure of your program
atexit registers functions with no arguments I assume, similar to C
Your object needs to be available for the function to see and still be alive, it can't be cleaned up before program end, etc
It doesn't exactly fun force your atomic int to be a global but it forces some of the same characteristics
atexit takes func, *args, **kwargs: https://docs.python.org/3/library/atexit.html#atexit.register
I don't think I've ever had this much fun π π π
Atexit is where destructors of globals in C++ are run
i can maintain a container of them, and have atexit del all the objects in the container
so only a single global
lol you been watching this whole time?
@red solar but you will also have objects with shorter lifetimes, right?
Yeah and if the function that creates the container gets called repeatedly as part of a larger program
You'll have memory leaks
yes, in which case their destructor will remove them from the container
No no, my own thing. Why, whats going on?
Effectively
ah ok - object lifetime stuff
That's no destructor in this language
you really shouldn't write __del__ methods. seriously.
Ohhhh fun
technically it's __dealloc__ since i'm writing it in Cython ( @raven ridge didn't approve of me doing it in C)
hmm ig i should look at Cython's guarantees then
If you create a local variable and register it's cleanup in atexit, like I said, it's basically a memory leak
why is it basically a memoryleak?
Because it doesn't get freed until program exit
Which is when the OS will free it anyway
the OS won't free shared memory
If you call the function repeatedly in a loop you'll keep allocating memory which will never be reclaimed
ah ok i see tho
Really, even if every reference to the shared memory is dead?
That seems surprising
Anyhow though you understand my meaning
lol if the OS does reference counting for shared memory then my issues are solved
but yeah
Err, well the opposite means that every time someone writes a faulty program with shared memory, memory just leaks till reboot
Hard to believe
that's honestly what i was expecting to happen (and what i think happens)
As a resource for sharing data across processes, shared memory blocks may outlive the original process that created them. When one process no longer needs access to a shared memory block that might still be needed by other processes, the
close()method should be called. When a shared memory block is no longer needed by any process, theunlink()method should be called to ensure proper cleanup.
that's what python says it needs π€·
Maybe you're right
But anyhow, context managers are definitely the way to go, and so are arenas
Having every int create its own shared memory isn't great
oh no i'm not saying every int creates its own shared memory
they can share it - but i still need every int to cleanup itself
i think i'm gonna see if i can get the atexit list to work tho
@red solar what is the cleanup that the int needs to do, if it doesn't create the shared memory?
well... it might create shared memory - or it might use an existing block that other ints are using
shared memory on linux uses mmap, which does a minimum of 4096 byte allocations (i think), so seeing as every object only needs up to 64 aligned bytes, giving every object its own region would be wasteful
but that's separate issue
@red solar i'm just following your lead: you said the int didn't create the shared memory, so I'm trying to understand what it needs to do.
oh - yeah i'm sorry - ok for this __del__ and atexit stuff you can just assume it destroys it
but lemme try atexit, see if it can be done correctly, and then i'll come back
I would recommend not making the int too complicated: have it always use shared memory it is given.
i'm not sure what that reaction means π
it's a zoomer version of π¦
it means that what you said makes me sad but you're probably right
a = AtomicInt(size=4)
...
p = Process(target=f, args=(a,))
means that i won't be able to do this and have it work correctly
@red solar is it important for that to work? What are these atomicints for anyway? Some background could help.
i mean it's not that important, but it would be nice π¦
there's not much context, just a while back i wrote some shmem stuff in C++, and then decided i wanted to interact with it in Python, and thought it would be useful to have an atomics library in python for that
what did you use the sharedmemory for? Why atomic ints?
in C++? i think i was trying to do a shared memory cache for faster access than my database (and for some reason didn't want to use redis, tho can't remember why now)
but this was a while back, i'm not doing this for that anymore
now i'm just doing it cuz i saw it as a project and wanted to finish it
so it's not important, but if multiprocessing.Lock can do it, i wanna do it too
you could take a look at how multiprocessing does it
The source is on your computer, you can read it.
hmm ok gimme a minute
it's not a quick read... π¦
static PyObject *
_multiprocessing_SemLock__rebuild_impl(PyTypeObject *type, SEM_HANDLE handle,
int kind, int maxvalue,
const char *name)
/*[clinic end generated code: output=2aaee14f063f3bd9 input=f7040492ac6d9962]*/
{
char *name_copy = NULL;
if (name != NULL) {
name_copy = PyMem_Malloc(strlen(name) + 1);
if (name_copy == NULL)
return PyErr_NoMemory();
strcpy(name_copy, name);
}
#ifndef MS_WINDOWS
if (name != NULL) {
handle = sem_open(name, 0);
if (handle == SEM_FAILED) {
PyMem_Free(name_copy);
return PyErr_SetFromErrno(PyExc_OSError);
}
}
#endif
return newsemlockobject(type, handle, kind, maxvalue, name_copy);
}
i wanna say that this is what gets used in the new process?
ok, that isn't shared memory, it's a semaphore
yeah, i'm not that convinced it's what's actually happening... i'll keep looking
i can't find it :/ even debugging in pycharm told me nothing π¦
The OS itself holds a reference to the shared memory, and you can later look it up by name, even after the process that originally allocated it has exited. You need to tell the kernel when nothing else will ever try to open that memory.
If you press ctrl-c once, it probably will get cleaned up, though it's not guaranteed. If you press it twice in a row, it probably won't get cleaned up.
i think we redirected that to #career-advice a while ago π
Even atexit can be defeated by two ctrl-c back to back.
kinda sad that idk how process gets its arguments (probably pickled from a file or shmem), but either way the fact is that if AtomicInt doesn't own the memory, it can't know how to reference it in the new process
can context managers be defeated by this?
Cython just makes it the object's tp_del - it doesn't give you any better guarantees that it will ever be called than Python does for __del__ methods.
yep, sure.
ok so i have no real guarantees, but context managers and atexit are the best ways available?
a ctrl-c in the context manager's __exit__ can cause a KeyboardInterrupt exception to be raised before the __exit__ has cleaned up whatever resource it's meant to manage.
Ctrl-c once should be guaranteed still
ok yeah, i meant that a determined user can always cause a leak (without just sigkill)
Unless you deliberately do something silly with the caught exception
Users can disable KeyboardInterrupt exceptions and just have the ctrl-c immediately terminate the program, instead
ah interesting
or, for another counter-example, if the object is being used in a daemon thread, then the KeyboardInterrupt raised in the main thread won't ever free it.
at any rate though, context managers are just "the way" these things are done in python.
Files, sockets, database connections, ssh connections, locks, threads/process pools...
all these things are handled by context managers in most (all?) good libraries
Deciding between atexit and context managers in python is pretty similar to deciding between atexit and RAII in C++. I.e. it should be the latter 99.9999% of the time
C++ makes it an easier choice π
well, yes and no - many libraries still have __del__ on important objects to try to give better chances of it firing
Yes, they have del as a fallback
but they still want you to use the context manager, and every suggests that you should
RAII is more powerful overall, than context managers, I'd agree, but context managers still work fine overall, you have to give them a chance
I haven't really had any issues context-managing almost anything, tbh
I'm sure there are designs where it would be difficult, don't get me wrong
they usually come up in the context of writing libraries
because you don't want to "infect" the context manager the whole way up to __main__
Why's that?
well, backwards compatibility is one reason
That's fair I guess. In most cases though, bubbling up the context manager is fine. Usually when it's not fine it's because of some global state that the library needs... and that might be one of the few cases where atexit makes sense, I'm not sure
I haven't really encountered that case
I guess a good-ish example is when you do logging.basicConfig(filename=....)
imagine, there's a library that makes HTTP requests. It makes one connection per request, so there's no resources to manage that outlive the request. Later on, you've profiled and discovered that connection pooling would speed this thing up a lot. You want to add the pool, but now it's a connection that outlives the request, and now someone needs to own it. But you've already got a contract that says this thing can be called as just yourlib.send_request(...) ...
Right, but that goes back to my point about global state
But yeah, it is a fair example
yeah, it totally is global state.
And probably a rare case where atexit makes sense
in C++ you don't need atexit even then, because RAII with globals works fine (if you know what you're doing)
but the "infect upwards" option doesn't work well when you need to backwards compatibly add global state.
yeah, generally.
I mostly agree with you - context managers are almost always a better option than __del__
r u the guy who gave that talk? (on globals)
yeah
And yeah, this is true - depending on the type of shmem, at least. It's true for system v shared memory.
(I'm still reading through the scrollback on this long conversation, heh)
Python can't raise a KeyboardException in the middle of a C function can it? (unless I call a Python function maybe)
which talk?
Like all the signal handler should do is modify sigatomic var
yeah, that's what it does.
@spark magnet i gave a talk at cppcon about globals
not sure if C++ is your cup of tea
if it is you may find it interesting, though I have learned that some parts could have been improved on
Interesting, thanks for clarifying
cool
the signal handler sets a flag that says sigint was received, and the Python VM mainloop checks for that flag in between every bytecode instruction (or every one on the main thread, maybe it is?) and raises KeyboardInterrupt if the flag was set.
ok nice π
yeah, mb π
really, I skipped a step there. The interpreter sets a flag that a signal was raised in the C signal handler, and in the mainloop it checks that flag and calls the appropriate Python signal handler
and the Python signal handler for SIGINT raises KeyboardInterrupt.
the default Python signal handler for SIGINT raises KeyboardInterrupt. The user could install a different one.
I was sitting in on my girlfriend's part time coding class and got so annoyed that the lecturer was teaching them python with everything in mixedCase
lol get her pycharm and the linter will teach her π
pycharm now turns a regular string into an f-string, if you hit { and you choose to auto complete a variable
fucking bonkers
that's awesome
I don't have to keep going back and adding f anymore
Set the handler for signal
signalnumto the function handler. handler can be a callable Python object taking two arguments (see below), or one of the special valuessignal.SIG_IGNorsignal.SIG_DFL. The previous signal handler will be returned (see the description ofgetsignal()above).
do you think most people who have their own signal handler forSIG_INTmake it call the old one (i.e. keep a reference to it and call it from their own one?)
but will it delete fs if you remove brackets
that's even better
this is my most common vim sequence these days: F"if
i was doing that as well (in pycharm) until today
surround could add f-string support in principle
surround?
it's a vim plugin that lets you change surrounding delimiters without having to leave where you are
it's so ubiquitous that even most decent vim emulators (like PYcharm's Ideavim) support it
ohh - unfortunately idk vim :/
so for example, to change from double to single quotes, you'd do something like cs"'
I think
and it "just works"
i tried surround.vim once, and it didn't suit me somehow
surround.vim is part of my regular workflow, i couldn't imagine not using it
I say "somehow" because when I read the readme, it sounds great, and useful. But I don't miss it, and when I tried it, it just didn't click for me.
of course, use what works and don't use what doesn't work!
it's not bad, but not my #1 either
I'd say my top vim feature is probably simply text objects
the fact that it just works from anywhere inside the text object can save you sooo many keystrokes
text objects are the bomb. I've customized some for my own use
pycharm now supports the argument text object
which is one of my all time top
cia for days
agreed, its an underrated vim feature (and one of the big things holding me back from being comfortable in emacs, even with doom/evil/whatever)
looks like you are are rocking whatever topic this is. I was going to ask if I'm right in thinking that lisp is higher-level than Python.
well, evil definitely has text objects (and far far more)
So I'm not quite sure what you mean there
evil is pretty insane, it's about a 99.9% perfect vim emulator
i get disappointed every time i see something like delete-forward-char
what do you mean?
meh. what does higher-level mean?
Is there a preferred way to handle iteration over descriptor instances for the fields of a class?
my vim brain thinks it should be delete+<motion>, not a single operation
the act of writing lisp code requires you to make even fewer considerations about the reality of the machine upon which it runs.
in that case, no
I mean that' sjust a name of a particular function, somewhere under the hood, vim has all these functions as well
that's a good point, but that isn't exposed to the user at all in vim
learn me 
evil is pretty awesome, I was a spacemacs users for a while, but the emacs IDE-like experience is still meh. better than vim.
But pycharm has really stepped up their vim emulation game in the last 2 years or so
It's definitely only like a 90% or 95% emulator, but if you don't use a lot of exotic stuff it's fine
interesting, i remember i tried their vim mode years ago and it was not great
the neovim ide experience is coming up pretty quick, im happy with it
you want to get all the attributes of a class that are descriptors?
I think so.
Using descriptors seems like a decent way to handle validation for now even if I'm not sure everything else I'm doing is the best way to do it.
But getting this working is more important than making it perfect, at least for now.
I tried looking at Django's source to understand how their ORM abstraction works but it looks like there's a whole extra layer between the Field baseclass and the actual descriptors
I think what I'm trying to do is something like their Fields concept, but not for db ORM abstractions
I expect not. If they wanted the normal behavior, they'd have an except KeyboardInterrupt: block. The normal reason to override it would be to not translate it into an exception.
if there's a project you'd recommend that does something similar but in a less complicated manner, I'd appreciate it.
@snow kettle i think you can iterate over class attributes and check for descriptor-ness with https://docs.python.org/3/library/inspect.html#inspect.isdatadescriptor
https://attrs.org attrs avoids using descriptors entirely, i think
the attr.s decorator looks for attr.ib instances, removes them, and uses them to construct an __init__ method
(at least, that's how i think it works)
I'll take a look at attrs, if not for this project, for others
ty for pointing out the inspect module
dataclass is similar to attrs π
but a lot more stuff. dataclass has gotten a few more really nice things in the last couple of releases but there's still stuff missing
thanks for the correction π
can you subclass dataclass field descriptors?
Every method is a descriptor, for instance. Are you sure you want to do that?
π€
What's your actual goal here, can I ask?
!e ```python
from inspect import isdatadescriptor
class Thing:
def f(self):
return 1
print( isdatadescriptor(Thing.f) )
print( isdatadescriptor(Thing().f) )
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | False
002 | False
I'm not at all.
The overall goal is to practice async by make a utility module that uses async subprocess to wrap some local native system utilities.
yeah, they're not data descriptors, but they are descriptors. https://docs.python.org/3/library/inspect.html#inspect.ismethoddescriptor
and the original ask was for all descriptors.
interesting
My current goal is to figure out a tolerable way to translate a set of parameters into cli flags
not the other way around?
Either way though, I'd probably use a dataclass with metadata in each field to accomplish soemthing like this
can you give a toy concrete example? then i could show what I'm thinking
actually, ismethoddescriptor seems to be special cased to not return true if isfunction is true, but a function is actually a descriptor nonetheless
different CLI zip utilities for example
@cli
class MyApp:
input_file: Path = arg(conversion=Path)
output_file: Path = arg(conversion=Path)
version: bool = opt(default=False)
verbose: bool = opt(default=False)
something like this?
no I mean a toy example with actual desired input and output
Sounds like you're describing Click - or... hm, what's that other one...
I think I'm describing the opposite of click
yeah that's why I was surprised
not quite. I think I might still need to think about this more and read because I'm not sure I understand the problem overall anymore, or at least not how to express it.
This is a good point
I'll try playing around with inspect a bit
so really what you want, is to translate a python class, into a string, that can be parsed as command line arguments
My suspicion is that I'm having trouble articulating the problem because I didn't understand that what I'm actually trying to build is a toy version of a large and complicated set of tasks done by devops tools like ansible or whatever
Only without the remote execution part.
@dataclass
class FooArgs:
x: int = positional_arg()
bar: double = optional_arg(default=2.0)
baz: str = optional_arg()
as_command_line(FooArgs(1)) # "1 --y 2.0"
as_command_line(FooArgs(4, 2.5)) # "4 --bar 2.5"
as_command_line(FooArgs(5, 3.6, "hello")) # "5 --bar 3.6 --baz hello"
I think the specific feature I'm trying to make is a class that formats a data class or similar source of data to the flag combination for the local execution environment (BSD utility, GNU utility, etc)
yeah, so pretty much what I wrote, but with as_command_line either taking some additional argument for the CLI style
or inferring it somehow
close?
sort of I think
so you can do something like that pretty easily, positional_arg and optional_arg are simply functions that will call field inside
and stuff something in metadata which can be retrieved later iterating over the fields
what do you mean by metadata?
the metadata fields of the function, or is this something in a class or other location in the data model?
it's something you can access in the fields of a dataclass
So, a very simple example. Let's just start by saying we want to be able to sore whether it's a positional argument or an optional argument.
I see there's a metadata argument for field
class ArgumentType(Enum):
POSITIONAL = auto()
OPTIONAL = auto()
def positional_arg(default=None):
return field(default=default, metadata={"argument_type": ArgumentType.POSITIONAL})
def optional_arg(default=None):
return field(default=default, metadata={"argument_type": ArgumentType.OPTIONAL})
def as_command_line(dc):
result = ""
for f in fields(dc):
argument_type = f.metadata["argument_type"]
name = f.name
value = str(getattr(dc, f.name))
if argument_type == ArgumentType.POSITIONAL:
result += value
else:
result += f"--{name} {value}"
return result
this is the gist of it
ah ok I see
we define our own functions, which basically become our little mini DSL almost, for describing the fields of the dataclass, whether they are positiona, or optiona, etc
whether they are required or not, or maybe you want to infer that from the type annotation (Optional)
etc
any data you want
np
I'll read the dataclass doc more
You may just want to start with attrs
I think I'll want to extend this more, but it's a good start
it does the same thing as dataclass, just more features, its where dataclass came from
I started with dataclass and kind of regret it
lol
Yeah, dataclass only recently (3.10 maybe) added the keyword_only argument for example, for the decorator
without that, certain things e.g. inheritance of dataclasses is really painful
attrs has had this for ages
all kinds of things like that
I'm playing around with dataclasses for now and it's going smoothly for now. I think I might look at attrs later if I run into insurmountable issues as it's a big library, ty for the help.
How does this work?
>>> def a():
try:
return 1
except:
pass
finally:
return 2
>>> a()
2
>>> def a():
try:
return 1
except:
pass
finally:
pass
>>> a()
1
>>>
What do you mean when you say "how"?
how is the order of program executed here?
i expected it to leave the function when i returned 1 in the try
because finally gets executed after try and except right?
finally blocks run after the try block (and any matching except block) finishes
And because the finally part run after the try part but still in the context of the function, it's able to return from the function - and if it does, that overrides any value that the try block wanted to return
@snow kettle that's what I said too, the thing is by the time I hit some of these issues, i was using dataclass in many places and it's a large enough pain to change that I haven't bothered
Under normal circumstances I would say the solution is simply to wrap the functionality and limit the API, at a single point, so you can easily switch over
the problem is, very sadly, that you cannot wrap @dataclass or @attrs, or rather if you do then mypy, your IDE, etc things that depend on static analysis no longer work properly. These things are so complex that they are special cased into these tools, and the special casing cannot see through even a trivial wrapper
to note on this: you can modify large parts of how dataclasses works by overwriting the default __init__ or __post_init__
i have made a entire script to auto-resolve nested dataclasses from json using this.
I serialize/deserialize dataclasses from/to json automatically as well, but not sure why you'd want to do it by overriding init/post init
on a related note, I went through a couple of old attrs issues that I was interested in, and attrs have done some very nice things. Namely, they've created a new decorator, attr.define, with much-improved defaults compared to attr.s, for example, auto_attrib=True
which way do you do it? for me its that way since the raw json is written to it first, and then resolved recursively iteratively going deeper down the structure
so on post_init is where the actual resolving happens
I mean if you are overwriting init/post init, it sounds like you ahve to write code by hand
yes?
I just have a free function that does it, and it works for all dataclasses
zero extra code per dataclass
they inherit from some base class, and then you write the init function manually, to have it call some function in the base?
i have some utility on top, like a from_dict classmethod that ignores extra/missing fields
in the base
but then the init of the derived is still the one that's called?
no?
do you pass init=False?
no?
can you write out what a dataclass looks like for you, in order to be able to do this
this is what a each dataclass looks like
@dataclass
class Class(Base):
date: datetime
other_class: OtherDataclass = None
list_of_classes: List[OtherDataclass] = None
it will resolve the OtherDataclass and List[OtherDataclass]
Why are you default non-optional fields to None?
because they are optional?
the class i copied from only has optional fields
there
what do you mean "the class you copied from"
that's still not compatible
i dont see a reason to typehint the optional fields to Optional[] because the same is acheived with a default value, which i need anyways
i dont use mypy
i have in fact never used a static type checker for python, because IMO that defeats the purpose of python
uh ok lol
why use mypy when i could just use a typed language to begin with?
there are many many many obvious reasons but I'm sure other people have told you them before
Obviously those reasons may not be applicable to you but they are applicable for many people
please do tell me a reason to use python with static typing, instead of a statically typed language
Do you use any kind of checker for the hints?
they are used to resolve the dataclasses from a json in my code
In general code, ignoring the use case of the above code now. Or do you not typehint "normal" code
I don't think I understand your base init thing still:
In [1]: class Base:
...: def __init__(self):
...: print("hello")
...:
In [2]: from dataclasses import dataclass
In [3]: @dataclass
...: class Foo(Base):
...: x: int
...:
In [4]: Foo(5)
Out[4]: Foo(x=5)
The init of base does not seem to get called
At any rate though, I don't really see a reason to make something intrusive when it can be done just as easily non-intrusively
There isn't really a reason to force it to be "opt-in", people generally don't converse their dataclasses to-from json by accident
its not opt-in?
You have to inherit from Base?
yes?
yes, because a lot of your questions make no sense to me
A class that doesn't inherit from Base won't be able to do the to/from json conversion
that makes it opt-in
I don't see much purpose in having mypy or an another checker being forced (as a part of the linting process for example) or going out of your way to do something to satisfy it, but running it during local development can be very useful
that makes me not having to write the same __post_init__ 100 times, nothing else
It's still opt-in?
let me clarify: i did not design this as a library for other projects to use, only for this specific project
Sure, that's fine. Maybe you prefer opt-in for some reason, but that's still what it is, it's just the definition
Anyhow, I don't have to write the same post init (or anything), and I also don't have to inherit from a base class
I can just call a function dataclass_to_json(some_dataclass) and dataclass_from_json(DataClassType, json_data) on any dataclass, and it just works
wanna know what my __post_init__ contains?
class Base:
def __post_init__(self):
dicts_to_dataclasses(self)
so i could do that too, just makes more sense in the context of the project
is Base supposed to have a dataclass annotation?
nope
ah, I see
doesnt need one
Yeah, I understand now. Still, don't really see the benefit of doing it that way, but different strokes
also, sure if you wanna define opt-in that way go ahead, i was more thinking for each individual dataclass, where inheriting from Base makes the conversion not opt-in
as in, you would have to define a custom post_init to bypass the conversion
so more of a opt-out on the individual class level
You have to inherit from Base. That's how you opt-in.
If you don't inherit from Base, then it doesn't work. That's the default π
compared to a free function that works on any dataclass. That's not opt-in, because the dataclass doesn't have to do anything differently at all for it to work
wait, so if I wanted to create a Class from a dict in your example, I would do c = Class(my_dict) ?
well, c = Class(**my_dict) as dataclasses dont take a dictionary as arguments
This video is a bit different than my regular videos. This is a video on my internship defense where I backtested trading strategies at DSE using Python, at IBA, University of Dhaka. Super interesting video, do watch it.
I am Open Sourcing the code: https://github.com/amdfad/dse-momentum
Go to this link to find the code on Github. Play with it,...
__post_init__ gets self after the actual init has already run, so before post_init is called, in my system, the dataclass would contain the raw json
this is interesting
Okay, so you have to ** it, I understand.
i would still argue inheriting from a base class is less opt-in than having to explicitly call the function used to create a dataclass, as you seem to do
you have opt-in when using the class, i have opt-in when defining the class
Maybe you're looking for something else
i dont have opt-in when using the class, as at that point the inheriting has already happened
So, with my approach, you can still initialize the python class normally, and you have to be explicit about from json conversion
yep
With yours, there's just one function you can call to create the class
so if you already have the arguments as proper types and you don't want that logic to run, you don't really have much choice
yep
Another issue is that you never have the option to ignore unused data in the json
as in my project, that will never happen
i do, as i actually never do Class(**json) but have my own classmethod from_dict which ignores unused data
By having a separate function, I guess
@classmethod
def from_dict(cls, a_dict):
class_fields = {
field.name: field.type for field in dataclasses.fields(cls)
} # get all fields
existing_data = {
to_snake_case(name): val # always use snake case name
for name, val in a_dict.items()
if to_snake_case(name)
in class_fields.keys() # check against fields, to discard additional fields
}
unknown_fields = set(existing_data.keys()) - set(class_fields.keys())
if len(unknown_fields) > 0:
print(f"Unexpected fields: {''.join(unknown_fields)}")
if len(existing_data) > 0:
try:
return cls(**existing_data)
Anyhow, it seems pretty convoluted overall, to be honest
Initializing with the wrong type, and then transforming afterwards into the correct type
the entire point is that this works recursively, i can just pass a entire giant nested json structure to the outermost dataclass and it will resolve recursively what it knows about (using the typehints)
its being recursive has nothing to do with the aspects of the approach we've discussed here, a free function can be recursive as well
yes, but if i make it a inherent property of the class, just having to instantiate it is enough to make it recursive
I'm not really sure what you're trying to say. But at any rate, one of the things that the author of attrs (and by proxy, kind of the main driver of dataclasses) strongly suggests is leaving the auto-generated init as-is, and having special construction with logic via classmethods and such
This whole approach also of course doesn't work out, type wise, but I guess we already discussed that. But, I guess it's another disadvantage, and I still don't really see any benefit to outweigh any of these downsides.
Obviously, your mileage has varied
I feel like most of the time I could get by with a statically typed Python
and that would in fact be preferable
@gleaming rover yeah, mostly, the types just get awkward in certain cases.
lets say i get a json like this
{"users":[{"id": 1, "name": "quicknir"}, {"id": 2, "name":"laundmo"}], "messages": [{"user": 1, "message": "hey"}]}
i can define the following dataclasses
@dataclass
def Message(Base):
user: int
message: str
@dataclass
def User(Base):
id: int
name: str
@dataclass
def MessagesResponse(Base):
users: List[User]
messages: List[Message]
and then use it like this MessageResponse.from_dict(the_json)
Yeah, I get it π My function works the same way
for example
leaving the auto-generated init as-is
but i do exactly that.__post_init__is SPECIFICALLY offered to the user as a way to do post-processing on the dataclass instance. in a way, im using it for its intended purpose
You're not using post init for its intended purpose, no
in that example, they are initializing uninitialized fields
initializing fields to the wrong type in init and then rewriting it to the correct type in post_init isn't an intended usage
the intended usage patterns play nicely with types, dataclasses are designed rather heavily around type annotations
thats the thing i dont get, i have the beautifully dynamic langauge and you want to force me to stick to static types?
this seems like an emotional topic for you
well, kinda?
it just feels as if you're arguing as if python was statically types
which it isnt
Nobody's forcing you to do anything. But dataclasses are designed to be very compatible with type annotations.
so stop treating it as it is
and i use the type annotations extensively. as converters.
@gleaming rover Well, I guess I can only say "it gets awkward wrt mypy", specifically.
Since I use mypy as my type checker
but they are annotations, hints, not a static thing
Possibly with another typechecker it would be different
yeah I'm saying if it were statically typed
i feel like this whole argument only exists because you treat python as a statically typed language and i dont
Well, if it were statically typed with mypy as the typechecker, or another typechecker, or some theoretical typechecker that works better?
@finite sparrow no, not really. Even aside from the types agreeing, I dont' see any benefit to your approach, which I've discussed already.
But if you can make the types work out for free, that's definitely another benefit
just avoids false positives in your IDE and mypy
who cares if the types are wrong for 0.0001s until the post_init is called and they are corrected?
because the IDE and mypy will complain about it, which creates what are called "false positives", which makes it harder to find real problems
my IDE is not complaining a bout it, because im passing a dictionary into a classmethod that takes a dictionary
@gleaming rover there are some things that python's static type system does, intrinsically, that are pretty sketchy by the standards of a statically typed language, that have given me grief before. In particular, flattening unions.
In that example, yes, but if you call your init function with **, you are losing type safety (though I guess you may or may not get a complaint)
what do you mean by flattening unions
python has this weird thing where it flattens unions out
python's static type system
im sorry? since when is this a language feature?
can you elaborate
@finite sparrow you know what I mean, I feel like you're just looking to continue picking a static vs dynamic typing argument, not interested tbh
@gleaming rover So, say you have something like this: x: Optiona[Union[int, float]]
isn't that a consequence of how Optional is defined
Optional is really just Union[T, None] at a fairly fundamental level
that part is fine
so you really have Union[Union[int, float], None]
in which language does that not happen?
Why is flattening unions weird?
but now, python, when you try to look at the annotation, will have transformed that into Union[int, float, None]
@halcyon trail let me rephrase that: could you please be specific so that i know you meant "pythons typehints" or "mypy static type checkign"
You get either an int, or a float, or a None
that's not the behavior in any statically typed language that I'm familiar with
It's unsound because it breaks composition of types
if you're in a statically typed language with generics, suppose you have some procedure foo that you want to be able to do for various types
yes, in a language with tagged unions Either[Either[A, B], C] would not be equivalent to A | B | C
I get that part
you have a specialization/trait/whatever for foo and Optional, so that if you know how to do foo to T, you know how to do it to Optional[T], for all T
That means that if you have Optional[Union[int, float]], you need to define how foo works for Union[int, float], and now it should also work for Optional[Union[int, float]]
because of the flattening though this doesn't really work out nicely at all, when you try to use the annotations
which language
are you thinking of
well, it definitely doesn't work that way in languages where the unions are tagged
but even in languages where they aren't, like C++
it does not work that way
C++ has union types?
it has std::variant, yes
Not familiar with scala. But either way, it's pretty bad behavior.
I'm not completely sure what you mean by that in this context. But, regardless, as I showed, it breaks attempt to compose behaviors of functions on unions that should work
I read up on why python does this and I'm pretty sure the reasoning was "there isn't much difference and performance"
not because they thought it was more correct or anything
Optional is not like Maybe or Option in other languages
^
Like I said, that's not the root cause of the behavior here
I'm not sure what the right term for this is
You can also nest variants in C++, and it will still behave correctly
for all types A, B, C, (A | B) | C == (A | C) | (B | C) == A | B | C
I need a type theorist in here
it's basically breaking generic code. Imagine you have a generic function that operates on variant<T, int>, or in python, Union[T, int]. This code should work for all T (provided that T meets whatever constraints the function requires, if you are working in such a language)
Now suddenly, T itself being a Union is a special case
there's no real justification for that
it isn't though. If I have a function like
def fun(thing: Union[T, int]) -> Union[T, int]:
if isinstance(thing, int):
return thing + 1
return thing
``` it will still work for all T, including union T
Union in the very strict type theory sense is just set union, and set unions are indeed associative and commutative
is there any Scapy support room?
#web-development or #data-science-and-ml are your best bets
is what I was thinking
thanks
now, it turns out that tagged unions are more useful in practice, due to being able to soundly check which part of the union is active, but python already has tags in the form of types at runtime, so it isn't as important, and there isn't a nice way to make a tagged union nor is it common in python code, so having a type hint for them would be a bit pointless, don't you think?
it works in mypy, but in the annotations, they are collapsed
So, I guess perhaps this point is moot until "statically typed python" actually had a statically typed dispatch mechanism
In my case, since python does not have such a mechanism, I was using the annotations to dispatch things
and collapsing is very misleading
yeah, you should make a custom union for that
or union could just behave properly π
union working like set unions is the proper behaviour
It's about like how null behaves, in languages that have null, as opposed to optional
yes
e.g. Kotlin and Swift
it's convenient but it breaks generic code
it's not better or more correct in a type system sense
there are some types which are impossible to dispatch on as well, for example Generator, Callable, Coroutine, Iterable
okay. At any rate, all you're basically saying is "this behaves like the mathematical concept with a similar name", which is fine
The question is, how does it fit in with a programming language that wants to achieve things like control the behavior of functions based on types, via generics
and the answer is, pretty poorly
That's why, say, Map<String, Int?>, is a headache in Kotlin (and the comparable class in Swift)
If you compare to say, languages like Rust, C++, etc, where you have Optional/optional, etc, which does not auto collapse
that goes back to this.
Kotlin's nulls are really useful, but if you have generic code, and the type you are generic on isn't constrained as being non-null, you basically can't use them (I've encountered this)
Right. My point though is that it's just not good, type system wise. It may be more performant (for python) or be really nice and convenient in non-generic code (like Kotlin), but collapsing these types is just a headache in generic code
it seems to me to just be a question of choosing the right abstraction
there is no (simple) way to express the Either-equivalent in Python
I know this problem
yeah, because you are using what is effectively an untagged union in a case where you need a tagged union. You also can't dispatch well with a C++ union, because well, there is no way to find the type (in python, you can sometimes find the type)
#include <iostream>
union num {
int a;
float b;
};
void square(int num) {
std::cout << "int" << num;
}
void square(float num) {
std::cout << "int" << num;
}
int main() {
num one{.a=34};
num two{.b=34};
square(one); // error
square(one);
}
```I agree that for generic code, you need tagged unions, but Union is not that. There is a reason you will not find untagged unions in haskell. But keep in mind type hints exist to describe existing python code, and unions in the Union way are very common in real python code
C++ union doesn't really have anything to do with this
it is the C++ equivalent to union
no you can't
I am not familiar with C++, but looking at the documentation does suggest to me it's a tagged union
so, neither are exactly equivalent
Union[Callable[[int], str], Callable[[int], int]]
```is entirely opaque
just like a C++ union
unions themselves also are not generic, they always have names
or rather, they can be anonymous, but they still will be different from one another
in python Union[int, float] is the same as every other Union[int, float]
the fields have names
etc
it is definitely closer than variant
no, it's most definitely not
sorry
it's hard to even say what the equivalent of Union[int, float] is as a C union because a) you have to give the fields names, b) it will be a distinct union, different from all other unions, even with the same fields
conversely, variant is basically a type constructor like Union
they both take a variable number of types, and create a new type from that
again, variant is tagged. It keeps track of what part of the union is active
I understand that, and it's not one to one
To simplify this, let's imagine you actually had an untagged_variant class in C++
template <class ... Ts> class untagged_variant { ... }
so, this is basically the same as variant, except that it doesn't store the tag
Hopefully you agree?
yeah
Everything I said before is still true
untagged_variant<A, untagged_variant<B, C>> and untagged_variant<A, B, C> are still distinct types
but they have the same set of possible values
Yes, but they aren't the same type
now say you have a function template <class T> void foo(const untagged_variant<T, int>&)
if I have untagged_variant<untagged_variant<A, B>, C> x, I better be able to do foo(x)
but that's because it's not a union type
if you write code that follows that logic with python's type annotations, it will not work
okay, then maybe union types are just a bad abstraction π€·ββοΈ
you cannot express that idea in that way
I don't really care what things are called
I care about the consequences for being able to write sane generic code
everything in its place
this will typecheck with mypy
I understand it will typecheck with mypy
that's why I wrote, a while ago:
So, I guess perhaps this point is moot until "statically typed python" actually had a statically typed dispatch mechanism
In my case, since python does not have such a mechanism, I was using the annotations to dispatch things
and collapsing is very misleading
Hmm I'm not sure that mypy meaningfully accepts that, actually
keep in mind that type hints originally exist to describe existing python code, not really to allow new ways to write python. Annotations in general do allow that, but using type hints for dispatching is impossible in enough cases that I would argue for creating a custom set of annotations for anything but the trivial cases.
from typing import List, Union, TypeVar, cast
T = TypeVar("T")
def foo(thing: Union[T, int]) -> T:
return cast(T, 0)
x: Union[Union[float, str], int]
y: Union[float, str] = foo(x)
So, when I type check this with mypy, here's the error
scratch.py:11: error: Incompatible types in assignment (expression has type "object", variable has type "Union[float, str]")
So, it just deduced T to object
Maybe I did something silly with my example. But honestly this is actually the behavior I expected, mypy uses the annotations at the end of the day, and if the annotations are collapsed, well, this isn't surprising
hm, are you aware of what cython does with annotations?
yes, but I never actually tested what it does with uncheckable type hints, my guess is just treat it as Any.
Hello has here someone experience with alembic migration?
FAILED: Can't proceed with --autogenerate option; environment script /home/xx/xxx/xxx/xxxx/xxx/xxx/alembic/env.py does not provide a MetaData object or sequence of objects to the context.
I got this execption if try to run: alembic revision --autogenerate -m "xxxx"
How would one set up a concurrent.futures.ProcessPoolExecutor so that each worker has its own logging file?
I imagine that this has something to do with multiprocessing contexts, but I have not used them
or maybe not contexts, actually, the name made it seem related but the docs don't make it seem to be
actually, maybe with initializer + init args
going to give this a go
hmm, but from the description the exact same init args will get passed to the initializer each time
I just need each worker to get a distinct integer, 0 to N-1
they need to be sequential?
they don't have to be, just seems easiest
I'm thinking now if I simply have a multiprocessing queue, push the integers 0 to N-1 into it, and then I have the init functions pop from the queue, that might do it
what i was gonna suggest π
(alternatively just do it based on pid if you don't need consecutive)
pid is pretty easy I admit, I guess I'm going to just try and do it to see, also sometimes the determinism might be useful
but damn this is all so much harder than it needs to be
are you asking the pool for a specific number of processes?
Yes, I would in such a setup
Alright, I got a POC working
from concurrent.futures import ProcessPoolExecutor
import logging
import multiprocessing
i = None
def initializer(queue):
global i
i = queue.get()
def work(x):
return i
def main():
num_processes = 5
q = multiprocessing.Manager().Queue()
for i in range(num_processes):
q.put(i)
with ProcessPoolExecutor(num_processes, initializer=initializer, initargs=(q,)) as executor:
print(list(executor.map(work, range(20))))
if __name__ == "__main__":
main()
This can all be wrapped up nicely now I suppose, I can have my own LoggingProcessPool that handles all these details and accepts a logfile prefix
ah ok - if you weren't controlling the number of processes i would question why not just have a single log file since splitting it into arbitrary pieces should make no difference - and i still somewhat hold that view but now if you know that you're going to have exactly 5 log files, that's more understandable
a single log file, would you just have a process level lock then?
if the overhead is insignificant, yes
fair enough, that could work as well, I guess the idea would be the same
pass the lock instead of the queue in initargs
well more like pass a concurrent logger
and then initializer would simply add a logging handler that grabbed the lock before recording
Well, it amounts to the same thing I guess
pass a concurrent locker, or pass the lock that you need to create one
concurrent logger*
not a logger, btw
Well, I guess you could do a logger. I was just picturing a handler
oh - i thought this was just about logging π¦
a handler for the logger
loggers themselves don't write to any files, they are just the things that you call .log on
Well, C++'s standard library doesn't provide one, but there are tons of logging libraries
super useful, bordering on mandatory I'd say for many domains, tbh
I mean otherwise something goes wrong in prod and you have very little to work with to understand the internal state of the program leading up to it
hmm ok (ik there's external libs but effort)
well, idk, what were you writing in C++
bots
i remember being interested in sentry.io a while back but they didn't have any C++ stuff at the time (or it was some poorly documented unofficial bindings)
wow they support cmake now π
that seems a bit different
talking something more like this: https://github.com/gabime/spdlog
working example of the overall logging thing for the interested:
from concurrent.futures import ProcessPoolExecutor
import logging
import multiprocessing
logger = logging.getLogger(__name__)
def initializer(logging_prefix, queue):
i = queue.get()
logging.basicConfig(level=logging.INFO, filename=f"{logging_prefix}_{i}.txt",
format='%(module)s %(asctime)s %(levelname)s %(message)s')
def work(x):
logger.error("damn")
def main():
num_processes = 5
q = multiprocessing.Manager().Queue()
for i in range(num_processes):
q.put(i)
logging_prefix = "log"
with ProcessPoolExecutor(num_processes, initializer=initializer, initargs=(logging_prefix, q)) as executor:
list(executor.map(work, range(20)))
if __name__ == "__main__":
main()
Very fast, header-only/compiled
had me atheader-only
hah
anyway we don't use this where I work, it's probably still too slow for us since it does the formatting in the main thread
but it gives you an idea
I've written a grammar-like language for automatically assembling a lexer
I'd love to get you guys thoughts on it, but I know I spam this channel all the time π You mind if I post it up?
i mean it's fairly quiet, so go for it
Hey @static bluff!
Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:
β’ If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)
β’ If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:
