#internals-and-peps
1 messages ยท Page 95 of 1
@unkempt rock pip doesn't remember the history of your installations -- it only knows which packages are installed. So the best solution would be to just look at the imports and reconstruct the required packages from that.
If you have installed packages A, B, C, where A depends on B and B depends on C, there are 4 possibilities:
- You only use
Adirectly -- if it stops depending onB, you don't needB. - You use
AandBdirecrtly, but you don't useC. So ifBstops depending onC,Cdoesn't need to be installed. - You use
AandCdirectly, but you don't useB. So ifAstops depending onB,Bcan be dropped, butCshould be kept in either case. - You use
A,B,Cdirectly -- they all need to be installed
So here pip can't decide whether requirements.txt should contain A, A, B, A, C or A, B, C
looks like pipreqs does try to reconstruct the reqs.txt from imports, but I've never used it
I m trying to run a powershell script which should over a file select natively. If I run PS>powershell.exe ./myfile.ps1 , I get a window but if I do it with subprocesses I don't
it's rather printing those strings. I have asked on #kivy or IRCNode but people are avoiding windows discussions
It may be easier to just open the file selector natively thru python
nevermind win32 api is garbage
Subprocess may not be started in userland
Hey. I am thinking about extending dataclass with some optional param in field which determines that class field is returned by asdict function or not (similar to init=False). Should I create PR in GitHub or how can I do it properly?
pantry
This sounds like something that should go on the Python-Ideas mailing list first.
It is what I am asking for, thanks!
If you post there, you should expect a tough gauntlet of critique, btw.
Some people might say, "sounds like you want a schema and serialization tool, use one of those," for example.
Maybe... However I can remove some fields from eq or repr so why should be a problem with excluding some fields from asdict?
Haha, okay. Thank you for response
pls i can't even come up w ideas
Am I getting my Python history right? I believe I read that map, filter, reduce, and lambdas were all implemented by a Lisp user who wanted them, and Guido or whatever powers that were at the time just accepted it since the person had already implemented it. If that's all true, did the key keyword argument already exist for comparison functions? were people using the callables in operator at that time?
@strange heath you don't have to reply, "i have no idea, i'm dumb" to the discussion that happens here ๐
ok
The reason I ask is that reduce is the only of those three functions that I don't have any reservations about liking, and when you consider that there are alternatives to creating sorting keys, I can start to see why some people wish lambdas weren't in the language.
i didn't know that some people wished lambdas weren't in the language.
maybe I'm reading too much in to those who say they're "not pythonic"
You mean the current form of lambdas or lambda functions in general
key to sort is much newer than map
Cause i dont particularly like the current form of lambdas
I wasn't aware that lambdas had undergone any non-implementation changes
list.sort(key=...) was added in 2.4
another thing: why are map and filter built in but not reduce? I would put all three in functools
if they ever do python 4, these are the kinds of cleanup I want to see
and map was builtin in 2.0, at least - that's the earliest online docs I can find. https://docs.python.org/2.0/lib/built-in-funcs.html#l2h-175
(I also want to see upper camel cased class names used consistently in the builtins/stdlib and the removal of lower camel case where it exists)
there's too much Python 2 code out there to even be thinking about Python 4 ๐
we don't care about those people
python 2? who's she
I'm still writing Python 2 + 3 compatible code, because I write libraries, and there are groups in my company that are still trying to get off Python 2. We're hoping for the end of the year...
should int be Int?
Good question. Not sure but of those 3, reduce is the only one I think is any use ironically. It's annoying to have to import it.
I would prefer the opposite tbh
map is one of my most used builtins, it would be annoying if it was moved to functools
imo map(int, lst) looks better than (int(x) for x in lst)
Reduce is really hard to emulate with a a comprehension, filter and map arnt
is it even possible?
how would that even work, without just creating a list comp with side effects
Probably not, which is why I think it should be builtin but filter and map shouldn't
it wouldn't ruin my day if int became Int, though I suppose one could argue that classes that represent primitive-like types can be exempt. I'm not sure where one would draw the line (str?).
if you're going to move it, filter should be moved to itertools instead of functools
bytes? bytearray?
what is it that you're doing with map?
I'd be fine with map and filter going to itertools.
I'd still just use a comprehension of some kind
Curried?
the transformation of a function taking multiple arguments to one that successively takes single arguments and returns a partially applied function
so you can do seq_to_int = map(int) and then do seq_to_int(["2","3","4"])
you can do that using functools.partial
One could have that backwards compatibly by making the iterable argument optional, yes?
sure, but - why special case that just for map? We already have partial for currying.
I guess. I've never used partial
you can
but it's inherently less FP-friendly
not just map
there are a few reasons for this, but one of them is nicely pipeable functions (ReactiveX uses this pattern a lot)
!e ```py
from functools import partial
seq_to_int = partial(map, int)
print(list(seq_to_int("234")))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
[2, 3, 4]
example (not Py):
this.gameService.load().pipe(
filter(game => game.gameTime > 60),
map(game => game.homeScore),
reduce((acc, next) => acc + next, 0)
)
so you can read top down
as opposed to right to left (because functions would be applied inside out)
you can do that with Python, arguably more readably - though it forces you to name the intermediate steps.
games = self.game_service.load()
long_games = (g for g in games if g.time > 60)
home_scores = (g.home_score for g in long_games)
return sum(home_scores)
"encourages" you to name the intermediate steps.
whenever someone asks, "here's a long line, what's the right way to format it," i always answer, "use more, shorter, statements"
of course you can do that with Python
but, again, less FP-friendly
also
it's not really clear from that snippet, but the result can be passed elsewhere and further processed
so it's not really a generator thing
and use of anonymous functions like that is quite a common FP idiom
I would say that naming every step can add noise
as an aside: similar Scala code, but with method chaining:
games
.filter(_.gameTime > 60)
.map(_.homeScore)
.reduce(_ + _)
return sum(g.home_score for g in self.game_service.load() if g.time > 60)
I certainly hope no one reaches for reduce in order to sum things.
I hope not, either
but it was the simplest illustration I could find
that was still meaningful ๐ฅด
Wow, I really dislike that, heh
the Python version with named generators is so much easier for me to read than that. I'm sure part of that is familiarity, but, still.
when I had group from groupby I did sum(1 for _ in group) (because groupby group doesn't have len)...
thankfully i could later change that to actual elem.get("qty", 1) because we introduced a field... wait, no... it was a string (because a custom field in a tool was always a string) and could be empty... ugh, it was int(elem.get("qty") or 1) or 1 or sth like that ("0" string would be True but int would return 0`)
it's extremely common in FP
although Scala has especially terse lambdas, to be fair
the thing is that with RxJS (at least) you can't name your steps
the abstraction doesn't work that way
well, you could name the individual functions, but that would be both misleading and increase nonlocality of reference (not sure if the right term)
but anyway, one really nice thing about writing code like that is that it's quite easy to scan what operation is being performed at each step
because the operators are aligned
I think the Python version using named generators is pretty readable, albeit more verbose.
it is
I would say the benefit of having names is that the intended meaning of each step is clear
so it depends on whether you want to emphasise the intended result, or what processing is actually being done
but anyway, different paradigms
https://github.com/python/cpython/search?q=alloca does cpython have a dependncy on alloca or is this just some ctypes stuff?
what do you mean when you say "dependency on"?
Looks like it's used in ctypes, and in attempting to detect whether a stack overflow is about to occur on Windows.
hmm
ig would it be possible to build cpython with a compiler that doesn't have alloca?
hm. I don't think so - it looks like ctypes unconditionally uses it.
though, now that I think of it, the interpreter itself may build, and just not provide the ctypes module.
if that's an option for modules, that's pretty cool
modules are separate shared libraries, and if one module fails to build it doesn't stop others from being built.
Blogged about recent improvements in py2many today:
if str(message.author.id) in bonk:
pass
#print('found!')
end = time()
print(end - start)
start = time()
if str(message.author.id) not in bonk:
pass
#print(message.author.id)
# print(type(message.author.id))
end = time()
print(end - start)
Output
5.817413330078125e-05
2.9802322387695312e-05
How are if not statements faster
They're not. That looks like sampling error to me. I doubt you can accurately measure something that takes less than 60 nanoseconds without running hundreds or thousands of trials
Consider using the timeit module for better testing.
!e ```py
import timeit
print(timeit.timeit("1000 in lst", "lst = list(range(1000))", number=10000))
print(timeit.timeit("1000 not in lst", "lst = list(range(1000))", number=10000))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 0.3987090536393225
002 | 0.3824900588952005
!e ```py
import timeit
print(timeit.timeit("1000 in lst", "lst = list(range(1000))", number=10000))
print(timeit.timeit("1000 not in lst", "lst = list(range(1000))", number=10000))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 0.3420707928016782
002 | 0.26386456098407507
!e ```py
import timeit
print(timeit.timeit("1000 in lst", "lst = list(range(1000))", number=10000))
print(timeit.timeit("1000 not in lst", "lst = list(range(1000))", number=10000))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 0.31232740404084325
002 | 0.2853736230172217
!e ```py
import timeit
print(timeit.timeit("1000 in lst", "lst = list(range(1000))", number=10000))
print(timeit.timeit("1000 not in lst", "lst = list(range(1000))", number=10000))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 0.37548045301809907
002 | 0.44427160965278745
Clearly just random variation even trying lots of times, but you can see that they're very close. They're doing exactly the same amount of work.
If you were to create your own list using a bloom filter, you could probably get good performance for the not case.
If you're gonna use hashing, you may as well use a set, and then both cases are O(1)
I'm not there yet 0(1)
how can i make my script to run an url, save the image from there and post it to twitter every day?
What I really hate about list comprehensions, there's no way to do that pattern
for i in iter:
if func(i):
return i
raise ValueError()
Any is similar, but something like first(iter, key=func) or first(i in iter if func(i)) would be neat.
next(i for i in iter if func(i))?
Hm, that's good, i don't use next() and iter() often. StopIteration instead of ValueError, but you can always catch and re-raise.
Another, rarer, pattern is
lst = []
for i in iter:
if not func(i):
break
lst.append(i)
return lst
Something like len([val for val in [3, 2, 1, 0, -1, 0, 1, 2, 3] while val > 0]), for example.
EDIT - disregard that, forgot about itertools.takewhile.
I am reading a hindi language text from an excel sheet using opepyxl module and the output is not in hindi characters
Somebody please tell me what to do
@torpid solar this is strictly a discussion channel. Take a look at #โ๏ฝhow-to-get-help
Okay sorry
I personally think this can go here because it talks about under the hood stuff
smth = get smth from api
if smth exist:
assign = assign
Do_smth_(assign)
else:
assign = assign
Do_smth_(assign)
assign = assign
if smth exist:
assign = assign
Do_smth_(assign)
if smth exist:
assign = assign
else:
assign = assign
Do_smth_(assign)```
What do you guys think is better
third ig
@nova iris
@feral cedar
NOW FIGHT which is better
I made it so it's understandable
third one
lol yayyy
i mean, psvm argues that second one is better since there's less duplicate code
but i think third is good
bc third is more concise, cleaner, and more understandable @sleek marlin
no ternary option
note that when i said that, the code in question was about assigning booleans, and op had said he didn't want to do the simplest option
yeah ternary would be pretty compact do_smth(assign if smth_exist else assign)
I do smth else too 
true true
wait the teacher is telling us to do work ๐ฆ
cy'all later
this chat died so hard
This channel is about the quality of the messages, not the quantity
Hello @abstract finch, please see #โ๏ฝhow-to-get-help
ok
keys = open("wanttowrite.txt", "w")
Yeah, I don't think this is really a "chat" as you will, but more a place to have nice discussions over python.
we shall put (Code Runner bot) to run the code in the server
!e
We already do have a bot. You can use it in #bot-commands or in a help channel.
print("hi")
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
hi
Also, this channel is for discussions about the Python language itself. If you want to discuss this community, check out #community-meta
Ok so does anyone know how I would make an OCR that can monitor my health in a video game like warzone?
is this: def func(thing, thing2: int) faster than this?
def func(thing, thing2):
thing2 = int(thing2)
they do 2 different things
def func(thing: int) is a type hint that thing should be int
def func(thing):
thing = int(thing)``` converts `thing` to an `int`
but they both convert to int?
no the first one does not convert anything
The annotation does absolutely nothing except being saved to __annotation__
discord.py does some kind of magic with them that converts them based on the annotations
OH
@unkempt rock these are annotations. Originally, they were designed for storing arbitrary data, like def cookie_func(x: 'number of cookies', y: 'cookies eaten per minute') -> 'number of cookies needed': You could use them for whatever you want. But now it's recommended that you only use them for storing typing data. And one could create a Python runtime where that type information gives you performance optimizations, but no one has made that.
ok
Where is a good place i can start to learn python?
This channel is strictly for discussion, though we have a resources page on our website: https://pythondiscord.com/pages/resources
There are other channels that are appropriate for questions.
We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.
S = 10 > 5 and "hi" or "bye"
print(S)``` how is this hi
this is a channel for discussion of the language, so I have replied to you in #python-discussion (please don't ask your question in multiple channels)
Here's an idea I haven't thought about very thoroughly:
def some_func(arg): ... # might raise ValueError or KeyError, otherwise returns Tuple[int, int]
match some_func(23):
case ValueError: ...
case KeyError: ...
case (x, 0): ...
etc.
How bad is it?
@boreal umbra have you seen the pattern matching pep that was accepted?
I'm not familiar with all the specifics, no
I'm fuzzy on many of the details myself.
That's already syntactically valid, and does something different than what you're proposing. So, your time to get that proposal implemented is limited to the next 2 months, until the 3.10 feature freeze.
Boo
As things stand with the PEPs that have been accepted so far, py match some_func(23): case ValueError: ... would match unconditionally, and would result in the object returned by some_func(23) being assigned to ValueError.
considering how busy the pattern matching mini-language already is, I really dislike the idea of making it able to match on both the return value and the exceptions that might be thrown.
well, obviously, the solution is to implement a Result type
you could already do that. If you made the function return an exception instead of raise an exception, you could match on that with the syntax that's already accepted.
you could also just make your own Result type and then use pattern matching to unpack it ๐
actually - there may be something to that. We already have a special purpose Result type - https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future. Maybe pattern matching is enough of an incentive to add a more general purpose Result type.
this reminds me of Try
!e
You are not allowed to use that command here. Please use the #bot-commands channel instead.
Why does Python use len(list) instead of list.len or list.len()? I would expect lists to have a len attribute. Perhaps having that attribute would mean that mutating lists would become more expensive because you would also have to change this attribute?
the dunder list.__len__() works, which is what len(list) also calls under the hood I assume
That doesn't help me understand why it is the way it is.
Why Python follows the Principle of Least Astonishment even when some people are surprised.
or perhaps more authoritatively
what is HCI in this?
human computer interface AKA readability
icic
does anyone have worked on reading CSV file using python and pandas, and insert it into snowflake DB?
The length of the list is being tracked as a field on the struct internally, as that's necessary to keep track of the current size of the dynamic array anyway. Whenever you read the length of a list, you'll just get that value.
The choice made here is that Python comes with built-in tools and mechanisms that you can provide support for in the types you implement.
Want support for len (and, if combined with some other dunders, support for other things like iteration), you implement a __len__ for your type.
In a way, this guarantees uniformity across all types: Is it .len .length .size?
No, it's always len(object), which is supported under the hood by the dunder method __len__, whether it's a str, list, or any object that's Sized, whether its type is built-in or user-defined.
Is seek starting seeking from start in text file or there where bookmark is
I think this is the big reason. The fact that len exists makes it so that there's only one obvious way to get the length of a collection, and everything that's collection-like has to make itself work with len, because users expect it. If instead it was a method of the class, there would be no len builtin to work with, and no common base class across all collections, so nothing to enforce that the same method name would be chosen by every collection-like class.
The second argument to seek lets you control this. It's either relative to the start of the file, the end of the file, or the current position in the file.
It also fits in nicely with all the other hooks you get to hook into Python's data model. There are a lot of "double underscore" (dunder) methods that you'll probably rarely, if ever, call yourself.
That's true, but it's not obvious why "length" ought to be part of the data model.
I can't think of any operations where the interpreter itself will call len on your object implicitly, like it does for most other dunders.
Even the legacy iteration protocol uses __getitem__ and IndexError, not __len__, IIRC
(yep, I double checked that)
Maybe not, but does that make it a special enough to be a special case that breaks the pattern?
We do have the relationship between str -> __str__, repr --> __repr__, and so on
But the interpreter calls those automatically, that's my point. f"{x}" calls x.__str__, f"{x!r}" calls x.__repr__, {x} calls x.__hash__, etc. The thing that makes those methods part of the data model, instead of just regular methods, is that the language itself uses them.
I don't think that's the case for __len__. At least, I can't think of any time the interpreter itself ever calls __len__ on a user defined type.
Fwiw, it. calls len in the legacy iteration protocol
In [1]: class Foo:
...: __getitem__ = None
...:
In [4]: reversed(Foo())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-22e4a800c3a9> in <module>
----> 1 reversed(Foo())
TypeError: object of type 'Foo' has no len()
Ah, nice. Only for reversed, not regular forward iteration? That would make sense.
!e ```py
class C:
def getitem(self, i):
return i
it = iter(C())
print(next(it))
print(next(it))
print(next(it))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 0
002 | 1
003 | 2
Yeah, forward iteration doesn't need it. Makes sense that reversed would, though.
But the interpreter calls those automatically, that's my point.
I understand that, but is that a case special enough to break the uniform pattern of being able to provide support for builtin_function(object) by implementing a dunder method?
I'd say that general, uniform pattern is more important than strictly going by whether or not it's called internally somewhere (ignoring reversed)
If I do seek(113) to start writing from 113th line in my text file, text is in 114th line, why this
well without any context as to what seek is im assuming indexing is starting at 0 which is why its line 114 starting counting at 1
how to fix it
@unkempt rock line 1 for you is line 0 internally
to access line 113 use 112 or whatever
i'm guessing the context is file like things? in that case seek works with bytes, not line numbers
file.seek(5) will go the the 6th byte in the file
That's not the pattern at all. The pattern is that there's a dunder for every method that's part of the data model, and the data model is defined by the things the interpreter needs to call. Not all built-in functions correspond to a dunder method, but every method the virtual machine ever decides to call automatically does. And the only non-dunder method I'm aware of that the interpreter ever calls on its own is keys.
That's why we don't have __all__, __any__, __ascii__, __hex__, __oct__, ...
I don't agree with that statement, I think
I think there is a pattern that for objects implementing a certain interface, you have a universal way of accessing them
len(sized_object) fits that
abs(object) --> __abs__
ascii is a special function that adds some functionality on top of repr (which calls __repr__)
any does not need such an implementation because it relies on another interface an object could have (the object is iterable)
So does all
The reason I've always read is the uniform access principle to give you a single way of doing something that you can implement support for by using a special method: len originally worked with just a few builtin types, but __len__ was introduced to give you the tools to write Pythonic types of your own that could emulate that builtin behavior that worked for builtin types. From that perspective, it has less to do with a pattern of "what the interpreter needs to call" and more with the design question of how to allow the user to define types that behave similarly to built-in objects with a similar, uniform interface.
I think your definition is the special case. Clearly __init__ doesn't exist to fulfill some uniform interface, and clearly it does exist because it's part of the data model, because type calls it automatically.
(or maybe it's object that does, ๐คท)
len and abs are the weird ones here. Most dunders are for things the interpreter VM needs to call.
I think it's not necessarily a special case
Dunder methods allow you to hook into the data model and basic operations within Python to write objects that act just like builtin objects do
Allowing you to support len(object) with a dunder method still falls within that categorization
__abs__, __dir__, __divmod__, __format__, __len__ - I'm able to find some dunders that exist only to support a built-in function rather than to support the interpreter. But there aren't many of them.
The grand majority of dunders are called by the interpreter itself
They allow you to hook into the data model and the workings of Python to influence how your objects behave
Yes, we agree about that. We're disagreeing about what "the data model" is.
Whether it's with builtins or with protocols like the descriptor protocol, it allows you to write objects that behave just like builtin objects would
I'm not sure about that; I wouldn't necessarily call all of that the "data model"
Not all dunder methods are described in the data model chapter of the documentation
I might have missed it in the long discussion here about len and special methods, but one factor in the design is that strings weren't objects in Python 1, so they couldn't have a .len() method. len(xyz) allowed lists and strings to be treated more uniformly.
That's really interesting
seems like bool being a subclass of int
all those little things keep backward compatibility, but IMO build up into a little bit of a mess
like you can do type(...) and .__class__
and all the aforementioned dunders
is there anywhere to read about how strings in python 1 worked?
For Python's 30th birthday last week, Skip Montanaro revived Python 0.9.1: https://github.com/smontanaro/python-0.9.1, you could check there ๐
this just came up on my company's slack, can someone explain what the hell is going on here?
>>> None < None
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'NoneType' and 'NoneType'
>>> (None,) < (None,)
False
>>> (None,) < (True,)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'NoneType' and 'bool'
>>> [None] < [None]
False
>>> [None,0] < [None,1]
True
Why does it seem like None is only an unorderable type in certain cases?
Isnt the length of the tuple checked before the elements are compared?
Huh i guess not
It'll only start comparing after skipping equal items
It might be using "is" first
hi
hm, if so that would certainly explain it
ah, beautiful, thank you
i was just getting that URL!
hahah
A UNIX style manual page and extensive documentation (in LaTeX format) are provided.
from tiny acorns, mighty oaks grow
I actually like LaTeX more than reStructuredText
but maybe it has issues with global references and such
also can't easily be rendered to HTML
whare a f-strings much faster then % and .format?
are they? I have another question: why do developers find micro-benchmarks so interesting? ๐
b/c speed is important
I really doubt string formatting speed is going to be the limiting factor
Sounds like a simple benchmark you can write yourself tbh
I'm driving through Manhattan. I'm going to use a Lamborghini so I can get it done faster.
yes
it won't work ๐
f-strings are faster, but in a trivial way, i.e. it probably doesn't have a large impact on your performance
.format involves a method call, f-strings don't. that miight be why.
ok thanks for the info
if you find yourself in a situation where string formatting is a real issue, it may be time to use ropes instead
.format() also needs to parse the string that you pass in (find all the {} and inspect the contents inside of them), whereas f-strings are parsed at compile time.
I guess this is the biggest hit
okie
WARNING: some scripts are executable and have a first line saying
#! /ufs/guido/bin/sgi/python
This is unlikely to give good results anywhere else except in my
office. Edit the first line before installing such scripts; to try
them out, you can just say "python file.py". (The .py suffix is not
necessary in this case, but makes it possible to debug the modules
interactively by importing them.)```
haha this is funny
I need help with some simple code
@normal monolith #โ๏ฝhow-to-get-help
Hi there
Itโs quite simple really I just need to
Do this I donโt know what Iโm doing wrong Iโll show you
This isnt the channel for this #โ๏ฝhow-to-get-help
Oh
How do we feel about functions that return a variable amount of values?
Good practice/not good practice? Classes or dicta better?
Python doesn't have a good way to handle that result, but if it is the best way, sure
@pseudo cradle definitely don't return differently shaped tuples at different times
@pseudo cradle whats the usecase?
@spark magnet Iโve seen a few functions that do so based on certain kwargs. You donโt think itโs a good idea?
no, it sounds bad. can you show an example?
Sure one sec
I canโt find it immediately on hand, but thereโs a scientific function that generates a list of values and they have a kwarg โreturn_timesโ that additionally returns a list of the times those values were recorded at
So the return is either list or list, list
i would rather always return the same kind of result.
if it's expensive to keep the times, then let a kwarg mean "real tiimes or zeros"
Or just replace it with None or something like that, much more discoverable when it's called improperly
Yeah thereโs a financial and computational cost to performing the full function
Or Iโd just always calculate it
Yeah, this is a good idea and probably what I'll implement
Does your function handle multiple responsibilities for it to return different kinds of values at different times?
the train_test_split utility from sklearn takes *arrays and returns a list of length 2 * len(arrays) which is one of the rare cases where it make sense, in my opinion:
>>> from sklearn.model_selection import train_test_split as tts
>>>
>>> data = [1, 2, 3]
>>> tts(data)
[[1, 3], [2]]
>>>
>>> data2 = [4, 5, 6]
>>> tts(data, data2)
[[1, 2], [3], [4, 5], [6]]
>>>
>>> data3 = [7, 8, 9]
>>> tts(data, data2, data3)
[[3, 1], [2], [6, 4], [5], [9, 7], [8]]
splitting variable amounts of vectors here cannot be done with separate function calls because they need to be split the same way (the vectors must be same length, and the function decides the indices that go into the lhs and rhs vector and splits all arrays the same way)
although that's actually not a good example at all since it always returns a single list lol, just the length changes
But it's still always returning a list of something. It's not like once it will return a value, then the list, and some other time it will be a tuple - the return type stays the same and is clearly defined
yes I realise now that it's not particularly relevant sorry
I would say it is fine in this case, since that kwarg will generally be static
for example, df.boxplot has a return_type kwarg, which makes it return either pyplot axes, a dict, or both.
For the topic... I've once had to modify my complex code to move some more info outside of some functions (client wanted the info I dropped early to be present in notification later). So I changed it so that it that I would return two values (aka a tuple, if return wouldn't be unpacked). But it was only sometimes that the other value would be present. I added None to the other return case in the function so that I'll always have this pair tuple (and I unpacked it instantly), instead of having to check for the return type etc.
So generally I work with fairly complex computationally expensive algorithms. Due to the cost, I only want to compute things that are absolutely required. So 9 times out of 10, the function only needs to do X, but 1 time out of 10, it might need to do X + Y. One problem is that the function may be called in a lot of different contexts, maybe in a data research pipeline, maybe in an app, etc.
But dealing with how the returns are handled has been a design problem I've been struggling with the last few days
One suggested by my boss has been to simply have a dict, which I think is fine for internal workings within an application, but kind of annoying to deal with as a module somebody is using (which is another usecase)
why not make 2 functions, one which computes both and one which computes just one thing
I could, but I'm oversimplifying, it could be X + y, or X + Z or X + Y + Z....
It see the logic, but it might get out of hand pretty fast
I do think various alternative functions are a solid option, but you are right it could get complex fast
if you think that passing the function flags as the what to include and exclude would be nicer, well, do that
Yeah, it would be nicer in this context. The only problem is how to handle the returns. My initial attempt (foolishly) just returned tuples of varying sizes, but that is error prone as one might imagine
is it? You will get an error if you unpack into a wrongly sized tuple
Well, I got yelled at for doing it by people who probably know better on this server, haha
well, you definitely shouldn't return them based on varying input values, but I don't think there is much trouble when you do that based on static flags.
Yeah, I agree with not returning on varying inputs, but hmm. Maybe I won't discoutn it based off of static flags yet.
It's a tricky problem, there's so many things to consider.
At least, that's my perpsective when discussing it with other people
no idea about the context, but would classes make sense?
if you want to calculate certain pieces of the equation/calculations and make them available on an as-needed basis, you could always do that. heck, could even design a "lazy computation" for the different pieces as properties on your class for example. just thinking out loud
I think it would be fantastic. My only concern/caveat would be if it was possible to make an instance of this class json serializable
Also classes are scary ๐
jk, I need to use them more
Named tuple would still be a tuple but object-like as well. So you can access it either by index or by doing o.name
I've used namedtuples before and I like them. Dataclasses, too.
I've only used named tuples for the cases like i described above - when I need to return several things but started to get confused about the order. XD
I wish i had discovered them earlier, tbh
Because then my code would be clearer
Have you messed around with dataclasses? A colleague made a strong sell to me about dataclasses over namedtuples and I was convinced
They're similar enough
But dataclasses aren't iterable
I'm not doing a big enough stuff to really get into it. Mainly integration scripts between systems. Or generating reports from one system. Mostly standalone stuff, basically
It's always so interesting to hear about what people do. The programming field is more diverse than I gave it credit for
I could live (and I did) without the named tuples. I mainly deal with stuff with list/dict comprehensions etc
Another option would be to use dicts where the presence or absence of a key depends on whether or not the flag requesting that key was provided.
Yeah, that's the approach my boss originally suggested
Nicer than a variable length tuple because callers don't need to know what order the fields will be returned in.
nod
I just haven't really seen any public open source modules use that approach, so was wondering if there was some inherent bad reason to use it
I think classes are generally the better option because of the ability to lazily compute things, but it's a bit annoying to make arbitrary classes json serializable, so that might be enough to sway me to the dict approach
Yeah, that's an annoying concern I have. Most of my returns will also have to be json serializable in one way
Do jsons just use the str dunder?
does json serialization, rather?
Nope.
By default, it will just give an error if you try to serialize a user defined class. dump and dumps take a default= kwarg that you can set to a function that takes an object and turns it into something JSON serializable.
object_pairs_hook can reverse that transformation on the decoding side, though that's annoying as well
!e ```py
import sys
s = ""
for _ in range(5):
print(sys.getsizeof(s))
s += "ร "```
You are not allowed to use that command here. Please use the #bot-commands channel instead.
Does anyone here know why this makes the memory instantly go up by 24 bytes once the character is added?
Iโm wondering what itโs doing which makes it need to do that
It prints: 001 | 49 002 | 74 003 | 75 004 | 76 005 | 77
And it only does it when a non-ascii character is added
Otherwise, if youโre just adding โaโ for example, it only goes up by 1 each time, so 49, 50, 51, 52, 53
its due to how non-ascii characters are handled internally
Whatโs it using the extra memory for?
Yeah
Itโs confusing because itโs a one character string
The extra space is to store the UTF-8 representation of the string, mostly. If a string contains only ASCII characters, the string as an array of 8 bit Unicode code point numbers and the string as an array of UTF-8 bytes are the same, so it can use one array for both purposes
If a string contains non-ASCII characters that optimization no longer applies
So it needs another array for a 2nd representation?
Yep.
With ASCII, the two different representations are byte for byte identical, so they can be shared.
Outside of ASCII that's not true and they must be separate.
What gg said
You can store any ascii character in a single byte
In fact, in C it was often a handy way of defining a single byte integer of value 0-255, lol
That's one of the things I had to unlearn when going to Python
So both arrays are using the same space in memory?
For ASCII, yep.
Because i see that it only ever increases by one byte per character
For that example
!e One chunk of memory can be interpreted as either an array of 8 bit integers representing Unicode codepoint numbers, or as an array of 8 bit integers representing UTF-8 strings, because for values from 0 through 127, it holds that:
for x in range(128):
assert chr(x) == bytearray([x]).decode("utf-8")
@raven ridge :warning: Your eval job has completed with return code 0.
[No output]
!e whereas for 128:
x = 128
assert chr(x) == bytearray([x]).decode("utf-8")
@raven ridge :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 2, in <module>
003 | UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
The only integers with this property are the numbers from 0 through 127, inclusive - which is the ASCII compatible subset of UTF-8
That is, for a Unicode codepoint below 128, and only for those 128 codepoints, the UTF-8 representation of that codepoint is a single byte containing the codepoint number.
@raven ridge this doesn't explain why after the string has jumped in size (supposedly to hold both a utf8 and a fixed-width form of the string), adding one character only adds one byte to the size.
The link I pasted at first does. The extra 24 bytes store a pointer to the UTF-8 representation, the size of the UTF-8 representation, and the number of codepoints in the string
Each of those 3 things is 8 bytes.
After that original jump, adding one character just adds a single integer to an array of 8 bit integers. Because the character is in the Latin 1 subset of UTF-8.
If you add a character with a codepoint of 256 or higher, you'd see another jump.
At that point it needs to switch from an array of 8 bit integers to an array of 16 bit integers, and adding ร after that will add 2 bytes each time instead of 1.
It does look like sys.getsizeof() doesn't include the size of the cached UTF-8 representation, though...
If you add ๐คฏ then it increases by 3 extra
import sys
s = ""
print(sys.getsizeof(s)) # 49
s += "ร "
print(sys.getsizeof(s)) # 74
s += "๐คฏ"
print(sys.getsizeof(s)) # 84
s += "a"
print(sys.getsizeof(s)) # 88```
Even though it seems like it would be 81
!e That's because:
print(ord("๐คฏ"))
print(2**16)
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 129327
002 | 65536
The codepoint for ๐คฏ is higher than the biggest unsigned 16 bit integer, so Python needs to switch to using 4 bytes per character instead of 1 or 2 when you add it.
When you add ร it jumps from 1 byte per codepoint to 2, and when you add ๐คฏ it jumps from 2 bytes per codepoint to 4.
But what are the 3 extra bytes
Since it goes from 74 to 84 even though 74 - 1 + 4 + 4 = 81
Ah. Because "๐คฏ" is stored not as a single codepoint but as a surrogate pair, I think, making it 72 - 2 + 4 + 4 + 4
though it's weird for it to be using both surrogate pairs and 4-byte integers for the codepoints, so I could be wrong about that. I'd expect it to use one or the other, not both...
non-ascii chars in utf8 are two bytes, including 128-255
yes - like I said, I'm pretty sure the size of the UTF-8 representation isn't being counted at all, just the size of the pointer to it
codepoints in 128-255 take 2 bytes to represent in UTF-8, but can be represented as a 1 byte uint8_t holding the codepoint number, instead.
So both pointers do use two different spaces in memory, itโs just not paying attention to the space for the other array?
Actually can you have two arrays which are using the same memory space in c?
Or is that not possible?
@sand goblet for the utf8 string, it's not the same memory
yep. The other array is created lazily, I believe - it may not even exist, but there's still a pointer that may or may not point to it.
You can interpret one array of bytes in multiple different ways. You could interpret an array of 10 4-byte integers as an array of 40 1-byte integers, instead.
Yeah I guess doing that wouldnโt make sense
Since you donโt need the array tell the program how to interpret itโs data
It just needs to know the size of each piece of data
in the case where CPython is doing this optimization - for all-ASCII strings - it works because "an array of characters as a UTF-8 encoded string of ASCII compatible characters" and "an array of single-byte integers representing the Unicode codepoints of a string of ASCII compatible characters" are represented exactly the same in memory, because a character is a one-byte integer type in C.
!e ```py
abc = "abc"
array_of_1byte_ints = b''.join(ord(c).to_bytes(1, "little") for c in abc)
array_of_bytes_in_utf8 = abc.encode("utf-8")
print(array_of_1byte_ints == array_of_bytes_in_utf8)
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
True
the representation of "abc" as 3 single-byte integers back to back in memory is b"\x61\x62\x63". The representation of "abc" in UTF-8 is b"\x61\x62\x63". These are the same, so CPython can cheat and use one array, knowing that if it made two arrays they'd just hold exactly the same data.
they're two conceptually different things, but for strings that consist of only codepoints between 0 and 127, they will always be equal.
Why canโt it use the same 1 byte array for ร if its ord value is under 2^8?
because utf8 for codepoints 128-255 take two bytes each
!e ```py
print(len("ร ".encode("utf-8")))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
2
that's how utf8 works. If it allowed 256 8-bit chars, there'd be no bytes available for encoding codepoints beyond 256
the most significant bit of a byte is set if the byte is a part of a multi-byte representation of a codepoint, and unset if it is a single byte codepoint. That leaves 256 / 2 == 128 single-byte codepoints (the ASCII codepoints), and all others must be multibyte.
So the last bit tells it if it needs to keep reading?
first bit, but yes. Not just if it needs to keep reading, but how far to read.
https://en.wikipedia.org/wiki/UTF-8#Encoding has a table that shows it well.
It tells it either if itโs 8 bits or 16 bits
@sand goblet some codepoints need up to 4 bytes.
(the utf8 encoding scheme can go up to 6 bytes, but Unicode only needs 4)
the first byte tells you whether the codepoint is represented by a single byte (if it starts with "0" as the most significant bit), or two bytes (if it starts with "110") or three (if it starts with "1110") or four (starts with "11110")
Oh
and all of the bits marked with "x" in the table I linked are the actual bits that make up the codepoint number.
So thatโs how it can use 7 bits, because if itโs a character under 127 then it only needs to read the first bit to determine that?
right. and that leaves 7 bits to store a value, and 2**7 is 128, so it can store values between 0 and 127.
and those 128 characters happen to be exactly the ASCII characters.
not by coincidence; UTF-8 was designed to be backwards compatible with ASCII.
But I forget why if you have the string "ร ", sys.getsizeof says its size only goes up by 1 per character added, instead of two
Itโs because the array itโs measuring just uses 8 bits?
I think sys.getsizeof() is ignoring the (lazily created) UTF-8 representation, and is only counting the array-of-codepoints that the string owns. And that array is an array of 1-byte, or 2-byte, or 4-byte unsigned integers.
and doesnโt do the leading bit thing?
so because all of the codepoints in "ร ร ร " fit inside a 1-byte unsigned integer, it can use just 3 bytes to represent that, and just stores b"\xe0\xe0\xe0"
each time you append another "ร " to the string, it adds an extra \xE0 byte to its array of single-byte integers storing codepoint values
And then if the ord value for a character is past 2**8, it moves everything in a 2 byte array
yep.
And then into a 4 byte array if itโs over 2**16
!e And that \xE0 I mentioned is because:
print(ord("ร ").to_bytes(1, "little"))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
b'\xe0'
Alright, I think all of it makes sense now. Thanks for explaining it, guys
Hi I would like to know how to detect hand motion, hand movement gesture and convert it into some commands, I found leapmotion but need to buy the hardware, can Opencv detect hand motion? Like swipe up to scroll up the pages of my monitor. Anywhere I can get a mentor?
This discussion is about python implementation itself, this would be probably more fitting for a help channel
GodlyGeek, are you the same godlygeek with the github vim/nvim repos?
Why does it not need to store the number of codepoints until it makes the utf-8 array?
I'm godlygeek on GitHub too, yes
It looks like it's got two different counts of codepoints, one that counts surrogate pairs as 1 and one that counts surrogate pairs as 2, if I'm reading the code right.
And for the ASCII case it can just use one length field for both things because it knows there cannot be any surrogate pairs
Alright, that makes sense. Thanks
@hasty portalthis is not a help channel #โ๏ฝhow-to-get-help
alright thanks
can someone help me with OCR?
I have a sample but I cant get it to get the right characters even though the text is literally a screenshot of a text file
hey guys, can anyone with python knowledge take a look at this issue ?
https://github.com/psf/requests/issues/5384
The responsibility for this strange behavior was blamed on:
https://github.com/python/cpython/blob/master/Lib/http/client.py#L903
But it seems like someone should look into this because it is causing headaches for anyone that needs to run custom web servers against proxy clients
and if anyone needs more information then has been provided in the original issue, I am always here
@unkempt rock by "separate transmission" you mean in two separate send() calls?
yea
they sometimes
and only sometimes cause the connect to be split into two network frames
meaning you have to read twice on the other side to get the full message
normal web servers already take this into account because they also serve normal traffic that needs it
but for CONNECT it's not really needed and you can just do one write
i didn't think any correctly written network service would care how many frames were used?
but i'm no expert
no, normally one doesn't care
but when building custom application that do really high throughput
it matters a lot
and I found this particular code to cause A LOT of performance downgrading because of this multiple write implementation
if you change it so that the request is formed and then sent as a whole, you could save a lot of implementation complexity on the other side
and the annoying part about this is that it only starts to break once you reach high network traffic
so it was super hard to replicate
and find for that matter
sounds like you should open a bug at bugs.python.org, or make a pull request.
that's the thing, I don't use python :S
I just joined in here because this issue was dead for almost a year now on that other repo
and I figured I should maybe try and come into this discord and poke some python pros
what involved you in this at all if you don't use python?
the way to involve the python pros is to write an issue at bugs.python.org
I was building a global proxy network with a ton of users and once our traffic spiked this started breaking it
so here I am
:S
could you prehapse assist me if I run into trouble while making a ticket ? because I have 0 knowledge of python or how things work around these parts :S
i'm not sure what help you;'ll need, it's a bug tracker ๐
though tbh, if the bug is, "things go slower", they might not consider that a bug.
well, an enhancement then ?
i'm not really concerned with what it's called as much as how it'll improve things (hopefully)
I'd change the title to mention the module involved, not the line of code.
better ?
great
okey, i'll leave it as is .. thank you for the help โค๏ธ
reduce(lambda v,f: f(v), list_of_callables, init_val)
Is one of the best ways of applying a list of callables iteratively.
Change my mind.
i would put that into a function if i were going to use it, and then once it's in a function, i'd write it as a for-loop.
It's just a very cryptic of writing function composition, right?
compose(foo, bar, baz)(fizz_buzz)
its equivalent to this
for idx, fnc in enumerate(list_of_callables):
if idx == 0:
intermediate = fnc(init_val)
else:
intermediate = fnc(intermediate)
right, so function composition
it's equivalent to:
def apply_all(fns, val):
for fn in fns:
val = fn(val)
return val
(not sure how enumerate got mixed into it)
bc im a dumbo sometimes
can someone help me with OCR
OCR being?
i guess i have to try it with coverage.py soon
God i want it
get it
PATTERN MATCHING COMING BABY
you can clone cpython, that's what I did
I imagine that it's a pretty complicated thing to implement in linters/typecheckers/other inspection tools
i hope it's no big deal
But I swear, it would be a mess to learn.
DOG = 'dog'
CAT = 'cat'
def speak(animal):
match animal:
case DOG:
print("woof")
case CAT:
print("meow")
speak("cat")
print(DOG)
I wonder what happens.
yup, I'm not very fond of all that business...
Well, no going back so we'll see how it looks in practice
Haskell/Purescript get away from that because all variables start with a lowercase letter, and all type constructors with an uppercase one
would case +DOG: work?
probably
import myscript
myscript.dog
sees unary +
javascript flashbacks
maaaybe .DOG
@severe lichen i'm kind of glad that the capitalization is ad-hoc: it lets us get away from knowing what type things are.
class? function? why does it matter?
It makes for very annoying name clashes
like lists = [[], []]
for list in lists:
print("Now your list constructor is reassigned")
I'm fairly sure backwards comp plays a role in builtins being single word all lower but only constants and exception classes using cap words helps distinguish a bit
yes, the built-in names would be nice to use for our own purposes.
though, "list" is a bad variable name anyway
but then
removeprefix/removesuffix??
I guess it's to be consistent with the other methods...
Also, module name clashes.
like json = response.json()
json = {"reassigned_module": "Yes"}
Well, it's either four-letter word for a local variable or local_variable_used_for_very_specific_purpose_by_LordDeathXXX_check_my_patreon
Yeah that's most probably for compat with the other string methods that are ancient
Pretty sure straight from C. Just like returning -1 when substring is not found in string
@silver canyon You should probably ask in #discord-bots.
Overall naming could've used a much bigger overhaul in the 2->3 transition along with the other big breaking changes, don't think we'll get anything anytime soon because of that
oh ok
there are so many channels
You can introduce 3 new keywords in Py4 and backward compatibility be damned. What are they? Also, you get to make a new operator.
what are you talking about?
the devil is making a deal with the PSF through Oouja
I'm talking about what would you do if you had creative freedom and no concern for legacy.
well, if you were allowed to introduce new semantics...
No braces, though. It's a personal request from Guido.
I see, you both have a deal going already...
Yeah, that was for not making Perl the main language for ML
Well, I would like to have a 'canonical' type checker, sort of like TypeScript. With no weird bugs everywhere, no weirdness (like how mypy treats callable attributes' annotations), and support for slightly more advanced things like generic (polymorphic) values (not just functions), good type narrowing, maybe manipulation of records.
Another issue (which is also the case with TypeScript, I suppose, but it's not a frequent use-case) is that some typing constructs like dataclasses or ORM models aren't expressable in the type system itself -- you just hope that the typechecker has a plugin system, and that someone has implemented a plugin for your favourite metaclass-infested library
Btw, about type narrowing. I remember C# pattern matching having cool stuff like
void speak(Animal animal) {
match (animal):
animal as Cat:
animal.meow()
animal as Dog:
animal.woof()
}
I don't remember the exact syntax, but you get the idea - typecasting and matching in one line
Okay, looked up, I forgot everything from C# pattern matching. -_-
Back to Python, then!
Also, with typing.NamedTuple (what a weird namespace to place it), i feel record manipulation is in pretty good place.
The only gripe I have that that NamedTuple and json dicts don't have a common interface. book["author"] would fail for a NamedTuple.
It's not that weird - the original is collections.namedtuple
then why not collections.NamedTuple?
It serves to add typehints to a namedtuple, makes sense to place it in typing. Same thing for TypedDict
or collections.typedtuple
But there's no untyped variant of dataclass, and the typed variant is in dataclasses...
and NamedTuple doesn't just add types, it also allows adding methods
without an extra inheritance step
It wasn't like that originally.
3.6 added more features to it, but moving the namespace at that point would be a breaking change.
It's certainly an outlier, especially without knowledge of historical context. Even so, it still isn't that weird.
With the types originally being just typed versions of collections, it made sense to put them in a separate namespace. Otherwise it would cause confusion. Maybe it could have been collections.typed.NamedTuple but I'm not going to argue that.
i collect all important data from my high school days
now its time to arrange them
i hv some yt videos which is no longer available on youtube I need a program which will take the title from my local file and put it on yt and verify if its deleted or still available
i hv some audiobook which is now not available on audible (all arranged)
i hv some ebook
i hv 4TB+ Local Documentary Collection
i hv enormous amount of data
but manually checking all 700000+ Files is too much
what i want is
a program that will take names from my local files and put it into google and check what file it is
is it a series or Documentary
then move that files in specific location as well as custom location
like
government & military (local folder)
i want any documentary which is about politics or military to be put inside same folder
plz help me
otherwise manually it will take 6+ month
i hv already wasted 7 month
in return i will give access to all the files i hv
its 17TB + in total
one 6TB hdd hv 3.7TB documentary
Directory:-
https://www.mediafire.com/file/xxxldw461bt6r0j/6TB_HDD_Video.html/file
@knotty plank wrong channnel
@knotty plank in a help channel
maybe #data-science-and-ml tbh
can someone help me with pytesseract
hey all, let me know if I should ask this somewhere else. I'm trying to do this:
from typing import TypedDict, Final
account_schema: Final = {"email": str}
AccountBase = TypedDict("AccountBase", account_schema)
class Account(AccountBase):
account_id: str
AccountPatch = TypedDict("AccountPatch", account_schema, total=False)
Idea is that I have two types used in communication in my database. Account has all account data fields + ID itself present (returned by my SELECT *s). AccountPatch has all account data fields optional, with no account_id present (because ID can't be updated)
since account_schema is Final, why can't it be treated as a Literal dict? I get this error when running my type checker (mypy):
account.py:4: error: TypedDict() expects a dictionary literal as the second argument
account.py:9: error: TypedDict() expects a dictionary literal as the second argument
actually Account and account_id are irrelevant to the example, should have focused on why AccountBase and AccountPatch specifically don't accept Final dicts
A literal and the specifier final are different.
I stumbled on https://www.python.org/dev/peps/pep-0586/#interactions-with-final which says:
The Final qualifier serves as a shorthand for declaring that a variable is effectively Literal.
is there something here that makes that not apply?
do you think that the dict being mutable could have anything to do with it? cause theoretically I could do account_spec["b"] = "c" and Final doesn't stop type checker from allowing that
whereas the pep really only gave an example for int, which is immutable
Hm, that could be a good intuition, yeah that'd make sense.
Final just means you can't reassign it, doesn't mean you can't change it's members.
Yea I'll just create a mypy issue for it, may be bug but might also be intended
Error coming can anyone help?
What error?
Ah, also, this isn't the channel for that. This is a channel for discussing python syntax and other stuff in Python itself
any admin/mod here
Ping them? There seem to be a few on
Dont ping mods unless you need something moderated
Can someone enlighten me on how this dispatching works? https://stackoverflow.com/a/24064102/9063378 I dont understand how __get__ knows what method to return based on the type of the arguments.
coverage.py with pep636 pattern matching: https://imgur.com/a/7e6gScO (no changes to coverage.py yet)
Nice! If you use pair = (0, 12) does it correctly flag line 8 as missing?
That's pretty good for out of the box tracing support
Are you generally familiar with how descriptors work and when __get__ gets called?
Yes but I'm not sure how its dispatching based on the type in the method argument annotation
well, it's not - it's relying on singledispatch to do that
it's just making it so that the singledispatch.register call is being performed on a bound method, rather than on a function, if I'm understanding correctly
I think its that closure in __get__ where it happens
I'm trying to make my own version of multiple dispatch for work.
right, the closure in __get__ is looking at the first of the *args that were passed to a bound method, which already excludes self, if I understand correctly.
yeah, i just debugged it locally. I just want to fully understand this. I wish I could get pattern matching at work bc it would literally solve what I want to do. I'm trying to dispatch on literal value of key and the type of the value for that key. So the args to my methods are the key values and their type annotation is the type of the value.
we made multiple dispatch in here before https://github.com/salt-die/Snippets/blob/master/overload.py
Not sure if this has been posted before, but there are exploits in the C used in cpython, see this:
"floating points as untrusted input"
that sounds exceedingly rare
"sprintf is used unsafely" LOL
yeah, sounds exceedingly rare. But it has already been patched, too.
Is it possible to make anti ddos with Python?
Does this seem like a bad idea?
class IndexableMeta(type):
def __getitem__(cls, key):
return cls.registry[key]
Depends on what the registry does I guess. You could also use __class_getitem__
Doesn't the data model say not to do that with __class_getitem__?
It's discouraged, but I can't really find a reason for that and using it instead of specifying a metaclass looks much clearer to me
Yeah, I mean I guess the metaclass is a bit much. At the end of the day I am doing a subclass registry and I just think being able to index on the parent class (i.e. the class holding the registry) is much clearer. However I can see how others dont think that.
@swift imp it's a global variable, so it's bad
What is a global variable?
I dont understand what you are saying
A global variable is something you can access from any place in the program. i.e. not a local variable.
.>
in this case, cls.registry is global: there's only one of them for your process.
I mean, what variable in this snippet is global?
there is one cls.registry per class that uses that metaclass
Which is the point
Right
I would just use self for that argument, rather than cls
depending on how it's used, you have all the downsides of a global variable
What's the difference between
class Scoring:
score = 0
def add(n):
Scoring.score += n
``` and ```py
score = 0
def add(n):
global score
score += n
```?
Right, I see
I guess you could make some sort of 'cascading constants', like CSS variables. But I don't know anywhere it's useful besides styling stuff
yeah so I was going to do __setitem__ too
yeah, then it's just a global constant @flat gazelle
what is the actual goal here?
I actually like how CSS variables work
I have a json schema set up with the jsonschema module and I want to define one registry class per top level object in the json, whose subclasses provide processor method that will be dispatched to depending on the attributes of the json object.
Basically my the registry keys are the subschemas, stringified, and the values are the subclasses providing the appropriate processor.
I would just add a classmethod on the parent class
classmethods instead of subclassing?
@swift imp if these are truly constants, then it could be ok. consider if you want to test with different sets of values in that registry.
overall, this isn't that bad of a solution though
Thanks everyone, this is why I love this channel because I'm considered the expert at work and I definitely not lolol
I should probably ask more often here about the stuff I make in my personal projects
because sometimes it comes out as total garbage that doesn't make sense
are you and i working on the same projects!? ๐
Wormhole bug in git?
Type hint question: if I want to annotate attributes of a class, is it better to do that in __init__ (partially by using type inference), or to do that explicitly in the class declaration?
Or maybe I should specify the public attributes in the class declarator, and private in __init__ if needed?..
are they class or instance attributes? If they're defined in __init__ they're instance and you can annotate them in that, like the body of __init__.
instance attributes
I think the convention is in declarator, but if the linter understands it, I would say its fine either way
Pretty sure the linter will understand bc pylint yells if you define an instance attribute outside of __init__.
typecheckers understand both
class Point:
x: int
y: int
def __init__(self, x: int, y: int):
self.x = x
self.y = y
and
class Point:
def __init__(self, x: int, y: int):
self.x = x
self.y = y
but sometimes I need to explicitly annotate some attribute anyway:
class Point:
def __init__(self, x: int, y: int):
self.x = x
self.y = y
self.quack: Dict[str, int] = {}
that third example should be fine...
the issue with the first one is that I have to write out everything twice, which makes the annotations even more bloated
I wouldn't write the first one. Declaring it as a class attribute seems pointless no?
It actually lives on the instance.
It's not declaring it as a class attribute -- it's still an instance attribute annotation.
To annotate a class attribute, you'd use ClassVar
class Point:
quack: ClassVar[Dict[str, int]] = {}
Oh see, I thought that was only for dataclass. That's nice to know.
So I guess you having to re-annotate in the __init__ signature is necessary then because they're required arguments.
Yeah its rough, seems awesome but then vscode is colored in red from pyright and its just a nausiance.
is logical expression < faster than == ? In my head == should be faster?
I think that's a loaded question and depends on the type and what dunders the type has implemented and how they're implemented.
or vice versa ๐
mypying a project written using pyright is a trainwreck as well
although you can use mypy there
Depends on the type. Can you elaborate?
And why do you care whether < is faster than ==? The difference should be insignificant anyway
besides, they're not equivalent, so replacing one with another can introduce a bug
lol yes, that I know. Just curious about ordering conditionals in quicksort
if I do sort left (lesser) first, then I wonder if == to equal, or sort right (greater) is faster, since the third is just the else case
It's a theoretical argumentation more than practical
sorting a list of 10 digits, even the algorithm is insignificant.
Guys is there anyone who is familiar with linear algebra
basically there is a simple question that I wanted to ask
it's simple as I guess but I don't get how to solve it
@supple geyser
dm
!warn 743128475715502259 Stop advertising your bot here. Please read our #rules and #code-of-conduct.
:incoming_envelope: :ok_hand: applied warning to @digital venture.
I wonder if py def foo(a, b=1, c): ... should be allowed, and interpreted the same as ```py
def foo(a, b=1, *, c):
...
currently you get "SyntaxError: non-default argument follows default argument" - but it's been that way since before we had keyword-only arguments. Now that we do, it seems like it might be reasonable to implicitly treat the first non-default argument after the first default argument as the start of keyword-only arguments.
Though, honestly, I'm not sure if that would be more or less surprising for people.
I feel like an error is fine here.
Is it a bad pattern to have an ABC class with concrete subclasses that only have classmethods (i.e. no __init__ and no instances)?
Say I have this base class PackageBase, with abstract class methods like setup() check_updates() get_version(), etc.. Its subclasses implement those class methods for specific packages (fetch stuff, unzip, w/e). This way they basically exist as a singletons, because it doesn't make sense (for my project) have multiple instances of those packages.
well, there cases where singletons makes sense, but why not have duck typed modules instead? Or even decorated functions.
I'd use modules for that as well.
All args are positional-keyword by default. keyword only args dont need default values. A lot of people use positional-keyword args and never pure keyword only and that's probably why its surprising.
@flat gazelle @raven ridge but with classes I get mixins, methods inheritance (even though they're just class methods), and stuff like that. Like I can have PackageBase <- VersionedPackage <- SomeActualPackage.
!run
from inspect import signature
def foo(a, b=1, *, c):
...
for sig in signature(foo).parameters.values(): print(sig.kind)
Does anyone know how to make an account checker for like combo lists on python?
So I made this:
https://github.com/sneakykiwi/grpc-gofiber-fastapi-example
you could get all of those things with modules, too, by just explicitly importing common/shared things into the modules where you need them.
Its just an example of how it works because I havent found anything like it on the internet, I was looking for examples on how to implement intercommunication between microservices and I thought screw it imma do it myself
hmm, classes do seem better suited for that
though I feel like packages could/should be instances
does anyone know how to make an account checker in python
but then you get trouble with overriding methods. You can have decorator setters like property, but it isn't all that great
but IG you need a decorator on every method anyway since it is a classmethod in your case
I don't plan on having setters though. Properties would be nice, and of course never instantiating the class I can't use them that way (they're just descriptors in a class ofc).
Basically these concrete classes just act as definitions, with some pieces of code to specifically fetch and install the package, get the remote version etc.
(By package I mean generic resources/files, nothing to do with python packages)
Say then I have a class like SomeTool. And say that tool depends on the package (resources) being installed. In my mind it makes more sense doing a SomePackage.is_setup(), rather than SomePackage().is_setup()
And, I might be wrong, having a duck typed module sound like boilerplate / duplicating code, or explicitly referencing other modules (ie no inheritance, no mro)
wrong channel i'm sorry
well, my idea was
package = VersionedPackage("mypy", "1.0.0", other args)
@package.install
def install(env):
...
``` instead of singletons
This way they sort of act like singletons in the module they're defined in ๐ค
it is a similar pattern to how flask handles its apps
Looking at it I like the decorator pattern
That's true
I like it, I think I might end up doing that. Thank you. 
The only doubt I have now is how to reference them as dependencies in the SomeTool class. Should I keep a mapping as a registry of all the known packages, with names as strings?
I kinda liked the idea of just importing the package class(es) and saying: ```python
class SomeTool(ToolBase):
dependencies: Container[PackageBase] = (PackageA, PackageB, ...)
import the package instances
But then this would look like this:
dependencies = (PackageA.package, PackageB.package, ...)
Or I'd have to do from PackageA import package as SomeName
๐ค
Maybe the real fix is to not use so many global variables ๐
or you could have dependencies be the whole modules and metaprogram the package out of there
but I do think just importing the packages is best
I think I might do something like that.
Thank you
Time to whip out that typing.Protocol I guess
:/
Consider this:
>>> sorted([1, 9, 2, 8, 3, 6], key=lambda x: x % 2 == 0)
[1, 9, 3, 2, 8, 6]
Is it specified behavior that if you sort something according to a key that renders non-equivalent values equivalent in terms of their sort order, their original order is preserved within each "equivalence class"?
So that's what is meant by "stable sorting"?
yes
I'm learning 
it means you can sort by a secondary key, and then a primary key, and it comes out the way you want.
and the way that you do that is having a key function that returns a tuple? or is there a way to specify levels of keys that's more "built in" than that?
oh I see what you mean
Tuples are the convention. Proof is that attrgetter and itemgetter take multiple args
@boreal umbra if you can do sorting on two values at once with a single key function, great. Sometimes it's not possible.
like when?
for example, if you have two string columns, and want to be ascending on one and descending on the other.
if they were numbers, the key function could just negate the descending number, but you can't negate a string.
for that example specifically, you could create a function that turns strings into a tuple of their ordinal value, but that's a bit :/
a bit? ๐
a bit <whatever adjective :/ represents>
in this case, :/ represents, "a crazy idea"
yeah, it would work though
the above question came up in #help-carrot, you don't need to "invert" the string itself, just the comparison, right? I came up with this: ```py
from functools import total_ordering
@total_ordering
class Inverter:
def init(self, obj):
self.obj = obj
def __lt__(self, other):
return self.obj > other.obj
def __eq__(self, other):
return self.obj == other.obj
usage example, ascends on i[0], descends on i[1]
items = []
for x in range(3):
for y in range(3):
items.append((str(x), str(y)))
print(sorted(items, key=lambda i: (i[0], Inverter(i[1]))))
As nedbat said, there's a much simpler solution
of course you can sort twice
!e Yeah.
items = []
for x in range(3):
for y in range(3):
items.append((str(x), str(y)))
items.sort(key=lambda x: x[1], reverse=True)
print(sorted(items, key=lambda x: x[0]))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
[('0', '2'), ('0', '1'), ('0', '0'), ('1', '2'), ('1', '1'), ('1', '0'), ('2', '2'), ('2', '1'), ('2', '0')]
but they did say sometimes it's not possible, with the ascending/descending strings as an example of that
which I didn't think was correct
yeah. It's possible - your Inverter does it - it's just much trickier.
The fact that sorting is stable means that sorting twice does the trick pretty easily.
FWIW, your Inverter class works pretty similarly to how functools.cmp_to_key works.
holding onto an object, and using < on that object as part of the key comparison.
Are class decorators exactly the same as function decorators, in the sense that they're just nice syntax for thing = decorator(thing) as a statement right after its definition?
yep
i guess I should mention, the Inverter thing was just an exercise of possibility for me. sorting multiple times should be much faster
Can someone explain what this change/optimization means?
is it something for subinterpreter stuff?..
is that for 3.10?
yes
_PyEval_EvalCodetakes 16, yes sixteen parameters!
ok, that's pretty worrying
so it's a refactoring of some god function?
BDFL:
Of course, the thing I'd really want is a way to state that all references to builtins are meant to have the exact semantics of those builtins, so the compiler can translate e.g. len(x) into a new opcode that just calls PyObject_Size(). (I can dream, can't I?)
so it's sort of a move to something akin to my esotericnaildecorator that turns global lookups into constant lookups, but much better?
Sounds like so far it's just turning two dictionary lookups into one. Instead of looking up __builtins__ in globals() and then looking up print in __builtins__, it's now able to look up __builtins__ as a slot on the function object itself, saving the need for a dictionary lookup against globals()
Basically, caching a reference to the __builtins__ dict so it doesn't need to be looked up in as dynamic a fashion
ah
What do you guys think is the best language for code design or code philosophy? I've been using OOP forever now but I might try out FP
I'd say it depends on the language specifically. Doing FP in Python for example is miserable, and in most languages where it isn't optimized say like Haskell is for it, it's going to be quite slow.
Use what makes sense for the problem you're trying to solve.
Hello, I have some hard time understanding aync/await (asyncio). If I await for some wss message that comes every 1 second and function proccess of that message last 2 seconds, does that mean that I will get every second mesaage?
If yes, how should I fix that if message flow wouldn't be constant (not every second), but I would just like to prevent my code to miss messages? Using threads and FIFO? Or is there a cleaner option?
I also code in nodejs but async/await should work the same way as asyncio...
No, you won't miss messages. It'll process the next message as soon as the event loop isn't blocked. When the event loop is blocked, it doesn't discard tasks. It keeps waiting instead.
you can hardly find 'best' anything, but I think there's a lot we can learn from the functional philosophy and apply it all over the place, including Python
trying out a functional language is always a good idea in my opinion
Does wss_message() in:
async def msg() :
while True:
await wss_message()
loop.create_task(msg)
Blocks event loop?
Can't say without seeing the functions code. I can't help with this anymore tonight. You should move this up #async-and-concurrency
I just tried again and it worked as you said. Thank you!
hello :)
do you mean use one?
you just use it
I don't have to do anything?
github? gitlab?
a bit more simple than that lol
you just have to have a document in your repo with the license in it
oh, cool
github has a nice thing explaining it, let me find it
okay, thanks
alright, thanks!
It's called "the MIT license" because it's the license under which MIT released it's open source code back in the day.
Anyone else could choose to release their own code under the same license.
anyone else excited for match case being added in 3.10?
I'd been following it but hadn't seen it in the "what's new" doc until today
it's nice that it accepts "dotted" constants as literals, I think the capturing behavior will confuse a lot of people who think it's switch
It was only added now to the changelog because it was accepted a few weeks ago and was first included in the alpha that was released less than 24 hours ago
I'm excited. I'm preparing a talk about it for work.
I've got a project I wrote a few months ago that I wanted to refactor to refine my OOP skills and it'll be perfect for that (it's a blackjack engine)
right now the logic I wrote to test various possible rulesets against various possible hands is messy and could be cleaned up a lot with pattern matching I think
Hi Everyone, I made this menu system for my application and it has a lot repetitive code in it. I feel like I could structure this better. Does anyone have time to take a look at it and give me some feedback?
hey guyhs
what does init do
because ive seen a lot of videos
which put it in def function classes
but i just dont know what it means
stuff like
stuff____stuff
stuff
stuff
:incoming_envelope: :ok_hand: applied mute to @topaz charm until 2021-03-02 19:22 (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
um
well the init function is used to assign instance variables when you initialize a class
๐ how are they still muted?
edit: thank you silent mod or bot
!code for future reference btw
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
I've seen you advising that few times. Why exactly? I know it's different paradigm
I just think it's great to get a taste of something different, it can grow your perspective and allow you to come up with better solutions even in languages that aren't purely functional
or just reading about how a certain problem would be solved in something like haskell
in terms of what we can gain out of it, I think all of the philosophical pillars - less (or no) state, pure functions, immutability - lead to code that is both easier to reason about and easier to test, even in an OOP-first language like Python
I also think that learning at least a bit of purely/mainly functional language is a good idea. It makes you... switch your thinking a bit. When you learn it, you have to use functional stuff, you have to understand it. There are no loops, you have to make use of that other stuff.
And later you can apply it to other languages with functional elements
Like... you know how many stuff you can do with zip? I learned it by working with a person who loves functional programming. We coded in Scala (or rather: he coded, I joined because he didn't have time to do it all by himself and mainly worked with writing similarly to what was already written), big data stuff - then it makes a difference if you have a million loops or chain generators and such.
It also teaches lessons that apply to other domains. Immutable state plays a crucial part in distributed systems design, for instance.
recursion is beautiful
recursion with a trampoline though...
๐ฅด
# WOOO PATTERN MATCHING
if compat.PY310 and type(obj) is dict:
match obj:
case {tags.TUPLE: _}:
restore = self._restore_tuple
case {tags.SET: _}:
restore = self._restore_set
case {tags.OBJECT: _}:
restore = self._restore_object
case {tags.FUNCTION: _}:
restore = self._restore_function
case _:
def restore(x):
return x
ok so this works in 3.10.0a6, butmatch throws a syntax error in 3.8.6 (and probably 3.9.2 also but i haven't tested that), are there plans for a syntax backport in future releases of 3.6+ so that libraries can start integrating this matching code without breaking anyone not at 3.10
or does it go in a different file actually
i doubt putting it in a different file will help actually
No, backporting things like that is not really a thing afaik. Not sure if the old parser can even make sense of it
huh ok
so people needing to maintain libraries that don't require cutting edge python versions will have to wait a few years to include this?
or is there a workaround for older versions
yes
as with any feature
If they want to use new features they have to use new python, the versioning wouldn't make much sense if every feature like that got backported officially
oh hm, so this is the error in 3.8.6:
File "test.py", line 2, in <module>
import mylibrary
File "/root/.pyenv/versions/3.8.6/lib/python3.8/site-packages/mylibrary/__init__.py", line 60, in <module>
from .unpickler import decode
File "/root/.pyenv/versions/3.8.6/lib/python3.8/site-packages/mylibrary/unpickler.py", line 211
match obj:
^
SyntaxError: invalid syntax
could we put from .unpickler import decode in a try/except syntax error block and import a different version of the file if the user isn't using 3.10
lemme test that ig
You can check the version before importing
yup yup
But if you're maintaining one path then that sounds easier than pushing for pattern matching and having to maintain both
then that sounds easier
which approach are you referring to by "that", try/except syntax error or checking the version first
@silk pawn Why even make the pattern-matching implementation when you have a non-patma one?
i'm under the impression that matching could be faster than the current approach
depending on how you use it of course, but still
Just leaving it as is, because adding the conditional imports becaue of pattern matching means you now have 2 different code paths to maintain. Either continuing to not use pattern matching until the majority of your user base adopts it (or versions below 3.10 reach EOL) or just forcing everyone to adopt the new python version sounds much easier
I didn't look at it that much, but I don't think you'll get speed increases that significant
Just a thought: maybe people would adopt new Python versions faster if libraries dropped support for eariler Python versions faster?
That's a good thought
I am always a bit dumbfounded when maintainers insist on supporting EOL versions like 3.5 and cutting features because of it
I guess there's the downside that people will just sit on their old version of Python and not receive important security updates...
yeah that's a fair point for medium/large code bases, but maintenance of the code base isn't really a problem for me at the current size though. is the difficulty of maintenance what you're referring to by code paths?
yeah i just tested it and the way i did it is actually slower than without pattern matching so i'll probably wait until the feature is more mature
Like, it seems very extreme that Python 2 was only dropped in 2020.
what if they don't use libraries? (very rare, but hypothetical)
sure, you can write your own HTTP server ๐
and your own database adapter
but people have deadlines and budgets
yea
You still have python libs that may not be available in earlier version, but any app with an appreciable size will use 3rd party packages as that's one of the points of using python in the first place
wait so https://www.python.org/dev/peps/pep-0635/#backwards-compatibility says that through the use of the new PEG parser and "soft keywords", matching is fully backwards-compatible, so why the error on previous versions of python?
it seems like a contradictory statement to make though because the older versions of python don't even have the peg parser iirc, so how would they use it to be backwards-comptaible?
"backwards-compatible" means that if a program works on a previous version, it will work on a later version
ohhhhhhh
so if you have a variable named match, it will still work
because it's never ambiguous whether it's a match statement or a variable name anyway
got it, i thought backwards-compatible meant that if you introduce a feature, it won't cause errors in versions before the feature was introduced
That's not backwards-compatibility. If the language ignores something it doesn't understand, that's just horrible language design ๐
nobody likes debugging HTML and CSS
fair point
That's forwards compatibility.
Like, adding a new optional argument to an existing function, for instance
no, that's still backwards compatibility
any old program will still work on the new version of the library
an example of forwards compatibility would be ignoring all extra arguments.
er, you're right - that was a bad example
or forwarding them blindly along, perhaps. Like something in a multiple inheritance hierarchy that forwards **kwargs to super() after taking out only what it needs from them, so that it plays nicely with other classes added to the MRO later.
I love keyword only arguments for that reason 
structuring your entire framework around mixins ๐คฎ
although
maybe it's not that bad
Doesn't blindly passing **kwargs to a function that doesn't take **kwargs just drop any that it doesn't take?
Or does it cause an error?
you need to do that if your'e a library author designing something that might be inherited from, since you don't know whether or not the end user will multiply inherit.
it causes an error.
Boo
unless the **kwargs dict is empty, anyway.
But you can swallow extra kwargs by having star star kwargs as a parameter and ignoring it, yes? (On mobile)
yep.
I'm gonna do that to all my functions
I'm signing off soon but what are mixins
Classes that add extra things to a class using multiple inheritance
classes designed to be dropped into an inheritance hierarchy to do something extra - usually one specific feature.
I like those
Not that they're necessarily good
I just like that they exist if you want them.
Lol, it's not a good idea to let people pass arbitrary kwargs to something that isn't gonna utilize them
That was the joke ๐น

