#internals-and-peps
1 messages ยท Page 109 of 1
It's pretty well tailor-made for overloading, eh?
(You here that 'eh' there? My Canadian is showing)
yeah actually, I think that's one of the use cases highlighted in the pep
some lsips really have TCO but clojure cannot I think for JVM reasons, so they have a special thing for it
It appears as though it can match types pretty well, but I'm guessing it breaks down when things get extra complex
No nested lists or callables I'm guessing
match/case can match nested lists iirc
?
you can attach guards to case clauses, I meant
I'm not 100% sure what you're meaning by that but it sounds cool
!pep 636
Something like, I can specify a case is true is a 'name' parameter is both a string, and a length greater than zero?
match command.split():
case ["go", direction] if direction in current_room.exits:
current_room = current_room.neighbor(direction)
case ["go", _]:
print("Sorry, you can't go that way")
is reading about 3.10 right now
[New Type Union Operator] is also accepted as the second argument to isinstance()
isinstance(1, int | str)
YESSSSSSSSSSSSSSSSS
not much better than isinstance(x, (int, str))
But I guess it is needed in some contexts
*fixed
Its just a lot sexier is all
the syntax is a lot more pleasing IMO
I'm not 100% about the symbol- but as an idea? Hellllll yes
using a tuple for isinstance felt weird af
I wish you could do isinstance with most or all typehints.
you can do it with all
isinstance(x, Any)
๐ฅด
type(int | str) --> <class 'types.Union'>

I was getting into Rust yesterday
smh
and I realised how much I missed pattern matching
isinstance(x, object) would work
I know
i wonder what the reasoning was behind disabling Any
Literal values are compared with the == operator except for the constants True, False and None which are compared with the is operator
I was reading the pattern matching tutorial, and I saw this. my question is, why make the distinction, isn't it exactly the same?
It's probably faster
and it isn't exactly the same
because those are singletons I guess?
!e
class GuessWhat:
def __eq__(self, _):
return True
print(GuessWhat() == False)
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
True
isn't there one for is also?
no
is checks whether two objects are literally the same object
what's the need for overloading it?
to do funny things I guess?
in C++ you can overload && and ||, in that case they stop being lazy
IIRC
ye
they lose short-circuit behaviour
if you do that
1 is True is false, 1 == True is true.
oh right, duh
So, I wanted to discuss runtime type validation again- but I'm happy to move to a specific channel if there's anyone in here who really just doesn't want to think about that
It occurs to me that because my language is going to run in Python, at least to start, I could have each function configure a big ol' pattern match heuristic under the hood. My second question is that if I also have method overloading, I should emit an error that says 'invalid arguments' for a method with only one implementation, or just use a 'no matching overload' error. Thirdly, what about being interpreted is good anyway?
From an ivory tower perspective, "no matching overload" would be confusing to someone who hasn't learned about overloading yet
Whereas an unambiguous "could not perform operation 'someFunction()'; required positional argument 1 ('somearg') must be of type str" is very clear
Something like that
and when saying "no overloads found", an explanation of why each overload failed would be nice
Excellent
you're right, but also i realized that hy has a special macro for when you want to use recursion that does perform TCO https://docs.hylang.org/en/alpha/api.html#module-hy.contrib.loop
(this is all in the alpha version of 1.0 so it will maybe change before 1.0 is actually released)
ah... loop is implemented internally as a trampoline
so its dog slow
it manages to be even slower in pypy, while the naive tail recursive equivalent is a hair faster
whereas the imperative version in pypy is an order of magnitude faster than its cpython equivalent
this makes me really excited for cinder and apparently the rebirth of pyston
what is a "social media algorithm"?
"recommendation algorithm", "recommender system", and "recommender engine" might be relevant keywords as well
nobody starts from scratch
also @native current this is more of a #data-science-and-ml question
i thought i was in that channel, otherwise i'd have directed you to look there sooner
i will say that deriving things from first principles is a great learning exercise, even though you will probably never need to do it "in real life"
this channel is for discussion about the python language itself, as per the channel topic
i had no context for your question so i wouldnt know what you were originally asking
but in any case it wouldn't be be on topic here. either #algos-and-data-structs or #data-science-and-ml
Anyone working on anything fascinating today?
i hope so ๐
are locks always pickle-able?
i always see locks getting passed as parameters to processes, but never understood how the lock is recreated in the other process
I think they're shared in the memory
Yes, this is what I get:
import multiprocessing as mp
def foo(lock):
print("lock in child process:", lock, hex(id(lock)))
if __name__ == "__main__":
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
lock = mp.Lock()
print(" lock in main process:", lock, hex(id(lock)))
process = mp.Process(target=foo, args=(lock,))
process.start()
process.join()
lock in main process: <Lock(owner=None)> 0x7fbd3e273fd0
lock in child process: <Lock(owner=None)> 0x7fbd3e273fd0
hmm, shared memory module doesn't publicly provide a way to set the base address, nor does it seem to let you manually put locks in it? (at least ShareableList doesn't)
is this solely for the implementation, or is doing this available to the user as well?
not that noob if you figured out that much ๐
is there any upcoming pep for simplified "for loop vectorization?"
or is the comprehension like (for i,j,k ...etc) actually the same thing?
What do you mean by for loop vectorization?
i doubt Python will get built-in syntax for parallelization of for loops. There are modules that you can use for that.
or, maybe I don't understand what you are asking for.
That was my interpretation of it
multiprocessing.Pool.mapโข๏ธ
I took it to mean, forming a for loop such that the compiler chooses vectorized instruction sets, which is what makes the C/C++ libraries so fast, well that and generally way less overhead
Here's some deep level stuff, well its not deep if you took a computer architecture course
and that would work great if Python was compiled to machine code rather than an interpreted bytecode :P
with the added bonus of a global lock
Basically if you do not vectorize it such that the compiler chooses the vector instructions you are repeating set of instructions load register 1 from memory, load register 2 from memory, add register 1 and register 2 contents into register 3, then store register 3 into memory each iteration of the for loop. Whereas if you do vectorize it, then you are doing those same set of instructions 1 time, but all at once
Vectorisation is a challenging thing to detect and optimise at the C level, and not really possible at all at Python level, where these aren't registers but pointers to individual objects all over. It's far better to simply use something like NumPy that offers smart array objects that do so under the hood.
How vectorizable a for loop is, depends on what that for loop is doing (add, subtraction, or something else), the data type its operating on (int, float), and how many loops to perform. Like that picture shows a 128bit lane, so if you have a for loop performing addition between int64, and theres 64 iterations then the vectorization would yield 32 iterations
off the top of my head, I could be wrong but that's the general idea
its very processor specific
It's a cool idea
but generally left to 3rd party extensions and packages to do in most languages
Yeah, python is way too far from the metal to able to utilize this stuff
also not every loop can be vectorized/parallelized
far as I remember from MPI course you have to be quite careful with it
Even compiled languages generally dont vector like that because you can get left with all sorts of safety issues as well as un-predictable performance
often with this stuff explicit is better than implicit
Numba + Numpy is a good example, or Rayon with Rust
yes with MPI or OpenMP you have to declare parts to be threaded/parallelized explicittly
Yeah I'd say, this stuff comes into play when you are writing software for a known/dedicated hardware platform, in a low level language.
Get the COBOL out boys and girls
naaah, OpenMP/MPI afaik was built for C and Fortran no need to go as far back as cobol
i think in the Python world, "vectorizable" just means "concurrently", not using cpu vector operations.
Anybody knows how to code ur own cryptocurnncy
@swift imp you were exactly on point with my question. However, of course we can do this in python, it just need quite of a workaround, and that was the pep I was hoping for, since this is far more efficient but a really nasty thing to write.
But maybe, it's a numpy thing.
PyPy's JIT has some options for vectorization, but I'm not sure what they are
numpy already vectorizes most operations with the MKL backend IIRC.
the tradeoff w/ numpy is that you might end up introducing several additional passes over the data by chaining vectorized numpy commands together
usually this is still faster than trying to do it in a loop, but it can be a problem
Sure, I tried that. Problem is that you still have to write it in "standard" format first and then pass it to the vectorize function. Im looking for a better syntax, like if you do it in C it does the job but the code looks like a mess.
Some of the conversation above seems to mix up vectorization with parallelization which makes it more confusing for anyone reading it
I've never heard of "loop vectorization" refer to parallelizing a loop before. Loop vectorization usually means using CPU instructions that let you handle multiple loop elements per pass
It's also closely related to loop unrolling, which means doing multiple units of work in each iteration pass
Probably any numpy backend is going to be doing at least some vectorization. A lot of loops will be automatically vectorized in C++/C/Fortran (and many more will be unrolled).
MKL is also parallelized to use multiple threads; I'm sure it will do that sort of thing intelligently though depending on how many cores you have, how large the matrix is, etc
Hi does anyone have a method of compressing positive integers fast, lossless, and as small as possible?
Or if there is a way to compress a dictionary that has integer as keys and integer as definitions
Note that I prefer that it compress it losslessly as much as possible over the compression speed
What properties does your compressed dictionary need to have? What operations does it need to support? You could represent this as a single list of int objects where the even elements are keys and the odds are values - that's smaller than the dict, but can't efficiently support the same operations.
I want to send this information from server to client, and I want to minimize bandwidth usage. The keys will consist of player ids of integers like 1214019 and the keys will be integer tuples
the compressed data has to be able to be read/uncompressed by the client
You can't beat sending them in binary.
Will compressing them in something like base64 like aH3J be worse than having a binary value like 10100001001001000000100100?
Characters are represented by bytes, too.
so binary will basically be the best and most compact way to send the data
okay thank you
Anything you send over the network is a stream of bytes.
But the answer is only simple if you know in advance the range of your integers. Say, if they're never larger than 1 million, it would be wasteful to send them as 64bit ints
A simple rule of thumb: an encoding is length-optimal, if every possible sequence of bytes (up to length rules) is a possible encoded message you could be sending.
Now, conventional logic would say that hex would be faster because its more values in only two digits- but because each of those characters require 8 bits to express, its actually much larger than data in streams of only one bit?
Just wanted to make sure I understand ๐
Right on.
In practice what you often have is that most values are pretty small, so you have some kind of format that is fixed width up to some small value
and from then it's variable length
Practical encodings always have some degree of redundancy, and space-efficiency isn't usually the #1 factor in choosing an encoding.
but this is all heuristic/empirical, there's no true answer, and in practice for most applications just picking the smallest fixed width integer is fine
and unless your integers can be bigger than 64 bits
Well not really. There is a theory behind this, it's called Shannon's information theory. It allows you to make totally precise statements about encoding efficiency. It gets a little complicated when you take into account that different messages differ wrt. the likelihood they will occur in an exchange, and you ask for maximum average efficiency.
The optimal encoding still depends on your data
in that sense, it is heuristic/empirical
You don't usually have a perfect estimate of the distribution of your data
may not know whether it's IID
etc
In that sense, sure.
Hi all,
Does anyone using Python in DevOps (automation ?? ) I need to know what it can automate like CICD Pipelines, etc?
I'm a newbie to python actually, but working in devOps...
@last forge #tools-and-devops looks like a good channel
Ah... Thanks for the direction... appreciated ๐
Have a read of this: https://gafferongames.com/post/serialization_strategies/
(Author uses C++ but still really interesting imho)
I'm starting to see an interesting pattern in the regex I'm using in my lexer
Without intending to, it seems to have configured itself into a series of top level categories corresponding to main lexical categories, each one comprised of one or more individual patterns corresponding to a token (or a component of one)
One helpful person on here suggestion that, because expression is so nasty, I develop some sort of intermediate representation that is easier to understand
So with that in mind, I just find it interesting to see this pattern emerging
Guys how to merge 2 python projects together in pycharm?
This is not a help channel, please see #python-discussion
For your question perhaps #editors-ides is also appropriate.
how accurate is pytesseract?
been some 10 years, but back in the day it was good enough to break some captchas.
still is?
maybe, maybe not
Awfully quiet day in here today
is it worth making a new ocr engine from scratch or is pytesseract good enough?
Depends on your usage. Generally speaking you want to stick with the established tools
'Perfect' recognition is probably a while away but if your system is such that you can reasonably expect your user to tolerate a 'input could not be read - please try again' error, you'd probably be fine
yes, but like, in which cases will the inputs be unreadable by pytesseract, I only need for the normal fonts, and does size of the text matter?
Hello
Have anyone of you listen about that new library which can able to get the meaning of blurry texts from an image?
Unfortunately I forget that library name
Python is the most powerful language must say tho
Depends on the context, most powerful for running high end guis?, No, most powerful for data analysis? Yes
No, never head of a library like that, I know you can sharpen and filter blur from images using pgmagick or other image editing libraries and detect if an image is blurred or not using opencv but any library that claims to completely understand blurred texts should have a question of accuracy and bugs
for i in range(5):
print(i)
print(i)
i is still valid after the for loop, is this a side effect or can i rely on it?
(i hope the latter)
there's this:
>>> for i in []:
... pass
...
>>> print(i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'i' is not defined
this is because loops donโt create their own scopes. you can rely on it, although personally iโve never found a use for it
i'm just adding explanation to this. It means i never got declared in the scope because of empty list.
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> for i in []:
... pass
...
>>> print(i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'i' is not defined
>>> for i in [2]:
... pass
...
>>> print(i)
2
>>> def foo():
... for i in []:
... print(i)
... print(i)
...
>>> foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in foo
UnboundLocalError: local variable 'i' referenced before assignment
>>> def bar():
... for i in [3]:
... print(i)
... print(i)
...
>>>
>>> bar()
3
3
>>>
for each item in the thing that is being iterated (the.. iteratee?), the item is assigned to the name i and the loop's body executes
if there are no items, the loop exits before any assignment can happen (there is nothing to assign)
so the name doesn't exist
or rather, will not be created, it could already exist from before
>>> i = 0
>>> for i in (): ...
...
>>> i
0
idk it kinda looks like it exists there
because it existed beforehand
yea that's what I'm saying, the loop didn't assign anything to i, but I assigned a value to it before the loop, so that value remains bound to the name
I'm saying this because you can make use of it in case you want to rely on i being defined after the loop, but cannot guarantee that the iterable won't be empty
your enlightenment era ๐
I know you guys gave me a bit of a hard time regarding my syntax the other day. I'm finding it must easier, and, for some reason also much more attractive to stick with norms in Pycharm than I ever did using sublime text
Its always a bit humbling to say 'you were right', but you guys were right ๐
If you're into vim bindings/emulation, pycharm s vim emulation has hugely improved over the last few yeara
I stole the approach to leader based bindings from the editor I used to use (spacemacs) and that works very nicely. Much easier to keep track of compared to so many random shortcuts
i hate the layout of IntelliJ IDEs, hence my poison of choice is 
out of curiosity, how much did you customize your sublime?
(feel free to ping me with a response in #editors-ides , that would be the better channel)
Would it be safe to say that python lexes the internals of every string it finds at some point?
I havn't gotten to the parser yet
why would it need to lex every string?
depends what you mean by "lexes the internals", I guess
it needs to figure out where the string ends
it will need to scan through the string, remember the starting delimiter, and look for an unescaped ending delimiter
well, it also needs to interpret unicode escapes, as well as the internals of an f-string
well, format is not called at compile time ๐
Fair point
err, "compile time" ?
Are you saying that f-strings are turned into literals in the .pyc?
or rather, lexed prior to the .pyc?
And it needs to know whenever it increments the line number, which may occur inside of even a normal string. The normal tokenizer module tracks line position by the code line by line
It doesn't parse the guts of a string even if .format is called
what do you mean by "guts"
in order to do the formatting operation, it has to parse it....
Which leads to some big ugly logic for keeping track of when it enters and exits a string and stitches them together at exit
the lexer doesn't seem to handle escapes
right, but at runtime
'Internals' and 'guts' mean whatever is between the quotation marks
okay, well .format most definitely parses the string that's being formatted
!e
When the code is turned into bytecode.
def f():
return "This { is { not { a { valid { format { literal".format(42)
@grave jolt :warning: Your eval job has completed with return code 0.
[No output]
maybe I just misunderstood you
no I think I understand what you're saying
the f-string isn't fully evaluated during the transformation into bytecode, you're saying though that some of the basic lexing is already done during the bytecode transformation
which isn't the case for .format
that seems reasonable
the parser handles escape sequences, the lexer just takes the string as is
I'm actually not sure how Python handles f-strings
(except \')
So lemme reframe my question: for every string I need to be able to recognize escaped characters and also increment the line index of the lexer if a newline is encountered. I feel that lexing the strings internals, albite a hit in performance, is a clear and unambiguous way of doing that. What are everyone's thoughts?
ah, ok
they cannot be evaluated at "compile time" obviously, they can be lexed
even with f-strings?
Those I know are lexed in their own step, and then parsed as normal
!e
def f():
return f"This isn't valid{"
@halcyon trail :x: Your eval job has completed with return code 1.
001 | File "<string>", line 2
002 | return f"This isn't valid{"
003 | ^
004 | SyntaxError: f-string: expecting '}'
Yeah, f-strings are turned into an efficient format at compile time
as in, before the code is run
well, not that efficient, but yes, more efficient ๐
I know that the parser doesn't actually create strings from fstrings, they get spit out as 'joinedstr' nodes, with string literals and replacement fields inside them. I'm guessing they are assembled on the fly when the bytecode is run
Right
it about does
f'the name is {v!r}!'
''.join(['the name is ', repr(v),'!'])
```except the values are just on the stack with a hardcoded BUILD_STRING, not in a list
they can't be evaluated during parse time of course
If python's internal string representation were a rope then this would be super efficient
the list of string thing I mean
maybe it is, I don't know
it can't be a rope, since the values returned by __str__ are not a rope
i didn't follow that
_> So I havnt heard either way regarding my ultimate question, but no one has given me cause not to go this route
So, off I go I guess ๐
AFAICS the implementation of str is an implementation detail
there is some minor complexity to escape sequences, doing \N{ CUCUMBER }
yes, but existing code depends on it being the random accessible string it is
so it would be a breaking change
contiguous != random access
but yes, CPython does implement it as a contiguous string, I googled
"You may not like pickles, but they are after all the only thing you can do with a cucumber"
- Unknown
it is random accessible, it isn't utf8. It autoscales the characters to fit the largest codepoint
So, inside a normal string I may find:
a) escaped characters of length 1
b) escaped octals and hexadecimals with length greater than 1 (not really sure what these are but I'll look it up)
c) uncontinued newline literals (valid in multiline strings)
c) continued newline literals (for single-quoted strings)
When lexing the string and preparing it for for the parser, I can modify it to unescape escaped characters, increment the lexer's line index upon encountering a newline literal, and throw an error in the event of an uncontinued newline within a single-quoted string
And a similar but likely much more complicated process can be applied to fstrings also
Does all that jive with you pillars of intellect? ๐
I don't think you can have escape sequences which result in more than one character
at most, you get a 32bit unicode codepoint
other than that, I don't really see an issue with the lexer handling escape sequences, fstrings do not seem impossible either
Not 100% sure what these octal and hex escapes are about, but I'm sure its not too nasty
that isn't the complete list
Oh no, probably not
Frickin W3 schools
and octal and hex and Uxxxxxxxx are all single characters
depending on how you choose to handle incorrect escapes, you may have things like \e be 2 characters (that being \ and e)
that is how python does it right now
Could I see an example of an octal escape?
@flat gazelle :white_check_mark: Your eval job has completed with return code 0.
`
And regex would recognize that as a single character
ah, you meant like that
\\. would pick it up?
no
the regex for an escape sequence is slightly more complex. Keep in mind you have to handle \N{ SMALL LATIN LETTER A }
wrong name, but you get the point
I don't really, sorry ๐ฆ
I don't want to monopolize your time though. I'm happy to check it out on my own
!e
print('\N{LATIN SMALL LETTER A}')
@flat gazelle :white_check_mark: Your eval job has completed with return code 0.
a
turns out the grammar for these is pretty strict, no superfluous whitespace allowed
it is still a regular expression, since they cannot nest. So you can make it into a regex, just a pretty complex one
I'm just not sure why you'd really want to use regex for this, to be honest
the regex will be shorter perhaps, but the non-regex code will be much easier to read and write (and you'll be much more confident as to the performance characteristics)
with lexers, it can be easier to use regex, since modifying a regex is easier than modifying nested loops with complex relations
really depends how the code is structured
the best option IMO is a lexer generator of sorts, and IG regex is the simplest form of that
IG?
As for using a lexer generator, I'll eventually transition to using more sophisticated tools and a more performant language (C, probably) as the projects gets closer to completion
Right now though, I've challenged myself to build a lexer for python source code with my own hands, because if I can do that, I'll be able to say with confidence that I have a working knowledge of how they work
I'm almost done too. The master regex works just fine. Indentation and line continuance works fine. The only thing left really is the strings
I'm wondering about the use of a python module purely to store some parameters, as I get warnings for uppercase naming for constants, but I'm not sure if I want to enforce that or not... But, it makes me feel that this is probably a poor pattern
Itโs fine imo
How do you guys think the bot Accounts on Instagram are made? Because you have to verify your phone number after 24 hours and they are really expensive ๐ค
if it's just simple data, why not store it in some json or something like that
Just use a linter that doesn't complain about things like that. I'm guessing you're using pylint, just use flake8 instead
Sorry if this isn't advanced enough for advanced, but it seems like general is reaaaally general...
I have a nebulous language design question... why is with expression as variable: not just with variable = expression:?
I used to always forget that with takes the assignment and expression opposite to normal creation of variables
Compare C# where you write using (var variable = expression) { ... }
!pep 343
It seems unnecessarily different. The with word already means "call __enter__"
Okay
its explained here
cheers
Okay, got it
Although I think it's a design choice that could have gone either way
It's to maintain separation between the context manager and the resource itself
the Python programming language. Feel free to ask any other questions you have about the community in #community-meta
@gleaming ember it can't really have gone "either way" in the context of a GC language
oh, I see
sorry, I didn't see the second with
Out of curiosity does C# support multiple assignments with a using block like that?
Kotlin's using is more similar to C#'s (well, kind of) and one of the things I miss from python is that python is much nicer for opening multiple resources
C# supports everything. Even things that shouldn't be supported
exit is called Dispose and I think file streams are IDisposable and so implement this method (which would call close)
with open('file1') as f1, open(f1.read()) as f2:
print(f2.read())
```this works.
Yeah, with is a bit more functionally rich compared to using, but they're mostly equivalent in day-to-day use
what do you mean by things that shouldn't be supported?
it seems to me from the example above that they seem pretty equivalent as they can both open multiple resources nicely, but maybe I'm missing something
In Kotlin (not C#, which is what I was referring to above with "equivalent"), they tried to avoid needing to build in a language keyword/functionality, and instead their using is not magic at all, but rather it's just a function that accepts a lambda. It's a really nice approach in principle in that you avoid making the language more complex, but it seems hard to get equally nice syntax.
The flipside is, you could argue, the lambda approach makes it easier to design APIs where it's actually impossible to forget to close things. In python nothing really stops you from doing f1 = open('file1')
well, you have a GC, so it isn't a massive issue regardless
Not sure I follow that (and probably wouldn't agree)
a leaked file gets closed instantly in CPython, and some time later in other implementations
or maybe you just mean there aren't as many resources around in the first place, once memory isn't considered
the same holds for C#, java etc. thanks to finalizers
that's pretty much the exact opposite of all discussion and advice I've seen around finalizers
there's also situations which aren't as simple as "reclaim the resource", where you don't care if it gets reclaimed a bit later
it isn't a good thing to do, but leaking file handles isn't a potentially crippling issue unlike in C/C++/Rust (not that it is very easy in C++ or Rust either).
For python, I do prefer being able to have resources without needing such a scope, since you can't always fit it into a single block (though generators can help). For example, a database connection needs to stay open while the app is running, but close and clean up at close. Doing this with a with is not impossible, but it ends up cleaner if you just keep it as app state and close it in the onclose thing of the app. I am not sure how you would create a lambda that exits when your entire application does, but it doesn't seem impossible, especially in kotlin.
In python the answer to such things is to factor out your resources to a sufficiently high level
It's pretty common for me to create thread or process pools in a top level, or near-top-level function
everyone else just takes it as an argument and isn't responsible for its life time
but in that near-top-level function, it's inside a with block
If you have an app that always needs a database connection, then it seems to me that the logical thing is to have the context manager in your main/top level function
so you mean
with opendb() as db:
app.start(db)
```?
sure
I was surprised when I found out that Kotlin has about 80 keywords... then I was surprised when I learned C# has about 100
that seems high or low to you?
keywords doesn't count operators?
lisps usually have like 10-20 and half are technically implementable in terms of the other half
brainfuck has 0 or 8 keywords, depending on how you look
but aren't because that would be crazy ๐
I think a lot of it is probably because Kotlin supports many things first class that python handles in weird ways with type annotations
or decorators
things that can be more dynamic, because of the nature of python
a lot of those keywords (maybe around 10) are just for built in encapsulation support, for example, which python barely has
access modifiers, inheritance permission
yeah, I guess
like descriptors
I'm reading the kotlin docs right now, and it has so many features
Another example maybe is lateinit, in python there is no compiler, if you want to do this you just do it and maybe you need to suppress mypy on one line
Eh, idk I don't really feel like it has significantly more features than python, but I guess it depends what you call a feature
natively supported feature
no metaclasses, no decorators, no comprehensions
hm, right
those are big features too
a lot of the "features" kotlin has, that python doesn't, IMHO, are small things relatively that python emulates with above "meta features"
data classes are a great example
Okay, so data is a contextual keyword in kotlin
maybe it's just that it has many ways of doing the same thing, it seems
I still take data class in kotlin over @dataclass any day
I like fun interface just because of how it sounds
I don't really think that's true. I think, largely because it's statically typed, and you have more checking of things statically, it has chosen to:
- provide a lot of simple (relatively) features like data classes "built in" instead of using decorators (properties is another good example, abstract methods/interfaces vs @abstractmethod/ABC)
- Because it's truly static, you need a lot more opt-in and opt-out hatches from the compiler, because typing isn't optional
- Kotlin is trying to check a lot more things than python, access control, tight control of inheritance
most of the "extra" keywords I saw fall into one of those categories, rather than providing two ways of doing the same thing
I guess 1-3 are all pretty related.
And yeah I guess a "fourth" category which is relatively small, but still, is things that help it with Java interop, obviously there's no analogous issue for python
but the fact that you can create things that "look" like keywords in kotlin, is very cool
infix functions
if you came from another language you would think using is a keyword but its' not
the fact that you can omit parens when passing lambdas, + the extra capabilities of inline functions let you do some pretty neat things
is there any "theoretical" reason why python/mypy couldn't support generic unions, or is it just practical in that nobody has come up with a good proposal yet
Eg something like FooOrBar = Foo[A] | Bar[A] but where FooOrBar itself can be parameterized
Unrelated but inspired by the above, with deferred evaluation of annotations, could unbound names in annotations be inferred to be TypeVars?
I don't think so. You can sort of have a generic union, can't you, by simply nesting it inside a class
pyright can do that ๐
I mean really it seems like the missing feature here has nothing to do with unions
So as always, the answer is "because this is not specified"
it's generic type aliases that you need
generic type aliases do exist
Good point
Is this just a pyright feature then?
P sure its not in any pep or official docs
S = TypeVar('S')
Response = Union[Iterable[S], int]
this is on the typing documentation
Huh
isn't this what you want?
I missed that apparently
My example from above works in mypy:
https://mypy-play.net/?mypy=latest&python=3.9&gist=86dea81b6092d585daf8b75eb50a191c
(press Run)
wait until you see recursive type aliases in pyright
@grave jolt can you define something like a cons list that way?
data List a = Nil | Cons a (List a)
like this i guess
it would be pretty amazing to be actually able to write out the type of json properly in python
yeah i have tried and failed many times
I can't think of a single time otherwise I've wanted recursive types
yeah it's not possible currently
is there something like numbers.Number that also includes float?
numbers.Rational I think
!e ```python
import numbers
print( isinstance(3.14, numbers.Rational) )
@paper echo :white_check_mark: Your eval job has completed with return code 0.
False
alas
iirc thats not valid either because floats arent really real numbers
!e ```python
import numbers
print( isinstance(3.14, numbers.Real) )
@paper echo :white_check_mark: Your eval job has completed with return code 0.
True
oh hey it works
Rational requires .numerator and .denominator and provides a default conversion to float
yes, pyright can do that ๐
@grave jolt interesting, I wonder if and when mypy will get that
I think theres an open feature request for it
there is, doesn't seem that it will be very soon
yeah, mypy complains "possible cyclic definition"
can someone give me a candy
๐ด
Hodl and buy the dip @grave jolt
?..
that certainly doesn't belong here.
!ot
Off-topic channels
There are three off-topic channels:
โข #ot0-psvmโs-eternal-disapproval
โข #ot1-perplexing-regexing
โข #ot2-never-nesterโs-nightmare
Their names change randomly every 24 hours, but you can always find them under the OFF-TOPIC/GENERAL category in the channel list.
Please read our off-topic etiquette before participating in conversations.
Thank you master yoda
Learning Python on DataCamp
When setting init(self): in a subclass, does it override the superclass init?
Yes.
I've been trying to understand super() but everything I've read relates to an init that also takes arguments instead of just using it to set instance variables
How do I use the subclass init to add extra instance variables without also overriding the superclass init
My predecessor's solution was just to set all these as class variables in the superclass but I'm trying to get us off that
Overriding init isn't the problem. You still have access to the parents init thanks to super
you call super().__init__() to call the superclass's init
So, write your init in the subclass. Inside that init, use super to call the parent's init. And do any extra assignment as you see fit.
Ohhhh right. Duh. Thanks you guys
Anybody got any good resources to learn computer vision?
Hm, perhaps try #data-science-and-ml for that question. For what it's worth I personally just used opencv tutorials
Anyone have opinions on which is more pythonic?
Say I have some instance variable I need to create via api call. It only is needed sometimes when the class is invoked but not always
When it's needed, which should I do?
Style 1: always if check for the variable. If truthy, continue, if falsy, api call
Style 2: try: reference variable
Except: api call, then reference variable
IMO I would initialize it to None in the constructor, and then just check if self.attr is None
since it sounds like you're going to be catching a AttributeError?
Yeah, it's preferable to do that, so that all the instances have consistent attributes. Additionally, conditionally creating the attribute may give you a slight performance penalty - CPython has an optimisation which assumes that all instances are going to have the same keys, which means adding it like that might invalidate that and cause the dict to need to be seperate. Not 100% sure though.
Hmm okay thanks guys
Do any of you fine folks know where I can find a detailed spec on indentation consistency?
The inconsistent indentation error is a weird one for because it pops up every once in a while- drives me nuts- but I can't ever figure out how to cause one specifically if I want to test it out
Also, question
Why, if grammars are more powerful than regexes, would you bother lexing in the first place?
Alright so, I have a library where I'm wrapping a dictionary inside of some class, for example:
class Entity:
def __init__(self, data: Dict[str, Any]) -> None:
self.DATA = data
@property
def name(self) -> Optional[str]:
return self.DATA.get("name")```
is there any better way to wrap it? perhaps make __getattr__ redirect to the DATA dictionary?
class Entity( dict )
def __init__( self, *positional, **encyclopedic ):
dict.__init__( self, { **enumerate(positional), **encyclopedic } )
That one is just me being fancy, but a simpler version
class Entity( dict )
def __init__( self, data: Dict[str, Any]) -> None:
dict.__init__( self, data )
Why do you need to encapsulate a whole dictionary? Why not just pass the data the class needs?
the data might not be fully consistent
Not sure how that's a problem.
By consistent, do you mean the data may or may not be there when you look for it?
What's the problem with
class Entity:
def __init__(self, name: Optional[str]):
self.name = name
...
data = fetch_data()
new_entity = Entity(name=data.get("name"))
def __getitem__( self, subject ):
return self.DATA.get(subject)
Get item, get attr for dot notation
in case you want that
Oh, well, if you need to individually classify the types of different items in the data
You're solving the wrong problem, encapsulating the whole dictionary is bad design.
I mean, I'm no expert, but I'd imagine thats a much more complicated issue
What is it you're trying to do?
self.__dict__ = data ๐
One main bad design part is that this means your class is tied to this input dict data structure. If that needs to change, your code will suddenly break. As you mentioned some keys might not be present, which means that users will have to catch AttributeErrors which feels very awkward.
Even if it's more code, it's probably better to unpack the values individually. That way you're also checking the properties are correct.
You don't need to generate code, you could make your own descriptor class, that reads from a dict or something.
Oh, this, this is a good idea
Type checking will even work with that.
You'd still need to assign a type to each attribute name though
yeah
Properties might be easiest
Sure, but you should do that anyway, kinda bad API design to make you look elsewhere for what something contains...
fair enough
thinking of some sort of REST API, what would be a better way to interface structures to the user?
You could expedite it using a property-like descriptor save on a bit of code, but the fundamental problem remains the same- if you have some internal data that you need to get and set, return defaults for if not present, and type check, there is no simple way to automate that process
In my estimation, even trying to would be a waste of time for anything less than a few hundred items
Well, for one probably not directly copy-pasting the API to the client, since there's likely details that are annoying to set, HTTP particulars, etc.
In other words determine what would be good for the Python user, and have it convert over.
class accessor():
def __init__( self, name, typing, default=None ):
self.name = name
self.typing = typing
self.default = default
def __get__( self, instance, prototype ): #prototype means class, but a little less ugly that cls or klass
if not instance: return self
return self.data.get(self.name, self.default)
def __set__( self, instance, value ): #prototype means class, but a little less ugly that cls or klass
if not SOMETYPECHECKINGFUNCTION(self, self.typing, value):
raise TypeError('bad value')
self.data[self.name] = value
class Entity:
def __init__( self, data ):
self.data = data
someStringAttrOne = accessor('someStringAttrOne', str)
someStringAttrTwo = accessor('someStringAttrTwo', str)
someIntegerAttrTwo = accessor('someIntegerAttrTwo', int)
I might be totally misunderstanding what you're trying to do, but thats one way you could access, type check, and provide defaults for attributes using a standard descriptor
Note as of Py 3.6, __set_name__ is called passing in the name of the attribute, so you don't need the redundant parameter.
Whaaaaa?
During class creation, Python checks all class members, and if they're descriptors it calls that.
Somthing like this is a sane way of doing it.
class Entity:
def __init__(self, name: Optional[str] = None):
self.name = name
@classmethod
def from_json(cls, data):
return cls(name=data.get("name"))
...
class Client:
...
def get_entity(self) -> Entity:
res = requests.get(...)
return Entity.from_json(res.json())
fair enough, I guess
Oh python โค๏ธ
serializing back though
you can make a custom json encoder.
Indeed, check the docs for the encoder/decoder classes, there's hooks.
Ohhhhhhh I did that once actually
You can hit me up tomorrow and I'll see if I can find it
I'll send it on over
grammars are generally quite a bit slower, but if you look at the JSON grammar on json.org, it doesn't have a lexing step. So it isn't mandatory by any means. something like common lisp also doesn't have have a sound lexer you can make.
is there a consistent way to import errors from libraries? Like if a library throws a custom error and I want to try/except with that error, sometimes I'm unsure how i should actually import that error to do so.
You might want to subclass all your stuff from a base class that has a to_json method or something.
have you looked at types.SimpleNamespace?
but honestly, this is a usecase for libraries like pydantic and marshmallow
Exceptions are classes like anything else, so they can be anywhere. Often there can be a specific module holding them, but otherwise no unfortunately.
So its a matter of optional preprocessing for the sake of simplifying the parsing process (in same cases), and maybe a bit of optimization
yup
Rockin
Well Lak, Imma ask you a big question
If I want to build an actually halfway decent pure python implementation of pythons new PEG parser, where do I start?
Bearing in mind I'm starting from square one (doing my proper research though, already)
I think Guido wrote one?
there is a talk by guido himself on PEG parsers https://www.youtube.com/watch?v=QppWTvh7_sI, which may be a solid enough start. I unfortunately know very little about parsing as a whole, the few tasks I did being either highly specialised to the point of not benefiting from a formal grammar or using lark/antlr for automatic generation.
Fair enough. I've seen PEG for fun and profit but I've got a lot more context now. I'll give it another watch
There's the PEP itself: https://www.python.org/dev/peps/pep-0617/
And the accompanying articles: https://medium.com/@gvanrossum_83706/peg-parsing-series-de5d41b2ed60
I always find the python core devs so fuckin funny
Raymond Hettinger XD cracks me up every time
That too
Don't you just import the whole library and catch their internal exception code
Like except botocore.exception.Clienterror as error
that's caused by mixing tabs and spaces. just use all spaces and you'll be good
Alternatively if you just want to catch one type of error a lot and don't want all that visual clutter you can do:
from botocore.exception import ClientError
...
except ClientError as e:
Obviously in that case you also have to be sure that the Exception name doesn't clash with some already existing error class name
Anyone have reasonably detailed experience using mypy and pyright on the same codebase?
If so, I'd be curious to know what kind of differences you saw in the issues caught, false positives, false negatives, etc, overall practicality for use in CI
and which you ended up sticking with (or neither, or both)
I haven't done both at the same time, but it sounds like a nightmare ๐
Well I know that much ๐ My issue is that I need to be able detect the mixing of tabs and spaces at the start of new lines and throw the appropriate error. The issue is that Python is actually fine with a mix of tabs and spaces, especially with regard to indentation within parentheses and also continued lines
python type checking is full of good intentions, but you know what they say about hell
It has something to do with lines, possibly of the same length, mixing tabs and spaces within the same block, and even then under some circumstances it lets it slide
Theres some nuance. I've decided to just say screw it though and disallow all spaces in indentation (I'm a tabs guy, myself) except on lines within parentheses, continued lines, and lines containing only comments
good, choice, i can barely wait for the hacker news thread about your language
Actually? O.O I thought you guys thought I was an idiot for even trying to do this
I'm a tabs guy, myself
why ๐ข
That means a lot though pocket. You have my permission to live another day
i'm talking about the flame war the tabs thing is enevitably gonna start
Ahhhhh
Just stating this- I'm Switzerland in this debate. I just use tabs because its one key instead of four, but I'm never one to kink shame
essentially, indentation is on a stack. As long as an unindent ends at some level on the stack, it is fine. and if an indentation at the next line matches up with the stack, it is once again fine. But if indentations differ, it is a problem
most IDEs automatically make pressing tab 4 spaces
with regards about detecting the identation type, you should check the old python parser. It's actually very simple to understand.
i'm pretty sure i does something very straightforward like setting an is_tabs bool variable when it first encounters and INDENT
so for example
"""
a:
\t 1
\t 2
\t \t3
\t c:
\t \t\t3
"""
```is fine
I think that might just be an illusion (I'll confirm in a bit when I'm done doing battle with this one issue). I've recently switched to using pycharm instead of sumblime and as you say it converts tabs to 4 spaces. But I then copied and pasted the code at one point back in to sublime and the tabs were still tabs. I think that the IDE just interprets the tab as four spaces, and does a bit of legwork for you in terms of auto switching from one to the other if you delete one of the spaces
sublime probably automatically changes the indentation
pycharm definitely places spaces, though perhaps not by default
Hmmm
I really don't want to have to code the semantics for mismatched indentation XD I guess thats the bed made I made myself though
pycharm does spaces by default
tables are something you have to change from the defaults
although pycharm wont automatically change tabs to spaces it will give you the option to convert them
yea I'm almost certain this is default
I forgot exact syntax but vim has a feature that it will use the settings for indentation given at the end of the file. So if in case you really needed specific file to follow specific rules then this was pretty cool. Just reminded of this while reading your comments
Also, Ken Thompsonโs explanation on why Go prefers Tab for formatting changed my perspective. He said itโs a single character but if one requires indentation of size 4 then they can just set the editor so, but if other wants to read the same with size 8 then also it works for them without changing code!
Not to mention its explicit and has fewer parts; both good things in programming
In case of Python 4 spaces definitely as pep8 says so. I think it is spaces more so because then one can exactly align code with just one extra space
Still doable after the first token on the line. Establishing the boundaries of the code block takes precedence to alignment but once a line has determined that, all bets are off
New pyflakes rules disagree!
how do you deal with line length rules when using tabs?
i.e. if a style guide dictates 80 columns max
Well the python lexer and parser do not ๐
Generally too much indentation is a sign of code smell. So you can use it as a clue for having too much nesting. Having said that, I find 80 too harsh, I often break that limit in my own codes
Never thought of this @sacred tinsel always used spaces with Python but this is making me think haha
another thing to note about python indentation is that indentation is only important at the start of a statement, so
"""
(
\t\t\t1
, 4
\t\t \t\t )
"""
```is valid python
Python upped their recommendation for line length from whatever it was before (low 70s I think) to about 85- a happy medium between two warring schools of thought, one the original and one who advocates for 100
I prefer 100 myself
I usually set a column at 80 and a hardwrap at 100
Yeah! guard clauses ftw! + I often just ignore the code in code review if itโs indented way too much. I just comment refactor this ๐
I would just go with tab = 1 character
i've always felt strange about tabs for indentation specifically because of the "advantage" that you can configure them to translate to whatever width. i keep thinking why is that a benefit ๐ I dislike font ligatures for the same reason
The idea is 1. Refractor to use less indentation 2. Refactor to make a complex line simpler, such that it fits. 3. Refactor to make things like lists or args fall on one line each, using multi-line brackets. And 4. Ignore
Which one to use depends very much on the context.
Well the limit on terminal was set to 80 columns because of hardware size. If you have tab size 4 then 76characters should be a hard line but if tab size is 8 then 72 characters only. Because the constraint is there to allow visibility of code on single line and not actually number of characters
@static bluff definitely not an illusion
yeah I know but you often have a linter check it
editors automatically creating X spaces when you hit tab is very very common behavior
it's been around for ages
Yeah I understand why tabs were preferred but with modern tooling spaces just seem more convenient as you also get things like consistent hanging indents if you don't always use a multiple of the base indentation level for them
Don't be a slave to your own liners I guess. ๐ The actual limit is relevant though, because a line 160 chars long is probably really bad, while a line at 87 may not be.
I was asking more from the perspective of how do you define max line length if everyone's indentation is different width
I'm aware that you can refactor long lines ๐
you just have to assign an arbitrary "width" to tab for enforcement purposes
You can use Emperical standards from existing code bases as reference. Last I remember the number was 90ish
but that is a good point
it basically means that your editor cannot draw a consistent vertical line that matches the enforcement of column width, at that point
yeah
if your tabs are set differently from the "enforcement width" of a tab
Isnโt it implied that if you change your tab width then you also change the limit in linter?
Oh!
Never thought of that
In the context of tabs vs spaces... I should have scrolled up..
No, it's not implied because the linter is run as part of CI
and it's not acceptable for CI to pass for one person and fail for another, it has to be consistent
well the thing is that everyone has to use the same linter config, but everyone's code looks different, so there has to be a disconnect somewhere
right
Oh good lord. This is brutal
lol
Use spaces in actual code. Let your ide convert your tabs to spaces. Follow pep recommendation.
Right on point
I like to keep actual code under 40 spaces and that leaves more than enough room for indents but that's just cuz I like how it looks
Can I just say, that significant whitespace, I honestly thought was a decent idea 20 years ago, when good auto formatters were not really a thing.
But in 2021, it's painfully obvious that python is on the wrong side of that decision
Breaking the line width pep seems like much less of a concern for someone who's breaking the tab vs space convention.
Every language now just has perfect auto formatters anyway... clang-format, go-format, etc, so the whole "indentation doesn't match intent" issue has been eliminated
Heard of black?
I have, but it doesn't change the fact that python has significant whitespace
https://youtu.be/wf-BqAjZb8M relevant talk
"Speaker: Raymond Hettinger
Distillation of knowledge gained from a decade of Python consulting, Python training, code reviews, and serving as a core developer. Learn to avoid some of the hazards of the PEP 8 style guide and learn what really matters for creating beautiful intelligible code.
Slides can be found at: https://speakerdeck.com/...
the python fo significant whitespace was to solve the problem of program behavior not matching indentation
*the point of
My boi!
I like that there's no noise from syntax to begin/end block, and with python's feature set I don't see a real need for them
python came up with a solution that seemed reasonable at the time, and now I think it's pretty clear, it was the worse solution
when python was created things like clang-format didn't really exist (not at comparable levels of quality, use, etc)
so it seemed a decent choice
But this also creates a problem!
smart auto formatters are definitely a thing everywhere, but I'd say the majority of developers outside of popular OSS still do things their way (which is a mess unreadable to anyone else in a lot of cases)
err, citation needed
everyone I know in private industry, writing C++, clang-format is part of their CI
it's just a given everywhere now
and in go it's literally part of the language
I don't really see that as a big problem as it's evident for anyone beyond a beginner
if you use clang can you write c without curly braces or semis?
No
meh then
err, lol?
I think python is very clean and readable
semi colons, sure, I agree languages shouldn't need it
I have seen people make that exact mistake with braces as well
Could be different environments but my experience doesn't exactly match up here
even worse since the indent it the way they want it to, but then the braces don't match the indentation
I think braces are pretty great and it's probably not accidental that braces just continue to be the top way to delimit blocks, in more and more languages
You say that... And then there's python.
Yes, really the one mainstream exception, and I think I've covered part of the reason for that already.
haskell uses indents
I guess we'll wait and see if there's a flood of new languages that use significant whitespace, at any point in the future
That seems like a bad metric
It's a bad metric to see if any language designers think it's a good idea, and copy it?
How about the language share as opposed to giving weight to a billion languages that will come up, then die off.
I definitely prefer braces to things like END, but within python that has no complex expressions where you'd want to delimit blocks in a more terse way and possibly inline I don't see the need for them.
While reading through code bases of languages that use them, it is easy enough to visually filter out when just understanding the code, but it is a bit in the way
The problem is that languages become popular for a whole bunch of different reasons, so to look at python and say "python is popular, this proves that braces are bad" is silly.
It's also silly to look at new languages that are esigned by totally random, perhaps crazy people, and conclude that those ideas are good either.
But if you look at relatively popular new languages, that are designed by fairly serious teams of people
I don't think anyone said they are inherently bad
Braces are very popular, and significant whitespace is not
To me, that's a very good indicator
After all language design is a matter of taste
I don't think one is more popular than the other
There is no metric on that
Well, yes and no, I guess. Some ideas are more successful at different points in time.
You can just say that everything's a matter of taste but that doesn't really leave anything to discuss, and obviously, someone where is going to need to discuss it, because languages continue to evolve, and be created
It is easier for the tooling so I'd say that plays a role in deciding whether to use it in a new language too, but is not exactly relevant to its readability
I can also say new languages, choosing braces can also be largely explained by precedence, people prefer whatever they've worked with. And there's been more languages with braces before python came along.
@peak spoke yes, I agree that's a big factor. I don't think the absence of braces is less readable, to be clear.
I think there's minimal difference in readability in typical code examples that are easy to write in both
It would be tricky to say that either metric was truly objective about what people wanted, as opposed to convenience, happenstance, or less work
choosing braces makes sense in that there are a lot more people who really hate indentation over people who really hate braces.
Sure, I agree that precedence is also a factor, but then, python is one of the most popular languages in the world, a huge fraction of people know at least some python, so you could argue there's "sufficient" precedence
With braces there is another divide too! Whether you put { on the same line or next ๐
I think those divides pretty much get solved by languages having official style guides from day 1, and tools that enforce those styles. Not really a big issue.
between Visual Basic, Bash, Shell, excel etc. and python, I would argue that the usage of braces and not braces is pretty close.
python seems to be one of the best as far as a single accepted style canon
really, excel? okay
excel is probably the most used programming language out there, even if few people write massive programs in it.
also the most used functional programming language
well "usage" also includes how much actual code there is in it
Not really in my experience. Since Python requires indentation by design we at least get something. Large number of professional programmers still indent random for C-family languages.
"number of people who write at least one line in it per day" isn't a metric of usage that was agreed upon here
When we say excel programming.. Is that vba scripts or the formulas in cells, or something else.
@warm wadi not sure about randomly, like I said, most reputable places are using clang-format in their CI these days
both
Gotcha
And C/C++ are very old, so obviously lots of people developed their own styles long before auto formatters existed
In that case, I wish your words were less true, but yes excel programming is everywhere ๐
I have seen code from Indian service companies. The horror!
The trend now (enforced to varying degrees of severity) is to have more or less official style guides from day 1, with auto formatters. Sometimes the auto formatters do have options if you want to deviate from the official style guide, sometimes they don't
You see this in Go, Kotlin, Rust (again, to varying degrees of enforcement)
This is so true. In one project one senior developer had optimised their style so well for cscope and ctags. I became fan of his work!
but none of those are more popular than python and they have somewhat different uses
hah, yeah. More broadly I think that's another big difference, languages being designed with tooling in mind, auto formatters too but also many other things.
I will also say, another factor in the brace vs significant whitespace, I've heard this cited as part of the reason why python's lambdas are so limited. That, there just isn't a good way to delimit the end of the lambda.
However, have you seen formatting from black? They spam ( ) so much, every import comes on new line and all. Hate it
Yeah, I have quite mixed feelings about python's auto formatting options.
black, makes some code look pretty weird. In particular, it makes multiple resources in with look unbearable, although this is supposed to be fixed in I think python 3.10
yapf is more configurable and nicer, but there have been issues where it doesn't seem to consider it a bug if it's not idempotent
which I would have thought would be a no compromise requirement for an auto formatters
I think typecheckers could integrate checking for one or a few of the existing multiline lambda possibilities that exist in python
if ppl want them that bad
Oh my god...
It worked
My lexer, still missing a few key components, worked
O.O
nice
I had it lex itself, no issues!
is it open source?
I can show you if you want, but I'm a bit hesitant
Last time I showed my code I was a bit overwhelmed by some vitriol regarding my coding style. I want to note though since switching to Pycharm and having it yell at me every time I break a coding convention, I don't think that will be a problem
nah, it's okay. I will wait until you release it
btw, long back I started reading this and reached nowhere, if in case you haven't seen: https://craftinginterpreters.com/introduction.html
This, this looks like a good read right here
+2
I was going through the code in the collections module, specifically of the OrderedDict and I found this line of code: ```py
def init(self, other=(), /, **kwds):
I'm quite curious what is the `/` doing? I've never seen that used in the fucntion parameters
it's for positional only arguments
basically, in this example, it would prevent calls of the form Foo(other=5) from working
usually, the main motivation for position only keywords is for functions exposed directly from C
could you elaborate on that a bit more? I still don't quite understand
You can pass arguments in two ways, either positionally (by just putting them inside the parens), or as a keyword-argument (name=value). By default, all arguments can be passed in both ways; arguments specified before / cannot be passed by name, and arguments passed after * cannot be passed without a name
oh, interesting, why would that ever be useful though?
well, which one of them?
/
keyword only is pretty useful because you can force the call site to be descriptive
ah
well, like I said, I think it's mostly useful for functions directly exposed from C
keywd only I know, I have used that in the past
def example(a, /, b, *, c): pass
example(1, 2, c=3) # valid
example(1, b=2, c=3) # valid
example(a=1, 2, 3) # invalid, a is positonal only
example(1, 2, 3) # invalid, c is keyword-only
because, those functions will ignore keywords anyway
functools.partial is a good example on why you'd need positional-only
def render_template(template_file, other=3, arguments=5, **kwargs):
...
render_tempalte("user.template", username="bob") # ok
render_tempalte("function_documentation.template", arguments=("x", "y")) # error
def render_template(template_file, other=3, arguments=5, /, **kwargs):
...
render_tempalte("user.template", username="bob") # ok
render_tempalte("function_documentation.template", arguments=("x", "y")) # ok
reading the pep is probably the best source
yeah, I'll give it a read, thanks!
np
fwiw I've almost never used it
not to say it doesn't have some good uses, it's just probably pretty rare
I have used * pretty often though
All you really need to justify * is to say "there's no way this call will be readable if you don't make it clear which argument this is" which actually happens all the time
Another neat consequence of * is that you can actually have keyword arguments that don't have defaults
the best argument for why / must exist is that, if it didn't, you wouldn't be able to have functions that work like dict.update(). You can do:
d = {"self": 1}
d.update(self=10)
print(d) # {"self": 10}
If dict.update() were implemented in Python instead of C, it'd look something like:
def update(self, /, **kwargs):
for key, val in kwargs.items():
self[key] = val
And if you didn't have the /, then it wouldn't be possible to pass self as a **kwarg (because self=10 would be taken as a value for the self parameter, not as a **kwarg)
!e ```py
class D:
def update(self, **kwargs):
...
D().update(self=10)
@raven ridge :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 5, in <module>
003 | TypeError: update() got multiple values for argument 'self'
!E
class D:
def update(*args, **kwargs):
self ,= args
for key, val in kwargs.items():
setattr(self, key, val)
d = D()
d.update(self=1, foo=42)
print(d.__dict__)
``` ๐
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
{'self': 1, 'foo': 42}
If anyone knows the way to fix this I would be grateful:
https://stackoverflow.com/questions/67626065/async-updating-a-dictionary-on-a-database
i'll try my best to examine
Suppose PEP 563 is implemented (postponed to Py 3.11, yay?), what would it look like? I'm having some issue understanding it
you essentially don't get a full object, just a string in __annotations__ and you have to parse it yourself
@raven ridge I don't know about that being the strongest argument, at least, not in that context
passing keyword arguments directly to update is awfully hacky, IMHO
Well I've changed network and computer and it's working fine now, I don't know why
OK, yes, yes, it's possible to work around it even in pure Python ๐
I'd argue that the built-in types are probably the best examples for what "Pythonic interfaces" look like. In addition to dict.update(), note that the dict constructor also allows arbitrary keyword arguments, and uses them as name/value pairs.
!e ```py
d = dict(name="John Doe", age=42)
print(d)
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
{'name': 'John Doe', 'age': 42}
It's also just nice for a glue language to be able to express positional arg only semantics that can interface with other language codes.
If I'm not mistaken, that really was the main driver. Before people went overboard with custom approaches* to express that args should be positional only, this Pep essentially allowed python to standardize it for everyone.
I'm mostly just thinking numpy scipy and tensor flow but I'm sure there's more examples out there if we want to find some.
@raven ridge i don't really understand what you mean there
At any rate, turning a literal identifier like name in name= into a string like "name" is basically pretty magical, in static languages of course this doesn't happen without reflection (and often even not then)
I have no problem using magic when it lets me do something cool, but for doing something incredibly simple I don't really see any reason to do it, just to save a few characters
definitely if I saw dict(name=....) etc in code review I'd ask for it to be rewritten to {'name': 'John Doe', ... and IME the latter is far more common in python code
Well, I don't agree, but ๐คท
Maybe you find the partial example more convincing, then: there needs to be some name for the function being wrapped, and it cannot be passed as a keyword because arbitrary positional and keyword arguments are forwarded to the wrapped function
I have seen dict(a=b) used even in pycon talks. I do agree it is quite rare compared to the more common {} form
I do prefer it if possible since it is easier to type
I mix back and forth pretty freely depending on what I think will be more readable for any particular function
especially if there's 20 different things, I think the dict kwargs approach tends to be more readable - there's less syntactic mess to wade through
the whole rationale is described in the PEP that added it
Makes even more sense with update if you're just updating some set values, no reason to construct a dict there
!e
print(list('does', 'this', 'work?', ('no?', ...)))
@acoustic crater :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 1, in <module>
003 | TypeError: list expected at most 1 argument, got 4
seems kinda weird that list constructer only takes one arg in that case
just use a literal
well, there are 0 cases where that would be better than a literal
yeah, something like
a.update(new1=v, new2=v2, new3=v3)
```is much nicer than
```py
a = {**a, 'new1': v, 'new2': v2, 'new3': v3}
a.update({'new1': v, 'new2': v2, 'new3': v3})
a |= {'new1': v, 'new2': v2, 'new3': v3}
it is nice not having to string literal the keys for sure
it just seems like if dict calls take kwargs for entries list calls should take args
tho it is admittedly not rly useful
Not necessarily. Because.. Yeah
and idiomatically, for an api that takes user_token as part of a dict, it makes more sense to have it unquoted. Since that is part of the API.
since you aren't mapping strings to more strings
you are mapping some specific keys to corresponding values
there is an argument to be made there for using types.SimpleNamespace
the partial example is better, yes. I still think the simple reality of python functions that map directly to C calls, even if it's less sexy and more of an implementation detail, is the main motivation here
If not for that I'm pretty sure that this could have been skipped and anybody who defined a function with self as an actual keyword argument would just be a Bad Person and it would be left at that
well, then you couldn't write your own simplenamespace without also taking *args and raising an error manually
that was also a large part of the rationale, having consistent notation for builtins that did not take kwargs
I don't think it's a massive deal, but it is a convenient thing to have for sure. Also helps make certain APIs where passing by kwargs would be illegible nicer
the fact is that if you read the PEP, issues with performance, C API's, much simpler ideas like simply enforcing the order, etc
are the reasons put first and foremost
true
self is mentioned briefly as a special case, partial mentioned not at all
re the quoting, I think it's much truer to the intent of the code to quote it, since that's what it is, a string. To me, definitely doesn't seem worth it to save a couple of characters.
it's also pretty fragile, as soon as you have anything dynamic in any key, you now have to change the whole thing to use a dict
it would be interesting to try to take a poll of experienced python devs somehow, to try to establish which the community considers more idiomatic
the idea I have with it is that when I have a web API that takes an object of a shape like
dict(user_token='...', name='thing', new_value='etc')
```then the keys are a different logical type than the values, similar to the types of variable names and attributes. So it makes sense to have them use the same quoting as those
not really. Ultimately though if you really want to express the "shape", it's far better to write a dataclass to express the shape.
dataclasses are better, yeah. But sometimes overkill if you only ever create the object in one place.
usually in those cases I still just use dict literals, as you're basically sending json at that point and I want to be clear about the fact that that's what I'm doing
but not all json objects are of the same type
Well, again, you're not really capturing that here anyway though?
It seems like one or the other, you either have a dynamic json/dict, or you have a dataclass which truly captures the structure
I guess, what I can say for this approach is that it works out nicely that both in JSON and in keyword arguments, the "keys" can only be strings
there is a middle ground which avoids the boilerplate of dataclass (and overall the pain that is trying to use dataclasses in JSON) while still making a difference between the outer object, the patches object, and the patches themselves. (some JSON APIs would also use an array)
{
"patches": {
"1231231231231": {"value": "99"},
"1231231231232": {"value": "98"},
"1231231231233": {"value": "97"}
}
}
I'm not sure I follow, how is this a middle ground? Isn't this just a dictionary literal?
you can also make hardcoded string keys constants to set them apart
def oauth_request(...):
return dict(
client_id=...,
client_secret=...,
redirect_uri='whatever.com/etc',
...
)
def oauth_request(...):
return {
'client_id': ...,
'client_secret': ...,
'redirect_uri': 'whatever.com/etc',
....
}
```I find the former nicer, but it very minor either way. And doing this with a dataclass is practically indefensible IMO
I dunno, it really depends how it's being used, how far along your code you'r epassing it
If you create an OauthRequest dataclass, it might actually be less code, if most of the function is just copying arguments into the dict
You also get a type out of it, so now you can annotate where you are passing around the request, instead of just an Dict[str, Any] annotation which is easily fooled
i strongly prefer dict literal notation over dict()
the only time i think it's ok is if you're constructing a dict meant to be used as **kwargs
because there's some visual analogy to a function call there
and +1 for using dataclasses/attrs instead of dicts
dicts are imo for 1) serialization (e.g. as an intermediary to json), 2) ad-hoc stuff, and 3) external data with an unknown schema
if you know the structure of the data, use a class
if you don't, use a dict
well, also 4) an actual mapping of one kind of data to another, i.e. as a data structure ๐
feels like that one got lost
Lol good point
okay so something I've been thinking about
what are the features of Python that are the most "dynamic"?
in the sense of being most resistant to static optimisation
not sure if I am using the right words
is one thing I was thinking of
of course, eval/exec
as opposed to a statically sized vtable you need a hash table for arbitrary attributes
I think? is that correct
getattr doesn't make that, but setattr could.
but it would be necessary
for the existence of getattr
otherwise you couldn't associate arbitrary attributes with objects?
I don't think reflection is up there for being the absolute most dynamic
hmm, not sure. doesn't getattr work with __slots__
probably metaclasses and decorators
you can statically analyze what decorators do.
with decorators, you can use them for all kinds of things, some of the simpler ones you can approach more statically, but in general they can get pretty crazy
it's very difficult
there's a reason why dataclass is basically magic'ed into mypy
ctypes trickery can be even more dynamic than eval and exec haha
ok, but possible. getattr can't tell what attribute will be returned
in that case the attributes are statically known right
but there are languages that have static reflection
so it is possible, if the input string is known at compile time
Can't you do some pretty deep runtime introspection
I think they could magic in eval and exec too if there was a desire to, at least within a restricted set of behaviors, maybe always giving exec a specific, concise namespace
Is it possible to modify the bytecode of a function in place?
pretty sure that's a yes
not legally, right
you can only replace the code object
you'd assign a new byte string to the function
yeah you can replace __code__
in the general case, the string isn't known at compile time, which is what makes it impossible.
I mean, you still have runtime reflection, which many statically typed languages have
you can modify co_code in place with ctypes but thats not trivial
if you phrase the question in terms of something that's impossible, then indeed, it's impossible
But most things you do with reflection, can be done in static languages, or even statically completely in some languages
there's nothing like metaclasses in almost any (any?) static langauge
yeah that's what I meant by "not legally"
we are talking about Python, right?
what do you mean "like metaclasses"
sure, but they can be analyzed statically.
no, they can't
because metaclass is the type of a type
and you can't have it be types all the way down
why is that? You know the metaclass, you can look at its code.
metaclasses work because in python, types themselves are values
huh
it doesn't matter if you can look at its code, it's just not an idea that's compatible with strict separation of types and values
which is the whole point of statically typed languages
static analysis just means looking at code without executing it afaik
in python there is no actual separation between types and values
I don't see how that's important.
exactly: what can you understand about the code without running it.
tho then "runtime static analysis" is abit oxymoronic except int he case of strings as they're passed to exec
disagree...
!e Hm?
class A:
__slots__ = "foo",
a = A()
a.foo = 42
print(getattr(a, "foo"))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
42
"which features can be statically analyzed"
"this feature is based on a concept that's fundamentally incompatible with static typing"
dependent typing...
you're claiming a thing that we don't agree with.
the point of static typing is literally more compile-time analysis of types
that's it
exactly.
in a statically typed language, types can be assigned to every single expression without running the program
that's the definition of a statically typed language
yes, how do metaclasses prevent that?
if you guys have your own definition of statically typed languages, then yeah, a discussion will be difficult
we don't have our own definition.
this is true
but that doesn't imply that types and values are entirely separate
which is literally not the case
metaclasses don't prevent figuring out the type of a name.
you need to execute the code in the metaclasses to understand the type
that's code execution
why do you have to?
You can change the class of an object dynamically at runtime, by assigning to instance.__class__
the code can do arbitrary things, add attributes, remove attributes
and it commonly does
that's a huge part of the point of metaclasses and decorators
do you mean setattr?
lol ok
python is more or less deterministic it's just convoluted to statically analyze if you include every feature
there are some pretty simple empirical facts here, that setattr and getattr equivalents exist in static languages, in some cases even static equivalents, and the features that I'm naming don't
but they're talkin about static analysis not python being a statically typed lang
seems like a pretty clear correlation yeah?
the question is which features are the most resistant to static analysis, those are also going to be the features that are hardest to include in a statically typed language?
i think you're being condescending, tbh
And then, statistically, the ones that are the least frequenty
"we're talking about python right"
but mypy and IDEs and so on do static analysis of Python code
I think you were condescending a while ago, not sure there's value in pointing that out
i didn't mean to be condescending
well, neither did i
๐ฅด
I'm frequently wrong haha I'm here to learn, I thought RPython was R the other day >_>
wait until you hear that IronPython is not made out of Iron
๐ฎ
I don't see how this follows at all - What makes you think that the features that are most resistant to static analysis are the features that are hardest to include in a statically typed language?
dammit I've been taking supplements of it and everything
i don't think metaclasses are inherently hard to statically analyze.
because in a statically typed language, everything has to be analyzed statically
not true either
the type system can have holes
you have to assign a type to every expression
yes, you have to
but it's not necessarily (I don't know the appropriate word for this...sound?)
sure, so you can figure out the type of every expression in a statically typed language, we all agree on that. What does that have to do with how easy or difficult static analysis is?