#internals-and-peps
1 messages ยท Page 87 of 1
when i run code without lru_cache i get this result. Wich is understandalbe
with multiporcessing
time took 0.4375
witout multiporcessing
time took8.8125
But when i run using lru_cache this is the result:
Test1
with multiporcessing
time took 0.34375
witout multiporcessing
time took 0.3125
Test2
with multiporcessing
time took 3.234375
witout multiporcessing
time took 3.046875
He we can clearly see without multiprocessing is almost equal or little faster than multiporcessing method. Whats the reason for this? I understand that creating process is overhead but work list is very huge (10 million) so i guess chunk size is not too small. Or am I doing this wrong way?
Code explaniation:
oddlist () take number and return the sum of all odd nums in that range
oddcount is a tuple contains 10 million random numbers
Code:
import os
from random import randint
from functools import reduce
from operator import add
from multiprocessing import Pool
import time
from functools import lru_cache
@lru_cache(maxsize=None)
def oddlist(num):
return reduce(add,(i for i in range(num) if i&1))
if __name__ == '__main__':
oddcounts=tuple(randint(10,50) for i in range(10000000))
print('with multiporcessing')
s=time.process_time()
with Pool(12) as p:
mp=p.map(oddlist, oddcounts)
e=time.process_time()
print(f'time took {e-s}')
print('witout multiporcessing')
s=time.process_time()
z=tuple(oddlist(i) for i in oddcounts)
e=time.process_time()
print(f'time took {e-s}')
Each process has its own cache, so when you use multiprocessing, your caching is 1/12th as effective as it would otherwise be. There are only 40 possible input values to oddlist. In the multiprocessing case, each process computes all 40, then uses the cache. Without multiprocessing, all 40 are only computed once. So, in addition to the overhead of starting the processes, each process does more work than it would need to if caching were working as intended. Also, you're paying a cost to pass the work to be done in each process to it, and passing the result back.
Does anyone know a nice blogpost about proper project structure of CLI program you want to distribute in 2020? Unfortunately, some advice I found may or may not be outdated, and some advice probably differs between libraries, CLI tools and backend projects.
Does anyone know of some sort of COW implementation of threading.local() that can operate similar to how fork() operates? Eg.
local = local_fork_implementation()
def thread():
print(local.foo) # outputs 'foo'
local.foo = 'bar' # COW here
print(local.foo) # outputs 'bar'
local.foo = 'foo'
Thread(target=thread).start()
time.sleep(10)
print(local.foo) # outputs 'foo'```
What do you mean by "copy on write"? There's no copying at all involved in anything you showed there, other than the copying of a pointer.
Ah, wait, I see: you want to inherit the value from the thread that started the new thread as the initial value in the new thread.
Question about string handling. Are strings beyond a certain length not automatically hashed (because it would be prohibitively slow to do so)?
What do you mean by "automatically hashed"?
If I create a string in Python, isn't it stored in such a way that if I try to create another string with the same value, it just ends up being a reference to the first string?
That's interning, not hashing
OK, I got my terminology wrong!
Better to say: is there a max length for interned strings?
@raven ridge Yes that's right - I probably shouldn't have specified COW, as that's not really relevant for the desired behavior (other than saving memory).
As far as I know, any string can be interned with sys.intern. Cpython only caches strings under a length of 4096
ok, that's what I was thinking of
Only names are interned by default, not all strings
And interning a string isn't slower if it's longer...
Well, it strikes me that if you read in something and then try to intern it, you have to compare it against what's already there, and that requires at least one scan of the string -- unless that's done at the same time the string is built, in which case my worries are essentially moot
Hm, fair point. Interning a string is free, but checking if a new string can be replaced by an already interned copy is not.
I have a scenario where I'm reading in texts that are essentially used as template, so I was wondering if that process was triggering an interning check. Most are not more than 4K anyway, and I imagine it doesn't take that long to scan such a string. It also occurs to me that the time involved to do this is not in a critical path anyway.
(The real slowness in this app is in waiting for the database. I'll focus on that)
Things that you use as variable names, attribute names, module names, method names, etc are the only strings that are interned by default.
That I did know. I was more concerned about strings constructed at runtime.
In [1]: a = 'bergergerg'
In [2]: a is 'bergergerg'
<>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
<ipython-input-2-8ef80b687b63>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
a is 'bergergerg'
Out[2]: True
Doesn't this mean all strings get interned?
It doesnt seem to intern them if they have anything other than a-z0-9_
Hm. Perhaps string literals are a special case that I'm not finding in the docs
In [1]: a = 'bergergerg!'
In [2]: a is 'bergergerg!'
Out[2]: False
I think what Querty is seeing is a case of each cell being handled separately. I've seen this in the REPL
>>> a="Hello there!";b="Hello"+" there!";id(a);id(b);
2373001015088
2373001015088
>>> a="Hello there!"
>>> b="Hello"+" there!"
>>> id(a);id(b);
2373001447472
2373001448240
That wouldn't explain this one: ^
In that case clearly the string literal was interned, across multiple different cells, which I was not expecting.
I believe all strings known at compile time will be interned, but it won't intern strings that aren't valid names etc.
All length 1 strings are interned, an empty string is interned and any string containing ascii characters, digits and underscores which has a length smaller than 4096 is interned
latin1 chars and empty strings should be cached like the ints from the -5 256 range are
Guys. I had an interview at JP Morgan like 2 years ago but this shit still bugs me.
This guy kept drilling me on python3's treatment of duplicate hashmap keys. So I want to know, is python3 doing anything more than the normal string internment to optimize dictionaries where keys are duplicated or close to identical? For example when you have a list of dicts which represent a db row?
Are you talking about how python handles collisions under the hood?
Doesn't look like it @pseudo cradle I checked some material on that
he was trying to imply that python has some optimization where keys are the same in different maps, eg there shouldn't be a hash collision
@coarse nebula if you have a number of dicts all with the same keys, there's an optimization to store the keys once
Sounds like implementation details. Interview questions on implementation details seems like a generally senseless endeavour anyways.
I agree
Afaik, any string used as a key to a dictionary is interned automatically by python
@pliant tusk it's not the strings themselves, it's something about the key structure for the dict. I should find the info...
Doesn't seem to be the case
a = {"a"+chr(ord("a")): None}
print(id(list(a)[0]))
print(id("a"+chr(ord("a"))))
###
2561530657392
2561530657520
"If the keys in a dictionary are interned" seems to imply it's not a default
but in the end it appears to apply for classes
it's not clear though if that means by program design or not
interesting
A classic dictionary is a table of key/value pairs, so each entry in the hash table is two pointers. With key-sharing, the key-pointer vector is stored once for all the dicts with the same keys
do they have to share the same exact set of keys?
I don't know
Hello folks. I have a somewhat convoluted recursive function that looks like:
def investigate(clues, quests, data):
return quests if not clues else investigate(clues[1:], [(sd[0], quest[1] + (sd[1],)) for quest in quests for sd in data.get((quest[0], clues[0]), ())], data)
I am actually running said function on a somewhat long number of "clues" and performance is poor. Would reducing the number of arguments help with stack frame overhead?
What can I do to reduce the function call overhead as much as possible?
@unkempt rock Not much you can do about it afaik, but investigating heh said function, it being tail recursive makes it a candidate to be reduced to a while loop
@brave badger Yes, it sure is. Part of the assignment is to keep it "recursive" though. I was looking into this:
But the thread's conclusions aren't clear.
Perhaps what I could do is avoid slicing clues[1:] and progressively decrease an index to avoid multiple clues copies being generated at each call.
Seems fair actually
(Thanks.)
The obvious answer, if you're willing to change the arguments, is to not do any copying and instead pass indices along.
def investigate(clues, quests, data, *, _cluenum=0):
return quests if not _cluenum < len(clues) else investigate(clues, [(sd[0], quest[1] + (sd[1],)) for quest in quests for sd in data.get((quest[0], clues[_cluenum]), ())], data, _cluenum=_cluenum+1)
Has anyone ever encountered someone who thought Python's lack of private variables is a language weakness?
A lot of people complain about the type system of course, and I understand where they're coming from.
lack of private makes the language more simple, but it creates the need for conforming to a convention or a project-specific set of conventions instead of relying on the compiler
This is something I don't know: if you subclass something in Java, are you able to know what the private variables are and manipulate them internally? Or are they always hidden?
I don't know Java, but from what I understand, private is only for the class (like __these members in Python), and protected is for the stuff within the same package and subclasses from the same package.
@boreal umbra
I guess the one thing I'd change with Python regarding state consistency is clean up how @property is used
class Thing:
@property
class a:
def __get__(...): ...
def __set__(...): ...
def __init__(self, a): self.a = a
@grave jolt not entirely sure how the ...s would look but the idea is that the name of the inner "private" variable could be __property_a or something and have that be handled automatically.
So dynamic properties, but private?
I don't see the need for it. As I see it, dynamic properties are needed when you have a dumb property and you need to change its getting/setting to do something else, but not change the interface
but if it's private, you can always change the interface, i.e. turn it into a method or two
It's a good thing that @property makes wrapping method around getting and setting backwards-compatible
but in my opinion what's more important is that it the concept of accessing and changing individual values, semantically, is the same throughout the language
Wait, why are you using a class with a property?
that's proposed syntax to cut down on the boilerplate.
it's not a real thing that I know of
class Thing:
@property
def a(self):
return some_computation...
@a.setter
def set_a(self, new_value):
return some_computation...
I don't see how this is too much boilerplate...
my main objection is that you have to decide on a name for the nonpublic attribute (which will probably just be _a but then it's your job to keep the name of the nonpublic attribute and the property methods in sync
and I don't like that. That's something I already dislike about getters and setters.
Well, if the property requries a getter and a setter, then the actual property is different from some attribute stored in the object
So the names shouldn't necessarily match
can you give a more real example?
class Thing:
def __init__(self, a):
self._a = a
# property decorators
if I decide I want to rename a, I also have to rename _a
just to make it obvious what has happened.
Yeah I dislike that too
Oh, if you want to use property just to make an attribute read-only, then yes, the names will match.
Or maybe if you just want to enforce some invariant
Thats why I like descriptors because with __set_name__, I can make a generic ass setter/getter that automatically does that setattr(instance, f"_{self.attr_name}", value)
In Java it's even worse--I was making an Android app for a class last year, and one of the libraries we had to use mandated that if you have a private variable named foo, and the methods getFoo and setFoo didn't have those exact names, your code wouldn't run.
Isn't that why lombok exists?
lombok?
isn't that best practices anyways
best practices for Java?
I suppose
Java's ugly.
@Getter @Setter private int age = 10; -- So this means that do-nothing getters and setters are automatically created, and you can implement them manually later if you need them to have non-trivial functionality?
I like that in the sense that it's making up for what was already a poor language design choice.
More recently, I got the impression that excessive getters/setters should be used rarely because they break encapsulation. E.g. "instead of
class Car:
def is_running(self) -> bool:
# some implementation
def set_running(self, new_running: bool) -> None:
# some implementation
do ```py
class Car:
def start(self):
# some implementation
def stop(self):
# some implementation
Alright, this is definitely heading to be more on-topic in #software-architecture
We're still talking about the specification of Python though.
or do i have to buy more ram/do this in C++
@bleak lantern That question isn't on-topic for this channel. Try opening a help session; #โ๏ฝhow-to-get-help
ok @boreal umbra
Lack of private variables is both a strength and a weakness. It's literally a weakness in the sense that other languages have a feature that Python doesn't have - the ability to prevent users of a class from modifying the data that the class is encapsulating. But it's also a strength, in that it allows for white box testing through monkeypatching, for instance
it has tradeoffs, like pretty much any other language design choice.
Doesn't this kind of testing create brittle tests? Since private stuff is implementation details, and instead of testing the behaviour of an object, you're testing some details
So you can't have a safety net of tests when you're refactoring those details
well, yes, but the alternative is sometimes test setups that are much, much more complex
Well, if using a class via its intended interface is so complex that monkeypatching is the only reasonable way to test it, isn't it too complex?
For example, if I want to know that I raise the right type of exception if I read back a value from my database that doesn't make sense, I can white box test that by injecting a fake database connection and having it return that garbage data, or I can set up an integration test with garbage data in a fake database and make the test use the public connection string parameter to connect to that database instead, or I can design a new abstraction by which the database to connect to is dependency-injectable, allowing me to pass in a database-connector class that creates database connections that return garbage data to allow me to unit test this
you might say that the DB shouldn't have garbage data in it - that'd be a fair point. Maybe say we're talking about testing what exception is raised when a query to the database times out, instead. Then your choices are 1) monkeypatch a fake DB connection that times out immediately, 2) set up a "real" database - something that accepts connections on a particular port, allows you to send SQL queries to it, and then just hangs, and test against that, or 3) allow dependency-injecting a fake database that times out immediately when attempting SQL statement
the monkeypatch approach and the dependency-injection approach remain unit tests. But the dependency-injection approach forces you to explicitly define an interface, and make it part of your public interface, exposing flexibility to your users that you don't need and will now have to maintain. The real fake DB option is an integration test, rather than a unit test, making it orders of magnitude slower, and extremely complex to set up.
the best alternative to monkeypatching is dependency injection, but in that case you're making something that could have been private be public instead, and now when you want to, say, switch from a postgresql database to a redis cache, you can't, because you exposed the idea that you use a SQL database to your users and allowed them to inject their own.
(in other words, it buys you looser coupling with your tests, in exchange for tighter coupling with your non-test clients)
@raven ridge But couldn't you achieve looser coupling with all the clients by dependency-injecting not an SQL connection, but a general store that can fetch certain data, maybe even a function?
e.g. instead of
class CookieRenderer:
def __init__(self, sql_database_connection, google_cookies_handle):
...
def render(self, cookie_ids) -> List[str]:
cookie_rows = self.sql_database.connection.fetchall("SELECT (name, format) FROM cookies WHERE id in ?", cookie_ids)
cookies = map(Cookie.from_sql_row, cookie_rows)
new_cookies = map(Cookie.add_chocolate, cookies) # some logic
google_responses = self.google_cookies_handle.batch([GoogleAPI.cookie.render(
cookie_id=name,
cookie_data=cookie.data_for_google_api()
for cookie in new_cookies])
return [response["renderedString"] for response in google_responses]
``` do
```py
class CookieRenderer:
def __init__(self, cookie_store, cookie_image_producer):
...
def render(self, cookie_ids) -> List[str]:
cookies = self.cookie_store.get_cookies_batch(cookie_ids)
new_cookies = map(Cookie.add_chocolate, cookies) # some logic
return self.cookie_image_producer.render_cookies_batch(cookies)
sure. but then you need to test your SQL implementation of the data-fetcher to ensure that it handles a timeout in executing its statement in the manner that you expect.
Yes, of course, you'll need to test the dirty stuff somewhere. But now the clients of CookieRenderer don't rely on it using an SQL database
right, but you just moved the problem around. We still have the original problem: there's still some class that wants to connect to a database, and wants to handle a failure in executing a SELECT statement in a particular way, and you still don't have a way to test it.
All you've done is change which class you can't test.
In this example, I guess, the best option would be to "rescue" all the pure logic
def transform_cookies_for_rendering(cookies: List[Cookie]) -> List[str]:
return [cookie.add_chocolate().remove_pineapples(threshold=42) for cookie in cookies]
You can test this logic without any mocks, but then you can do whatever you want to test the impure code that accesses the database, queries this function and calls the API
sure - and it seems pretty reasonable to do so.
the single most valuable thing that you can do with unit tests, IMHO, is test your error handling paths. Manual testing can get you pretty far to making sure you didn't break any of your happy paths, but ensuring that you correctly handle the case when your database is down, or your backend connection times out, or your config file is missing, etc, etc - these are where unit tests really shine. They can help you simulate failures that can be really hard to set up when things are talking to real databases or real filesystems.
alright, thanks for the discussion ๐
I guess another option would be to make a "fully configurable" class with all the things parametrized, test it with fakes, and then make a facade class that would put the real stuff in it
yeah, I guess that would address my concerns. Though, then we're back to the lack of privates - what if your users start using the "fully configurable" class, heh
and, still - it's much more work, and leaves you in a position of needing to support that a bunch of code that you otherwise wouldn't need.
prefix its name with an underscore 
Hyrum's law. ๐คท
yeah, well, it's all tradeoffs between nice things and the amount of code that isn't essential...
the lack of privates annoys me pretty often, actually. I write libraries in a corporate environment; if people depend on my private interfaces I can't just break their code and cause an outage, I'm forced to work with them to deprecate their use of the private interface and move to something supported. And moreover, I write a lot of code that binds to C++ libraries. The Python interface is designed to be fully memory safe, but if you mess with my privates you can violate invariants of the C++ libraries and cause crashes or security bugs.
I'm not saying it's a bad tradeoff, but of the design tradeoffs Python makes, it's the one that annoys me most often.
Side note: I recently failed victim for it, kind of. Google Drive allows you to make a file available for anyone with a link to it. But instead of it being a boolean like anyoneWithLinkCanView, it's just a permission in the list of permissions, and to make it recognizable by the web UI, instead of the permission's ID being a hexadecimal number (e.g. a67acbae78b7cf87ecf7c0cbf39c), it's literally the string "anyoneWithLink".
I recently done a kind of a CRM/CMS using their existing google drive, and one of the nice features is that the stray anyone-with-link permissions are removed, because they're undesirable (they are an analogue of garbage data in a DB), so I rely on the dirty implementation detail of Google API (it claims that the ID is an opaque value, but ya know). But it's not an essential feature, so in case Google changes something, no big deal.
so in case Google changes something, no big deal
The person who initially implements the thing often believes that, but 10 years and 4 maintainers later, no one remembers the trick anymore...
Thanks for the answer, hadn't noticed it. Why do you prefer this:
def investigate(clues, quests, data, *, _cluenum=0):
return quests if not _cluenum < len(clues) else investigate(clues, [(sd[0], quest[1] + (sd[1],)) for quest in quests for sd in data.get((quest[0], clues[_cluenum]), ())], data, _cluenum=_cluenum+1)
To this:
def investigate(clues, quests, data, clue_index=0):
return quests if clue_index == len(clues) else investigate(clues, [(sd[0], quest[1] + (sd[1],)) for quest in quests for sd in data.get((quest[0], clues[clue_index]), ())], data, clue_index + 1)
Basically, why the * args?
And the "not <".
Same thing, the only difference is whether the last parameter is or is not part of the public interface. Starting it with a _ indicates it's private, and making it keyword only makes sure that the user types the name and can see it's private
Alright. Convention based private parameter then.
not < vs ==, ๐คท, no good reason. Either seems fine.
Do you think I could store clues and data as functions attributes? They're unchanged throughout the execution. So I don't need to pass them along. Am I right in thinking that the reduced args unpacking might speed up things in many recursive calls?
(Thanks @raven ridge)
Nah, looking up arguments is faster than looking up attributes
nods
In your original implementation, though, I'm betting all your time was being spent making copy after copy after copy of the clues list, thanks to the slice in each recursive call.
I'm not sure why you're worried about the overhead so much. If your assignment requires you to make it a recursive function, just make a recursive function. It doesn't require you to make the fastest possible recursive function. Rewriting it with a loop will yield a better performance gain anyway.
I bet just avoiding the slicing, and the copying it causes, fixes your whole performance issue
Remember that slices don't share any data with the original list that they're sliced from.
Oddly enough, the recursive function using the index is actually a bit slower (according to kernprof and python -u -m timeit).
That doesn't make sense, it should be much faster...
That's a very relevant comment! ๐ I meet the assignment's requirements but there's a class wide leaderboard, with fastest execution times and lowest cyclomatic complexity ... I am first, but I like those numbers going down!
Perhaps it's the additional len evaluation, the index increase and extra argument ... I am surprised as well. I will look into it.
may I suggest that the cyclomatic complexity metric they're using is broken? ๐
@grave jolt Absolutely, it is. We're using radon and it can be "tricked" in so many ways.
All of that should be much, much cheaper than the list slicing.
All of that is O(1), and slicing is O(n)
@unkempt rock Can you give a data sample?
I am off topic, but all "cyclomatic complexity" is achieving is pushing me to use "map" instead of "for" loops or list comprehensions, "filter" instead of "if" blocks, even "*" between bool values to avoid "and" ...
Well, this is the entire script:
def ex1(file, city, clues):
return {(" ".join(quest[1]), quest[0]) for quest in investigate(clues.split(), ((city, ()),), populate_data(file))}
def populate_data(file):
data = {}; trans = {58: 97, 65: 65, 66: 65, 67: 65, 68: 65, 69: 65, 70: 65, 71: 65, 72: 65, 73: 65, 74: 65, 75: 65, 76: 65, 77: 65, 78: 65, 79: 65, 80: 65, 81: 65, 82: 65, 83: 65, 84: 65, 85: 65, 86: 65, 87: 65, 88: 65, 89: 65, 90: 65, 97: 97, 98: 97, 99: 97, 100: 97, 101: 97, 102: 97, 103: 97, 104: 97, 105: 97, 106: 97, 107: 97, 108: 97, 109: 97, 110: 97, 111: 97, 112: 97, 113: 97, 114: 97, 115: 97, 116: 97, 117: 97, 118: 97, 119: 97, 120: 97, 121: 97, 122: 97, 193: 65, 195: 65, 201: 65, 205: 65, 214: 65, 220: 65, 242: 97, 256: 65, 258: 65, 278: 65, 298: 65, 352: 65, 362: 65}
with open(file, encoding="utf-8", newline="\n") as handler:
for quest in (q for line in map(str.split, filter(lambda l: not l.startswith("#"), handler)) for q in line):
contents = quest.translate(trans)
target = contents.find('Aa', clue := contents.find('aA', city := contents.find('Aa') + 1) + 1) + 1
data.setdefault((quest[:city], quest[city:clue]), []).append((quest[clue: target], quest[target:]))
return data
def investigate(clues, quests, data):
return quests if not clues else investigate(clues[1:], [(sd[0], quest[1] + (sd[1],)) for quest in quests for sd in data.get((quest[0], clues[0]), ())], data)
And this a data sample:
Maybe we should move to a help channel
That's silly. So it doesn't do any further introspection and assumes 1 call is better than a loop?
It probably can't since they're built in
It also seems to like a ternary if/else better than ones on separate lines, which makes no sense to me
You're right, sorry. Posted in #help-grapes
Cyclomatic complexity is about as meaningful as "lines of code" is as a complexity metric...
I completely agree.
I mean ... if CC is >10 there's obviously something wrong and the code might benefit from refactoring. But if it's 1, 4, 5, 6 nothing changes really.
CC is how many individual paths a portion of code could take, yes?
That is a pretty vague thing to measure, which may be why it thinks a function call with a lambda is "simple"
A single expression can have many ways through it. Imagine a deeply nested if-expression.
@unkempt rock if you don't mind me asking, how are you determining the CC of the code?
They said that their university's checking system uses radon https://pypi.org/project/radon/
thx
Indeed we use Radon (which builds on McCabbe's work, I think) but we also limit ourselves to "builtins". So we can't import modules with few exceptions. It's a good didactic exercise to keep CC low ... until one starts to resort to repeated map, filter, any, all, ternary operators through indices evaluation, multiplication of booleans to avoid and, etc.
So I've got this going on, is it because of a typehinting limitation on properties or am I just going about it wrong?
Why do you have the union wrapped in brackets?
isn't it Dict[key, [value]]?
No, I don't think so
The only one that comes to mind is Callable[arg1, arg2, [returntype]]
I thought it's Callable[[arg1, arg2], returntype]?
ah that's what I was thinking of
still no bueno tho Expression of type "dict[str, str | int | dict[str, Unknown]]" cannot be assigned to return type "Dict[str, str | int | Hidden]"
I should just go back to not using typehints, life was so much simpler ๐
also to avoid this turning into a help channel topic, do py properties work the same way they would in static languages?
Pretty much
The underlying mechanism is a bit more dynamic that in statically typed languages, but functionally they're the same
Getting/setting/deleting a property is transformed into a function call, and that function call alters underlying values
when would you need to delete a property?
In my experience, del is used very rarely to say the least, but it's nice that the option to account for it is there
I can't come up with an example where I'd use it, but since you can do del object.attribute, you can override that behavior with a deleter
!e ```py
class Sample:
@property
def x(self):
return self._x
@x.setter
def x(self, value):
print("Set x to", value)
self._x = value
@x.deleter
def x(self):
print("Removed x")
del self._x
obj = Sample()
obj.x = 10
print(obj.x)
del obj.x
@spice pecan :white_check_mark: Your eval job has completed with return code 0.
001 | Set x to 10
002 | 10
003 | Removed x
could you not just do del obj.x without the method?
obj.x is transformed into calling a method on the property descriptor, and if it's not defined, property should raise an error stating that this property doesn't support deletion
!e ```py
class Sample:
@property
def x(self):
pass
del Sample().x
@spice pecan :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 5, in <module>
003 | AttributeError: can't delete attribute
oh ok so you'd only use it if x is defined as a property in the first place
and need to overwrite it/etc
Or it just fails to find the attribute, since it's not present on the instance, lol
If x wasn't a property, it would just delete it, yeah
what about getter?
There is a .getter attribute on the property object, yeah, but there's generally no need to use it explicitly as @property uses the function passed to it as a getter already
right that's what I assumed, is there a use case in other languages?
A use case for a getter?
ya
Well, it's very rare for a property to have a setter, but no getter
While the other way around is really common
Getters are probably the primary use case for properties in general
so you'd have to define the getter + the property separately or is property just a generic wrapper specifically for py
It really depends on the language syntax
Python doesn't have properties as a syntactic construct, instead it implements the same functionality via a more powerful descriptor protocol
An example of a language where properties are supported on syntactic level is C#, which has tons of ways to define them. The general syntax for a property that needs to manage the underlying value looks like this:
public int PropertyName
{
get
{
return _somePrivateVar;
}
set
{
if (value > 10)
throw new ArgumentOutOfRangeException();
_somePrivateVar = value; // `value` is a special keyword, like `this`
}
}```
Python instead has the descriptor protocol - if you try to look up an attribute that doesn't exist on an instance, but exists on its class, and the value of that attribute happens to implement the protocol, its __get__, __set__ or __delete__ method is called instead
property is a class that implements that protocol and provides a way to implement what is essentially the same as properties in another language
There are some other hooks, such as __set_name__, which allow for some impressive things
__set_name__ allows you to create attributes right?
pretty much, yeah
neat, thanks that cleared all this up a lot
There's a good How-to article on descriptors in the docs
ah I was just looking at the realpython article
What do you mean by "create attributes"? The __set_name__ method will be called when a class attribute is assigned to an instance of the class implementing it. This means that the instance can become aware of the name/attribute assigned to it.
!e
class MySetNameClass:
def __set_name__(self, owner, name):
print(f"The attribute {name} of class {owner} was assigned to me!")
class MyClass:
foo = MySetNameClass()
bar = foo # Runs again, because we again assign an attribute to it
@wide shuttle :white_check_mark: Your eval job has completed with return code 0.
001 | The attribute foo of class <class '__main__.MyClass'> was assigned to me!
002 | The attribute bar of class <class '__main__.MyClass'> was assigned to me!
oh I misinterpreted the article, so does that allow you to access foo from MySetNameClass?
I guess you could use the name with getattr on the owner class, but that's not really the purpose of this method. It's used most often in classes that implement parts of the descriptor protocol (__set__, __get__, and __del__), as it can sometimes be useful to be aware of which attribute was assigned to the descriptor.
For instance, if you have a data descriptor with __get__ and __set__, you may want to store the information in the __dict__ of each instance using the name of the attribute, as that keeps the data with the instance and, if you're consistent in using this method, a name that's unique for that managed attribute.
ok so __set_name__ allows you to set a class to an attribute of another class, and __get__/__set__ then allow you to handle what happens when that attribute is called/modified?
sorry I'm trying to dumb this down a bit to understand it ๐
allows you to set a class to an attribute of another class
It merely gives you the information which class and which class attribute was assigned to the instance. You can always assign a class attribute to something, but typically that instance has no way of "knowing" what you assigned to it:
class MyClass():
# This list instance won't "know" that we assigned
# MyClass.attribute to it.
attribute = [1, 2, 3]
However, if the type/class of the instance defines such a __set_name__ method, it will be called after the assignment was made. This allows the object to "know" which class attribute was assigned to it. (So, it has nothing to do with allowing an assignment; rather, it's to do with sharing information: Which attribute of which class was assigned to the instance?)
The __get__ method allows you to define what happens when the attribute assigned to it is accesses, either directly on the class or indirectly on the instance. The __set__ method allows you to influence what happens when something is assigned to that attribute (again, either using the class or an instance). There's also __del__, which allows you to hook into the process of using del on the attribute.
There's a bit more to it (the presence or absence of __set__ is important), but that's the gist of it.
ah ok thanks that makes more sense, I think I'll have to play around with it to fully understand it but that clears it up a bit
What the heck is __get_name__?
a typo
Oh
We're missing __del_name__
Sorry but what is del_name ? Methods ?
A joke โ __set_name__ exists, but @wide shuttle wrote __get_name__ as a typo. __del_name__ is an extension of that
is placing a semi colon at the end of a line a bad practise?
Well yeah, python code is supposed to be clean. ; in python at the end of statements doesn't mark the termination of it, it makes a new statement.
print(2);
Is two statements, one having a call to print and the other being empty.
but in it doesn't make a difference in javascript(well, mostly) too and in there its considered to be a good practise
In JavaScript it does make a difference in some cases. In Python, it doesn't
It has a mechanism of inserting the semicolons, and different people have different opinions on whether to use them at all
it does look ugly i guess
yeah asi is dumb so it's better to be consistent with when you actually need them
Plus it better fits the c-like syntax
it is recommended to make ; in JS as using no ; is relying on the interpreter to understand your statement is finished. which doesnt work always
in python, you could basically write stuff like this:
print('Hello'); print('World')
as far as I know. But you shouldnt
In [1]: class Test:
...: def test(self):
...: print(locals())
...:
โ
In [2]: Test().test()
{'self': <__main__.Test object at 0x7fd8d5544d30>}
โ
In [3]: class Test:
...: def test(self):
...: print(locals())
...: super
...:
โ
In [4]: Test().test()
{'self': <__main__.Test object at 0x7fd8d5544460>, '__class__': <class '__main__.Test'>}
Does type actually just check every function if super in func.__code__.co_names and closure in a __class__?
super() with no arguments is special cased by the parser, and transformed into super(__class__, method_args[0]) - see the last paragraph of https://www.python.org/dev/peps/pep-3135/#specification
OOP code needs to call it pretty much constantly. You need it whenever a subclass wants to add extra processing on top of what the base class does for a particular method.
I expect that super() is just about the most heavily used builtin function.
top 3, at least.
i've only just heard of "design by contract" today (https://en.wikipedia.org/wiki/Design_by_contract), and was curious whether this was commonly implemented? Someone mentioned it whilst I was trying to think of an approach to ensuring some data integrity check stuff.
i'm not sure if its just a fancy way of saying TDD tbh
No, TDD is about developing the tests together with the code
"Design by contract" means explicitly stating the invariants and contracts that must be true in certain places
although this is more fit for #software-architecture @magic python, which is coincidentally on topic right now ๐
Are there any big projects that overload how super works. Seems like there are cleaner ways to make that sugar work that donโt rely on polluting the closure scope, unless you want to actually support overloading it.
@distant seal what do you mean by "overload how super works"? How would you like it to behave differently?
I'm not sure, I guess in theory you might overload super to monkeypatch in an instance's inheritance, or some crazy dynamic inheritance since I don't think you can change the __mro__ of an instance? It seems like if super isn't going to be a reserved keyword, and __class__ is going to be in the closure, that python is just setting you up to be able to do some really crazy stuff.
@distant seal there's lots of crazy stuff you could do, but that can get really messy and unmanagable
Oh for sure, I know there's plenty of other ways that python let's you go crazy, but it's just interesting how super is implemented, and I've never really considered how you could go crazy with it until now.
You never know what the next class after you in the MRO is going to be, so you can't really get very crazy with super() - it would just break if the next class in the MRO isn't playing by your rules
You could conditionally make super work normally, if that's what you wanted. I'd have to think about what a use case would be where it would be, of course, still wrong to do, but might at least make some sense.
actually it seems to me that if __class__ shows up as a name in a class method itll resolve to the class like ```py
class Foo:
def bar(self):
return class
print(Foo().bar())
``` returns <class '__main__.Foo'>
__class__is an implicit closure reference created by the compiler if any methods in a class body refer to either__class__orsuper.
how to not send traffic through vpn in python with socket with udp protocol?
That's not on topic here. This channel is for meta-discussions about Python itself. See #networks or open a help channel
bye
why do lists not have rindex but strings do
oh, ๐
Anyone knowing llvmlite module well?
why arent yall like millionaires. cause you guys be going crazy with that coding stuff
๐
lol
what makes you think we arent?
... ok im not
Hey guys, can we keep this channel on more meta Python topics please? We have off-topic just two categories under this channel for general talks 
that's my bad
No worries
When do you think the Anaconda distribution would switch it's "stable" release to 3.9? Obviously, "when the libs in it will support 3.9", but how much would it lag behind the release schedule?
What's the lag between language new releases and mainstream adoption?
generally a year or so
Ouch. It has some nice syntaxic sugar. Not having to import generic container types for typing is the thing that has very little benefits, but can be used in like 90% of projects.
So, could someone explain to me what Shunting Yard Algorithm is?
I suggest you ask in #algos-and-data-structs
๐
Hello everyone, I have a quick question about for loop ternary operation. I understand how to use a basic if else statement in a for loop but for someone reason I am unable to chain these together like I would a regular ternary if else operation. Is this chaining possible to do in a one line for loop, and if so could someone please link me to this proper syntax or give an example? Thanks!
Alright, so I was able to figure this out. I needed to chain my statements before my for loop statement
What do you mean by "one line for loop"? Are you talking about a list comprehension or a generator expression or a for statement?
[str(x) if x%5==0 else str(x) if x%7==0 else 0*'a' for x in range(100) ]
That's called a list comprehension.
Alright, thank you
Also, follow up question, is there anything I can do that would add nothing to my list
In the example above, is there anything I could do using the same format that would not add anything to the list if a number is not divisible by 5 or 7?
Or is the approach impossible with the format
[str(x) for x in range(100) if x % 5 == 0 or x % 7 == 0]
Lets say that I wanted to add 'a' to the list for every number divisible by 5, and 'b' to the list for every number divisible by 7 and add nothing if the number isnt divisible by either 5 or 67. Would that be possible within a single line for loop?
If it's divisible by both, would you add both a and b?
Again, this is a list comprehension, not a single line for loop.
Yes, this is possible.
Hello guys I have a litte question in sensor fusion implementing in python. Can you help me guys?
This isn't a help channel, read the description
Hi ๐ Anyone have any advanced bot guides/material they can refer me to?
What are your opinions on type annotating not only parameters and return values, but also variables? On the one hand I'd like to just annotate everything, on the other hand that doesn't work in some places, like with unpacking for iterables. Are there any plans to support annotating all variables on the left-hand side when unpacking? Or is there some other way around that (without forgoing the unpacking ofc!) Not being able to annotate in all situations really dampens the benefits of static type checking
a: int
b: str
a, b = (1, "2")
you cann do it but it's a pain
That really kills the beauty of the automatic packing and unpacking behind the scenes :\
imo (with next to no experience), typehinting is a nice way to check function/method args+return types and to make sure they line up, but I don't see the point of typehinting vars unless you're hinting for a compiler or smthing along those lines
I've done it in a couple places but it's more or less so pylance stops screaming at me
This is mainly to help out nuitka, yes
I also don't have pylance set to strict checking for the same reasons
@empty kite In the perspective of using type annotations for type checking, there's no real gain from type annotating local variables and such, they could help with attributes of course but if you're using them for documentation, doctrings don't really hurt
In most cases, local variables are either assigned Literals, which type checkers can easily infer, or parameters which are already typed
mypy can usually figure it out except for empty containers (it can't tell what the keys and values of a dict will be when you just do {} or the like)
For dicts there's typing.TypedDict, assuming there are no dynamic attributes
sure, but I mean you usually don't need to declare the types of your local variables, because mypy usually knows what type it is because it knows what type the thing you're assigning to it is.
{} and [] and set() are pretty much the only things you regularly encounter where mypy doesn't now what type they are.
My use case might be a bit esoteric: unpacking structs with formatting strings created at runtime. They will always be ints in Python, but could be any type of integer in the struct
Hey all, so a pretty specific question. I am reading through the Big Blue Book rn and a lot of the conversation is around OOP, but python has some advanced features like Decorators. How have y'all thought about decorators with regard to domain modeling? Like where are they most useful?
By Big Blue Book, do you mean Eric Evans's book?
Ofc to narrow it down, my thoughts are they don't live in the domain or application layer as they help with building up functionality, not defining it. But beyond that, how do they fit nicely into the some of the common patterns? I can see them acting like factories in a way, for aggregates which are centralized some core service functionality.
Yes!
This is definitely a question for #software-architecture ๐
Ah perfect thanks!
@bold quail did you post that on reddit? We're not really the right platform for that.
hello!
I just wanted to share a new call graph generator for Python that I've just publicly released
heeeeeeeeeyy... so I install TensorFlow and it broke numpy for me
raise RuntimeError(msg.format(__file__)) from None
RuntimeError: The current Numpy installation ('C:\\Users\\MyName\\AppData\\Local\\Programs\\Python\\Python37\\lib\\site-packages\\numpy\\__init__.py') fails to pass a sanity check due to a bug in the windows runtime. See this issue for more information: https://tinyurl.com/y3dm3h86```
and the link leads to not something that looks even remotely useful
Ideas? ๐
reverted to numpy's version 1.19.3, hopefuly its gonna be fine
This โ๏ธ
Windows10 canโt run the new version of Numpy for some reason
Itโs a known issue, I just dug into it the other day. Uninstall the new version and then specify version 1.19.3 when you install again to revert back to the version that works. @cedar flare
Yeah, known issue, I've had to revert to 1.19.3 on all our projects at work, it's a big rip
I can't believe I want/am saying this, coming from c++, but it would be so convenient for Python to try conversions instead of raising a TypeError
Like, if we're trying to create a date_time object by passing month/day/year in directly, it raises a type error if you pass in strings and not ints. It would be nice if Python, when faced with a type error, tried to recast as the correct type
I disagree here, it should raise an error because otherwise you might be passing strings without your knowledge because your function to generate them is failing and conversion will hide this. handling by conversion isn't usually a good idea, especially with custom classes which could have custom defined methods like __str__ which will be used for this conversion and it might result in a whole different kind of error because the value comming from that isn't correct
Perl and JavaScript both implicitly convert types like that, and in both cases I think it's just about the most annoying thing about the language.
in the JS case it's the entire reason for the === and !== operators, for "is it really equal?"
I just wish exceptions would be used for actual exceptions
and I also wish x/0 = 0
that would probably fix and remove sooooo many bugs and exceptions in the wild world, lol
@supple light thats not how math works tho
x/0 is undefined in math
because it would break the properties of division
x/0 is sometimes the result you want, but a lot of the time it isn't. Better to error than have incorrect results
any programmer who accidentally does x/0 is looking for 0 as result
I really don't care about "undefined in math" cause this is real world, lol
and from practical standpoint, x/0 should be 0
Pony for example does do that, though it also has erroring division. But when dealing with rendering math, you do not want random points of discontinuity like that. There are more cases where x/0=0 leads to wrong math than when it leads to correct math
If would lead to fewer exceptions, but more bugs ๐
I can't imagine a scenario where x / 0 = 0 from a "practial" standpoint
same, and i like pony
Only time it's come up to me was when counting proportions of <X> in a sequence
And the original sequence had length 0
Well, why should the answer there be 0% instead of 100%? Every element in the original sequence has <X>
Would you say it's an exceptional situation that users should need to decide how to handle? ๐
yeah, which is why I think it should error
yeah, if the question is what success rate is it, then you can compute 1 - failures/attempts and get 100% success rate from an empty sequence, which is the mathematically correct result. But there are cases where it is wrong too, and in case of ambiguity, avoid the temptation to guess comes to mind here
!zen ambiguity
The Zen of Python (line 11):
In the face of ambiguity, refuse the temptation to guess.
close enough
And of course you can compute a failure rate the same way, and it would also be 100%, which is clearly nonsense in the context of whatever domain you may be working in.
Schrodinger's subsequence ๐
Not a fan of that as that's how you get php and JavaScript which then just becomes a lot of bullshit to deal with. More often then not, you'll be fighting that conversation then actual problem
what's the purpose of auditing events?
You can see that as a log of events, it is useful for troubeshooting
but then again, php is a clusterfuck
so it's like a log for functions that are implementation defined?
or rely on OS specific calls?
Are you talking about the CPython audit log, or audit logs in general?
https://www.destroyallsoftware.com/talks/wat lovely talk on PHP/javascript/ruby's bullshit
Well, as you can see they are events for a lot of different things
for the most pats they are critical piece of the system, like OS interactions, and creating code objects, imports, cpython internals...
are they meant for troubleshooting on the user end or implementation end?
It is for the CPython maintainers, you shouldn't need this as an end user
what is the difference between a python bot and python automation script?
Generally, a bot is something that interacts with a service that other people use non-programatically. A script could also just handle moving files around, writing word documents or responding to http requests
script is basically an ai?
if I write a python script to attend online meetings in zoom will that be called a bot?
in this the script is acting as a agent for me so I guess it will be called bot right?
kinda
yes i am not sure too
uh this is getting off topic?
That would be a bot indeed, though consider whether zoom tos allows such automation.
it does
I am not sure whether this is fit for #internals-and-peps but I've thought about avoiding local variables repeated initialisations โฆ
Let's say we have a function of the sort:
def is_woody(word):
return word in {"gone", "vacuum", "prodding", "sausage", "bound", "vole", "recidivist", "caribou"}
That set is being instantiated at every call, which is wasteful.
In C, C#, etc. I would use static local function variables, const, etc.
The set literal is interned actually
>>> from dis import dis
>>> def is_woody(word):
... return word in {"gone", "vacuum", "prodding", "sausage", "bound", "vole", "recidivist", "caribou"}
...
>>> dis(is_woody)
2 0 LOAD_FAST 0 (word)
2 LOAD_CONST 1 (frozenset({'vole', 'prodding', 'caribou', 'bound', 'vacuum', 'sausage', 'recidivist', 'gone'}))
4 CONTAINS_OP 0
6 RETURN_VALUE```
Though if it's more complicated, you can indeed create a class and hold such kinds of data in a class-level attribute
class Sample:
shared = [] # essentially static
def __init__(self, item):
self.shared.append(item)
a = Sample(10)
b = Sample(20)
print(a.shared)
print(a.shared is b.shared)
Yes. Although then I would need to create singletons to handle methods and objects stored in class attributes ...
I could also pass the set as an argument to the function, from an other scope. Might even set a function attribute but, again, would have to either recur to obscure syntax or do it from an outer scope.
You can use classmethods to avoid instantiating it, or better yet, separate that into a different module and use module-level constants
I thought that the most elegant way was to use default parameters which are (surprisingly) only evaluated once.
Default parameters are good for caching and things alike, yeah
But what you posted actually changes the picture!
CPython implements a bunch of optimizations for these things
Immutable literals and literals that aren't assigned to a variable are interned, strings that look like identifiers are interned, integers about -6 through 255 (not sure on the exact numbers, but around that) are pre-instantiated, etc etc etc
I'd assume lambdas are created at runtime
It won't change mutables into immutables in many cases even when unassigned (not sure if there's anything other than the in) as it could change the behaviour
There definitely are nuances I missed, yeah
!e
you can also intern stuff yourself!
import dis
import builtins
def nail(**names):
def decorator(fn):
code = fn.__code__.co_code
consts = fn.__code__.co_consts
positions = {}
for instr in [*dis.get_instructions(fn)][::-1]:
if instr.opname == "LOAD_GLOBAL" and instr.argval in names:
if instr.argval not in positions:
positions[instr.argval] = len(consts)
consts += (names[instr.argval],)
pos = positions[instr.argval]
code = code[:instr.offset] + bytes([dis.opmap["LOAD_CONST"], pos]) + code[instr.offset+2:]
fn.__code__ = fn.__code__.replace(co_code=code, co_consts=consts)
return fn
return decorator
@nail(NUMBERS = [1, 2, 3, 4])
def f():
return NUMBERS + NUMBERS
dis.dis(f)
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
001 | 22 0 LOAD_CONST 1 ([1, 2, 3, 4])
002 | 2 LOAD_CONST 1 ([1, 2, 3, 4])
003 | 4 BINARY_ADD
004 | 6 RETURN_VALUE
You could also use this
nail_builtins = nail(**builtins.__dict__)
and turn all builtins references to constant lookups
For strings you can use sys.intern, which can provide optimizations in terms of both memory and comparison times
This is great stuff, thanks all. The message I take home is that I should rely on dis more often. For more complex objects I would probably keep using default parameters, although they can be a dangerous device.
is it possible to get the name of the current scope? So - if i was inside a function the name would be the name of the function, i'm not sure if this is a thing.
def f():
...
x = get_name()
here x would be the name of the function f
def f():
def g():
x = get_name()
here it would be the name of g, etc
!e
import inspect
def f():
x = inspect.currentframe().f_code.co_name
return x
print(f())
def f():
def g():
x = inspect.currentframe().f_code.co_name
return x
return g()
print(f())
@unkempt rock :white_check_mark: Your eval job has completed with return code 0.
001 | f
002 | g
Seems pretty hacky
Seems pretty hacky
yeah... i think this is true ๐ i was curious tho
Doesn't seem that hacky tbh
That's pretty close to the reflexion solutions you'd have in other languages
Welcome to the esoteric normal
Today I have implemented something cool, it's a script that loads a module of a Python extension taking the C symbol from the current process memory instead of a .dyn/.so or similar. The library which implements the extension is a Python interpreter by itself, so it's embedding and extending Python at the same time in the same dll:
https://github.com/metacall/core/blob/develop/source/ports/py_port/metacall/module_win32.py
The init() call at the end of the script jumps here: https://github.com/metacall/core/blob/ee626fb31967a86b376b26ac2714e858d70d94eb/source/loaders/py_loader/source/py_loader_port.c#L486
Which returns a PyObject* that gets transformed into a usable Python module from Python side:
https://github.com/metacall/core/blob/ee626fb31967a86b376b26ac2714e858d70d94eb/source/ports/py_port/metacall/api.py#L36
The final result is something like this example, using Ramda (JS) from Python in a transparent way:
https://github.com/metacall/ramda-python-example/blob/main/index.py
That sounds very interesting
That's an appropriate question for #software-architecture
But it's off-topic here, this channel is about the Python language and data model itself
Nice wintery name change
Hey guys what is the best way to learn programming?
Hey @unkempt rock check out !resources in #bot-commands
@unkempt rock This channel isn't really for discussion for how to learn programming, that's more suited for #python-discussion or one of the off-topics
i was making a license key script for my GUI application for like monthly/yearly subscription and thought to release this script separately so anyone can use it easily. suggestions are really appreciated. https://gist.github.com/MayankFawkes/89fde71dcff1538fd4cf7c0c16c92efa
Gist
A simple script may help you create/generate license for your software. - License.py
I was trying to improve a work application and I tried manually calling the Python GC at a predetermined interval. This is a long running application that spawns many remote functions (the core of the app utilizes the Ray distributed library). It sped up performance by 2.5x. I was not expecting that.
I know in Python you donโt (or if the program is written decently then shouldnโt) need to call the GC because it keeps up with objects and deletes them automatically (thatโs an oversimplification I know). Never had to call the GC manually.
that sounds very surprising - my only guess about what could possibly cause that is that either you have very little memory available and without the more frequent GC cycles you'd be using a swap file, or you have some objects that are no longer reachable but that are still doing a bunch of unnecessary work, and so garbage collecting them sooner makes you do less work overall
I thinks it is number 3. I have plenty of memory and no swap space. I have never been in the position to even think about the gc.
in that case, adding gc calls should slow down your program, it is weird
I rewrote the implementation to not use that underlying lib and had the performance as the old solution with GC. When I tried the new solution with GC it was a few milliseconds slower - as expected since the new solution follows idiomatic python that should automatically reclaim memory correctly. There is something in that lib that is acting up.
Just wanted to share a cool typing trick I found when building a tokenizer. Haven't checked whether it works with mypy, but it works fine with Pyright/Pylance:
https://github.com/gurkult/gurklang-pyimpl-interpreter/blob/master/gurklang/parser.py
tokenizer = build_tokenizer(
(
("LPAR", r"\("),
("RPAR", r"\)"),
("LBR", r"\{"),
("RBR", r"\}"),
("INT", r"[-+]?(?:0|[1-9]\d*)"),
("STR_D", r'"(?:\\.|[^"])+"'),
("STR_S", r"'(?:\\.|[^'])+'"),
("ATOM", r"\:(?!\d)[^\"'(){}#: \n\t]+"),
("NAME", r"(?!\d)[^\"'(){}#: \n\t]+"),
),
ignored_tokens=(
("COMMENT", r"\#.*($|\n)"),
("WHITESPACE", r"\s+"),
),
)
Token = tokenizer.token_type
A little bit of black magic, and this is what the inferred types for tokenizer and Token are:
tokenizer: Tokenizer[Literal['`LPAR', 'RPAR', 'LBR', 'RBR', 'INT', 'STR_D', 'STR_S', 'ATOM', 'NAME']]
Token: type[parse_utils.Token[Literal['LPAR', 'RPAR', 'LBR', 'RBR', 'INT', 'STR_D', 'STR_S', 'ATOM', 'NAME']]]
That's pretty impressive considering it's python
I still want my statically typed python wrapper language
Coconut is one option but im not sure how usable it is for serious work
Like TypeScript?
Hm, well, one issue with TypeScript compared to other statically typed languages like Haskell or Rust, type information isn't available at runtime. So you can't really deduce any types from stuff unless inferred information is actually passed to the runtime
Yeah exactly like typescript
And no i would not expect runtime inference
afaik types aren't available at runtime in C either its all just blobs of bits and bytes
Part of the problem is the typing semantics are questionable for things like overriding methods
iirc mypy chokes when you assign a function to an instance or class attribute
Cython is an optionally statically typed language that is a superset of Python - though it transpiles to C rather than to Python. But it can be imported into a Python program just like a Python module could.
it is suitable for production work, though, and can outperform Python by literally a factor of 100 pretty easily on certain types of problems.
You are not allowed to use that command here. Please use the #bot-commands channel instead.
Is there a good use case for _xxsubinterpreters?
It's a private module with no backwards compatibility guarantees, intended for use exposing subinterpreters to the test suite so that they can be better tested
so, there's no end-user use case for it - the use case for it is "As a CPython developer working on subinterpreters, I need a way to test my changes"
Anaconda is cpython right? I'm so confused by all these distinctions
what do you mean?
ananconda is cpython bundled with a package manager(conda) and a bunch of preinstalled scientific packages(assuming you have full anaconda and not miniconda)
So in Python, we have the situation where you can iterate over anything that has an __iter__ method that returns something with a __next__ method, regardless of what any of the types involved here are. Would we call that polymorphism? Because they tried to sell us on the concept of polymorphism in the introductory CS classes that I took and I didn't understand what was so remarkable about it that it needed a name.
that returns something with a
__next__method
Technically it needs to return something with both an__iter__and a__next__method, FWIW
Yeah, that's polymorphic behavior - from the point of view of the for statement, it will work on any iterable type without needing to know what specific type it's working on.
In Python, polymorphism is really normal, though. Pretty much everything in Python is polymorphic. If you make a ```py
def times_10(a):
return a * 10
There are lots of languages where you would need to write something different in order for that to be able to handle multiple different types, though. For C++, you'd need to do something like: cpp template<typename T> T times_10(T a) { return a * 10; } to say that a is a parameter of some generic type T supporting a * operator that accepts an int and returns an instance of its own type.
I suspect that polymorphism just doesn't feel remarkable to you because you're used to a language where you get it for free automatically, and can't opt out of the polymorphic behavior even if you wanted to. Pretty much everything in Python works on many different types.
I didn't want to dwell too long on the specifics of the iterator protocol before getting to my main point--the intended audience knows how it works. (Because I was going to dwell on how __next__ needs to have a return value, as opposed to always raising an error :P)
Couldn't the idea that multiple types support an infix operation with * be construed as polymorphic, generics or not? Because Java was the language they used to illustrate polymorphism when they taught it to me.
Is there a language that has a notion of data types but very much does not support polymorphism? C maybe?
it's also important to distinguish between the types of polymorphism when we discuss it, I think
I'm listening.
but anyway, you could say that the lack of polymorphism is the default, I think?
C is a good example.
Couldn't the idea that multiple types support an infix operation with
*be construed as polymorphic, generics or not?
but yes, I would say that this is true
it seems to me that polymorphism is basically the ability of a function to perform an abstract operation on values of multiple types
and in particular, the implementation of that operation may vary by the type received
e.g. + is addition for numbers and concatenation for strings
that is ad hoc polymorphism
In my limited foray into C, which was... limited, I don't think I ever got a feel for how it handles typing. My understanding is that it's more weakly typed than java?
(although the distinctions kind of blur with Python)
indeed it is
C is the poster child for "statically and weakly typed"
so if you pass a struct that isn't the right type of struct
is that a compilation error or a vague segfault?
(this is the commonly accepted view...I'm not really sure TBH)
if I ever write a compiled Python thing, it will probably be in Rust. So I doubt I'll ever hold you to that.
I don't know Rust but my friend won't shut up about it ๐คท
because "static" and "dynamic" in the context of typing have accepted meanings
on the other hand..."strong" and "weak", not so much.
I have argued before that Python has elements of weak typing
but anyway yeah as said above the way Python works means that basically everything is implicitly polymorphic
on the other hand...
isn't python weakly typed in the sense that the end user sees everything as being duck typed, but internally it's doing lots of type checking to decide what your instructions mean?
that would be dynamically typed
which basically means type checking is done at runtime
right. which you said has an accepted meaning. would you say that what I described could be construed as strongly or weakly typed?
neither, really
they're orthogonal
weak typing would be more like...
okay, in Python, for example, you can do this:
[val for val in iterable if val % 2]
which gives you the odd numbers in iterable, right
now, theoretically, if operates on booleans, so there is an implicit conversion there
on the other hand, in a more strongly typed language (say, Scala), you can't do that
you need to explicitly say val % 2 == 1 (or the equivalent)
it gives you... all the objects generated by iterable for which val.__mod__(2) returned a truthy value.
yup
"truthy"
that's the crux.
other languages have no concept of "truthy"
there are no implicit conversions to booleans
this is one view of weak typing: how many implicit conversions are made for you
another example, in JS:
> "1" - 2 + [3]
'-13'
weak typing.
implicit conversion to strings
that wouldn't even compile in statically typed languages (barring certain special constructs), and in Python (dynamically typed) you'd get a runtime TypeError
on the other hand, JS is dynamically and weakly typed
so you don't get a compile-time error
and at runtime the language just tries to massage what you give it into something that works
and, to be fair
I would have accepted 4, but -13 is not that bad
on the other hand (well known)...
> 'b' + 'a' + + 'a' + 'a'
'baNaNa'
I assume this is b + a + (+a) + a and unary +a has some unexpected effect?
and then the nan gets converted back to a string?
yup
wtf
weAktYpiNg
speaking of nan
maybe you know about this
why don't comparisons to np.nan have (what I think of as) intuitive behavior?
go on
in my mind it should be a singleton whereby is np.nan is a thing
!e import numpy as np; print(np.nan is np.nan)
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
True
!e import numpy as np; print(np.nan == np.nan, np.nan != np.nan)
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
False True
๐ฎ
nan is generally not equal to itself
why
(it's in the FP standard)
well
long story short
it basically means "not defined"
and there are many sources of that.
however I'm not the expert in the history of this
there are probably deeper reasons
Couldn't the idea that multiple types support an infix operation with
*be construed as polymorphic, generics or not?
Well, multiple types supporting the operator isn't the part that makes it polymorphic - what makes it polymorphic is that user code can call that operator without knowing exactly which type it's calling it on. In C, bothintandfloatsupport*, but (up until very recently) the language gave you no nice way to make a function that takes int-or-float and applies*to whatever you get. In that respect, C lacks polymorphism.
polymorphism, in a nutshell, is the ability to act upon an object without knowing what type it has, and have it do the right thing for the type anyway.
it's probably helpful to contrast this with a language that doesn't have polymorphism automatically. In C++, if you've got a base class called Base and a subclass called Sub, and they both implement a method called print but do something different: cpp #include <iostream> class Base { void hello() { std::cout << "Hi from base\n"; }; class Sub : public Base { void hello() { std::cout << "Hi from sub\n"; }; And you've got a function that takes a pointer to base class instance and calls the print method on it: cpp void say_hello(Base & obj) { obj.hello(); } then it's legal to call that with either an instance of Base or an instance of Sub (because Sub is a subclass of Base, and so a reference to a Sub is also a reference to a Base) - but if you pass it an instance of Sub, it'll still print "Hi from base", because by default method dispatch in C++ is not polymorphic, and calling the hello method of a Base reference will always call Base::hello. You need to explicitly opt into polymorphic behavior by putting declaring virtual void hello instead of void hello in the base class, and only if you do that will passing a Sub object to say_hello print "Hi from sub"
the reason it does this is that polymorphic lookups require a step, at runtime, to figure out what type of object you've gotten and call the appropriate method for it. This can't be done at compile time; the function doesn't know what type of object it's actually been passed until runtime. So there is some runtime overhead for detecting that you've got an instance of the subclass, rather than the base class, and then calling the subclass's method, instead of the base class's method. C++ lets you choose, on a per method basis, whether every call to the method should or shouldn't have that overhead.
Also, re: float("nan"), note that null in SQL behaves the same, and is unequal to everything including itself
anybody know about kubernetes, ansible and stuff
@boreal umbra ^ I don't remember if I pinged you on the above or not, so if I didn't, ping
thanks for the ping. I started reading but got sucked into mod stuff.
no worries - just couldn't remember and didn't want it to get buried ๐
I'll need to read it tomorrow. I'm just about exhausted currently.
that's just dynamic polymorphism though right
e.g. static parametric polymorphism with templates can be done at compile time
yeah, that's true.
and it's not exactly on-topic, but modern C has a way to do generics at compile time as well - https://en.cppreference.com/w/c/language/generic
I vaguely alluded to it above.
nan is only a problem if you would like to have it as a key in a dict.
hey, is the the key nan inside the dict? since its never equal to itself you can add it, just not retrieve it back.
and thats just silly, so im glad python does not allow it
What changed with C recently?
Ohh generics
I read a bit up top, and then skipped everything until eivls msg so missed that
Does python hash every nan to the same value regardless of its binary representation?
Seems to always hash it to 0
Huh ig thatโs a good thing
How did u test different representations? Right after I asked the question I was trying to figure out how to get different NaN types in python
i'm inspired:
In [23]: a = No()
In [24]: d = {}
In [25]: d[a] = 1
In [26]: d[a]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-26-550fc156815a> in <module>
----> 1 d[a]
KeyError: <__main__.No object at 0x0000020F41E80C10>
@red solar
https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Python/pyhash.c#L86 and the place where it's used
Ahhh nice thnx ๐
@raven ridge never thought of cython in those terms before. I suppose it can be used for general purpose programming, and it does usually give a noticeable speedup on basic tasks. I think of it more as an interface to the cpython api i guess
It certainly can be that, but it can be a general purpose statically typed Python as well - at least, depending on what you want the static typing for. If you want it for performance or code readability, though, Cython does the trick. Though of course, Cython does hurt maintainability... It's a pain to need a C debugger instead of pdb, etc...
Tradeoffs ๐คท
https://tenthousandmeters.com/blog/python-behind-the-scenes-6-how-python-object-system-works/
Anyone know of any other good python/cython internals books/blogs? These kinda remind me of the 'Ruby under a microscope' book which I enjoyed greatly way back when.
As we know from the previous parts of this series, the execution of a Python program consists of two major steps: 1. The CPython compiler...
@wicked holly https://realpython.com/products/cpython-internals-book/
Thanks, exactly what I was looking for, even if I'm not using 3.9 for anything 'real' of yet I'm sure it's close enough for 3.6-7.
definitely
!e python print("gamer")
You are not allowed to use that command here. Please use the #bot-commands channel instead.
):
what is your opinion on local generators, something like
def fun():
def data_gen():
while (val := obj.read()) != sentinel:
yield val
for data in data_gen():
do things
```in this specific case, you would use the 2 arg form iter, but there are cases where that is insufficient
so how this is possible here: self._product = Product1() and then product = self._product. Explanation please :(
class Builder(ABC):
@abstractproperty
def product(self) -> None:
pass
class ConcreteBuilder1(Builder):
def __init__(self) -> None:
self.reset()
def reset(self) -> None:
self._product = Product1()
@property
def product(self) -> Product1:
product = self._product
self.reset()
return product
class Product1():
def __init__(self) -> None:
self.parts = []
def add(self, part: Any) -> None:
self.parts.append(part)
def list_parts(self) -> None:
print(f"Product parts: {', '.join(self.parts)}", end="")
That doesn't seem clearer or more maintainable than just ```py
def fun():
while (data := obj.read()) != sentinel:
do things
I think it'd be fine if you're passing that generator down somewhere, or returning it up, but it feels weird and unnecessary to define and use it in the same function.
the case I had in mind had a more complex generator, I will probably ask with real code tmr.
I'm not sure I understand what you're asking. Why wouldn't it be possible?
Whenever a ConcreteBuilder1 is created, or wherever the product attribute is accessed, it creates and caches a new Product1 instance. The product attribute returns the original Product1 instance that existed before the new one is made.
@raven ridge thank you, keep going in more details please. What you see in this code, pls descirbe, I took screenshot cos I rly want to understand this
What specifically do you not understand?
@flat gazelle I agree with godly on that example not really being worth it, but iโve definitely had times in c++ where a small lambda significantly helped out in similar(ish) situations, and iโm sure a similar need exists in python
from typing import BinaryIO, Iterable, Callable
def fn_a(it: Iterable) -> None:
for i in it:
print(i)
def fn_b(fp: BinaryIO, is_sentinel: Callable[[str], bool]):
# create generator
def to_it(fp_, is_sentinel_):
while (byte := fp_.read(1)) and not is_sentinel_(byte):
yield byte
# call previous function with generator
fn_a(to_it(fp, is_sentinel))
if __name__ == "__main__":
with open("main.py", "rb") as f:
fn_b(f, lambda _: False)
Guys im searching to somebody to help me for coding new multifunctional bot
@flat gazelle kind of a contrived example, but this is more what i'd consider worth it
hmm not sure this is the right channel for that jett (but then also not sure which would be)
hi jet ! pls do share your discoveries as you go
something like this:
In [15]: pretend_locals = {}
...: b = pretend_locals
...: pretend_locals["b"] = b
...: print(b)
{'b': {...}}
In [16]: pretend_locals = {}
...: b = pretend_locals
...: print(b)
{}
!e I think this makes it clearer:
def a():
b = locals()
print(b is locals())
print(b)
a()
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | True
002 | {'b': {...}}
both calls to locals() in the function return the same dictionary, and each time locals() is called that dictionary is updated with the current locals. So if you only call locals() once and assign the result to b, then at the point locals() is evaluated, then name b doesn't exist yet, so it's not in the dictionary. If you call locals() a second time, it updates the function's locals dictionary with the current globals, and then returns another reference to that dictionary.
A Python frame holds an optional dictionary storing its local variables. locals() creates the per-frame locals dictionary if it doesn't already exist, then updates it with the current local variables, then returns it.
The next call to locals() sees that the per-frame locals dictionary already exists and doesn't need to be created, then it updates it with the current local variables, then returns it.
oh i see thanks
Any one wanna talk about and have a discussions about Cpu and GPU and their possible furture knowing that Cpu and GPU are almost to the end of what we can produce because the lack of discovered technology??
@cyan ice how did you decide we are almost at the end of what we can produce?
Well I mean I am study technology.
As you can tell the new GPU currently out barley have anything different compare to power just advantage in VRAM. Because VRAM is the only thing than can upgrade. I mean if they wanna make GPU more powerful then they need to find new technology that allows them to keep their size of been small and have power.
Cpu are barley increasing in threads and in GHZ. Everyone they realise that is new barley does any extra power.
I say give it 5yr and they will be slowly running out of ways to produce GPU and cpu. They will need to find new technology or upgrade motherboards technology to support a new technology.
Maybe they should try and use fiberobitc or maybe micro elctrolites to increase performance.
Maybe... But I guess that would increase size.
I mean quantum computer is going to be the next thing. However quantum computer is basically the size off a wall. They are huge we basically gone back to the first ever computer with size.
So untill they figure out how to make quantum computers smaller.
We are going to run out eventually with upgrades till then
@cyan ice 1) I know nothing about these chips; 2) everyone who has predicted that chips can't get more powerful has been wrong for the last 50 years; 3) this is off-topic for this channel
Yeah true lol.
That's what I mean lol, I think quantum computer is going to be the next huge leap. The question is when.
Also move to off topic lmao
Who knows with new technology could also mean new coding language. To run the new computers
Who knows ๐คทโโ๏ธ
try asking in #software-architecture or #web-development
thanks Steler
I'd been using colorama before for colored terminal output, but there were other libraries recommended. What were they and what was the rationale?
@uncut sage "rich" looks amazing
As I've been using numpy, pandas, and type annotations more extensively, it's annoying that I can't lint for properties of arrays/dataframes that are necessary for a given function to work. But then again, I'm not sure if it would actually be feasible to lint for things like shape or column names. Thoughts?
@boreal umbra PyContracts had some things for doing that
but that's not static analysis, it's runtime.
Ideally I'd want it to work seamlessly with whatever linting tools I find myself using. Which is pretty much always just the PyCharm one, or flake8 if I'm working on one of our projects.
The reason I'm suddenly interested in this: I deleted some classes that I wrote because everything they did could be done by dataframes, but the classes I wrote arguably made it more obvious what properties those objects had.
what if we had something like pd.DataFrame[shape=(2,3), columns=['a', 'b', 'c']]? I remember there was some discussion about whether or not the data model can handle that syntax for __getitem__--not sure about __class_getitem__
hi guys is anyone studying electrical engineering here I just can't figure out what the ยซyยปmeans next to the value in the wolfmeter
What about index names though and values? How do you handle that? I also doubt it will ever happen bc there's no way to annotate arguments to check for say a specific key name in a dictionary right?
Capital Y usually denotes admittance in circuit analysis. <> To denote time average operator, should look up the manual of your multimeter
you could just have more kwargs for all those different properties, I guess, but it would definitely get messy. You'd probably need to define an alias elsewhere in the code.
I think if you're in a situation where you need for a dictionary to have specific keys to be usable by a given function, you've probably designed your program wrong.
Isn't that literally what TypedDict is for?
Yeah I discovered that today
Here's a question. When you guys write docstring example sections, do you have a method of saying writing the examples in a notebook and then converting them to a string doc? I've got to write a fair bit of them and it's super annoying to do all the copy and pasting
Don't have to write examples that much but usually do them in an ipython repl or manually for smaller examples
Yeah I do the repl write now it's just really slow
Why isn't there a possibility to use dunder methods for modules directly?
for example defining __call__ function in a module, which would make this possible ```py
import my_module
my_module("hi")
@radiant scroll you can for some. __getattr__ works in modules now
oh, interesting, is implementing other dunders, such as that __call__ planned?
no idea ๐
I wonder, with that __getattr__ does it also need to take self as parameter, because it seems kind of odd to take self with module?
it does not
It's possible to put something that is not a module into sys.modules. If you do:
class MyFakeModule:
def __call__(self, greeting):
print(greeting)
import sys
sys.modules[__name__] = MyFakeModule()
Then you would be able to import and call it.
It'll confuse the heck out of anyone who needs to maintain your code, but it's possible.
oh that's very interesting
I'm now discovering that __getitem__ and other methods that read from square brackets can't accept keyword arguments. And now my day is ruined.
I believe there's a proposal for this
I thought I saw something about it on Python ideas but it looks like it's not in 3.9
But the real question is, is your disappointment immeasurable as well?
It is indeed.
This will come as no surprise to gm, but I was about to start prototyping a type annotation system that I want pandas to have
!pep 637
I need it
you kids and your types
I'm a Python native, so you can't say I want this functionality because another language planted in me the assumption that you need that.
No, I say it's because your generation is ruining Python 
serious on-topic musing: I wonder how much of the push to add static typing to Python is due to people getting mocked for using "a scripting language" for "real work".
A: pfft, you use python? that's a scripting language
B: nah, it's statically typed, so it must be real
exactly like that.
the language got by just fine without it for 25 years, then got increasingly popular, and then it became a huge new feature that everyone agreed must be added, for... some... reason. Maybe that reason was that it didn't fit well in some of the new niches it was being adopted into and the new feature actually helps, or maybe the reason was that some of the new people who were adopting it didn't like that the feature didn't exist, for no other reason than some preconception of what a language should be.
I don't really understand the whole "scripting language" pejorative. A friend once said "I like Python as a scripting language. Like for reading a CSV or building a Discord bot" (emphasis mine)
The latter is by no means a "script". So they didn't actually have an idea of what they meant by "scripting language" other than that they hate python.
yeah - again, from a preconception of what a language must be.
"if it isn't compiled to machine code, it isn't a real language", or whatever.
assembly isn't real ๐
it's awfully puzzling to me the amount of vitriol that the walrus operator got, when most people can completely pretend it doesn't exist and never have to see it. Meanwhile, type annotations were a huge change to the language that has infested nearly every Python codebase in the wild to some degree or another, that was added to the language years before there was a production-ready tool for using them, and there was much less uproar over them.
yeah, it was as if it even being in the language like "tainted" the language such that it couldn't be used anymore
I guess the assumption was that the walrus operator was a slippery slope towards destroying Python's reputation for readability, whereas type annotations could have been construed as a necessary step to legitimize and/or future-proof the language.
with the benefit of a few years on each of them, if you asked me which of the two did more to destroy the language's readability, I know which I'd pick.
could it be... neither?
no. ๐
so what do you hate about type annotations?
I think they're noisy and unnecessary, the the tooling for using them (at least mypy) is still terrible, they duplicate information that already appeared in documentation, they fall apart at representing all sorts of common Python idioms...
someone at work the other day was asking why mypy was complaining when he tried to pass an io.BytesIO as the file= to print, and... as far as I can tell, the answer is just that type annotations are bad and don't do their job well.
i still cant get past the visual clutter of it all
my brain just shuts down when i see type annotations. what's wrong with good ol docstrings ๐
and there's ongoing discussion on using them for newer and more terrible things - python-ideas has someone suggest using const as a type annotation about once a week, and const is not a type!
does that add anything that a SCREAMING_SNAKE_CASE variable name doesn't communicate?
I agree that the tooling could be better and the system is lacking in some aspects but with them being mostly only present in definitions I find it easy to filter out when the information is unnecessary, docstrings that mix in the type information into text are harder to read for me
sometimes people propose that it means that a name cannot be rebound, and other times people propose that it means that an object cannot be modified through that name. So, who knows, and we'll see.
yeah, I'm not on board with adding enforced constants. As the developer, you have the option to not write code that you don't want to write. Same reason I don't want private variables.
I find that wild - the most important part of the entire function signature is "what parameters do you take", and now that's frequently spread over 3 or 4 lines where 80% of the line is noise. At least with docstrings it was in the docstring, all contained in one place and easy to skip over when reading.
and it's not just mypy - all sorts of code is broken for it. I have a PR right now that I'm trying very hard to talk the Sphinx author into, because Sphinx tries to interpret type hints when generating documentation, and it does it buggily, still, years after type annotations were introduced.
we added them to the language saying "once we have these someone can build tools that use them", and it's been years, and all the tools (that I've used, at least) are still bad.
Most of my definitions where I have more params with non trivial typehints end up with each param having its own line, then it's just a matter of stopping reading on the colon
A nice way to stop reading on the colon is to not have the colon or the stuff afterwards! ๐
why, if you did that, you could even make it more readable by putting all of the parameters on one line, so that you can read from left to right like it's English!
doesn't everyone have 5k monitors nowadays
then you can just read it on one line
which might actually be ideal since you don't move your eyes as much as like, multiple lines
Lest anyone think I'm a type bigot: I've met those who put a lot of faith in type safety to help them write code. But no amount of pre-runtime tooling will ever absolve the developer of the responsibility to understand the problem they're solving and to solve it.
pfft, that's the job of the problem type checker
im stuck in the perpetual cycle of "Cant be bothered to change the rules of my pep8 hint" then "ah crap, crossed 79 chars, need to fix"
just set it to 120 and be done with it
Well you do get diminishing returns the longer it gets on one line
that's step 3! "ah, i should change it to 120". and then step 4 "ah i forgot. meh"
new feature, autoscroll: as it detects that you're done reading, it just moves so you don't have to move your eyes at all!
I admit that documenting types in a structured way is useful for developers across module boundaries, and it's nice for IDEs to be able to use those hints. If people stopped there and treated it only as human-readable annotations rather than machine-readable static types, I wouldn't mind it nearly as much. (And I wonder how many of those who were in favor of approving type hints anticipated them being used that way, and were bamboozled into the current status quo)
I'm actually starting to get to the opposite side of it, where my project is becoming too complex and typing it is becoming next to impossible without adding more and more to make pylance happy
are you defining aliases in advance?
yep that was one of the first things I did when I starting typing it
I've come to the (partial) realisation that when you don't have/need strict i/o, a static typechecker isn't all that useful
that being said, typing args+return types is
Even if your 250-character line fits one the screen:
- if it's long, it's probably very complex, which is hard to understand
- some people will read your code on smaller screens or in a split view (in a diff view, for example)
[chr(ord('fmcd[luX`lxn'[i])^i) for i in xrange(12)]
what does this python code do? anyone?
solve it one bracket at a time, it's just printing a message.
if you go from inside out, you'll see that it's indexing a string, taking its ordinal, doing a manipulation on the integer, and converting it back to a character.
it's basically applying XOR encoding
@unkempt rock @stone prism this is strictly a discussion channel. Take a look at #โ๏ฝhow-to-get-help
what to do?
I wrote a post on typed Python: https://www.balajeerc.info/The-Joy-of-Typed-Python/ Feedback welcome
Balajee Ramachandran
If I am to start working on a new project today, I would hesitate to attempt it in a language that does not have compile-time type checking. However, I do have to deal with Python at work (though we are slowly phasing it out). Also, I have been working off and on, in my spare time, on a Python project that has over the past 3+ years gotten fairl...
(please let me know if this is offtopic here)
I think it's a great article, well-written, correct, and interesting to read
one minor thing I might add
- The Python interpreter simply ignores the type annotations.
+ The Python interpreter simply ignores the type annotations (except for use with reflection).
You could also mention other type checkers, like Pyright (which actually addresses some shortcomings of mypy, like recursive type aliases and type aliases in error messages, which you mentioned) and Pyre
@grave jolt thanks, I did not know about the fact the interpreter uses it with reflection. I'll look this up a bit more.
also, yes, I was considering checking out pyright myself
@vast plank what @grave jolt meant is that the annotations are stored on the functions, so you can introspect them if you want.
Yes, it doesn't really "do" anything with it, it just stores that information. Some libraries inspect type annotations for some behaviour.
For example, FastAPI uses them to figure out how to validate and parse HTTP requests
!e
def f(x: int, y: str) -> bool:
...
print(f.__annotations__)
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
{'x': <class 'int'>, 'y': <class 'str'>, 'return': <class 'bool'>}
@grave jolt i understand now. thanks for clarifying.
it's not very important information, though, so you haven't missed anything in the article
@grave jolt also thanks for having taken the time to read the post and provide feedback. much appreciated. please let me know if there is anything else I might have missed.
Oh, maybe also mention Protocols?
yes, I have read the mypy documentation about them. however, I hadn't used them myself. I will add a section on it.
what does None do in when you are trying to index numpy array? like arr[None]? like it turns array (4,) into (1,4) (when doing .shape) but like... wat?
@cedar flare None is np.newaxis
out of curiosity, would anyone know ifgym.render() is giving me an image?
That statement means nothing to us out of context.
Hey, is it:
def func(*strings: str): ...
or
def func(*strings: Sequence[str]): ...
It would be the former
you can do python *strings: Tuple[str, ...] as well, and nothing else
Showerthought time. Consider this. New syntax for ranges.
First, [1:n+]/[1:n^] that means the same as [1:n+1].
Make it work with strings, so ['a':'z'^] means ascii.lowercase[:]
Make range work with slicing as well as method syntax. for i in range[2:10:3] instead of range(2, 10, 3).
Why? Because stuff like pandas 'loc' where it has different indexing from default just because you want inclusive indexing for column names.
!eval slice('n^')
@paper echo :warning: Your eval job has completed with return code 0.
[No output]
You can do it without new syntax
By parsing the stuff you pass to :
I don't see the value in adding [] to range but i see lots of value in adding things like regex to indexers
I don't know if I agree with adding new functionality to basic language constructs
In my opinion python is getting dangerously close to becoming something like perl, with too many clever short hand ways to do the same thing baked into the language syntax
We are in a good place right now but I'm afraid we might go over the edge with pattern matching
But something like Slicer(some_dict).keys.re[r'^foo.*-1'] could be a nice 3rd party library
And pandas would definitely benefit from more ways to operate on indexes without having to actually use the .index accessor
Especially multi indexes which are a verbose pain in the ass
Which is unfortunate because they are such a powerful tool
Also, considering previous discussion on typing. I'm looking forwards to what would be possible with new Annotated typehint.
I agree, that current typing system tries to do three things at once.
First, it's actual runtime types (and protocols).
Second, it's restrictions on values which are replacement for if value not in values: raise ValueError or asserts. For example, checking if n >= 0 in a function where negative values are meaningless. Or hell, division by zero. For now we have Literal.
And third, it's meta-information like Final. Private values are also good candidate - you can access them for debugging, but the IDE would warn you (though it duplicates _val syntax).
I think all three are useful, but maybe they should be expressed in different ways
Maybe something like n: int : Literal[1, 2, 3] : Final?
We have Final[Literal[1,2,3]]?
int: Literal is redundant at best or inconsistent at worst
@paper echo Interesting point about regex. Would you use re.compile and use compiled regex as a key?
Sure why not
Now one thing i WOULD like from python is syntax for regex literals
That'd be sweet
Maybe silly though
I just couldn't think up of a better example. The point is "flat is better than nested" and we could use several : instead of nested types.
I see what you mean. But the nesting is really just syntax here, not reflecting a nested data structure
Maybe a tuple then
Even better - ability to make your own literals.
from re import regex_literal
regex"*.txt"
Thatd be cool
Reader macros for python!
Or first class infix functions like haskell
That would probably confuse new users but also would obviate a lot of verbosity from the operator library
Fwiw you can do all this with Hy ๐
There are already macros PEPs underway, but I have zero idea if it would be cool and useful thing or a way to make DSLs that make python unreadable.
I think it would make the language unreadable, this is one of the main reasons people don't like Ruby. Too much metaprogramming abuse๏ฟผ
Can you provide an example how python infixes could look? Not familiar with Haskell syntax.
Hi I am new and I need a little help can anyone help me ?
@dim kindle You probably want to use a help channel or general, for quick questions, this is a discussion channel
ok thanks
which channel can you let please
@dim kindle consult #โ๏ฝhow-to-get-help
Hey, that's a cool way to omit parentheses and make your own sort of operators
What is the PEP for macros?
Haskell is very cool
The basic syntax is really elegant
Lots of other complexity in the language but the basic parsing and evaluation model is really nice
Thx
I've heard a lot of complaints about the fact that it uses several function syntaxes at once
What, haskell? Or the pep
@unkempt rock see #โ๏ฝhow-to-get-help
Haskell. As I said, I'm not a haskellist, but I've seen plenty of flame wars about "should you use do notation"
Well, maybe I'm wrong.
df[df['x'] == 1]
That made me think. Is it possible to pass the whole expression?
So you could do df[x > 0] instead of df[df[x] > 0]
Thats actually a suggested pep iirc
What pep? This would be a special case for the parser, yes?
@severe lichen you could do that if x was a custom object that returned context from its comparisons
How would the object know what dataframe it's supposed to refer to?
If you have to set that in advance, that kind of kills the elegance.
You could step up the frame, but it doesn't have to
The data frame would receive the object and could act based on it (pulling out values and what comparison was done)
(Tbh tho I'm not curious if you could implement something like that on the builtin list type with monkey patching)
@pliant tusk I think I see where you're going with this. But remember that objects typically don't know what the names of the variables that point to them are. So I still can't think of a way to do it without any earlier setup
It would be tricky, but doable
x,y and z could be special objects by themselves (and maybe a way to create additional axis)
!e
Well, that's a pretty big breaking change. Any expression can be an index. You could imagine doing something weird like
x = 5
print(("a", "b")[x > 0])
x = 0
print(("a", "b")[x > 0])
opts = {True: "yes", False: "no", None: "idk"}
print(opts[x > 0])
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
001 | b
002 | a
003 | no
Pep 637 offers this example with pandas
df[df[x] == 1]
``` to
```py
df[x == 1]
And how the pep would change the first notation
The PEG parser should allow regex literals of the sort that Perl/AWK/JS/Ruby have, using x = /a.*b/ or the like. I'm pretty sure that would be unambiguous with the existing meaning of the / and // operators, since they're both binary and can't appear in a place where a name is expected, and a regex literal could only appear where a name is expected... I think.
And avoiding the need for re.compile() and raw strings might be a good motivation for such a PEP. Though, it is another step towards syntax soup
The PEG parser should allow regex literals
I think it is much of a tokenizer deal rather than parser. The only thing parser have to do is, sort of wrap the token into an AST node. Though I don't think anyone would support such a feature
There are multiple problems when it comes to compiling those, for an example whether you will keep this in the co_consts or build the regex constants at runtime. If the first choice is taken, then you need to create regex objects in the compile time which is quite problematic (the whole infrastructre is build for built-in types, not stuff from extension modules), if you take the second route then it will come to the point of being slowness since you have to reconstruct the regex object (imagine doing re.<func>(<regex>, <text>) every time instead of GLOBAL_RE = re.compile(<regex>) one time and search on that, which is quite fast compared to the first one).
also, how would you compile a regex literal that uses the 3rd party regex module?
or some other library
you can't, because they have different syntax
fair enough. I was focused on whether it would be grammatically unambiguous, and I think it would - but I buy that there are other problems with the idea beyond that.
You can support custom regex literals by assigning to a special parameter that is localized to the current module
Not sure if the runtime currently supports "module local" top-level variables
import re
import regex
assert type(/^a/) is type(re.compile(r'^a'))
sys.regex_constructor = regex.compile
assert type(/^a/) is type(regex.compile(r'^a'))
That might actually be interesting in general
Eg assigning something to sys.stdout could/should be limited to the module in which the assigning code is defined
@paper echo why would you want sys.stdout assignments to only affect the current module?
You'd need to alter the tokenizer for that, right? Or do you propose to just parse everything between / and /, respecting \/ being an escape, and then pass that to a specific factory?
regex actually has syntax that's not supported by re, so it's a superset or re
Yeah @grave jolt
@spark magnet because I've had it up to here with badly behaved modules developed by naive developers that modify global things ๐ logging especially
In this case you can use regex.compile-based literals in your library without forcing anyone who imports your library to do the same
I also would put a vote in for changing the behavior of getLogger() to be equivalent to getLogger(__name__) for similar reasons (and adjust the behavior of logging.info, logging.basicConfig etc accordingly) but that is a somewhat significant breaking change
@paper echo the logging module is far out-of-style, for sure.
My advisor says that she won't embrace Python unless it gives regular expressions the first-class treatment they get in Perl, but I don't see what's so bad about having all regex functionality namespaced together in a module that's always available. And then, if you want to use a regex engine that isn't the standard Python one, I don't want it to get fundamentally different treatment.
That's a perfectly defensible argument for why there shouldn't be regular expression literals. But of course, you could just tell your advisor that having regular expression literals is a hallmark of a scripting language, and that you only like to use real languages.
@boreal umbra I dare you ๐
I could never say that to her. She is queen af.
who?
My advisor.
โญ
sounds like you have a great relationship, which makes the jab all the more enjoyable ๐
Her other issues with python is that it doesn't have sigils so she has to remember if something is a list or a dict. But only having that for two types isn't that great
And the variables should be named such that if you understand what problem is being solved, the types are inferable.
class Something(metaclass=HI)
when we do metaclass=hi the __new__ method of the metaclass is called right?
my usual name for any dict that I want people to understand is x_by_y. address_by_name, or count_by_letter, or whatever.
__call__, __prepare__, __new__ , __init__ are all called in that order, i think
Hey can I ask a question about python-nmap module here?
You should ask in #tools-and-devops
ok
If anyone else is wondering that how they could possibly lex regex literals, here is JS's method: https://www-archive.mozilla.org/js/language/js20-2000-07/rationale/syntax#regular-expressions (which I believe is the direct sign that why this is a bad idea for python)
Why would you want regex builtin to base python? Regex is a dsl and it fits quite nicely being a module.
Would you really want to have to escape special characters all the time to make regex work?
I like that
!rule 5
@unkempt rock This is strictly a discussion channel. In general you could ask about this in #web-development (or #discord-bots if it's a discord bot), but it sounds like this violate's Spotify's terms of service. Per rule five, we can't help with that.
This is a discussion channel; Try opening a help session. See #โ๏ฝhow-to-get-help.
@fierce geode it could be that I was misreading the intent of your question; if it relates to gaining a deeper understanding of Python itself, I suppose that is on topic. If you want to talk in general about use cases for niche library, try asking in the topical channel that most closely relates to that library or in #software-architecture.
What i want is a language that is stricter than bash/zsh but has things like input/output redirection (file handle literals), regex literals, and a gradual static type system
Macros would be nice too ๐
Maybe tcl has some of what i want
@karmic creek this belongs in #data-science-and-ml ๐
what's the reason why CPython doesn't automatically convert large tuples/lists to a set for use when people try the in operator on such a large tuple/list? it'd make it much faster, right?
because it cant guarantee a) the operations used, b) that all items will be unique, and c) that there wont be collisions
the runtime and language really shouldnt try make those decisions for the programmer, thats up for them to decide
what do you mean by a and c, and for b, if it's only checking if an item exists at least once in a sequence, shouldn't eliminating duplicates make no difference?
MM ig for points a and b if its just for the in lookup operation
point c is still valid as two items can be different and still possibly collide (unlikely but possible)
yeah i meant just for the in operation
what does a collision entail, and why don't we see those when we use set(), but we would with this method
oh
what error does it give?
hmm?
like what's the name of the error, so i can search it
hard to find a good explanation of it online but https://www.includehelp.com/data-structure-tutorial/collisions-in-hashing-and-collision-resolution-techniques.aspx does an alright job
Learn about the collisions in hashing and collision resolution techniques with examples.
its just generally known as hashtable collisions
ah ok, thanks!
CPython has internal algorithms to deal with hash collisions
You won't ever have a collision in dict keys
as expected with most things python
No
in python not then
AFAIK it's part of the language spec
huh ok
(to the extent that python even has a spec)
attempting to test something and the repl decides to maybe not
we hit no error -> memory error -> c types go no
I guess you can't allocate an infinitely large list
sad
There is only a similiar optimization, at the compile time. If you are using in to check out whether a sequence of constants contains something, python will fold that sequence into the faster equivalent (list=>tuple, set=>frozenset).
Tbh i prefer that to OOM locking up your computer
an issue that can be pretty easily demonstrated with that conversion idea is the time it takes and the holy fucking memory usage
run py x = list(range(1, 9999999999)) print("done") y = set(x) print("done")
and notice the time
also the memory
@silk pawn you can put things in tuples that aren't hashable
ohh another point
And if you just need to do one lookup it's better to just O(n) scan a tuple than to O(n) hash every element of the tuple to build a set and then do an O(1) lookup in the set
Im abusing asymptotic notation anyway
then again, what things arnt hashable in python ik there are some but i cant remember
It's faster to just look something up in a tuple than to build a set and look something up in it, if you aren't going to keep the set around๏ฟผ
most things should be
Anything mutable is not hashable by default
Lists, dicts, most instances of user defined classes
that makes sense yeah
The latter need to specifically be given a __hash__ method and even then it's assumed that it is not mutated after it is hashed
Otherwise you violate the semantics of dicts sets etc
semantics were made to be broken :P
Hash equality should correspond to object equality
The purpose of an optimization to have the same behavior, with better performance. When doing operations on the set (in/not in) set expects target to be an hashable object, but tuples can take any. So if you write your check like if name in {'foo', 'bar', 'baz'}: ... then we can safely convert it to a frozenset, because the TypeError will be raised regardless if name object is not hashable. This is not true for tuple=>*set conversions.
Really only fundamental atomic things like strings, ints, floats, None, functions, and bools are hashable, and tuples thereof
Here is an example
>>> class X:
... def __eq__(self, other): return False
...
>>> X() in {1,2}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'X'
>>> X() in (1, 2)
False
icic
Also, this optimization both speeds up the contains check and object creation. Since [constant] tuples and frozensets are immutable objects, they can be safely stored in co_consts so you won't create a new object every time you perform a lookup.
it would still be O(n) to verify that everything in the list/tuple is hashable and to put it in a set.
From my understanding, the implementation for list.__contains__ is basically:
for elem in list_:
if item == elem:
return True
return False
So it's at most n comparisons. Whereas converting it to a set guarantees at least n operations and may fail.
got it
and classes!
basically, an object is hashable if
a) it's compared by identity, not by value (e.g. function)
b) it's compared by value and is immutable
or c) it implements __hash__ however it wants to
class Pathological: def __hash__(self): return 17
well, yeah, I meant intended cases
randomized hash function, seed determined by fair dice roll
actually, the only requirement (by contract, anyway), is that a == b => hash(a) == hash(b), so if your objects are never equal (except to themselves), their hash can be whatever you want (but hash(a) == hash(a))
something I find a bit inelegant about python OOP: I'd rather be able to establish what attributes uniquely identify an object all at once. And I guess dataclasses solves that to an extent. I know salt rock lamp likes a library called attrs, but I've never looked into it.
right, but I want something like __identity__ that returns a tuple of what attributes uniquely identify an object, and then __eq__ and __hash__ can both use it implicitly.
I also don't like how the builtin sorting functions depend on __lt__. I might want that to be thrown in as well.
it has lots of options, e.g. you can turn off init or repr
right