#type-hinting
1 messages Β· Page 60 of 1
heres what i have
async def prefix_getter(bot : commands.Object, message: commands.Message) -> str:
async with aiosqlite.connect("database/guilds.db") as connection:
async with connection.cursor() as cursor:
await cursor.execute(
"""
SELECT * FROM prefixes
WHERE guild_id = ?
""",
(message.guild_id,),
)
data = await cursor.fetchone()
if data:
return data[1]
else:
return "w!"```
that's right
oh thanks
just like int means an integer, not the class int literally
annotations is what you use after the :
"type hint" is an annotation that's used to specify a type
originally annotations were created for various purposes, not just for type hinting
examples ?
discord.py used annotations to mark the converter to use on the command argument
oh fair
!pep 3107
there's more about them in this pep ^
.bm
what does mark the converter exactly mean ?
thanks !
You can write something like ```py
class ISO8601DatetimeConverter(Converter):
...
@command
async def mute(user: discord.User, until: ISO8601DatetimeConverter()):
...
would it check if until is an instance of that converter ?
also is there really a class called converter
?
No, it would use the logic inside of the converter to parse the argument (a string) from a Discord message
you can look at examples in our bots https://github.com/python-discord/bot/blob/main/bot/exts/fun/off_topic_names.py#L113
bot/exts/fun/off_topic_names.py line 113
async def delete_command(self, ctx: Context, *, name: OffTopicName) -> None:```
discord.ext.commands.errors.ExtensionFailed: Extension 'cogs.prefix' raised an error: AttributeError: module 'discord.ext.commands' has no attribute 'Object' oh lmao
bot/converters.py line 383
class OffTopicName(Converter):```
whats Converter exactly ?
bot/converters.py line 13
from discord.ext.commands import BadArgument, Bot, Context, Converter, IDConverter, MemberConverter, UserConverter```
It's just a normal class that discord.py defines π€·
discord/ext/commands/converter.py line 101
class Converter(Protocol[T_co]):```
it doesn't do a lot π
If you want to see how discord parses commands, I don't know where it does that
I'd guess in process_commands(...)
!d discord.ext.commands.Bot.process_commands
await process_commands(message)```
This function is a [*coroutine*](https://docs.python.org/3/library/asyncio-task.html#coroutine).
This function processes the commands that have been registered to the bot and other groups. Without this coroutine, none of the commands will be triggered.
By default, this coroutine is called inside the [`on_message()`](https://discordpy.readthedocs.io/en/master/api.html#discord.on_message "discord.on_message") event. If you choose to override the [`on_message()`](https://discordpy.readthedocs.io/en/master/api.html#discord.on_message "discord.on_message") event, then you should invoke this coroutine as well.
This is built using other low level tools, and is equivalent to a call to [`get_context()`](https://discordpy.readthedocs.io/en/master/ext/commands/api.html#discord.ext.commands.Bot.get_context "discord.ext.commands.Bot.get_context") followed by a call to [`invoke()`](https://discordpy.readthedocs.io/en/master/ext/commands/api.html#discord.ext.commands.Bot.invoke "discord.ext.commands.Bot.invoke").
This also checks if the messageβs author is a bot and doesnβt call [`get_context()`](https://discordpy.readthedocs.io/en/master/ext/commands/api.html#discord.ext.commands.Bot.get_context "discord.ext.commands.Bot.get_context") or [`invoke()`](https://discordpy.readthedocs.io/en/master/ext/commands/api.html#discord.ext.commands.Bot.invoke "discord.ext.commands.Bot.invoke") if so.
This function is a coroutine.
how to make me cry in one sentence, volume 1
It's in Command._parse_args or something
What's wrong with that?
a function cannot be a coroutine
a function defined with async def is a "coroutine function"
this causes confusion, especially among people new to async
Ah I see
anyone aware if there is a nicer way to spell out the type when it may be a Pathlike or a path str?
make a type alias?```py
Path = Union[PathLike, str]
or PathT if Path conflicts with pathlib.Path
We have aliases for that in the _typeshed module but that's stub-only
how do i allow a name reassignmnet inside a import error handler?
what do you mean?
i have a ugly hack for dealing with messes users had in setuptools_scm
try:
from packaging.version import Version as P, InvalidVersion
assert hasattr(
Version, "release"
), "broken installation ensure packaging>=20 is available"
except ImportError:
from pkg_resources._vendor.packaging.version import (
Version as SetuptoolsVersion,
InvalidVersion,
)
try:
SetuptoolsVersion.release
Version = SetuptoolsVersion
except AttributeError:
class Version(SetuptoolsVersion): # type: ignore
...
is there any way to actually make exception handling part of typing in a sense (aka ensure people handle known exceptions and/or pass them on)
No
No
I don't think anyone who has experience in Java wants this
If you want to make errors explicit, you can use error values (kind of like in Haskell/Rust)
One trivial example is None used to indicate a missing value
I think https://github.com/dry-python/returns has something like this, but it seems pretty foreign to "idiomatic" Python tbh
The type system is already complicated. I think exceptions would only make it worse π
hmm, i guess i'll have to reaarrange the internals a bit to go from a Exception to communicate from the inside to a outer shel that makes the excepton for setuptools/other tools so that i can make the optional part of the handling
There are libraries that implement Result and Some in python which I prefer to this
I don't think you can make it not painful in Python...
Maybe if ? Is added to syntax
and if you do it in a lib it'll just be weird compared to everything else
well, you can emulate it with yield/await π
or well, with exceptions
like ```py
@catch
def foo(x):
y = unwrap(bar(x, x + 1))
...
instead of NoReturn a FlakyReturn[T, EXC_T] π©
you can return a Union[T, YourException]
or make some Result[T, E] class with two variants, Ok[T] and Err[E]
Black has something like this
It's just very annoying IMHO to work with such classes without syntactic sugar or good lambdas
i switched all the innner apis to optionals, and create the exceptions on the outside, some stuff got nicer ^^
None isn't always a great error value, because it doesn't have any metadata attached to it
but if that fits your problem, I think it's an improvement π
it fits, and with types all in place its fine to go that way (the excpetion in the inside helped to avoid missing a guard)
bascially my instincts for some stuff are honed for types aint helping me, and now we got mypy ^^
in particular overloads make things so much better sometimes
I used some really primitive error value in here: https://github.com/decorator-factory/pyright-playground/tree/master/backend/backend.
It's actually not as bad as I thought.
async def download_code_handler(request: Request) -> Response:
raw_source = dict(request.query_params)
source = parse_code_source(raw_source)
if isinstance(source, SimpleError):
error = "Invalid request: {0}".format(source.message)
return Response(error, status_code=HTTP_BAD_REQUEST)
code = await download_code(source)
if isinstance(code, SimpleError):
error = "Could not fetch code: {0}".format(code.message)
return Response(error, status_code=HTTP_NOT_FOUND)
return Response(code)
whats the best way to declare a functions input arguments the same as the signature of a type ?
there's no syntax for that
is there a suggested option to do it?
I'm not aware of anything other than copy-pasting
that seems doable with a mypy plugin π
is mypy able to transform a Type T into a Callable[SPEC, T]
the idea would be that there is a
def transfer_input_args(template: Callable[PARAMS, T]):
def decorate(func: Callable[Any, T2]) -> Callable[Params, T2]:
return cast(Callable[Params, T2], func)
return decorate
``` ?
that sounds like ParamSpec, which mypy does support
if you basically want to reuse the type information from a function signature, you can kind of do this by declaring a dataclass
it does make it a bit more awkward to call, but it lets you reuse the same "type signature"/type information, in multiple places, which can be very useful in some cases
e.g. if you want to extract all the arguments that a certain function needs in order to be called from somewhere, like a json, or command line arguments.
Then just having a function take the dataclass and parsing out the dataclass is a very simple and clean way to do it
the class already exists, but the backward compatibility api is still there
Yeah, then it is what it is. But if it's a backwards compatibility API then at least you aren't going to be changing it
so the duplication is less important
sometimes it is much preferable to keep the duplication (maybe note it) than to introduce some common abstraction
The "DRY" solution can be way more complex than appropriate. And in the end it might be the wrong abstraction, which will have to change with some new requirements/bug fixes
It can be good to have duplication, especially in multiple modules - then the info is locally available, you don't have to look elsewhere for it. Write tests probably to ensure they don't become mismatched.
idk, if DRY is very complex, or you specifically don't plan to keep them in sync like here because it's a backwards compat API that's not evolving, that's fine
Duplication to avoid importing from another module though, I have trouble making sense of that
by that logic we can also just copy paste everywhere instead of importing functions π€·ββοΈ
Sometimes making the code apparently DRY can lead to coupling. You thought that modules A and B needs the same function/class, but then it turns out they needed something slightly different.
well, I can't think of any easy examples
the thing is that if/when that day comes, you can always transition to copy/paste
I think people are reluctant to do that
that's one factor I find that advocates of "copy paste first, abstract after N repetitions" tend to ignore.
It's much easier to see that there's a common abstraction, and split it out.
If you're looking at a block of code that's been copy pasted because the author thought "oh, this doesn't meant such and such requirement for being factored out" - you'll very likely never even know there's an identical block in the codebase
Find References is a pretty good tool it turns out π
my original point was that extracting out the common idea can result in much more complex code
for example, if you have an async client and a non-async client
yes, that part I agree with. If your DRY mechanism for the particular problem is poor.
that doesn't mean the duplication is good of course, it's just less bad that the code complexity of avoiding the duplication
true
Believe me I've done some non-dry things π I have a lot of config information that gets passed from python to C++, and I have the same classes/structs declared in C++, and in python
well obviously, you need to create a new language with one interpreter in C++ and one in Python, and define these configs in that language
I just didn't find any way to get rid of the repetition that I liked enough
clearly π
The funny thing was that it was actually the type annotations in large part that made it this way.
I found ways that I liked to abstract this out, e.g. using boost python
but then if I wanted static type checking in python, I'd have to write out the stubs file
and since these are just simple dataclasses with no behavior....
The ability to make things DRY also depends on the language. For example, making a sync and an async client would not be that challenging in Haskell because it has higher-kinded types. Some stuff with types is much harder in Python than it is with TypeScript (like extracting common parameters from a bunch of functions).
yeah, it massively depends on the language
Sometimes with de-duplicating code you can lose some features like type-safety or speed. Consider C -- it has nothing remotely resembling generics, so it has dynamically typed functions like qsort and qsort_r (which is another hack because it doesn't have closures/function objects)
i can't think of any kind of DRY that isn't pretty easily avoided in most lisps.... macros are like the thermonuclear weapons of the DRY arsenal
well, that's where C++ generics are a weirdly good fit, they are largely equivalent to duplicating code by hand, but without the worst downsides π
duplication of your initial headache at 1am
that too
the cost is high.
I'm still not convinced about macros as a language feature but I don't really have enough experience in this area to sufficiently back up my opinion
I have worked with some projects that are clearly overDRYed...
or rather, in the wrong way
with metaclasses, decorator stacks etc.
yeah, I don't love reaching for such complicated options in python
that said I don't find I need to do duplication all too often either. Just pretty basic functions and classes let me reuse almost everything I need.
occasional use of reflection is helpful too
My general complaint about "overdried" code is that when reading it, I often have to do reduction, i.e. expanding the abstraction with the stuff substituted into it
Sometimes the abstraction is good: you read it once of twice and you understand it -- you don't really need to do the expansion anymore
I think that's what TeamSpen meant by locality
What I meant is having to repeat the type hints when overriding a method in a subclass, for instance.
I think that in practice, people rarely inline the abstractions (at least in my experience). It feels a bit dirty
You have to repeat yourself, but it means you don't have to consult that other file to figure out what these arguments are.
that's also why distributed systems often duplicate data π
the data duplication is pretty much always an implementation detail for performance reasons
not in any way comparable to what's being discussed here
(performance/backup/technical reasons)
sometimes it impacts consistency
Like, instead of talking to a single source of truth, you have some local data store that's not always in sync
yeah, it's a different story
You say people rarely inline the abstraction, but at least they clearly have the option to, they can find all the usages in an automated way and make a decision. People can't always make perfect decisions about abstractions, that's life.
But if you copy and paste some code, there just isn't any way for the next guy to find the repetition in a reliable manner. You don't even have the possibility of making an informed decision, unless you get lucky ad hoc.
I guess you can mark it with some standard comment, like # DUP: computing the Foo. But that's prone to breakage, like any comment
I guess a more reliable way would be to monitor fresh abstractions so that they don't rot. Something like: "after a month/after 100 commits, see if it still makes sense"
it's because most of the languages that have inheritance and such also have overloading
or at least, it's just not possible to avoid that repetition in most of the languages in question
python doesn't have overloading of course, but it's static type system does in some sense, and it was inspired clearly by languages like C++, Java, etc
I don't think that that repetition of type signatures in overriden functions can really be considered a deliberate choice in that sense
i think you basically do that whenever you need to touch the abstraction, yeah? I do agree btw, people are often reticent to break up an abstraction completely, or do it the wrong way.
Often you see foo get more and more and more arguments as it handles different use cases from different sites π
Whereas a better way to go might be to have foo do a little less, and then just do the "different" behavior inline. But sometimes that's hard because the different behavior is in the middle, but then you could break it into two functions, but then sometimes you could be passing a lot of state back from one function into the other.... etc. There are no easy answers.
yeah, it's pretty complicated...
today I was looking at mypy's code, and it's pretty scary
For example there's a mypy.nodes.Var class which handles local variables, global variables, class members, some part of a function argument and probably something else.
This is its __init__
https://paste.pythondiscord.com/ugepagevoh.py
Lots of flags for various special cases. Not very clear how they're related and when they change
I wish there was some static analysis tool to show which parts of an abstractions are used in which places
what do you mean by "which parts of an abstraction"
it's called grep π
is it possible to type hint decimal places?
Like you want a Decimal with N digits after the decimal point?
yes
type-checking won't really do this for you since it'd require actually running the code to check if your number has that amount of decimal places
you could however use a NewType
from typing import NewType, cast
TwoDecimalPlacesT = NewType("TwoDecimalPlacesT", float)
def get(x: float) -> TwoDecimalPlacesT:
# Check if `x` has exactly n decimal places
return cast(TwoDecimalPlacesT, x)
def my_func(x: TwoDecimalPlacesT):
...
probably now worth it though
well, the get function should include the check that the passed number has that require amount of digits
if it doesn't, it should raise an exception, and if it does, it can return the float casted as the new type
so like in myfunc() i do get(x)?
after that, you'll need to pass only numbers that went through this get functions into functions that take this custom type
confused... π
if my_func is taking an attribute of that custom type, you first need to call that get function: ```py
my_number = get(2.54)
my_func(my_number)
yeah, I mean, you most likely shouldn't use this
just perform the check at runtime
lmao k thx
why do you even need this @undone carbon ?
you could use a custom type and a TypeGuard, but it'd be difficult to do any math with it
yeah typeguard would work too
is 2.00000 5 digits or 0? impossible to say without looking at the source code (which is a baaaad idea)
just wondering... tho if i couldnt it's fine
!d decimal
Source code: Lib/decimal.py
The decimal module provides support for fast correctly-rounded decimal floating point arithmetic. It offers several advantages over the float datatype:
you still can't specify how many digits does your decimal have at type-checking though
no, that's true
however it does support fixed-point arithmetic
which might or might not be what they are actually looking for
yeah, I suppose depending on what they're doing, this could be useful
then again, even rounding may be enough
which is why I was curious ^
lol
hmm, anyone aware of a type declaration for "json/toml"-ish data (aka dict, list, and some basic types, recursively)
Pyright supports recursive type aliases like this: py JSON = Union[None, int, float, str, list["JSON"], dict[str, "JSON"]] But mypy doesn't.
not sure if that's possible with a mypy plugin
And this isn't great as there is no way to have an unsafe Union
So you'd have to handle every case for all of these individually
Well, if you receive some untrusted data, you have to parse it somehow
it would make the perfect input type for something that goes from json-ish data to parsed object
typeddict is the best you get for "native" support
I've heard a lot of people say you should skip straight to pydantic or something dedicated to your domain
I think Ronny meant a type for an arbitrary JSON value
tbh I've been hoping typeddict gets expanded at some point to include specifying types for "all other" keys and some other stuff. Right now generic typeddict-like typing is RFC on typing sig
yeah I was really disappointed
the new typing features seem a bit... almost good? π
If you want mypy support w/type recursion I think your best bet is placing an Any at the recursion point still until that bug gets fixed
I think the only time I reacted with "wow this is cool" instead of "it's better in typescript" was with this
or using an alternative graph structure without recursion, I guess
I guess you could create special fake sequence and mapping types that have JSONs as their elements
but that's not always a fun (or possible) option
LinkedList = Optional[tuple[T, "LinkedList[T]"]]
this is pretty fun tbh
This is instructive to implement in something like rust
Just because it forces you to wrestle with the ownership system
(linked list-like types)
I'm hoping the len() type narrowing pattern for tuple/literals lands in mypy soon then I can work on convincing pyright to add support for it
Yeah this
well, it already exists in pyright
Huh, I guess he snuck it in in the last month or two without me noticing

Oh it was one week ago
mypy has an open PR for this
I like pyright all the bugs I've reported always get fixed
like Callable being treated weirdly as an attribute?
They get downright bizarre one sec let me find one
I am playing around with mypy plugins, and I am exposed directly to the madness that's inside of mypy
How does a bug like this even happen, it's so specific
π
that's pretty weird...
also, pyright just released a new version, idk how he does it so fast
probably my fault as I implemented walrus in mypy
the codebase is pretty impenetrable
It's no problem; static type checking is enormously complicated; the bug is just amusingly mind bogglingly convoluted in its conditions and took me quite a while to wrap my head around when I hit it in the wild :D
I managed to report something like 4 or 5 different pyright bugs, a mypy bug, and a rust language server bug when doing advent of code which was fun
https://github.com/python/mypy/issues/11807 a bit weird but a typing bug nonetheless
The pyright ones are all fixed now; I was finding weird edge cases while code golfing python
It is mind-boggling how complex type checkers are
Haskell's type system (without extensions) is unironically more simple
I wanted to try making a primitive type checker, but then I started imagining how it would work and just gave up
why do you say that type checkers generally are more complex, than type systems?
Not sure I understand
I do agree that python's static type system is surprisingly complex/expressive. I'd probably prefer for it to be simpler but less... janky, for lack of a better word
oh I meant Python type checkers specifically
in combination with Python's type system
ah ok sorry
Does anyone know if it's possible to implement a mypy plugin that supports this decorator?
@sequence_m
def gather(*args) -> Awaitable:
...
a: Awaitable[int]
b: Awaitable[bool]
c: Awaitable[str]
gather(a, b, c) " -> Awaitable[tuple[int, bool, str]]"
using a string instead of a comment because I literally can't read with contrast this low
fwiw typeshed does this kind of thing with higher priority overloads to cover the precise argument cases for small arg counts (<8, say)
yeah I've seen this ladder of overloads
You might see if this fits your use case
that's what I'm trying to avoid here π
it's not for anything in particular, just playing with mypy
I actually convinced pyright to change its behavior there only a week or two ago
Now if you call gather(*mylist) it will prefer an overload with *args in it instead of the first possibly compatible overload (which could be the one argument version!)
mypy already used this heuristic iirc
ah, I think I experienced that issue but forgot/was too lazy to report
I saw a Java library that defined a function using a ladder like this but they defined it up to like 1024 arguments
hmm, this is starting to look like type algebra a bit
π
bascially dnymic result type computation based on the input args
that's kind of what typevars do as well
typevars dont do for loops tho ^^
true
you can actually type asyncio.gather and such precisely in TypeScript, so I'm wondering if I can do this as a plugin
how is the gather typed in typescript?
Paramspec has started slowly making its way into typeshed
ParamSpec still doesn't cover this tho π
do you need TypeVarTuple + Map?
Has anyone seriously proposed typing Map?
I think it's been on some slides, yes. It may be what the tensor typing people want to do next after PEP 646
It would help clean up some typeshed too I imagine
function gather<Args extends Promise<any>[]>(...args: Args): Promise<{[K in keyof Args]: Awaited<Args[K]>}> {
return Promise.race(args)
}
I was experimenting with some stuff, and you can hack together something like type-level map, filter and reduce in typescript.
https://github.com/decorator-factory/ts-generic-rep/blob/master/examples.ts#L20-L21
But it doesn't scale well, and you can't really express arbitrary types (like Promise) with it
examples.ts lines 20 to 21
export type parsedBools = TypeLevel.map<ParseBool, ["yes", "no", "no", "yes", "yes", "yes", "not sure", "no"]>
//-> [true, false, false, true, true, true, never, false]```
these kinds of things show up in a lot of languages, including ones that are claiming type system sophistication as one of their primary features
Rust does this for n-sized tuples, using macros
right
Rust, Scala, I think even in Haskell in the standard library (I think you can avoid it using extensions but not standard haskell)
Haskell Rule number 42: if you have a problem, it is solved with a GHC extension
it is always a big shocker for me how many languages lack variadics.
C++ has both variadics and template template parameters (higher kinded types) and I've needed the former much, much more often
i think so
A higher-kinded type is a type that abstracts over some type that, in turn, abstracts over another type. It's a way to generically abstract over entities that take type constructors.
but it checks everything after substitution, right? something something sfinae
well, C++ templates are just unconstrained, in general
but afaics that's orthogonal to having higher kinded types or not
I meant that it's probably easier to implement HKTs in this way
ah ok
Yeah, I don't claim to know much about HKT's. But I do think variadics come up pretty frequently, without going out of your way to use anything fancy.
non type template parameters (non-const generics) are also a pretty rare feature that's very useful in systems programming
I'm pretty lost as to where to ask for help with mypy plugins...
issues seems like a place for bugs, not for questions from a noob
and the typing gitter just ignores me
typing GitHub discussions maybe
hmm
but that's for general typing, not mypy specifically
I was wondering if I could "unify" two types together.
Like, I have an Awaitaible[_] and a Coroutine[Any, Any, int]. And I want to end up with Awaitaible[int]
Ooooh I think it's mypy.solve.solve_constraints
nope, it doesn't work π
is there a mypy mailing list?
maybe the typing-sig mailing list?
what about fax? π
no, there isn't
mypy.join.join_types might work for you
I have many classes which use something like: ```py
T = TypeVar("T")
class Foo:
@classmethod
def from_xyz(cls: Type[T], ...) -> T:
...
@staticmethod
def some_internal_foo_method():
...
``` Problem is that this causes issues because the typevar isn't bound to the Foo class, and so doing cls.some_internal_foo_method gives me an error Cannot access member some_internal_foo_method for type*.
I know that I could just bind the typevar to the Foo class to solve this, but the problem with that is that I'd need to make a typevar for each class, and in my case, that would mean making over 7 type-vars that do the same thing just because I have 7 classes (for now, that could grow) each with an alternative constructor.
I've considered just type-hinting it as (cls: "Foo", ...) -> "Foo" which would do the trick, but it wouldn't be type-consistent on inheritance, since calling that method on a child class would now make the type checker think that it returns "Foo", even though it's actually just an alternative constructor and actually does return the instance of that given class.
Any ideas on how could I avoid the mess of having >7 different type-vars each bound to each class without loosing typing information for child classes?
Use Self from PEP 673. Currently only pyright supports that though
oh, that sounds interesting, let me check out the pep
Until then, lots of TypeVars is your best bet
what version was it added on?
It's in typing-extensions
The PEP is still pending, but hopefully it will be in typing in 3.11
oh, I see
btw, how exactly does a PEP get approved? Is there some vote on how many people liked it, or how does it work?
!pep 0
!pep 1
Currently, the Steering Council decides after a discussion period. PEP 673 was just submitted to them today, so they'll discuss it at some point and post a decision.
@twilit badge i've been lazy and just defined _Self = TypeVar('_Self') and used that for everything
it doesn't properly set the bound=, but that's okay imo
because it will always (i think?) be the exact type of the current instance
i.e. it will never be a superclass or subclass
yeah, but at least with pyright, it doesn't realize that, and so it reports an error that I've mentioned above
once I assign a type-hint to cls, it does check for compatibility with the class type, but after that, it basically throws away the class's type and uses the type from that type-hint. Which is an unbound typevar, so according to it, the type-checkers won't recognize the argument variable as it's type, but rather as a type of any object.
If just using a simple typevar was possible without errors, I wouldn't be asking and there probably wouldn't be basically any reason to introduce that PEP, since TypeVars would fully handle it on their own.
I mean creating each typevar is a one liner
It's the same as the "right" approach you'd see in a static language, just that python makes you declare the type variables out of line
Self is interesting though it feels a bit sad if this is being added essentiy just to compensate for how awkward typevars are
i am saying to reuse one typevar in all the class definitions
Hm, how would I use it? the upper bound of Awaitable[Any] and Coroutine[Any, Any, int] is Awaitable[Any]
mypy.maptype.map_instance_to_supertype seems to work
you might want to use a true bottom type instead of Any (i.e. join Awaitable[<nothing>] to Coroutine[Any, Any, int]). that's UninhabitedType in mypy. glad you found something that works for your use case though!
class SequenceProxy(Generic[T_co], collections.abc.Sequence[T_co]):
```how would I make something like this work on 3.8 (where Sequence can't be subscripted)
would subclassing typing.Sequence have the same runtime behaviour?
actually it should have the same behaviour, nevermind
Is it better practice to annotate a return type using the specific collection, or using a generic one?
when you say "generic one", I guess you mean the super-interface?
like, Dict vs Mapping/MutableMapping ?
Yes
opinions differ on that. The more common stance in python is to accept the more general type (Mapping) and return the more specific type (Dict)
but in most other languages I've used, I don't think they would agree with that
That's what I've been doing. However, I have realised that it ties my interface to that decision. Thus, I loose the freedom to change it to a different type later without it becoming a breaking change technically.
Though in practice, I don't think I will need to change that often.
yes, that's basically the main reason why in most languages, it would be recommended to return Mapping
It's always a trade-off but there's typically just very little, or zero benefit, to users, in knowing that what they're getting is specifically the built in standard dictionary
there's not huge benefit to you as an implementer to returning Mapping but there is some
Well, for example, if I return an Iterable instead of a list, then the caller can't use something like += anymore, right?
Yes, but Iterable isn't really the first thing you'd associate with using instead of a List
that's a pretty different thing
in the context of this conversation it would be more like List vs Sequence
or MutableSequence
Right, sorry. But the same thing applies to sequence I think.
At least when it's on the left hand side of the +=
Oh no, it does actually define __iadd__
yeah i was about to lookup whether it actually does it
I would expect that it would because it's just going to be the same thing as extend
but in the general case, you're right, it is possible that List will have things that are not on e.g. MutableSequence.
So how to decide between e.g. Sequence and Iterable? The latter would give me more flexibility such that I could turn my function into a generator later if I wanted,
but typically those things are relatively few and far between, and users always have the option to just construct their own list if they need to
Well, Sequence vs List is about these relatively small trade-offs, but the basic concept of the function is the same
with Sequence vs Iterable, I think it's a much more dramatic design decision
you're basically reserving the right for your function to be lazy
hey guys i am studing Genetic Algorithems would it be possible for one of you guys to help me with one question
Discuss the different solutions to address the failure of simple crossover strategies(to solve the disadvantages) for the travelling salesman problem.
In particular:
why they are necessary
how they are applied
how they preserve the parental traits
what other possible methods are available
if you have a legitimate use case for thinking it might help for your function to be lazy later, then it's fine to stick with Iterable. But IMHO that should be relatively rare.
You're in the wrong channel. Try opening a help channel or #algos-and-data-structs
In most cases it's either going to be lazy from day one (e.g. functions that take and return iterators, a la itertools), or you are just performing some pretty typical munging of data that you're going to want to do eagerly and there's no real reason why you wouldn't do it eagerly
I see. I have to think about the context in which it is and may be used I suppose.
yeah as always you have to decide the trade-offs and such
but I almost never have functions that return Iterable to be honest
Not even generators?
Well, generators, yes
when i said functions above I just meant regular functions
if you write a generator then you have to annotate with something like Iterable/Generator
you don't have any choice, annotating with Sequence is wrong
I guess part of the reason is that in python writing generators just isn't very onerous to begin with.
So if I think it's potentially useful for something to happen lazily, I'll basically just make it a generator from the beginning.
If I think it's not potentially useful then I'll just make it a regular function and commit to Sequence, Mapping, etc
There's very little benefit, in python specifically, to annotating your return type as Iterable but then implementing it as a regular function that returns a list.
Thank you, that was helpful
If I know what type a function is going to return, why wouldn't I use that type? Even if I could use a Sequence when I know my function is returning a list, I'd always annotate the return type as a list since Sequence is just more restrictive. There's no real reason to use some supertype (less specific type) to annotate a more specific one, if you know the more specific one. Unless you have a good reason to go with something less specific (i.e. the return type of that function may change over time, but it will always be a Sequence, even though it may not always be a list), but that's a pretty rare situation.
I'm leaning towards recommending that in Python too now. Returning Sequence instead of list and Mapping instead of dict gives you more flexibility, avoids issues with variance, and also makes it so the type checker won't let callers mutate the returned list, which is usually risky.
I'm running into trouble with using ffi types in my type annotations. During runtime Python complains that "Parameters to generic types must be types." But I don't know what object to actually use to represent the type. I'm not using a type checker like mypy so I guess I don't really care how correct it is. I just use type annotations for documentation purposes.
Technically the correct "type" would be ffi.typeof(...) which returns a ctype instance.
yeah the typing runtime can be a bit overeager sometimes. One solution can be to use string annotations or from __future __ import annotations so the annotations don't get evaluated
It's not uncommon to see people mark iterators as iterable for this reason
I see this example in the typing module docs
Annotated[int, ValueRange(3, 10), ctype("char")]
but what is ctype here?
why is it usually risky?
what if the called functions caches the returned list, and returns it again to another caller
Whatever you want it to be. The point of Annotated is that static checkers will ignore whatever you put in there
I agree that would be risky. But hopefully that's not the "usually" case, that's a lot of caching π
Depends on what codebase you're working with
I guess that isn't helpful for me trying to annotate a param that is a ctype then
@summer berry re our convo before btw I was going to use the example of Kotlin. In Kotlin, you use List and MutableList 99% of the time, for both function arguments, and returns. You almost never use the specific implementation, which is usually ArrayList, for anything
In Kotlin you don't even usually name ArrayList to construct a list, that part might be taking it a bit further than some languages. But outside of the construction, in terms of API, receiving and returning, the thinking in Kotlin I find more typical than that of python.
Is that somehow encouraged by the language or is it just how the community converged on doing this?
its definitely encouraged by the language insofar as that's how the whole standard library is written
like, if you want a list you write listOf(1,2,3) and the standard library promises to return you a List<Integer>
it doesn't promise which implementation you're getting
if you do (1..3).map { it * 2 }, which is the equivalent of python's [x * 2 for x in range(1, 4)]
you get a List<Integer> too because that's what map returns, a List
@oblique urchin "The precise behavior of overloads is not specified and varies across type checkers." This drove me nuts when trying to learn it; when convincing Eric to add a heuristic to the overload evaluator in pyright he thankfully documented their algorithm at least
I think type evaluation macros could be a spicy proposal; the readability angle is easy to buy, but I do wish you gave a few more examples of things that weren't previously possible or that no longer need hacky workarounds
Not treating sequence as str comes up repeatedly, and you mentioned generic overlaps
hot take: the best thing would be to vastly discourage the use of overload π
The example in the official documentation is a great example of when not to use overloading
I need to take a type hint detox session
and make a complete medium-sized project without any types at all
Thanks! Yeah maybe I should have given more examples, there's just a lot that you can do with this proposal and I didn't want to overwhelm people
Sequence-but-not-str is also possible, I think I mentioned that in the pyanalyze docs but not my email
im glad this type of issue is getting fixed but this seems like a very big addition to the type system
It's definitely much bigger than many recent PEPs, it's pretty straightforward to implement though
i still like NotType tbf and that would be a pretty big thing
@terse sky i had a thought about that Sequence vs List thing, which probably applies more to python than to other languages like kotlin. if you return a Mapping, but you know that your caller might want to mutate the thing later by adding keys, do you just expect the caller to call dict() on it?
pretty much
I mean, that issue is still technically orthogonal to "concrete type vs interface"
You can returning MutableMapping if you want to allow mutation
I'm not sure it really applies more to python than kotlin. In both languages the main data structures are ones that are most efficiently updated via mutation.
And both kotlin and python provide at least some reasonable syntax for doing things in a non mutating way
like in python you should be able to do x = returns_a_mapping(); x |= y or x = x | y
actually, x |= y will mutate the set/dict π
this is not kotlin
if you have a lot of merges, x = x | y is O(N^2) complexity, while x |= y is O(N) (assuming x supports __iadd__)
ah right I forgot that the type cannot actually influence what happens
but anyhow you understand my point
a function returning Mapping instead of MutableMapping will never force non-linear complexity on a user because the worst case scenario is to construct a dict of your own
in most cases, you're probably just going to use the return without needing to mutate anything. very occasionally you'll have to copy.
In exchange you're going to get a lot of help from mypy in tracking down accidental mutation. Well worth it.
Also, if you returning a Mapping, remember, not to use defaultdict! I mean don't use it in 99.99% of cases anyway, but especially not there
Why not? Because accessing a non-existent key makes the key get created?
ive never used defaultdict(
yep
defaultdict violates LSP
its fine though if youve given it as the concrete return though right?
I find it very useful though, I use it all the time
Really? What do you use it for that wouldn't work just as well, and be less bug prone, than with dict.setdefault?
dict.setdefault is slow
which concrete return? Dict?
no defaultdict[T, TT]
sure, if you say defaultdict specifically than it's fine, from an LSP perspective
then you're just back to the usual bug prone-ness of it π
Yeah, I know that setdefault is slower, if I ever actually run into a case where that performance matters, I'd use defaultdict
What are some examples of these bugs?
usually what happens is that people populate the defaultdict, then later want to access values, so they just use the usual operator[]
and tend to not think about the fact that they're silently mutating the container if they make a mistake
I suppose, but conceptually with a defaultdict the default value should be equivalent to the key not existing anyway
maybe, the overwhelming majority of the times I've seen people want a functionality similar to defaultdict, that's not actually how it's used
usually people use defaultdict, IME, because it's convenient to "build up" some kind of dict. Once you have that dict, there's no reason for the weird defaulting behavior any more.
I used a context manager on mine that temporarily disabled the default item creation, iirc it helped me catch a bug once
could also do it the other way around
Also, Jelle, you just mentioned that you typically suggested annotating with Mapping when returning from a function, which I agree with.
annotating defaultdict with mapping is quite a bad idea. so if you were populating a dict with the intent of returning it, then you'd need to copy everything out anyway.
defaultdict implementing Mapping is also bad even if it's not directly annotated that way at the return point. mypy will not complain about passing defaultdict to functions that want Mapping, and you'll believe incorrectly that such functions aren't mutating their argument.
None of this is catastrophic but it's pretty not-great, and in most cases setdefault doesn't have any significant downside so why not just use it
I see what you mean, I just haven't ever seen this cause issues in real code so far, so I'm not too worried about it
If there weren't a good alternative then I probably would just use it, but there is so π€·ββοΈ
good point, and this is actually what i meant to ask about
wait, why can't a defaultdict be a MutableMapping and therefore be a Mapping?
or are you saying that defaultdict should not be valid when returning Mapping, but should be valid for MutableMapping?
defaultdict violates LSP for Mapping
the point of Mapping is that it's a non mutating interface to a dict-like container
part of the Mapping API is operator[]
defaultdict provides operator[], but it mutates the container
so, it satisfies Mapping syntactically, but not semantically, very dangerous
yeah i think i misunderstood what you were saying at first. it's fine to have defaultdict subclass MutableMapping
it's not fine because mutablemapping is a child to Mapping
that's because indexing and assignment are entirely different operations
you are taking issue with the core collections hierarchy
which is fine, but a totally different argument
no, I'm taking issue with defaultdict π
but defaultdict doesn't mutate itself when you look something up
MutableMapping as a child to Mapping is fine, and that's what I suggest
it does though, operator[] is supposed to be an access method, not a mutating method
!e ```python
from collections import defaultdict
dd = defaultdict(lambda: None)
_ = dd['a']
_ = dd['b']
print(dict(dd))
@fierce ridge :white_check_mark: Your eval job has completed with return code 0.
{'a': None, 'b': None}
wait
lol
yes
seems like a leaking implementation detail
No, that's the whole point of it
!d collections.defaultdict
class collections.defaultdict(default_factory=None, /[, ...])```
Return a new dictionary-like object. [`defaultdict`](https://docs.python.org/3/library/collections.html#collections.defaultdict "collections.defaultdict") is a subclass of the built-in [`dict`](https://docs.python.org/3/library/stdtypes.html#dict "dict") class. It overrides one method and adds one writable instance variable. The remaining functionality is the same as for the [`dict`](https://docs.python.org/3/library/stdtypes.html#dict "dict") class and is not documented here.
The first argument provides the initial value for the [`default_factory`](https://docs.python.org/3/library/collections.html#collections.defaultdict.default_factory "collections.defaultdict.default_factory") attribute; it defaults to `None`. All remaining arguments are treated the same as if they were passed to the [`dict`](https://docs.python.org/3/library/stdtypes.html#dict "dict") constructor, including keyword arguments.
[`defaultdict`](https://docs.python.org/3/library/collections.html#collections.defaultdict "collections.defaultdict") objects support the following method in addition to the standard [`dict`](https://docs.python.org/3/library/stdtypes.html#dict "dict") operations:
that's... not the semantics i expected
people use it for things like
d = defaultdict(list)
for x in something:
d[get_key(x)].append(x)
this for example is a group_by using get_key as the key function
sure, but i never considered that it works by just inserting the value
well, if it didn't insert the value then the above example woudl be broken of course
Amusingly the only two languages I know with something this broken are python and C++
and they're the two languages I mostly program in
(thats collections.Counter)
but yeah, I really dislike defaultdict's behavior
hm, i am still willing to say that's a subclass of Mapping because it implements __getitem__. but i see your point
I mean, it's a subclass of Mapping because it says it is π
semantically it violates what Mapping.__getitem__ should do
the only question is "should it be", and the answer is no because of LSP
what's LSP?
at least, unless someone is prepared to argue that a core point of Mapping isn't to provide a non-mutating interface. That's a pretty convoluted argument though.
Liskov Subsitution Principle
hm, but when you do use Sequence or Mapping type, there's nothing preventing these to also be mutable
every mutable mapping does fulfill the Mapping's requirement
Right, but the types should prevent you from actually mutating it
their point is that __getitem__ is not expected to mutate anything, so it's a conceptual violation rather than an actual type error
ah, I see
it basically says, the point of inheritance is that if you substitute a derived type for the base type, that the behavior should fulfill the same "fundamental" requirements
tldr in python, almost everything is mutable and almost nothing is private. "immutability" is a fig leaf
it's obviously up to the human beings what the fundamental or key concepts of the type are.
But it hopefully isn't controversial that Mapping is supposed to be a non-mutating interface.
fwiw i think this is a concession to practicality over purity
I don't really think you can enforce non-mutability like that with a type though
I mean I don't think it's a deliberate concession
defaultdict inherited from dict, as a matter of probably mostly implementation convenience, since before Mapping et al existed
i'm sure someone brought this up in a mailing list at some point
I think the problem with LSP in practice is that the expected properties of a class are rarely explicitly defined.
so basically there was no way to make dict implement Mapping without also making defaultdict implement mapping
I don't know if it's a problem with LSP per se; it's perhaps a problem with getting people to agree on the precise nature of LSP violations.
But I think while there are cases that reasonable people could disagree about, I don't think this is one of them.
I'm not saying it's a bad rule, I'm saying that we should come up with these properties and write them down π
isn't Mapping a protocol though? doesn't it just check for __getitem__?
Well, we did, right? we wrote a second class called MutableMapping
that was my argument @twilit badge : it still implements __getitem__, so in some kind of very strict sense it is valid. but the point is that it does something semantically different from what you usually expect or want
here's what the docs say:
class collections.abc.MappingΒΆ
class collections.abc.MutableMapping
ABCs for read-only and mutable mappings.
read-only
"in some kind of very strict sense" - in the syntactic sense.
Protocols/interfaces don't just define the names of methods an object needs to have. They also have some expected properties. For example, the items returned by the keys() method must produce objects that can be put into the __getitem__.
right
It's not a Protocol (in the typing sense). Generally only the very simple ABCs have been turned into Protocols
there's additional invariants before just the functions existing
that's fine, but protocols can't enforce anything about the actual implementation, the inner workings of __getitem__ will always be able to be overridden to do something unexpected, I'm not sure there's any better way to handle this
(that doesn't affect @terse sky 's argument about substitutabiliity though)
yeah, the point isn't that the language needs to handle this better at a fundamental level or magically detect bad classes
the point is just that it's bad that defaultdict implements Mapping
yeah. nobody is arguing that protocols should be inforced. that's impossible in general (even in a statically typed language)
right
ah, I thought Mapping was a protocol, not an ABC, so I thought it basically had to be a mapping just because it was implementing __getitem__
well, either way, protocols can have expected invariants as well
interfaces, protocols, ABC, constraints, traits, type classes....
there's millions of names for these things that have different implications and different usages in different languages, but the ultimate idea is a set of methods that are usually expected to follow some additional invariants
yeah, but there's nothing you can do about it matching a protocol like that, but if it's just about having Mapping somewhere in the MRO, then yeah, that could've been different about it ig
it's not about having "Mapping somewhere in the MRO"
or rather, it is technically I suppose, but you're making it sound more obscure than it is
the problem is that you can pass a defaultdict to any function that wants a Mapping, mypy will not complain, and the results can be very surprising
like, you can break your program
Yeah, if you then query values() or keys() it will show a different answer than before the lookup.
is there a way to enforce non-mutability at the type level? even in idris and haskell, at some point you have to hand off control to a core that can do "whatever it wants", including mutating data
what if you are running concurrent code, you are passing the same defaultdict to multiple places. You are taking by Mapping and expecting it's safe because all these threads are doing read-only operations
in the case of idris it's a mix of C and Scheme
oops!
oh I haven't even thought about threading
yes, it's a mess
that's even worse than this
normally if you pass by Mapping and you listen to everything mypy says, you're relatively safe
It also breaks other expected properties, like py if k in mapping: print(mapping[k) being the same as ```py
try:
v = mapping[k)
except KeyError:
pass
else:
print(v)
so if (in a perfect worls) we wrote down all these laws for Mapping before defaultdict was implemented, we would see that it is indeed not a sound mapping
I was just saying that a solution to this would basically be to make defaultdict not have a Mapping in the MRO, I get why it's an issue after your explanation. I was just saying that I thought Mapping was just a protocol enforcing __getitem__ which if that was the case, there would basically be absolutely nothing you could do about it
right, defaultdict shouldn't have Mapping as a base, as currently implemented
realistically, I think if people saw this issue, then they'd be like "shit, we really want defaultdict to implement Mapping and MutableMapping", and then they would rethink the API
wouldn't that just be a dict then though? basically turning that special __getitem__ implementation into something like dict.setdefault
at which point, why even have defaultdict
and then they'd realize that defaultdict.keys() should return the universal set, and defaultdict.values() should return a set of the current values plus the set of all objects that will be instantiated by the default factory in the future π
oh, and defaultdict.__contains__() should also be true
yes
and you also need to put a lock on it π₯΄
wait, there's something like a universal set in python?
yeah, I mean, honestly, even arguing that setdefault is slow, all you actually need is a version of setdefault that takes a lambda instead of a value
yeah
d = dict()
for x in stuff:
d.set_default_callable(get_key(x), list).append(x)
at least, I can't immediately see why this would still be significantly slower than defaultdict
Don't know about significant, but calling a named method is generally slower than a dunder
because it does a hash table lookup, and __getitem__ goes straight to the slot
fun fact: having awesome lambdas is really good. In Kotlin { x } is a zero argument lambda returning x. So, Kotlin doesn't even bother to have set_default and set_default_callable.
The Kotlin equivalent is called getOrPut and it just always takes a lambda; d.getOrPut(getKey(x)) { listOf() } , in this example listOf doesn't get evaluated unless the key is missing
Rust also has functions lke these. In Rust the lambda would be || x
Okay, sure, that's a very tiny difference though
about 3x ```In [49]: d = {}
In [50]: %timeit d.setdefault("x", 0)
105 ns Β± 13.2 ns per loop (mean Β± std. dev. of 7 runs, 10000000 loops each)
In [51]: dd = collections.defaultdict(int)
In [52]: %timeit dd["x"]
36.5 ns Β± 0.802 ns per loop (mean Β± std. dev. of 7 runs, 10000000 loops each)
and thats on 0 which is as free as things get to "make" in python
but in absolute terms, it's still a tiny difference.
If you want to apply this reasoning then you should be applying it equally everywhere.
How often are you forcing people to use dunder methods, or program dunder APIs, even when they're less readable than named functions?
My guess is probably pretty much never
Agree, this is likely premature optimization
I admit I am shocked by that 3x though
it's on 3.6, wonder if they made it better since then
python is hilarious because my thinking, being mostly a C++ dev, is "well, hashing is obviously much more costly than function call overhead"
python: but what if function call overhead were hashing π€
the hashing is in addition to function call overhead
on 3.10 it's only 2x ```In [2]: d = {}
In [3]: %timeit d.setdefault("x", 0)
69 ns Β± 3.05 ns per loop (mean Β± std. dev. of 7 runs, 10000000 loops each)
In [4]: dd = collections.defaultdict(int)
In [5]: %timeit dd["x"]
34.4 ns Β± 0.502 ns per loop (mean Β± std. dev. of 7 runs, 10000000 loops each)
pretty great
it's crazy to me that just the call .defaultdict itself, is a hash that's just as expensive as the actual hash you care about
fwiw hash(0) == 0
i thought that builtins like dict and so on, calls on their methods though don't get hashed
I bet this helped: https://docs.python.org/3/library/dis.html#opcode-CALL_METHOD
I thought that they actually get mapped directly to the instruction
so that could theoretically be optimized
no, only for dunders
and that's why you can't just "glue" new methods to the built in types
i wonder if they can't do that due to breaking
what if you have a slots class
it still hashes?
you still need to go from the name of the attribute to the slot
yes, but all the attributes are known at the start
the compiler doesn't know that d is a dict when it compiles d.setdefault
I think strings cache their own hash in CPython, actually
so maybe the attribute strings' hashes are precomputed?
that helps but you still need the hash table lookup
true
which can include collision resolution and such
I'm sure there's a lot of caches involved but it's still overhead
in principle you could build a perfect hash table
for things like slots tables
even for built-ins, you could have a perfect hash table, with an extra check upon lookup, in case someone tried to "glue" a method
but I doubt this is done
attribute strings will be interned so they can be compared by identity
interested as to if you guys have any features you'd really see in a PEP
that's not already in a PEP I guess?
yeah
hello, how is the type of class decided? i have probably some misunderstanding of concepts... i am trying to annotate my code that uses external python package that does NOT have type annotations (e.g. aaa), and i have error that i cannot do class myClass(aaa.unAnnotatedClass): subclass as aaa.unAnnotatedClass class has Any type.
i understand that methods etc of the aaa.unAnnotatedClass are missing typing, but isnt the class name itself sufficient to identify type of parameter of that class?
even with possibly adding the "stub" for external class, how does one annotate the type of class itself (which i assumed is the class itself)?
Hey guys, quick question. Is it ok to use NoReturn in union? docs says that it indicates that function NEVER returns anything, but i'm confused if it comes to union.
Simple example:
def foo(number: int) -> Union[int, NoReturn]:
if number > 100:
raise ValueError("Number too high")
return number
looks wrong imo, but is it wrong?
not completely sure, i am newbie in hinting myself, but i assume NoReturn is not used when throwing exceptions, only for infinite loops and such... so you dont need NoReturn there...?
resp. to rephrase, imho your specific implementation of that function does not have NoReturn case ever, only int
use case for NoReturn is also:
def foo() -> NoReturn:
raise Exception
according to https://www.youtube.com/watch?v=-zH0qqDtd4w
today we talk about a common stumbling block in python typing -- the "NoReturn" annotation!
playlist: https://www.youtube.com/playlist?list=PLWBKAf81pmOaP9naRiNAqug6EBnkPakvY
==========
twitch: https://twitch.tv/anthonywritescode
dicsord: https://discord.gg/xDKGPaW
twitter: https://twitter.com/codewithanthony
github: https://github.com/asotti...
(but i havent checked into annotating the exception throwing, so i may be off)
ah, ok, scratch my theories then, sorry π i assumed all that due to fact that my code thowring exceptions did not complain with any warning/error even in strict mypy output
personally I wouldn't use NoReturn, but it comes out on code review and no one is sure about it π
that's good idea, I will look how mypy reacts for this union with noreturn, thanks π
btw. I'm relatively new to typing in python, I read your question but I can't help, sorry
and thanks for your help!
Type[TheClassMeaningInstance]
not much of help, but no problem, thanks too π
thank you! i will check documentation on the items...
This is because all classes are instances of type (that when called return TheClassMeaningInstance).
erm, trying to google for explicit keyword "TheClassMeaningInstace" gives nothing, i assume you mean to use specific class id/name?
something like:
# stubs/aaa.pyi
UnAnnotatedClass = Type[aaa.UnAnnotatedClass]
``` ?
Haha yeah I get that "TheClassMeaningInstance" doesn't get you anything π .
The reason I named it that way is because when you do a: X you mean that a is an instance of X (X being the class that means an instance of it as an annotation). So when you do Type[X] you mean an instance of type that when called returns an instance of X.
my code is defining subclasses of class Action (lets rename it so it's shorter) that comes from non-annotated module aaa
i was assuming one can one-line define annotation for class Action before defining full class stub, but this may not be true probably
This would work yeah, but you could just do UnAnnotatedClass = aaa.UnAnnotated y'know?
is there some implied reason why this does not work automatically? why i cnnot use the class from non-typed module in my typed code? (assuming i dont use any nested methods etc.)
this has to be inside stub .pyi file right?
I don't understand, does it fail?
i mean error is reported when i dont have any stub, which i find strange, as it's class Action which by itself is type im my understanding
Or rather, is the issue that you get no feedback because it becomes Any?
Well your type checker doesn't know that what you're importing is a type without the stub, and even after you subclass from it your type checker doesn't know what Action inherits from its unknown parent.
oh, so something like class MySubAction(aaa.Action) might be "clarified" by class MySubAction(aaa.Action: aaa.Action)?
annotations for functions are rather clear & easy, but the class subclassing is either not described that great, or i am missing some core understanding
No, not at all. I am confused about what you're trying to do.
i want to "clear" the error from mypy that says i cannot subclass aaa.Action inside my annotated code
aaa.Action is unannotated class of external lib
That second code line is not valid Python syntax, but I take it you're trying to do some form of "X as Y" syntax that exists in other languages?
and i want to have class MyAction(aaa.Action) in my code that uses external aaa
typing.cast(typ, val)```
Cast a value to a type.
This returns the value unchanged. To the type checker this signals that the return value has the designated type, but at runtime we intentionally donβt check anything (we want this to be as fast as possible).
my assumption was, that mypy sees that aaa.Action is some "class", so it is clear that my class is subtyping other class of exact type "aaa.Action", but this seems not to be the case
class MyAction(cast(type, aaa.Action)):
...
MyPy doesn't work like that (I think that's called gradual typing, there's other type checkers that do that), because then it could make a wrong assumption from you making a wrong assumption and won't correct you when you're doing something incorrectly.
oh, i need to study-up on theory
is there a go-to pattern for such use cases? ability to create my subclasses of external untyped class? (is it the "cast() that you mention above)
i guess it's rather frequent use-case for library APIs in general
Pyre appears to do gradual typing
https://pyre-check.org/docs/types-in-python/
Python's type system was specified in PEP 484. If you are new to Python's type system and want to learn the basics, we highly recommend you take a look at mypy's cheatsheet as well as their type system reference. The following discussion focuses on Pyre's approach to "gradual typing" and how you can get from an untyped codebase to a fully typed ...
I would probably do something more like this: ```python
import typing
from untyped import aclass
class AClassProtocol(typing.Protocol):
def some_method(self) -> bool:
...
AClass: Type[AClassProtocol] = aclass # Could probably use cast() alternatively
This can now be type checked afaik.
class MyThing(AClass):
...
Protocols i have not read on yet, thanks a lot for clarifying, going to grok the new details...
Overall though, I'd recommend stubs - these are all workarounds I've presented.
oops, i thought the above code is example for the stub code itself (up to, excluding class MyThing)
The last codeblock I sent is a way to basically have stubs in your code itself
right, it's just about the placement of these lines in file tree structure properly
In the stub file you just basically copy the entire module and everything it exports, like Action and write it in the way I wrote the Protocol (although without subclassing Protocol, you should essentially write classes and functions with no bodies except for ... or pass so that it's valid Python code).
yes, that one is clear, i was missing the Protocol part understanding
A protocol is like an abstract class except you don't need to subclass it, which makes it ideal for Python's duck typing system.
perfect, i have content of stub, ill just have to split it correctly into stub sub-modules etc., my case has nested untyped modules (so that mypy sees it), but that one is py-101
Also afaik you don't have to explicitly inherit from the protocol
lot of joy trying as a beginner to add annotations to code using not so greatly written lib (missing reexports etc.)
Yep, types are great!
https://mystb.in/TiffanyMorningNasty.py
how would I change the return type of the CacheableMeta.cache property to match what User.__cache__ is (in this example)?
You can't
Well not properly, this is just something type checkers special case
I mean you could just make it a class property and then it would be able to access the type of Self
ah okay
class Cacheable(...):
if TYPE_CHECKING:
@classmethod
@property
def cache(cls: type[CacheableT]) -> Cache[CacheableT]:
...```seems to work
type[Self] works fyi
Well that's a bug in pyright
ah okay
this is probably a fun consequence of Pyright being implemented in TypeScript: https://github.com/microsoft/pyright/issues/2860
lmaoo
When dealing with large amount of literals that would return different types, is it better to make a stub file to house all the overloads? Or just keep the overloads inside of the file where the implementation is
opinions differ, someway up there's a big conversation on when to use stub files
my first instinct would be that you probably shouldn't write an API that requires that many overloads
but if you really need to, keep the types close to the code so you only have one file to update if you change it
That is a good point, I'll keep that in mind
most editors allow you to create regions that'll allow you to get the overloads out of the way when you're not actively working with them
I'm using neovim, so I don't think that's possible for me. That is unless a plugin already exists for that
I would expect one to exist if it's not something builtin
Do you know what I should search for exactly then?
I haven't really sought after something like this before so I'm unsure
not completely sure
Not sure what exactly they're called, but they're commends where # region starts the region (with an optional name after that) and an # endregion closes it
π³ you found my attempts to break literal math
the one with fifty primes is the best one, but the string example is more realistic..it's fixed now in pyright anyway (it bails to int/str when it starts to explode: https://github.com/microsoft/pyright/commit/98afe7110ea50cacb26b9ed8ea01f9e23660594c )
I actually have a couple more ideas for ways some bugs might have snuck in, but I haven't tried them yet
I'd probably echo, don't use overloads for this unless you have a specific reason
For something like open, there's a lot of history and it reflects an existing API and so on
but honestly would probably be better for users to have separate functions, open_readable, open_writeable, open_readwriteable
Basically just use named functions over overloads and literals basically, it's easier for both implementation and users
The system Im working with will set Futures corresponding to an event, the literal itself is the name corresponding to the event and thus would return a specific type from the Future after set_result is called. My main reason of having this is for that type safety but I'm also willing to scratch the idea
I think the only real use I personally had for it was here https://github.com/Numerlor/Auto_Neutron/blob/master/auto_neutron/settings/toml_settings.py#L58, and even there it was a bit of a pain because a normal superset of the overload on the implementation with explicit arguments is not valid, but *args **kwargs is
could've been separate functions too I guess
What do you mean by "set futures"? That sounds like modifying an existing object?
You can't change the type of an object that already exists, so for example a set_result call doesn't in principle have any freedom with respect to the type
Would probably help to see some toy code though
E.g ```py
futures = {}
def foo(...) -> None:
futures[...] = loop.create_future()
async def bar(...) -> None:
await asyncio.wait_for(futures[...], timeout=None)
In the dispatch system itself it would just call futures[...].set_future(result)
So you're going to pass a literal to create_future that will determine the return type?
What type do you envision futures having?
No, the literal only tells what event to have the Future under. The Future.set_result will be set an instance of a class dependent on the event being dispatched by the system. I'm simply trying to relate the literal of the event to the type if that makes any sense
E.g ```py
await bar("foo") -> One: ...
await bar("bar") -> Two: ...
This is why I thought of the overload
So far after looking for a few alternatives it seems the best choice is the overload after all though
Yeah but I'm trying to understand the broader context here
You're going to take the future and put it in this dict
What's the type of the dict?
The dict would be dict[str, asyncio.Future] as I said earlier
I'm not actually trying to retrieve the future
Yep
I'm just trying to wait for the Future to be set via set_result
Then that would in return give me the object I passed in set_result from wait_for
Alright
What's the reason for needing to be able to pass in these strings
Instead of just passing in the type you want?
The dispatch system works via the event NAME being dispatched, the objects comes from the event itself
But then the name will be something dynamic
Maybe perhaps I can make it a TypeVar and correlate the object to the event
that would require a redesign though
If you take in the event name which is dynamic and pass it in to bar mypy won't know what the return type is
It'll just be confused
Which is why I asked on if I should do overloads in a seperate file or not
I understand the problem is non trivial and no workaround I see works so far
It's not non trivial it's just impossible
You can't ask a static type system to draw inferences on runtime values
Unless I just overload the function with the Literal and have the return time
Overloading with literal only works if you make calls with literal values
You said you make calls with the name of the event?
Yes... the literal
the NAME of the event is predefined
You won't ever have code like bar(event_name)?
Then a much simpler solution to all this is to do bar(EventType)
But EventType != returned instance type
Well, the type of the returned instance then
Then the name of the event wouldn't be the name of the event... And instead be the type
"I'm simply trying to relate the literal of the event to the type if that makes any sense"
It's just substituting one literal for another at that point
Either way, I simply cannot just do the instance type as the event name either way even if I decided to do a redesign of the system
Two events could dispatch the same object but on different basis
E.g ```py
bar("delete_obj") -> obj: ...
bar("create_obj") -> obj: ...
You can see the problem that has with what you suggested
Sure
Sorry,nehy not just different names functions then?
bar_delete_obj, bar_create_obj
For each event I would then need to add their respective function, basically putting a bunch of stuff into the classes namespace when I don't want it to. With overloads It won't do that and only ensure type safety which is the whole reason why I'm asking my original question
How is adding their respective function worse than having one function implementation with big if/else, + overloads?
basically putting a bunch of stuff into the classes namespace when I don't want it to
So it comes down to "polluting the namespace"?
You can have a namespace called bar and make the events functions in that namespace
bar.create_event
bar.delete_event
No namespace pollution, much simpler implementation, even get much better ide support this way
Hmm this is viable, but this would also end up making me do more work pretty much. When I write the documentation I will need to document every method of bar when I could've just documented the single wait_for
I'm not exactly sure what your documentation requirements are
But I mean ultimately in both cases you can either document all the implementations in one place, or document each of them individually
The details of how exactly they're organized don't change that
I will consider doing the namespace with all the methods per event. But before I consider can you tell me why it's not a good/suggested idea to go for the overloads route?
I haven't used overloads much so that is something I'd like to know in the future
It's just already preferable to use named functions than to pass hard coded values to functions
Like, imagine instead of my_list.append(x), etc, you decided to write my_list.do_it("append")
And the only method of your list was do it
That would just be silly
That does not apply in my case though
Function arguments should be things you can pass around, forward from other places. If you have to choose from a hard coded list of values to an argument then went not just choose from a hard coded list of function names?
I mean whether it's literally the only method isn't the main point
It would either be, more methods, or having this single method accepting literals
Yes...
The single method wouldn't call any other method either
It simply waits for a response from the websocket
I'm not sure what that has to do with it
I'm saying why exactly should I just make more methods when the latter is much more cleaner
was the autocomplete problem also mentioned. by having the user choose from a list of string literals, they get no autocomplete suggestions from their IDE
It's not cleaner though?
It's the same thing, minus the quotes
And minus the need to have an enormous if/else or something like that in the implementation
I don't see how it wouldn't be cleaner though
Can you explain concretely how it isΓ±
Let me write an example
Like I've given some concrete things
You save quotes, you get ide auto completion
I've also tried to make the point that you could use this technique literally anywhere
Yet, 99.999999 percent of the time we don't
I'm not seeing anything concrete from you other than that you like it better tbh
class Foo:
async def wait_for_delete(self) -> obj:
return await asyncio.wait_for(futures["delete_event"], timeout = None)
async def wait_for_create(self) -> obj:
return await asyncio.wait_for(futures["create_event"], timeout = None)
async def wait_for_update(self) -> obj:
return await asyncio.wait_for(futures["update_event"], timeout = None)
async def wait_for_another_event(self) -> obj:
return await asyncio.wait_for(futures["another_event"], timeout = None)
foo = Foo()
await foo.wait_for_delete()
await foo.wait_for_create()
await foo.wait_for_update()
await foo.wait_for_another_event()
#vs
class Foo:
async def wait_for(self, event: str) -> obj:
return await asyncio.wait_for(futures[event], timeout = None)
foo = Foo()
await foo.wait_for("delete_event")
await foo.wait_for("create_event")
await foo.wait_for("update_event")
await foo.wait_for("another_event")
Can you see what I'm saying
So what part do you feel is messy?
Adding more methods then needed and the naming itself
I guess the thing that makes the second look nicer is because you're passing a string into this dict
So it trivializes the implementation in the second case
But I would not implement it that way either, that dict can't store the proper types of the futures
Your different wait_for s are supposed to return different types but you'll need to depend on manual casts to get that correct
The typical way to store things of heterogeneous types with fixed names is a dataclass, not a dict.
And if it were done that way then the second implementation would need to have a big branch
getattr on the dataclassπ₯΄
futures["foo"] = asyncio.Future[object](loop=loop)
I just realized I could this this
But this wouldn't solve my current issue either
Yeah I mean your wait for calls are supposed to be returning different types
But the implementation is such that the type information is identical for them
So users are going to get type information but you won't benefit from that inside your implementation really
I don't care for the benefits internally, I want the type error in the first place for users using typecheckers
I'm not sure how much else you have going on here but it seems like maybe the way to go is to have a generic class of some kind
like even if you had a dataclass of Futures, the name of each field is the event name and the type of the future is the return event
You can see that already gets you most of the way there
At that point why not have a member function on each field instead of writing these functions over and over. Does that make sense?
Oh I just thought of an idea that might work
How about an enum per event, E.g py class Event(enum.Enum): CREATE = "CREATE" # Name of the event ~and somehow attach a class to CREATE
I can do this through the metaclass
I could just this this but instead of "CREATE" uhh
Perhaps a class with a class property of the type of object?
I'm not too sure
@dataclass
class Foo
delete: MyFuture[TypeOne]
create: MyFuture[TypeTwo]
foo = Foo()
await foo.delete.wait()
I was going to mention enums before, I mean they're not really much better but at least they save you some issues with mistyping strings
But I think something like what I wrote with the dataclass can work well
lst = [1,4] , [5,6] , [6,7] <How to sum each of arrays to n>
example 1+2+3+4
and 5+6
6+7 please help where to find answer of this code
You can get a help channel for this #βο½how-to-get-help
@terse sky not sure if you are aware, but you can subclass both str and Enum and then your enum members will be actual str objects
Which is pretty cool and very useful for a lot of applications
class MagicWord(str, enum.Enum):
PLEASE = enum.auto()
THANKYOU = enum.auto()
That's interesting. But I feel like I'd rather be explicit about when I want a string
Oh auto doesn't work anyway i just checked
I could've sworn it worked for string
!e
import enum
class MagicWord(str, enum.Enum):
PLEASE = 'PLEASE'
THANKYOU = 'THANKYOU'
print(repr(MagicWord.PLEASE.lower()))
@fierce ridge :white_check_mark: Your eval job has completed with return code 0.
'please'
This is really useful when interacting with other interfaces that rely on magic strings
so you get both type safety and zero-ish overhead
and of course it works with integers and "flags"/bitfields too
with IntEnum and i think FlagEnum
people complain about enums in python being different from c but i think maybe the python version is better
I don't think comparing them is fair
But yeah Enums are fun
Especially cause you can do some cool stuff for your own constant values and Literal
well, mypy chose to use integers instead of enums
apparently because there's overhead
I mean have you seen the code?
?
It's actual spaghetti
ah
yes it is
I didn't look very deeply, but there are objects with lots of optional/boolean fields
Yeah it's very customisable and handles like every case for something
So it's pretty slow
It's why discord.py has its own implementation
of what?
Enum
ah
oh, you meant the enum source code?
for some reason I thought you were talking about mypy
Interesting, I didn't realize this. But I never checked
Are you talking about slow to create the class? Or slow to actually work with members/instances
But now you're repeating every value
And you're also losing type safety, if I'm understanding directly
*correctly
I can pass in an enum anywhere a string is asked for
Yes, subclassing things when they don't need to be is also less type safe
What?
If you have a class that had a conversion to string, it's safer to not inherit from string
Why?
It's equivalent to implicit conversion
In this example with the enum
It's like how C enums implictly convert to integer
This was broadly regarded as generally undesirable and fixed in C++ enum classes
yeah but i don't see an issue with that; the benefit is that you can write foo(val: MagicWord) and then operate on val as if it were a string
The issue is that you might not have intended that
I personally don't see any benefit in that
And it's very easy to just call name or value explicitly if you want a string
Unless I specifically need to decouple the string which I personally often don't
I usually just us name
And then that also saves me writing every name twice like in your example
if an enum is intrinsically associated with some kind of string, I would just use the value, yes
that string isn't necessarily a valid Python attribute though
Right
In some cases if there's a specific string that's enforced externally then you probably want to just repeat and use value
But if I just want safely dump enums in a json for example I'll use name
I think someone fixed enum creation being quadratic in number of items in two different cases but there's still a third reason it's quadratic
@soft matrix not sure I can help you with the mypy implementation, but it turned out to be pretty easy to add Self to pyanalyze (https://github.com/quora/pyanalyze/pull/423). I simply desugar Self into a TypeVar and then use normal TypeVar substitution on attribute access to turn it into the right type.
That seems to bring a lot of changes in things that seem inconsistent
Idk if you follow the pyright repo's issues but someone brought up Self not working in ClassVar because it was getting desugared into a TypeVar
I also tried doing desugaring but couldn't make it work but this is the first big thing I've done in mypy
This is different than the desugaring approach that was suggested in the PEP itself, to be clear
Well what you're describing is what I tried in mypy
I think I have a link to it originally somewhere in the original pep document on Google docs
It was really not good
works fine here, https://github.com/quora/pyanalyze/pull/426/files
I thought TypeVars didn't work in ClassVars
pyanalyze is maybe a bit too lenient here. But also note that I implemented Self resolution in the attribute resolver. It really doesn't have to be a TypeVar at all, it's just that when we resolve an attribute on an object, we run a step that turns Self into the current class
@oblique urchin hmmm, tricky to turn this into a false negative or positive
(it's a bug anyway)
I feel like this is just an instance of the general issue where people complain that if they have x: Union[int, str] and then do x + x, they get an error
Eric has rejected reports like that because it's not really feasible to support
yeah, although it could be easy depending on how the literal evaluation is coded
I'll add to additional context
@oblique urchin if this is not fixed and eric implements ** the current mitigation strategy for my OOM will not work
it checks size of LHS and RHS unions during entry to the codepath
well, it depends how ** is coded I guess
it would probably be okay (if it's done any reasonable way)
I guess if he went ahead and added real bignums you'd have tried to compute 100000000 * 10000000 and OOMed pyright π
well with some more zeros
trying to OOM is actually the first thing I thought of when I saw the issue proposal, so I had a day head start of ideas
the issue with strings is at least relatively realistic for real code (I can come up with plausible ways real code would have things that explode to at least ~100k literals) which would be unacceptable for pyright's use in vscode, I guess
do you know offhand if pyre supports this?
well I managed to freeze pyright
(again)
yes I managed to escape the mitigation completely
this is getting increasingly pathological so I'm not sure how much Eric cares (except for that people can take down a CI, I guess)
I think I annoyed him to the highest degree
he doesn't even reply to me
like, if I reply to an issue he closed or to a discussion
π§interesting
I commented here, I think it's a bit weird still https://github.com/microsoft/pyright/commit/98afe7110ea50cacb26b9ed8ea01f9e23660594c
the code is safe to run (it won't immediately OOM, it will just spin up your CPU and stall)
anyway there's still an entirely different class of literal math bugs I haven't started trying to find yet (code flow that leads to loops or variable modification in unexpected ways)
I have my 8th (!) job interview with a company Wednesday π
good luck π
I'm not sure I understand the type perspective comment, although I don't care too much about the underlying issue
I assumed it was more like, the set of practical limitations in the design approach for pyright to avoid programming an actual SMT solver for type checking precludes handling cases like x+x when x: str | int as opposed to there being some actual philosophy/type theory behind it
(I don't know if any of you know what is being referred to?)
i think literal math should work that way:```py
x: Literal[0, 1] = ...
y: Literal[0, 1] = ...
reveal_type(x + y) # Literal[0, 1, 2]
x: Literal[0, 1] = ...
y: Literal[0, 1] = x
reveal_type(x + y) # Literal[0, 2]
```
@oblique urchin hmmm, so in my head I imagined mypy and pyright were implementing a small subset of a constraint solver, but I guess what's really happening is everything is working strictly with type algebra with a lot of special cases added on to deal with spots where bidirectionality is needed?
and that's why stuff like x+x when x: int | str can't be handled unless it's through some hack
And that's actually what Eric is referring to by 'type perspective'; i.e. it's (literally) blind to the stuff that isn't the type except where special cased otherwise?
surely I can come up with stuff where bidirectionality doesn't work though
I want to see
Isn't it correct though? They could be different types and fail?
x: Union[int, str]
y: Union[int, str]
x + y # error
x + x # false positive error
def f(x: T, y: T):
x + y # no error
x + x # also no error
I was going to try and fix an issue myself
then I gave up
Do you know if all the type checkers are just type algebra with special cases or has anyone taken the approach of a real constraint solver?
if by "real" you mean something like Z3, those are used in type checkers for type systems with "refinement types"
(i don't know much about those algorithms, i just know they're used)
I mean, I assume that's the limitation Eric is getting at by type perspective
i missed the context for the conversation
and yes if you want to use things like TypeGuard for int > 0 then you need true refinement types with a constraint solver
would be pretty damn cool to have in a language like python
pyright can't handle x+x when x: int | str
i still didnt get it
