#internals-and-peps
1 messages · Page 61 of 1
C extensions need to be recompiled for PyPy in order to work. Depending on your build system, it might work out of the box or will be slightly harder.
pyPy has support for the CPython C API, however there are constructs that are not compatible.
i imagine, yes - to some extent
Well, I was wondering specifically about bytecode.
Because, for example, you could compile Python3.7 features like annotations with CPython compiler to bytecode and then run it with PyPy.
i thought it had a straight up different mechanism
like a whole different set of layers for execution
Is "jank" flagged as a bad word in pypi or something? I am writing a game engine called Jank (because its so buggy lol) but it wont let me upload to pypi
probably already a package
oh
that's interesting
pretty sure you cannot, deleting anything from pypi breaks everything that depends on it which is not determinable
^
The repo doesnt even exist anymore too
for example, microsoft has been squatting the minecraft package for years
nobody else can claim it anyway according to PEP 541, though
I guess they can just break the rules :(
im gonna contact the current owner of the jank library
hopefully we can sort something out
if I dont get a response within a reasonable amount of time ill probably do that
or you could go to the pypi repo and make an issue
and pypi overlords will remove the package (because it breaks PEP 541)
ok
what would it classify as?
oh great, the email address doesnt exist either
guess ill have to file an issue on the repo then
I know I'm late to the party on this, but I still find it weird that Pypy was willing to backport f-strings into their 3.5 line
Not sure if that's still relevant, but it just seemed strange
They support f strings?
is there any specific improvements/features for python 3.9 yall are excited/more intestred in?
Null aware operators Misunderstood
@full jay the Pep for null aware stuff is deferred
Right right
null aware were deferred, no potential date
but the showcase implementation of match-case that is allegedly merge-ready was marked as 3.10a1 or something along those lines
@full jay maybe it was an easy backport, and they were stuck of 3.5 for a long time so maybe they wanted to just show a sign of life
im happy now that theyre on 3.7
Well not to mention f-strings are hella convenient
yeah
Seems like something worth backporting
dataclasses was the last major python thing that I liked - so that seems like a very useful thing for pypi to have. Although I'm guessing the pip backports worked with pypi
after all, the whole point of pypy is that its written in rpython and rpython is supposed to be good for parsing and compiling
It's very interesting to me to see so many type aware things coming into the language, especially after such a long insistence that things like type hinting would have no impact on how the code works or runs, at least in the context of the standard library implementations
@slim island i just use attrs anyway, more featureful than dataclasses
what type aware things? @full jay
the none-aware ops are in the same category as the walrus operator, helps clean up a lot of messy python code
and im still more or less opposed to pattern matching
I might be using the wrong choice of words, I'm thinking about the pattern matching
I'm not opposed to the idea of us having it, but I kind of bristle that they actively discourage using it like switch in the PEP
yeah :/ seems like the best use for it
"Here's this swiss army knife, but you can only use the toothpick and the corkscrew"
The reference implementation is marked as 3.10alpha0
Well, we slowly come to a realization that types are generally pretty cool...
@grave jolt Sure and I'm all for that, 100%. But it's been such a... I don't know, subtle change when they've been rather vehement about it in the past.
I think I'd just want to know what their official stance is on it at the moment, I guess
It's still marked as a draft I think
I want a better understanding of where it's being guided
Thought it was just a draft
Status: draft
It is
does standards track not mean what i think it means?
No, I don't think so
I'm not sure
Still worth mulling the concept over
Consider some of the heavy hitters that are on the author team
typing in stlib is really nice, I’ll switch to 3.9 without even really used 3.8
it is standards track for 3.10
I'd be shocked if this doesn't go through
hasn't typing always been in stdlib?
but the class pattern makes a giant mess of everything
For a long time, yeah
where is this all discussed? python-dev mailing list?
Well, I meant not using the typing module
@glass robin still need it for Optional etc
harmonizing typing and collections.abc has no downsides imo
That's the main thing to use from it
I want a better understanding of where it's being guided
Python 3.42 will have full dependent typing
Har har
oh yeah, the stupid underscore wildcard
I'm okay with that
yuck
That's common
Two were already rejected, but those were more for switch instead of the pattern matching, but didn't like it that much, looked more like some neat syntax to me instead of big readability improvements
context dependent variable name?
Specifically the use of _ as the catchall option
thats a smelly smell if ive ever smelled one
Feel like I've seen it in at least two other places
its used in nonstandard libs
And in fairness, _ has always been a catch all kind of thing
Don't need this part of the unpacking? Underscore it
unless you put a bunch of semantics around it like _ = is a black hole assignment that just dels the RHS and you can never actually access _
which might be nice actually
ye, it's cool if it's consistent across the language
make _ a special name everywhere or nowhere imo
that could be a pretty cool trick for some niche cases where you are passing around large objects
but where you only need a small fraction of the data at the end
the "black hole" variable
So you're proposing making it defined in the language rather that an implementation detail
Okay, I can see that
Because yeah isn't it currently just a CPython quirk?
what exactly?
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
42
for _ in range(5):
_ isn't even a cpython quirk. _ is just a variable name
It is but I think the interpreter itself uses it as a buffer
the use of _ is purely by convention
Or at least I thought it was.... maybe it's a REPL thing
yeah the repl uses it for "last result"
I could swear there was something weird specifically about _
Okay that's what it was then
My bad
still a good point
the interpreter already uses it so you can't make it an inaccessible name
without breaking how the repl works
Project to be claimed jank: https://pypi.org/project/jank Your PyPI username DoAltPlusF4: https://pypi.org/user/DoAltplusF4 Reasons for the request The project has no description, no licence, and n...
i would rather use a bare * as a catchall/wildcard
* is already used in several places
That'd certainly make more sense in the Python context
and syntactically case *, a: is unambiguous
and its a very familiar convention from shell globs
I think the _ as the catchall for switch casing or pattern matching might be more of a functional language thing? Entirely possible (probable) I'm wrong, though
it would be it a bit strange as it would have an entirely different meaning from what it does in argument lists, but better than _ probably
Well there's always just adding default
__any__
does ? have any syntactical meaning in python at all?
nope
It's free real estate
That's surprising
actually, woudn't Point(object(), object()) work?
typing.Any 🙂
It would and be more explicit
also an option, but I am not sure how the pep matches against those
It was proposed to allow *rest in a class pattern, giving a variable to be bound to all positional arguments at once (similar to its use in unpacking assignments). It would provide some symmetry with sequence patterns. But it might be confused with a feature to provide the values for all positional arguments at once. And there seems to be no practical need for it, so it was scrapped. (It could easily be added at a later stage if a need arises.)
hm
so much for the haskell fans
I'm also not fond of the idea of having to import something from typing just to get my default case
agreed
i dont understand their objection to case head, *rest:
can someone interpret that paragraph for me
I think it literally just boils down to "It's not very Pythonic and we don't think it should be there"
I'm not seeing any super compelling arguments against
the pep actually discusses using * as a wildcard, but not convincingly
Can you delete a pypi project that you own? Im thinking I might push to pypi with a different name then delete and push with "jank" once the naming conflict is cleared up
someone might think that you could do sth like
l = [3, 4]
match (1, 3, 4)
case 1, *l:
matches
Okay, that makes sense
if you think of it as the LHS of assignment it makes sense
but yeah i would love the help channel questions about why l was overwritten
i suppose case head, *, *rest: is potentially confusing as well
Yeah a little
@grave jolt :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 4, in <module>
003 | NameError: name 'e' is not defined
but honestly, head-tail split makes pretty much no sense with the semantics python has for unpacking
@flat gazelle and it will result in O(n^2) algorithms
because it's gonna copy the whole array
and if you end up making a linked list
Yeah I don't think there's a happy medium.
yeah its not actually useful most of the time when writing python code except when trying to learn recursion
and if you end up making a linked list
...someone's gonna murder me
And even then there's alternatives
match llist:
case Cons(x, xs):
...
case Nil:
...
this is essentially what purescript does (though operator aliases for data constructors are wip afaik)
sealed classes are pretty cool
_ for pattern matching would be the most reasonnable option, as it’s the convention in most languages
Unlike *
it would make _ special exactly in this one case
which is a bit strange
java had a depracation warning for it
Again I think it's more of a functional language standard
Yes
| Error:
| as of release 9, '_' is a keyword, and may not be used as an identifier
| var _ = 5
Although the only language I can think of off the top of my head is ReasonML
So also likely in Ocaml as well
But pattern matching is a fl thing
True
#sorrynotsorry
but you cannot do that in python, as _ is already used in the repl and over tons of codebases are ignore
Ocaml use _ yes
god trying to find resources fro that was the most atrocious thing ever
@flat gazelle I disagree. Consider that in the examples the patterns matched would be doing crap tons of function calls and the like
So it'll already have a special like... bubble around it anyway
I don't think it'd conflict
haskell, elm, java, kotlin, purescript and I think the JS proposal all use _
I can see what you mean, it would be like saying {} in fstrings conflict with set literals
ELM! That's right, that's the other place I saw it. Exactly
We already have special sub contexts that do stuff like this
It just needs a trigger to tell it to think in that context, in this case match
{} is also an ambiguous representation of a dict (it could be an empty set)
Which again is dictated by context
Clojure did it correctly with #{1, 2} being a set and... uh... an awkward vector (array) being a dict [:hello 1, :world 2]
Okay, when they're not side by side that's certainly more pleasant
:atoms are a part of the language anyway, so why introduce new syntax 🤷
Also, commas are whitespace, so I can do
(def users
[:typing "fix error"
:hello "world"])
(def users
[:typing,,,,, "fix error"
:hello ,"world",,,,,,,,])
I mean in fairness, even the language name sounds like you were drunk when trying to tell someone what it's called
Klojure would've been worse
Well, it's like replacing pi/pe/pa/po with py in python packages.
I was thinking more about the jur part
Like someone trying to say closure but had a few too many
actually, FormDSL is just a point-free bad version of clojure 
anyway, we're getting kinda off topic
It would be nice to have different syntax for a set. Every time I see a set, I go: "ok, it's a dict... wait... it's a set"
I think I'm already used to it, and I like the parity of both a dict and a set being hash maps
True, a dict can be implemented as a set of pairs Pair(K, V) which hash and compare like their keys
or a set can be implemented as a dict of keys mapping into a singleton type
I actually did it like that when I implemented them in C (for educational purposes)... <offtopic prevention sequence>
Maybe {{ }} could be used for a dict? (since you can't put a dict or a set in a set anyway? (good luck putting a dict in an f-string with additional {} around it tho)
Except that you can nest dictionaries, and I could see getting bracket fatigue SUPER quick
Hmm, true
this convo was more about how other languages deal with the {} ambiguity between sets and dicts
Oh, okay I see
What's the issue with that though?
A key:value pair seems pretty clearly distinct from a set
{} could be both an empty set and an empty dict
in python, it is an empty dict, but the ambiguity is still annoying
Sure but only for a moment.
It's one of the language quirks that are the least troublesome
true
It's one of the language quirks that are the least troublesome
i definitely agree with this statement, any possibly solution seems to introduce more headaches then it solves, you could add {,} or whatever, but that just introduces another quirk, {} or set and be done with it imo
I think sigils for literals are a generally good thing.
Like, in Python you can do r"blahblahblah" or f"blah{wah}" or b"\x00ffdffdfdf". But what if you could define a custom prefix (sigil)?
Like
# split_sigil.py
import sigils
class SplitSigil(__sigil__[str, "w"]):
@staticmethod
def __convert__(super_sigil, string, context):
return [sigils.convert(super_sigil, word, context) for word in string]
# main.py
from split_sigil import SplitSigil
x = "hey now"
a = w"hello world abc" # a == ["hello", "world", "abc"]
b = fw"hello {x} abc" # b = ["hello", "hey now", "abc"]
c = wf"hello {x} abc" # c = ["hello", "hey", "now", "abc"]
what if there is a name conflict
the problem with that is that you'd need to read split_sigil.py before you could compile main.py, or any prefix would be compiled
hm
yes, that's true.
Maybe plain f, r, b could be compiled as is now, but any other prefixes could be stored together with a string.
and well, if a sigil may return an arbitrary object, how should that work with multiple sigils
Yeah, sigils don't perfectly compose... but, as you can see, some composition is possible ^
Maybe multiple sigils isn't a good idea, though
apart from combining f, r and b
Huh, apparently, bytestrings aren't composable with f. Well, that makes sense. It'd be weird to implicitly to .encode("utf-8") or something
I should look into how elixir does this thing. However, elixir has macros, so that integrates very well.
A sigil for SQL queries that would escape stuff would certainly be nice, though.
ye, that would quite nice
Or maybe an HTML template sigil.
why not just use a function?
good point 🙂
maybe SQL sigils aren't that great, by the way...
it might encourage intertwining business logic with raw SQL queries
one problem, I think, is stringly typed interfaces like SQL
and regex
If sigils are restricted to transforming a string into another literal or an f-string at compile-time somehow, they could work together with a type checker.
For example, if a sigil treats a string as a regular expression, the result could be a representation like this:
!e
import re
print(re.sre_parse.parse(r"[a-zA-Z0-9]+"))
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
[(MAX_REPEAT, (1, MAXREPEAT, [(IN, [(RANGE, (97, 122)), (RANGE, (65, 90)), (RANGE, (48, 57))])]))]
a static analyzer could look into a string that is passed into re.compile, though
But, for example, you could create an SQL sigil which is parametrized (at compile time) with your database schema. Then it would be possible to determine at compile time the types of parameters that you can put in there.
All that would probably require significantly changing how Python is compiled for quite a meh feature
and if you desperately want that, you probably should use idris or something
haskell can do that with postgres just fine. You just write your SQL and at compile time it checks it against your schema
well, doing that in python would probably require turning python into haskell, at least partially
and if it's possible for type checker to perform potentially Turing complete computations at compile time, it is logically possible to, for example, perform computations in type annotations, and that's another rabbit hole...
it also means you can have an infinite loop in your type checker
there are checkers for this in dependently typed langs like agda, coq, idris
class MySigil:
x = 1
while True:
if collatz_conjecture_is_false(x):
break
x += 1
it'd be quite challenging for mypy to know if it's an infinite loop 
from the little I've seen, in agda & coq it's impossible to even construct an infinite loop, right?
@slim island thanks, I wont.
Except this is Python, and using a syntax error to bail with a friendly upgrade message is "advanced."
if you read the channel topic, you see that your message doesn't fit at all
I just did. And I disagree.
Thanks. I will.
If you'd rather engage on useless rule pedantry rather than meaningful discussion about handling a 2->3 transition that has languished for 12 years... you go for it
speaking of, any cases in the wild where python 2 is still used?
@viral hawk Hey, firstly you didn't start any discussion just posting an error message, second the error message isn't even about the transition from 2->3 and thirdly it does not fit this channel
@brazen jacinth i'm sure there are many
lemme rephrase, a large scale one?
probably
sigh
debian probably still has py2 as the system python
afaik some windows c++ build tools still carry py2, also npm afaik
there are still large companies running XP so its almost certain
Large scale or proprietary usages are the ones that will stay long
certain Audio stuff with npm uses py 2 still iirc
I think JPM has some big projects that still aren't python 3
@viral hawk do you have a better idea about how to do better on python 2?
inskcape also has py2 in it
@viral hawk if you want to use f-strings in your code, there's no way to avoid the syntaxerror. I'd say they did a good thing making sure the error made some sense.
cant you just get the python version and eval a old bit of py2 or older code and then exit
if there are f-strings further down in the file, there will be a syntax error before you get to run the eval.
oh yeah
not much you can really do about the parser scanning it
If its a module you best you can do its just specify the supported versions
interestingly where java shines
you can run java1 code on java14 without much problems
afaik, even a typo was written in a stdlib function, and it remained there just to keep retain backwards compatability
not quite the case
right, this si the other case around, brainfart
Not sure how that's a Java thing, Python along with many languages are designed with backwards compatibility in mind
The only breaking changes introduced was Python 2 -> 3 and 3.6 -> 3.7 where async finally became a keyword
ehh, strings -> bytestrings are different oh i see now, misread the 2->3, but java remains backwards compatbile across major versions aswell, of course with overhead, but still, a different story i'd say
a few stdlib module functions also changed
i mean, mostly yea, it retains it
but when it doesn't, it's a pain
a lot of java code relies on pretty complex behaviours like classloading which did have major changes in many versions and require quite a bit of effort to port. PHP is very good with backwards compat, but even there some features were to broken to keep
CancelledError, dict ordering are some more minor things that are backwards incompatible
I prefer to think of 2 and 3 as different languages in that sense
Relying on dict ordering prior to 3.6 was a mess on your part in the first place
oh nevermind, java 8 is basically java 1.8, still makes semver sense
nevermind then
I guess C&C++ are good at keeping backwards compatibility?
Java SDK upgrades can be nightmare
Hello
Does anyone know how to stop a program in a specific place ?
There is a big function and I want to end the program when the function ends
@feral heath This is not the right channel for getting help, please use #❓|how-to-get-help. This channel is for discussing the language itself.
Ok 🙂
@spiral willow Agreed
@grave jolt I can't think of much that has been removed from C/C++, just a new thing that's added and recommended. Like, bool vs BOOL. bool is a true boolean which is recommended vs BOOL which can be any int value.
Wait, so which of the two got added?
Interesting
inline kinda changed?
why int and not, say, char?
<Typically, CPUs are fastest at operating on integers of their native word size (with some caveats about 64-bit systems).>
Does sendall work on UDP sockets? (Im using the standard socket library)
ok
any ideas how to make a bot to detct stocks
What specifically about stocks are you wanting to detect? price increase / decrease, new IPOs, etc?
Sounds like a question for #python-discussion. This channel is specifically to discuss the Python programming language itself, from a higher-level and more abstract perspective.
@wild vale I would not use python
@wicked breach why not?
speed. But if it is not a problem, then python is ok. Obviously depends on what you want to do
speed shouldn't be an issue, since you're mostly worrying about I/O
Well, you could write a bot interface and the stock watching program in different languages.
You can worry about IO and python can still be too slow 🤷♀️ maybe you care about the latency of the rest of the program
Has anyone ever toyed with making a meta class where u implement to dunders to work on the class and not instances of the class?
Ive never really used metaclasses before but im finding it quite interating
class Unit_Constant(type):
_constant = 5
_units = "kg"
@property
def constant(cls):
return cls._constant
def __add__(cls, other):
return cls.constant + other
def __str__(cls):
return f"{cls.constant} {cls._units}"
def __repr__(cls):
return str(cls)
class A(metaclass=Unit_Constant):
pass
I dont know why someone would ever do that but the idea of operating on the class itself and not the object, seems interesting. I was wondering if anyone knew practical use cases?
I never thought to do that
Maybe something like, record types that you can combine for pretty annotations
Classes are singletons, right?
So that could be a use case I guess, operations between singleton, probably for a DSL syntax
Classes are singletons, right?
@undone hare classes are instances of metaclasses, so not really...?
In software engineering, the singleton pattern is a software design pattern that restricts the instantiation of a class to one "single" instance. This is useful when exactly one object is needed to coordinate actions across the system.
Yeah no, you're right
But at the same time, there's always one class object
sure, only one class object exists, but afaik the singleton pattern refers to the existence of only one instance in the global namespace (iirc), not the class itself
well, yes, but that's like saying in [[i] for i in range(5)], each inner list is a singleton
it has different attributes, but it's still an instance of list
similarly, each instance of a metaclass has different attributes (the class's methods and attributes), but that doesn't make them singletons.
hello guys, this is my code but does not work. Why?
import kivy
from kivy.app import App
from kivy.uix.floatlayout import FloatLayout
class clsApp(App):
def build(self):
return FloatLayout()
if name == "main":
clsApp().run()
@mint ruin wrong channel, ask in #user-interfaces or a help channel
ok sorry
anyone good at http sevrerr? need help
How to type-hint nested iterables of int? Something like (this of course doesn't work):
from typing import Union, Iterable
mInt = Union[int, Iterable[mInt]]
probably with TypeVar
You can use a string forward ref
Cannot resolve name "mInt" (possible cyclic definition) mypy does not like that
I think it worked with pycharm when I tried, but a human ought to understand it if tools can't
https://github.com/python/mypy/issues/731
It's a very, very long-demanded feature.
Discussion on the use cases, implementation and future of the Python programming language including PEPs, advanced language concepts, new releases, the standard library, and the overall design of the language.
okay thy Charlie I will delete that and put in on discord.py
@hot inlet neither mypy nor pyright can handle that, unfortunately. However, recursive classes are possible.
class MInt:
attr: Union[int, Sequence["MInt"]]
def __init__(self, attr: UnionUnion[int, Sequence["MInt"]]):
self.attr = attr
@grave jolt a more practical use case: annotating self
huh?
Maybe something like, record types that you can combine for pretty annotations
@paper echo This is pretty much what I was thinking. At my job we have a module of just constants. You unfortunately cannot doc string them so I am thinking maybe just converting them to this, plus some class methods to change units on the fly.
@swift imp Sphinx supports docstrings on constants and class attributes like this
LETTER_A = 'A'
""" The letter 'A' """
class PointOnEarth:
lon: float
""" Longitude """
lat: float
""" Latitude """
ele: float
""" Elevation """
but they dont show up in help right?
ah, maybe not
lets try something
yeah they wont
you can't assign to str.__doc__
but yes i've also done things like this
BusinessIDType = Union[str, int]
class BusinessDict(TypedDict):
id: BusinessIDType
name: str
cash_on_hand: float
@dataclass
class Business:
id: BusinessIDType
name: str
cash_on_hand: float
@classmethod
def from_dict(cls, data):
return cls(**data)
BusinessType = Union[Business, BusinessRecord]
just for prettier looks
idk if theres a better name than BusinessType
Wow, I didn't know about that. Really saves a lot of boiler plate
the idea here is that your functions can accept both literal dicts and instances of the dataclass
i think its kind of a bad pattern tbh
but for "less advanced" users it can be a lot less scary to use
What we do in #internals-and-peps ??
@unkempt rock it is pretty simple, people have advanced disscusions in here, try to avoid this channel if you dont have anything to disscuss or dont have any topic to talk about related to python programming enc...
I will say the same thing I did before: documentation generation tools like autosphinx have a shortcoming where they will use the definition of a type alias instead of the given name
this can make your documentation confusing to read when it's produced
hello, i create a discord bot and i am wondering which module to use for image processing? example
@grim tusk thats a better question for a help channel, see #❓|how-to-get-help . please read the channel topic
it was pasted literally in the post above yours
Oups sorry
the issue with python typehints is that i can never keep up with what to use
it's never intuitive 👀
its honestly no where near as bad as some
like even strict typed langs can lead to awful syntax stuff for type hinting
{"a": int, "b": float} > TypedDict("Something", {"a": int, "b": float})
Well, C++ has quirky template syntax; and cannot infer everything because of its conversion (coercion?) rules
Like that is wayyy simpler than what Rust's typing becomes when they're bigger apps
Overall Py's isnt too bad because atleast it becomes mostly simplified and doesnt turn into the above mess as the linter lints every single function the var has ever been through
shoulda used haskell 
honestly, why do we not have type hints like
a: [(int, int)]
yeah i agree that should be possible
I don't know what to do with callables, though...
typing.Callable™️
ew
typing.Callable[foo, bar, [foobar, fee, foo]] amazing 🌈
Actually Callable[[foo, bar], (foobar, fee, foo)]...
And if you have a frequent callable type, you can alias it.
But yes, it's pretty cumbersome.
I think I said it about 200 times now, but I started to love type hints more when I switched from mtpt to pyright. It can infer many things that mypy can't
i use typehints for documentation, simply makes my code cleaner
if i'm returning a crazy type that can't be properly modeled then i'm probably building a bad interface
@grave jolt pyright doesnt need to be compiled like the msoft python lang server right? its just an npm package?
Yes, it's just an npm package
if you have VSCode, it can be just installed as an extension via a button click
Or it can run as a command line tool.
yeah id use it for the latter
Some people on reddit were complaining that it's written in TypeScript and not Python, I don't get why that's particularly bad
i actually like that tools can be written cross language
but one annoyance is that python devs now need a nodejs installation to use it
regarding type hints, how would you annotate something like "a pandas or spark dataframe with a fixed schema"? meaning, it supports __getitem__ but only with specific keys. so something with the same idea as a TypedDict, but for pandas.DataFrame objects instead
@hollow anvil asked this in #data-science-and-ml and i didn't have a good answer, but i'd like to know
is it possible to write your own generic to do this?
or just use a protocol
hmmm
Well, I guess you could make a phantom subclass which is also generic
itd also be nice to write custom generics anyway to detect things like pd.Series with a specific data type
It's a trick I used to make str to be typed SQL queries
so pd.TypedSeries['float'] for a pd.Series with dtype='float'
oh?
how does that work
class Query(str, Generic[Q, U]):
pass
LOREM_IPSUM = Query[Tuple[str, str], Tuple[int]](
"SELECT dolor FROM table WHERE sit=? AND amet=?"
)
and then you can write strongly-typed SQL-accepting functions
that means what? the parameters are 2 strings and the result must be 1 int column?
Yep.
clever.
but type annotations can't infer that, right? you'd have to manually specify
so e.g. mypy can't know how to infer anything about Query "instances"
but if you manually specify it should work, right?
Well, it can't parse SQL and take into account the DB schema
exactly
and even if it could, you'd have to hack that logic into the type checker and not the type annotation itself
right?
If you specify it correctly and create well-annotated (generic) function, it should beat you up when you're doing something wrong.
def fetch_one(sql: Query[Q, U], arg: Q) -> Optional[U]:
...
i see
So maybe you could do a similar trick with a dataframe.
yeah let me start with series
and build from there
from typing import Generic, TypeVar, Union
import pandas as pd
T = TypeVar('T', bound=Union[str, Type])
class Series(pd.Series, Generic[T]):
pass
y: Series[float] = pd.Series([1.0, 2.0, 3.0])
like this?
Yes, but float is not a subclass of Type
...its not?
T is bound to Type means that T is a metaclass
ah right
well
it's really a finite list of things that can be there
i guess you can enumerate it as a union of literals
right
I don't know pandas, so that's on you
sadly there's no str dtype 😦
I don't know how well mypy will work with that, I haven't tested, tho
hold on
oh ok nvm
this is for bound=
idk though
it has to be literally float
for example
pd.Series([1.0, 2.0, 3.0]) the dtype is float
so it's parameterized by float or int or something
think more abstractly about a typed array or typed list
the parameter is a type
im not thinking clearly about this right now
The parameter is always a type
right
anyway TypeVar lets you do this TypeVar('T', float, np.float32, np.float64, Literal['float'], <etc>)
so that's fine
that part isnt important anyway
from typing import Generic, TypeVar, Union
import pandas as pd
T = TypeVar('T')
class Series(pd.Series, Generic[T]):
pass
y: Series[float] = pd.Series([1.0, 2.0, 3.0])
ignore the binding for now
yeah, you don't need to bind it
Well, uhhhhhhh
the problem is that the methods on Series aren't typed
so you might put some annotations, I think?
Or you could make dumb wrappers
what do you mean? i was thinking you could make a function like f(data: Series[float]) -> Series[float]
yeah i know pandas doesnt have type stubs
right
interestingly this type stub library already has generics at least for numpy arrays https://github.com/predictive-analytics-lab/data-science-types
But if the methods are typed as well, at least pyright can infer the return type of that function (in most cases)
don't know about mypy
just insert that after every of my messages 🙂
so this library just wrote the type stubs to be generic
hmmm
this is actually an impressive effort
well, you could use that then
i think i will
the original question was about spark dataframes
but its the same technique
ohh
well my question was about pandas 🙂
but this was prompted by a spark question in #data-science-and-ml
good luck on your (or not your?) typing adventure then 👍
@hollow anvil see above ^ if you're feeling ambitious and want to port that logic over to pyspark
thanks
this was helpful
@paper echo there's a str type in pandas now https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#dtypes , unlesss i've missed your point (#internals-and-peps message) here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.StringDtype.html#pandas.StringDtype
That's new
is the string dtype nullable? that is the real question
Thats fine as long as it doesnt flip to O dtype if it has nulls
what's O, object?
does anyone use importlib.resources? I bumped into it today and was a bit confused about when I would use that instead of just importing something from a data dir in a project.
when does something stop living in ./data and start being a resource somewhere 🤔 or visa versa
https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#text-types ill have to look in here re: nullable
yes O is object
and yes i use importlib resources
the purpose is when you need to package data with your library
its not for data science projects
for large amounts of data you should still have some kind of "data downloader" functionality like spacy or nltk for their models
but for small things its really handy
e.g. a jsonschema file
@magic python
if your package is like this:
my_package/
__init__.py
data.py
inputs.jsonschema
then data.py might contain:
import json
from importlib import resources
import jsonschema
with resources.read_text('my_package', 'inputs.jsonschema') as fp:
_input_schema = json.load(fp)
def validate_input(data):
jsonschema.validate(data, schema=_input_schema)
so you can just from my_package.data import validate_input and be on your way
you'd also have to add inputs.jsonschema to your MANIFEST.in file if you're using setuptools for packaging
@paper echo hmm - do you ever use it within datascience projects or not?
i use it for libraries supporting data science projects
for example let's say i wrote a library to clean addresses
are you aware of any examples I can google?
that includes a bunch of data files
e.g. a list of all currently active zip codes
or an historical dataset of zip codes
right so lookup tables and stuff
yeah exactly
why not have them in ./data tho
because that's part of the library
it's not part of your project
if someone else wants to use your address cleaning library, they need to get the data files
ok - i get that - so you wouldn't use them in ./tests and stuff then
as long as the data files aren't obnoxiously huge, it's much easier on the user if you just include them inside the package alongside the python code
a data/ directory for a project is kind of a different beast
is there a rule of thumb on what large is in the context of a git repo?
like 5M ? 🤔
I would consider that large
yeah it depends on your users
if im writing an internal library for my company i'll tend to be a bit looser with packaging big data files
anything >1M actually I'd think is pretty noticeable - looser meaning what, >5M?
here i might have an example
yeah idk
enough that the download from a package index doesnt take forever
or that your git host doesnt complain at you
i want rules of thumb
eh just try it
if you hate how long the install takes / how much space it takes on disk
change what you do
the alternative for "library" code is to look in a fixed or configurable directory
which is what nltk does
or you can do what spacy does and download the data as a package but not distribute it through pypi
which is actually a very interesting option
yeah i was wondering about that kinda approach for some stuff - like pytorch / sklearn and stuff have for their data i guess
TIL pytest -p no:randomly --looponfail
you can for example do this
try:
with resources.path('my_package._models', 'a-big-model.json') as model_path:
model = load_model(model_path)
except IDontKnowWhatExceptionTypeGoesHereIfTheFileIsMissing:
raise ImportError('You need to download the models first, with my_package.download')
here's a use case actually
writing stuff that's not thrown away immediately
https://github.com/gwerbin/scientific-colourmaps/blob/master/python/scientific_colourmaps/__init__.py
my library could and probably should package the color map files internally
instead of expecting the user to download them
oh dear that says my real name and employer
What's the difference between my own library and making it a package?
probably should package the color map files internally
as in - store them within the repo
store them within the python package itself
not just in the source code repo
@pseudo cradle there are 2 meanings to the word "package":
- a software package such as you might install with a package manager
- a python module that contains other modules (submodules), corresponding to a directory with a
__init__.pyfile
I read "if your package is like this:" and was wondering what the difference was between making a package and just having a library within your project.
Oh
its really really unfortunate that the 2nd definition uses the word "package" instead of something else
but so it is
sorry i'm being thick - i don't see what the diff between having them in the python package itself and the repo is
i mean - what's the functional implication of this, what changes, :S
this is the source repo:
scientific-colourmaps/
src/
scientific_colourmaps.py
tests/
test_scientific_colourmaps.py
data/
colours.txt
setup.py
README.md
yea - do you'd have them in src i guess, but - it seems the functionality is the same no?
this would be keeping the data inside the python package:
scientific-colourmaps/
src/
scientific_colourmaps/
__init__.py
colours.txt
tests/
test_scientific_colourmaps.py
MANIFEST.in
setup.py
README.md
not at all
interesting
the latter can be easily packaged into an sdist or bdist and uploaded to pypi
the former can't, at least without extra work
so it just enables people to use pip?
yes, and for the data to always be located alongside the python source files
interesting
so you never have to fiddle with finding the current file path, you never have to ask people to download data separately, etc
glad you changed that, embedded what confusing 😅
its absolutely the preferred solution when the data files "arent too big"
again, the definition of "too big" being up to personal preference
its dinner time but @ me if you have other questions. significantly underrated feature of python imo
one of the great use cases imo is what i demonstrated above with jsonschema
i'm going to declare 5M the rule of thumb for files that are too big
seems reasonable 😄
😎
Even a bit conservative
I've made a statement on line, i'm sure I'll be corrected if wrong
its dinner time but @ me if you have other questions. significantly underrated feature of python imo
cool, cheers @paper echo 👍
numpy is the best python module
Will I be corrected if I'm wrong? 😄
stdlib itertools is pretty awesome, but I'm a huge fan of collections
collections isn't used enough by me
Isn't it so useful though? It's like, all the data structures I ever wanted from other languages again when lists and dicts won't cut it.
I don't find myself using collections that much
used to use it for namedtuples a lot but now I usually use the typed ones from typing
ah defaultdict is the one I use a lot
also imported OrderedDict today for a 3.6 project
pandas is the main thing i use by a mile
Data scientist?
i work with data, i'm always unsure about that role title
I do algorithms, so numpy is my main one, and scipy.
Gotcha, that's fair
I need to get more comfortable with pandas
I don't think I've ever used anything from typing
i've managed to avoid multi-index far too much
@pseudo cradle today I learned that there is a fairly complete type stub library for numpy and pandas
I honestly think that asyncio rocks, but it could do with some UX overhaul (it is getting that with recent releases tho)
I love async Python
If only it could make use of as much processing power as other langs that don't have a GIL
Pypy doesnt have a GIL right?
afaik pypy-stm also only supports Python 2.7
👍
Installed it, but it needs to step it up, haha. Lots of functions/operators it doesn't recognize
@tawny shoal we have TSM?
STM yes but only for python 2.7
If STM concurrency manages to appear in Python I still don't see how it could be utilized especially this late
Would the python implementation be thread safe?
@pseudo cradle they said in the readme its incomplete and PRs accepted
Nod
Fortunately most pandas and numpy functions dont have that complicated signatures
And they already define types for "array flavored things accepted by the ndarray and Series constructors"
@red solar Based on my limited knowledge of STM from Haskell it has to be relatively thread-safe by nature
Another advantage of STM is the fact that it's easy to compose operations and treat them as a single one, though I'm not quite sure what the API for that would look like in Python
If I make my own google sites im allowed to use it for web scraping right?
Hi everyone, iam titin and i have struggled with arduino and pi communication. I have ultrasonic sensor with arduino and webcam with pi, both using for counter. So i send the sensor data counter to pi then prosses on pi, then i need to send back to arduino because the display (led matrix madule) on arduino, anyone can help me?
b = True
import time
a = "print('\n>>>')"
g = open('module4.py','w')
j = open('module4.py','r')
while b == True:
g.write(a)
ll = j.read()
if ll == 20:
break
print('Made by "Hoodies" and written by pyton with Microsoft Visual Studio 2019')
not a help channel
oo sry
Can i import a file that isn't at the same folder as my main file?
like im doing import random...
Yes, if it's in a sub directory of your root directory.
and if its in other directory?
it is possible
Well, you still can, but it's ill-advised.
You'd have to add its directory to sys.path manually.
thanks
import importlib.util
spec = importlib.util.spec_from_file_location("module.name", "/path/to/file.py")
foo = importlib.util.module_from_spec(spec)
spec.loader.exec_module(foo)
foo.MyClass()
🙏
I think extending os.path might make more sense
lulwut
If anyone is interested in contributing to PyInstaller, one of the core devs is on r/Python looking for people with a good grasp of Python https://www.reddit.com/r/Python/comments/i1lid8/pyinstaller_developers_wanted/
I've had a look before because development seemed to be stuck on 3.8 for a while but the project in general seemed like a more complex issue to tackle
Hello, when i use pyinstaller to compile python code into a excutable, the executable doesnt work and seems not to pack in the imported modules. I am using debian 10 buster and python3.
Are there any cases where using % formatting with strings would still be preferable to f-strings/.format? If so, why, what are the differences that lead to it being a more viable option?
Hello, when i use pyinstaller to compile python code into a excutable, the executable doesnt work and seems not to pack in the imported modules. I am using debian 10 buster and python3.
@wind lion try create an venv (in that dir) and install the library that u use + pyinstaller then use that
The most viable option imo is using fstrings, because you immediately see what will be inserted where, are inserting or removing an interpolation will lot break the whole thing
% is useful if you want to be less strict about input data types
'{:d}'.format(3.1) is an error from what i recall, but '%d' % 3.1 isn't
In general this is somewhat of a smell anyway so I don't think it's much of an advantage
Fair enough
I was mostly wondering because I've seen it get recommended for SQL queries and such
Somebody recommended formatting SQL queries like QUERY % (a, b, c)?
Something along those lines, yeah
% is lazy
Which is why its recommended for stuff like logging as its the quickest formatting method iirc
f-Strings are the lowest from my knowledge
For logging it's lazy because you don't apply the % operator yourself, you just pass the template, no?
I don't see how string % input could be a lazy operation
In SQL you should never use plain formatting like f-strings, .format of % because it can lead to incorrect formatting or SQL injection.
Every SQL adapter has a more or less the same interface to passing parameters
https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries
i mean doing QUERY % (a, b, c) will do the same thing
You don't format SQL queries yourself, but there are C style place holders
Psycopg2 is a shit show of sql formatting
Like sqlite's ?, Other libs use for example %s
execute(QUERY % (a, b, c)) is NOT the same as execute(QUERY, (a, b, c))
i thought you were talking about doing "%" over format or f strings because it stops sql injection
Does % perform input sanitation?
the library functions do the sanitation
I think it's misleading that the query placeholder syntax is the same as the % formatting syntax, and it leads to people using %
yes
MySQL is one thing with its %s
but its Psycopg2 doing it which is another
not to mention holy fuck how it recommends doing some weird String formatting for tables and columns and all this other junk
😔 We could really do with a sync version of asyncpg
@radiant fulcrum people were asking for a sync version in 2017
So, to clarify, is there a reason to use % (as in string % input) in your own code or is it just that libraries use the same C-style formatting syntax?
speed
I don't think % is way faster than f-strings
The main problem is that is the only python implementation of the binary PSQL protocol. Great use case for sans io
not a notible degree
I remember Raymond Hettinger (I think?) tweet about f-strings being the fastest formatting way
still doesnt work
!e
import dis
def f():
return f"hello {x} + {y} = {z}"
def g():
return "hello %s + %s = %s" % (x, y, z)
print("f:")
dis.dis(f)
print("g:")
dis.dis(g)
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
001 | f:
002 | 4 0 LOAD_CONST 1 ('hello ')
003 | 2 LOAD_GLOBAL 0 (x)
004 | 4 FORMAT_VALUE 0
005 | 6 LOAD_CONST 2 (' + ')
006 | 8 LOAD_GLOBAL 1 (y)
007 | 10 FORMAT_VALUE 0
008 | 12 LOAD_CONST 3 (' = ')
009 | 14 LOAD_GLOBAL 2 (z)
010 | 16 FORMAT_VALUE 0
011 | 18 BUILD_STRING 6
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/cecexewixa
i dont get it
@wind lion try create an venv (in that dir) and install the library that u use + pyinstaller then use that
@silk spade ok, i will try
@split escarp This is not a help channel. Check out #❓|how-to-get-help or #discord-bots
I got 443 ns for f-strings and 415 ns for %-strings
so if that's your bottleneck...
Apparently f-strings are fastest on short strings, but get progressively slower as the string gets longer
I guess % formatting runs entirely in C, whereas fstrings compile to bytecode
yes, that's probably the issue
def f():
return f"hello {x} + {y} = {z} {x} {y} {z} Lorem ipsum dolor sit amet blah blah blah {x} {y} {z}"
def g():
return f"hello %(x)s + %(y)s = %(z)s %(x)s %(y)s %(z)s Lorem ipsum dolor sit amet blah blah blah %(x)s %(y)s %(z)s" % {"x": x, "y": y, "z": z}
Here I got 1.03 us for f and 1.16 us for g.
Well, yes
and it turns out in some cases an f-string can be faster (still marginally)
why is it giving me an error?
@rapid relic This is not a help channel. Check out #❓|how-to-get-help
You wrote selfself instead of self
So self is undefined
@grave jolt oh thanks thats the problem
I don't know what course you're attending, but f-stings are way better for formatting (see my snippet above ^)
i just started learning phyton so
.format would better in this case tho
true
agree
Is there any python code where, if a bool was replaced with an int with the same value, the semantics would change?
Other than isinstance(obj, bool), I don't think so
(And is checks with booleans, I guess)
Ok sweet
& and | return bools for bools instead of ints, but I believe other than that all their operations are identical in behaviour
Hello, I was wondering if there was a way to generate code. Like to generate boilerplate code
What's used to do that? Would it be python or some other tool
Like... for another language? Or for python? (Both are possible)
If you have a lot of boilerplate code, you might be better off if you find a better abstraction...
unless it's C or something
An IDL like thrift or protobuf would be an example imo, and they use a compiled language for it, but nothing stopping you from using python
(Unless I misunderstood what you meant by boilerplate)
Yes for python
I'm actually using django for a web app and want to find a way to automate the creation of models
Lol something like that
Lots of the models are very similar and so I wanted to bulk create the models.py files
I mean, you could just create classes at runtime with type () or use a common superclass
wouldnt inheritance or <insert field copyer thing from django's orm that ive forgot>
do the job
Probably the superclass
It doesn't need to happen at runtime. I can copy the generated files where they need to be. I just want a quick way to generate the models with certain fields.
I don't fully understand what you mean by inheriting from django @radiant fulcrum
lak's suggestion of superclass would work better
but you could just have a base model and each new model or whatever just inherits that
Ok that actually helps. I understand that concept.
Then I'd just need to change the fields somehow
just override them
When you inherit a class the top level class gets all the subclasses' attributes and methods
Got it
Thanks
But how would I get the actual models written in models.py format?
Because I would then be copying and pasting the generated models into models.py
i mean if you're inheriting you wouldnt have to generate them
they would work on runtime
class SomeBaseModel(django.Model):
some_field = SomeFieldType()
class MyDupeModel(SomeBaseModel):
some_extra_field = SomeFieldType()
MyDupeModel in this case would inherit the fields in SomeBaseModel as well as having the some_extra_field
I see. Is there a way to do it without relying on runtime created models? I just want regular files with models that I can copy and use anywhere I want
not really
I do understand inheritance that's how I create most of my models
if its got lots of duplicate fields and type the more 'pythonic' way would be inheriting a base class removing the need for duplicate code
rather than having a massive file filled with duplicate or similar models
I figured something like cookie cookiecutter django uses the method I'm talking about. Generating models
Ok I get it. Thanks
Is anyone keeping track of the idea to implement getitem for dict views?
I don't see a problem with it per se, other than that I think algorithms that rely on dict ordering should be discouraged
other than that I think algorithms that rely on dict ordering should be discouraged
curious to hear why tbh
as of 3.7 (3.6 for cpython) dicts preserve ordering
Dicts are primarily for key based lookup and if I need the key value pairs in a certain order, it's usually for displaying the output in a sorted way. So not really part of the data structure's specification
well, it is in the specification now
as of 3.6 the dict implementation changed and one side effect (I understand that it wasn't the primary purpose) was that it made dicts ordered, I'm not agreeing with them adding that to the python "standard", but this is where we are now
i nice gift guido left us in that mailing list was that the dict implementation simply cannot change ever again to something that isn't ordered
Right
I don't think it was worth that.
But I don't know enough about cpython internals to have a say
Yeah actually i don't know why I asked you that I 100% agree with you
My bad i'm tired
So am I lol
Anyway I've never thought of a use case where knowing the insertion order for iteration but not caring about the order in any other context was exactly what I needed
For a hash table
if videos:
Yep
I'm in the wrong channel
yes
my bad guys
Since this is advanced python, I'm going to make it advanced.
For classes that implement __len__
__bool__ is defined as len not being zero.
Is anyone keeping track of the idea to implement getitem for dict views?
@boreal umbra is this the ordered dict proposal on python-ideas?
@red solar yes
oh I see
Is it worth looking at? I see it, but I was more interested in the shared memory one
I don't see
Yeah wait I don’t see the bool thing either
I don't know, I only asked about it here because I don't want to read the whole thread.
Join the club 😂
If I remember correctly, it you implement len for a class (I'm on my phone so I'm not doing back ticks)
Yeah, it's documented with the data model iirc
Then the boolean value for an instance of that class defaults to the len not being zero
Unless you override bool
Called to implement truth value testing and the built-in operation bool(); should return False or True. When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __bool__(), all its instances are considered true.```
Oh as in __bool__() returns __len__() != 0, I misunderstood the first time
There are times where I find that kind of implicit truthiness kinda vague
Idk in this case it makes sense to me
It does make sense
In my experience, most people straight up forget about it
False is no bits are set, true is anything else - len(0) is equivalent to no bits are set
And implement it themsleves
Unless you don't know it's a container type
What else implements len?
We should scan git{hub,lab} forpy def __bool__(self): return bool(len(self))
strings are not exactly containers
Wonder if it improves performance by some negligible amount
It probably does
Are they not? I mean they’re immutable but apart from that
well they can't "contain" anything other than characters (which are not even actually a thing in python)
Because implementing dunders lets more of the magic happen in C
Or something
I remember reading that calling len is more efficient than calling dunder len directly
Ok I can definitely see that
Still 50/50 on the bool thing tho
Eh in any statically typed language they’d be containers 🤷♀️ python’s just fancy lol
Hmm, maybe i’ve been spoiled by std::string in C++
I'm not using some formal definition of container
my issues with calling strings containers is that they're not polymorphic and generally immutable
Yeah
tho the immutable part can be a non-issue based on the language too i guess
I would consider std::vector to be a container template, and std::vector<char> to be an instance of that container template (so a container) - and std::string is basically that with a nicer interface
So to me anything with elements is a container 🤷♀️
welcome to the flame war 👍
I think it's reasonable to say that a Python string is a specialization of an immutable array (ImmutableArray[char]). But there's no char type exposed.
Well, defining "container" to be polymorphic is entirely reasonable as well. But saying "str is a container" could also mean "str is a specialized container". But here we run into the problem that we can't retrieve an element of this container.
That makes me think, I generally find it rare to see people (including myself) using the proper python terminology from collections.abc for the various data structures
other than iterable and iterator (misused every other time)
99% of the time, it does not matter. I do sometimes say sequence, however. But mapping I have only ever used in context of **
But here we run into the problem that we can’t retrieve an element of this container.
Recursive container 😎
I don't think it was worth that.
@boreal umbra that's a misunderstanding of the change. They implemented an optimization in 3.6 to make dicts more compact and faster. A side effect of that change was that they now preserved iteration order. Adding that requirement to the language specification in 3.7 was a reflection of the fact that people were definitely going to begin depending on that new side effect, and that in practice it wouldn't be easy to change in the future, and people would begin to write code that wouldn't work in micropython or pypy or the like unless they implemented it too
yeah but now they brought themselves another potential issue, people will write code that depends on the ordering without using ordereddict, and if someone comes up with another implementation later on, it will not be backward compatible unless it is also order-preserving
*forward compatible
Yep. Such a change would likely be reserved for 4.0, since it would be breaking.
It's also a decision that was taken WAAYYYYYYYYYYYYY too quickly imo
Like, GVR just went like "oh cool, let's add it to python3.7"
There was code relying on old dict order
deep inside a mailing list
Not a negligible amount either
It was probably wise to not have some code rely on what's undefined as per the language spec
I saw a talk where this was mentioned, let me check.
easy, just add UnorderedDict 😄
@teal yacht the core devs decided it on the python-dev mailing list, just like they decide every other issue. There was a year between CPython dict becoming order preserving and it being added to the language spec.
@flat gazelle in old versions of Python 2 versions hashes were deterministic across runs, so there was a stable but arbitrary order. That was fixed later, so the order would be longer be stable, to fix a security vulnerability. Later still dict began to preserve insertion order as a side effect of an optimization.
So, 3 distinct generations here - you're right that there was a point when it was stable but arbitrary.
Wasn't that change for string keys only? For ints and other datatypes, this vulnerability shouldn't have been so major
Some ints hash to themselves etc. I believe it's only for strings like lakmatiok said.
Could the dict memory usage be smaller if they didn't have to preserve the order?
@flat gazelle yes, you're right, only strings and bytes are salted. Others were deterministic but stable even after the introduction of hash randomization.
They almost perserved order except for deletions as a side effect, but having the reorder was cheap and convenient, so that ended up being part of the impl
@peak spoke in theory, maybe. In practice, the version that preserves insertion order is smaller and faster than the old version, and no one could propose a specific way that it could be made smaller or faster if it didn't preserve insertion order.
It seems like it ought to cost more to preserve the order, but it comes for free as a side effect of other optimizations

