#internals-and-peps
1 messages Ā· Page 90 of 1
I used assert a total of 1 time.
And when I did, the message I wanted to display was making the line very long, so I put brackets and broke my assert into two lines..... Sigh.
If only you could assert that assertions are enabled, they'd be useful.
Suffice to say, that wasn't a very good day.
lmao
you can, just without asserts š
if they're not enabled, can you create your own assert function and conditionally import it?
it's a keyword
oh, so it's still there, it just does nothing
but otherwise, yes
ok ig that makes more sense than getting a syntax error because it's not defined, not sure what i was thinking
Since you mentioned C and C++ earlier, note that the same logic applies: assert can be turned off at compile time, so you should only use it if you know exactly how the code is going to be compiled.
In C and C++, it's turned off by defining the preprocessor macro NDEBUG.
I wouldn't say in C++ it's as much of an issue, since you could likely be guarding something that could crash your program anyway
but yeah i'm just inexperienced with the python assert š C++ i get
Crashes are security vulnerabilities, assertions aren't
only tangentially related
but it's p cool that you can encode the size of an array/other collection in its type
in a sufficiently powerful type system
not sure if that'll be possible in Python though š¤·āāļø
it is awesome š (apparently it's possible? https://stackoverflow.com/questions/43292197/can-python-implement-dependent-types)
i'm not sure how well supported it doing that actually is though (i'm not convinced something like mypy would always catch that)
is this an argument against using assertions to guard what could be security vulnerabilities, or just an argument against assertions in C++ in general?
p sure it wouldn't
mypy doesn't handle all the weird stuff like higher-kinded types
The issue with dependent typing python is that everything is mutable and given a type A, you can create a type B that is it's subtype, so you can't really infer much and need invariant data structures. You could make a subset of python that can be dependently typed in a useful way, though at that point just use idris/agda/lean.
Any chance someone has made a jetbrains quality IDE for any of those?
well, we're in off topic territory with this, but - imagine you've got C++ code that does:
int getArrayElement(std::vector<int> &vals, size_t index)
{
assert(index < vals.size());
return vals[index];
}
``` If that assertion is disabled, then it will try to read arbitrary memory off the heap. What's in that memory? Could be anything. Could be your server's SSL private key, for instance.
In pretty much every case where you hit undefined behavior in C or C++, a crash is the best case scenario. Almost every crash is actually a security vulnerability in disguise.
You can use static analysis to prove whether your code is correct and whether you should provide an assertion or not.
that "SSL private key" example isn't hypothetical - that's essentially Heartbleed.
There are things like Checked-C and Pragma-C for that.
Hmm... ok i feel this is just an argument against assertions in general then; your assertion being disabled leaves your code vulnerable, and so should never be disabled, so it probably shouldn't be an assertion in that case
right.
Rely on static analysis instead to inform you whether or not your code is correct, and if it's not use your assertion.
i don't think static analysis is complete? isn't this essentially the halting problem?
I mostly treat assertions as documentation and help for library users. It helps document exactly how the library demands it's input data to be in a way that breaks if not kept up to date unlike a docstring
assertions are fine as a form of documentation, as long as you understand and appreciate that they may not cause the control flow to change even if the invariant is violated.
Yeah, a failing assertion should always be a bug
If the tool can't determine if your code is correct then use assertions, the halting problem isn't really relevant.
people really want to use assertions as a shorthand for ```py
if not some_precondition:
raise ValueError("My precondition isn't satisfied")
Assertions are not for that (response to I'm stuff)in static type systems. They are to tell the type system about invariants that it doesn't know about. But using them loses the correctness guarantee, since well, they are no op in most cases
I'm not sure how assertions interact with the type system, I thought they were separate?
The checkers you listed only check for basic types of correctness. For other types I agree with lak that you shouldn't use assertions due to correctness guarantees
lakmatiol is referring to something akin to https://en.wikipedia.org/wiki/Design_by_contract
Ah i see. Hopefully we get that come C++23 š
there are paradigms that use assertions to declare facts that the type system can't validate. Like "Stack.pop() takes a non-empty Stack"
@fierce zephyr @jolly kiln this is strictly a discussion channel. See #āļ½how-to-get-help if you have a question or seek out the appropriate topical channel
where is the challenge problem channel?
We do not have one.
You can post that in an off-topic channel if there's one that doesn't have an ongoing conversation.
ok
@jolly kiln note however that the file has to be in a format that appears directly in the chat. It can't be a file that someone has to download to view.
please also do not post the same comment in more than one channel. you can't monopolize all three off-topic channels with your topic.
so.. im running a python script as a service on ubuntu 18.. this script connects to a database.. does anyone have a preferred way to store the password.. right now in the .service file i define a few env variables and pass these into the script at run time.. i tried messing with import keyring.. but it gives me all types of import errors
Guys is there any library for ransomware in python?
!legal
You are not allowed to use that command here. Please use the #bot-commands channel instead.
(anyway, I guess such thing is illegal
And before such things even come to python, other, more basic things need to be fixed, like very little support for type narrowing, or lack of polymorphic values.
!warn 796627077968166932 please do not ask about illegal things here. Ransomware, even the greyhat edition that could be used in CTF and similar exercises, violates our rule 5 and should not be discussed here
:incoming_envelope: :ok_hand: applied warning to @undone ravine.
Hey, anyone have any idea on how to load user code as module?
I guess what I want to do is load user code and call a predefined function name as the entrypoint to the code. I want to allow user to import libraries so I don't need sandboxing.
Running non-sandboxed user code calls for trouble
Nahh, it's not for crucial things
Or just crash the server by starting a process fork or something similar
so why do you need to run user code in the first place?
virtualization is the world-wide standard for running unknown user code, anything else will likely not be secure, because simply, you can't determine what it will do, unless you wrote it, or looked through it, sandboxing in read only filesystem is simply your best bet
You're asking why we let people evaluate Python code with the bot? I can answer that in #community-meta
uhhh i was referring to pip3 install brain's messages...
Type narrowing within the type hint system? Or just in general? (If itās the latter, not sure how u mean)
Iām not blue š
Iām blue š
In the type checkers. E.g. in Pyright this works:
class Foo:
a: int
tag: ClassVar[Literal["foo"]] = "foo"
class Bar:
a: int
tag: ClassVar[Literal["bar"]] = "bar"
class Baz:
a: str
tag: ClassVar[Literal["baz"]] = "baz"
def f(x: Union[Foo, Bar, Baz]):
if x.tag == "foo" or x.tag == "bar":
x # x inferred as Foo|Bar
x.a # int
else:
x.a # str
but this doesn't:
def f(x: Union[Foo, Bar, Baz]):
if x.tag in ("foo", "bar"):
x # x inferred as Foo|Bar|Baz
x.a # int|str
else:
x.a # int|str
Eric Traut proposed custom type guards https://www.python.org/dev/peps/pep-0647/ like in TypeScript, which would make it a bit better, but still
im pretty novice to type checking, do you really have to wrap Literal like that?
That's so fucking verbose
I mean, Rust isn't exactly known for not being verbose
that's not rust?
Java i think is probably the most known for being verbose
Im not saying python is verbose, I am saying that type checking syntax is verbose. For a class attribute to have to type this for a literal
tag: ClassVar[Literal["bar"]] = "bar" that's ridiculous, especially because well, its probably going to be a Union of literals
it is mad
its why i only do very basic typing
i tried using mypy and doing strictly typing for about a week and then ditched it
imo it just makes the code base look insanely messy and hard to read
It kind of does make it messy. I've noticed that. Like I made some dataclasses, went very strict with the type annotations, and using dataclasses.field to explicity set what fields get used in what methods and after I blackened my code it was ridiculous looking.
A single field was taking up like 10 lines
What ive ended up doing is just sticking to the non-strictly types but typing most of the external and internal api enough to where it helps readability rather than go so over the top
i think the only typing i use is Optional[T]
Yeah, and to be completely honest, it didnt even really save me on documentation bc I ended up retyping the same thing in my docstrings
I forgot what channel I was in
Is the switch expression being planned in any future python releases? Sry if this is not the channel to ask it
if no, any reason for that?
ping me ^
No.
its being weirdly merged with the pattern matching
similar to Rust's algebraic pattern matching as its called, which can provide the function of switch and case
however imo the python impl of this is... Odd
I believe the reason is there's some complications with the syntax if it were to be added, same with do-while.
enjoy reading though that 
Because it's a class variable, not an instance variable
I don't remember whether it makes a different with normal classes, but with dataclasses it does
Yeah, I have lots of complaints about type hints myself...
And I somehow got immune to the fact that Callable[[Callable[[int], int]], str] looks horrible
(which is just (int -> int) -> str)
And if you start moving step by step into the type rabbit hole, you realize that some ideas slightly more trivial than square(x: int) -> int are just impossible or impractical to express.
I mean, you can simplify that a bit,
>>> IntRoutine = Callable[[int], int]
>>> Callable[[IntRoutine], str]
typing.Callable[[typing.Callable[[int], int]], str]
Yeah the rabbit hole of typing, and the fact that a lot of major libraries (i.e. Pandas and Numpy) dont fully support it yet makes me want to steer clear
They seemed awesome at first, especially how the annotations are used functionally in dataclasses i.e. a ClassVar vs InitVar but like it quickly gets out of control
I think the prev-lance of TypeScript has shown how useful annotations and type checking is for building anything more than a very small application.
For example, a lot of games use Lua for a modding language and it's so frustrating having a small type error in your code, crashing the game and having to relaunch when a type checker could just verify that for you.
Yeah, but compare
IntRoutine = Callable[[int], int]
Callable[[IntRoutine], str]
to
(int -> int) -> str
I agree that gradual typing is a good idea, it's just frustrating that some things are not representable in Python's implemenatation of it.
Yeah, I can certainly see that.
This is actually a better construction IMO
Fn = Callable[[A], B]
Fn[Fn[int, int], str]
with a generic type alias
Fn is bad though bc classes and instances can implement __call__ which then like yeah...
For example, you can do this:
def identity(x: T) -> T:
return x
But there's no type annotation you can give here that would make identity2 the same type as the above definition:
identity2: ??? = identity
Sure, but it's clear what the semantics of "f is a function" are in Python. In most cases it means that f is any callable. I can't really imagine a non-hacky context in which it's important to make a distinction between a function and an arbitrary callable
I'm just saying that's why they used the ridiculously long Callable, its the most encompassing
Also, you can do
@dataclass
class Foo(Generic[T]):
function: Callable[[T], T]
def f(x: int) -> int:
return x + 5
foo: Foo[int] = Foo(f)
...but you can't do:
def identity(x: T) -> T:
return x
bar = Foo(identity)
bar just... doesn't get assigned a coherent type. And this is considered an error:
bar: Foo[T] = Foo(identity)
...as is this:
def swap(x: tuple[T, T]) -> tuple[T, T]:
return (x[1], x[0])
baz: Foo[tuple[T, T]] = Foo(swap)
Which is very sad, because it means I can't really typehint a very useful construction called "lens"
I just think a lot of people might try to over-use typing hinting, like when the return type is annotated, you don't need to write the type when you assign the result of that function.
C++ has this problem too, and that's why auto is useful.
int main() {
std::function<int(std::function<int(int)>, int)> apply = [](std::function<int(int)> a, int b) -> int {
return a(b);
};
std::function<int(int)> add_one = [](int a) -> int {
return a + 1;
};
std::cout << apply(add_one, 1) << std::endl;
return 0;
}
And frankly speaking, it doesn't really make sense to me that only functions can be parametrically polymorphic (i.e "generic").
In statically typed languages like C++ it's a sensible constraint, because identity<int> and identity<string> will actually be different functions at runtime, so a value of type Foo<int> would have to be different from a value of type Foo<string> at runtime.
However, in Python, identity for int and str is the same object, and a Foo[int] and a Foo[str] would be the same object.
Yeah, type checkers actually do type inference. So e.g. requiring every function to have a return type annotation is not a good decision IMO
But then you have to go to the function definition to see what the type is.
no, you can just hover over it
Unless your IDE just fills that in for you anyway
I've seen some people even annotate things like bot: discord.Bot = discord.Bot(...)
well, that's just overkill š
Yeah, I think there's a happy middle ground with what works for you, and no need to overkill.
Pylance is somehow able to solve the million-dollar problem of figuring out the return type of this function š
I was talking about something else
Like int result = myfunc(); vs auto result = myfunc();
@sage mortar let's... not put so many reactions
bruh
Indeed modern tools make this a non-problem, but I do have some doubt over whether it's okay to write code that is only readable using certain tools.
!mute @sage mortar 1d Don't create an off-topic mess, please.
:incoming_envelope: :ok_hand: applied mute to @sage mortar until 2021-01-21 18:32 (23 hours and 59 minutes).
I suppose these days it's reasonable to have that expectation
well... if you're not using a type checker, then why bother with type hints?
Readability
I never use a type checker
I'd recommend using one just to make sure your annotations are correct.
It's too much of a headache to get complex things correct
But then you can just write a comment, because English is much more rich and powerful than the Python's type annotation system
Annotations are more concise
Before I used a type checker, I would just do stuff like
def do_stuff(f: "int -> int")
because... Callable[[int], int] is not more concise
oh you meant like your own symbols
Sure, fair
But you said English
That's not english
I thought you meant "argument f is a callable that takes an int and returns an int"
ahh
Yeah, you could use your own conventions and stuff.
or just explain what the argument is, if it's pretty complex
Well I had a bad experience cause the first time I used mypy I needed to create a JSON type, and what do you know, they don't support self-recursive types.
And I just became too frustrated, asked myself what the point even is
Just write json: Any and people will know what you mean, though your type checker won't
Pyright/pylance actually does support recursive aliases.
In fact,
Stack = Optional[Tuple["Value", "Stack"]]
Is a very nice way to define a linked list https://github.com/gurkult/py-gurklang/blob/master/gurklang/types.py#L47
Yeah, it's nice that it allows for forward declaration now.
It wasn't a forward declaration issue
It was an issue specific to mypy, which is still not resolved to this day
Oh, I misunderstood.
yeah, it's about a type alias referencing itself
Also, mypy is completely wrong about a Callable defined as an instance attribute.
It thinks it needs a first self argument or something
Is there a way to annotate a type as a re-usable iterable?
In the strictest sense possible
Sequence?
sequence also implies indexing and length
You can have an unlimited iterable that's reusable?
Yeah, I want something that shows the type only needs __iter__, but that it can always return a new iterator
Reusability to me implies finite
Bc if it's not finite then you never actually finish using to reuse it
No you misunderstood. The sequence can be finite. It's just that if I request to iterate over the sequence again, it will let me do so
I was talking more to lakmatiol
The problem I face with Iterable is that something like a generator can be classified as one, though a generator is consumed upon its first use.
Right so I suggested sequence
But Sequence's interface has a lot of extra stuff which I don't want to require
Hmm, well I can't imagine an iterable that's reusable if it's not a sequence
consider
class Stream:
def __init__(self, start, step=lambda v: (v, True)):
self.start = start
self.step = step
def __iter__(self):
yield self.start
e, c = self.step(start)
while c:
yield e
e, c = self.step(start)
``` potentially finite, reusable, undefined length, no nice indexing
Of the iterables what do you have, generators, sequences, maps
Aight that's over my head š¤£
I yield
I would assume there is nothing builtin, since there is no way to enforce it by checkers, but it would be useful indeed.
Now that I think about it more, I don't think even Sequence strictly requires the iterable to be re-usable.
it indeed doesn't, though it would be quite the odd sequence to not do that.
I'd probably just alias Iterable, it doesn't stop checkers but should be clearer to the user
Yes, it's a bit contrived
I think your case is super nice Mark and don't believe there is anything to support it
I don't see how there could be anyway.
All the types are based on certain interfaces
There's no alternative to __iter__ that guarantees re-use
I don't know why the one guy is downvoted but I agree
That's regarding how to make a re-usable iterator, not how to annotate one
I agree with Numerlor here, ReusableIterable = Iterable is probably the most descriptive you can get
I believe it's more to do with the lack of an interface for such thing at a fundamental level. And that's because it's likely technically impossible.
But if they pass a generator the type checker won't catch it
checking for this is the halting problem
It be nice if you could do like Iterable\Generator bc Generator is a specific Iterable
Yeah exactly
Arnt map types resuable?
@gleaming rover sorry to reach way back but I didn't see these replies until now. Precondition not being met, means that the program itself is wrong. this is different than a routine raising an error that can be handled by another error.
Why wouldn't I be able to iterate a dictionary twice
I meant map as the builtin map
Iterable\Iterator would catch a few, but there are still cases where Iterable is not an Iterator yet it's Iterator will exhaust it, even if they contrived
@gleaming rover if you throw an exception in response to some kind of bad input, as a general thing, that's very different. You are saying "if such and such invalid input is passed, I promise to throw this kind of exception".
A precondition is the opposite, it is the function saying "well, as the user, if you call this function, you promise me that such and such is true. I may assert that such and such is true as a courtesy, but it is your responsibility"
python does generally tend to put more emphasis on wider contracts, compared to C++, because performance isn't generally as critical, and it doesn't typically bother actually eliding asserts in "release" mode which would be a given in C++ and many other languages
A classic example of a precondition though, would be your array being sorted before being passed to a binary search algorithm. Even in python, checking this condition routinely and throwing a nice exception is absurd because the precondition takes longer to check, than the actual algorithm.
At any rate I wouldn't say "never catch assertions", you may have situations where you want to catch everything, if only to perform some kind of routine cleanup and shooting an email with a crash log
but I do think that AssertionError shouldn't derive from Exception really; it's not "just" another ordinary error, and under ordinary circumstances even if you want to handle all errors a certain way, you may or may not want to handle assertions the same way
do you have a source for that
because that is not what I understand "precondition" to mean
In computer programming, a precondition is a condition or predicate that must always be true just prior to the execution of some section of code or before an operation in a formal specification.
in relation to a block of code, it just means "something that must be true before that block starts executing", which, in the specific context of pure functions, reduces to a statement about the inputs
uh, yes?
is that from Wiki?
yes
For example: the factorial is only defined for integers greater than or equal to zero. So a program that calculates the factorial of an input number would have preconditions that the number be an integer and that it be greater than or equal to zero.
Correct
which is the same thing as division by 0.
that is different from a function that throws an OutOfBound error
it depends on how things are defined
every function has a contract
you can check preconditions with asserts or similar operations, but you're not required to.
correct
so...
so that's the whole point, a precondition violation is not the same thing as an error
honestly I don't even remember what I said
an error in a function, any function, can always in principle be part of a "correct" program
an assertion violation always means the program itself is wrong
that's the distinction
e.g. if you're using a library, if you get a ValueError, you screwed up. If you get an AssertionError, the library author screwed up.
neither of those are necessarily true
Isnt it advised to never use assert is any kind of production code? There's no gurantee itwill even trigger when necessary
If you get a ValueError, it's entirely possible nobody screwed up, you catch the exception, and then do the correct thing
if you get the AssertionError, somebody screwed up, but who screwed up isn't immediately clear
assert can be used for documentation and to convince type checkers of facts they aren't aware of. It indeed cannot be relied upon to run
Right, I agree with that
this all came up actually because mypy was not smart enough to see that a point in code was unreachable, so I told the person to write assert False, and somebody else said to throw an exception
and we discussed a bit, and I looked at the exception hierarchy, and was very very surprised to see that AssertionError inherits from Exception
Why would mypy check for that, seems more of a pylint check
because, mypy needs to be sure, statically, that all types agree
if the user doesn't have a return statement at the end of a function, for example, that is actually unreachable, then mypy will follow python rules and say "if you reach this point, you implicitly return None"|
which will (usually) not agree with the annotated return type
so you need to tell mypy "this is unreachable"
I want to indicate that but I don't want to widen the contract and "admit" to users that I'm going to check their inputs for nonsense values, I'm not interested in making that promise to users
pylint checks for when a piece of code is dead because it's unreachable. But mypy might check that to see that a path that returns the wrong type is never returned. e.g.
def is_even(x: int) -> int:
if x % 2 == 0:
return True
if x % 2 == 1:
return False
assert False
Technically speaking, the assert False is not unreachable -- you can write a subclass of int, for which those conditions aren't inclusive. But you can assume that noone is that silly.
the problem (IMHO) is that python really takes the teeth out of this distinction by having AssertionError as a child of Exception
Python uses exceptions for things that aren't really errors, but when it does, they don't derive from Exception
it's probably the case that python already takes the teeth out of this distinction because it's not really as common in python to have an optimized build that elides assertions to begin with, like it is especially in C++, but also i believe in Java
Can you assume that though?
Yes
quicknir?
I'm usually pretty quiet because I talk more in the python slack
but every once in a while when I want to cast a wider net I ask here too
...and sometimes it's impossible to check whether a piece of code is (un)reachable, because mypy would need to solve the halting problem
def is_collatz_conjecture_true_for(n: int) -> bool:
...
def foo() -> NoReturn:
i = 1
while True:
if not is_collatz_conjecture_true_for(i):
break
i += 1
return 42
This explains why I got such cppslack vibes from your para above
āWider contractsā, ārelease modeā...
I do kinda wish there was a place like cppslack for python but nothing I've found is really equivalent
In cppslack you ask a question you tend to just get a flood of absurdly knowledgeable people answering you, citing papers, links, talks, etc, in python the bona fide experts seem harder to find, or maybe only hang out on the mailing lists
Tbf those people definitely exist here - itās just youāre less likely to see a beginner in the cppslack forum
Whereas python is a much more beginner friendly language, and discord a more beginner friendly platform
that's probably true, I have asked harder python questions here a few times and usually I don't really get a satisfactory answer
I mean I consider myself very far from an expert in python, I know much less python than C++ actually, but in the python slack I'm one of the main people answering questions, in the C++ slack there's so many people more far more knowledgeable than me
but like you say, just different distributions
I didnāt even know there was a python slack until now, not sure itād be better than here tho
I'm not sure it is or not, I just started there and talked to some people that I'm now friendly with
so it's more just habbit
Anyway my guess is, overall, for a variety of reasons python people don't really care as much about this assertion issue, which is fine, different language communities emphasize different points
probably the exception structure was carried over from 2 without much concern
it's not really something that fits the language IMO
because its dynamically typed?
also because of the "practicality beats purity" idea, and its view of exceptions, and its reflective capabilities, I guess...?
Also, if they really feel that way, why not remove the keyword? I mean, a language keyword is a really big deal. They didn't even demote assert to a function, which is what they did to print
I'm not sure about the pracitaclity and reflection aspects
I guess that assert wasn't demoted to a function because then it wouldn't be possible to cut it out in -o mode
^ my thought process
can you replace builtins in Java
but maybe it would be more unpleasant to do in python, e.g. maybe you'd need to have a runtime check
There would still be a function call, which would add overhead. And due to Python's semantics, you can't cut out a function just based on its name.
when you called a function, to ensure that it wasn't an assert
^
sure, there are performance costs
Don't think you could really do it in the same way without some kind of weird parser special case
Yeah but eventually you cannot fool the interpreter, the interpreter could see where the function call is going prior to evaluating arguments
so then
you need to check for that too?
like whether the user has rebound assert? and act accordingly
the interpreter isn't required to look at strings to make its decisions, in fact you'd never do anything that way
seems too complicated to be worth it
(there is, of course, a hack to do something like that <#internals-and-peps message>, where you can, for example, convert a global lookup to a constant lookup)
yeah, probably
exec and eval break most static checks you can do
At any rate, I'm not too convinced that having reflection, or practicality > purity is really the reason, but who knows, I think the idea of preconditions is only really starting to gain more traction in more languages, more recently
preconditions have been an idea in CS since forever but I mean in practical software engineering
wouldn't proofs be a more robust idea than preconditions? but I guess proofs don't easily fit in with imperative languages
Java btw has the same problem as python, i.e. assertion errors are just another catchable exception
I mean proofs are great, if you can construct a proof for a sufficiently complex case
along with nonlocal, assert is probably the only keyword I've never used directly so I agree there, but if it stays as a feature I don't think it should be moved from being a keyword
that's really a whole other kettle of fish
assertions are quite great but I guess at the end of the day, the proper use of assertions tend to be tied with performance, and library/implementer flexibility
and python doesn't care about these things as much
if you think about types, sometimes a better type system of a more appropriate type could serve as a proof; e.g. instead of a precondition that a list is empty, you could have a NonEmptyList of a, which is just a tuple (a, List of a)
Yeah, but to prove things in cases like that you are already getting into dependent types
which is a super advanced feature
in the sense that almost anything can change, so the attitude towards propositions that definitely hold is more lax?
like linked list list?
That's really a whole other conversation
weren't we talking about dependent types yesterday
well, some kind of list that can have 0 or more elements. Doesn't really matter.
at present the industry consensus is that there's a sweet spot with regards to type systems, or at least, that's my interpretation, overall
if your type system is massively complicated then yes you can have these proofs but you need to spend forever learning the type system to construct it
and it may not be a net gain compared to unit testing + assertions + contracts etc
Oh yeah iām AppleFeen btw (got bored, found an alt)
And, well, at the end of the day you might not be able to prove something
books like Code Complete tend to suggest that there's a lot of things that help with code quality and that you get diminishing returns with any one of them
so you want to have all of them, and get the low hanging fruit froma ll of them, without going crazy
so you want to have a reasonable static tyep system, but not an insane, have some amount of unit testing, some amount of regression testing, some amount of contracts, etc
each with different benefits
python becoming a sort of gradually typed language almost, is kinda proof that a lot of people even in the python community agree with this in some form š
In other words either drop python or do more testing ;)
or both š
How come one can iterate over things types like set or dict keys? These should basically act like hashtables (from my understanding), how does it know the order for this iteration?
The order is unspecified
Actually sorry
No longer true
Dict is insertion order
Though I don't remember the details
Yup since some version
I actually seem to recall that set is still technically unspecified though
In principle though, dictionary data structures don't really have an order.
Sets are ordered by the hash values, monotonically increasing right?
Or at least a predictable one.
Yeah, in most cases they don't unless they offer it as an additional feature/guarantee
Well it has to be implemented somehow unless it's abstract
I doubt it's by hash value
they don't but how is it able to keep track of it then? because when you iterate over dict keys, it comes in the order you inserted them in, does it actually index them?
By bucket index maybe you meant
Internally it has multiple data structures
Probably a hash table + linked list
Sets are sorted everytime you add something to it and I thought the sort is by hash value.
One way would be to keep items in list and using hash based tree for lookups
What do you mean by sorted?
@brittle dagger a list is pretty bad because you need removal from the middle
Not that bad if it's fixed size pointer
And prepending
It's very fast
You make a set {2,1} next time you access it, it will be sorted {1,2}
Removal from the middle of a list is O(N)
It's turbo fast
isn't that against what dicts should be, why would linked list be there too, it should only keep the hash values from my understanding
It's O(N) so it will be slow for big dicts
On maps no where to caches and memory buffers
@radiant scroll I'm not really sure
And for really big ones one could use linked list of chunks
Ultimately the python people thought it was worth the cost though right
@limpid marten yeah but for tracking insertion order, specifically
but then why do we even have dicts like these, if they're actually just lists + dicts, it doesn't seem to make sense, I don't want that big complexity and to keep track of lists, if I'm using dicts datastructure, there's something called ordereddict which I can use if I want that (in standard lib), but why should anything similar be done with normal dict, that seems very weird
explains some of the reasoning
it's not a linked list, but instead, an extra layer of indirection
it's a hash table of indices, into a list
apparently they wanted to do that anyway because of sharing things between dicts with similar values or something like that
but anyhow that link will have more information than this convo has had up until now š
but, anyhow as an end user you don't really need to worry about any of this
if they are actually doing this, why don't they also provide intuitive way of accessing it by that index? i.e. dict[5]
it seems like I said, the insertion order thing was basically free given the implementation they chose
so they decided to make the guarantee
because [] syntax is already used, for starters?
Also, providing ordered iteration is not the same thing as providing efficent random access iteration
e.g. a linked list
I'm nto sure if that's part of it or not though
you can probably figure it out from blog post + C code
so is there a true hash table datatype in the standard library which doesn't do this?
@radiant scroll Python after all is a specification, and CPython is the implementation, different implementations could create their dictionary differently so long as it matches the specification.
So accessing a dictionary by index, while might be possible in CPython, may not be useful behavior to enforce in the specification.
Is there a way to guarantee insertion order that wouldn't also provide the facilities to select by index?
Well, that may be a bad question
Cause I don't know if CPython's implementation even has the facilities for that
Hmm maybe there is an old implementation available if you don't want to spare extra memory
Just not sure if it's worth it. And if it does maybe that part (of your code) shouldn't be in python
but won't soting indexes like these make the dict O(n) then, for deletion? That just seems unacceptable considering it should be resembling a hash table
The indices are replaced with a dummy value on deletion iirc
Stop with this O notation it doesn't maps into memory model operations
but won't it have to move all of the others anyway?
Slowest slug is memory access literally a piece of raw linear memory is faster to rewrite than jumping on tree at random
Artemis is saying, if they're using a list for insertion order, then there is an O(n) complexity on dictionary deletion for insertion order.
It's implemented in c not python it doesn't go one by one..
If you're traversing a linked list, you do go one by one.
oh so it simply ignores those gaps there and skips those?
Anyway, I'm not sure how they implement insertion order though, I will check but you bring up a good point Artemis, but surely it's solved.
What gaps it realocates or given extra space moves chunk of memory holding pointers or indexes
With crappy single channel it's 12GiB/s
Which goes to 3 giga items per second if it's 4 bytes
Handling tree rotation for having tree balanced might be slower than that
even if you had a normal hashtable, you would still replace deleted items with sentinels
python isn't really any different
if you're using linear probing that is
big O still matters
if you have a million elements, deletion from the middle of a linked list is going to be faster than a contiguous array. easily.
most collections aren't that big, yes, but obviously the biggest collections are often where these things are the most important
there's a reason why e.g. salted hashes are such a big thing
Not always the case it depends on the size of items. Not saying it is irrelevant I'm saying that were using simplistic approach of machine where every operation costs the same and no optomalization happens.
@radiant scroll i am not sure what the details are, but I can pretty much guarantee that python's dict is still going to be, average case, O(1) for insertion/removal/access. having it otherwise would be dumb and the python devs are definitely not that.
@brittle dagger no, it doesn't
you can run this benchmark in C++ with items of one byte
Wait you're right
even for items of one byte, you're still going to have a crossover point long before a collection of size one million
I've seen these comparisons many times, in many contexts, about cache awareness vs big O
linear vs binary search
But yeah one allocation Vs moves
A linear factor is really big, O(N) vs O(1), it's going to start to be dominant very fast, almost always < 1000, even before size 100 in many cases
even for example linear vs binary search, linear search even very carefully coded can't hold out much past N = 1000
so you really just don't want to ever use arrays when you have deletion from the middle
unless you have fixed guarantees on N, or you plan to use tombstones
but tombstones have serious issues as well, e.g. when do you reclaim the tombstone
But at any rate I think the actual python implementation is more sophisticated than what we were discussing, not surprisingly
yeah it's pretty interesting, I remember hearing that insertion order was free with the implementation they wanted for other reasons, but never understood why, and it always seemed surprising
i wonder if there's a more detailed blog post somewhere, i read it (it's very short) and it still leaves some huge questions
everything seems fine when you are only inserting, but once you start deleting I have no idea how this will work
Maybe there is some discussion over that what to seem quite big change in a way
--remove-duplicate-keys
remove all duplicate keys in objects
what's a key in an object?
this is from autoflake
this is general, like dictionaries and everything? I'm not too sure what it means here
@njzk2 When an item is removed, the corresponding index is replaced by DKIX_DUMMY with a value of -2 and the entry in the entry array replaced by NULL, when inserting is performed the new values are appended to the entries array, Haven't been able to discern yet, but pretty sure when the indices fills up beyond the 2/3 threshold resizing is performed. This can lead to shrinking instead of growing if many DUMMY entries exist. ā Dimitris Fasarakis Hilliard Oct 11 '16 at 20:03
so it seems like basically, what happens is that tombstones are inserted into the array when things are removed
and then, when the dict next gets resized, the tombstones are purged
which makes sense, you're not really wasting any memory per se
however, it does lead to strange situations
if you take a python dict and repeatedly insert and delete, insert and delete, you are guaranteed to (quite quickly) trigger resizing
because the tombstones never get recycled
this is very different to most hash tables, most hash tables, if you are constantly inserting and deleting, if the size doesn't go up over some threshold, and the hashing is doing a good job, will not resize at all (or often)
python dicts will have this weird property where they can actually "resize", but not end up growing, but even shrinking
as an example, you have a python dict of size 1000, the backing array happens to be size 1200. You now delete 900 items, and insert 201. Your dict only has 301 items, and a capacity of 1200. But, the deletions never really get reclaimed, so the backing array needs to be able to hold 1201 items, which it can't, so it triggers a resize. but during the resize, it will see that you have all these tombstones, so even though the backing array is "full", the resize could actually result in a new backing array of < 1200
e.g. if it wants to start at 25% capacity then it could resize down to 800
pretty interesting
@brittle dagger @radiant scroll
good explanations in the comments of the main answer
Anyone know of any prior art to trying to get portable virtual environments? Right now I think that (beyond stuff like native extension installation) there's like hard-wired absolute paths
It's pretty standard for managed dynamic arrays to grow at around 1.2 changes in size.
No, it's not
Usually it's more 1.5 - 2
At any rate though this is a hash table, it's a bit if a different situation
Most open hash tables will try to keep between .75 and .25 occupancy, or thereabouts
At any rate though that's not really the point in this example, I don't know what the exact factors are
The exact max factor for dict is 2/3
If an insert would push a dict's hash table above two thirds occupancy, it is resized.
What's the minimum it aims for when resizing?
Looks like the smallest power of two that's greater than the current number of used keys, if I'm reading https://github.com/python/cpython/blob/master/Objects/dictobject.c#L415 correctly
gotcha
at any rate it's quite fascinating, I've never seen a hash table that doesn't recycle tombstones before, and can end up shrinking when it resizes
some very interesting tradeoffs
This is strictly a discussion channel. Try asking in #async-and-concurrency or #unit-testing.
I was playing with pyglet and noticed something that irked me a bit.
batch = pyglet.graphics.Batch()
pyglet.shapes.Rectange(..., batch=batch)
batch.draw()
vs.
batch = pyglet.graphics.Batch()
rectangle = pyglet.shapes.Rectange(..., batch=batch)
batch.draw()
Clearly the batch has a reference to the rectangle yet when I don't save the reference the draw fails.
I'm aware pyglet extensively relies on ctypes, but I can't really figure out how this isn't being reference counted.
Rectangle has a reference to batch, not the other way around. The constructor could create a reference cycle, but it probably doesn't
Yeah, I'm reading a bit more in how it's implemented, just wasn't what I thought certainly.
You'd think batch would have a reference to the rectangle since it actually is the thing doing the drawing.
guys i need help w/ this psuedo code
a = 2
loop while a < 10
print a + " "
a = a + 2
what is a lol?
Hey @neat ivy check out #āļ½how-to-get-help
A lol, is a laugh that happens out loud
#bot-commands
oooh pep 632 got accepted, what was it about?
Do note that this PEP is in "draft" status.
Hmm, that's a good point
I am guessing that it still needs to be approved by the steering council?
I don't know what the word "draft" is intended to mean in the context of PEPs, but my impression is that it means "this isn't even the final version that the steering council will decide on".
is there a way to extend the functionality of mypy?
other than cloning it, changing the code, and installing your version?
import json
f = open("t.json", "r")
print((f.read()))
json.load(f)
why does this give jsondecode error
but
import json
f = open("t.json", "r")
json.load(f)
``` this works fine
@narrow pasture f.read() will move the cursor of the file to the end, leaving nothing more to read. try
import json
f = open("t.json", "r")
print((f.read()))
f.seek(0)
json.load(f)
is there a way to spawn off many async jobs and gather them easily?
I have like 50 db connections I'm querying and need to gather all the results into a different db
!d asyncio.gather
awaitable asyncio.gather(*aws, loop=None, return_exceptions=False)```
Run [awaitable objects](#asyncio-awaitables) in the *aws* sequence *concurrently*.
If any awaitable in *aws* is a coroutine, it is automatically scheduled as a Task.
If all awaitables are completed successfully, the result is an aggregate list of returned values. The order of result values corresponds to the order of awaitables in *aws*.
If *return\_exceptions* is `False` (default), the first raised exception is immediately propagated to the task that awaits on `gather()`. Other awaitables in the *aws* sequence **wonāt be cancelled** and will continue to run.
If *return\_exceptions* is `True`, exceptions are treated the same as successful results, and aggregated in the result list.
If `gather()` is *cancelled*, all submitted awaitables (that have not completed yet) are also *cancelled*.... [read more](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather)
so what's the use there, run the jobs and gather them into a list, then pass that to gather?
and it collects their returns?
just leave off the await keyword so I get coroutine objects?
something like
results_list = await asyncio.gather(*(do_db_stuff(db) for db in dbs))
``` is how I would do it.
and since these DBs return lists, flatten them into one list - got it
I've been staring at this website for an hour trying to find a way to start scraping it without using selenium and just make api calls. But, there is a weird POST call when you go to the website that authorizes you. Without that authorization, a request is getting a 403 response back. I've been trying to figure out how to make that post call and I found that there is a hidden input field with an rss-token, whihc is one of the headers, but there is also a uid, which I cannot find anywhere on the page. I'm afraid there is some JS that generates taht
anyone have any suggestions?
!rule 5
5. Do not provide or request help on projects that may break laws, breach terms of services, be considered malicious or inappropriate. Do not help with ongoing exams. Do not provide or request solutions for graded assignments, although general guidance is okay.
also wrong channel
hmm
let me check their robots.txt
is there a scraping channel?
nothing is disallowed in their robots.txt, so I'm fine
You should check their privacy policy and terms of service.
Concerns about the legality of your question aside, this is strictly a discussion channel, so if you've validating that this is something we can help with, you can ask in a relevant topical channel or in a help session. See #āļ½how-to-get-help
@calm hedge what does !close means ?
oh
just for kind information
yea ok
I'm not entirely sure why I'm getting garbage data here.
import ctypes
libc = ctypes.cdll.msvcrt
class PyObject(ctypes.Structure):
_fields_ = [
('ob_refcnt', ctypes.c_ssize_t),
('ob_type', ctypes.c_void_p)
]
class PyVarObject(ctypes.Structure):
_fields_ = [
('ob_base', PyObject),
('ob_size', ctypes.c_ssize_t)
]
PPyObject = ctypes.POINTER(PyObject)
PPyVarObject = ctypes.POINTER(PyVarObject)
@ctypes.CFUNCTYPE(ctypes.c_int, PPyObject, PPyObject)
def inspect_fn(a, b):
a_ = ctypes.cast(a, PPyVarObject)[0]
b_ = ctypes.cast(b, PPyVarObject)[0]
print(a_.ob_base.ob_type, b_.ob_base.ob_type)
#None 2916228200256
return 1
def inspect(a, b):
obj_array = (ctypes.py_object * 2)(a, b)
libc.qsort(obj_array, 2, ctypes.sizeof(ctypes.py_object), inspect_fn)
print(obj_array[0])
a = [1,2,3]
b = [3,4,5]
inspect(a, b)
I wanted to check out the object ref count, and the pointer to the object type this way, but I'm not entirely sure why I'm getting garbage data.
Is there a better way to go about inspecting py objects? I can't use the ctypes.py_object type as it seems a bit special, and can't be dereferences or even cast to.
I'm not so proud of my abuse of the qsort function also.
wait why do you need to qsort
a_ = PyVarObject.from_address(id(a)) should work?
In [11]: a_ = PyVarObject.from_address(id(a))
In [12]: a_.ob_base.ob_type == id(list)
Out[12]: True
In [13]: a_.ob_base.ob_type
Out[13]: 9463680
@limpid marten
worked
I'm a bit confused why this isn't swapping them though
a = [1, 2, 3]
b = [3, 4, 5]
a_ = ctypes.cast(id(a), PPyVarObject)
b_ = ctypes.cast(id(b), PPyVarObject)
temp = a_[0].ob_base.ob_type
a_[0].ob_base.ob_type = b_[0].ob_base.ob_type
b_[0].ob_base.ob_type = temp
print(a, b)
# [1, 2, 3] [3, 4, 5]
Actually, it appears they have the same obj_type.
I'm not entirely sure why.
Where is the actual list stored?
PyListObject iirc
i have a bunch of ctypes structs here https://github.com/SuperStormer/pyutils/tree/master/utils/hazmat/structs
Anyone know what the use case of dict() is instead of just a good ol' dict literal {}?
Seeing as dict() has more overhead underneath the hood
I don't think it has a use case per se, it's just possible because it's the standard that calling a class returns a new instance of it.
ahh i can get behind that lol ty
though you may already know that passing Iterable[Tuple[x, y]] for some hashable type x to dict works.
Did not know that - never ever use dict but thanks for that tidbit lol. Literally just discovered dict and list and all that jazz today and was curious why they existed at all.
!e list can be used to construct a list from some other iterable.
thing = {'a': 1, 'b': 2, 'c': 3}
print(list(thing.items()))
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
[('a', 1), ('b', 2), ('c', 3)]
!e list will just blindly catch everything that an iterable gives it, so you need to be careful that you're not using an infinite iterator.
from itertools import count
print(list(count()))
@boreal umbra :warning: Your eval job timed out or ran out of memory.
[No output]
@limpid marten if you want to swap 2 lists like that you could use memmove (don't do it with objects that have a different size tho cause stuff could break)
note that it ran out of memory ^
why is that better than a, b = b, a?
Is there a way to abuse the fact that assert can be disabled to create a sort of debug_print that gets disabled alongside assert?
(or ig logging kinda does this, but without that)
def asserts_on():
try:
assert False
return False
except AssertionError:
return True
then just print conditionally
Just put it inder under if __debug__
oooh that's cool
didn't know
(also completely forget about this question lol sry, found pizza)
assert not print("Hello, world")
```?
(ignoring __debug__)
If debug is off, print(...) gets executed and returns None, which needs to be negated. If debug is on, print(...) doesn't get executed.
actually using -O or --optimise i think it is? which actually disables asserts
Not so useful here as __debug__ exists, but not print can be nice for list comps etc. to get the elements for a quick debug
Well, maybe it makes sense if someone does
def binary_search(key, array):
assert is_sorted(array)
...
but I don't think anyone does that
i'm gonna start doing it just to annoy people
Yeah, I'm aware of __debug__, that was just a solution that technically satisfies the original question š
the optimise flag is more hassle than its worth
low key if you want togglable print statements
I always enable it when distributing apps, the resulting size gets noticeably smaller and I can then use __debug__ to set up different logging etc.
may i introduce you to the logging module š
you mean, theLoggingModule
I hear this is the place all the smart people come to
logging still needs to know what levels to show
Ig but atleast logging doesn't print everything by default, its opt in rather than opt out
I guess logging could have something like
logging.assertTrue(lambda: 2 + 2 == 4, "at this point, I don't know what to even say")
logging.assertTrue(lambda: 2 + 2 == 4, "at this point, I don't know what to even say", level=DEBUG)
# same as before ^
logging.assertTrue(lambda: 2 + 2 == 4, "at this point, I don't know what to even say", level=WARN)
well logging.INFO and logging.DEBUG levels are disabled by default
warning and critical idk because not tested it
For example something like
if __debug__:
set up stream handler
set level to debug
else:
set level to info
set up a GUI handler
What did you mean with this? Apart from people using asserts incorrectly or libs that use docstrings at runtime I can't think of a disadvantage
Or just the "hassle" of toggling it on
Just the hassle of remember to add it
hi
Well, usually the parts where the flag is on are usually automated
yeah, it is really annoying
I really hate using APIs with camel case convention (both in stdlib [logging, unittest etc], and 3rd party)
ā a fellow zealot
I'd guess there is a push against any changes to the module's api, but a more pythonic interface in an another module would be nice (something pathlib and os.path)
Then get a cold shower šæ dude
camel case in Python is literally disgusting 𤢠and every time I have to work with logging or pyspark I feel a bit of my soul chipping away and disappearing into the void š¦
Syntax is sometimes preferable dict(key=value, other=stuff). It's nice to not have to write the keys as strings.
Using the dict constructor is often more succinct.
all modules should be in CamelCase along with stdlib classes like List and Set.
Because lowercase module names are often mixed with lowercase variable names.
Unfortunately, that's unlikely to ever happen.
However, i like lowercase python functions.
Modules can't easily be made camel case thanks to case sensitivity differences between file systems.
Huh, didn't even think of that.
Hello
!e
...reposting my snippet from august
import re
def ensnake(attrmapping):
class _SnakeCaseProxy:
def __getattr__(self, attr: str):
try:
return getattr(attrmapping, attr)
except AttributeError:
snake_case_attr = attr
camel_case_attr = re.sub(r"_(.)", lambda m: m[1].upper(), snake_case_attr)
return getattr(attrmapping, camel_case_attr)
_SnakeCaseProxy.__name__ = _SnakeCaseProxy.__qualname__ =\
f"_SnakeCaseProxy<{getattr(attrmapping, '__name__', repr(attrmapping))}>"
return _SnakeCaseProxy()
###
import logging
snake_logging = ensnake(logging)
print(snake_logging.get_logger())
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
<RootLogger root (WARNING)>
Is that safe to use? Especially if I did logging = ensnake(logging), I don't think it wouldn't be, but just making sure
cooool
Hello. I have no-python related programming question, but can't find proper channel to ask - is there one, or can I do that here? The question is about file in memory dump
Hey @rancid flame if the question itsn't python related then it would best fit into off topic, if it then you can try out #āļ½how-to-get-help,whatever it may be this channels is meant for discussing about the language itself.
Oh, offtopic channels have funny name, that's probably I didn't see them haha
Why wouldn't it be safe?
Is there any performance difference between subclassing tuple and subclassing NamedTuple?
Is anyone tell me what should I do in this or what this statement says -
{The links in the
FAQ section (questions in drop down) on this page are hard-coded (static). Make them dynamic to be configured from the location object.}
You should open a help channel for this
Not after the definition runs, namedtuple dynamically creates a tuple subclass
or dataclasses, if that's what you need
you can make dataclasses frozen, and there's a bunch of other cool features
guys how do I make this more efficient so that I can find the sum of all prime number till 2 million
num_l=[]
y = int(input("enter till what number do you want the sum"))
for count in range(0,y):
num_l.append(count)
total = 0
for counter in range(0,y):
num = num_l[counter]
if num > 1:
for i in range(2,num):
if (num % i) == 0:
break
else:
total = total + num
print(total)
@vagrant sparrow that would maybe a question for #algos-and-data-structs
sieve of eratosthenes baby
ok ill ask there
would be wayy more efficient for 2 million
whats that?
also what do ull discuss here
wdym? you don't need any libraries
it's just a bunch of loops and if-elses
So I didnt get what Wikipedia is saying lol
hey folks, what would you prefer from an API? Iterable v. Iterator v. Generator v. Sequence?
my use-case here, my function takes a list of lists and converts it into a list of dicts, but intellij keeps complaining to me how map doesn't like generator and other such types
to make matters worse, the package databases lies about its annotations š°
hol' up
sqlalchemy is the thing lying to me
The usual advice is to type things as Generator only if you do Generator specific stuff with them, like call send() or throw(). Otherwise, just say it's an Iterable.
what's the community stance on Sequence v. Iterable?
I know they reference the Interfaces the type implements, but I'm being lazy atm
I generally don't care that much, If I need a Sequence from Iterable I will just call list. Unless there is some more optimized impl of Sequence (lazy resolution) for your specific type, I would use whichever is easier, which is generally Iterable
source of my Qs as I'm writing a public interface here
If you expect the user to want to index it, then pay the cost of returning a sequence so they can do that easily. If you expect most users will only ever iterate over it, then type it as Iterable to reserve as much freedom of implementation as you can.
gotcha, that makes sense. wouldn't you want to be concrete about your return type, tho?
and return a List[str] or Tuple[str, ...]
Why?
I don't really see the gain in being generic about your return value
You, as the owner of a public API, would like to be as vague as possible about what it will return, to give you more freedom to make changes to it over time. If you document "this returns an iterable" and you want to change it from a regular function returning a sequence to being a generator function, you can do that and be reasonably sure no one will break if they're obeying the documented contract
hm, that makes sense
If you say "this returns a list" then you can't make that change.
so it's mostly about exposing as little as possible about your internal logic
and keeping the function opaque
allowing you to make more broad changes if needed
Yep. In practice, people will tend to rely on implementation details anyway. But, at least by documenting the intended contact, you have a leg to stand on if you need to change it.
found what I'm looking for
I don't know why Python decided to invent new terminology to describe what is an "Interface" in basically every other language
in calling them Protocols
and no one seems to use the terminology anyways which makes finding this info impossible
rust be like: trait
traits aren't specifically the same as Interfaces and protocols though
Protocol is only an Interface in a language like go and pony. Java interfaces are more abstract base classes. Traits are very different, being more haskell typeclasses which do not really have a python equivalent.
PipEnv - a fresh approach that is going to become the āofficialā Python way of locking dependencies some day.
from https://pip-compile-multi.readthedocs.io/en/latest/why.html
is this true? Or is it their opinion
I like Poetry significantly more. Pipenv just feels salvaged after sitting idle for years
right, their phrasing this as though it's a given though
I've never heard of "pip-compile-multi"
enables you to have different req's for different purposes
pip should be abandoned and they should just bring poetry in-house imo
I don't trust pypa to develop a competent package manager
i've considered poetry, and it put be off bc it seemed to try and do too much
perhaps i need to give it more of a chance, idk
I just use venv with pip-tools
and manages environments and stuff, idk, seemed that it was doing too much to me
look at npm/yarn, they also manage environments
i don't use them so probably won't, but maybe it's a done thing
i mean - i just use python, have installed from npm but that's all
my point being, I wouldn't take a random package's documentation as some kind of indication about the future of the language's package management.
no i agree, i raised bc it seemed far too strongly worded
Pipenv is half-baked, picked up for name recognition (imo) and has inferior resolution compared to Poetry
I like my environment to be repeatable
I like my environment to be repeatable
i thought this was the whole point š
what do you use lak? Or do you not care
pipenv is older than poetry, and poetry was unusable for quite some time. So it is no surprise.
i don't know enough to know whether this is some kinda stupid vim vs emacs things, i have found python env management a mess tho
not sure why intellij made it on that chart
yeah seems a bit out of place
Up until maybe 4 months ago, it was literally impossible to build poetry from source because of cyclic dependencies. I trust the pypa maintainers to get the hard cases right much more than the poetry maintainers.
Also... Check out the npm maintainers on https://github.com/npm/npm/issues/19883 and see if you still think these are role models we should emulate
I think very clearly the npm maintainers are not mature and careful enough for the critical role they play in a major software ecosystem.
anything is better than raw pip.
you also cherry-picked a two year old issue. I don't see any npm member participation in that thread
I believe anyone who has commits merged in the tree get "Contributor" tags
I'm not a fan of npm, either, yarn is what won that war imo, but that's a bad example of what you're trying to prove.
A two year old bug where a package manager did an absolutely insane thing and destroyed systems in the process.
It's not like this was an "oh, anyone could make that mistake" bug, it's a "wtf were they even trying to do" bug.
Literally every part of what they were doing was crazy. Ignoring the user your installer is running as and deciding a different user should own things? Crazy. Recursively changing the ownership of directories you don't own and didn't create? Crazy. The negligence of allowing their "change the owner of every file below this directory" function to run with a directory of / is the least wrong piece of the whole thing.
I don't think they coined the term (though I don't know who did) nor are they the only language to use it. Protocol is a term Python has been using for a very long time (dating back to Python 1.x from the mid 90s).
it feels very 90s so that makes sense
Isn't "protocol" the term Smalltalk used?
it was used in a slightly different sense, but yes
I don't know, but if that's the case, it would make sense.
Isn't Smalltalk one of the earlier OOP languages?
Yep.
a Protocol is described in a Class and it documents what Messages a given Object can respond to.
I think Interface was terminology introduced for this concept by Java, so Python can hardly be faulted for using different terms than a language that came later would go on to use.
Protocols are not Interfaces
Abstract Base Classes are Interfaces (if you take the java definition of interface)
Hm. Aren't Python's ABCs more like Java's ABCs, and Python's protocols more like Java's interfaces?
I dunno. Maybe Java doesn't have anything particularly like Python's protocols, since they're fundamentally about duck typing rather than declared types.
I guess you could make the argument that Python's ABCs are analogous to both Java's ABCs and Java's interfaces, and that Java just doesn't have protocols...
yeah, that is more what I would say
Supposedly they are not like Java's interfaces, and that is precisely one of the reasons why they chose to name them protocols
Along with the fact that "protocol" is already a familiar term in Python
protocols are different from Java interfaces in several aspects: protocols don't require explicit declaration of implementation (they are mainly oriented on duck-typing), protocols can have default implementations of members and store state.
yeah, that is more like a java abstract class, but a java abstract class doesn't implicitly have instances just because it happens to have methods named the same
TypeScript, Go, Pony and to an extent Elm and Purescript do have fully statically typed Protocols
@raven ridge thank's for the info, guess I"ll just wait and see if something becomes a standard.
I trust the pypa maintainers to get the hard cases right
i'm wondering what exactly you mean here (what are the "hard cases" ? ), and whether you have an opinion on what will/should become the standard with this sort of thing
I'm more talking about package management than environment management. I have no particularly strong opinions on pipenv or poetry, beyond that it's incredible for something that wants to be a package manager to itself have circular dependencies. That's a big no-no in package management land.
also ocaml, maybe?
ocaml supports row polymorphism, which is the better structural subtyping (the fancy name for python protocols)
Didn't know ocaml did that, though I can see why it has it. That's the same thing elm and purescript do, though from what I know about ocaml, it uses it more.
Anyone know what importlib code import a.b translates to?
and I mean that where b is a package, meaning import a; a.b would raise
(e.g. urllib.parse)
from "big corp" perspective, Poetry is seeing rapid adoption. repeatable builds and easy, familiar package management are king for resiliency. My only complaint about poetry right now is slow first-time resolution since there are a lot of packages that do not follow standards (look at the pywin32 package for a perfect representation of this where they have one number they increment every release, currently on v"224")
nvm, they decided to jump to 300 for no reason:
https://pypi.org/project/pywin32/#history
@desert peak fair - maybe it'll be a standard at some point.
https://github.com/mhammond/pywin32/releases/tag/b300
release numbers as a celebration... I partially blame Mark Hammond for the win32 API being so god-awful
atm i use pip-tools, which seems to work alright, i'm never too sure what i'm missing, seems to be env management (i use venv) and distributing packages (which i don't really do)
i feel this might be a bit out of my scope, but im working on writing my own interpreted programming language, and i have to ask
how on earth can python lists have multiple types of elements when CPython is implemented in C, which is statically typed?
like what black magic does it do to achieve this
C isn't really statically typed
They are vectors of pointers to PyObjects.
All Python objects are ultimately one class, and Python lists are just vectors of pointers to such objects.

also that^
what would be a better way to describe C den
also i see ty reptile

oh lol
alright yeah that'd make more sense then.
tyvm!
C is more fluid than some other statically typed systems like Haskell or Rust

I'm not sure exactly how I'd describe it
alright, ty anyway tho
in what way is C not statically typed
importlib.import_module("a.b") ?
How statically or strongly typed C isn't really relevant here, honestly. The answer is that every Python object shares a common layout, which is used to implement the is-a relationship that objects in object oriented programming languages have. If you have a list object or a float object or an int object, you can interpret it as a generic object, because a list is-an object, and a float is-an objct, and an int is-an object, etc.
yeah so it uses the PyObject thing to store all the pointers to each value in the array
are the things in **ob item a bunch of void pointers then?
no - they're PyObject's
No, they're not void pointers.
All object types are extensions of this type. This is a type which contains the information Python needs to treat a pointer to an object as an object. In a normal āreleaseā build, it contains only the objectās reference count and a pointer to the corresponding type object.
ah alright that makes sense
so each PyObject has the number of reference the object has to it, and a pointer to the type object?
starts with that, yeah.
hate to ask but whats a type object?
it can have extra, instance-specific data at the end.
oh 
and this is the other key point:
every pointer to a Python object can be cast to a
PyObject*.
A type object contains information on allocating and deallocating a type.
ahh alright den
oh awesome, so this is how the type object is implemented
right - but every object contains two things, a pointer to its type, plus its instance data
yup
so, every Python object contains a reference count and a pointer to a type object. After those two things, it can stick whatever it wants - all its instance data goes after those two things.
because every Python object is guaranteed to start with those two things, any Python object can be cast to a PyObject* and passed around as a PyObject*.
And the type objects contain pointers to functions that know how to read or modify that instance data.
check out https://github.com/python/cpython/blob/master/Include/floatobject.h#L15 for an example
this starts with PyObject_HEAD - the reference count and a pointer to the type object - followed by the instance data, a double.
and the type object for a float looks like this: https://github.com/python/cpython/blob/master/Objects/floatobject.c#L1928
which contains pointers to functions like, for instance, float_hash - and those functions do things like: https://github.com/python/cpython/blob/master/Objects/floatobject.c#L557 - they take the PyObject* that they're passed and interpret it as a PyFloatObject*, which lets them access the instance data fields like the double field ob_fval
You can do probably bad things like make things that are immutable, mutable.
import ctypes
class PyObject(ctypes.Structure):
_fields_ = [
('ob_refcnt', ctypes.c_ssize_t),
('ob_type', ctypes.c_void_p)
]
class PyVarObject(ctypes.Structure):
_fields_ = [
('ob_base', PyObject),
('ob_size', ctypes.c_ssize_t)
]
class PyLongObject(ctypes.Structure):
_fields_ = [
('ob_base', PyVarObject),
('digit', ctypes.c_uint)
]
a = 5
id_ = id(a)
a_ = ctypes.cast(id(a), ctypes.POINTER(PyLongObject))
a_[0].digit += 1
assert(id_ == id(a))
sure, of course you can, because everything is mutable in memory.
immutability is enforced by the Python language; if you bypass the Python language and poke around with the C API or manipulate memory directly, you can mutate whatever you want.
i feel i still do need 2 look at more of how the PyObject thing works to answer any more of my questions since i don't even know how to phrase them
its just theres so much stuff there
Why not just a_=PyLongObject.from_address(id(a))?
Iāve tried to look into how languages enforce immutability, but I didnāt find anything. Do you mind explaining how Python enforces it if possible? Or even any resources to learn about it
Hm, I think It can be as simple as "not exposing" mutable methods and "not exposing" memory addresses.
Yeah. At the C level, this function exists - https://docs.python.org/3/c-api/tuple.html#c.PyTuple_SetItem - it modifies a tuple. But there's no Python binding for that function, so there's no way to access it from Python code.
type punning with ptrs or union
A list has https://github.com/python/cpython/blob/master/Objects/listobject.c#L3028 which points to https://github.com/python/cpython/blob/master/Objects/listobject.c#L2866 deleting or modifying an item in the sequence. A tuple has https://github.com/python/cpython/blob/master/Objects/tupleobject.c#L894 - a null pointer for that slot - because a tuple doesn't support item assignment or deletion from Python.
That's an argument for C not being strongly typed, not for it not being statically typed.
Woah, thanks
type punning with ptrs or union
@sacred yew thatās weak typing
Arguably. š
on the strong/weak axis, rather
well... if your code lives in void*, are you really getting any benefits of static typing?
I think it's rather 'typed' vs 'untyped'.
e.g. assembly is untyped, but you wouldn't call it dynamically typed
I would call it unityped actually, thereās one binary thpe, and you can do any kind of operations on it
Ignoring special registers such as floating points
I think the type in asm would be 'word' not 'binary'
you can make binary ops on words, but binary is not a first class citizen like word is
Thatās true
is there any pep discussing the non-isolation of private datatypes?
O.o immutability of what of value ?
If you have any kind of depth like even array that guarantee goes down the sink
Well... if there's only one type, there are no types
Like, if there's only one country on a planet, there's no countries on that planet.
So is C++ weakly types too?
C and C++ typing works differently
C revolves around registers and often casts to int
C++ tends to stick to the type more
Idk, c++ casts a lot of stuff to int too, itās incredibly annoying
Whatever which isn't full register
Ah, I didn't know about std::byte, I do appreciate C++'s new enum class.
Say char, short or whatever
Mhm, so it would get cast to int for the operation
It's not so simple š GCC has internal mathlib which can statically check stuff or can even remove stuff š¤Æ
Which can be funny as adding 0.0f to another float actually can hit normalisation :)
Point is that typing and language rules and undefined or not specified behaviours exists for changes.
There are two meanings of types
One is to control bit and bytes
Another is to enforce logical traits
It's not the same
So youāre saying in C it only does bits and bytes, whereas C++ can enforce logical traits?
(Or at least C++ does it to a larger extent)
Yes by controlling types in abstract you can limit or extend behaviour
I'd be scratching my head to find you example now though
Ok that makes sense
I think python calls bits and bytes protocols as abstraction
Of how exactly it interops
I'm still learning python thought ;)
As of overflows it's even funnier as native and Java/C# artimetic has different FPU settings where on raises hardware exceptions (traps) and other is typically by OS resulting in signals or other stuff.
Can anybody help me implementing piority quues???
also about yesterday's question i asked, i realized from a stackoverflow post that all pointers have the same size (more or less, it varies on machine), so yesterday i was also curious how PyObjects were stored in an array when they were pointing to different types of data
so that question is solved afaik since the pointers to that data are all the same size.
but i fail to see the rationale behind using PyObjects, mostly? is it because you can group more information about the data? like it's type? because outside of that i dont see how it's different from an array full of void pointers
sorry if im not gettin' this fully
It's not really about the pointers having the same size, it's about you being able to dereference the pointer and have access to the ref_cnt, and the type.
You couldn't do that with void pointers unless again, all your structs began with the same first two fields.
ah yeah ur right
It's called type-punning, which I think was mentioned above.
But it's actually pretty cool, and a nice way to get some OO principles into C.
i saw it in the chatlog, i was doin some research on making multi-type vectors in C and it mentioned something about unions and type-punning
anyway ty!!
@crude turret It's really awesome how much thought your put into this.
You're a great programmer. š
hm?
oh ty!! lol
im tryin' 2 make my own interpreter and i was basically baffled at how python could pull off multitype lists when cpython is implemented in a language like C
but that doesnt rlly matter, ty tho lul
Yeah, making an interpreter or a language in general is really interesting.
I think I've tried making one like 10 times, and stop after some amount of steps, but I always learn something. š
well as long as u do in the end it's all that matters
An object has a 'reference count' that is increased or decreased when a
pointer to the object is copied or deleted; when the reference count
reaches zero there are no references to the object left and it can be
removed from the heap.
oh this is how it tells which objects should be garbage collected
via the ref_cnt field
The cycle collecting garbage collector isn't a tracing garbage collector. It's implemented entirely in terms of the reference count.
Oh interesting
can anybody please tell me where is the wrong for path "observation_deposit_file="./sendai_thick.csv""
What do you expect to happen with ./sendai_thick.csv
Why not just use sendai_thick.csv?
Your path resolver most likely doesn't follow POSIX shell command resolution.
Wouldn't pathlib resolve that to current working directory anyway?
https://twitter.com/gvanrossum/status/1354305179244392453 It's discussion time people
What does this print?
x = 0
y = 0
def f():
x = 1
y = 1
class C:
print(x, y) # What does this print?
x = 2
f()
From https://t.co/8iZEYr23rx (@kevmod) Still in Python 3.9!
156
714
x = 0
y = 0
def f():
x = 1
y = 1
class C:
print(x, y) # What does this print?
x = 2
f()
Some odd behaviour, trying to figure out what's going on
Someone in the tweet posted the byte-code disassembly.
@nkitpati @kevmod If u look at disassembly
For 1 1:
8 LOAD_NAME 3 (print)
10 LOAD_CLASSDEREF 0 (X)
12 LOAD_CLASSDEREF 1 (Y)
14 CALL_FUNCTION 2
16 POP_TOP
For 0 1:
8 LOAD_NAME 3 (print)
10 LOAD_NAME 4 (X)
12 LOAD_CLASSDEREF 0 (Y)
14 CALL_FUNCTION 2
16 POP_TOP
Yeah I was looking at that before
Yeah, it's because of LOAD_NAME, but I'm trying to figure out why it has to be that way
It probably has to do something with the necessity of creating a class scope
I know that you can't access the class scoped names within a list comprehension's scope either
Will fail:
class C:
a = 2
b = [a * i for i in range(3)]
Certainly not behavior I would have guessed though.
but this will work:
class C:
a = 3
b = [2 * i for i in range(a)]
The way Python deals with scopes was always strange to me.
I generally quite like the scoping: Your name is local to your function scope if you assign to it unless you say otherwise.
The scoping here, specifically with a class definition, does something odd
If that inner class would have been a function, you'd just get an unbound local name referenced before assignment error
That's a bit like I would have expected here, but class scopes are different, apparently
!e
This is what it will do if the class were a function (that we actually run):
x = 0
y = 0
def f():
x = 1
y = 1
def g():
print(x, y) # Error!
x = 2
g()
f()
@wide shuttle :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 16, in <module>
003 | File "<string>", line 13, in f
004 | File "<string>", line 10, in g
005 | UnboundLocalError: local variable 'x' referenced before assignment
So, it hinges on classes using LOAD_NAME, but I haven't found the rationale for that yet
If you donāt reassign x, youāll get 1 1 which is really interesting
That's because x is not a name you assign to in the local scope, meaning that you can read from the outer, nonlocal scope
ok.. whats the thing about guidos tweet
We're discussing that now
i have not seen it. let me catch up
LOAD_NAME(namei): Pushes the value associated withco_names[namei]onto the stack.
What is co_names though
The mechanics are that when you define a name within a class, Python will now use LOAD_NAME instead of LOAD_FAST to load that name and load_name bypassed the nonlocal environment to look in the global environment
LOAD_FAST(var_num): Pushes a reference to the localco_varnames[var_num]onto the stack.
Hmmm
co_names is the ātuple of names of local variablesā while co_varnames is the ātuple of names of arguments and local variablesā
iteresting
this is where nonlocal comes to save the day
the strange thing is that it only works for x
>>> x = 0
... y = 0
...
...
... def f():
... x = 1
... y = 1
...
... def g():
... nonlocal x
... print(x, y) # No Error!
... x = 2
...
... g()
...
...
... f()
The documentation doesnāt say much about the difference between the two
the documentation for co_varnames is incorrect
Oh
@stable grail This was just a demonstration of that effect, the weird thing is with the inner class definition
Yeeep
Or was it names
!e
x = 0
y = 0
def f():
x = 1
y = 1
class C:
print(x, y) # What does this print?
x = 2
f()
@wide shuttle :white_check_mark: Your eval job has completed with return code 0.
0 1
co_names should be ātuple of names other than arguments and function localsā
It contains any non local names that have to be resolved, so attrs, globals etc.
yeah, and with nonlocal in the mix, that means that a class behaves similar?
So it does be pulling from the globals
