#internals-and-peps
1 messages Β· Page 132 of 1
actually, what kind of text is it?
logs?
is the python bot created in python or node?
Its pipe separated values, theyre financial data so cant really show any
Names, tickers, prices, etc
!source
Python
uhh
alright thanks
time series
csv
if performance is of any concern, i'd use hdf5 instead
or numpy.load
(which basically uses pickle with bells)
Its not a huge concern, i just dont wanna sit here for 10min waiting on file writes
I guess i should profile
do you use pandas for processing?
No, everything's done with pure python
Im refactoring old code
Boss not too fond of 3rd party deps
not even numpy?
No, pandas (and numpy) are only used in reading the values, at this point
honestly, under 1MB writes will probably be realtime no matter how you do it
Okies, i'll write it down and get to it after more important bugs are dealt with then
writing to a file is really fast
ehhhh.. π
try writing a single line to a file that has millions of chars π
In [20]: %%timeit
...: with open('sadf.txt', 'w') as f:
...: for _ in range(100_000):
...: f.write(" " * 100)
...:
289 ms Β± 40.7 ms per loop (mean Β± std. dev. of 7 runs, 1 loop each)```this is pretty much realtime
and other scenarios where "writing to a file" is everything but fast
Whats the issue, long lines or multiple small writes
long lines
to my knowledge, things get slow once you start changing existing content, appends are fast
I would be surprised if this was noticable if it happens once per human action
appends may actually be slower depending on the disk type as you'd have to re-allocate a new block and potentially swap some stuff around in case of a full SSD for example
when it comes to performance, i found that numpy.load is hard to beat
not sure exactly what it does, but it's damn fast and works pretty well
why we need to use pkg_resources to load data in init.py or any module inside the package what has happened internally which is avoiding us to use os.path.join?
βββ b.json
βββ __init__.py
I tried to load json file in init.py by using os.open("a.json") but it does not work. then i found on stack overflow that i have to use pkg_resources to load static data inside the package.
how do i learn to use python ?
Ooh, I also need to write a bunch of lines to a file, of the history of a physics simulation. Did you reach any conclusion about how to do it?
Itβs a Hangul filler, which is a Unicode character that looks blank
@paper echo muaha.. i invented a new pattern :p
i'm calling it the "buffet pattern"
case = (bool(version), bool(hashes), bool(spec), bool(auto_install))
result = {
(False, False, False, False): lambda **kwargs: ImportError(Message.cant_import(name)),
(False, False, False, True): _pebkac_no_version_no_hash,
(False, False, True, False): _import_public_no_install,
(False, True, False, False): lambda **kwargs: ImportError(Message.cant_import(name)),
(True, False, False, False): lambda **kwargs: ImportError(Message.cant_import(name)),
(False, False, True, True): lambda **kwargs: _auto_install(_import_public_no_install, **kwargs),
(False, True, True, False): _import_public_no_install,
(True, True, False, False): lambda **kwargs: ImportError(Message.cant_import(name)),
(True, False, False, True): _pebkac_version_no_hash,
(True, False, True, False): lambda **kwargs: _ensure_version(_import_public_no_install, **kwargs),
(False, True, False, True): lambda **kwargs: RuntimeWarning(Message.cant_import_no_version(name)),
(False, True, True, True): lambda **kwargs: _auto_install(_import_public_no_install, **kwargs),
(True, False, True, True): lambda **kwargs: _pebkac_version_no_hash(_ensure_version(_import_public_no_install, **kwargs), **kwargs),
(True, True, False, True): _auto_install,
(True, True, True, False): lambda **kwargs: _ensure_version(_import_public_no_install, **kwargs),
(True, True, True, True): lambda **kwargs: _auto_install(_ensure_version(_import_public_no_install, **kwargs), **kwargs),
}[case](**locals())
# fmt: on
if isinstance(result, ModuleType):
return _ensure_proxy(result)
else:
return _fail_or_default(result, default)
because all cases are equal and can take whatever they like :p
Do you guys know there's a way to statically type "cases", for example if A is 0 then I know that B is of type str. If A is 1 then B is of type int
There's not overloads for attributes afaik
the big benefit of having the whole logic in a table instead of a big if-elif tree is that you can see at a glance which cases are handled, no chance to overlook any potential weird combination
You flattened the if/else tree?
yup!
Are you sure lol, like yeah I can tell exactly what combination does what at a glance. But that's very magic πͺ
I have done that before
how did you handle combinations/chains of functions from there?
I think I just used lambda: with closures
yeah, that works for small functions
the functions i'm dealing with are slightly bigger ^^
damn auto-install :p
the **kwargs look magical from this perspective, but it's actually nice coming from the signature
def _ensure_version(func, *, name, target_version, **kwargs) -> ModuleType | Exception:
each function can just pick whatever they need from the **locals() that everyone gets offered
that's why i call it the buffet pattern ^^
at that point just use a class with attributes
that was the first approach, this is actually a dramatic simplification
I definitely don't understand the lambda/kwargs thing
I like the dict because it makes it easier to see if you covered all cases
(Although property based testing like Hypothesis can/should also be used to cover them all)
the idea is
[lambda some_thing, **kw: code, lambda other_thing, **kw: code][choice](some_thing=1, other_thing=2)
every function takes what it needs
nods
it's essentially creating closures without nesting the closure into the scope it's closing over
which is an idea for sure, not sure how much I like it.
well, you need it if you prefer module-level functions instead of putting everything inside a big method
tried both approaches, the module-global approach is slightly more readable imo
depends on how much code you actually have, i guess
if things fit inside lambdas or generators, the table alone might suffice
I've got a question to the room. is there any advantage to using decimal module over fractions for rational number math?
it's different
both are capable of doing rational number math. im sure they can be compared
for what it's worth, i can't think of one
Well do you want to represent big decimals, or fractions?
decimal is needed to do lawful (literally lawful, as that is how the math is specified in law) financial math afaik
that's interesting, never knew that
i think primarily for the sake of discussion, the only concern is accuracy in the calculations itself, and either fraction or decimal as the end result is acceptable
I think it simply boils down to what it represents.
Does it represent 1/3 or 0.3333333333333
oh actually that is a good shout.
ultimately fractions would be very slightly better in case you run into calculations that give ... uh.. im blanking out on the term.. infinite..decimals?
decimals can represent any value within their precision, but you still have the same issues as floats for repeating decimals
which does make it seem stacked against decimal. so does decimal have anything else that makes it advantageous?
from the top of my head it misses some things like infs or nans, and while any number you can handle with decimal should be able to be represented as a fraction I don't think that makes sense in most cases
There should be a standard that the module implements which should provide more detail on what the individual features of the type are and maybe what they address
could you crash a computer by trying to represent pi as a fraction?
So I think I m planning to connect computers together though wireless network and let them train nn model together, do u think there is anyway to do so?
from the description decimal offers things like significant places, controlling rounding, the precision etc. that accommodate financial and similar computing like lakmatiol mentioned which I don't think would make sense with fractions
I guess in the same way as if you went on calculating all the digits with any algo and type, any number you can actually get the decimal value of be represented with a fraction
i guess this is too advanced for a help channel (which i just learned the hard way) - how can i get all keyword args from inside the called function?
as in
def foo(*, a, b, **kwargs):
print(<all keyword args, not just kwargs>)
You could inspect the signature and then fetch the names that are kw only, don't think you can differentiate it in the function if it is a positional or keyword though (without horrible hacks)
the accepted answer on SO mentions a function that has been deprecated since, then i tried looking at the inspect docs and found it quite confusing to figure out how it's supposed to fit together
other answers on SO say "you can't, just rewrite your function" :p
i think i should make a note in my dev diary when i figured this out..
!e
import inspect
def get_keyword_param_names(func):
for name, param in inspect.signature(func).parameters.items():
if (
param.kind is inspect.Parameter.KEYWORD_ONLY
or param.kind is inspect.Parameter.VAR_KEYWORD
):
yield name
def foo(*, a, b, **kwargs):
for name in get_keyword_param_names(foo):
print(f"{name}={locals()[name]}")
foo(a=1, b=2, c=3, d=4)
@peak spoke :white_check_mark: Your eval job has completed with return code 0.
001 | a=1
002 | b=2
003 | kwargs={'c': 3, 'd': 4}
Not yet, first i gotta figure out what all these profiler numbers mean lmao
If you get there before me give me a @
definitely more convoluted than i anticipated π
but it makes sense
there may be something easier as that's just what came to mind, but I don't think you can differentiate them purely from within the function body as that just receives the names
heh
You will probably figure it out first, I haven't even begun writing my simulation yet π
def get_keyword_params(func, locs):
return {name:locs[name] for name, param in inspect.signature(func).parameters.items()
if (
param.kind is inspect.Parameter.KEYWORD_ONLY
or param.kind is inspect.Parameter.VAR_KEYWORD
)}
def foo(*, a, b, **kwargs):
return get_keyword_params(foo, locals())
>>> foo(a=2, b=3)
{'a': 2, 'b': 3, 'kwargs': {}}
i like it π
is it possible to feed a function as an argument for another function in python ?
eg:
def foo(fun):
fun
print("ran a function")
foo(fun())
sure
does it have to have parentheses so it would recognize it as a function ?
no, just assign the result to the call
def foo(func, args):
result = func(args)
print("called func with result", result)
return result
it's basically the recipe for decorators
and functools
and pipes
and my buffet pattern
How hard is it to contribute to CPython
the codebase is kind of big though so I don't know if I would even find something
The process itself doesn't look too difficult
I've contributed some documentation improvements. The process isn't bad at all.
guys
i made a notepad and calculator using tkinter
now i want to compile them like in one interface
how do i do that?
just curious, do you happen to have any resource that mentions this by any chance? (and perfectly okay if you dont) Also, just to clarify, when you say this is for lawful math, im assuming the distinction is in using decimal/float (things with dots) for calculations (ie mathematical decimals), not fractions. but floats would be fine too lawfully despite the precision errors. is that interpretation correct?
This is IMHO a bit of a half truth,.the statement about other languages not having an optional function call operator
Kotlin uses extension functions a lot so it's less needed, but even if you did want to call a free function in a null aware way, scope functions bridge that gap
x?.let { foo(it) } is a null aware call of foo on x
So its just not very needed
That said I didn't really understand how traversing class hierarchies was really related... Probably I need more context
You have this flipped. There's almost never a reason to use Fraction. The advantage of Fraction is that it has infinite precision for representing rational numbers, at the expense of infinite computation times for simple arithmetic and infinite memory. Using Decimal instead, you can represent irrational numbers like pi as well as rational numbers, by rounding them off to a significant number of decimal digits. In real life (physics, engineering, biology, chemistry, etc), that's what's usually desired.
I see. (and just for context, i dont have a specific stake in this, primarily just trying to get some clarity on why to use one or the other)
@anyone, is the question vague or I should ask in stack overflow?
Because using os.path.join() and then open assumes that the files are in the same directory on disk, but Python packages aren't always directories on disk. A package can be imported from a zip file instead, for instance, in which case that approach wouldn't work.
In fairness, in most real life what you want is a smart algorithm and dumb numbers
Not a dumb algorithm and smart numbers
(usually of course)
It doesn't answer the question I have asked. But thanks I got some insights.
Ah, you tried to use open("a.json") instead of open(os.path.join(os.path.dirname(__file__), "a.json"))? The former would look for the file in the program's current directory, the latter would look in the directory that contains the __init__.py
but also, I wouldn't use pkg_resources these days. https://docs.python.org/3/library/pkgutil.html#pkgutil.get_data is part of the standard library, which is much nicer than depending on a third party library for this.
What's the intuition on why python programs don't have constant runtime ?
constant runtime?
os things happening, gc runs
I'm using the wrong words here, I mean every time you run a program the time to run it changes
And I'd like to explain why
for the same input?
yes
that'll be the case for any language within small amounts
is it a hypothetical question or for a real program?
general / hypothetical
well, the most obvious (yet non-obvious) reason could be involving randomness
I'd like to explain it with software running in background, memory states etc but I'm not sure how "rigorous" this is
external factors play a big role in any practical application
especially swapping comes to mind
I've got very little knowledge about python's garbage collection etc
also other applications using/hogging the cpu
for instance, if you plot the execution time of any simple function operating on a number of items, you'll notice that the more items, the greater the "roughness" of the plot, upwards
that effect could be explained to a big degree by the likeliness that the cpu is used by another process before completion of the function
so, the longer the program takes (ideally), the greater the chance that random processes ask for OS resources that are outside of your programs' control
many programs involve RNG for initialization etc, so the actual input might actually be different without the user realizing
just thinking of training of neural networks
GC runs are deterministic, I believe, so they shouldn't be playing any role in the performance differences observed when running the same program with the same input.
One big answer is that your computer has way more programs running on it than it has cores to run programs on, and so it switches back and forth between programs very quickly to maintain the illusion that they're all running at the same time. There's a cost paid every time it switches, and the amount it switches depends on what other programs on the machine are doing.
source? (i don't doubt you i just wanna read more)
Another big answer is that, when your program needs to allocate memory, it's competing with all of the other processes on the machine that also want to allocate memory. Sometimes when your program allocates 10 blocks of memory it gets 10 blocks that are all right next to each other, sometimes it gets 10 blocks that are far apart from each other. The CPU's L1 and L2 caches work much better if the memory is in one spot than if it's in 10.
I can't swear that there's no part of it that's non-deterministic - I'm not an expert. But, my understanding is that objects that are tracked by the GC are added to a linked list as they're created, which puts them in a deterministic order, and that the GC is run based on deterministic statistics about things that have already happened so far in the program's run - number of allocations and deallocations, number of bytecode instructions that have been run, stuff like that.
and it's run in the main thread as part of the interpreter's eval loop (in between bytecode instructions), not in a separate thread, so there's no non-deterministic context switching involved. It's at least mostly deterministic.
i'd like to point out that in previous python versions dict keys were randomly shuffled on purpose, so even in a simple program runtimes could theoretically differ wildly, depending on context and inputs
just saying that my point about random initialization and algorithms is more common than you might realize
Was that just so that people didn't think they were deterministic and started relying on the behaviour?
yeah, basically trying to educate people
good thing they gave up and just went for ordered dicts ^^
that wasn't trying to educate people, it was trying to prevent a security hole. If a user knew the hash algorithm in use and selected inputs for a program that would be stored into a dict, the user could provoke worst-case dictionary performance by choosing inputs known to have hash collisions
and this is still true - the hashes of str and bytes objects are still seeded randomly to prevent that attack.
that shouldn't make much of a performance difference, though. On average you'd expect the number of collisions to be small, and as long as the number of collisions is small the hash table still has O(1) performance.
specifically, the hash randomization is to prevent http://ocert.org/advisories/ocert-2011-003.html according to the note at the bottom of https://docs.python.org/3/reference/datamodel.html#object.__hash__
I do agree with the general point that there's hidden non-determinism in storing things in dicts. I would just expect it to be a much smaller factor in performance differences than things like physical memory locality and context switching.
to your point, in fact, the string hash seed is a hidden source of non-determinism in the GC algorithm. The GC walks objects in a deterministic order, but the order it walks a dict in is based on the internal order of the hash table, which is based on the hashes of the inserted objects, which changes based on the hash seed.
So, I'm wrong, that's definitely a major source of non-determinism in the GC.
@charred pilot
oh cool, the rust docs have a section on how they choose a hash function to prevent that kind of attack
makes sense, although i seem to remember that i heard in a pycon talk about dicts that educating people also played a role in the decision, but maybe my brain just filled in that gap..
there was a (rejected) proposal to do that when dicts began to preserve insertion order. One argument ran that dict iterators should emit elements in insertion order in order to be maximally efficient, and because it's a useful property to have. Another argument ran that, if we did that in CPython, then users would come to expect that behavior from all Python implementations, and so it would be better if CPython's dict iterators returned elements in a random order even though it knows the insertion order
there's this https://www.youtube.com/watch?v=p33CVV29OG8, which is very informative
Abstract
Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk is given by Raymond Hettinger, the Python core developer responsible for the set implementation and who designed the compact-and-ordered dict implemented in CPython for Pyth...
in the end, the first argument won, and we handled the objection from the second argument by mandating that all Python implementations must behave the way CPython does and preserve insertion order.
leaves me wondering if that opens up python dicts to that kind of attack?
no - attackers can control the iteration order now, but they still can't cause hash collisions
and it was the collisions that was the vulnerability, since a hash table with many collisions has drastically different performance characteristics from what a hash table is supposed to have.
yeah, i understand
hmm.. you said that the hashes of str and bytes are seeded randomly, but what about dicts that use ints or id() as keys?
there's no randomization for those. The assumption is that programs that allow the user to enter strings that are stored in hash tables are much more common than programs that allow the user to enter ints that are.
in fact, the fact that hashes for ints are predictable is seen as a good thing, according to comments in the CPython source code...
God I've seen like all of his talks on dictionaries
coding by assumption.. not sure if that's a good idea ^^
"This isn't necessarily bad!"
https://github.com/python/cpython/blob/main/Objects/dictobject.c#L139-L151 - and check out the rest of this comment as well.
Objects/dictobject.c lines 139 to 151
Major subtleties ahead: Most hash schemes depend on having a "good" hash
function, in the sense of simulating randomness. Python doesn't: its most
important hash functions (for ints) are very regular in common
cases:
>>>[hash(i) for i in range(4)]
[0, 1, 2, 3]
This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
the low-order i bits as the initial table index is extremely fast, and there
are no collisions at all for dicts indexed by a contiguous range of ints. So
this gives better-than-random behavior in common cases, and that's very
desirable.```
Also makes it easier to write a new numeric type
There's always tradeoffs. With predictable hashes for ints, we gain performance but potentially lose some security. But also, note that in order to exploit this, the user would need to enter very different ints. The first int that hashes to the same value as 1 should be somewhere around 2**63, I think.
Very, very good point.
how often do you need to write a new numeric type? o.O
When two number types compare equal - like 1 == Fraction(1, 1) - they're required to have the same hash. The one constraint that Python puts on hash functions is that equal objects must have equal hashes.
which means the author of Fraction needs to know how ints are hashed.
numpy has around a dozen, the standard library has... at least half a dozen?
float, complex, int, Fraction, Decimal
i'd say float doesn't count in there
although, not sure how exactly floats are hashed tbh
1.0 And 1 hash the same
bool 
As does the ctypes 1 and Fraction
True and 1 are required to hash the same as well, since they compare equal.
!e ```py
print(1 == True)
print(hash(1))
print(hash(True))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | True
002 | 1
003 | 1
There is a document somewhere called the numeric hash which describes how to hash things
>>> hash(0.1)
230584300921369408
>>> hash(0.01)
23058430092136940
>>> hash(1.1)
230584300921369601
>>> hash(1.2)
461168601842738689
there's some pattern π
!e ```py
print(hash(2.0))
print(hash(2))
@raven ridge :white_check_mark: Your eval job has completed with return code 0.
001 | 2
002 | 2
the ones that are able to be represented as ints have the same hash as the corresponding ints.
yeah, figured as much
the float hash function has some serious trickery to make that work.
ello
not just to make it work, but to figure it out quickly.
anyone here wanna do me a soild?
this is a discussion channel, not a help channel. Specifically, the topic is the Python language itself, and various implementations of the language.
Not sure if this is advanced topic. I am wondering why we need a module like Pydantic for runtime, when we already have Typing, and MyPy? I am curious how others out there use it. Do we use Typing&MyPy or Pydantic&MyPy?
All of the above
Typing provides some primitives for writing type hints
Pydantic is a convenient way to derive runtime checks from static type hints
if you use mypyc it raises TypeError at runtime, i hope and expect this supersedes Pydantic
You need runtime validation to ensure that the types are valid
Otherwise when you consume data from the outside world, you can never prove that your data has the types you expect
@wet void βοΈ
Oh so it would be beneficial to use all of it in my projects?
yeah, salt lamp is right. whenever you interact with something that isn't in your source code, like some json response, or a user written config, you have no way to verify the type statically
what are the banned characters in module names?
Module names have to be identifiers.
https://docs.python.org/3/reference/simple_stmts.html#the-import-statement
import_stmt ::= "import" module ["as" identifier] ("," module ["as" identifier])*
| "from" relative_module "import" identifier ["as" identifier]
("," identifier ["as" identifier])*
| "from" relative_module "import" "(" identifier ["as" identifier]
("," identifier ["as" identifier])* [","] ")"
| "from" relative_module "import" "*"
module ::= (identifier ".")* identifier
relative_module ::= "."* module | "."+
@white nexus βοΈ
It's not a question of what's banned, but a question of the huge range of characters that is allowed as a python identifier
aha
I guess... what's a character that I could use for a local symbol?
I was using @, but get a syntax error when trying to import that path
>>> import modmail.plugins.@gateway_logger.gateway_logger
File "<stdin>", line 1
import modmail.plugins.@gateway_logger.gateway_logger
^
SyntaxError: invalid syntax
already in use for a different purpose 
what are some other characters that work?
I mean, Ig I could use _
@paper echo
like a symbol that means something is local instead of global or something
_
just wanted to drop an update: I did away with the directory prefixes all together
So does Cython. Still, if you annotate something as Dict then it's a dictionary. PyDantic can validate this dictionary, I don't think MyPyC does that?
I was under the impression that the purpose of pydantic is to serialize external data (like JSON) into your own objects, with all the proper validation
because once you get into Python objects, type validation might become tricky and expensive
yeah, it's just a way to describe a schema and ensure it is handled correctly. It doesn't exactly doesn't do runtime type checking.
remember, you can't figure out the type of list elements in any case.
unless the programmer tells you
or a compiler in a statically typed language
https://www.youtube.com/watch?v=qCGofLIzX6g <- wow.. just wow.. o.O
Can i ask about runtimewarnings here
Or should i open a help channel/asl in topical chat
Not if it's from an external source returned by some parser
If it's highly technical or about python internals this is a good place
Wdym? I meant, like, if you see a [Integer] in haskell, the compiler will guarantee that every item in the list in an integer
The prior art in this case is the pre-type-hints library Marshmallow as well as Jsonschema. Pydantic is the first time someone has put together runtime deserialization/validation with the attrs/dataclass pattern, in a single library with a tidy/friendly api; also building fastapi on top of it helped popularize it a lot
If you're parsing json you can't know what type it should serialize to. The parser internally at some point has to reject json that isn't [Integer], if your application calls for [Integer]
well, yes
the context of the discussion was that checking your own internal types at runtime is tricky and sometimes impossible
That's what pydantic is best used for, validation of unknown data at the edges of your application
Internal?
Oh
Yeah for infinite streams and such
Although you can design around that / this is why both are good
that's a very insightful talk, thanks for sharing
yw π yeah, it's brilliant
makes me even more wonder how GvR intents to improve python's performance now
transpile into JavaScript!
hehe
Make pypy more popular and force these handful of cursed libraries to adapt
i like that idea
by cursed you mean the most popular ones, right
yeah, it'd be good to make pypy faster with fast stuff
I don't remember exactly what the issues were but I remember it seemed like stuff most libraries didn't need
Maybe pypy needs the deep stack introspection capabilities
Fwiw common lisp implementations are typically much faster than python and also offer very very deep runtime introspection. I always wondered how that worked
tbh i kind of rely on inspect for quite a number of things these days, no idea how else i'd achieve certain goals without
it's very convenient to say the least
but that a + b is a lie is shocking π
i never dived that far into dis
Can't you do introspection statically?
you could, if you removed some flexibility
my guess is the REPL has been a large factor in the early design decisions
if you throw out some of the flexibility required for the REPL, it might make things easier to reason about
not sure how much jupyter relies on this flexibility
if jupyter wasn't affected, we could propose to kill the python shell for the good of everyone ^^
how do statically typed languages do it?
statically typed?
I think some statically typed languages have reflection built into the compiler and runtime
statically typed != static introspection
So either you can leverage the compiler statically, or not all of the type information is erased at runtime so you can introspect on it at runtime
I would imagine that inspect is and ought to be blessed as part of the core runtime
I haven't watched that talk in a while, but I remember that he didn't give specific examples of popular library's abusing these leaky internals and I felt disappointed, because I felt like I came away without an understanding of the solution space, only the problem
there are a couple of examples, yes
ye
but what dynamic aspects
are needed for the REPL specifically?
well, i guess the first thing that comes to mind is you need a stack
Well again common lisp implementations have deep stack introspection without slowness
Actually i wonder about implementing python in common lisp, if it'd be slow
https://www.python.org/dev/peps/pep-0219/ is an interesting idea
i'm just trying to google for stackless python REPL, but i can't find anything
i wonder how EVE online does it
Well stackless is an entirely different approach to the internals
But I'm suggesting is literally implementing at least a subset of the core language on top of common lisp hash tables
I think the difference with R for example is that R uses f-exprs, so expressions are never even evaluated until they are used
Like the entire language runtime is in a continuous state of macro expansion
So understandably that is slow and pretty much impossible to optimize
i wonder if it's possible to go all in on the AST and just use pointers to leaves instead of the stack
I have admittedly have no idea how it works in lisp
it feels like it should be possible
maybe that's how the jit in pypy works?
at least PyPy seems to be stackless, so
does PyPy have a REPL?
seems it like it has
Common Lisp definitely has a call stack
And yes Pypy's repl I think is equivalent to cpython
then my point about the REPL dictating some design decisions most probably is wrong
how can PyPy have debugging capabilities without the stack?
What does the repl require that makes python slow?
Again, i am still interested why common lisp has stack inspection and even arbitrary control flow to arbitrary points in the stack, but sbcl can emit reasonably efficient machine code
I've heard something about how lisp having a clearly defined "top level form" helps
Something about having these small and well-defined compilation units
the question earlier was whether you could implement inspect based on static analysis, but if you have a repl, there's little you can analyse in advance
Sure, but I think we agreed that even statically compiled and statically typed languages can still offer runtime time introspection capabilities
Also you do have stuff like a C++ repl but that's extra weird
Idk how the haskell repl works but I know that entire language doesn't have a call stack
I'll maybe try to ask on a forum somewhere
It seems the answer is pretty much what I said
The ability to compile very small top level compilation units
And some of the comments make a good point that Cython is actually quite fast
This makes me want to benchmark mypyc and nuitka
when do we get a pony implementation of python
I would be really curious about something like a web framework throughput test, or some kind of numerical loopy stuff
Forget Rust and Go!
kill all locks
Lisp is the lost server programming language
The lost gospel, consumed in religious fervor during the AI craze of the 90s
Its true teachings cast aside by its zealous followers
Server and numerical*
there is something nice and simple about lisp
Lets say i have a number of python classes all to be used at different times depending on the structure of data
Why would I choose to use importlib to dynamically load only the one i need inatead of importing all of them and using, lets say a mapping, to decide which one i need
Is there maybe some hidden benefit?
The classes are all different parsers to be used depending on the text file
the entire module is exec'ed no matter what
so i'd honestly just put them all in a dict or whatever and import the dict
nope
it's called optional dependencies
you can check if your platform supports certain features and then import depending on that
or in a CLI if a certain flag was given, you might not need to load a big module
it's not uncommon
sometimes the code doesn't actually require certain packages to be installed but can deal with different backends (audio encoding, plotting comes to mind)
there are some reasons to do conditional and/or lazy import in libraries with breadth of (optional) features, applications not as much
i'd say that depends on the application ^^
GHC can compile haskell in 'compiled' and 'interpreted' modes
like, if you have an audio-encoding app that can deal with different encoder-backends, you could say you try to import based on the setting
plotting apps often have different backends
plugins might be loaded depending on settings after the app has been started
imo you should try to make the backends "enable-able" at runtime
(if possible)
i don't like the idea of doing things upon import
although maybe certain things don't make sense to import if certain dependencies aren't met, e.g. a shared library
most optional features i see are implemented with try except, not what op described re: importlib
importlib is for not knowing the name of the import until runtime
and/or potentially boilerplate reduction
In this case, i have a list of known filetypes and the parsers they require and i also know the names of each parser, i dont see how anything needs to be dynamic here besides "dispatch" of parser via mapping
The parsers arent even that big, there are 5-6 of them and 5-600 lines each
sounds like the work of a bored engineer π
i am BEGINning to understand
i'd keep them in a dict then
parsers = {
'.awk': AwkParser,
'.py': PythonParser,
'.cpp': CppParser,
'.hxx': CppParser,
'.hpp': CppParser,
'.h': CParser,
'.c': CParser,
}
prefixed by sth like from .parsers.awk import AwkParser etc.
i have a little problem with packaging.version.. it doesn't allow Version(Version("1")), so i'd like to make a subclass that fixes this issue by overwriting __new__ to check if the argument is a Version and return it directly
from packaging.version import Version as pkv
class Version(pkv):
def __new__(cls, *args, **kwargs):
if isinstance(args[0], Version):
return args[0]
else:
return ???
any ideas?
super new?
tried that, but..
here:
class Version(pkv):
def __new__(cls, *args, **kwargs):
if isinstance(args[0], Version):
return args[0]
else:
return super(cls, Version).__new__(Version)
v = Version(Version("1"))
Traceback (most recent call last):
File "F:\Dropbox (Privat)\code\justuse\tests\.tests\.test2.py", line 11, in <module>
v = Version(Version("1"))
File "G:\Python396\lib\site-packages\packaging\version.py", line 264, in __init__
match = self._regex.search(version)
TypeError: expected string or bytes-like object
Thats the plan for now and maybe later i can generalise the parser class
Right, your version returns an instance of the class so init is ran which is not what you want
so i need to overwrite init as well to catch it
yeah that seems like the easiest option, not sure how you'd avoid it otherwise from a subclass
Is overwriting init not enough? (Disclaimer, I'm not familiar with Version)
if they don't define new on their own then yeah just skipping the init in your overwritten method should be viable
Ie don't mess with new
if arg is version return
Let me get on my laptop in a few and poke around , out currently. I suppose your question is around how to make the instance stick considering init doesn't do returns
yeah π
ah right that'd create an unitialized obj
ty
What's funny is, this could honestly just be a function if you wanted couldn't it
i've meddled with __new__ a while ago for singletons, but this is a different kind of weird
if i didn't want to overload other methods of Version, yeah ^^
yeah, you can generalize with a function that looks up a parser by name (or enum), and uses a dict internally
tries not to get distracted by the mention of singletons....
it's not singletons, it's not!
π
i swear nedbat has a trigger set on "singletons"
π
funny thing is, i just dropped in here for the first time in a few days, and there it was. π
your timing is impeccable π
i guess i felt a disturbance in the Force...
well, maybe you can help with my question, nedbat?
I've hardly ever written __new__
from packaging.version import Version as pkv
def Version(*args, **kwargs):
if isinstance(args[0], pkv):
return args[0]
return pkv(*args, **kwargs)
v = Version(Version("1"))
Why didn't you use argument-less super()?
so, honestly this seems cleanest to me to start with. are there any issues with this approach?
overriding both init and new should work if the subclass is wanted
that's method-less o.O
oh, it's a function..?
what..
yep π part of the "keep it simple" approach i suppose. exhibit A.
but i need to subclass Version for other reasons :p
well, you could still freely subclass it and stuff, as pkv. but okay..let's see.
damn, that uppercase Version confused me after the def
i do think you cant do it with just init then, there might be some hacks for it but now i can't think of a workaround.
Why doesn't this work?
class Version(pkv):
def __new__(cls, *args, **kwargs):
if isinstance(args[0], Version):
return args[0]
else:
return super().__new__(*args, **kwargs)
Traceback (most recent call last):
File "F:\Dropbox (Privat)\code\justuse\tests\.tests\.test2.py", line 12, in <module>
v = Version(Version("1"))
File "F:\Dropbox (Privat)\code\justuse\tests\.tests\.test2.py", line 8, in __new__
return super().__new__(*args, **kwargs)
TypeError: object.__new__(X): X is not a type object (str)
huh i get a different error message
python396 here
you'd only pass cls to the new if they don't override it
oh nvm, i can repro now
from packaging.version import Version as pkv
class Version(pkv):
def __new__(cls, *args, **kwargs):
if isinstance(args[0], Version):
args[0]._pre_instantiated = True
return args[0]
else:
instance = super(cls, Version).__new__(cls)
instance._pre_instantiated = False
return instance
def __init__(self, *args, **kwargs):
if self._pre_instantiated:
return
super().__init__(*args, **kwargs)
v = Version(Version("1"))
Here's one way to do it
All im doing is setting a flag and using it.
uhh, let's see!
you can check the arg in the init to avoid the attr
i didn't understand, could you elaborate?
yes!
from packaging.version import Version as pkv
class Version(pkv):
def __new__(cls, *args, **kwargs):
if isinstance(args[0], Version):
return args[0]
else:
instance = super(cls, Version).__new__(cls)
return instance
def __init__(self, *args, **kwargs):
if isinstance(args[0], Version):
return
super().__init__(*args, **kwargs)
v = Version(Version("1"))
that's excellent
oh yep
so i suppose i dont understand the black magic that init and new do. to me, it's surprising that init is getting args[0] as an instance of Version.
the init and new will receive the same args, and init is called if new returns an instance of the class
the python OO system is actually less magical than in a lot of other languages i think
less reliance on magic scoping rules, except in the body of a class statement
There's also __class__ being implicitly in scope inside methods
@paper echo tried a simple (and naive with only names) prepared format, and even with this it's on par (string with many replacement fields) or a lot better (long strings with few replacement fields) than .format. Wonder if a string method that'd do something like this but for the full format spec would be useful considering the templates are usually the only use of .format now
BRACE_RE = re.compile("({.+?})")
cdef class PreparedFormat:
cdef list _string_parts
cdef list _replacement_fields
def __init__(self, string):
self._string_parts = []
self._replacement_fields = []
for group in BRACE_RE.split(string):
if group.startswith("{"):
self._replacement_fields.append(group[1:-1])
else:
self._string_parts.append(group)
def format(self, dict map):
result_arr = []
for i in range(len(self._replacement_fields)):
result_arr.append(self._string_parts[i])
result_arr.append(map[self._replacement_fields[i]])
result_arr.append(self._string_parts[-1])
return "".join(result_arr)
cdef?
wanted to look into how cpython does it at first and separate the parts of the process but it was a bit too complicated for a quick test
ah right I cythonized it as that got a nice speedup for free
curious
@spark magnet speaking of singletons and the talk amogorkon shared earlier, an example of how singletons harm: each module name guarantees a corresponding singleton (stored in sys.modules), which means that you can't install different versions of packages side by side (like with npm). Or different packages with the same module name.
Like re.compile? I think it's a good idea. Maybe as a 3rd party package because cpython is apparently leery of taking on new functionality due to maintenance burden
@verbal escarp how do you handle this in justuse? Do you bypass sys.modules entirely, or give them unique-ified names?
the str.format implementation is complicated? If you link to the GH source i can look another time
Not sure if complicated was the right word but it uses the PyUnicodeWriter api which I didn't really want to look into https://github.com/python/cpython/blob/d3cc68900dc99966007112f884779895daefc7db/Objects/unicodeobject.c#L14195-L14222
Python/formatter_unicode.c line 1428
_PyUnicode_FormatAdvancedWriter(_PyUnicodeWriter *writer,```
It looks like the actual string formatting functionality isn't part of the C API
we don't cache the modules except requested explicitely (as_import), but we do keep track of namedtuple("ModInUse", "name mod path spec frame") where imports happened to have a detailed history of where stuff is coming from
it's a bit in flux atm, we might use the db later for this purpose, the internal dependency graph still is more of an idea than reality
How do you handle distribution packages that contain more than one module?
(I am banning the notion of an "import package" from my vocabulary, everything is a module now)
Maybe you should have justuse(..., module=None), where if it's None you try to auto detect which module to load, otherwise if would be a string naming a specific module
Maybe "auto detect" means "get the module with the best-matching name by Levenshtein distance"
Can you explain why? I don't understand why you'd want to do that?
Are we talking about having the same behaviour as like str(str(0))?
urgs π let me pass that question to @lusty scroll, he's more into package handling than me. i'd guess that's related to our protobuf case?
we had a number of test cases that handled version as a str and the original Version only supports Version(str), so those tests broke when i unified everything to Version as it was passing Version(Version(str))
unfortunately its not possible to always do the right thing as far as i know, there's nothing that tells what the "canonical module" is for a package. for example in example-pypi-package, I believe it's 'examplepy.hello1' which is mentioned nowhere in the package contents
since we already subclass Version for other reasons, i thought it'd be an easy fix to allow this behaviour, but it turned out to be harder than anticipated
indeed there isn't. that's why i suggested levenshtein distance. although in 99% of cases there will be exactly 1 top-level module
best you can do afaik is make an educated guess, looking at the top_level.txt for the package and hope it's the __init__.py in that directory or else there's s.a module with the same name as the package, in the case of protobuf it's google.protobuf, etc. that is what we do for now
yeah namespace packages are an issue too
for examplepy you have to do use("example-pypi-package.examplepy") thanks to that
i see
i'd definitely suggest having a separate parameter for the module name
otherwise you have ambiguity in the parsing of the package name
# option 1
example1 = justuse(('example-pypi-package', 'examplepy'))
# option 2
example2 = justuse('example-pypi-package', module='examplepy')
i think there are a lot of issues with the "magic ." system you just described, and might contribute to greater confusion rather than less
but that's only my opinion and you are the ones actually developing the thing
yeah, i'm a fan of being explicit
yeah, definitely, explicit is better than implicit, and so forth
i always was under the impression that the pointed names were explicit enough
that said, it's kind of a can of worms because now you have to support figuring out the "default" module name from sdists, eggs, wheels, and who knows what else
maybe if there's more than one top-level module it just errors unless you provide module=
then you can add the "smart" logic in later
again, most packages won't have a problem like this, as long as your code can cleanly distinguish between namespace packages and real modules
well, crazy idea, but we could break the convention there, even
we're not restricted by pointed names
the amount of hoops to even figure out where the code is, is fairly crazy
does the wheel format provide a list of python modules?
if it helps make things clearer
i would hope that wheels and eggs at least are the easy cases
i'm sure sdists are harder
and hopefully pep 517 helps?
how does pip do this for legacy packages?
wheels and eggs are definitely preferred, otherwise it's up to pip to make something that's usable
sadly no, just the entries in the zip
oh yeah, i've seen that function - crazy π
that's kinda silly
we should go to the circus by the amount of hoops we're jumping through
what's the point of a special binary package format if it doesn't help you actually use packages
there's a file you can put in the egg to tell python it's safe to import which I assume means you've not buried it somewhere in a subdirectory
interesting
i shudder to think the amount of incoherent docs and pip source code you must have had to wade through to figure all this out
nodnod
Yes

but is it actually in the generated wheel file?
Should be
ah, maybe we can simplify our code then
How would you otherwise install it?

it might not say which files are actually python source files
oh this isn't even the spec, this is some deferred thing
where's the damn wheel spec
is there a spec
is there a way to hire a specialist on python packaging? :p jk
aha this looks like the latest accepted spec https://www.python.org/dev/peps/pep-0427/
haha
in the worst case "look at the source" π
"parse the filename"
i silently swear at pypi each time i see that one
should mention just since we've been working on it, the filename changed to allow putting multiple platfotm tags separated by a dot
if you make justuse depend on setuptools maybe you can just use setuptools.find_namespace_packages on the extracted files
or it looks like there's stuff in here to do that https://docs.python.org/3/library/pkgutil.html
we should definitely look at possibilities to simplify our code, yeah
it's just so damn hard to sift through all docs, reference implementations etc
when i saw how nice zipimport works, i thought everything else would be a piece of cake π
but that was only the shallow end of the pool
yeah, if only... maybe we need a new binary package format π
one to rule them all!
what four hundred people? pip alone has 552 contributors π
I don't think it's possible to do anything without pip executing setup.py (for an sdist) because you can arbitrarily map different packages to separate folder trees
or worse, arbitrarily executing a build backend specified as per pep 517
my biggest fear are packages that need to be compiled on the target machine
but maybe that's just how it is? you download build deps, run the build backend, and hope the build backend is sane
ooh that's a new vector for injecting malicious code - introducing a "typo" in the pyproject.toml build backend specification that runs a malicious build backend that installs a keylogger or crypto miner. have to remember to audit the build_requires in pyproject
it should work in theory, but I haven't tried it
i mean that's literally what pep 517 expects right?
oh god..
also we pass "--no-use-pep517" so we're safe π₯³
hm
another pep bites the dust
why wouldn't you always use pep 517, assume setuptools if no pyproject.toml file is there, and let the build backend do the work in your temporary venv?
basically exactly what pip does
afaict the main magic of justuse is installing every package into an isolated site-packages to facilitate a npm-like local dependency tree, right?
true
the worst that could happen is you get a folder with some trash in it, vs executing some build backend that does sudo rm -rfI guess
true, but i guess that's an open risk with pep 517
i wonder if that's been discussed anywhere
maybe pypi needs to whitelist build backends for security...
could a python program escalate even?
it could prompt you and say "please"
subprocess.run(['/bin/sh', '-li', '-c', 'sudo rm -rf /home'])
why bother with a VM, venv is already "virtual"
i don't trust it to be virtual enough :p
it could slow things down considerably too
~/Desktop$ python foo.py
[sudo] Passwort fΓΌr amo:
well.. hm.
that's a problem π
i mean it's just as bad as a keylogger in a package dep
it's just another list of deps to audit
no explanation, only prompt - what could possibly go wrong
if you're installing some code from the internet, running setup.py, I guess it doesn't get much more dangerous than that at least
it could ask at any point in the future to escalate for that extra bit of functionality
just that users would obediently type the password
I wonder if we can prevent such prompting in our subprocess that runs pip
stdin="/dev/null" ?
is there any legitimate scenario for a python module to ask for escalation?
there's always some magic like systemd prompting the password etc. so it's probably best to do nothing
also i wonder if we should just ignore it and instead query security panels etc. to check if a package was flagged
i'd propose that security checking is out of scope for justuse π because it's also out of scope for pip and setuptools too
API? for what
my advice is take a backup
best would be the hash for the artifact, second best package name and version
oh you can get the hashes from pypi
if the API says "security issues found" we could link the user and warn
yeah, we have the hashes, that's not a problem
i don't know that there's any security auditing infrastructure on pypi for things like "the deps changed substantially since last release"
or "the version number was not bumped but the hashes changed on X date"
maybe snyk has something
(does pypi even have the ability to inspect deps in the uploaded package?)
not that i know of
Keep your Python dependencies secure, up-to-date and compliant.
btw how big was the venv created for examplepy, 12MB?
that is insane
for a module that is 1 KB
zipimport ftw :p
btw, we really should consider an automatic cleanup strategy
by now i see 3 numpy versions, 2 protobuf and 2 sqlalchemy versions alone just thanks to our testcases
we keep track of date_of_last_use for each package, maybe we can automatically purge on each startup, maybe in the background as a subprocess?
in the tests, we could
why only in the tests?
at least if there are multiple packages we could be more aggressive
different versions
i suggest prompting the user to purge, using a justuse command
like from justuse import purge_cache; purge_cache(older_than='30d')
maybe eventually you could have a cli to manage things like this?
hmm.. config?
although at that point you really are reinventing pip π
but maybe reinventing it better
:]
well, we could have auto-cleanup as the default and have an option in the config "i-like-it-dirty" π
or "keep-messy"
does list.remove() remove the element by the address an object is stored at? or does it hash the object and try to filter it that way?
I think it's the first element for which item == elem or item is elem
Where item is the one you are trying to remove.
yes but im curious how it works with class objects, how does it evaluate:
class obj == class obj
does it look at the memory the class object is stored in
or does it hash it
the default eq compares by identity
wait actually, nvm im stupid, i forgot python passes everything by reference so my object in my list wont be different from the object im evaluating to
No it calls __eq__() of the two things. In Python object()s __eq__() is defined as checking identity (x is y or id(x) == id(y)).
So yes, and no.
If you only have object()s in your list then yes it will check for identity.
Otherwise, if you have another type such as str __eq__() works differently
Either way list.remove() is O(n) because it needs to go through each element.
Are you looking for a set?
ahh ookay ya thats what i wanted to know
thanks you for explaining
no, i was curious, because if it was comparing hashes my evaluation would fail
Ah, because sets are hash(maps?).
They hash the object and make in checks O(1)
ahh ya i was actually trying to create such sort of hashmap lol
but it really looks like hashmaps are just glorified dictionaries
a dictionary is a hashmap
It's actually the other way around
aren't those synonyms?
a dictionary has more functions than the abstract datatype hashmap
what hashmap are you refering to?
not any specific implementation, just the idea of a hashmap
I have a question I hope I can ask it here
I'm always looking at medium-big projects on places like github and get really impressed by how they organized the projects.
I want to and am willing to learn and make projects that way. I'm currently working on a project I just came up with, but it's going to be massive. I'm using git to manage it but I also need to learn so much more about managing a project, to the point of having specific explanation files, versions, license and many other things like formal ways of writing (such as PEP stuff) and special comments, as well as designing and planning, I mean where do I even start...? I'm looking to learn these, perhaps any books/docs I could read? Tutorials I could check? Courses?
I hope you guys understand what I'm trying to ask for, I don't think I can explain any better
really, I thought a hashmap is an implementation of a dictionary
suppose you had a hashmap with just one bucket
Afaik a "dictionary" isn't really a data structure, more of a design pattern or interface, a specific kind of key-value mapping. The cpython implementation of a dictionary is a hashmap internally, and a dictionary is expected to have hashmap like semantics
maybe even a look-up table would be considered a dictionary
what's your "massive project" about? do you intent to make money off of it?
no intention for money, contribution to my knowledge and to the community.
the project is a REPL-based interaction with many tools built-in that could perhaps assist people in different ways
well, then you're overthinking the whole thing, i'd say
make a new repo on github and it'll guide you through the process - setting a license, readme
I did actually
then enter your project in https://snyk.io/advisor/ and it'll give you more hints what you could improve
alright so about the formal way of writing the code though, and actually describing functions properly, the structure of code
formal way? what do you expect? π
I mean I just look at projects and get amazed by how they knew how to sort it that way
all the files system, how it's beautifully coded
you'll get there when you've got some functionality to show off
obviously that's experience as well, but I'm looking to gain that experience throughout the project. This won't be perfect at first
there are a lot of choices about what tools to use for providing all the docs, lint, tests, automation, and so on, look at what other people are using and choose whatever you think is best.
some of your answers will be more popular than others but there are no wrong answers as long as it works.
the less work it takes to get your software running the better, the less work it takes to run the tests the better.
structure the code however you want, apply docstrings liberally.
use type annotations if you're cool and smart π
don't overthink it
oh yeah I'm using type annotaions
truth is most people who use software never look at the source code
and don't waste your time on documentation before you actually have stuff to document
obviously, but imagine this.
I code the project suddenly a busy year later I want to work on it again.
Will I handle it properly? That's the question
that's a reasonable thought
a hash map is a data structure that you can implement. a dict is a Python data type that uses hash maps, but does other things, too.
priorities for github project
- the README describes exactly how to get it running
- it works when i follow those instructions
I don't expect myself to be working on it throughout the entire year, I expect many large breaks as well
haha that's a nice way of putting it
unit tests help with this, however i don't know what advice to give besides do the best you can this time, and do better next time. you will learn what works for you
my advice: always code and comment your code with some stranger in mind (you in a year) who would read the code for the first time
that's your target audience
you don't need to comment and document everything (i mean, the you in a year probably isn't an idiot, right)
yeah obviously
I believe doing things like
# It works so don't ask why
version = '\n\n'
is totally fine right :P
but you do need to comment stuff that you had to put in serious thought in to get right
right
so I'll keep that, next is about the tools
tools that help providing the docs, lint, tests, automation and so on
along with commenting, commit messages are important for the same reason, so you can describe the issue it solves and why that approach was chosen
yeah I've still got work to do on that part, not my best asset
any recommendations for where to look up, perhaps some very known tools to start off with?
a good workflow is also to use feature-branches named by the according issue number-name-of-issue
and then do PRs to the main branch instead of pushing directly
I'll be honest I have no idea what you just said
I don't have that much experience I'm just 16
pff. excuses π
Why the issue number? I usually only describe what it addresses and then leave the PR to link issue etc.
what's PR?
i stumbled over that pattern when i was wondering about the naming and did some research, it's also the naming pattern gitlive extension in vscode uses for this purpose
pull request
oh alright
I shouldn't feel too bad about having a bad-looking project as for the first time
from the perspective of the main branch, you basically ask the owner (in this case you yourself) to pull the changes in
I always seek perfection too much which is bad, as it leads to me never finishing something
haha then since I'm the owner of myself I'll pretend I didn't see that and proceed with my life
on the contrary, the first thing you should do is produce prototypes
yeah
I mean for starters perhaps I could just make an interaction terminal that can accept the addition of tools and implement them
later on add tools
I'll start with the main wrapping
producing minimal examples as proof-of-concept for certain features, well in mind that you'll throw it away afterwards
then things will be way easier
will help a lot to get started
right that's awesome
where can I find the PEP or docs about commenting properly in functions so that they have a readable description
such as for if I hover with my mouse over the function in the editor I can actually see
PEPs about commenting? that's entirely up to you π
alright
!pep 257
oh, description
thanks pal
so about the automation as mentioned here
have you got any tools in mind for starters?
black
black?
yep
Does someone have a code a discord bot code coded with python that make the user choose and change the prefix with
a command like "+setprefix" i searched all the day but i didnt find my phone is going low in battery if someone got that pls send me the code i will be thankful
I'll check that, got anything else in mind?
pytest
is there a document with a list or something?
mypy
you could make a flowchart for future generations π
everybody would have a different list but probably some overlap in this room
so under what category should I search? Code formatting tools?
tbh i don't know any code formatters other than black
well, at first you need to find an editor/IDE that fits your needs
what he means is use vscode
haha yeah I'm using VSCode don't worry about that
even used it for Assembly honestly
I made super mario bros in assembly
and there you used "i'm only 16" as an excuse π
from scratch
well what can I say
Β―_(γ)_/Β―
some things I'm more experienced with and some less
it was my finals project, a small percentage from my graduation
haha honestly I remember when I first saw DosBox
felt like it was going to pain, oh boy I was right
I was working with 8086
alright well I think I might be able to take it from here
thanks a lot guys
yw
is there vsCode addon for this?
vscode python extension has builtin support for it
also yapf and autopep8
Oh π€. How would I enable? Vscode doesn't seem to catch what our pipeline is catching
I have that extension, but it doesn't flag anything pep/black
change python.formatting.provider in settings @wet void
Ah thank you!
np


!mute 627516853576663041
:incoming_envelope: :ok_hand: applied mute to @graceful peak until <t:1631084847:f> (59 minutes and 59 seconds).
version = getattr(mod, "__version__", None)
if version: return Version(version)
another case where i wished it was possible to short-circuit on None somehow
module?.name is None if module is None
True, in this case you'd still need getattr
(let [version (getattr mod "__version__" None)]
(when version
(Version version)))
Approximate Hy equivalent
Hey everyone, which part of the Python interpreter knows which dunder methods are "fallbacks" for missing dunder methods? Example, if dunder __str__ is missing, Python interpreter knows to check for dunder __repr__. If dunder __int__ is missing, the Python interpreter knows to check for dunder __index__ and if missing, checks for dunder __trunc__
what i posted,
version = getattr(mod, "__version__", None)
if version: return Version(version)
it's annoying to repeat version 3 times alone, so i wished there was a way to do something like ?return Version(?getattr(mod, "__version__")
which i imagine would skip the return if the thing is None and skip calling Version if getattr returns None
something like that
already repeating myself three times outside of code, damn me
did you check the bytecode and the AST?
No I haven't, I don't think I would be able to understand the bytecode but maybe the AST can give me some help. Python is my first language so all of this is new to me
just try from dis import dis and apply it to different functions and methods, it might give you some unexpected answers, but you won't know if you don't try ^^
the bytecode wouldn't really show that, I think apart from finding documentation snippets all over the place for them the best place to look would be to look at the typeobj and obj method implementations
didn't know about dis i'll check it out.
@peak spoke , doesn't it have to be all somewhere so that the interpreter knows where to check to see what is the fallback method?
would be so helpful if those dunders with fallback dunder methods had a fallback attribute -.-
Yes, it'll be in the C source and should be in the object and typeobject method implementations or you'll have to trace what function is called and look at how it's implemented
usually you don't need to care about fallbacks
I just want to learn to get a bigger picture view
For example https://github.com/python/cpython/blob/e1becf46b4e3ba6d7d32ebf4bbd3e0804766a423/Objects/object.c#L1389-L1415 is used to check for truthiness
This makes sense because the interpreter is coded using C, so the source code is in C...
"you'll have to trace what function is called and look at how it's implemented" for this part, say I want to know if dunder __int__ has a fall back, how could I trace back using the function int as a first step or even the dunder __int__
I wouldn't spend too much time on this, you'll learn the important ones as you need them and others and can be noticed if they are used by something
Knowing that int can default to truncating on types like floats is nice, but it might as well have its own __int__ that truncates and there wouldn't be a difference
int is special
hmm, let me read that again
thank you!
it should give you a better understanding of some aspects
yeah, it will be helpful, gonna watch, and thank you for help on the question.
@peak spoke thank you for the good ideas of where to look
The dunder itself doesn't really have a fallback (like comparisons through NotImplemented) so you'd have to look at how int does the conversion - https://github.com/python/cpython/blob/main/Objects/longobject.c#L5046 then for example it goes through the PyNumber_Long path which is in https://github.com/python/cpython/blob/main/Objects/abstract.c#L1529
Gonna be a tough read, wish me luck! thanks again!
So many channels it's hard to know where to ask
I have this short function:
def ftime(seconds: Union[int, float]) -> str:
if seconds < 60:
return f'{seconds}s'
elif seconds % 60:
return f'{seconds // 60}m {seconds % 60}s'
else:
return f'{seconds // 60}m'
What it does is format a number of seconds in somewhat like 3m / 3m 43s / 43s
for general problems
this channel
for discussions about the language itself
and the topical channels if your question specifically relates to one of them
Oh, I see, thanks!
again again
Does anyone have good references to making interpreters?
I got a basic thing going with python,
But I can't figure out how to make something that works like JavaScript with { }
http://craftinginterpreters.com is the way to go
Thanks
no redirection to https smh
crafting interpreters :D
You are not allowed to use that command here. Please use the #bot-commands channel instead.
which makes more sense?
1```py
class Thing:
thong = '5'
@staticmethod
def heya():
return Thing.thong * 4
2py
class Thing:
thong = '5'
@classmethod
def heya(cls):
return cls.thong * 4
hey
should subclasses be considered?
that's probably when to do one or the other
thanks that was fast
will probably do the first one and final it
hmm, is there a runtime way to @open turret something?
lol sorry, person
final does nothing at runtime, the only utility it provides is for static checkers
for the class atrr I'd personally use cls and a classmethod because of the aforementioned lower maintenance cost, a decorator is already there anyway
if you want to prevent subclasses at runtime for some reason you could raise an exception in init subclass, but documenting that should be enough
2
Better expression of intent
yeah?
It's clearly not "static", it uses the class
But how is it meant to be used and why would you want to make it static in the first place?
Can someone solve this problem?
def print_all(w1, w2, w3):
print(w1, w2, w3)
func(print_all)(3, 5, 8) -> 8, 5, 3
in this case, its error and success methods for my bot so all error messages and successes are consistent looking
is this homework? also this channel isn't a help channel, check out #βο½how-to-get-help or ask in #python-discussion
so the class has some prefilled responses on it, and the methods use those prefilled stuff π
If these methods depend on class attributes then I'd make them all class methods
Making some static and some class would be weird
danke
looking up an attribute doesn't use locals at all, it's an attribute on the object.
looking up the object from the variable name will use the locals
Are you sure about that?
Hello
I mean, looking up the object from the variable name uses the locals, but does it use the f_locals dictionary?
hi, sorry about that, lol
Np
did not mean to ping you lol
Itβs okay
locals are tricky, and might not really use f_locals.
I was under the impression that locals are indexed into an array, as opposed to looked up by key into a dictionary
right, it's not straightforward.
it depends on the instruction used
In Python 3.9, I believe the locals are normally looked up in this array - https://github.com/python/cpython/blob/v3.9.7/Include/cpython/frameobject.h#L46
Include/cpython/frameobject.h line 46
PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */```
in the main branch, that's gone, so I have no idea how it works there π
Notice I said "the locals", not f_locals
Yeah, I caught that and added a clarification π
Where should I go to learn more about the deep depths of how Python works? Instead of just the base of it (like I'm aware of how first source code is translated into Bytecode using the CPython compiler (i think?) then is interpreted from the PVM to python code
but how does it really work behind the scenes
LOAD_NAME uses the f_locals dict directly, not sure if that's only used if it can't detect that it's in a function local scope
like whatever frames are and what not
I think LOAD_FAST uses f_localsplus, then
though I don't know when LOAD_NAME would be used instead of LOAD_FAST Β―_(γ)_/Β―
(internally)
getattr retrieves an attribute from an object. It has nothing to do with locals/globals/etc
Global scope and I guess if you generated some code in a function? I'm not really sure when it can't use the more optimized loads
yeah, I was thinking exec or eval. Global scope makes sense, too, since in that case it has to use the globals dictionary, since the names of globals aren't known at compile time like the names of local variables are.
Thank you!
Haven't read it, but I've heard it's good.
do you know C?
no
I do see that there is an iNtro to c portion
for the book
I've been able to keep up with fluent python so far (chapter 12), so do I have enough of an understanding to understand the book, or should I learn some more?
I haven't read it, like I said - but I'd expect it won't be easy if you don't already know C...
You could check your local library, too. They often have technical books.
Hm, I'll check that out
If for the internal behaviours the book mostly focuses on reading you should be relatively fine, at least I can get around the source of cpython decently when searching for something with my not so great C understanding
I might just look at the codebase for Cpython and try and piece things together
not sure if that's the best way to learn though
that sounds like the most difficult possible path π
if you decide to try that, there's some resources at https://devguide.python.org/ about how some of the subsystems work
There is probably more than one CPython internals tutorial so I think I should find a good one
I found one youtube playlist but it's written in python 2.7...
well, what's your goal? Do you need up to date knowledge about how it works today, or is knowing how it worked 10 years ago good enough for what you're trying to learn?
I'm more or less interested in the general internals, so it should be fine
That CPython internals book looks pretty good though
https://github.com/zpoint/CPython-Internals
Hm, this one is for python 3.8.x
$32 used on Amazon, FWIW.
Ah, thank you
I'm gonna search through O'reilly's library and see if I can sift through there for any cpython internals books
otherwise i may just buy the realpython one
If you have a list you keep building up and resetting is it better to .clear() it or just reset it mylist = [] 
I'd assume .clear is more memory efficient but idk if it will matter much
- i was wrong,
.clear()will deallocate the backing array
Does smb. maybe has a hint onto that issue:
I just want to sort Tracks of a Music Album by there disc
wrong channel
Hi, is there an API for compiling LaTeX?
What are you trying to compile it into? LaTeX's normal compilers are command line only to begin with, you can just call them with subprocess.
I would like to compile LaTeX in discord chat.
matplotlib and ipython have the ability to render latex math expressions
also depends on what exactly you're doing with it
(also this really isn't on topic)
We learned the hard way here that you should apply strict runtime and memory limits to your render
So please be careful :D
hello
hi
import inspect
def all_kwargs(func, other_locals):
d = {
name: other_locals[name]
for name, param in inspect.signature(func).parameters.items()
if (
param.kind is inspect.Parameter.KEYWORD_ONLY
or param.kind is inspect.Parameter.VAR_KEYWORD
)
}
d.update(d["kwargs"])
del d["kwargs"]
return d
def foo(*, a, b, **kwargs):
bar(**all_kwargs(foo, locals()))
def bar(**kwargs):
print(kwargs)
foo(a=2, b=3, c=4)
works like a charm, but i'm wondering if there's a way to simplify the function call so that it becomes **all_kwargs()
is it possible to get the locals of the outer function?
locals would be easy (and hacky) if you get the outer frame, not sure on the function obj though
I guess you could fetch it from the globals of the frame with the name stored on the code object, I'd rather pass the args than mess with frames though
because it's cpython specific or other reasons?
that and it's fragile
fragile how?
for example you wrap something and then the number of frames you need to go back to reach the relevant one changes
decorators hmm π¦
i've encountered issues with decorators that wouldn't work if applied to @staticmethod etc
i know what you mean
wait
but in this case, it never could get the wrong frame
the function is called from inside the function that it's supposed to handle
there's never a decorator between, right?
Yeah, but you could decorate your kwargs function, or introduce a frame in an another way (e.g. using it in a comprehension)
how can a comprehension get in between a function and a call inside that function?
If you called all_kwargs in a comprehension, that comprehension has its own frame
but all_kwargs only makes sense for functions and methods :p