#internals-and-peps
1 messages Β· Page 64 of 1
I'd kill to be able to parse jsons instead of binaries, lol
is it a custom format?
overall it seems orjson is the fastest out of the json parsers overall
followed by ujson
tho hyperjson had the most consistent time taken overall
Jay, if you're asking me, I'm creating parsers to a few different types of binary files that aren't necessarily distinguished via extension
So I have multiple files ending in .dat for example that are binary files with different file structures
It's "fun"
No it's not, it's the least fun part of my project
I don't even have the file structure for one of the files
Yeah....
ive done similar things but for network protocols
Thankfully my employer knows it's a clusterfuck and isn't holding it against me that I'm struggling with it
find hte person who wrote it and wack them with a hammer
That's pretty high on my to-do list tbh
Webdev is a huge black box for me. From what I hear it doesn't take too much to grasp the basics, but I'm not that motivated to learn it
@dim plank This is not a help channel, read the channel description. You can ask in #async-and-concurrency
ohhh sorry
(or just in a help channel if #async-and-concurrency is occupied)
Has anyone used PySpark on Databricks(AWS) in Production? If so, what does your CI/CD pipeline look like for different environments and are there any community tools which can help in CI/CD of PySpark job scripts.
YES finally got it working, does it scare you?
is that adding 1 to counter when __str__ or __repr__ is called?
@hollow sky i've used it but i have no idea what a proper CI/CD pipeline is like. also this is probably better suited for #tools-and-devops
Thanks, will do π I am new here, so it might take a bit to figure what goes where.
well then
is that adding 1 to counter when
__str__or__repr__is called?
@north root its a__str__
What ide should i use on i3-5010U 4gb ram?
#tools-and-devops , but vsc might be light enough
vim
Ide not text editor :P
I have i5-3230M (which, as I have peeked at some benchmarks, is pretty similar in benchmarks) and 4GB of RAM. Actual performance depends on many factors such as graphics card, RAM speed, disk speed etc.
VSCode with Python+Pyright+some other garbage like coloured parentheses runs just fine. PyCharm can be a tiny bit slow with large projects.
"IDE" and "text editor" is not a dichotomy and are pretty vague terms.
you can use jedi with vim, vsc with python also uses jedi
@dire wing Also, this is not the right channel for this, read the channel description. Ask in #tools-and-devops
if i had to create a custom hash table (using my own hash function), any resources on what may be a good and simple one to use in an interview for an unknown data set?
"general purpose hash function" is the term to Google
thanks @raven ridge
But what about the wide world of hashing, how thrilling it is to explore polynomial hashes! π
no
π¦
Fine, I'll admit, I learned all kinds of complicated hashes but I didn't really have fun doing it
It was more just annoying and I love having that all handled in the background in Python π¦
π
π€
!ban 571722029049577472 joined just to post a penis copypasta
:incoming_envelope: :ok_hand: applied ban to @balmy mantle permanently.
thank you
anyways
is list comprehension parallelized in python?
because it should be, I mean you have independent elements that can be dealt with separately
it is not parallelized
As in its processing being parallelized for performance?
yes
it should be. See the simple example I posted right after
parallelization can easily make that example faster
it's easy to make an example that shouldn't be parallelized, and Python can't tell the difference.
better to have the programmer say when it should be parallelized
can list comprehensions be parallelized?
It can accept any expression which can't be handled in C outside of the GIL, and the overhead of doing it in separate processes is far greater
@subtle whale there isn't a simple way to do it. You can use libraries for that.
Python in general doesn't run Python code on more than one thread at once.
I seem to be hitting a brick wall in terms of speed when it comes to Python.
it can't easily be parallelized because the stuff done for each element in the comprehension can be arbitrary Python code, and arbitrary Python code can't be run without the GIL being held, which means it's necessarily serial
So I am looking for ways to make my code faster
well, it is code I am writing for my master thesis. Long story short: I look at images and predict trajectories of dynamic obstacles. I have to draw a lot and check pixel values too to make sure the predicted trajectory is on the road
can pypy be used with opencv and numpy?
yes to numpy for sure
We can't really give any advice without seeing your code really, but there are caveats to numpy, are you sure you're not falling into these ?
caveats?
using loops, using np.concat/np.append like you would use list.extend/list.append, making bad slices which copy, etc
I will link my code here and ask for advice again. Thank you and also thank you for mentioning PyPy, it seems very interesting and I will give it a try
If your code is heavily based on numeric computations, which it appears it is, numba can give you a fair boost of speed for almost free
You may need to adapt your code tho
numba doesn't work with opencv
no but it can be used to accelerate pure python and numpy functions
then you can wrap your opencv code around these
hmm true, would you recommend cython or numba?
I heard almost everyone favor cython over numba
numba is definitely easier, but it's really only for numeric computing
which seems fine here
I'd say try numba and if it doesn't improve much or forces you to modify your code too much (i doubt it but who knows), move to cython
multiprocessing really sounds like it should help here
if you've got a big data structure that you want to share across many different workers, building that data structure in one process, forking off many workers to do the processing on a shared, copy-on-write version of that data structure, and then sending only the results back to the parent process over a queue should give you much better performance than any parallelism you can add in the Python code, assuming that the processing takes some minimal amount of time to justify the cost of the fork
numpy operations are not particularly fit for multiprocessing, they already use simd, same with opencv, I'm really unsure this would bring any speed up
maybe a dumb question, but have you tried profiling your code to figure out where it's spending its time, @subtle whale?
i wouldn't be surprised if it was consistently worse tbh
@teal yacht you're saying you think numpy+opencv+multiprocessing would be worse than numpy+opencv in a single process?
that is, ofc again, assuming most of the time is spent doing the numerical computations and not IO or some other non-numpy related ops
yes
interesting. I don't know numpy or opencv well enough to comment - I'd expect that could possibly be true in some special cases, but wouldn't have expected that to be the general case - but that's outside my wheelhouse.
many numpy operations are multi-threaded by default, assuming proper use of them, also not any kind of multi-threading, they use libraries like BLAS, implement SIMD operations etc that basically draw the best out of our CPUs for "pure" math operations
I'd expect the cost of building the pool, spawning processes and aggregating the results to be slower, but tbh i've never tried np + multiprocessing
semi-related: has anyone had good experience with cupy ? I never tried it but always thought it was interesting, wanted to hear if someone has had success moving from numpy to cupy and had great results
on Unix spawning processes is pretty cheap, but you're right that if it means that something that would otherwise use SIMD won't, it'd definitely still be a net loss
either way, seems like a really good idea to start with profiling, I'd think - no sense optimizing anything but the place where it's spending the most time, so it's worth checking your understanding of what that is.
numpy-like library that uses cuda
Sounds relatively new
it's "new" as in a few years
Also, do you think I should go the C/C++ route if I am not getting the required speedup
Is it worth it in fact?
naive C/C++ can be slower than python with optimized C/C++ libraries
just a disclaimer
Try Cython before going all the way to C
I did try cython
I will try it again after reworking my code a bit
I used to write in C/C++ before I switched to python a year ago
before making any big decisions, profile, figure out where the time is spent, and then think about how to speed that particular thing up.
Cython is awesome, but it's not a silver bullet - applying it to the wrong spot, or not using it properly, will only hurt you.
you can probably get the whole thing down to 1 or 2 functions you wrote that are taking up 90% of your time, and once you have that, it'll be a lot easier to ask for help getting those things faster.
and once we have specifics to work with it should be really easy to prove whether multiprocessing is the best way to speed it up, or using numpy differently so that it can make better use of SIMD, or what.
if you're writing bad code, switching to a faster language won't change anything
What would you do if you want to enumerate over multiple things? So zip and enumerate
The only way I can think if is to have itertools.count as the first iterable
enumerate(zip()) ?
Does that unpack the tuple you get from zip?
!e
my_list = ['a', 'b', 'c']
for a, b, c in enumerate(zip(my_list, my_list)):
print(a, b, c)
@boreal umbra :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 3, in <module>
003 | ValueError: not enough values to unpack (expected 3, got 2)
!e
my_list = ['a', 'b', 'c']
for a, b, c in enumerate(*zip(my_list, my_list)):
print(a, b, c)
@boreal umbra :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 3, in <module>
003 | TypeError: enumerate() takes at most 2 arguments (3 given)
!e
for a, (b, c) in enumerate(zip(my_list, my_list)):
print(a, b, c)```
You are not allowed to use that command here. Please use the #bot-commands channel instead.
I don't think that that would work but we can see
it works, just not my !e
!e
my_list = ['a', 'b', 'c']
for a, (b, c) in enumerate(zip(my_list, my_list)):
print(a, b, c)
@boreal umbra :white_check_mark: Your eval job has completed with return code 0.
001 | 0 a a
002 | 1 b b
003 | 2 c c
I just found out about memoryview. What a cool function. I know about views with numpy arrays, but I didn't know you could do that with Python buffers.
This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character. The following type codes are defined:
array is quite a low-level module, yeah
they're not meant for the same stuff tho, array is for simple memory efficient typed arrays, useful for compact storage, numpy is a whole beast that includes arrays and operations optimized for numerical operations
Do the extra potential operations affect the memory usage of numpy arrays?
no, i'd guess it's about the same order of magnitude
bad wording, i mean it should be very close
but you're not adding a dependency to your code with one
That's true. I'll probably never use it. Numpy is probably going to be a project requirement for anything I work on.
I agree, in the end you're right, I'm curious about the time complexity of append, i don't know the internals of array so i'm unsure if the arrays are fixed or dynamically sized
that could be a legitimate use-case against numpy for a lighter alternative to lists
I want to say they're dynamically sized, but I'm not 100% on that
Oh, I mean for numpy
I'd be curious about how these are implemented, too
I'm leaning towards dynamically sized, though
Python's array.array is damn close to std::array, you should appreciate with your C++ referencing username π
it's a dynamic array, unlike std::array which is fixed size, but its elements are typical C level types, rather than Python objects.
So it doubles on every resize, correct?
Yeah, it's neat that it has C level types
It doesn't have to double, it has to increase by a constant multiple
I'm not sure if it's 2x, but 1.5x is also amortized constant time.
It just has to be a constant multiple to get amortized constant time
Jinx!
So is that still an array of pointers, or is it a true array?
for some reason in python it's not a simple geometric growth
(neither for arrays or lists)
in cpython*
_new_size = (newsize >> 4) + (Py_SIZE(self) < 8 ? 3 : 7) + newsize;
I mean, it's going to be faster either way because it doesn't have to do type checking
So is that still an array of pointers, or is it a true array?
@pseudo cradle a true array.
I didn't wind up using arrays that often because I kept blowing up my stack with large computations
Ooh, nice
That's hot
without numpy, there would be none of the current scientific/numerical we have now
Yeah, everything is built off of numpy
Python's
array.arrayis damn close tostd::array, you should appreciate with your C++ referencing username π
@raven ridge so...std::vector?
wups I quoted the wrong part
yeah, that would have probably been a better analogy π
it's tough - list is also close to std::vector, excepting that it's a std::vector<PyObject*>
whereas array.array is more like std::vector<int>
yeah it's a little difficult to draw analogies because of the difference in typing
between Python and C++
yeah.
does anyone have a favorite feature of numpy?
Oooh
That's a tough one
One of my favorite features is being able to compute functions across any axis I desire or the matrix as a whole
does anyone have a favorite feature of numpy?
@avocadofire#6954
I guessnp.wherefor me.
einsum
einsum is like regex for vector/matrix math
Im having headache
One this that is inconsistent in Python and kind of breaks its duck typing model is the relation of classes, callables and functions.
Whatβs the best way to input images into a machine learning model
@tribal skiff This is not the right channel, read the channel description. You can ask in a help channel ( #βο½how-to-get-help ) or #data-science-and-ml
!e
class SomeCallable:
def __call__(self, *args):
print ("some callable was called", *args)
def f(*args):
print ("f was called", *args)
class MyClass:
def g(*args):
print ("g was called", *args)
my_f = f
some_callable = SomeCallable()
x = MyClass()
x.g()
x.my_f()
x.some_callable()
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
001 | g was called <__main__.MyClass object at 0x7fb1843780a0>
002 | f was called <__main__.MyClass object at 0x7fb1843780a0>
003 | some callable was called
So functions are treated specially and are made into bound methods.
Has something to do with the __get__ of function objects I believe
yeah, it's about the way that functions implement the descriptor protocol.
and technically SomeCallable could do it as well.
From the Descriptor How-To Guide in the documentation
class Function:
def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
if obj is None:
return self
return types.MethodType(self, obj)
One could make the argument, though, that every callable ought to implement something like that by default.
that is, that object.__get__ ought to exist and check if __call__ exists and if so do that.
I can't really think of a consistent way to define class methods.
I guess that methods are such a common use case that the protocol-based nature is a bit broken
Well, one way is to make methodness explicit and make a @method decorator, but that will make most methods have this decorator.
So making self an explicit parameter comes with some drawbacks, it seems.
Some time ago I made this https://gist.github.com/decorator-factory/9cede41abf48cbef24d6482b713daed9
!e
def inherit_from(parent):
def decorator(child):
def func(*args, **kwargs):
prepared_object = parent(*args, **kwargs)
child(prepared_object, *args, **kwargs)
return prepared_object
return func
return decorator
def base(*_, **__):
return {}
@inherit_from(base)
def animal(this, age, *_, **__):
this["age"] = age
@inherit_from(animal)
def wolf(this, age, howl_length, *_, **__):
def howl(): print("u" * howl_length)
this["howl"] = howl
x = base()
print("base() =", x) # {}
cat = animal(17)
print("animal(17) =", cat) # {'age': 17}
dog = wolf(17, 6)
print("wolf(17, 6) =", dog) # {'age': 17, 'howl': <function...>}
dog["howl"]() # uuuuuu
@grave jolt :white_check_mark: Your eval job has completed with return code 0.
001 | base() = {}
002 | animal(17) = {'age': 17}
003 | wolf(17, 6) = {'age': 17, 'howl': <function wolf.<locals>.howl at 0x7fc1e078dee0>}
004 | uuuuuu
Basically, it uses object-closure duality or something
Like one of the million idioms for creating a class in JS
Actually... A class is a dependency injection manager for a collection of functions (methods) π
in PyQt5 i have window1 -> opens window2 -> opens window3. if window3 has a problem i want to close window3 and 2 and return to window1. how do i return throw/catch something between QmainWindows?
@unkempt rock This is not the right channel, see channel description. Check out #βο½how-to-get-help or #user-interfaces .
ok sorry
So, I finally overhauled my Python code into this: https://github.com/sarimmehdi/master_thesis/blob/master/real_time.py
And now I am trying to convert it to cython. I am reading online that lists shouldn't be used in cython and we should use arrays instead. Is that true? Because I use lists (and do a lot of list comprehension) and not using lists would mean doing a complete do-over of my code. Also, what about dictionaries?
so does anyone know how would you go about making named pipes but on Windows?
cuz apparently os.mkfifo is Availability: Unix.
named pipes in windows is a reallllyy big pain in the ass
Hi
@proven sphinx that would be a great topic for #tools-and-devops
It looks like community would be better for me than professional but eh
Last time I researched pro was for web dev and full stack while science and scripting could be handled by community
Okay I lied there are some scientific tools available that I use a bit but could probably make better use of
I have an issue on macos (on linux it works fine) where I monkey patch a library that uses the multiprocessing module and on linux, the patched function is called as intended, but on macos, the original function is called in the child processes
anybody run into this, know an approach to fix it?
that sounds complicated! Why do you need to monkeypatch?
What does it mean to monkey patch something in this context?
i took it to mean, modify the code in a third-party library by changing it at run time
I don't think this is really on topic for this channel either.
What does it mean to...
original_func = module.func
@functools.wraps(original_func)
def mypatch_func():
shinanigangs...
return original_func()
module.func = mypatch_func
ok, sorry for off topic
Maybe it is and I'm not seeing the connection but I would probably use a help channel.
ok, I didn't grasp the concept, I though those suffixes were project names π
Take a look at #βο½how-to-get-help
What are use cases for using array library? I have never come across a situation where I have used one. A list usually can do what I need.
same here
I have the vague sense it's for interoperating with other languages, like maybe C or FORTRAN
but that's just a guess
Basically, when you care about memory usage and operation time
You usually don't, so lists are plenty
But when you start getting high data throughput or you have some sort of realtime constraints or memory constraints then it's something you should pay more attention to.
@pseudo cradle thanks
np
I vaguely remember Arcade apparently uses them instead of numpy arrays in some place or other because they're more efficient
I'm curious of some python libraries that do rely on them though - everything I can think of just uses numpy for that stuff
the niche where you use array.array seems very small
I agree personally, but maybe it's for when you're storing data but aren't planning on doing any sort of computation with it
it's a nice mix between lists and numpy arrays
you get the compact size with the benefits of dynamic arrays
but realistically, it's really useful in very few cases
just when you need a compact variant of list
Maybe in embedded, but at that point I'd just use a different language
Micropython is pretty sweet.
I may explore it some day. We'll see.
I really want to get into programming microcontrollers and more hardware-heavy projects at some point as well. It seems super fun.
theres so many things i want to try/play with, i need more time in my life
I've worked with Ada, C, and ASM for microcontrollers
It's pretty fun but it's a major paradigm shift if you're not used to it
Most coders don't ever really have to think about things like firmware or memory
I did a few simple hardware projects at uni, but that's it.
I've only worked with microcontrollers for hobby or academic use, but may begin some industry work in a few months with it
so suppose if i decide to write a interpreter for a language in python. what do i gain from it ?its pretty vague question but im curious what i can learn from it ?
you'd learn what it's like to write a language i suppose
a deeper understanding how interpreted languages operate π€·ββοΈ
hmmm i c i c but anyother benefits ??
Learning on how parsing work, how lexical analysis work, how stack machines work...
ohhh i c i c ty bud
Taking this back from general Because of the GIL, I wonder what are the practical difference between asyncio and threading, apart from the fact that in asyncio the context switching is done by the python event loop while in threading it is done by the OS
i might throw some quick question regarding GIL too. so when u run aysncio, isnt it running in only one thread coz of GIL?
Asyncio is intended to be run sequencially, it is just a way to be able to execute another piece of code while waiting for an IO operation to complete
i get the concurrency part. but is there any way u can bypass GIL and achieve multi threading in python?
The only way is to launch a new interpreter, this is what multiprocessing does
or to create a thread and run entirely non-python code in it
Doesn't have to be entirely non Python code in that thread. The thread can do a mix of Python and non-Python stuff, and drop the GIL while doing the non-Python stuff. That's how numpy works, for example
hmmmm but how do u connect the process then ?? for the non python code in the other thread?
is there already tool build for it ??
You write code in another language in such a way as to allow it to be imported as a Python module. Cython is a tool that simplifies that.
oh sweet cpython here i come thnx heaps golygeek
Cython, not CPython. Entirely different things.
cython here i come.thnx again. i swear i learned so much stuffs about programming after joining this python group lol
I want to learn python imaging library but i can't find any good sources to do it
#data-science-and-ml is a better place for that question I think
is this considered out of date now : https://stackoverflow.com/questions/9001509/how-can-i-sort-a-dictionary-by-key ?
dict(sorted(d.items()))```is enough now I think, yeah
Tuples are compared by first element first, then by second and so on.
In a sense, yes. Starting with Python 3(.4?) dicts retain order by insertion. So if you create the dictionary with the order you want, it will keep it.
So that sorts by keys, then by values.
note that OrderedDicts have nicer properties and APIs still
if you really want to enforce/rely on insertion order
yeah, though it's important to note (at least it was notable to me) that OrderedDicts are for situations where the order is actually important as much as the contents - for example, comparison of OrderedDicts takes order into account, unlike dicts.
^
i think if you're using sorted on a dict then creating a new dict
OrderedDict is much better
if only to semantically indicate in the code that "this is a dict where order matters"
because conceptually a dict is an unordered mapping
hrm - is that not assumed now though?
i think its still semantically valuable
are you sure you're not just used to thinking about this not being the case from python < 3.6?
afaik this is standard now?
dicts preserve insertion order
yeah
but 9 times out of 10 you dont care
personally i think for the cases where you do care
its much clearer to use ordereddict
i wish π
There are definitely some niceties on OrderedDict if you are performing lots of reordering operations. If you are just constructing the mapping once, a regular dict is probably all you need.
it'd be nice if pandas could consider empty dicts as NA values.
OrderedDicts are something that I've managed to avoid - probably in part bc of the new standard
I vaguely recollect being angry at OrderedDicts when doing something with serialization. It might have been way back using tastypie or drf.
@magic python we just had a discussion here last week about how empty containers should not be considered falsy
now you want them to be null?
did we? I don't recall, i use falsy things a fair bit - i just changed the default value so it's not an empty dict now π€
you might not have been reading at the time
im 50/50 on it, i dont hate it nor do i think its an amazing wow feature
The strongly typed enthusiasts will eat you alive. π
i think in general it's reasonable that bool([]) is False, and that if <expr> should implicitly call bool() on the result of the expr
re: pandas null-ish definition. @magic python usually i dont want to conflate "empty dict" with "null"
and when i do, i either use one or the other consistently
e.g. either i replace all empty dicts with None or i replace all None/nan with empty dicts
i only mix empty dicts with null values if i want to distinguish between "null" and "has data but it's empty"
also how often do you really need a dict inside a series π
(j/k i use it all the time)
I was initialising a structure then filling it - but where there are no values i was left with empty dicts :/
i was enjoying TDD last week, this week i realise i need to change everything
now i hate it
that's where data cleaning comes in, transform them yourself into NaN?
@magic python one important thing is that you don't need to write a direct test for each entity. Suppose you have a function f and a bunch of tests on f. You can add some additional functionality to f (like an optional parameter to pass as strategy, as in sorted), then you can make another test that tests this new functionality. Then you might think that f is doing too many things and decide to refactor it -- split it into two steps, g and h. Don't write tests for g and h. I.e. test the public interface, not the implementation details, because the tests for f will ensure that g and h work
if that makes sense
The strongly typed enthusiasts will eat you alive.
I am already washing myMonad m => Fork m f
@unkempt rock This is not the right place to ask. If you want to get help, you can ask in #python-discussion or grab a help channel ( #βο½how-to-get-help )
gotcha, sry!
I meant that it is not true that every function, class etc. needs its own dedicated test.
Because if that's true, there's very little point in testing. Testing help refactor code and add new features without breaking old ones. And if every change also requires changing the tests, you don't have that guarantee
so if i have f( g( h( x) ) ) i'd just test f
If f(x) = g(h(x)) then you can test f, yes. Because if g or h have a defect, that defect should probably break f as well.
Except one case. When two defect cancel each other out. But that's pretty rare.
well, it woudn't necessarily be that f(x) == g(h(x)), but i think i get the point
well, unless you're in Haskell or something, it's more like f = @aaaaa(g(x + 42); h([a, b, c]); monkeypatch(print with django.model.view.controller))
π¦
i meant that just composition might be rare, unlike in FP
alright, that's a weird way of saying that
i think a lot of my functions are composed actually - which was annoying bc i'd written tests for them all π
i guess i can delete them and just leave the top one
Yeah, if you have a function that's like
def foo(x):
return bar(baz(fizz(buzz, x)))
where foo is some kind of a public function and bar and baz are helpers, you only need to test foo.
right - i get the point, I didn't get the point last week
though writing tests for everything did make some things kinda handy, maybe not worth it tho
And if bar and baz aren't fully covered, then:
- you didn't cover some cases
- the covered code is inner code that fails only when some invariant fails (if that's important to test, it might make sense to)
- some code is dead
but you might have a different case, so don't listen to me
tests sort of violate DRY (Do not Repeat Yourself) because then you change something, you also have to update the tests. So too many tests can get annoying, but just the right amount is quite useful
yolo
LMMUA
https://kentcdodds.com/blog/aha-programming/ not 100% relevant, but also talks about DRY not being applicable in all cases
can I get a job with python without a degree
and just projects which I have made
to show
short answer: yes, but wrong channel
I think that's the long answer as well :)
iT dEpEnDs
that's the real long answer. Alright, drifting to offtopic here
what if I dont get the job
we should move to #career-advice before mods get angry
Well, it's not for the mods' pleasure. Keeping stuff on-topic is just better.
You know, Single Responsibility Principle π
I want to do n independent function calls in parallel on m CPUs, n > m if it matters. I've done it before with joblib but multiprocessing is stdlib
is there any advantage to joblib?
Joblib can use one of several backends, including multiprocessing.
So I'd expect joblib to just provide simpler ways to do it than manually working with multiprocessing/threading.
looks like joblib is already a dependency of one of the requirements
of what I'm doing
might as well use it, I guess
It appears that I can't use joblib.Parallel because each call relies on a shared object that isn't serializable.
pool = mp.Pool(num_cores)
for ann in dataset:
pool.apply(_psudofy_file, args=(ann, output_dir))
pool.close()
Hopefully this does what I think it does
well, multiprocessing also would not work in this case
it also has to move the shared objects into other processes

I have to leave and we're edging off-topic-for-this-channel territory so I'll bring it up somewhere else later if needed
@flat gazelle Shared memory Tho that would probably cause so so so many issues
Is there actually a way to share an arbitrary python object as read-only between processes
Something pretty big and complicated and not easily serializable to json or something
pretty much the only things you cannot pickle are things that talk to the OS, which cannot really be read only
Its pickleable
Just huge
(This is hypothetical but potentially useful in a few applications i can think of)
Iβve been impressed with Ray. It uses an in-memory object store to pass data around the different actors.
Isnt that specifically for dataframes though
The main use case in my head is machine learning though
Ray? No. I think you are thinking about Dask.
Iβm still confused by your use case.
perhaps some python wrapper for open mp, if you're in need for distrubted processing
Make predictions in parallel from a big model without duplicating the model in memory
If its not really possible thats fine im just curious
Shared Unshared
Mutable AAA!!!!! ok
Immutable ok very nice!!!!
some shared memory segment makes sense for your use
@grave jolt rust did it a bit better than just this I guess
well if we are talking about memory here
Anyways I was thinking what if Python's compiler was not one-pass so it could support forward references
@grave jolt yes im looking for shared and immutable
"I have 3 gb of float32's and metadata, which i want to share between 16 processes, read-only"
wild
you could pack those into a shared memory I think
but I never actually looked at how to use that
Yeah, multiprocessing shared arrays are interesting here
Numpy itself also supports memmapping
Not sure if you can memmap something read-only from 2 different processes
I think you can, eg joblib i think tries to do this
But probably using c extension stuff
for logging do you think using modulus substitution is faster than f strings?
f-strings must be faster
I assume that f strings always get evaluated even if the statement doesn't get logged.
The time for interpolation on any reasonable string will be negligible
nekit $ python -m timeit -s "import logging; log = logging.getLogger(); test = 42" "log.info('%s', test)"
500000 loops, best of 5: 666 nsec per loop
nekit $ python -m timeit -s "import logging; log = logging.getLogger(); test = 42" "log.info(f'{test}')"
500000 loops, best of 5: 757 nsec per loop``` interesting
but I suppose if it does get logged, f-string interpolation will be a faster way
nekit $ python -m timeit -s "test = 42" "f'{test}'"
2000000 loops, best of 5: 163 nsec per loop
nekit $ python -m timeit -s "test = 42" "'%s' % test"
1000000 loops, best of 5: 249 nsec per loop``` well I mean it is, obviously
logging statements do modulus substitution their own way, rather than using str.__mod__
oh interesting, let's test that
They use % internally i think
I think i looked at the source once
Although dont quote me on that
logging uses % internally?
You could say that. It has support for lazily interpolating those strings, which means that the string interpolation only happens when that logging message is actually relevant for something (usually: the logging level is low enough for it to actually be logged).
yeah it seems to
def getMessage(self):
msg = str(self.msg)
if self.args:
msg = msg % self.args
return msg```
yeah also there is a bunch of styles but they seem to be using underlying formatting like .format() or %
Yep that ^
Python's stdlib is not fully pythonic imo :p
You thought that I was totally crazy when I told you about the ,= operator.
This file https://github.com/lark-parser/lark/blob/master/lark/load_grammar.py uses it 10 times
and I had nothing to do with it!
LOL
I'm still laughing π
There are 10 times it uses it as ,= and one time it uses it as , =
there's also... this
if literal.type == 'STRING':
s = s.replace('\\\\', '\\')
return { 'STRING': PatternStr,
'REGEXP': PatternRE }[literal.type](s, flags)
nah, an if is too simple
wow
lol
I guess it's technically more compact and easier to expand if you get new literal types, but uhh
but why though
as a class attribute
I guess you can emulate immutable dicts as frozensets of tuples.
please no
CPython will store them as constants!!
oh, wait, no
because it can't store the classes
actually I was looking for a nice bytecode optimizer
or like just generally useful bytecode manipulators
test_value = 42
@constant("test_value")
def test() -> int:
return test_value``` and LOAD_GLOBAL will be replaced with LOAD_CONST here, as one of my thoughts
Is it possible to speed up Pandas by using Cython for specific things?
I can't remember teh book because it was available through work, but it basically went over Pandas and talked about it deficiets in certain areas. Spoke about how in some cases using .apply() is actually really freaking slow and advised using custom Cython implementations.
I was just wondering if anyone has had experience with that.
I guess it's more for #c-extensions
@swift imp I would try numba first
@paper echo @swift imp Numba is a definitely a lot more simple to use and I would also recommend trying it first, but you are likely not going to get quite as much performance as you could gain from optimal usage of Cython.
Whether Cython is worth the added complexity depends on your specific use case and performance requirements.
Yep thats my experience as well. Im not a good enough C programmer to take full advantage of cython so i use numba for most "mathematical" things that need speed
What's the difference between numpy and numba?
One of the large computers owned by my uni only has up to python3.6 and I think the administrator is always afraid to install newer versions
Whereas cython has helped me a lot in other cases where i needed to wrap a C++ library or reduce overhead
if there any reason why he should be afraid to install 3.8 now that it's been out a while?
Sometimes i get a 50% speedup literally by just cythonizing a function unchanged
nice
@pseudo cradle numpy is a vector/matrix/linalg library. Numba is a JIT compiler for python based on LLVM that supports numpy
Cython is a Python-like language that compiles to C but with the specific purpose of writing C extensions for CPython
I've heard Numba been thrown around before but I don't think I've ever used it. I'm never not using numpy, though
godlygeek was trying to sell me on the virtues of Cython when I was looking into ctypes
What was your use case?
I had an obscure binary file and the parser was in complicated C code. I needed to integrate it with a Python program
What's the point of cython? Is it just speed boost or more?
It's serves as a superset for the Python language that allows you to write C-style code that can be used mainly for building C extensions
C is about as close to the metal as you can get unless you want to do assembly or machine language. Cython let's you use it in python
Or at least, python-like syntax. It has some added constructs and more requirements (particularly if you want to get the most out of it), but it's much easier than programming in C and avoids many of the pitfalls.
Basically alot of the stuff that makes python great comes at the cost of a lot of overhead. C on the other hand has little overhead but it's alot a more verbose and require many layers of abstraction to do anything super useful
Yeah, you don't actually write C
I just realize, what I'm saying is sounding like that
Does cython have dynamic typing still?
I wonder if I can use cython and still claim my project is pure python
You can use dynamic types in Cython, but using static types results in a significantly better performance improvement. Also, Cython IS an implementation of the language Python. When you say "pure python", you're actually referring to CPython, the reference implementation of the language.
So, a project that uses Cython instead of (or in addition to) CPython can still be considered 100% Python
Posted this in the wrong channel:
len([e for e in gold_ents if e.tag == tag]) # option a
[e.tag == tag for e in gold_ents].count(True) # option b
Which do people prefer?
intent clearer on the first one imo
i mean on the second one you know right away you're counting the Trues
Seems like it's a matter of preference, I like the second one
I switched to the second one because I feel like it's more explicit
Hah, looks like it's fairly evenly split
I feel like a builtin that tells you how many times a condition is true for an iterable should be a thing
Yes, that is sum
hello
it's one of the more common problems that doesn't seem to have a "one obvious way to do it"
That seems useful
yeah but if you're using sum you're using the fact that True is 1 and I try to avoid that
both of the original 2 eagerly create temporary lists wasting memory
count_true = sum; count_true(...) π
And yeah i mostly just hate creating lists that get thrown away
tbh I'd almost be tempted to put count_true = sum in my code
Well to be really correct you'd have to cast everything to bool before summing
abusing the fact that bools are ints feels wrong but it is concise and efficient lol
def count_true(items: Iterable[Any]) -> int:
return sum(map(bool, items))
is treating bools as ints really idiomatic? it's useful here but I've never seen it in the wild, at least intentionally
Actually yes i would say it's strongly idiomatic
To the point where "sum of bools == # true" and "mean of bools == % true" are synonyms in my mind
in the domain of numpy maybe but I can pretty confidently say it's pretty obscure to must 'application' python devs
!e print(isinstance(False, int))
oof
@paper echo :white_check_mark: Your eval job has completed with return code 0.
True
wdym ints and bools should be different, where's the issuepy In [2]: {True: "bruh"}[1] Out[2]: 'bruh'
I guess it's weird but in a world of weak typing you can justify it as casting and not a subclass thing
/s
yeah I know it's the case, it's just not really well known or utilized in my experience, and has always felt more leaky or vestigial than intentional
yeah not implying it's accidental lol, just not super loudly mentioned
falsey empty containers are not actually integers like bools
that would be a bit more confusing
well yeah, they're containers
yeah that's part of why I have the view of it that I do
and how like a big point of tuples has really always been to save refcount mutations
I use sum, but I do find it a little weird
@pseudo cradle why would you use b?
i mean you can always change the step
honestly I don't know anywhere the latter would be used (not just Python) unless the "counter" was changed by an amount depending on what happened each iteration
the only reason to use B is if you want the loop to end before i >= x, and you can use i
or if you have a random step ig, that's weird though
there was a problem that i had once, where you had to simulate a frog and it jumped random intervals each time
I have some usecases for B, but my current use case does not fit. I'm just... tired.
But I would use option b if I was traversing data with multiple pointers, but this is not that case
does anyone know a good way to get good at python i only really know discord.py
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
wait, how are you not at least decent if you really know discord.py
I asked for opinions on a draft blog post in #unit-testing , but it's a very low-traffic channel... π
How to kill a child process of a child process without affecting the virtual parent
is there any sort of pattern that's like a zip where one element is a generator?
I guess I could just write a for loop π€
@magic python You can give zip a generator.
any iterable should work
oh, yes, i must have done something wrong before π
j = itertools.cycle("ab")
[x * i for x, i in zip(j, range(9))]
π
is it bad to re-use typevars?
from typing import Mapping, TypeVar
_T = TypeVar('_T')
def f1(x: _T) -> _T:
return x
def f2_unrelated_to_f1(x: Mapping[str, _T]) -> _T:
return {k: v for k, v in x.items() if k.startswith('Q')}
is this ok in general?
I dont think reusing typevars is necessarily that bad? but im not really sure
I think it's fine
I don't think of it as tied to a certain object or function
Imagine if you had a more complex typevar, like one that put restrictions on the types. Then I'd consider it to be just a generic thing with defined restrictions that can be used by anything that needs those restrictions, in the same way multiple unrelated things could reuse type aliases
that makes sense
Granted, if you have multiple classes defined nearby which don't have a relationship, then maaaybe having separate typevars for them would be more readable?
I don't really think so though
I'm not a huge fan of one letter type var names anyway, so in that situation I'd create separate ones to be more descriptive, even if they are identical otherwise
The admin for my uni's big computers apparently said that he will never install a python version past 3.6. My advisor said she's going to change his mind but I'm not sure what his hesitation is.
what is his reasoning?
Security, I guess.
i mean, in the real world - uni or business - the newest version isn't always the best option
I thought he just hadn't gotten around to installing 3.7
Though after a while won't older versions become less secure?
they still recieve security patches
(untill the minor version itself is supported)
But not indefinitely, right?
already? damn
they grow up so fast π’
oh, basically, 1.5 years of full support, then 3.5 years for security patches
security doesn't make sense for reasoning though π€·ββοΈ
Fwiw 3.6 is still "usable"
F strings are nice i guess but damn 3.7 has been out for a while
3.6 has fstrings. For me the things that I missed from 3.8 to 3.6 were only dataclasses and the walrus
3.5 reaches EOL this september https://devguide.python.org/#status-of-python-branches
oh right
wait what is 3.6 missing then?
importlib.resources i guess but that's been backported
can't name variables await lol
postponed type annotation evaluation
i guess dataclasses are nice but you have attrs anyway
i havent yet found a use case for ContextVars
for the things that can't get backported reasonably postponed annotations and a lot of async things
true asyncio was pretty rough in 3.6
oh wow module-levelt __getattr__ was new in 3.7 i didnt realize
oh and the whole C locale change
that alone might be reason enough to not upgrade in a production system tbh
C locale change?
if only python handled encoding like rust from the start
https://www.youtube.com/watch?v=qCGofLIzX6g
a good talk, with a bit of insight on encoding aswell, if this is the right channel to post it in
back in 2012, it took me about a week to print something in Turkish to console because of the ASCII/utf8 stuff
i'm so grateful for python3
love to hear it π
it turned out my Windows login name had a turkish character (non-ASCII) in it, so none of the path stuff was working
haha, know that too well
the beauty is, you can't change it unless you headfirst in the registry
that's one of the moments you just say, fuck windows
gutwrenchingly annoying moment
its more like a fuck python2 moment for me
a lot of stuff doesn't work if you have a non asci character in your profile name on windows
Docker, alot of path stuffs breaks, somethings rely on it internally like android studio, preventing fixes byhand
well, i was more or less a regular user before that so i didnt feel the UTF-8 pain until learning python π
but yeah, i see your point
...and now, in 2020 we broke the unicode stuff again with all those emojis
right now i'm trying to find a way to show emojis on wikipedia but i keep getting zero width joiner errors
i hate emojis...
emojis don't do anything special, do they? The same modifier mechanism is used for actual text as well afaik
i encountered more issues on emojis than regular text
i think some emojis use something like two unicode characters or something\
ye, for example flags are sequences of regional indicators
im actually trying to figure it out right now
you have skin tone modifiers as well
π©βππ¨βπ 240 individuals from over 19 countries
π 250 miles above Earth
π° 17,500 miles per hour
Join us this fall for our #SpaceStation20th anniversary as we celebrate the dream we call the @Space_Station: https://t.co/M79yOYokLt https://t.co/09tEk4kvXd
1927
9459
like the emojis in this tweet
oh damm, twitter straight up replaces emojis with images
when I get the text from Twitter API and manipulate it, Wikipedia templates give an error
https://tr.wikipedia.org/w/index.php?title=KullanΔ±cΔ±:Khutuck/deneme_tahtasΔ±&oldid=23006294
same Tweet on wikipedia
yet as an alt, a pure emoji remains
can someone give example of proper commenting? HAHA
Not how-to-comment
but general one-liners that get to the point
you shouldnt strictly need one line comments if you actually name your shit sensibly and type accordingly :P
imo, comments should only be used to explain obscure behaviour when really you can't make your code cleaner without hitting another issue
# we need to do this because of that isn't a bad comment imo, explaining what the code itself is doing should be self-explanatory for the most part
you also only have that comment for that one line
sensible var names are worth 1000% more than comments
my rule of thumb is to document code with 3 questions, 1 of which is optional
what ? -> variable names and docstrings
how ? -> the code itsefl
why ? -> this is where comments may be used when that question is worth being answered
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
nvm
you meant 16, but still
i agree with you, but sometimes it's not applicable
i know what you mean
I'm wondering what the guy(s) that created this were thinking about being able to explain the implementation or not :p
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}```
there are a lot of really really really smart people who wrote a lot of library code
ah yes, the Doom sqrt thingie π
quake*, but yea
its like pokemon being made in 16mb and now 15yos cry if they dont have 16gb of ram
math heavy code, is a devious exception to different types of code when it comes to comments
@teal yacht there are entire papers dedicated to that specific function
@pearl river its actually from Quake
I know, that doesn't mean it fits the "easy to explain", quite the opposite actually
But it's not like they had better options
oh there is no good way to comment code like that
because it is probably the result of months of research
You can write a blog post and drop a link π
in instances like this, a link to a article explaining it doesn't sound bad tbh π€·ββοΈ
Afaik, noone actually knows how they found that
Everything is just speculation that was done decades later when the code was released
https://betterexplained.com/articles/understanding-quakes-fast-inverse-square-root/ this is a pretty good article that explains the math
The fastest way to the solution: guess it.
it does essentially boil down to that
How would someone write an algo aimed at trading stocks/crypto? Really interesting imo.
Like if BTC goes up by 5% then sell
!zen 11
In the face of ambiguity, refuse the temptation to guess.
if btc_rates[-1] * 100 / btc_rates[-2] > 105:
sell()```
Interesting
(I'm kidding, don't do that)
lmao
#bot-commands
have you ever tried to recreate the "wikipedia text version differences" view for comparing long 2 strings/files?
Sounds more suited for #python-discussion or a dedicated help channel
I'm working on exactly this on wikipedia, but I need to preview the changes I'll make and need to compare the original vs modified strings
@final whale should I repost there?
I remember seeing it a while back but cant find it
how do we type hint something but prevent a recursive import issue?
using string literals?
no there was another way if i remember
that or from __future__ import annotations, which postpones the evaluation
oh that might be it
That's what you'd do for things like
class Thing:
def named_ctor(args) -> Thing: ...
Not sure about how effective it'll be with circular imports
from typing import TYPE_CHECKING might be it aswell`
you generally want to combine both
the idea of TYPE_CHECKING is that you can put imports that would otherwise be circular in an if TYPE_CHECKING: block
Anyways I was thinking what if Python's compiler was not one-pass so it could support forward references
It could help with circular imports and declaring methods of a class that return that class
well, not sure about the circular import part
Well, multi passes solve a lot of things and allow more optimizations, but that's considerably slower
How to print only those elements of a list whose length is 5 in Python ?
This isn't really the channel for this sort of thing, try #python-discussion or #βο½how-to-get-help
for element in list:
if len(element) == 5:
print(element)
is there anyway to get the source code of a repl.run link?
Hello, this channel is for discussing the Python programming language itself, from a higher-level perspective. Thing like advanced language concepts, the implementation, the future of the language, Python Enhancement Proposals (PEPs), and so on.
Does anyone know where i could find a pdf of the book "Advance Computer Programming in Python" by Karkim Pichara and Christian Pieringer???
Hello, this channel is for discussing the Python programming language itself, from a higher-level perspective. Thing of advanced language concepts, the implementation, the future of the language, Python Enhancement Proposals (PEPs), and so on.
@wide shuttle
Hoe can I target MS Word or Notepad with Python? So that I can write on them using Speech Recognition library
@flint crystal This channel is for indepth discussion of the Python language. If you need help, please follow the guide in #βο½how-to-get-help
ok I have my channel open i am gonna wait for replies
y r there so many raids on this server?
I don't think that has anything to do with this channel, but all large servers have it.
Why does pkgutil.walk_packages need to import packages to get the paths? https://github.com/python/cpython/blob/master/Lib/pkgutil.py#L90-L105
I guess I really don't know what __path__ is and why it's being used over the file's path.
Namespace packages.
A package can be spread across multiple directories on disk, but only if it's a namespace package.
And if a package has an __init__.py, it may or may not be a namespace package, there's no way to know without seeing what the file does
!pep 420
^ that describes the situation before implicit namespace packages were a thing, which goes into __path__ a bit.
Thanks
implicit namespace packages are pretty nice, i use them at work
from bigcorp_datasci.some_really_specific_package import ThingClient
it lets us release and maintain lots of little small packages all under the bigcorp_datasci namespace without having to really think about the operational aspects of making that work
This isn't the channel for that. See #βο½how-to-get-help
does anybody use metaclass meaningfully in python? iβm looking for valid use cases aside from theory and sample usage
Question, what are the chances of programmers teaming up with developers?
As in pro type apps, and websites.
there are very specific use cases for metaclasses
English please
Aka I want to create protype apps and website while learning SQL. But now Iβm wondering if I could partner with developers to create apps in my local area
does anybody use metaclass meaningfully in python? iβm looking for valid use cases aside from theory and sample usage
@sand musk hm not really sure if you would consider this meaningful but
I use metaclasses to bind class attributes
so like an attribute of an instance of a class will vary automatically with the value of an attribute of another instance of another class
Iβve also used them for some dependency injection stuff, but that was more for fun
i've used metaclasses like this: https://github.com/Den4200/arcade_ui/blob/master/arcade_ui/widgets.py#L7
i know discord.py uses metaclasses for cogs: https://github.com/Rapptz/discord.py/blob/master/discord/ext/commands/cog.py#L36
What is dependency injection?
@unkempt rock this is not the place to ask that. please see the channel topic. that could probably go in #career-advice or an off-topic channel
What is dependency injection?
@boreal umbra classes ask for things they need instead of making them themselves
BASICALLY
Is that the point of *args, **kwargs or is that something else
not really...? itβs more like a pattern
Ah ok
okay an example...say you have a recommendation engine.
for performance, you want to also have a cache
now, your reco engine class could create its own cache...or it could take one in its init method
creating its own dependencies vs having them injected in
(provided)
this has benefits
for example if you want to test what happens if the cache randomly drops values you could pass in a mocked object with that behaviour
im tryng to work with some like 7 year old code
written in python 2
whats the best way to do this in pycharm without fucking up my main python install
on ubuntu btw
@limpid ridge I think that question is better suited for #tools-and-devops
Has PEP always mandated that there be two blank lines between functions? I thought it was two lines between module imports and class definitions, then two between class defintions and function definitions... but only one between function definitions otherwise.
It's one line between methods and two lines between each class/function definition, has been that way for quite a while if not from the start
music_path = Path(MUSIC_PATH)
artists = [
{
'path': d,
'artist': d.stem,
'albums': [
{
'path': a,
'album': a.stem,
'songs': [s for s in a.iterdir() if s.name.endswith('mp3')],
'cover': a / 'cover.jpg'
} for a in d.iterdir() if a.is_dir()
]
} for d in music_path.iterdir() if d.is_dir()]
list comps β€οΈ
I had it redone with black
artists = [
{
"path": str(directory),
"artist": directory.stem,
"albums": [
{
"path": str(album),
"album": album.stem,
"songs": [
str(song) for song in album.iterdir() if song.name.endswith("mp3")
],
"cover": str(album / "cover.jpg"),
}
for album in directory.iterdir()
if album.is_dir()
],
}
for directory in music_path.iterdir()
if directory.is_dir()
]
Hrmmm... I'm not sure how I feel about those line breaks between the for and the ifs. Its not even consistently applied. Notice it leaves str(song) for song in album.iterdir() if song.name.endswith("mp3") as one line but thinks much shorter statements should be on their own lines...
Overall I guess it does look better.
The reason is probably that the longer list comprehension is split out over multiple lines and that's when it opts to split all lines
Thalbum.iterdir() list comprehension is only broken in the sense that the opening and closing bracket are on seperate lines
Are recursive functions really that expensive in CPython that it's necessary to put a recursion limit?
I'm guessing recursion only gets expensive if your arguments for said function gets more and more complex
Not really familiar with the implementation of the call stack but I'm rather curious as to why I could have an object that nests another instance of the same class and be able to delegate calls just fine e.g.
class Nested:
def __init__(self, n ,v):
self.n = n
self.v = v
def execute(self):
print(self.v)
try:
self.n.execute()
except:
pass
xs = [Nested(None, i) for i in range(5000)]
while len(xs) > 1:
a = xs.pop()
b = xs.pop()
b.n = a
xs.append(b)
xs[0].execute()
having some recursion limit is just a good solution to avoid infinite recursion, and the recursion limit can be changed at runtime
I guess that's true
>>> sys.setrecursionlimit(2**31-1)
``` okay I have two questions, why is that limit have to be a signed integer, and why is it even that large if it will segfault at some point kek
Do you guys use import turtle?
I have installed PyPy 3 using snap on Ubuntu 18.04
But I am unable to call it from the terminal
can someone explain to me how to do that?
You probably need to add the right path inside the snap to your PATH variable
Tbh installing command line tools using snaps sounds like a bad idea
You'll end up with a crazily long PATH, and I'm not even sure if the snaps are always mounted in the same directory
Using your package manager is preferred
I guess a more appropriate question would be this: Has anyone here used PyPy with PyCharm?
I did. It was as simple as selecting it to be the project's interpreter.
I have installed PyPy using snap and now I am trying to create a new virtualenv
but I get this weird error which makes no sense
can you show me how to you did it?
on pycharm
@subtle whale This channel is for indepth discussion of the Python language, not for help. Please check out #βο½how-to-get-help instead.
I thought those channels were for code discussion
it's okay to ask questions like these there?
You can ask for help in help channels, yes.
ok thank you
@brave badger Compared to other languages, the recursion limit on Python is pretty conservative. Each recursive call does get stored on the stack so it does take additional memory, but I believe the assumption is that if you have recursion 1000 calls deep that something is wrong. Thankfully in cases where it matters, we can increase the default recursion limit. As someone mentioned though, there is a point where you're just going to segfault.
Fair answer. Thanks.
np
So it seems to me that the new changes to pip might really encourage the use of venvs even more so.
Also, I'm not entirely sure how this would play out? How would this behavior differ with the pip upgrade?
"If you pip install x and then pip install y, itβs possible that the version of y you get will be different than it would be if you had run pip install x y in a single command."
@cloud crypt I would guess because it is signed integer in C code, and adding the overhead of python integer addition into every call is probably a bad thing. Why it needs the sign idk.
can you even set a limit on the stack frame depth in other staticly typed languages at runtime?
Does opensource project know that copied thier work and violated thier license by selling the software or source
or heck, even dynamic ones
@exotic mortar - also note that sys.setrecursionlimit really is a call stack limit, so if you recurse with A calls B which calls A, you will only recurse recursionlimit/2 times. And even if A calls A, if A has a decorator (like @lru_cache), then each call to A is actually 2 calls on the stack.
With some pyparsing parsers (especially those doing arithmetic parsing with many levels of operator precedence), I have to bump the default recursion limit to 3000 or 4000. Setting it to 2**31 is pretty much turning it off, leaving you open to a Python out-of-memory crash instead of a recoverable Python Recursion exception.
@orchid karma i think the dependency resolution algorithm might find a different result for the latter case compared to the former. i honestly wonder why that is, but i know dependency resolution can be a hard problem algorithmically (beyond just topological sort).
Hi
How can I import variable from one function to another function without setting it as a global
Try #python-discussion or #βο½how-to-get-help , this isn't really the focus of this channel
Ok sry
no worries π
however, debating the merit of module-global imports definitely is
boxed#9332 made a good argument that from X import x should not put x into the top-level namespace of the current module, at least not by default
in fact i think that question is on topic here
because the solutions are neither trivial nor idiomatic and touch on a lot of python language design details
What namespace would x be behind if not the global one? (assuming imports at top)
At that point I'd just import X
@peak spoke
a.py ```python
val1 = 123
`b.py` ```python
from a import val1
val2 = 456
the argument is that you should not be able to write from b import val2
or at least, that you should not be able to write it without explicitly marking val1 for export from b
the idea being that these non-segregated imports clutter up module namespaces
one option would be to have __all__ apply to all imports and not just import *
hmm
so if you omit __all__ the behavior works as it does now, but if you include __all__ you create a whitelist for all importing, not just for import *
maybe @half wolf has other ideas since he's the one that brought it to my attention in the first place
Yea, what salt said. The problem is that there is just one global namespace they gets clobbered with imports, variables, functions and classes (which are really just variables but anyway). All top level imports become the public API of modules. This is not what anyone means in 99.9% of cases.
But changing this now would be a long depreciation period and would introduce performance regressions too (although pretty small).
import random
import time
from tkinter import Tk , Button , DISABLED
def show_symbol(x,y):
global first
global previousx , previousy
buttons[x,y]['text'] = button_symbols[x,y]
buttons[x,y].update_idletasks()
if first:
previousx = x
previousy = y
first = False
elif previousx != x or previousy != y:
if buttons[previousx,previousy]['text'] != buttons[x,y]['text']:
time.sleep(0.5)
buttons[previousx,previousy]['text'] = ' '
buttons[x,y]['text'] = ' '
else:
buttons[previousx,previousy]['command'] = DISABLED
buttons[x,y]['command'] = DISABLED
first = True
win = Tk()
win.title('Matchmaker')
win.resizable(width=False , height=False)
first = True
previousx = 0
previousy = 0
buttons = { }
button_symbols = { }
symbols = [u'\u2702',u'\u2705',u'\u2708',u'\u2709',u'\u270A',u'\u270B',
u'\u270C',u'\u270F',u'\u2712',u'\u2714',u'\u2716',u'\u2728',
u'\u2702',u'\u2705',u'\u2708',u'\u2709',u'\u270A',u'\u270B',
u'\u270C',u'\u270F',u'\u2712',u'\u2714',u'\u2716',u'\u2728']
random.shuffle(symbols)
for x in range(6):
for y in range(4):
button = Button(command = lambda x=x , y=y: show_symbol(x,y) , width = 10, height = 8)
button.grid(column = x , row = y)
buttons[x,y] = button
button_symbols[x,y] = symbols.pop()
win.mainloop()
this is a really fun game i made
try it
@keen thicket that's great π but this isn't the best place to share your projects
ok
im actually not sure where we should share projects nowadays. you can ask in #community-meta
in cpython, strings are usually interned, right? e.g.
def f(x):
s = 'abc'
return s, x
f(1)
f(2)
f(3)
doesn't create a new string each time?
!e ```python
def f():
s = 'abc'
print(id(s))
f()
f()
f()
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | 140459125360752
002 | 140459125360752
003 | 140459125360752
!e ```python
from dis import dis
def f():
s = 'abc'
print(dis(f))
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | 3 0 LOAD_CONST 1 ('abc')
002 | 2 STORE_FAST 0 (s)
003 | 4 LOAD_CONST 0 (None)
004 | 6 RETURN_VALUE
005 | None
!e ```python
from dis import dis
def f():
s = 'abc'
return s
t = 'def'
def g():
return t
print(dis(f))
print(dis(g))
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | 3 0 LOAD_CONST 1 ('abc')
002 | 2 STORE_FAST 0 (s)
003 |
004 | 4 4 LOAD_FAST 0 (s)
005 | 6 RETURN_VALUE
006 | None
007 | 7 0 LOAD_GLOBAL 0 (t)
008 | 2 RETURN_VALUE
009 | None
!e
if(item.type == 426){
int health = target.life; //current hp
int healthCap = (int)(target.lifeMax * 0.75f); //75% of max hp
if(health > healthCap){
damage *= 2;
}
You are not allowed to use that command here. Please use the #bot-commands channel instead.
dude did you think you were gonna be able to run Java with a python bot
I think the conclusion that was arrived at is that string with no spaces get interned
IIRC strings that qualify as identifiers get interned every time
!e ```python
def f():
s = 'this string definitely is not a valid identifier'
print(id(s))
f()
f()
f()
@paper echo :white_check_mark: Your eval job has completed with return code 0.
001 | 140106400839808
002 | 140106400839808
003 | 140106400839808
I assume it's tied to the fact it's a literal
yeah
that's what im asking about though, string literals. i didnt clarify that above
makes sense
anybody have the algorithm of finding a peak written in python ? (binary research)
like, the max value?
Dang, AoT compilers/interpreters are pretty awesome
I think MyPyc https://github.com/python/mypy/tree/master/mypyc is an aot python interpreter
it's a python to C (for cpython extensions, not any kind of C) compiler, analog to Cython
interesting, i wonder how it differs from nuitka or cython
I guess is relies heavily on strict typing to produce any sort of perfomant code
im surprised its only 4x faster
cython typically is a 2x speedup when ive tried it in 1:1 translation with python
ive seen 4x from pypy too which doesnt require typing
Most type annotations are enforced at runtime (raising TypeError on mismatch)
Classes are compiled into extension classes without
__dict__(much, but not quite, like if they used__slots__)Monkey patching doesn't work
Instance attributes won't fall back to class attributes if undefined
ahh, strict-mode python
Instance attributes won't fall back to class attributes if undefined
i always forget to annotate my class attributes as such
out of all the proposals to take away the GIL that bdfl has denied for making single-threaded applications slower, which proposal has the smallest slowdown for single-threaded applications, and what percent is that slowdown?
i was just wondering, because i read somewhere that guido won't allow taking away the GIL because it'll slow down single-threaded applications
if you can answer, please ping me with it
Iβm 14 and donβt want to work at a place like McDonaldβs or something, do you have any idea if any tech companies hire at this age?
#career-advice might have an answer for you.
Okok
are there any metrics to gauge the popularity of a pep?
no way a tech company would hire a 14 year old
guys i am a second year in college and looking for an internship..can you guys give me some advice because so far i have been rejected by 2 companies idk where i am going wrong
like im having trouble in getting an interview
I ainβt going to college and living in debt for the rest of my life college is a scam
Just invest
And learn stuff about it
@celest spade
Itβs ez
lmao dude i already do stocks π i just want to land an internship, and I am blessed to have parents who pay for college
Dude
You should start a huskiness
Business
Iβm gonna start one in 1 year
But yeah college is a waste of time
i want to start a business but i need to think of a proper idea lmao
You will realize college is a scam one day
Do clothing
Remember
SupplyxDemand
= $$$$
We got 75k in the bank
I feel bad for you
lmao why
Thinking college will get you somewhere
lmao im enjoying college and i am from an asian family so like they pay for my stuff and they want me to graduate
Most jobs donβt even need college
Well I might dip high school
I already know what Iβm doing in life
yeah who needs education
survivorship bias is very strong
this isn't the channel for this, probably #career-advice
there are people who study because they find it interesting, you know
also, yeah, off topic.
@unkempt rock Hop on back in here in 4 years and let us know how your decision turned out. Legitimately interested in finding out!
This is some real advanced discussion
I use metaclasses to bind class attributes
@gleaming rover thanks but mind sharing some real example? I'm really looking to grasp when is necessary and beneficial to use within python
@gleaming rover thanks but mind sharing some real example? I'm really looking to grasp when is necessary and beneficial to use within python
@sand musk not at my work computer right now so
but honestly I think on most cases you can get by without
i've seen it used for singleton implementation
Can Anyone suggets me a good source for Probabilistic Programming in Machine Learning Domain?
Hey @feral cedar!
It looks like you tried to attach file type(s) that we do not allow (.pdf). We currently allow the following file types: .3gp, .3g2, .avi, .bmp, .gif, .h264, .jpg, .jpeg, .mkv, .mov, .mp4, .mpeg, .mpg, .png, .tiff, .wmv, .svg, .psd, .ai, .aep, .xcf, .mp3, .wav, .ogg, .webm, .webp, .m4a.
Feel free to ask in #community-meta if you think this is a mistake.
Hello
What do people think of the matrix suggestion from the mailing list? It seems to me that it wouldn't be overly useful. Possibly just ending up the same as array.array
here anyone configured apache xampp server to host flask app?
this isn't the right channel for that kind of question really. Try #web-development or #tools-and-devops
Thanks π
@slim island idk there seems to be a couple issues with it if you read the thread
If anything they might as well integrate numpy in
The main argument for it seems to be that it's a way to use matrices in Python with a friendlier interface for Python programmers than numpy - and I like that, and think it would be good to exist - but it really doesn't take you very far in terms of usecases, and very quickly you'll have to switch to numpy anyway. I think the argument that it makes python more friendly to high school students/teachers is a very good one though
I don't like that it essentially tries to fragment the matrix world in python
There's already an ugly stepchild of this fragmentation called the array module in python, which 99% of the time lies unused and ignored vs numpy array
If there's a link for this proposal somewhere though I'd greatly appreciate it
Just to see if my first impressions have any merit or not
In the #mailing-lists channel is a link
showerthought: why is it import this and not import self?
Honestly, day dreaming about this matrix thing, and I'd just love it if it had slightly more natural indexing than numpy.~~ Being able to do my_matrix[x:x+10][y:y+3] to get a 10x3 slice would be much more intuitive than the numpy way to me at least~~ Probably not the indexing I just did, but something better than numpy's current way which I always misunderstand
Hm, there's merit to that, agreed. I personally perhaps have just gotten used to the numpy way of doing it by now, but it was not an easy transition at first.
I understand the angle that the mailing list post comes from, primarily trying to ease the burden of simple matrix operations for students
The only question is, does that merit creating a stdlib implementation for matrices or not, and I'm not sure there. I'd still lean against
my understanding is that numpy is more intuitive to people coming from R/matlab/other-mathsy backgrounds. And coming from a Python softwarey background, I find it very frustrating to work with - stuff like getting used to new ways of indexing, not using loops anywhere, having to learn a million methods
Haha, can't disagree. I essentially picked numpy without any background (wouldn't call myself even proficient in python in that sense) and it definitely was different
The trouble is, not using loops and recommendations around that side of things are essentially a necessary evil for performance. If we are not using 2d lists and having a dedicated datatype for matrices, I would presume that part won't change
For the indexing aspect though, it's very different from how rest of Python does it. That's always been probably the most jarring thing about it.
yeah, Julia is pretty much the only language I know where loops can still be fast on matrices and stuff
Technically indexing in python is way faster than resolving a name, since you just use a memory offset instead of going through an hashmap


