#algos-and-data-structs
1 messages · Page 48 of 1
it's not t_{i-1} 
it's t_i - 1
the loop header runs once more than the body yes
aren’t both same?
i wrote it more of TeX way
ahh mb
yes it’s t_{i} -1
{} groups, yeah
@haughty mountain similarly reasoning goes for why # line 1 is executed n times?
2 to n there exactly n-1 numbers
that's not how loops are implemented
not that I would usually care about this tiny difference
same exact thing as the while
^
I didn’t catch it
That’s not ?
think of for loops as just while loops in disguise
you still have the failing check before exiting the loop
so really when i goes from 0 to n we still check the edge case when i=n to terminate from the loop
what do you mean by that’s not how it’s implemented?
I didn’t really catch what you mean by this.
I meant that for loops work much like while loops, it will have a failing check before it exits
so it runs one time more than the body
when do we analysis of algorithms we assume each primitive operations take constant time right?
not that it takes equal amount of time
not sure if i am in the right place, but np.random.seed() what does it matter which number i put in the function
yes
theoretically, you pick some "model of computation" where certain operations take O(1) time
sometimes you even define more than one kind of primitive operation, so that you can calculate the complexity in terms of both and be able to make tradeoffs (e.g. computation vs memory access).
If you put the same seeds into it, you'll get the same numbers back. E.g.
np.random.seed(42)
f = np.random.random()
g = np.random.random()
h = np.random.random()
```The values of `f, g, h` will be the same every time you run this
If you don't specify a seed, then np randomizes it each time you run iirc
Also, newer np recommends using [Generators](https://numpy.org/doc/stable/reference/random/generator.html) instead of these
so it doesnt matter if i put 1,32,42 or 5100?
I am trying compute a way to heads and tails "program"
i don't see why you need to manually seed it at all, if so
i just dont know why the seeds matters if it even does
can i just write np.random.seed()
For practical purposes, I don't think so... anything you put in there will give you a sequence that's relatively random (unless the rng algo is bad)
I'm pretty sure you can just not have that line and numpy will randomise it on every run
if you don't see what seeding are for, probably you don't need it at all.
usually people use it in examples, to make sure that despite using randomly generated data the results are exactly reproducable.
what do i need to do to get it runnng
print("Result of the coin toss:", toss_of_coins())?
oh nice it worked
anyways thanks all
In case this is still relevant, how I would solve it in O(n) time:
||Initialize a list of 0s of length n. Let's call it b.
Loop over the input sequence a: for every number in a, in order, we get the i-th person and person in front of them x. So we go to the x-th element in b and change it to i. A nice way of getting both i and x in your loop is with the enumerate function, but you don't have to.
When you encounter -1 in the sequence, save the appropriate i in a separate variable.
Once you're done with the loop, you have the order: Start with the first person in line (you saved their position in the separate variable). Go to the appropriate position in b. Now b tells you the next person in line. Keep moving according to what b tells you until you reach 0, and you're done.
I essentially inverted the sequence: instead of a list that tells me who's in front of me, I made a list from it that tells me who's behind me, which is what I needed.||
def coin_toss():
answer = r.randint(0,2)
return 'Head' if answer == 0 else 'Tails'
print('resultat of coin toss',coin_toss())
does anyone know how i make this a loop, so i can print more than just one value of this
Right now, it only prints one value at a time, what if i would like it to do i 1000 times
and write it as a list of [HHH,TT,HHHHH,T] etc
import numpy.random as nr
def coin_toss(n: int) -> list[str]:
random_list: list = []
for _ in range(n):
answer = nr.randint(0, 2)
word = 'Head' if answer == 0 else 'Tails'
random_list.append(word)
return random_list
if __name__ == '__main__':
n = 1000
result = coin_toss(n)
print(f"Result: {result}")
is this a bot replying?
No, I'm a real human 😆
oh ok just seems a bit robotic with no explanation
but thank you
can i ask question to it?
(n: int) -> list[str]: what is this?
Hi, I have a question about ADT in Python. List and dict are both adt (first one implemented by ArrayList and the second one by HashMap). But what about tuple and set, what are they and how are they implemented? Python doesn’t have algebraic data types, so python’s tuple can’t be that one.
type annotations, they doesn’t change anything, its just a hinting system to support IDE (so it can suggest you a proper methods for string or list)
List and dict are not ADTs (abstract data types). Those are very concrete implementations of list and dictionary ADT in python. ADTs are "abstract" for a reason. You use them in your reasoning, not in your programs.
That said, tuple is not even a data structure. It's simply multiple values bundled together. Think that it is implemented the same way list is, just without allowing modification. Set is implemented using hash table, the same way dict is.
Algebraic data types (also ADTs 😩) are types produced with addition, multiplication and exponentiation operators on atoms or other basic types, and are unrelated to ADTs.
Hi, I'm looking for advice for a Segment Tree question in the latest codeforces competition (https://codeforces.com/contest/1927/problem/D). My code is that one:
if __name__ == "__main__":
n_tests: int = get_int()
for _ in range(n_tests):
n: int = get_int()
a: list[int] = line_integers()
q: int = get_int()
l: list[int] = []
r: list[int] = []
for _ in range(q):
l_, r_ = line_integers()
l.append(l_-1)
r.append(r_-1)
segment_tree_max = SegmentTreeMax(a)
segment_tree_min = SegmentTreeMin(a)
for i in range(q):
max_, max_i = segment_tree_max.get_max(l[i], r[i])
min_, min_i = segment_tree_min.get_min(l[i], r[i])
if max_ == min_:
print("-1 -1")
else:
print(min_i + 1, max_i + 1)
print("")
The implementations of SegmentTreeMax and SegmentTreeMin (https://pastebin.com/GsP1Ga4Z for the SegmentTreeMin I use) are the "standard recursive ones". And it lasts more than 5 seconds on codeforces servers.
However the same code in c++ (https://pastebin.com/EYEd66ii) only runs within 300 ms.
The only difference between the two codes is that I affected the size of the array representing the segment tree using 4* the size of the initial array in c++ and I was "more precise" in the python code by looking the smallest power of 2 bigger or equal than the size of the array and initialized the size of the array representing the segment tree using 2*power-1. I don't think this would take more than 4seconds.
I think I might be something notably wrong for there to be such a difference. I tried PyPy and I had the same Time problem. Did I make something particularly different that may explain that difference?
You can find my actual submissions here (https://codeforces.com/contest/1927/submission/245199915 PyPy) and here (https://codeforces.com/contest/1927/submission/245200532 C++)
honestly it possible that it's just python being slow...
if c++ solution runs 300ms it's very very possible python is just >20 times slower
Usually when that's the case, pypy is okay ish
but i'm still a newbie in CP so idk
so maybe there's just no way to solve the problem with Python?
I not expert on pypy, but sometimes pypy is just unable to optimize code well and even ends up slower than original python. But I will need to investigate more here to be sure about that
You can try rewriting the segment tree implementation to a bottom-up one. Those are usually significantly faster.
Yeah that's the problem
the only approach I know is that one
If I remember correctly pajenegod said he only used an iterative one
yeah, and that's basically why
mmh
I'll look into it
Also, I'm having two segment trees, one to get the max, one to get the min
basically the exercise asks if there are two different indices in a given range
so I use two segment trees
could I use "only" one?
It's not too difficult to understand btw, I would say if you only query it's actually way easier than top-down
I feel like I could (by storing tuples of size 4 (max, max_idx, min, min_idx) but I'm not sure it would speed up anything
(I also only use bottom up even though I usually write in rust and performance is not an issue)
It may, or it may not. It depends a lot. If it was a compiled language I would confidently say that it's unlikely the two trees are going to be any slower than just one, but in python it may as well depend on some other factors. I still bet the one tree instead of two will be faster.
Because I start at the top when I do my queries? and then I explore the binary tree by looking at the children?
you use top-down yeah
redefining a helper function on every call is probably detrimental to performance
now that I think about... of course it will be faster 🤔
oh yeah I should make a static method
outside
@outer rain you know what, bottom-up actually makes sense
i almost like it better
well i only did the construction part right now but it's I guess quite intuitive
I was talking about lists and dicts in general, they are common abstract data types implented in python by ArrayList and HashMap data structure. I know tuple is not a data structure, I’m asking what it is in general in terms in Python. I guess python’s tuple is implemented by ArrayList DS also, but bow should I call it as a concept? Tuple abstract data type? If so, it should gabe defines common and abstract operations and I couldn’t find any
yeah to me bottom up is also nicer. And it's even shorter, lol
I actually got confused with the indices
😬
class NumArray:
def __init__(self, nums: List[int]):
# Let's build the tree
self.nums = nums
# find smallest power of 2 >= len(nums)
n = len(nums)
power = 1
while power < n:
power *= 2
self.size = 2 * power - 1
self.segment_tree: list[int] = [0 for _ in range(2 * power - 1)]
# Populate the leaves of the segment tree
for i in range(self.n):
self.segment_tree[self.size // 2 + 1 + i] = nums[i]
# Build the tree by calculating parents' sum
for i in range(self.size // 2, -1, -1):
self.segment_tree[i] = self.segment_tree[2 * i] + self.segment_tree[2 * i + 1]
does that look correct for the implementation?
of the building part
Just call it a tuple, whatever. Noone says ArrayList either, that's just a java name for vector. Tuple is not an abstract data type. Not everything has to be. Like, you wouldn't call a float-float pair a "2D Point abstract data type". That's just a pair.
looks fine, probably...
I would use 1-indexing btw, those -1 and +1 everywhere are a bit confusing...
!e
import numpy as np
class NumArray:
def __init__(self, nums):
# Let's build the tree
self.nums = nums
# find smallest power of 2 >= len(nums)
n = len(nums)
power = 1
while power < n:
power *= 2
self.size = 2 * power - 1
self.segment_tree = [0 for _ in range(2 * power - 1)]
# Populate the leaves of the segment tree
for i in range(n):
self.segment_tree[self.size // 2 + i] = nums[i]
# Build the tree by calculating parents' sum
for i in range(self.size // 2 - 1, -1, -1):
self.segment_tree[i] = self.segment_tree[2 * i + 1] + self.segment_tree[2 * i + 2]
nums = (np.random.random((7,)) * 100).astype(int).tolist()
a = NumArray(nums)
print(nums)
print(a.segment_tree)
@outer rain :x: Your 3.12 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 23, in <module>
003 | a = NumArray(nums)
004 | ^^^^^^^^^^^^^^
005 | File "/home/main.py", line 20, in __init__
006 | self.segment_tree[i] = self.segment_tree[2 * i] + self.segment_tree[2 * i + 1]
007 | ~~~~~~~~~~~~~~~~~^^^^^^^^^^^
008 | IndexError: list index out of range
yeah
doesn't work
I think the padding is not put at the right place
somehow
So apparently you need to do the whole power thing
for a bottom-up approach
yeah
you just make an array of size 2n, with the n last elements the leaves
and the n first elements the internal node
the first element being not used if you use 1-indexing
n - 1
i don't get it, how can you need like twice as less space
in the top down it's like 4n
or 2*m-1 with m the smallest power of 2 greater than n
!e
import numpy as np
class NumArray:
def __init__(self, nums):
# Let's build the tree
self.nums = nums
# find smallest power of 2 >= len(nums)
n = len(nums)
power = 1
while power < n:
power *= 2
self.size = 2 * power - 1
self.segment_tree = [0 for _ in range(2 * power - 1)]
# Populate the leaves of the segment tree
for i in range(n):
self.segment_tree[self.size // 2 + i] = nums[i]
# Build the tree by calculating parents' sum
for i in range(self.size // 2 - 1, -1, -1):
self.segment_tree[i] = self.segment_tree[2 * i + 1] + self.segment_tree[2 * i + 2]
nums = (np.random.random((7,)) * 7 + 1).astype(int).tolist()
a = NumArray(nums)
print(nums)
print(a.segment_tree)
@outer rain :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | [4, 2, 7, 5, 3, 7, 4]
002 | [32, 18, 14, 6, 12, 10, 4, 4, 2, 7, 5, 3, 7, 4, 0]
you can get away with less space (exactly 2n - 1) in top down, but in bottom up you need to round up to the next power of 2, so you can end up with 4n space
yes
unless you build bottom-up, but query like in top-down
ok fair i'm gonna stick to the basics first
!e
import numpy as np
class NumArray:
def __init__(self, nums):
# Let's build the tree
self.nums = nums
# find smallest power of 2 >= len(nums)
n = len(nums)
power = 1
while power < n:
power *= 2
self.size = 2 * power - 1
self.segment_tree = [0 for _ in range(2 * power - 1)]
# Populate the leaves of the segment tree
for i in range(n):
self.segment_tree[self.size // 2 + i] = nums[i]
# Build the tree by calculating parents' sum
for i in range(self.size // 2 - 1, -1, -1):
self.segment_tree[i] = self.segment_tree[2 * i + 1] + self.segment_tree[2 * i + 2]
nums = [-1, 3, 4, 0, 2, 1]
a = NumArray(nums)
print(nums)
print(a.segment_tree)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | [-1, 3, 4, 0, 2, 1]
002 | [9, 6, 3, 2, 4, 3, 0, -1, 3, 4, 0, 2, 1, 0, 0]
no, I left it your way
okok
I like 1-index, but 0 is also an option ofc
yeah yeah fair
!e
import numpy as np
class NumArray:
def __init__(self, nums):
# Let's build the tree
self.nums = nums
# find smallest power of 2 >= len(nums)
n = len(nums)
power = 1
while power < n:
power *= 2
self.size = 2 * power - 1
self.segment_tree = [0 for _ in range(2 * power - 1)]
# Populate the leaves of the segment tree
for i in range(n):
self.segment_tree[self.size // 2 + i] = nums[i]
# Build the tree by calculating parents' sum
for i in range(self.size // 2 - 1, -1, -1):
self.segment_tree[i] = self.segment_tree[2 * i + 1] + self.segment_tree[2 * i + 2]
nums = [-1, 3, 4, 0, 2, 1]
a = NumArray(nums)
print(nums)
print(a.segment_tree)
print(list(range(15)))
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | [-1, 3, 4, 0, 2, 1]
002 | [9, 6, 3, 2, 4, 3, 0, -1, 3, 4, 0, 2, 1, 0, 0]
003 | [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
But it’s not just a pair of numbers, it has its own methods
ArrayList comes from Java, but its common name for dynamic array DS
@stray onyxThe key thing to understand is that python not java - the python list is not an ADT, list is a concrete implementation of the MutableSequence ADT. Tuple would be a Sequence, a set a MutableSet, and a dict a MutableMapping.
@outer rain I FINALLY managed to do it
and on a random leetcode for sum query bottom up is actually almost twice as fast
so that might help my initial codeforces problem
Alternatively known as the iterative segment tree, or the zkw segment tree (named after the guy who supposedly made it popular), maybe looking those terms up will give you more results
damn
every day there's a new concept
brain is gonna grow too big
I guess I also need to learn how to update in log(n) for both approaches even though that's not helpful for my initial question
Also it doesn't look like you need to update, so here a sparse table can be used instead
Btw python recursion in general is just abysmally slow (for CP purposes anyway), so you'll have to try hard avoiding it (e.g. you've just experienced it thru segtrees)
Aren’t queues and stacks even worse?
I’ve heard that for dfs the append method was that long that recursion was better
Since we use lists for stacks
Although we don’t need a stack for segtrees so yeah for sure
Uh... I don't think so? Where did you get that?
If you mean implementation difficulty, then sure
I think it was for a longest path algorithm dfs
But I had to keep, in the stack, different versions of the set of visited nodes
And with the recursion I could only have one hashmap
So it was for another reason
That recursion had an advantage
if it is slow - you have skill issue
function call in python takes approximately the same amount of time as one or two other operations (like list[i] or list.append(x)), so there is no reason to implement calls stack by hand, because it will be slower than a native python call
Even recursion? At least in my experience python solutions that use recursion often are much slower than if you just did it iteratively
I remember reading about python not doing tail call elimination, so the recursion is comparatively slower
yes, it is just a function call, without unnecessary features
So still slower?
I mean the context is competitive programming, and recursive dp's or data structures like segtrees still time out in a lot of cases I feel
again, if you do it correctly, it might be faster than iterative approach where you implemented call stack by hand
hello! Could someone help me in coding? It's just basic stuff
dont ask to ask
Interesting... I guess I just haven't poked around enough to see this happen
Just ask what your question is, then someone scrolling by can directly answer instead of asking you what the actual question is
okay, this is basically the question:
Develop a program that reads a four-digit integer from the user and displays the sum of
the digits in the number.
Hint: Use a string method to retrieve the individual digits of the input integer.
[ INT]: 4-digit integer : 3141
[ OUT]: Sum: 9
[ IN]: 4-digit integer : 9999
[ OUT]: Sum: 36
Well the idea is, since you don't actually care about the value of the integer input, but what each digit is, don't read it as an int, keep it as a str
@narrow mica i implemented DFS using iterative and recursive approaches: ```py
import random
(to get reproducible results)
random.seed(0)
type of nodes
T = int
number of nodes
n = 500
make random graph
G = {i: random.choices(range(n), k=n // 2) for i in range(n)}
where we start doing stuff
root = 0
approximation of number of iterations, doesnt have to be precise
graph_size = sum(map(len, G.values()))
def dfs_iterative(G: dict[T, list[T]], node: T) -> None:
visited = {node}
stack = [node]
while stack:
node = stack.pop()
# do stuff:
# print(node, end=' ')
for node in G[node]:
if node not in visited:
visited.add(node)
stack.append(node)
def _dfs_recursive(G: dict[T, list[T]], node: T, _visited: set[T]) -> None:
_visited.add(node)
# do stuff:
# print(node, end=' ')
for neighbour in G[node]:
if neighbour not in _visited:
_dfs_recursive(G, neighbour, _visited)
def dfs_recursive(G: dict[T, list[T]], node: T) -> None:
_dfs_recursive(G, node, {node})
import timeit
N = 1_000
warm the bytecode before measurements:
timeit.timeit('dfs(G, root)', number=100, globals=dict(dfs=dfs_iterative, G=G, root=root))
t1 = timeit.timeit('dfs(G, root)', number=N, globals=dict(dfs=dfs_iterative, G=G, root=root)) / N / graph_size
timeit.timeit('dfs(G, root)', number=100, globals=dict(dfs=dfs_recursive, G=G, root=root))
t2 = timeit.timeit('dfs(G, root)', number=N, globals=dict(dfs=dfs_recursive, G=G, root=root)) / N / graph_size
print(f'iterative: {t1 * 109:.4} ns per node')
print(f'recursive: {t2 * 109:.4} ns per node')
results on my machine on CPython3.12: ||py
iterative: 33.16 ns per node
recursive: 33.38 ns per node
iterative: 33.09 ns per node
recursive: 33.42 ns per node
so there is literally no difference||
What about bigger n?
And also a "bamboo" (edges u->u+1)
Bamboo on n=15_000:
iterative: 380.6 ns per node
recursive: 1.052e+03 ns per node
well the segment tree does end up being twice as fast
iteratively
but that may be due to the bottom-up being better than top-down
although i'm not sure why mathematically bottom-up would take less paths in the binary tree
so only explanation here is that recursion is worse because it creates a stack and in the bottom-up approach you don't even need a stack
cant reproduce: ```py
iterative: 177.1 ns per node
recursive: 216.3 ns per node
my bamboo setup: ```py
import sys
T = int
n = 15_000
sys.setrecursionlimit(n + 10)
G = {i: [i + 1] for i in range(n)}
G[n] = []
root = 0
graph_size = n
``` didnt touch any other code
How did you fix the stackoverflow?
I get such results with this code:
def main():
# (to get reproducible results)
random.seed(0)
# type of nodes
T = int
n = 15_000
G = {i: [i + 1] for i in range(n)}
G[n] = []
root = 0
graph_size = n
.....
sys.setrecursionlimit(15_000 + 10)
import threading
threading.stack_size(2**26)
threading.Thread(target=main).start()
first answer on SO that i found to increase stack size
Giving
Process finished with exit code -1073741571 (0xC00000FD)
for me
buy more ram lol idk, probably your pythons issue
you are still doing it in your code: sys.setrecursionlimit(15_000 + 10)
yes
what line crashes the process?
That's the whole output
Though it looks like something with pycharm
Through terminal works fine
And produces output like yours
Funny
iterative: 425.6 ns per node
recursive: 541.3 ns per node
with threading: ```py
iterative: 191.2 ns per node
recursive: 233.6 ns per node
still cant reproduce
ig pycharm attached some sort of stupid debugger which is the reason your code runs so slowly
Actually it was not pycharm, but the python version
# py -3.10 .\main.py
iterative: 392.4 ns per node
recursive: 1.009e+03 ns per node
# py -3.11 .\main.py
iterative: 406.1 ns per node
recursive: 559.9 ns per node
And on 3.11 it works fine without threading, but on 3.10 exits without output at all (with code -1073741571)
are you on linux?
vitness uses windows based on the command, and default stack size on windows is a lot lower than on linux
windows10
64 bits
cpython3.12
aktually for CP everybody uses PyPy if possible, so all these results doesnt matter
pypy simply crashes, i cant fix it
I guess this https://doc.pypy.org/en/latest/stackless.html#recursion-depth-limit
Though i can't get it to work still
for pypy3.10, bamboo with n=5000: ```py
iterative: 51.09 ns per node
recursive: 203.5 ns per node
for random graph with n=500: ```py
iterative: 7.985 ns per node
recursive: 9.55 ns per node
ah, i understand
the reason why i am getting weird results for random graph is that it is too dense, so recursive approach is kinda bambooing, walking through whole graph pretty early
interesting ```py
n = 5000
make binary tree
G = {i: [2 * i + 1, 2 * i + 2] for i in range(n)}
G = {i: [c for c in c if c in G] for i, c in G.items()}
results: ```py
# cpython3.12 n=5000 tree
iterative: 204.9 ns per node
recursive: 179.1 ns per node
``` in this care recursive is faster by 10%
but on pypy iterative is still faster: ```py
# pypy3.10 n=5000 tree
iterative: 73.25 ns per node
recursive: 130.9 ns per node
graph: bamboo (50000 nodes) node_type: <class 'int'>
number of iterations: 100 (100 warmup iterations)
system: Windows-10-10.0.19045-SP0 AMD64 Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
python: CPython 3.12.0
total time for benchmark: 4.566 s
dfs_iterative: 202.28 ns per node
dfs_recursive: 238.39 ns per node
##################################################
graph: bamboo (500 nodes) node_type: <class 'int'>
number of iterations: 10000 (200 warmup iterations)
system: Windows-10-10.0.19045-SP0 AMD64 Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
python: CPython 3.12.0
total time for benchmark: 1.782 s
dfs_iterative: 161.49 ns per node
dfs_recursive: 186.81 ns per node
##################################################
graph: bamboo (500 nodes) node_type: <class 'int'>
number of iterations: 10000 (200 warmup iterations)
system: Windows-10-10.0.19045-SP0 AMD64 Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
python: PyPy 3.10.12
total time for benchmark: 0.7306 s
dfs_iterative: 35.495 ns per node
dfs_recursive: 104.37 ns per node
##################################################
graph: tree (50000 nodes) node_type: <class 'int'>
number of iterations: 100 (100 warmup iterations)
system: Windows-10-10.0.19045-SP0 AMD64 Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
python: CPython 3.12.0
total time for benchmark: 3.778 s
dfs_iterative: 196.05 ns per node
dfs_recursive: 180.4 ns per node
##################################################
graph: tree (50000 nodes) node_type: <class 'int'>
number of iterations: 100 (100 warmup iterations)
system: Windows-10-10.0.19045-SP0 AMD64 Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
python: PyPy 3.10.12
total time for benchmark: 3.277 s
dfs_iterative: 122.23 ns per node
dfs_recursive: 196.63 ns per node
n = 50000
niter = 100
T = int
root = 0
# uncomment necessary section:
g_type = 'tree'
G = {i: [2 * i + 1, 2 * i + 2] for i in range(n)}
G = {i: [c for c in c if c in G] for i, c in G.items()}
# import sys
# sys.setrecursionlimit(n + 10)
# g_type = 'bamboo'
# G = {i: [i + 1] for i in range(n)}
# G[n] = []
# import random
# random.seed(0)
# g_type = 'random_mess'
# G = {i: random.choices(range(n), k=n // 2) for i in range(n)}
def dfs_iterative(G: dict[T, list[T]], node: T) -> None:
visited = {node}
stack = [node]
while stack:
node = stack.pop()
# do stuff...
for node in G[node]:
if node not in visited:
visited.add(node)
stack.append(node)
def _dfs_recursive(G: dict[T, list[T]], node: T, _visited: set[T]) -> None:
_visited.add(node)
# do stuff...
for neighbour in G[node]:
if neighbour not in _visited:
_dfs_recursive(G, neighbour, _visited)
def dfs_recursive(G: dict[T, list[T]], node: T) -> None:
_dfs_recursive(G, node, {node})
import timeit
import platform
import time
niter_warmup = min(200, niter)
_t1 = time.perf_counter()
res = []
for f in (
dfs_iterative,
dfs_recursive,
):
# give it a chance to optimize/jit the code:
timeit.timeit('dfs(G, root)', number=niter_warmup, globals=dict(dfs=f, G=G, root=root))
t = timeit.timeit('dfs(G, root)', number=niter, globals=dict(dfs=f, G=G, root=root)) / niter / len(G)
res.append((f, t))
_t2 = time.perf_counter()
print(f'graph: {g_type} ({n} nodes) node_type: {T}')
print(f'number of iterations: {niter} ({niter_warmup} warmup iterations)')
print(f'system: {platform.platform()} {platform.machine()} {platform.processor()}')
print(f'python: {platform.python_implementation()} {platform.python_version()} ')
print(f'total time for benchmark: {_t2 - _t1:.4} s')
print()
for f, t in res:
print(f'{f.__name__}: {t * 10**9:.5} ns per node')
``` barely fits into discord message li
i'll just leave it here
As I learned recently, python 3.11/12 optimized recursion by having an explicit call stack for python code calling python code, instead of the previous using C call stack for everything
It's supposed to be +50-60% on recursive microbenchmarks
no, it is other way around
previously python frames were always allocated on the heap as python objects
now frame info is located at the C stack, and frame objects are created only when requested
hi everyone, i'd like help with a graph theory problem.
problem description: https://dmoj.ca/problem/ccc17s4
editorial: https://dmoj.ca/problem/ccc17s4/editorial
reference solution: https://github.com/sriharivishnu/My-CCC-Solutions/blob/master/2017/ccc2017s4.cpp
i learned disjoint-set structures and kruskal's algorithm for this problem, and applied it textbook to obtain the first 11 points. for a few days i've been trying to understand the full solution, to no avail. imho the editorial is a little poorly written with ambiguous wording all around, and i can't crack the reference solution either.
can y'all help me out and explain to me how considering D was efficiently implemented? and why does this algorithm guarantee the final minimum spanning tree (or rather, the number of days to acheive this MST)? thank you very much!
oh and, if the extra info helps, here's my naive solution https://paste.pythondiscord.com/V5PQ
Hi. I think it belongs here, please correct me if I should post it in a different channel
I'm having trouble with calculating distance from accelerometer values I'm reading constantly - yes I know it sucks, and drifts over time, but I'm getting no good values at any point in time. The sensor itself is not the problem though, I'm having a trouble in calculating something and I just don't know what yet
t=0.01
while True:
a1 = acc_data
v1 = v0+a1*t
meanv = (v1+v0)/2
pos1 = pos0 + meanv * t
print(f"{pos1}")
time.sleep(t)
v0 = v1
pos0 = pos1
all values except constantly changing acc_data and t (time) start as 0
pos0!=poz0
forgive my typo, its pos0, renamed from poz to make more sense in english, as in position
I'd suggest starting with writing a test case.. something you can use to verify that your algorithm is correct.
Also, you seem to be assuming that acceleration is constant during t?
Yeah, pretty much. I understand the bigger value of time is, the more drift it will give me over time but I think checking something 100 times a second is enough
you probably would want to use kalman filter to get rid of noise
You sleep for t seconds, but do you take into account the additional time spent reading the sensor and performing the calculations?
Not really, should it be that significant? But that is a good tip, thanks
I don't know, it might be. Just a suggestion.
pos = 0.0
vel = 0.0
last_t = time.time()
while True:
t = time.time()
dt = t - last_t
acc = get_acc()
vel += acc * dt
pos += vel * dt
print(pos)
time.sleep(0.01)
last_t = t
Try it out and see if it makes a difference.
Dead reckoning is always going to be fairly inaccurate.
can someone explain how merge sort worst case takes more comparisons than merge sort best case?
Wouldn't they all take the same amount regardless?
time difference between loops is more or less like this:
0.015595
0.015461
0.015695
0.016380
0.016874
0.016377
Pretty okay for a loop that registers values of 3-axis gyro and accelerometer, subtracting its offset and constantly measuring current angle on a raspberry pi 4
still an error is an error, basically 50% of time drifting
The merge fucntion can do anywhere from n/2 to n - 1 comparisons. Consider merge([1, 2, 3, 4, 5], [6, 7, 8, 9, 10]) vs merge([1, 3, 5, 7, 9], [2, 4, 6, 8, 10]). In first case it can short circuit after 5 pushes, because it exhausted first array, but in second it won't happen.
But for the final result, wouldn't it have to compare every pair?
Like it wouldn't know to stop comparing if the lists are sorted already, right?
Hello. If I understand correctly, you have a tuple of tuples of objects, and you would like to group those objects by the value of some attribute?
Is the attribute orderable? E.g. a string, number, etc.
Oh ok. In that case I would flatten the tuple of tuples, sort the objects by the value of the attribute, then group on the value of the attribute.
Let me demonstrate. One second.
!e ```py
from operator import itemgetter
from itertools import groupby
data = (
({'value': 1}, {'value': 2}, {'value': 3}),
({'value': 1}, {'value': 3}, {'value': 6}),
({'value': 2}, {'value': 3}, {'value': 5})
)
flattened = [
obj
for tup in data
for obj in tup
]
sorted_by_value = sorted(flattened, key=itemgetter('value'))
grouped = [
tuple(group)
for key, group in groupby(
sorted_by_value,
key=itemgetter('value'),
)
]
for group in grouped:
print(group)
@keen hearth :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | ({'value': 1}, {'value': 1})
002 | ({'value': 2}, {'value': 2})
003 | ({'value': 3}, {'value': 3}, {'value': 3})
004 | ({'value': 5},)
005 | ({'value': 6},)
You'd use operator.attrgetter rather than operator.itemgetter if they're objects rather than dicts.
No worries!
Hello i have a problem i want to take a specified line for a .txt file and another one later how can i do that i'm new and i search but didn't found
find
Hello. I suggest making a post in #1035199133436354600 as this is more of a general python question than an algorithms question.
Make sure to include details, and ideally a small example, of what you would like to achieve.
I already did it
hi guys quick question
how would you represent tree structures in Python?
I know of course that it depends of the concrete application
and that you might for example code a class Node with all attributes relevant to your problem
but I'm looking for a general framework to base my modelling on
for eg: would you use a queue to store the nodes ?
that type of questions
well, for a general tree something like
@dataclass
class Node:
# data it stores, etc
children: list[Node]
Ok, the class object for the nodes seem to be widely used, then how to store each node and conduct the exploration ?
oh, could you elaborate on that:
children: list[Node]
It's a list of nodes - the children of this node. Not sure exactly what you're asking - if you mean the : list[Node], that's just a typehint for this instance variable.
My second question is how to explore the tree structure ?
Base framework, store items in a queue and extract each item ?
Not sure what you mean by "explore". You can visit all nodes by recursively looking at a node's children - depending on how exactly you do it you will get the nodes is various orders (e.g. you can do it depthfirst or breadth-first by storing the currently-being-handled ones in a queue)
simplest way, say:
def visit(node: Node):
# do something to it
for child in node.children:
visit(child)
yeah that is what i'm looking for, a visit framework depending on the way I explore the tree
that depth-first snippet will give you... I believe that's called pre-order traversal.
Whereas if you wanted a BFS, you'd do it like:
def visit_bfs(root: Node):
cur_level = [root]
while cur_level:
next_level = []
for node in cur_level:
# do something to it
next_level.extend(node.children)
cur_level = next_level
yep, thank you for the insights, I'll dig into it, if you have any interesting website to recommend me I'd happily take it
who know how to decompile a pyc file
:incoming_envelope: :ok_hand: applied timeout to @karmic sierra until <t:1707504593:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).
The <@&831776746206265384> have been alerted for review.
@halcyon plank I am starting backend developer carrier
where to start and anyone provide resources to learn
Better question for #career-advice
I have this 'map' (see img).
x = wall or obstacle
'# = starting location (1,1)
. = empty tile
@ = coin
I am trying to build an agent that collects (as efficiently as possible) all coins. The agent only has limited vision (all adjacent tiles). The agent can only move left, right, up and down. Does anyone have a good suggestion for an algo or approach to solving this problem in python?
as efficiently as possible
limited information
now define "efficiently"
Hey 👋 Efficient as in least number of moves made by the agent?
This is a partially observable environment, so your agent needs some kind of model of the environment. It could keep track of where it has visited and where the walls/coins it has seen are. Then, because it's going to have to explore the whole environment anyway, it could simply pick the nearest unexplored location to visit next, until there is nowhere left to explore (and pick up coins as soon as it sees them for the first time).
okay guys, how do i filter TOKENS from trees?
i need to use only the tokens to my transformator works
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
as you can see here i'm having problem to parse only the tokens and not the trees
[Tree('content2', [Tree('args', [Token('EXPRESSIONS', '1d20')])])]
As you can see here my roll part of transformer gets the tree conent and another tree
but i want to filter it
to only gather the token
transforming this into this
[Token('EXPRESSION', '1d20')]
https://paste.pythondiscord.com/SMIQ
pastebin for my sorting algorithm if anyone wants to check it out
Hi for those who did the last codeforces contest (https://codeforces.com/contest/1928/problem/C)
My solution is the following:
https://pastebin.com/DMks9b2s
And is almost equivalent to the editorial. However to filter out the candidates, I did a lot more work (they only discarded the candidates for which k > x) and it turned out that their filtering is enough. I don't understand mathematically why though. When we add the divisors of n-x we don't know if they're all in the "ascending part" right? Similarly when we add one divisor we add its complementary and we don't know if it will be a good solution?
Can anyone here answer a big o question?
I have to use big theta to represent F(n) = N^4 - n log n + n^3
There is constants but it’s besides the point
Wouldn’t the lower bound be n log n
And the upper bound n^4
upper bound seems correct
not sure on the lower bound though, because idk how the subtracting works with the n log n
what are you doing that requires n^4 though?
The upper and lower bound are both n^4. Think about what happens as n gets very big: the terms n log n and n^3 both become insignificant compared with n^4.
a big O notation graph i made in desmos
Isn't log x^2 just 2 log x
Or did you mean (log x)^2
oh oops messed up the brackets lmao
so leo
guys i got some questions about the speed of the queue and lists anyone can help me?
Hey 👋 Go ahead and post your question. You'll be more likely to get responses that way.
https://www.spoj.com/problems/QUEUEEZ/en/ for this problem
import queue
q = queue.Queue()
input1 = int(input())
for i in range(input1):
a = map(int, input().split())
value1 = next(a)
if value1 == 1:
b = next(a)
q.put(b)
elif value1 == 2:
if not q.empty():
q.get()
else:
if not q.empty():
print(q.queue[0])
else:print("Empty!")
i used this code but it always tells me time limit exceed
from collections import deque
parse queries
num_queries = int(input())
queries = [list(map(int, input().split())) for _ in range(num_queries)]
queue = deque()
output = []
for query in queries:
query_type = query[0]
if query_type == 1:
queue.append(query[1])
elif query_type == 2:
if queue:
queue.popleft()
elif query_type == 3:
output.append(queue[0] if queue else 'Empty!')
print('\n'.join(map(str, output)))
this one as well gives time limit exceed so i wounder if i can fix that
can you have a look at that :D
I wonder if it can be solved with python with such limits
something like this should be better.. but it doesn't pass too
import sys
input = sys.stdin.readline
print = sys.stdout.write
n = int(input())
data = [0] * (n + 1)
sz = 0
pos = 0
for _ in range(n):
query = tuple(map(int, input().split()))
query_type = query[0]
if query_type == 1:
data[sz] = query[1]
sz += 1
elif query_type == 2:
if pos < sz:
pos += 1
elif query_type == 3:
print(data[pos] if pos < sz else 'Empty!')
print("\n")
This isn't a practical programming problem you might encounter on a job, despite the problem statement describing something your "boss" is telling you to do. This is a puzzle and the solution should focus on doing the least amount of work to solve it, considering T can be huge and you have a time constraint.
I didn't try solving it, but my first instinct is that you don't actually need to create the list of all numbers in the queue. If you have x amount of dequeues, then you at most need the first x elements
Additionally, nothing matters after the last print query
gonna give it a try
i feel like its not ment for python
but there are ppl that did it with python
i wounder what did they do
i can not run the code lol
yeah i know just problem solving stuff might encounter in an interview
sucks to see how slow python is :/
import __pypy__
import io
import os
input = io.BytesIO(os.read(0, os.fstat(0).st_size)).readline
out = __pypy__.builders.StringBuilder()
n = int(input())
data = [0] * (n + 1)
sz = 0
pos = 0
for _ in range(n):
query = tuple(map(int, input().split()))
query_type = query[0]
if query_type == 1:
data[sz] = query[1]
sz += 1
elif query_type == 2:
if pos < sz:
pos += 1
elif query_type == 3:
out.append(str(data[pos]) if pos < sz else 'Empty!')
out.append("\n")
os.write(1, out.build().encode())
Passes with pypy2
question about the complexity of this function that finds the height of a binary tree:
def how_high(node):
if node is None:
return -1
if node.right is None and node.left is None:
return 0
left_height = 1 + _how_high(node.left)
right_height = 1 + _how_high(node.right)
if right_height > left_height:
return right_height
elif left_height > right_height:
return left_height
else:
return right_height
def _how_high(node):
if node is None:
return 0
print(node.val)
if node.right is None and node.left is None:
return 0
elif node.right or node.left:
return 1 + _how_high(node.right) + _how_high(node.left)
would this be in O(n) time? would the space also be in O(n)? or would it be better time and space complexity to do this solution I saw in the website I'm using that gives me these problems:
def how_high(node):
if node is None:
return -1
left_height = how_high(node.left)
right_height = how_high(node.right)
return 1 + max(left_height, right_height)
The first way I wrote it out (using the _how_high helper function) was good for breaking the problem down in my brain
is pypy 2 faster than normal python?
depends, but usually yes
how to install the py?
I guess __pypy__ comes just together with pypy itself
We provide pre-compiled binaries for many platforms and OSes. There are also
pre-compiled binaries available on conda-forge. We have found conda-forge
a convenient and cooperative community for distri
https://paste.pythondiscord.com/T5HQ
updated sorting algorithm
where do I ask questions about python imports?
#python-discussion , #❓|how-to-get-help (or #esoteric-python if other didn't help)
So, I have a certain problem that I'm trying to tackle and I am looking for some suggestions. Basically I'm writing a tool (well, more like a script for now) for most efficient solution of an aspect in certain game. The gist of it is that there are sequences of symbols that activate certain abilities and you can "layer" them, so as long as there is a sequence that contains sequences for abilities, all of them will activate, so for example if there is a sequence OAANN and AANNOO, you can "layer" them into OAANNOO and both will activate. On top of that you can also merge 2 sequences at certain point, so for example if we have NONAOA and AANOA you can have both of them merging into one on the second last symbol (so there are sequences NONA and AAN both connecting to the sequence OA) and they will both be active. For now I have a brute force kind of script that goes through all of permutations of orders of "layering" on those sequences on each other and generating the shortest sequence with them in, so, you can probably tell that the amount of time to solve it gets extremely long, extremely quick, not to mention the lack of merging them (which if I decided to implement in similar fashion would probably make the script running for literal hours).
Also, sorry for the wall of text
longest common string has its own wiki https://en.wikipedia.org/wiki/Longest_common_substring
In computer science, a longest common substring of two or more strings is a longest string that is a substring of all of them. There may be more than one longest common substring. Applications include data deduplication and plagiarism detection.
i don't fully understand t he second part merging
So what would be the ultimate solution? Given a list of abilities [s1, s2, s3, ...], is it, for example, the shortest single string S that contains all the abilities s1, s2, s3, ...?
Well, it would be either the shortest string that contain all the abilities or several of them that do merge into one later on (depending on which one would use less symbols in total)
Does any one know the name of this color scheme or a similar one for a blank black background
it would be something like this:
NONA ↘
OA
AAN ↗
Let's first just assume that the second way of merging doesn't exist (so no NONAOA & AANOA -> [NONA | AAN]OA)
Even in this simpler scenario, what you have is the shortest common superstring problem, which is NP-hard according to the wiki
In computer science, the shortest common supersequence of two sequences X and Y is the shortest sequence which has X and Y as subsequences. This is a problem closely related to the longest common subsequence problem. Given two sequences X = < x1,...,xm > and Y = < y1,...,yn >, a sequence U = < u1,...,uk > is a common supersequence of X and Y if ...
So the running time is gonna suck no matter what, unless you use a heuristic
Yea, the algorithm you posted does look like it might be only a "little" bit faster than what I did
Insertion
Inserting at the end of the array
Since we can always access the last index of the array, inserting an element at the end of an array is O(1) time. Below is the code demonstrating the concept.
# Insert n into arr at the next open position.
# Length is the number of 'real' values in arr, and capacity
# is the size (aka memory allocated for the fixed size array).
def insertEnd(arr, n, length, capacity):
if length < capacity:
arr[length] = n
length is the number of elements inside the array whereas capacity refers to the maximum number of elements the array can hold.
def insertEnd(arr, n, length, capacity):
if length < capacity:
arr[length] = n
return arr
print(insertEnd([1,2,3], 99, 3, 4))
Traceback (most recent call last):
File "/home/runner/Static-Arrays/main.py", line 40, in <module>
print(insertEnd([1,2,3], 99, 3, 4))
File "/home/runner/Static-Arrays/main.py", line 37, in insertEnd
arr[length] = n
IndexError: list assignment index out of range
i don't quite understand
you are assignment to index 3 but [1, 2, 3] has length 3 (only up to index 2)
huh
sorry i don't get it
I have tried to create a function which I could use a lot it mentioned to make answer=true if answer="y" and in other cases. Should I use for it self or how I should do sth like this I dont have a clue how to do this
You know what I want to do bcs I dont know did I explain it in the proper way
Guys what do you think
!code
what exactly were you trying to do? adding a new element to the end of the list? or assign a new value to the last element of it? Also, what is that example snippet for? It doesn't look like it's for standard python lists
add a new element to the end
trying to use static lists
instead of dynamic ones
umm... you want to create and then expand a static list? am i getting that right?
yep!
but there isn't a static list
right
let's start with a fact that python lists are dynamic, so there's your problem number 1
If you need to use static arrays you can probably just use NumPy library
well, if it's for a course than we're gonna need some more context
Is your task to implement a static array class?
no
then what is it?
it’s to imagine i’m working with a static list and add an element at the end
without using append ig
hmm... if it's just pretending, then I guess you could simulate it by creating a list filled with None values and then work with that
dynamic array is static if you dont change the size :-)
but if you don’t change the size. how do you insert the new element?
it doesn’t make sense
replace empty stuff with useful stuff

does anyone know of a smart method to solve this problem
i mean, instead of showing me all the values, i'd like it to just show me: K:512, P:512
okay so, in a hashed dict how do i get the only hash i'm searching for
for example
dic = {
"detox": {
"name": "Natan",
"number": 123
},
"toxin": {
"name": "Raid"
"number": 456
}
}
here i want to select detox for example
by grabbing it from hash
hii i cant figure out computing this part.
so i got distributions like this and have to calculate error.
i have the upper values fixed so im free to hardcode it, but lower intervals are generated by an algorithm.
i need to add p1/len when its 0, and (1-p1)/len when its 1 in the lower p2 thingy. i have p2 values in terms of exclusive sorted intervals as list of 2 sized tuples.
is there any way to do this?
resultater = ''
p_time = 0
k_time = 0
for i in range(1024):
i_resultat = np.random.random()
if i_resultat < 0.50:
resultater += 'P'
p_time += 1
if i_resultat > 0.50:
resultater += 'K'
k_time += 1
print("P: " + str(p_time)+ "\n" +"K: " + str(k_time))
dic = {
"detox": {
"name": "Natan",
"number": 123
},
"toxin": {
"name": "Raid",
"number": 456
}
}
print(dic["detox"])
you had a missing comma at Raid which may be causing issues for you but here's how you print the dictionary for detox
I'm not sure what exactly you're trying to do, you can do something like this to learn about amortized O(1) time complexity with arrays
arr = [1,2,3]
for i in range(99):
arr.append(3 + i)
print(arr)
If you just want to change the value of the last index, you can do this:
def insertEnd(arr, n, length, capacity):
if length < capacity:
arr[length - 1] = n
length += 1
return arr
print(insertEnd([1,2,3], 99, 3, 4))
I did length - 1 because the index was out of range, index starts from 0 so [1,2,3] has an index of 2 so you couldn't reach it with the length 3
okay so, how do i rename a hash name?
if this is python rather than pseudocode, length += 1 is useless here
this is all that's written on the course i'm looking at
ik what i did wrong
uhmm guys
is there already a built in macros function in python?
cus when i typed macros it colored in green
can you show an example?
wait now i can see why
it's because i named my macros file by macros.py
so it probably detecting it as an imported thing
https://paste.pythondiscord.com/PR4Q
if anyone has any ideas or can help make this code faster im open to feedback
Have you profiled it? If not, perhaps https://github.com/pyutils/line_profiler
Apparently the quickest time complexity for a sort is O(n logn) best case and O(n^2) worst best.
I thought about it and came up with this working sorting algorithm
def linear_sort(data: list[int]) -> list[int]:
container = {}
largest = float("-inf")
smallest = float("inf")
for number in data:
if number not in container:
container[number] = 0
container[number] += 1
largest = max(largest, number)
smallest = min(smallest, number)
sorted_list = []
for number in range(smallest, largest + 1):
if number in container:
sorted_list.extend([number] * container[number])
return sorted_list
To me it looks like the time complexity is one loop + one loop which is O(2n) which is simplified to O(n)? Either I'm a genius or I've calculated the complexity wrong (probably the latter)
ps the space complexity would probably be a bit high
(i also just made up the name linear sort because it felt right)
the quickest time complexity for a sort is
O(n log n)best case
The best time complexity for a comparison sort isO(n log n)
Your sorting algorithm is fast, but it comes at a cost. For example, what happens if you try to sort alist[float]using your algorithm?
Nah, you've successfully invented counting sort which is O(n). It's just that it only works when the list has a finite number of kinds of objects.
Ahhh thats the kicker
this naive impl also just dies for [1, 10**100]
Also, calling your current algo O(n) is slightly misleading. It should be O(n + M), where M is the difference between largest and smallest
Cool thanks guys
@toxic hare was cooking up some optimized counting sort
Can I see?
There's also radix sort which is pretty cool and related to counting sort
In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been...
#algos-and-data-structs message seems this is the latest version?
what are the chances I send that sort when the algo right above it is more optimised
4
Hey I am trying to learn how to use breadth-first search so I can compete in programming contests.
I think its the algorithm I need to learn because it can help me find all possible paths, find shortest path, determine if goal is accessible, which can all be possible problems in a programming contest.
Here's an example of what I've learned about it: With a 3x3 grid, start from the top-right and end at the bottom-left.
grid = [
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]
]
# coordinates are represented by row num, and column for example, grid[0][2] would get you the start position.
start = [0, 2] # start in the top-right corner
goal = [2, 0] # end in the bottom-left corner
# since it is a 3x3 grid (X representing the row, Y representing column)
X = 3
Y = 3
queue = [start]
while len(queue) > 0:
coord = queue.pop(0) # take first element of queue
if coord[0] == goal[0] and coord[1] == goal[1]
print("Reached end")
break
# add all possible moves to the queue (depending where the coord is, for example, at [0, 2] you cannot move to the next column, as you're already at the last column)
if coord[1] != Y-1:
queue.append([coord[0], coord[1] + 1])
if coord[1] != 0:
queue.append([coord[0], coord[1] - 1])
if coord[0] != X-1:
queue.append([coord[0] + 1, coord[1]])
if coord[0] != 0:
queue.append([coord[0]-1, coord[1]])
But I have a few questions:
- Is my implementation correct?
- How can I edit this code to find all possible paths?
- How can I edit this code to find the shortest path?
- It looks correct, in that it'd consider all paths indeed.
- You should remember which cells you've visited and not revisit them, otherwise this visit cells multiple times (since when handling any cell, you add at least 2 cells to the queue). For example, it considers the cell [0,0] once at the start, twice again at path length 2, and many more times.
- Consider using an actual queue, like
collections.deque, instead of using a list as one - it's slow (O(n)) to pop an element from the start of a list, whereas for a deque it's O(1).
To get shortest paths, consider storing for every cell where you visited it from (as a dict). That way, when you visit the final cell, you can "retrace your steps" to generate the path - take the final cell, its predecessor, the predecessor of that predecessor, and so on until you end up back and the start - and that's your path in reversed order.
Getting all paths via BFS... for that, I believe the only way would be to store entire paths in the queue, rather than current cells. So you take a path ending up in a certain cell from the queue, and extend it by the neighbour cells, resulting in several paths of bigger-by-1 length, that you all push into the queue.
getting all paths with BFS sounds really annoying ngl
yeah, it's pretty unusual
there's infinity possible paths (when you can visit cells multiple times), and BFS visits them all in order of length, so most of these paths are useless
your approach seems reasonable. instead of single nodes, store paths
ah okay this all makes much more sense. thanks
is there a name for this sort of algorithm that takes an area and tries to split it up into rectangles/squares to minimize the amount of 'tiles'? i imagine there's several
Not precisely, I guess
is there a way that i can make a custom traceback?
like, i need to know how to force a custom error
cus i need to check if the macro is protected or public if they are a server macro
to check if i can execute them
otherwise force a traeback error with a custom traceback
I really recommend checking out this article: https://www.redblobgames.com/pathfinding/a-star/introduction.html
I have actually looked into that a while back for my own project, this paper has a pretty simple O(n^2.5) algorithm (n is the number of polygon vertices here), as well as the more efficient O(n^3/2 log n), but I have not looked into that at all. n^2 sqrt(n) was good enough for me.
https://arxiv.org/pdf/0908.3916.pdf
"Partition into rectangles" seems to be a good enough name
kk
also i implemented bitwise operations into floats again, this time it's much more readable imo
import operator
def bitop_on_float(op,a,b,precision=3):
p = 10**precision
a_int,a_float = divmod(a,1)
b_int,b_float = divmod(b,1)
a_int,b_int = int(a_int),int(b_int)
a_float,b_float = int(a_float*p),int(b_float*p)
return op(a_int,b_int) + op(a_float,b_float)/p
its actually very useful to me to have this since i can use it for fractal drawing
This is mildly related to this channel but I'd like to know what people's classes for this are structured like. I'm a freshman, just barely learned a bit of how to code in python last semester and am new to data structures and algos. But I have a class rn that just really makes me mad. Basically, its an hour forty class where the professor tells us ab a basic data structure concept, for example we did linked lists this week and last week. Then they makes us work individually for like 90% of the class. None of our work is graded, they only makes us submit 1 point reflections. Technically zero hw and we grade ourselves at the end.
Is this normal?
Some of this stuff seems really interesting and I feel a guest doctor who had zero experience teaching explained big O better than my actual professor did
It's legitimately giving me a headache. Some people try doing stuff during class, others just play video games and im somewhere in between. We don't even have a decent book to read from
This class, we play a kahoot for like 10 minutes, then the professor talks ab stuff to keep our retention, then goes on to tell us to go and ai generate a bunch of stuff for a "to do list" data structure for the rest of the class
that is not great. i would at least expect homework problems where you're meant to implement data structures and solve problems with them
The closest thing we get is a check list of stuff to watch
Or we take home a jupyter notebook we aren't graded on and that we don't get feedback on
I reallt want to complete learning ab linked lists for My sanity but at the same time I don't want to be doing this in a way that is against the point.
Like, ik it should be my fault for not practicing Jupyter notebooks as much at home, but I have other classes and I can only dedicate so much time for self study. If I wanted to spend all this time self studying instead of practicing then I wouldn't be in that class
maybe ask around to see if other profs are the same
Pretty sure they're the only data structures professor
transfer schools 😔
Dude I wish. Tuition is crazy
The one option I wanted to transfer to would be significantly cheaper but i feel like it'd be a bad idea to transfer since 1. Would be far away from all of my family 2. Would be a significantly bigger uni
Plus housing would probably be hell
Is there a summative assessment at the end of the course?
There's a portfolio thing?
I think
I'm probably gonna try going to an office hours next week
Just to vent my frustrations with the class. Of course I don't want to be brutally graded and struggling but I feel like I'm not really gaining anything by sitting in a class and trying to figure out how to write a specific structure when we don't get how solutions should be looking like
hehe yeee
also yeah i did a memory and a time profiler
the slowest portions of the code are
outliers0,arr = list1[key]-mini,list1[~key]
-
bt[y] = np.bincount(outliers[y])
-
arr = np.repeat(np.arange(np.size(bt[y]))+miy,arr)
https://github.com/Pigizoid/pigicount
i made a github for the code
i’m sorry man that doesn’t sound good to me
does numpy have a way of computing a function with x,y (or more, i.e x,y,z...) coordinates in mind?
i like making fragment shaders but the way i'm currently doing it is very much not optimal
essentially i'd like to be able to use a numpy 2d array and run a function on it for every coordinate on it, and then get the results from it in something like a list or an array
here's an example i just made of approximately how i do it now
it works fine for single images but when i extend it to rendering gifs, it can get pretty prohibitively slow
import math
from PIL import Image
#== config ==#
size = (512,512)
scale = 1
output_filename = "shader_output.png"
#============#
width,height = size
im = Image.new("RGB",size)
image_data = []
def color_from_trig(h,offset=(math.pi/4)):
n = h * math.pi
r = int(255 * abs(math.sin(n-offset)))
g = int(255 * abs(math.sin(n)))
b = int(255 * abs(math.sin(n+offset)))
return r,g,b
def shader_func(x,y):
n = math.sin(x * y)
return color_from_trig(n)
for y in range(height):
if y % (height//100) == 0:
print(f"{y/height:.0%} Complete... ",end="\r")
for x in range(width):
x_scaled = (x/width) * scale
y_scaled = (y/width) * scale
color = shader_func(x_scaled,y_scaled)
image_data.append(color)
im.putdata(image_data)
im.save(output_filename)
i just iterate through each coordinate, run the function, and put that into a list which is put onto the image and saved
im porting my sorting algorithm to pip so yall can enjoy the sorting speed go brrrr
Sure? A lot of np functions take an optional axis argument, and basically every function can operate on ndarrays
>>> import numpy as np
>>> coords = np.array([[1, 2], [3, 4], [5, 6]])
>>> p = np.prod(coords, axis=1) # <--
>>> p
array([ 2, 12, 30]) # <-- [ 1*2, 3*4, 5*6]
>>> n = np.sin(p)
>>> n
array([ 0.90929743, -0.53657292, -0.98803162])
>>> h = n * np.pi
>>> h
array([ 2.85664212, -1.68569354, -3.10399289])
>>>
how would i run an arbitrary function on it? like say one that takes two values as input
You don't, you strictly use np functions to achieve what you want
While you technically can (through np.vectorize), it's just a convenience wrapper around a python loop, so you won't see a speed up if you do that
kk
best sorting algorithm is radiation, let the bit flips do the work for you 💪
what is radiation?
i just now merge sort
ah and quick sort
In physics, radiation is the emission or transmission of energy in the form of waves or particles through space or a material medium. This includes:
electromagnetic radiation consists of photons, such as radio waves, microwaves, infrared, visible light, ultraviolet, x-rays, and gamma radiation (γ)
particle radiation consists of particles of non...
i was making a joke about how radiation can flip bits in computer memory
yeah
it's infamous for doing stuff like changing a mario 64 speedrun by letting mario jump a bit higher
but i was just doing a joke
theoretically though, just by random cosmic rays and such, you could have data sort itself, though the medium would probably break down ages before that :p
ah yes i saw it
ok haha
you can do that to any homogenous 2D array using apply alogn axis or any numbe rof methods
but if a list contains arrays that arent the same size then you cant do that due to the way the math works
nawhh not miracle sort XD
hey do you know why your magic function doesn't work on this recursive dfs function (https://codeforces.com/contest/1929/submission/246588499) ?
I get:
Traceback (most recent call last):
File "C:\Users\lhott\Desktop\mdf\main_codeforces.py", line 74, in <module>
dfs(0, -1)
File "C:\Users\lhott\Desktop\mdf\main_codeforces.py", line 11, in wrappedfunc
to = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "C:\Users\lhott\Desktop\mdf\main_codeforces.py", line 62, in dfs
dfs(child, node)
File "C:\Users\lhott\Desktop\mdf\main_codeforces.py", line 11, in wrappedfunc
to = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "C:\Users\lhott\Desktop\mdf\main_codeforces.py", line 62, in dfs
dfs(child, node)
File "C:\Users\lhott\Desktop\mdf\main_codeforces.py", line 17, in wrappedfunc
stack.pop()
IndexError: pop from empty list
(and runtime error on codeforces https://codeforces.com/contest/1929/submission/246642572).
You are using the bootstrap incorrectly https://codeforces.com/contest/1929/submission/246642754
Btw I'm really confused by your code
mul1 = mul1 + 1 * 1 ** len(graph[node]) # add when root is dangerous and all children are not
It also looks like mul1 can become a really big integer
this is just mul1 += 1 but complicated
Yes, and then the next line is mul1 -= 1
it's just +1
but i made it hard so that i'd understand my reasoning
and then the next is -1
it's useless but it made sense to me when i was writing down my formulas
for my tree dp
sorry
https://codeforces.com/contest/1929/submission/246643427 After some modifications (making sure to mod after every calculation, and using sys.stdin.readline instead of input), it gets AC
oh ok you should yield None when the dfs doesn't return anything
yes, and you also need to yield the recursive call
oh ok my bad
i though the modulus was long so i was only doing it at the end
but i assume doing operation son big number is even worse?
thanks btw!
For speed, it is super important to not use big integers.
You always should avoid going >= 2^63
Wdym by "was long"?
was an operation that takes a long time
but it seems that going >= 2^63 ends up being way worse
It is true that mod runs much slower than + or *. But in Python you'll barely notice that.
I usually do an if check to mod, I wonder how that compares to modding every time
# do sth to num
if num >= MOD:
num %= MOD
might be better to just do -= instead of %= if so
right
%= probably works better if the threshold is like multiple times more than MOD
In terms of running time? it seems negligible to me, I'd just mod every time
Yeah modding every time 100% works
the only problem was the recursion depth and handling very large integers
I wouldn't necessarily assume that the if-check version is faster overall. You'd really need to time the two approaches on realistic input.
In compiled languages on modern x86, if-check is faster, about 1.5 times (about x5 if you count just cycles on a simulated CPU in perfect conditions, but it's very complicated and can't be implemented: 1.5 is the result in the practical application). I measured it some time ago. In python, if is unlikely to be any faster because it's just more interpreter commanđs.
In Python I would not bother with if statment modding
In fact, I would would guess that using if statements is significantly slower
why it didn't executed variable function first before the use_var function?
https://pastes.dev/BWF9zzpKum
like, it's obvious that it's out of range, but don't it was suppose to store first with variable function
capture it with use_var later?
Lark what the actual fuck?
I guess there is some ratio where if you mod X% of the time things become equivalent
(of course, there is speculative execution making things more complicated...)
yo
i have to make a project using data structures
any ideas??
give me some unique problem that can be solved using data structures
I mean just go on sites like leetcode, codeforces, etc.?
but the thing is, everyone from my class will go and search there.
so id like to see if there are any obscure or lesser known stuff from other sources
Not sure what you mean... sites like these thrive with problems that are meant to be solved using obscure DSA
iseeee
For codeforces, you can find people explaining stuff you've probably never even heard of
E.g. Mobius inversion, Mo's Algorithm, Convex hull tricks
And a lot of these you'll probably never use outside of these problems, or when you do you copy someone else's implementation
hehe boi, thats great
thankyou! ill check em out
Those probably aren't great if you're just getting into DSA though, too specific and too confusing
yeah, i figured.
but there will be newbie friendly stuff also right?
Sure, but those 3 definitely aren't newcomer friendly
tell me how hard this problem is for a beginner:
this is some railroad yard problem which asks to use lisp lists and arrange the cars to the engine in a given map of rail yard
hahah ok
the cars are labled as characters and have to be arranged accordingly, and there are some legal and illegal moves that are possible as usual
Starting out, it's probably better to follow some course/guide on what you should learn first
your right.
i have done the usual linked list, stack, queue, DLL,CDLL, priority queue etc
would like to look for a problem thats just a few points more difficult
There's this roadmap that's nice I think
A better way to prepare for coding interviews.
a few problems in there are locked behind premium but most aren't
alright thanks
will look into it
sir, this is more of a collection of problems related to ds. which is not exactly what i am looking for.(is great for my self development for later tho)
my project requires implementation of ds in a useful manner, like a working system of sorts
i think Nqueens is the only viable option from that website which fits my need
but its a very well known problem
ik i didnt properly explain my needs earlier
Ah, so you specifically need to make a project, and the requirement is to use ds in some way?
What about a huffman encoder?
Like a program that attempts to compress the input using huffman coding
Maybe even reason about why this algorithm (and all lossless compression techniques) will fail to compress some inputs
under what topic is it?
interestingg ill put it on my list
You can use it to losslessly compress data, and you'll use dsa for the process
when i went to check for huffmanencoder i found lru cache
i meant on the roadmap, but yea i got it
compression, coding theory, information theory
❤️
jump game also looks nice but might not be interesting enough for my teacher lol
Looks like Huffman coding is a pretty complex topic by itself
How much time will I need to learn that first and then implement it for the encoding?
tomscott has a video on it, nice
Greetings guys
Hope all are well. Anyone here who is familiar with iTensor library in Julia?
Kindly let me know.
okay so, anyone here knows Lark library?
i'm having a problem with it where use_var is being used first before variable from my Transformer
"!echo [Sucess {&0}\nDamage: {!roll 1d12+5,2}] !if $!roll 1d20+2 >12 !else !echo [Failure]"
this is my input
0
[]
Traceback (most recent call last):
File "C:\Users\User\Documents\GitHub\RP-Utilities\.venv\Lib\site-packages\lark\visitors.py", line 124, in _call_userfunc
return f(children)
^^^^^^^^^^^
File "c:\Users\User\Documents\GitHub\RP-Utilities\rp_utilities\macros.py", line 560, in use_var
use_var = variables[var_pos]
~~~~~~~~~^^^^^^^^^
IndexError: list index out of range
not the right channel for this, why not try in #python-discussion
!rule 4
4. Use English to the best of your ability. Be polite if someone speaks English imperfectly.
Any good links/sources for learning to use Kruskals algorithim in python?
does anyone know a list of leetcode topic/questions that covers almost all problem solving patterns ?like we have 2 pointer approach and sliding window etc i only know these two yet and idk how many are there so i'm thinking to learn about more patterns to solve
GUYS
what?
!mute 1044763608493600850 this is absolutely not appropriate, do not troll here. It'll be a ban if you continue to do so once your mute is up.
:incoming_envelope: :ok_hand: applied timeout to @warm crane until <t:1708324861:f> (1 hour).
Can Anyone tell me how to be a shark in algortihms
Can anyone tell me how to be a shark in algorithms?
Some good sites where i can test myself or understand in easy. I feel like my teacher is bad at formulate things so i try to Think about homestudy for that class.
i have a class implementation for linked lists
there is the node class which is a data class that has a data value and a pointer and only 2 methods modified
@dataclass
class SinglyNode():
data: Any
point: 'SinglyNode' = None
def __post_init__(self) -> None:
if isinstance(self.data, type(self)):
raise ValueError("Data Value Cannot Contain A SinglyNode")
def __repr__(self) -> str:
if self.point is None: return f"{self.data}"
return f"{self.data} >> {self.point}"
and well if i have like 1 million of these binded toggether. It will cost me like 48 mega bytes
which is too much
especially given the fact that the data value is litterally a integer(for this testing)
does anyone know what's the logic behind it
im thinking instead of referencing the instance of the pointer
what if i reference something like their memory address(or ig their ID)
and then look that up on runtime
problem tho
for i in gc.get_objects():
if point == id(i): return i
this is a loop, and it will be completed in O(n)
gc.get_objects has 10146 objects tho
what if i have a more complicated program?
well the time to retrieve will enlarge
(was busy at the moment, srry for late response but pls ping me if u wanna share a way to reduce the size of linked list implementation)
this is just bubble sort, no?
an int in python is already 24 bytes at least, so storing a million python ints won't get better than ~24MB
so no solution sadly?
Depends on how desperate you are
since im implementing a data structrue for a library im working on(again data structure related, mostly proof of concept and showcase). I am desperate to find that performance boost
Having two parallel numpy arrays with fixed size ints could get sizes down, but they will be more annoying to work with
What are you actually trying to accomplish?
(linked lists are rarely the best data structure)
i see
making a linked list that has data(which the data can be in any form. For example list, int, string, another object... etc)
why does it need to be a linked list? 
bc well i wanna implement common data structures on my library
so other people can use
(not really in production, since i already said its for showcase and proof of concept. But still be somewhat easier to use then having to implement your own)
100% there are libraries that already do something similar
but idc that much
regardless of how you store things you need as least store all the values, so for python int that's already 28 bytes per element
lists are close to optimal here, they have some constant overhead, but then it's just whatever the total sum of sizes is
for linked lists it's worse since you need to store info about the links
In practice the usefulness of linked lists is...rare. Using it to implement a queue might be sensible, or for some niche uses where you want to make use of O(1) operations in the middle of the list
(and I suspect you wouldn't end up exposing an api for that last thing)
idk what there is to solve
if you need to store the values you need to store the values
that'll be the lower bound for the size of any container
The collatz conjecture would be classified as a linear complexity, correct?
huh?
The collatz conjecture
I know what it is
but wdym by linear complexity? we don't know if things even terminate
Good point
that's the whole conjecture
I have following code
class AuthUser(BaseModel):
id: int = Field(alias='ID')
first_name: str = Field(alias='FIRST_NAME')
last_name: str = Field(alias='LAST_NAME')
user_email: str = Field(alias='EMAIL')
class Config:
orm_mode = True
class User(Base):
__tablename__ = "user"
ID: Mapped[int] = mapped_column(primary_key=True, nullable=False, index=True)
USERNAME: Mapped[str] = mapped_column(nullable=False)
FIRST_NAME: Mapped[str] = mapped_column(nullable=False)
LAST_NAME: Mapped[str] = mapped_column(nullable=False)
EMAIL: Mapped[str] = mapped_column(nullable=False)
PASSWORD_HASH: Mapped[str] = mapped_column(nullable=False)
address: Mapped[List["Address"]] = relationship(back_populates="user", lazy=True)
user_from_db: Type['User'] = User.get_user_by_email(db=db, email=user_email)
return AuthUser().from_orm(user_from_db)
When I do the return statement, i get following message : ID
Field required [type=missing, input_value={}, input_type=dict]
(Also for the other attributes from the BaseModel), but I get all needed data from my db call?
Does anyone know whats wrong ?
Assuming it's true, what can we say about the complexity of the sequence length with respect to the starting point? 😄
iirc we can say a lot about most starting numbers
but overall I have no clue what implication just knowing the conjecture is true would be
I would ask about argparse in Python bcs I cant understand this
for what it is used
and generally how to learn all this stuff with this
cause I dont think I should to learn it by heart
if anyone would like to explain for me this topic I will be pleased
Maybe ask in #python-discussion , argparse is just a general Python topic
next is, how do i get the hash values without knowing the name of the hash?
like, how do i acess a hash without knowing it's name?
what do you mean, "name"
like
{
"name" : {
"key1": "item1",
"key2": "item2"
}
}
in this case i don't know that the hash is "name"
so i want to acess it even not having it's thing
oh you mean ruby's hashes
well if you have a dict you can just loop over the values
have you tried just looping over the values of the outer dict
i got a better way to solve the problem
placing the .values() that i forgot in database
i don't really understand the problem, but 👍
Hello my friends,
Is prefixing python file with the directory name considered a bad practice?
Thank you
Why'd you do this?
Ask my friend 💀 he told me it was common
I mean, consider you have this
my_app.py
ROSU/
| - ROSU_menu
```In your `my_app.py`, the import would look something like
```py
# my_app.py
import ROSU.ROSU_menu
from ROSU import ROSU_menu
```Which contains redundant info, because... well the `menu` I import from `ROSU` is obviously the ROSU menu
Whereas if you didn't prefix it with the dirname it'd be nicer:
```py
# my_app.py
import ROSU.menu
from ROSU import menu
its like I know data strucutres and alogrithms but when I do leetcode I am able to solve but not with the most efficient solution
The good thing is: in real world, ‘most efficient’ is rarely the goal. Acceptable is. 🙂
but them indian coding interviewers will not accept this
and why the f doesnt leetcode have autocomplete
You can code in another ide and copy to leetcode. Also; if you want to get better, do some reading/studying in top of practicing. Sometimes hard to break the next skill ceiling simply by grinding.
Well it does... it's locked behind premium though, so just code in a separate IDE
Just practice more problems, and eventually you'll start to recognize patterns more easily, e.g. "I see that I need to do X-task, which can be done using Y-datastruct/algo"
Though it's in cpp, you can take a look at this. Each page has links to practice problems at the bottom
ALWAYS code in a separate IDE/text editor
it is likely to be faster, closer to the environment you are going to work most of the time, you are free to test your code however you want, and generally have a lot more control over it
it's just a significatly better experience
i have an algorithm in mind for searching efficiently linked lists
and on first its simple. If you wanna find based a node based on index
you iterate over until you get there. But when you get that node, you save it as a sort of "checkpoint"
before running again the code, the algorithm will ask 2 question(on itself)
- Is there any prerecorded position that equals to the position im asking for?
- What is the closest prerecorded position that is not ahead of the position im looking for?
now the question is. Is this a efficient algorithm?
or can it be more refined
ig
btw i ran a test on a linked list with 5 million elements and got these results
Without Checkpoints
< 3000000 >
0.3016791343688965
< 4500000 >
0.4416670799255371
With Checkpoints
< 3000000 >
0.29016876220703125
< 4500000 >
0.14542198181152344
jesus. Average time with non-checkpoints vs checkpoints on 100 iterations(with 2 distinct random values each iteration). With 2 lists of same length and content
0.2500529036521912 // Non-Checkpoint Average Time List
0.0026582655906677247 // Checkpoint Average Time List
yeah a skip list is basically this idea but it precomputes the "checkpoints", and they're evenly spread
Yes. This is basically a skip list. But if you really want to access stuff by index a lot you probably just want a list
tbh I wouldn't bother trying to optimize this for a linked list
any change you make to the list in between queries risks shifting all of your checkpoints, rendering them incorrect
I think if you as a user decide to use a linked list you should expect the O(n) operations for accessing elements
tbh the more you use it, the more it "improves" (if the positions are diverse)
but on the other hand you're using more and more storage
and eventually you have a linked list and a dict/whatever
and keeping them in sync under modification gets more expensive
elif event.key==ord('F11'):
TypeError: ord() expected a character, but string of length 3 found
anyone could help me with this
any ideas
that's not how ord works
also wrong channel
ok thanks I will try to make it work
that feeling when you get an indent expected error
hello, im stuck in this exercise and there is 10 minutes left
thats the result i get
nah nvm times up
!rule 8
8. Do not help with ongoing exams. When helping with homework, help people learn how to do the assignment without doing it for them.
That's a neat little programming language btw, no idea how it works
i just asked for help, not the answer, but yeah i agree with u
true
for that reason i have 2 sort of "mitigations" to this problem
the first is adding a bandwidth. When the bandwidth is 0, the checkpoint algorithm is disabled. If its -1 it can go to whatever number you like
but when on a positive number. It will have maximum that number of checkpoints
the other one is basically resetting the checkpoint list when a change on the linked list happens
so alternating inserts/querying will ruin any performance gains
kinda
the algorithm is optional
if u traverse too much on the same list, then enable it
I guess I'm not a big fan of just stacking stuff on top of the data structure as a library
just provide the data structure, let the user stack stuff on top
unless maybe it has to make use of the internals to work
tbh i could implement a "version" system or smth
to keep some checkpoints intact
but that sounds complicated
basically, if the lookup performance starts mattering (i.e. a lot of lookups) using a linked list was probably a mistake
true
Why not just switch to some data structure where all operations take guaranteed log(n) time?
tbh managing multiple data structures is too much
it matters when u do a lot of lookups
but i kinda wanna make it still a little more viable
so yeh
the non-checkpoint version would take ~33 minutes to complete all 10k iterations
whereas the checkpoint version takes about 2-4 seconds to complete all 10k iterations
Just pick a good one and stick with it. There are for example data structures that share the same interface as lists, but where operations take log(n) time
Then you could do about 1e6 inserts/lookups per second
mhm
also 1 question
does a bidirectional hashmap sound useful? This means both key and values are unique and you can look up a key by value and vice versa
:incoming_envelope: :ok_hand: applied timeout to @regal spoke until <t:1708536063:f> (10 minutes) (reason: newlines spam - sent 102 newlines).
The <@&831776746206265384> have been alerted for review.
!unmute 475657117509091329
:incoming_envelope: :ok_hand: pardoned infraction timeout for @regal spoke.
Sorry about that. Use the pastebin to avoid tripping the spam filter: https://paste.pythondiscord.com
ok thanks
@glossy lintel Try comparing what you have to this https://paste.pythondiscord.com/L3ZQ
It is a list like data structure that support fast inserts that I made a while ago
My guess is that it is a lot more efficient than what you currently have
the linkedlist supports more data types and other data structures
so its not really numbers only
none of the less. I might include some of these toggle on/off optimisations for specific scenarios
The thing I posted support the same things as a normal list in Python.
It supports far more things than just numbers
mhm
tbh the thing im trying to do is midigate some of the loses you would get from a linked list
just don't use a linked list. solves pretty much all your problems
Anyone here able to help with hadoop mapreduce in Python?
Need to create a KNN hadoop mapreduce program from scratch (no scikitlearn). Given two datasets that live in hdfs in input folder.
Both datasets contain the same structure: label feature1 feature2 feature3 … feature 300. Each record has 300 features and one label.
Need to somehow list predictions and accuracy in output. Must have a mapper.py and reducer.py
Need help with all of it, but specifically how to use stdin to have both files when the hadoop command execution would look like this:
hadoop jar /usr/local/hadoop-x.x.x/share/hadoop/tools/lib/hadoop-streaming-x.x.x.jar -mapper “python3 /home/name/mapper.py” -reducer “python3 /home/name/reducer.py” -input /user/name/input -output /user/name/output
._.
im making a datastructure library
I say just provide a plain linked list to the user then 
or maybe a kind of skip list
sure, but that's separate from a linked list
Where can I find details about time & space complexity of Python standard functionalities?
Is there a reference for that?
For example, in C++, I use this website:
https://cplusplus.com/
https://cplusplus.com/reference/algorithm/sort/
Is there something similar for Python?
hey - curious if someone could help set me straight on whether my current approach for dynamically creating models that may have relationships from is bad form/dumb/okay.
I'm working with an API that returns a schema similar to JSON Schema but not exact, so I'm working on a custom encoder/decoder to create the python objects. Each record in the response contains a schema which may have a relationship with another not yet processed record in the response. I thought it'd be simpler to store each model and relationship in a class variable for tracking and lookup in case the object and/or relationship already exists. But, I'm not sure how pythonic this approach is and if it'd be better to process the API response and construct the schema first, sort the instances outside the class and then discover relationships.
Simplified version of what I've done: https://paste.pythondiscord.com/EGZQ
what's the difference between O(n) and O(k)?
It's not stated explicitly, but reading between the lines, for the list table I think:
nalways refers to the length of the listkrefers to some other number, such as the length of the slice, the length of the iterable you're extending the list with, or the magnitude of the number you're multiplying the list by
👌 tyvm
It's actually stated there btw
Generally, 'n' is the number of elements currently in the container. 'k' is either the value of a parameter or the number of elements in the parameter.
can step count and operation count analysis of an algorithm yield different results?
wdym by different? are you counting exact number of steps/operations? if so, then yes, but it doesn't matter because exact number doesn't make sense
like on a code with a few for loops and some comparisons i got a different time complexity on step count (slightly lower) than i did on operation count as for why i need the exact number is just that its a descriptive question i had, i am aware overall impact between different analysis on final execution is minimal
i can share the code here if u would be interested
as i said, exact number doesnt make sense, only biggest term matters
thats true but we are marked based on the solution we arrive to otherwise i could always write the complexity generally just judging the code ;-;
i have a solution to the problem actually
and this might solve another problem
but it requires changing the algorithm
and finding the right balance
so instead of recording a checkpoint on the searched spot. What if instead i calculate predefined checkpoints that are stradegically placed, like for example if i have a 1 million node linked list, maybe the checkpoints would be spread by 250k
for insertion instead of immediately changing the checkpoints, what if i lazily evaluate them
meaning after the linked list change. On the first first to ever be done on the new linked list. Thats when the refining happens
would this be a better stradegy to boost the algorithm's performance?
since the more the checkpoints are placed thoughtfully, the less memory it uses(although would be just a tiny bit slower. But that would be some milliseconds)
Oh, thanks. I somehow completely missed that 😄
it's not always obvious just looking at code
yeah true
e.g. what's the complexity of this?
def f(n):
for i in range(1, n):
for j in range(i, n, i):
# some O(1) work
actually, starting at i doesn't matter
maybe simpler
def f(n):
for i in range(1, n):
for j in range(0, n, i):
# some O(1) work
outer loop is n - 1, inner loop goes from n to n / 2 then n / 3 till 1 ig?
so what's the overall complexity?
need some maths and a paper
spoiler: it's ||Θ(n log n)||
i see
harmonic series
n + n/2 + n/3 + ... is Θ(n log n)
1 + 1/2 + 1/3 + ... is Θ(log n)
this comes up when analysing the sieve of eratosthenes
though there it's an even more intricate because primes, and you end up with n log log n
number theorists drowning 😔
hmm
so it's not always as easy as one loop = O(n)
another simpler example
i = 0
while i**2 < n:
i += 1
i mean i did say generally 😔 or atleast the ones in my course those are pretty popular algos.
and the question is
is it worth implementing instead of leaving it there?
an example i have is this one
blue circles mean that these nodes are always known no matter the length of the list and anything
red circles mean checkpointed nodes
is there also a formula that i could use to calculate the most efficient interval of checkpoints
not related, but "generally" is one of my favorite words to hate 🥴
generally, you hate the word
meaning 2 and 4 contradict each other
usually and always
without reference to specific details, even
turns out i tried batching searches toggether into 1. And it kind of worked, so there was no need for the checkpoint algorithm
like you take the nodes u want to search. You put them in a set and then in 1 second you got your answer
whereas the checkpoint algorithm took 5 seconds
so im gonna adopt it more
kosaraju
hey guys, quick question. what happens when it hits the return node_map[node]
like it returns and gets out of the dfs function?
im trying to figure out how the dfs function is working here and the 2 returns are confusing me
improved 👍
This guy kinda 🐐ed or nah
yo
anyone know how classes and shit works?
ya boi be struggling
need to expand my hacker knowledge know what i mean
class = "token"
print("hacked account" + token)
hi guys , i tried a lot to understand what the level order traversal of a binary tree but i can't understand it and i can use in , pre , post order traversal but level order can't , so is there any one can help me
A level order traversal means visiting all the nodes at depth zero, then all the nodes at depth 1, then all the nodes at depth 2, etc. where 'depth' means the number of edges between that node and the root node.
It's the same kind of order you would visit the nodes in a 'breadth-first search'.
You could implement it using a FIFO queue: ```py
from collections import deque
def level_order(root_node):
if root_node is None:
return
queue = deque([root_node])
while queue:
node = queue.popleft()
yield node
for child in node.left, node.right:
if child is not None:
queue.append(child)
The next node to be yielded is always the one that was added to the queue at the earliest time.
ok bro thanks so much for that
This is how I ususally do it: ```py
def level_order(root):
if root is None:
return
level = [root]
while level:
yield from level
level = [
child
for node in level
for child in (node.left, node.right)
if child is not None
]
Essentially just going level-by-level.
muchas gracias
pre/in/post order refers to when you traverse the current node M.
E.g. (L R refers to the left/right subtree respectively
preorder:M L Rinorder:L M Rpostorder:L R M
I'm trying to resolve this steiner with a cost constraint problem
We are given a Floyd-Warshall shortest path algo and Krukal's, utilizing these two algos with a graph of 1000 2D points with the cost of and edge equaling the distance between the two points, we could approximate a steiner tree that covers all the required terminal points. Now, we add a root node P and a total cost constraint, how do I approximate a steiner tree that must inlcude P while staying under the constraint?
actually batching mainly depends on the iterations to be the most optimal(how many searches have to be done) whereas checkpoints i think depends on the size of the linked list & frequent uses
Hi everyone! I am building a simple project in Python to practice my skills (I am beginner), but I am having a bit of an issue. I want to make a simple online bookstore, but I don't know how to add books with the name, price, etc.
I’d suggest opening a help thread, sharing your problem and what (if any) code you have so far. #❓|how-to-get-help
Thank you so much
The easiest way to think about it is using a queue. Start at the root. Queue all the children. Go to next in queue. It’s just a layer by layer view of the tree
thanks so much
Hello guyys
does anyone here know the bowyer watson algorithm and could help me implementing it, i nearly got it, but i get overlapping edges and unused points sometimes
these are some outputs i get, here my algorithm:
def calc_bowyer_w(pts_arr, w, h):
super_triangle = create_super_triangle(w,h)
triangulation = [super_triangle]
for point in pts_arr:
bad_triangles = []
for triangle in triangulation:
if triangle.check_if_point_is_inside(point):
bad_triangles.append(triangle)
polygon = set()
for triangle in bad_triangles:
for edge in triangle.edges:
if all(edge not in other_triangle.edges for other_triangle in bad_triangles if other_triangle is not triangle):
polygon.add(edge)
for triangle in bad_triangles:
if triangle in triangulation:
triangulation.remove(triangle)
for edge in polygon:
pts_lst = list(edge.points)
new_triangle = Triangle(pts_lst[0], pts_lst[1], point)
triangulation.append(new_triangle)
for triangle in list(triangulation):
if triangle.contains_vertex(super_triangle):
triangulation.remove(triangle)
if super_triangle in triangulation:
triangulation.remove(super_triangle)
return triangulation```
classes:py class Point: def __init__(self,x , y): self.x = x self.y = y class Edge: def __init__(self, point1, point2): self.points = {point1, point2} def __eq__(self, other): return self.points == other.points def __hash__(self): return hash(tuple(sorted(self.points, key=lambda point: (point.x, point.y)))) class Triangle: def __init__(self, p1,p2,p3): self.points = [p1,p2,p3] self.center, self.squared_r = self.calc_circumcircle() self.edges = {Edge(p1, p2), Edge(p2, p3), Edge(p3, p1)} self.neighbors = set() def get_neighbors(self, triangles): for triangle in triangles: if len(self.is_shared_edge(triangle)) == 2 and triangle is not self: self.neighbors.add(triangle) def calc_circumcircle(self): p1 = self.points[0] p2 = self.points[1] p3 = self.points[2] epsilon = 1e-10 denominator = 2 * (p1.x * (p2.y - p3.y) - p1.y * (p2.x - p3.x) + p2.x * p3.y - p3.x * p2.y) + epsilon x = ((p1.x**2 + p1.y**2) * (p2.y - p3.y) + (p2.x**2 + p2.y**2) * (p3.y - p1.y) + (p3.x**2 + p3.y**2) * (p1.y - p2.y)) / denominator y = ((p1.x**2 + p1.y**2) * (p3.x - p2.x) + (p2.x**2 + p2.y**2) * (p1.x - p3.x) + (p3.x**2 + p3.y**2) * (p2.x - p1.x)) / denominator squared_r = (x - p1.x) ** 2 + (y - p1.y) ** 2 return Point(x,y), squared_r def check_if_point_is_inside(self, p): d = (self.center.x - p.x)**2 + (self.center.y - p.y)**2 tolerance = 1e-10 radius_squared = self.squared_r + tolerance return d <= radius_squared def is_shared_edge(self, triangle): common_pts = set(self.points) & set(triangle.points) return common_pts def contains_vertex(self, triangle): for point in triangle.points: for self_point in self.points: if self_point == point: return True return False
what is wrong with my "batch insertion"
def batch_insert(self, entries: dict[int, SinglyNode]) -> None:
index = -1
positions = list(entries.keys())
minPos = min(positions)
maxPos = max(positions)
if minPos < -1: raise ValueError("Position Pair Argument Can't Be Less Than -1")
elif maxPos >= len(self): raise ValueError("Position Pair Argument Can't Exceed More Than The Linked List Length")
self.__nodes_length += len(positions)
if -1 in positions or self.__nodes_length - 1 in positions:
node = entries.get(-1) or entries.get(self.__nodes_length - 1)
self.ending_node.point = node
if 0 in positions:
node = entries[0]
prevStartNode = self.starting_node
self.starting_node = node
node.point = prevStartNode
target = self.starting_node.point
while target:
index += 1
if index == self.__nodes_length: break
elif index + 1 in positions:
node = entries[index + 1]
node.point = target.point
target.point = node
target = target.point
for 1 element, its all fine but when more, thats when there is a problem
i have a list of example 6 nodes
("0th Node", "1st Node", "2nd Node", "3rd Node", "4th Node", "5th Node")
and i wanna insert first a node on the second index a.k.a "2nd Node"
it works fine
but if i wanna insert a other node, since i'm modifying the linked list thus the length
the third index will be next to the second index
but i don't wanna copy the linked list or do expensive operations on it
the desired list should be for the batches
{
2: "1"
3: "4"
}
("0th Node", "1st Node", "2nd Node", "1", "3rd Node", "4", "4th Node", "5th Node")
Given the Codecademy code provided:
My first question is why did the linked list print out the output in this reversed order from what I was expecting.
Output-only Terminal
90 5675 70 5
https://discuss.codecademy.com/t/data-structures-printing-out-values/810796
Codecademy is the easiest way to learn how to code. It's interactive, fun, and you can do it with your friends.
Given the Codecademy code provided: Exercise Link: https://www.codecademy.com/courses/learn-data-structures-and-algorithms-with-python/lessons/learn-linked-lists-python/exercises/linked-lists-python-list-ii My first question is why did the linked list print out the output in this reversed order from what I was expecting. output Output-only...
in principle, can any piece of python code be divided into atomic operations?
what do you mean by atomic?
atomic wrt. concurrency: there is no way for the operation to fail if it occurs, i.e., it cannot be half-finished
in that sense, probably yes, but not in a useful way most of the time.
In CPython, bytecode instructions are atomic because of the global interpreter lock: https://docs.python.org/3/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe
Which means, as the FAQ points out, many basic operations like changing an element in a list are atomic.
But to what extent you can rely on this in practice, I'm not sure.
Thank you so much
computational complexity is introduced in a data structures and algorithms class right?
or what cs class is it introduced in?
yes, generally
Can anybody help me figure out why I am getting an unsimplified sum for my "scaled_sum" in this code?:
def add_scale_2(L, M, L_scale, M_scale);
N = min(len(L), len(M))
L_scaled = L * L_scale
M_scaled = M * M_scale
scaled_sum = [L_scaled + M_scaled]
return scaled_sum
this is ipython btw
Definitely
U simply cannot go about dsa without complexity
thanks to both!
something is wrong with my delete implementation on a linked list and idk what exactly is
def delete(self, position: int = -1) -> None:
"""Deletes a node on a given position. If the position is
either at the beggining or end, it will perform in O(1) otherwise if in between
nodes, it will perform in a average case O(n)
"""
index = 0
if position < -1: raise ValueError("Position Argument Can't Be Less Than -1")
elif position >= len(self): raise ValueError("Position Argument Can't Exceed More Than The Linked List Length")
self.__nodes_length -= 1
if position == -1 or position == self.__nodes_length + 1:
self.pre_end_node.point = None
return
elif position == self.__startingIndex:
self.starting_node = self.starting_node.point
self.__startingIndex += 1
return
target = self.starting_node.point
while target:
index += 1
if index >= self.__nodes_length - 1: return
elif index == position - 1:
target.point = target.point.point
break
target = target.point
the above is the normal version
and this is the batched version
def batch_delete(self, position: set[int]) -> None:
"""Deletes node(s) given position(s). If a position is either
at the beggining or end, it will perform in O(1) otherwise for the rest of the
positions that are in between the start or end, it will perform in a
average case O(n) time. This is the batch version of delete
"""
index = 0
minPos = min(position)
expectedSequence = set(range(minPos, minPos + len(position)))
if max(position) < -1: raise ValueError("Position Argument Can't Be Less Than -1")
elif min(position) >= len(self): raise ValueError("Position Argument Can't Exceed More Than The Linked List Length")
self.__nodes_length -= len(position)
diff = position.symmetric_difference(expectedSequence)
if diff == set():
for i in range(len(position)): self.starting_node = self.starting_node.point
return
target = self.starting_node
while target:
if index in position:
target.point = target.point.point
index += 1
continue
index += 1
target = target.point
If you need the __init__ method, here u go
def __init__(self, *args) -> None:
self.__startingIndex = 0
if len(args) == 0:
self.starting_node: SinglyNode = SinglyNode(None)
self.__current_node: SinglyNode = self.starting_node
self.__nodes_length: int = 1
self.ending_node: SinglyNode = self.starting_node
self.pre_end_node: SinglyNode = self.starting_node
return
target = SinglyNode(args[0])
self.starting_node: SinglyNode = target
self.__current_node: SinglyNode = self.starting_node
self.__nodes_length: int = len(args)
if len(args) == 1:
self.ending_node = self.starting_node
self.pre_end_node = self.starting_node
for index, i in enumerate(args[1:]):
node = SinglyNode(i)
target.point = node
if index == len(args) - 2:
self.pre_end_node: SinglyNode = target
target = target.point
self.ending_node: SinglyNode = target
for SinglyNode it consists of the data value and a point value
pls ping me if u have found the cause and how to fix it
i think delete is the one being wrong
Hello, I am trying to build an application to manage JustEat orders. I would like to match orders based on some given criteria, such as distance or time to reach it from the bar. Like, two orders with silimilar scheduled time and position to the map could be made with one only trip.
I managed to get the orders data (location and time).
Now I have to build the algorithm to match the orders. Before I start doing that, does anyone know any alhorithm that I could or should use?
You're talking about a filesystem / disk partition or volume? if so, this is certainly the wrong place. Ask in one of the off-topic channels, plz
Thank you, will do!
Traders ? 📊📊
Can anyone suggest an algorithm to solve a sudoku
Howdy, I'm working with queries and using sqlfluff to parse items in select clauses and from clauses. SQLfluff returns a nested json/list structure and I'm having some issues manipulating it. I'm just looking for suggestions on what I could use to manipulate/search nested trees. So far DFS looks good and I'm able return the parts of the json that I want. That said, whenever I try to modify the json using recursion I can't seem to get it to work? Any suggestions for algorithms for working with json?
Take a look at this: https://github.com/norvig/pytudes/blob/main/ipynb/Sudoku.ipynb
Wrong kind of algorithm. You probably want #data-science-and-ml
Oh, that's a wonderful repo. He even has the AOC's up there.
Thank you
Anyone have any advices about learning QGis
I am just starting and it feels so hard to learn
i don't quite get what is a bit array
i mean i get that its basically a binary sequence
but tf is this?
def set_bit(self, index):
if index < 0 or index >= self.size:
raise ValueError("Index out of bounds")
chunk_index, bit_offset = divmod(index, 32)
self.bits[chunk_index] |= 1 << bit_offset
mhm, also one question
instead of a array why not represent the bit array as a string
it consumes way less memory(i've seen about 36KB difference in a 50k bit arrays)
so there must be a drawback?
guys, how do i remove Tree ws in my Lark list?
result = [['```\n1#3d6(5)(3)(5)[13]<[13]>\n2#3d6(1)(5)(4)[10]<[10]>\n3#3d6(3)(2)(5)[10]<[10]>\n4#3d6(4)(2)(2)[8]<[8]>\n5#3d6(6)(3)(4)[13]<[13]>\n6#3d6(1)(5)(2)[8]<[8]>\n\n```\n🎲 **__Total__** = 62'], Tree(Token('RULE', 'ws'), [Token('WS', ' ')]), [['```\n5d10(6)(9)(3)(4)(6)[28]* 10[10]\n```\n🎲 **__Total__** = 280', 280]]]
like, i want to remove WS from my thing
oh wait, i have an idea
mhm
this is a silly thing if you're working with python ints
if the underlying thing is something like a numpy array of uint32 then you will actually be memory efficient
storing 8 bits per byte
mhm
should be ~8x more memory efficient than a string
tbh this is a bit array, i mean like we are storing bits
so why uint32 when we can do further down
you gave an example with 32 bits, no?
yes
but like
this is a array of bits
[0, 1, 0, 1, 1, 1] example
or for the string representation it would be
"010111"
(srry if i kinda suck at data structures & algorithms)
the point is that you can pack bits tightly into integers
using larger int types like 32 or 64 isn't necessary but could make some operations more efficient
a string like the one you showed would use something like 1 byte to represent 1 bit
if you had instead used an uint8 (which is 1 byte) and done this kind of shifting from before, you can store 8 bits in the same space
so in essence, representing it with string is not necessary
e.g. this could be fit into a single byte with room to spare, 1+2+4+16 = 23 = 0b010111
representing it as a regular string is even wasteful
can i get some example code?
of what exactly? 
for the uint8 thingy
you have the relevant code for 32 bits already, switching to 8 is trivial
the one thing need to make things memory efficient is to let the underlying storage be whatever fixed size int you choose, be it 8 or 32 bits
!e here is a simple bitset
import numpy as np
class Bitset:
def __init__(self, n):
self.n = n
self.data = np.zeros((n+7)//8, dtype=np.uint8)
def set(self, i):
self.data[i//8] |= 1 << (i%8)
def get(self, i):
return (self.data[i//8] >> (i%8)) & 1
def __repr__(self):
return ''.join(str(self.get(i)) for i in range(self.n))
bs = Bitset(13)
bs.set(0)
bs.set(1)
bs.set(2)
bs.set(4)
bs.set(11)
print(bs)
@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.
1110100000010
!e or equivalently to store in uint32
import numpy as np
class Bitset:
def __init__(self, n):
self.n = n
self.data = np.zeros((n+31)//32, dtype=np.uint32)
def set(self, i):
self.data[i//32] |= 1 << (i%32)
def get(self, i):
return (self.data[i//32] >> (i%32)) & 1
def __repr__(self):
return ''.join(str(self.get(i)) for i in range(self.n))
bs = Bitset(13)
bs.set(0)
bs.set(1)
bs.set(2)
bs.set(4)
bs.set(11)
print(bs)
@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.
1110100000010
am i limited to only 8, 32... etc or can i do uint9 or uint11... etc?
also when i put n = 100, for the data. When i do sys.getsizeof i see that the numpy array consumes more memory
than a normal list
!e
import numpy, sys
l1 = numpy.zeros(131//32, dtype=numpy.uint32)
l2 = [0] * (131 // 32)
print(f"Numpy Array: {sys.getsizeof(l1)}")
print(f"Normal List: {sys.getsizeof(l2)}")
@glossy lintel :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | Numpy Array: 128
002 | Normal List: 88