#algos-and-data-structs
1 messages ยท Page 52 of 1
i agree with your analysis though ("should only need n operations here to fill the remaining nodes")
!e
class segment:
def __init__(self, data, f):
self.f = f
self.n = n = len(data)
self.data = [-10**9] * n + data
for i in range(2 * n)[::-1]:
self.data[i//2] = f(self.data[i//2], self.data[i])
def query(self, l, r):
l += self.n
r += self.n
ans = -10**9
while l < r:
if l&1:
ans = self.f(ans, self.data[l])
l += 1
if r&1:
r -= 1
ans = self.f(ans, self.data[r])
l//=2
r//=2
return ans
segment_tree = segment([-1, 2, 1, 4, -2, -3, 2], max)
print(segment_tree.query(0, 6))
print(segment_tree.query(0, 2))
print(segment_tree.query(0, 4))
print(segment_tree.query(6, 7))
print(segment_tree.query(5, 7))
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | 4
002 | 2
003 | 4
004 | 2
005 | 2
there is so little information about O(2n)-space segment trees on the internet, it's awful
self.data[i//2] = f(self.data[i//2], self.data[i]) just looks really wrong tbh
like why and how can you have self.data[i//2] affect itself during build
especially if you init it self.data = [0] * n + data
try an array with all negative numbers and see how that works out
I sould init to -10**9 to be fair
then it wouldn't work if f is min
for sure that implementation is only if the initial array is positive
so some tweeks are to be done if negative numbers
i*2+2 seems to be a problem when i = len(nums)-1
which makes sense i guess
that's the thing it should be thought of as a set of perfect binary trees rather than a big segment tree I think and I cannot wrap my head around how to visualise this unless someone shows me (like in the blog) the set of perfect binary trees - surely there's an easy method to visualise it from the seg_tree array of length 2n
@regal spoke hey do you reckon you could enlighten us quickly? sorry for pinging
so I've taken the c++ code of the blog and translated it into python (https://pastebin.com/7rc2eiDF)
And I get this (1-indexed) array [-1000000000, 4, 4, 2, 2, 4, 2, -1, 2, 1, 4, -2, -3, 2]
which this times work
in purple is the initial array
however I have no idea where people see this seg tree as a set of perfect binary trees and the blog author's image makes no sense to me still
dunno, I always thought of it as just 1 big binary tree
this is some wizardry stuff
i can only find two sources of codes
both sources work
and i understand nothing
there's probably more, and it's probably buried deep in some archaic chinese blogpost
yeh i mean great ๐
ok i get it
if you do the standard zkw steps
it works
it's magic
but it works
you just pretend it's a complete perfect binary tree
like when you use the closest greater power of 2
and you do zkw iterations upwards
NO IDEA why it ends up being a set of perfect binary trees and why people even talk about that though
I've discussed this with people in the past. It is correct that the iterative non-power of 2 segment tree consists of multiple binary trees
If n is a power of two, then index 0 in the 2n array is unused
In the case where n is not a power of 2, then more indices become unused
It is always true that left nodes have even indices and right nodes have odd indices
Also, the parent of a node is at index//2
Most of the confusion comes from people trying to interpret the unused indices as part of some binary tree, while they should really just be ignored (similar to how index 0 is ignored when n is a power of 2)
There are simply unused gaps in the array
in the non-power of 2 segment tree, there aren't any gaps in the array right?
the segment tree of 0 5 0 2 -4 6 is [-1000000000, 6, 6, 5, 2, 6, 0, 5, 0, 2, -4, 6] which can be represented as
There are gaps or unused indices (or whatever you want to call them)
and from what I understood, it worked perfectly
the only thing i don't see are the set of binary trees
and the gaps
using this code
class SegmentTree:
def __init__(self, n):
self.n = n
self.t = [-10**9] * (2 * n)
def build(self):
for i in range(self.n - 1, 0, -1):
self.t[i] = max(self.t[i << 1], self.t[i << 1 | 1])
def modify(self, p, value):
p += self.n
self.t[p] = value
while p > 1:
p >>= 1
self.t[p] = max(self.t[p << 1], self.t[p << 1 | 1])
def query(self, l, r):
"""
l included
r included
"""
res = -10**9
l += self.n
r += self.n
while l <= r:
if l & 1: # l is a right child
res = max(self.t[l], res)
l += 1
if not r & 1: # r is a left child
res = max(self.t[r], res)
r -= 1
l //= 2
r //= 2
return res
Let's say n is odd, then the first guy (at index n) is alone in his binary tree
Anything "above" him is unused
So index n//2, n//2//2, ... are all unused
The query function will never look up an unused index
for the array -1 2 1 4 -2 -3 2 I (odd this time) have this segment tree [-1000000000, 4, 4, 2, 2, 4, 2, -1, 2, 1, 4, -2, -3, 2] which I draw like this.
You're visualising it in a way that I don't understand and i think that's what the blog is trying to explain as well but I'm not getting it
Gimme me a minute
yeah of course
Here is the code to generate the indices of the roots
# Compute the indices of the roots
def find_roots(self):
l = self.n
r = 2 * self.n - 1
odd_roots = []
even_roots = []
while l <= r:
if l & 1: # l is a right child
odd_roots.append(l)
l += 1
if not r & 1: # r is a left child
even_roots.append(r)
r -= 1
l //= 2
r //= 2
return odd_roots + even_roots[::-1]
For n=7 it becomes
!e
# Compute the indices of the roots of a segment tree
def find_roots(n):
l = n
r = 2 * n - 1
odd_roots = []
even_roots = []
while l <= r:
if l & 1: # l is a right child
odd_roots.append(l)
l += 1
if not r & 1: # r is a left child
even_roots.append(r)
r -= 1
l //= 2
r //= 2
return odd_roots + even_roots[::-1]
print(find_roots(7))
@regal spoke :white_check_mark: Your 3.12 eval job has completed with return code 0.
[7, 2, 6]
There are 3 roots, so there are 3 binary trees
hmm
Here they are
The values is in the boxex, and the indices are underneath
As you can see, the roots are at indices 7, 2 and 6
Indices 0,1,3 are all unused
the rationale being when you're at index 6 for example maybe it's left that's currently a right child so it "jumps" to another binary tree?
I think of it more like you've reached the root (the end) of that binary tree
But ye
In build and modify here, you could technically make it ignore the unused indices
But practically it is easier to just let build and modify update all indices (even the unused ones). The updates to the unused indices wont matter
so I've followed what's happening
it's a bit messy
but basically it just magically works
all the time
even though initially l and r are "not on the same floor"
they get adjusted at the end
no matter the example
that's why I was wondering what's even the point of considernig the segment tree as a set of perfect binary trees
if representing like this just works perfectly
and it never uses unused indices
since literally my segment tree array doesn't have any apart from the first index
but that's just because i 0-indexed everything
Think of it like there are multiple perfect binary trees. Furthermore, the query function is able to jump to the next tree when it reaches a root.
And you should completely ignore unused indices. They are never useful for anything. I think they should be left out from the picture
The are multiple ways you can use to go from the root of one tree to the next tree. The simplest would probably be to xor the root index by 1. Query is a bit more fancy and essentially does index//2 ^ 1
i see
question: what do you mean by unused indices since literally there aren't any unused values in my segment tree array?
you mean the indices there would've been if i had done it in size O(4n)?
with the closest power of 2
For n = 7, indices 0,1,3 are unused
You could put whatever crap you want into them
It doesn't matter
Query will never try to access those indices
here? 1 and 3 are never used??
ah
ahhhh
Again, it is very similar to how index=0 works in the power of 2 case
It is there
But never really used for anything
but never ever do we end up with l=1 and r=1 for example?
Yes
query is always working with indices that correspond to nodes in the binary trees
and will never try to access any (unused) indices outside of the trees
It is surprising that query works even when n is not a power of 2. But it is not "magic"
Some other functionality of segment trees can break if n is not a power of 2. For example, suppose you want to find the index corresponding to the largest value in the initial array
Normally (if n is a power of 2) you would start at index 1 and "walk down" the segment tree (in the direction of the largest value).
If n is not a power of 2, then you first have to figure out which of the binary trees contain the largest value (by iterating over all the roots)
Once you've figured out which root to use, then you can just walk down the tree like usual
So not everything magically works when n is not a power of 2, but query does work
why not just store max, idx_of_max in each node then?
I guess we end up doubling the space
this is so much clearer now btw thanks!
I assume not padding (to closest greater power of 2) allows for slightly improved performance right?
A little bit, yes. You save memory by not padding. But I mostly don't bother with rounding up to next power of 2 because that makes the code slightly longer
You could do that, but it is a hassle to do it that way.
Walking from the root can be used for other things than finding the largest value. You could for example use that method to find the first index with value >= x
And for that example storing (max, idx_of_max) wouldn't cut it.
i guess initializing an array potentially twice as big matters when the input sizes are around 10^5?
I've only had it matter in extreme cases where the memory limit was intentionally tight
This happened on a problem with a 2D segment tree (made by putting a 1D segment tree in each node of a segment tree).
Then you have to pay for the padding twice. That quickly becomes expensive
Btw I think of it in general like any interval [L, R] corresponds to some set of perfectly balanced binary trees.
The roots of these trees are the indices that the query-function accesses
# Compute the indices of the "roots" of interval [l, r]
def find_roots(n, l, r):
# l,r inclusive
l += n
r += n
odd_roots = []
even_roots = []
while l <= r:
if l & 1: # l is a right child
odd_roots.append(l)
l += 1
if not r & 1: # r is a left child
even_roots.append(r)
r -= 1
l //= 2
r //= 2
return odd_roots + even_roots[::-1]
These will always be subtrees of the binary trees making up the entire segment tree
One functionality you can add to your segment tree is to find the maximum in some interval [l, r] using find_roots
The nice thing about having this is that it works both if n is a power of 2 and if n is not a power of 2
is segment tree kinda like quadtree, but in 1d?
Yes. But a 2D segment tree (putting a segment tree in the nodes of a segment tree) is not the same thing as a quad tree.
I don't think I've ever seen quad trees be used in competitive programming
The issue with quad trees is that a 2D interval (a rectangle) can consist of n log n number of quad tree squares. So using a quad tree to solve 2D range queries can be really slow.
also, is it really faster to do bit shifts?
i see loads of people doing & 1 instead of % 2
and << 1 instead of *=2
but i tried to timeit &1 and %2 and didn't see any difference
and bit shifting is always like a hassle for my small brain
like why would any one use (i << 1) | 1 instead of 2*i+1
!e
import time
i = 10
start_time = time.time()
for _ in range(500000):
a = (i << 1) | 1
print(time.time()-start_time)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.043749332427978516
!e
import time
i = 10
start_time = time.time()
for _ in range(500000):
a = 2 * i + 1
print(time.time()-start_time)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.033731698989868164
I don't think there is a difference. My guess is that they are equivalent. Maybe you can see a difference by comparing x//y (where y=2) and x>>y (where y=1).
!e
import time
i = 2
start = time.time()
for _ in range(1000000):
b = i // 2
print(time.time() - start)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.06850481033325195
!e
import time
i = 2
start = time.time()
for _ in range(1000000):
b = i >> 1
print(time.time() - start)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.06655597686767578
I mean I guess
!e
import time
i = 2000
start = time.time()
for _ in range(1000000):
b = i // 2
print(time.time() - start)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.0827324390411377
!e
import time
i = 2000
start = time.time()
for _ in range(1000000):
b = i >> 1
print(time.time() - start)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.07913565635681152
yeah might be slightly faster for //2
but really isn't for any other bit manipulations as far as i can tell
and I'm sorry but who in their right mind thinks i << 1 | 1 is clearer than 2i+1
there is a difference in languages like C
in C there is no power operator, so you cant just 2**i + 1, you have to call some function from standard library. But (1 << i) | 1 is shorter, faster, does not require library, ...
and it is not that hard if you are familiar with bitwise operations, there are much more cursed ways to do stuff with them
i'm just multiplying by 2
to raising to the pwoer of i
for that i guess i agree if you need to call some pow function
but for 2*i+1
it's a bit weird
multiplying by does not always give you a power of 2
although i guess i'm starting to get used to bitwise operations
!e
import time
i = 10
start_time = time.time()
for _ in range(500000):
a = 2 ** i + 1
print(time.time()-start_time)
@lean walrus :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.06015491485595703
yess yes of course i wasn't really trying to get a power of 2 in the first place
2**i is ~2 times slower than other operations
ah, then ok
!e
import time
i = 10
start_time = time.time()
for _ in range(500000):
a = (1<<i) + 1
print(time.time()-start_time)
please keep in mind that 1<<i + 1 is not the same as (1<<i) + 1
yeh i just realised
!e
import time
i = 10
start_time = time.time()
for _ in range(500000):
a = (1<<i) + 1
print(time.time()-start_time)
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
0.05784440040588379
in my opinion it was a bad design choice to use that operator precedence, but anyway
i agree
i got used to it since 1<<i-1 is quite common
in sets represented as bitmasks
damn i finished rewriting my O(2n) segment tree with 0-indexing
always hard
but i gain 1 bit worth of space
Ok, even with the optimised 2n space segment tree, I cannot find a way to pass (it gets TLE) this codeforces problem: https://codeforces.com/gym/518039/problem/A (that's the whole reason i tried so hard to understand this optimised segment tree in the first place)
Obviously it works if I translate everything in C++ but is there anyway to make a python/pypy submission pass?
Here is my code that currently works but gets TLE on test 9: https://pastebin.com/2nRjPVpZ
Any idea on how to further optimise it?
I have an interesting problem/question.
Background info
There is a game which (oversimplifying) consists of planets and spaceships.
On each planet there is a teleporter that can instantly teleport any amount of ships to another planet.
Once you teleport ships, two teleporters (source and destination) go into inactive state for some constant amount of time (for example, 10 minutes.). You cant use teleporter if it is in inactive state.
Example 1
There are 4 planets 1-4, there are ships on each of them. The goal is to collect all ships on planet 1.
Possible way to do that:
time src dest
0 4 -> 3
0 2 -> 1
# now all teleporters are inactive, you have to wait for 10 minutes
10 3 -> 1
# now all ships are transferred, task is done
# also, at the end teleporters 1 and 3 are inactive, and they will activate at time 20
20 now all teleporters are active, you are free to use them again
Example 2
There are 4 planets 1-4, there are ships on planet 1. The goal is divide ships between planets 2-4.
Possible way to do that:
time src dest
0 1 -> 2
# 1 and 2 are inactive
10 1 -> 3
10 2 -> 4
# at the end all teleporters are inactive, and they will activate at time 20
20 now all teleporters are active, you are free to use them again
Question 1
There are N planets with ships. What is the fastest way to collect all of them on one planet?
What is the exact formula of total time?
||it should be about O(log(N))||
Question 2
There are N planets, ships are on one of them. What is the fastest way to spread ships across all planets?
What is the exact formula of total time?
||it should be about O(log(N)) as well||
Your link to the problem doesn't work for me since I'm not registered for the gym
So I dont know which problem
wtf
on the front page
it says
We are happy to invite you to participate in National University of Singapore (NUS) CS3233 Final Team Contest 2024 Mirror on Monday, April 15, 2024, at 09:15 UTC. CS3233 is a course offered by NUS to prepare students in competitive problem solving.
And Iclicked on the link
ohhhh yes
it is an invitation
sorry i didn't even know that was a thing
ok that worked
So they are basically asking you for the 2 smallest values and the two largest values in (l,r)
RMQ would do this in O(1) time per query
what does RMQ stand for?
range minimum query
Is it the "sparse table" of this article?
I think of it like RMQ is built ontop of a sparse table. But I think people sometimes use the words interchangably
So that's what you think you'd do?
You can make an RMQ based on other data structures than a sparse table. For example it is pretty easy to get RMQ down to O(n log log n) build time and O(1) per query
no
The idea is that if you have precomputed the min of all intervals of power of 2 length
then you can easily solve minimum range queries in O(1) time
Since any interval can be written as the union of two intervals with a power of 2 length
two only?
that's not what they say
ahh
what about interval [0, 13] for example I'm confused
Ah you do something like
[0, 7] and [6, 13]?
[0, 13) = [0, 8) union [13 - 8, 13)
okok I see
The power of 2 you use is interval length rounded down to closest power of 2
makes sense yes
so I'd use 4 sparse tables?
one for max1
one for min1
one for max2
one for min2?
Maybe that is faster
But the RMQ I linked support custom functions
So just define your own function that can join 4 values, and you are set
If this doesn't pass then you could try optimizing the RMQ to take O(n log log n) time to precompute
It is not that much extra work to get that going
I'm trying RMQ first
I need to change self._data = _data = [list(data)] this too right?
my nodes were defined as self.t[self.n + i - 1] = (nums[i], -10**9, nums[i], 10**9)
my leaves
otherwise I can't merge
since I'll have ints and tuples
So data should be a list of 4-tuples
Ye you shouldnt have to modify any of the code inside of RangeQuery
hm
-10**9 and 10**9 looks dangerous
They become big int if you mult 3 of them
Its probably better to just define inf as 10**4+1
!e
class RangeQuery:
def __init__(self, data, func):
self.func = func
self._data = _data = [list(data)]
i, n = 1, len(_data[0])
while 2 * i <= n:
prev = _data[-1]
_data.append([func(prev[j], prev[j + i]) for j in range(n - 2 * i + 1)])
i <<= 1
def query(self, start, stop):
"""func of data[start, stop)"""
depth = (stop - start).bit_length() - 1
return self.func(self._data[depth][start], self._data[depth][stop - (1 << depth)])
def __getitem__(self, idx):
return self._data[0][idx]
def merge(node1: tuple[int, int, int, int], node2: tuple[int, int, int, int]):
if node1[0] >= node2[0]:
max1 = node1[0]
max2 = max(node2[0], node1[1], node2[1])
else:
max1 = node2[0]
max2 = max(node1[0], node1[1], node2[1])
if node1[2] <= node2[2]:
min1 = node1[2]
min2 = min(node2[2], node1[3], node2[3])
else:
min1 = node2[2]
min2 = min(node1[2], node1[3], node2[3])
return max1, max2, min1, min2
rq_mind: RangeQuery = RangeQuery([(1, -10**9, 1, 10**9), (2, -10**9, 2, 10**9), (2, -10**9, 2, 10**9), (2, -10**9, 2, 10**9), (10, -10**9, 10, 10**9)], func=merge)
print(rq_mind.query(0, 3))
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
(2, 2, 1, 2)
You can simplify it to
if node1[0] >= node2[0]:
max1 = node1[0]
max2 = max(node2[0], node1[1])
else:
max1 = node2[0]
max2 = max(node1[0], node2[1])
if node1[2] <= node2[2]:
min1 = node1[2]
min2 = min(node2[2], node1[3])
else:
min1 = node2[2]
min2 = min(node1[2], node2[3])
ah yes
!e
class RangeQuery:
def __init__(self, data, func):
self.func = func
self._data = _data = [list(data)]
i, n = 1, len(_data[0])
while 2 * i <= n:
prev = _data[-1]
_data.append([func(prev[j], prev[j + i]) for j in range(n - 2 * i + 1)])
i <<= 1
def query(self, start, stop):
"""func of data[start, stop)"""
depth = (stop - start).bit_length() - 1
return self.func(self._data[depth][start], self._data[depth][stop - (1 << depth)])
def __getitem__(self, idx):
return self._data[0][idx]
def merge(node1: tuple[int, int, int, int], node2: tuple[int, int, int, int]):
if node1[0] >= node2[0]:
max1 = node1[0]
max2 = max(node2[0], node1[1])
else:
max1 = node2[0]
max2 = max(node1[0], node2[1])
if node1[2] <= node2[2]:
min1 = node1[2]
min2 = min(node2[2], node1[3])
else:
min1 = node2[2]
min2 = min(node1[2], node2[3])
return max1, max2, min1, min2
A = [1,2,2,20,10]
inf = 10**9
rq_mind = RangeQuery([(a,-inf,a,inf) for a in A], func=merge)
print(rq_mind.query(0, 3))
print(rq_mind.query(0, 4))
@stiff quartz :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | (2, 2, 1, 2)
002 | (20, 20, 1, 1)
it's a bit weird it says 1 1
Please make it more readable by doing
A = [1,2,2,20,10]
inf = 10**9
rq_mind = RangeQuery([(a,-inf,a,inf) for a in A], func=merge)
Oh now I see the issue. When the intervals intersect, the rmq might return duplicates
ehm
ah
yes
damn
maybe on rmq for max1, one for max2 etc... ?
but i feel that's going to not work in our favour for the TLE problem
and also no just initializing data will be weird
for max2 & min2
If you know the index for max1, then you could in O(1) time find max2 by splitting the interval in two
Then do the same for min
This would require 2 rmq's
where you store pairs (value, index)
looking on the right and on the left?
of that index?
yes
that's smart
so i don't need this merge function either
damn
I probably need the nloglogn
which test case do you TLE on?
9
import sys
from typing import Iterator # noqa
user_input = sys.stdin.readline
def invr() -> Iterator[int]:
return map(int, user_input().split())
def line_integers() -> list[int]:
return list(map(int, user_input().split()))
class RangeQuery:
def __init__(self, data, func):
self.func = func
self._data = _data = [data]
i, n = 1, len(_data[0])
while 2 * i <= n:
prev = _data[-1]
_data.append([func(prev[j], prev[j + i]) for j in range(n - 2 * i + 1)])
i <<= 1
def query(self, start, stop):
"""func of data[start, stop)"""
if start >= stop:
return (10**4+1, -1) if self.func == min else (-10**4-1, -1)
depth = (stop - start).bit_length() - 1
return self.func(self._data[depth][start], self._data[depth][stop - (1 << depth)])
def __getitem__(self, idx):
return self._data[0][idx]
def solve():
n: int
q: int
n, q = invr()
a: list[int] = line_integers()
seg_tree_max: RangeQuery = RangeQuery([(a, i) for i, a in enumerate(a)], func=max)
seg_tree_min: RangeQuery = RangeQuery([(a, i) for i, a in enumerate(a)], func=min)
for _ in range(q):
l, r = invr()
l -= 1
r -= 1
max1, max1_idx = seg_tree_max.query(l+1, r)
max2 = max(seg_tree_max.query(l+1, max1_idx)[0], seg_tree_max.query(max1_idx+1, r)[0])
min1, min1_idx = seg_tree_min.query(l+1, r)
min2 = min(seg_tree_min.query(l+1, min1_idx)[0], seg_tree_min.query(min1_idx+1, r)[0])
lr = a[l] * a[r]
if lr > 0:
best = max(
max1 * max2,
min1 * min2
)
else:
best = min(
min1 * max1,
min1 * min2,
max1 * max2
)
print(
lr * best
)
if __name__ == "__main__":
solve()
to reproduce
What do you do if an interval is empty?
Looks like you might have a bug
Oh you hardcoded stuff into the RMQ ๐
I have heard a lot on youtube that it is important to learn about data structures and algorithms, but what does that actually mean and why is it so important to learn? For example I have heard a lot about binary trees and seen some code for it - but why is it so good to know that and when and what do you even use it for? (sorry for such a confusing question, but I am a bit confused too....)
at least it's a simple rule in that bitwise ops all have lower precedence
I got it to pass
Damn
Turns out that partly PyPy is being stupid
You always do though
?
Did you do RMQ?
yes
If you add
for _ in range(1): pass
into query. Then this code gets TLE on TC11
This weird performance bug is a known PyPy bug that we have reported in the past
The developers of PyPy was the ones who showed me the empty for loop trick
It stops PyPy JIT from attemting to optimize the function
The main optimization I did is that instead of storing (A[i], i) in the RMQs, I store (A[i] << 32) | i.
That way it just have to work with integers
Which together with adding the empty for loop to query made it run in 1.3s
Why would it not work with integers in my case?
You have a tuple (a pair)
I just have a single integer
Ahhhhh
val = seg_tree_max.query(l+1, r)
max1 = val >> 32
max1_idx = val ^ (max1 << 32)
max2 = max(seg_tree_max.query(l+1, max1_idx)>>32, seg_tree_max.query(max1_idx+1, r)>>32)
val = seg_tree_min.query(l+1, r)
min1 = val >> 32
min1_idx = val ^ (min1 << 32)
min2 = min(seg_tree_min.query(l+1, min1_idx)>>32, seg_tree_min.query(min1_idx+1, r)>>32)
My trick makes the code look a bit ugly
But it speeds it up too
seg_tree_max: RangeQuery = RangeQuery(data, func=max)
seg_tree_min: RangeQuery = RangeQuery(data, func=min)
Yes yes sorry

That was my bad
No worries
How the hell is one int better than a pair of ints
If the int is like super big
Surely it takes twice as many bits
Or maybe itโs just tuples having overhead?
It is not a big int. So it is just as big as any other (small) int
All (small) ints in Python are basically 64 bit
Oh yes 2^32 is actually big enough
To encompass the 10^4
Ok thatโs fair thatโs fair
2 ^ 32 * 10^4 is not even close to 2^64
Well that was quite the effort
I would not use this trick had it become a big int
We are still not close to highly optimized Python code
Removing the class would probably speed up things siginificantly. Also we could switch to n log log n RMQ
Would not be surprised if it is possible to cut down the time by like a factor of 5
Did you need to use this AND the <<32?
yes
Without the empty for loop it TLEs on tc9
Btw I think your if statement in queue might have been the thing that triggered the PyPy bug
PyPy JIT is stupid
Because I modified the not-to-modify-class
Karma got me
Speaking about PyPy bugs. Try running these in PyPy
def f():
for i in range(2000):
for j in range(1040):
i == 0
j % 6 == 0
f()
def f():
for i in range(2000):
for j in range(1041):
i == 0
j % 6 == 0
f()
The ||lower|| takes about ||40|| times longer to run
Isn't it wonderful
When PyPys JIT sees a pattern (here it is i==0 being true in the first outer loop iteration), then it won't ever forget it. Even if it is false for the next 1999 iterations
PyPy arbitrarily has 1041 hardcoded into its JIT, making it remember a pattern after seeing it for >= 1041 iterations
This type of performance bug can easily trigger when you are for looping over something like a grid or a matrix, since usually there is something special about the first row. If the matrix has size >= 1041 then PyPy will belive that pattern holds for all of the rows
Fun fun
The trick to get away from PyPy JIT perfomance bugs like this is to add an extra empty for loop
def f():
for i in range(2000):
for j in range(1041):
for _ in range(1): pass
i == 0
j % 6 == 0
f()
This fix still runs slow, but not 40 times slower
So for i between 1 and 1999 it thinks i==0 returns true but actually is false like some sort of cache but that fails?
Yes
Related to branch prediction fails?
It is similar
Fair fair
Except pypy somehow doesnโt ยซย forgetย ยป when wrong too many a times?
What actually causes the performance regression? Some kind of really slow cache lookup?
interesting fact: you can observe branch misprediction from python
is there branch prediction in python?
i thought it was only pypy
Just thought about it again, RMQ has a O(nlogn) preprocessing time and O(1) query time whilst segment trees have O(n) preprocessing time and O(logn) query time. So are we not just moving the logn from query to preprocessing? Is there really a difference?
how many queries are you doing? do you care about updates? etc
no updates
both n and q are between 1 and 5*10^5
it would obviously be good to do RMQ if q was like 10^6 and n 10^4 I assume
yeah, it depends
segment trees does give loads more flexibility
so a nice tool to have
yeah for sure
so I feel like RMQ
doesn't really have any application that seg tree doesn't cover? EXCEPT when q >> n
maybe the gymnastic pajenegod ended up doing to make my submission pass could've been done with a segment tree
@regal spoke I had a GREAT idea
I managed to make it pass without the << 32
instead of storing (index, value) in the sparse table
I just store the index
reg_tree_max: RangeQuery = RangeQuery(list(range(len(a))), func=lambda x, y: x if a[x] > a[y] else y)
reg_tree_min: RangeQuery = RangeQuery(list(range(len(a))), func=lambda x, y: x if a[x] < a[y] else y)
I guess it's still like a lot slower than yours though rip (a bit less memory though obviously)
Two things.
- The n log n work in the RMQ is very cache friendly + fast running code (no if statements or anything else advanced going on). This makes RMQ have a better constant than segment tree even if their time complexity is theoretically the same when n = q.
- RMQ can easily be sped up to O(n log log n). It is even possible to reach O(n) in the special case when f=min or f=max.
Thanks!
for t(h,w)=hw+2h+2w
can I drop 2h and 2w as lower oder term
so hw+2h+2w in Big Theta(hw)
yes
(hw + 2h + 2w)/hw = 1 + 2/w + 2/h โ 1 as w and h get large
1 is a constant
so hw+2h+2w is in ฮ(hw)
yup that adds up
ig the question is how do you come up with the higher order term
it is kinda confusing when you got two variables
if you ahve hw + h^1000
is hw still the dominant term?
Hi, regarding Depth first search (DFS) and Breadth first search (BFS), do the search allow visiting the same nodes multiple times or no ?
no
In the case of DFS it depends on your definition of visiting. You can visit the same node multiple times "on your way back".
Sometimes you hear people talking about pretraversal and posttraversal.
Pretraversal refers to the nodes ordered according to the first time you visited them in the DFS
Posttraversal refers to the nodes ordered according to the last time you visited them (i.e. "on your way back") in the DFS
For BFS, if Sibiu is the goal, when searching from Arad to Sibiu, will it stop immediately?
or will it also search for the adjacent node with same parents (Timisoara and Zerind) and stop ?
Sibiu could be found either as the 2nd node, 3rd node or 4th node depending on the order you explore Arad's neighbours in
If sibiu happens to be first out its other two neighbours then you could exit early
If you are lazy, you could also just explore the entire graph with a bfs starting at Arad and not do any kind of early exit at. Then once you are done you check if/when you saw Siblu
it could. one of the drawbacks with DFS is that you might explore the wrong branch
i have a task asking for 'number of states explored' output . thats why im asking these question
I think they are interested in the case where you early exit once you've found your goal
u mean once sibiu is immediately found, then it stop ?
Yes. Once sibiu is found, then stop immediately
otherwise, comparing the search algorithms is a bit pointless. they'll all eventually search all the nodes
BFS is nice in the sense that if your start and goal are close in distance, then you are guaranteed to quickly reach the goal
DFS on the other hand can walk off somewhere in completely the wrong way, and take ages to reach the goal
What does this page say ? I think its saying that if goal is reached, dont stop. It will still visit the nodes with same parents then stop.
the goal here is Bucharest, right? it hasn't been found yet
this example ends at this page
is this dfs path correct ?
That could be a DFS path. The only rule for DFS is to not turn around until everything infront has been explored
why go to zerind ? why not go to timisoara or sibiu ?
which edge it explores first is arbitrary
there is no "the DFS", there are loads of possible DFSs
the only rule is that you go deep before backtracking
what do you mean deep ?
deep as in longer paths
keep extending the current path until you can't expand anymore, when there are no other options you're allowed to take a step back
there are multiple valid dfs paths, and what you have is one of them
i thought there is only 1 solution path ?
there may exist multiple valid dfs paths in 1 graph
here that's the case, and you have 1 of those valid dfs paths shown in red
I didn't knew python did this?
anyone can pls tell me why this happens would be very helpful, basically I don't know why it's creating a global memo variable
Default arguments are evaluated exactly once, you'll need to
def howsum(a, b, memo=None):
if memo is None: memo = {}
i'd like some guidance on LC 933 number of recent calls
`You have a RecentCounter class which counts the number of recent requests within a certain time frame.
Implement the RecentCounter class:
RecentCounter() Initializes the counter with zero recent requests.
int ping(int t) Adds a new request at time t, where t represents some time in milliseconds, and returns the number of requests that has happened in the past 3000 milliseconds (including the new request). Specifically, return the number of requests that have happened in the inclusive range [t - 3000, t].
It is guaranteed that every call to ping uses a strictly larger value of t than the previous call.`
i'd like to use a deque for the requests
i've imported deque from collections, and i'm going to use it in my class RecentCounter. however, there are two methods as part of the RecentCounter class (init and ping), and i'm not sure how to make them work together
from collections import deque
class RecentCounter:
def __init__(self):
self.queue = deque()
def ping(self, t: int) -> int:
while self.queue and self.queue[0] < t-3000:
self.queue.popleft()
self.queue.append(t)
return len(self.queue)
here is the answer
i need to find a way to get guidance from chatgpt without it just writing the damn answer for me
class Solution:
def maxDepth(self, root: Optional[TreeNode]) -> int:
if root is None:
return 0 # Base case: if there's no node, the depth is zero.
# Recursively find the depth of the left and right subtree.
left_depth = self.maxDepth(root.left)
right_depth = self.maxDepth(root.right)
# The depth of the tree rooted at this node is 1 + the maximum of the depths of the subtrees.
return 1 + max(left_depth, right_depth)
can someone help me understand how this recursive formula keeps track of the depth
its seemingly not kept anywhere
wdym kept?
like i don't understand how it can yield an integer describing the depth when that's not kept as a variable anywhere and its not keeping track of the recursive calls (i know it must be in some manner im just not seeing how)
i understand 872 leaf-similar trees better even thought its more involved
well, left_depth and right_depth are local variables. While the second function call is being made, left_depth is kept as a local of this particular function frame.
Perhaps consider this example:
def tree_node_count(tree):
if not tree:
return 0
return tree_node_count(tree.left) + tree_node_count(tree.right) + 1
Here, there's no local variables except tree at all. While the second term in that return statement is being evaluated, the already-evaluated first one is just held on the stack.
here it makes more sense
speaking of this, how can i write a succinct method to return the values of a binary tree in a recursive call given a node. i know it is basically like print the left child, print the value of the root, print the right child, but wrapping that up in a reursive way gets me confused
or like rather if left child exists, current_root = node.left
you would go down until a base case
e.g. where node.left = None, at which point you would return the current node value then apply the recursive call to right subtree
did y'all know Meta does not ask any recursive questions in their soft. eng. interviews?
Recursion Error ๐
ommmmggg
don't understand this at all
solves LC 700 -- search in a binary search tree
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def searchBST(self, root: Optional[TreeNode], val: int) -> Optional[TreeNode]:
# Base case: If the root is None or the root's value is the target value
if root is None or root.val == val:
return root
# If the value to be found is less than the current node's value, search in the left subtree
if val < root.val:
return self.searchBST(root.left, val)
# If the value to be found is greater than the current node's value, search in the right subtree
else:
return self.searchBST(root.right, val)
so root is a list structure of values
is val as in self.val some special class definition that gets returned if you return a class instance?
how is this returning values when they are never explicity called
not sure what you mean. val is an argument of the function, there's also a .val field on any TreeNode instance.
it's not a list
what's going on w the Optional statements
i understand the arrow in the def searchBST line (->) is intended to mean that python is expecting this function to yield a TreeNode class on execution. is the reason for this so that it'll error if it instead yields something else?
how can you say return root and not return root.val or similar
the reason for this so that it'll error if it instead yields something else?
it won't, it's just for you and your IDE to read.
how can you say return root and not return root.val or similar
not sure what you mean. the return value should be a node, not a node's value
what's going on w the Optional statements
Optional[Something]is an alias forSomething | None.
it's supposed to find a node in the binary tree with the target value.
let me get the description
`You are given the root of a binary search tree (BST) and an integer val.
Find the node in the BST that the node's value equals val and return the subtree rooted with that node. If such a node does not exist, return null.`
so when we say return root, why does it return a number val and not any of the other methods
Why do you think that return root returns a number? It returns root, which is a node (unless it's None).
because when this runs i get a list of numbers
if it did not return a number, it could never print the subtree
it would just say node, node, node, node
well, somewhere under the hood the leetcode judge parses numbers into a tree, passes the tree to your function, and checks the result.
bruh
who designed it like that. that's insane
no wonder i don't understand
v dumb
is there some other place to learn where there aren't hidden methods going on
on codewars you get to see the entire code of (the non-hidden part of) the tests.
ty
i'll try running locally and operationalizing by passing the TreeNode classes as calls and figure it out
i knew there was something weird about this
it's for people to easily put their own testcases in
like users? or leetcode people?
users
In which manner?
Anyone has used Kattis and could tell me why I have a RunTime Error here?
Code: https://pastebin.com/DvJ1XnEz
Literally the same codde in cpp passes and it probably isn't a Time Limit Execution or Memory Exceeded since their FAQ says: Run Time Error means your program crashed during execution with our secret test input. More precisely it means that it terminated with a non-zero exit code, or with an uncaught exception. Never had any problems before with Kattis but not seeing the error message makes it hard to know what's happening
can differences end up empty?
if so the max will fail
hmm good question
if differences:
print(max(differences.keys(), key=lambda x: differences[x]))
else:
print(0)
yeah that works
but then error on test 3 what the hell
my c++ code is honestly literally the same
I'm going to investiguate though that's already very helpful
There is a default keyword argument to max/min for this
oh know i think i got it
ah smart
--
maps are ordered in cpp
damn
i always think map is a hashmap but it's not
unordered_map is one
so I needed to order my dictionary at the end
all good now thanks @haughty mountain !
map is, unordered_map isn't ๐
pixelated img
Anyone practicing Dsalgo in cpp? I want someone to join
i'm going to paste this here again as i would like to revisit:
# Definition for a binary tree node.
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
class Solution:
def searchBST(self, root: Optional[TreeNode], val: int) -> Optional[TreeNode]:
# Base case: If the root is None or the root's value is the target value
if root is None or root.val == val:
return root
# If the value to be found is less than the current node's value, search in the left subtree
if val < root.val:
return self.searchBST(root.left, val)
# If the value to be found is greater than the current node's value, search in the right subtree
else:
return self.searchBST(root.right, val)
we have two classes, TreeNode and Solution
why does Solution not have an __init__ call?
does it need one?
i'm not sure that it doesn't? i just noticed that one class had one and one did not. what's up with these optional statements again? (in the definition for searchBST line)
why is it even a class in this case is a better question
I'm more questioning leetcode's choices
if you need initialization you use __init__
the Optional just means None is also an ok value
so root can become none in subsequent calls when we call say for example self.searchBST(root.left, val) on a node with no left child
in which case there is a base case
so i've removed the class structure from the searchBST function
and wondering how i can build out a tree for testing. i've declared a few nodes with values like so:
node5 = TreeNode(5)
node3 = TreeNode(3)
node7 = TreeNode(7)
node11 = TreeNode(11)
node15 = TreeNode(15)
obviously it'll have to have the proper structure
do i just throw them in a list? how do i chain them together
maybe its a dictionary
head = TreeNode(17,left=TreeNode(12,5,14),right=TreeNode(25,20,30))
something like this perhaps
ok my local python doesn't like Optional
also i don't know what to do with that self term in the function definition when i am later calling it
oh its no longer a class. ah
i'm getting ```py
Exception has occurred: AttributeError
'int' object has no attribute 'val'
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def searchBST(root: TreeNode, val):
# Base case: If the root is None or the root's value is the target value
if root is None or root.val == val:
return root.val
# If the value to be found is less than the current node's value, search in the left subtree
if val < root.val:
return searchBST(root.left, val)
# If the value to be found is greater than the current node's value, search in the right subtree
else:
return searchBST(root.right, val)
head = TreeNode(val=17,left=TreeNode(12,5,14),right=TreeNode(25,20,30))
searchBST(head,5)
lefttree = TreeNode(val=5,left = TreeNode(val=3), right = TreeNode(val=6))
righttree = TreeNode(val=15,left = TreeNode(val=13),right=TreeNode(val=17))
mid_and_root = TreeNode(val=10,left=lefttree,right=righttree)
ok this is how i am declaring my tree, still getting same error
wait no now its working
however, it is only printing the value of the target, and not returning the entire tree with the root at the target
which was the original goal of the exercise
building out a much larger test tree would also be cool
this also errors when the target value isn't in the tree ๐ตโ๐ซ
from typing import Optional
I mean, that's what you are returning
i got rid of optional
any rough idea on how i could build a huge test tree to look at time
i'll need to make a tree builder function
but i can't just yield some random number as the search tree structure would be perturbed
hmm
here's a primitive thing that's working
def treebuilder(i):
node = TreeNode(val = i,left=TreeNode(val = i/10),right=TreeNode(val=i*10))
return node
but chaining it together..
i guess this is a really involved thing just to prove to myself that the time complexiy is O(log_2(treedepth))
still curious how quickly my cpu can handle a giant BST
wait going about this all wrong. i should generate a big list of numbers and then assign them to a tree using an algo similar to the one i already have
does going and grabbing the median to act as the root make sense?
if i have a randomly generated list of integers
alright i've got the tree being built, now i'd like to implement the search and return subtree with the root = search value
wait no its working now
dear lord
its extremely fast
takes wayyy longer to generate the list of ints for building the tree
time to build tree: 2.677919864654541
time to search the tree: 0.0004994869232177734
for tree with 100,001 nodes
time to generate all integers for node values: 0.3655548095703125
time to build tree: 2.6878769397735596
time to search the tree: 0.0005011558532714844
so building the tree is the limiting step, but its incredible it can be done so quickly with so many comparisons to make
lets try one million and one nodes
treesize (number of nodes): 1000001
time to generate all integers for node values: 3.378013849258423
time to build tree: 26.969225883483887
time to search the tree: 0.0004980564117431641
is this a good place to ask about constructing map reduce like algorithms for particular relationship patterns
Is this a good way to make aliases for dataclasses?
from typing import TypeVar
from dataclasses import dataclass, field
T = TypeVar("T")
class Descriptor:
field_name: str
def __init__(self, field_name: str):
self.field_name = field_name
def __set__(self, instance, value: T):
setattr(instance, self.field_name, value)
def __get__(self, instance, owner) -> T:
return getattr(instance, self.field_name)
@dataclass
class Point2D:
x: float = None
y: float = None
lat: float = field(default=5)
lng: float = field(default=5)
Point2D.x = Point2D.lat = Descriptor("_x")
Point2D.y = Point2D.lng = Descriptor("_y")
point = Point2D(lat=1)
print(point) # Point2D(x=1, y=5, lat=1, lng=5)
https://discord.com/channels/267624335836053506/1231203483244433459 any a c++/python dev who can provide some help please?
u don't need that else statement
and it's gonna be greater or equal remember that
true, some people find them more readable
Last time I did this was like a year ago
class TreeNode:
def __init__(self, value: int, left = None, right = None) -> None:
self.value = value
self.left = left
self.right = right
def size(self) -> int:
if self.left is not None and self.right is not None:
return 1 + self.left.size() + self.right.size()
if self.left is not None:
return 1 + self.left.size()
if self.right is not None:
return 1 + self.right.size()
return 1
def height(self) -> int:
if self.left is not None and self.right is not None:
return 1 + max(self.left.height(), self.right.height())
if self.left is not None:
return 1 + self.left.height()
if self.right is not None:
return 1 + self.right.height()
return 1
def preorderTraversal(self, l: list = []) -> list:
l.append(self.value)
if self.left is not None:
self.left.preorderTraversal(l)
if self.right is not None:
self.right.preorderTraversal(l)
return l
class BinaryTree:
def __init__(self, root: TreeNode | None) -> None:
self.root = root
def size(self) -> int:
if self.root is not None:
return self.root.size()
return 0
def height(self) -> int:
if self.root is not None:
return self.root.height()
return 0
def preorderTraversal(self) -> list:
if self.root is not None:
return self.root.preorderTraversal()
return []
This is not a BST tho
BSTs are easier
Inorder and postorder traversals methods are exactly the same as the preorder traversal method except you change the position of the l.append(self.value)
i have a list of values and i would like to know the position of minimum value for each range of constant length
what structure should i use? that reminds me of prefix-sum-like problem, but i dont know how to apply that for minimums
so keeping track of the minimum while sliding a window?
yes
I think you can do it keeping a monotonic sequence
i managed to throw together messy implementation using heap from heapq and set of visited values
from collections import deque
def sliding_min(seq, n):
d = deque()
for i, x in enumerate(seq):
# Add new element, prune any elements it "dominates".
while d and seq[d[-1]] > x:
d.pop()
d.append(i)
# Prune any elements that are too old.
while i >= n and d[0] <= i - n:
d.popleft()
if i >= n - 1:
yield seq[d[0]]
def test(seq, n, expected):
res = list(sliding_min(seq, n))
if res != expected:
print(f"FAILURE: expected {expected} but got {res}")
test([1, 2, 3, 4, 5, 6, 7], 3, [1, 2, 3, 4, 5])
test([7, 6, 5, 4, 3, 2, 1], 3, [5, 4, 3, 2, 1])
test([1, 4, 2, 5, 3, 4, 8, 5], 3, [1, 2, 2, 3, 3, 4])
test([2, 2, 2, 1, 1, 1, 2, 2, 2], 3, [2, 1, 1, 1, 1, 1, 2])
Something like that
also I meant to !e that...
!e
from collections import deque
def sliding_min(seq, n):
d = deque()
for i, x in enumerate(seq):
# Add new element, prune any elements it "dominates".
while d and seq[d[-1]] > x:
d.pop()
d.append(i)
# Prune any elements that are too old.
while i >= n and d[0] <= i - n:
d.popleft()
if i >= n - 1:
yield seq[d[0]]
def test(seq, n, expected):
res = list(sliding_min(seq, n))
if res != expected:
print(f"FAILURE: expected {expected} but got {res}")
else:
print('OK')
test([1, 2, 3, 4, 5, 6, 7], 3, [1, 2, 3, 4, 5])
test([7, 6, 5, 4, 3, 2, 1], 3, [5, 4, 3, 2, 1])
test([1, 4, 2, 5, 3, 4, 8, 5], 3, [1, 2, 2, 3, 3, 4])
test([2, 2, 2, 1, 1, 1, 2, 2, 2], 3, [2, 1, 1, 1, 1, 1, 2])
@haughty mountain :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | OK
002 | OK
003 | OK
004 | OK
interesting, i will look into that in a moment
I'm sure you can do it without keeping indices, but it was just convenient in this case
i.e. you should be able to implement this by just some class that accepts an add and remove where you pass the values that enter and exit the window
I'm also not sure if that second while actually needs to be a while, might be that an if is enough
im bad at competitive programming :(
i implemented a binsearch, but wasn't sure that it was correct
so i supplemented it with linear search afterwards: ```py
find first X such that check(X) is true (X=0..n-1):
check(X) is quite expensive, you cant run it for all inputs
L = 0
R = n
while L < R:
M = (L + R) // 2
if check(M):
R = M
else:
L = M + 1
for i in range(n):
if abs(i - R) < 5: # if we are close to the answer, start doing expensive checks
if check(i):
print(...)
break
i was getting WA, so i increased 5 to 1000 and it passed :)
you're looking for some transition true -> false, I much prefer keeping l and r on different sides of that transition
so if you check the midpoint
true -> set l = midpoint
false -> set r = midpoint
makes things much easier to reason about to me
i guess that makes sense
but i am not able to write binsearch without making a couple of off-by-one mistakes :)
code i write for competitive programming events is even worse than code i write for #esoteric-python
that's the thing, the thing I mentioned eliminates the ones where they matter
while r - l > 1:
mid = (r + l)//2
if check(mid):
l = mid
else:
r = mid
and the 1 that pops up makes a ton of sense (loop until l and r are adjacent)
do you like importing deque?
The "easy" and general solution is to store (value, index) as a tuple instead of just the value.
Then by finding the smallest tuple in the range, you get both the minimum value and its index
The standard way to solve min range queries is to either use a segment tree https://github.com/cheran-senthil/PyRival/blob/master/pyrival/data_structures/SegmentTree.py
or RMQ https://github.com/cheran-senthil/PyRival/blob/master/pyrival/data_structures/RangeQuery.py
Both of them can handle intervals of arbitrary length
I prefer RMQ since it runs in O(1) time per query
The way I prefer to implment binary search is like this, with l and r both inclusive
assert check(l) # Assumes check(l) is True
while l < r:
mid = (l + r + 1)//2
if check(mid):
l = mid
else:
r = mid - 1
# Now check(l), check(l + 1), ..., check(r) are all True
Your check-function is flipped compared to what I'm used to seeing. For example it is flipped compared to @haughty mountain's binary search.
This is how I would impement binary search with a "flipped" check-function
assert check(r) # Assumes check(r) is True
while l < r:
mid = (l + r)//2
if check(mid):
r = mid
else:
l = mid + 1
# Now check(l), check(l + 1), ..., check(r) are all True
I think most of the confusion in binary search comes from some ambiguity when it's first taught.
For an array (or implicit array - function of integer or float), call index a number corresponds to the element, and position a number which corresponds to the number of elements on the prefix, or that space "between" indices or values. For instance, for array [1, 3, 5, 5, 6, 8], if I, for clarity, continue it in both directions to infinity, keeping the sorted-ness,
# Index: -2 -1 0 1 2 3 4 5 6 7 ...
# v v v v v v v v v v
... -oo -oo 1 3 5 5 6 8 +oo +oo ...
# ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
# Pos: -2 -1 0 1 2 3 4 5 6 7 8 ...
So in a sense adjacent indices clamp a unique position, and adjacent positions clamp unique index.
What most binary searches do, is find the position where before it, for each index, value of some function on element on that index is True, and after it it is False (or vice versa). So consider function f = lambda x: x <= 5. There is a unique position in the array where this function "breaks":
f(a[i]): tru tru tru tru tru tru fls fls fls fls ...
... -oo -oo 1 3 5 5 | 6 8 +oo +oo ...
# ^--- here f(a[i]) "breaks" from true to false
And what binary search does, is it iteratively clamps this position between two indices. We keep the invariant that the answer position is between indices l and r, where f(a[l]) is true, and f(a[r]) is false. If we start with l = -1 (-1! not zero because you don't know if f(a[0]) is true yet! You sometimes do, but not always, and this can cause bugs. And we are not going to access this element either way, so it's fine) and r = n (so 6), we get
# l r
... -oo -oo 1 3 5 5 | 6 8 +oo +oo ...
# < . . . . . . . > - where we know answer can be
Now we take the index in the middle, so either (l + r) // 2, or (l + r + 1) // 2 - doesn't matter. Let's take the first one, so (-1 + 6) // 2 = 2, and call it m:
# l m r
... -oo -oo 1 3 5 5 | 6 8 +oo +oo ...
# < . . . . . . . > - where we know answer can be
We calculate f(a[m]) = f(5), which is true, so we know that the position of the answer is after index m=2, so we update l = m:
# l r
... -oo -oo 1 3 5 5 | 6 8 +oo +oo ...
# < . . . . > - where we know answer can be
Now m = 6, f(a[m]) = f(6) is false, so we move r = m:
# l r
... -oo -oo 1 3 5 5 | 6 8 +oo +oo ...
# < . . > - where we know answer can be
And one last time, m = 5, f(a[m]) = f(5) = True, so
# l r
... -oo -oo 1 3 5 5 | 6 8 +oo +oo ...
# < . > - where we know answer can be
And now l and r are adjacent (r - l == 1), and therefore the answer position is between them. So now we know the last index where f is true, and first index where f is false. That's what binary search does. The code is therefore
l, r = -1, n
while r - l > 1:
m = (l + r) // 2
if f(a[m]):
l = m
else
r = m
What's good about reasoning about binary search like this is that there is everything just make sense: there are no non-obvious +1 or -1, or the question of where to round the middle, or anything else. And it's easy to come up with new binary searches: with flipped checks, with additional information (such as f(a[0]) = 0 - which usually holds). Or even when you are searching for index (and therefore clamp it between positions) if you simply need to check if the element in the array and get its index.
This should be in #1051603408597024828, isn't it?.. ๐ค
I wouldn't bother with that channel tbh ๐
welp, I did it anyway https://discord.com/channels/267624335836053506/1231574710513434645
it will probably into my blog anyway ยฏ_(ใ)_/ยฏ
Hello, this leetcode problem (https://leetcode.com/problems/distinct-echo-substrings/description/) makes me doubtful of my implementation of string hashing. Brute force is O(n^3):
def distinctEchoSubstrings(self, text: str) -> int:
ans: set[str] = set()
for i in range(len(text)):
for j in range(i + 2, len(text) + 1, 2):
if text[i:i + (j - i) // 2] == text[i + (j - i) // 2:j]:
ans.add(text[i:j])
return len(ans)
And using string hashing should be O(n^2) instead
class Solution:
def get_hash(self, left: int, right: int) -> int:
"""
hash value of text[left:right]
"""
hash_right: int = self.hash_prefixes[right]
hash_left: int = self.hash_prefixes[left]
return (hash_right - hash_left * self.powers[right - left] + 1_000_000_007) % 1_000_000_007
def distinctEchoSubstrings(self, text: str) -> int:
self.hash_prefixes: list[int] = [0] * (len(text) + 1)
"""
hash_prefixes[i]: text[:i] hash value, i excluded
"""
p: int = 31
self.powers: list[int] = [1]
for i in range(1, len(text) + 1):
self.powers.append(self.powers[-1] * p % 1_000_000_007)
for i in range(len(text)):
self.hash_prefixes[i + 1] = (self.hash_prefixes[i]*p + (ord(text[i]) - ord("a") + 1)) % 1_000_000_007
ans: set[str] = set()
for i in range(len(text)):
for j in range(i + 2, len(text) + 1, 2):
mid: int = i + (j - i) // 2
hash_first_half: int = self.get_hash(i, mid)
hash_second_half: int = self.get_hash(mid, j)
if hash_first_half == hash_second_half:
if text[i:mid] == text[mid:j]: # make sure it's not just a collision
ans.add(
text[i:mid]
)
return len(ans)
It's longer on leetcode (twice as long, 2s instead of 1s)
I followed the implementation from https://usaco.guide/CPH.pdf#page=255 for String Hashing
On really big inputs (n = 15,000) I could finally see string hashing being faster
But I think I must've done something wrong and I have like a big constant term somewhere?
If you have a string like "aaaaaaaaaaa...aaaaaaaaa" then this is still O(n^3) but slower
If talking about competitive programming, i'd just don't do direct checks at all and check only hashes (but maybe more than 1)
And using string hashing should be O(n^2) instead
Still looks n^3
About hashing strings (with rolling hash), here are a couple of things to think about
- Randomize the base. Pretty much any string hash is hackable if the base is fixed.
- Use a large mod (preferably prime), 2^61-1 is a good candidate, while 10^9+7 is a bad candidate. Remember that the probability that a hash comparision fails is 1/mod. If you are doing many comparisions (by for example checking if some hash is in a set), then you can make O(n^2) comparisions in O(n) time. To make the probablity that any of these fail, you need n^2/mod << 1. TLDR: pick prime mod such that mod >> n^2.
- This code
(hash_right - hash_left * self.powers[right - left] + 1_000_000_007) % 1_000_000_007looks like buggy C++ code to me and/or you don't know what you are doing. Instead, in Python simply do
(hash_right - hash_left * self.powers[right - left]) % 1_000_000_007
Had you wanted to write C++ compatible code, then you should have done
((hash_right - hash_left * self.powers[right - left]) % 1_000_000_007 + 1_000_000_007) % 1_000_000_007
But you don't need the exatra add mod in Python since unlike C++, in Python the result of modding is always positive (if the mod is positive)
What if hash_right - hash_left*self.powers[right-left] is negative?
-5%2
Out[2]: 1
-4%2
Out[3]: 0
oh it works
does it automatically add the modulus until it's positive before doing the operation?
x % 1_000_000_007 is always positive in Python, but in C++ it could be negative (if x is negative)
n^3 because of the check in case there is a collision?
I feel like if we don't check that there is no collision
even it's super unlikely
we have an incorrect algorithm right?
weirdly enough with "aaaaaaaaa...aaaaaaaaa" the string hashing gets faster than the naive as n gets bigger. I don't really understand why
random_only_a_string = "a" * 10000 # answer should be 5000
takes 93 seconds with string hashing (and collision check) which is in that case THE SAME as the naive but slower because of so many random things computed
and takes 94 seconds with brute force
In competitive programming, being correct <=> AC
You just need to make sure that the probability of your algorithm failing is tiny
What kind of python design pattern should I follow? I have a Problem instance and a bunch of Solution instances for the problem. Solution instances has various different methods that depend on Problem instance which is unique. My current solution is to have an attribute Solution.problem, but this creates multiple copies of Problem which seems counter intuitive to me.
Only other idea I have is to have functional programming like design, where I make all the methods of Solution global and have something like plot(solution, problem) instead of solution.plot()
Why would that create multiple copies of Problem?
AC = all cool?
all accepted?
ACcepted?
The thing I can see that for hashes you have
ans.add(text[i:mid])
and for brute
ans.add(text[i:j]) (2x longer string)
probably that's the reason
I guess so, feels wrong but if the probability of failing is like 1/((2^61-1) +/- a random "salt") it's probably worth it
smart, thanks
but at the end of the day, even without the sanity check in case two strings have the same hash (and therefore we'd be in O(n^2)), the O(n^2) solution is 700ms slower on that leetcode problem
kinda feels bad
Your solution is suboptimal. You should try to only use hashes. For example ans should store hashes.
Thats how you get a fast solution
I tried to do a set of ints
and storing the hash
I only gain 100 ms or so
actually yeah should've pointed it out
and this is twice as fast
btw did you copy this code from somewhere?
I only ever see people do this in c++ i + (j - i) // 2
No? I coded it all after reading the usaco thing
There is really no reason to do it in Python
What would you do?
But that mod code was just buggy
You added mod to a number that was possibly very negative
adding 10^9+7 to it wouldnt have made it positive
Both get_hash were already taken mod so it couldnโt have been inferior to the mod
Ah no
Because of the
Times p
To the power of something
yes
Btw I would not be surprised if there exists a O(n) sol
(i + j) // 2
C++ people are afraid of overflow when adding i and j
Thats why they do i + (j - i) // 2
In Python that isnt an issue (and it kinda isn't an issue in C++ in the first place).
fair enough I was honestly just thinking beginning + half of the length
it can be an issue in C++, risking UB is not great
I am pretty sure the issue is integer overflow leading to a negative result, not UB
!e import numpy as np; print((np.int8(100) + np.int8(100)) // 2)
@outer rain :white_check_mark: Your 3.12 eval job has completed with return code 0.
001 | /home/main.py:1: RuntimeWarning: overflow encountered in scalar add
002 | import numpy as np; print((np.int8(100) + np.int8(100)) // 2)
003 | -28
usually compilers are not smart enough to cause issues due to overflow UB in arithmetic alone
signed integer overflow is UB though
Yes, but it's not like "jump to an invalid instruction" UB, it's "compiler assumes it never happens for optimizations" UB. The issue here is not really UB, it's that you are getting a competely wrong result. Saying it's UB is like saying "don't drive a car when you are drunk - you can get fined". Like, duh, you can, but that's not the reason why you shouldn't drive drunk - it's because you can get harmed or harm others.
Any amount of UB is bad.
Hmm thinking about it, maybe the issue is not overflow, but simply C++ integer division rounding towards 0
Integer division changing rounding direction for negative numbers is a pain
just use std::midpoint :^)
I am confused ๐ค . What does this have to do with anything?..
I agree and disagree simountiniously
Anyone here experienced with codewars?
but yeah, the seemingly simple operation of finding a midpoint has a bunch of fun edge cases
In binary search, the rounding direction of the mid point matters
weirdly appropriate for UB
I am optimizing myself out of this conversation
well, sometimes
not in all implementations
and if you got a negative value you are screwed anyway
OH
wait
I think I understand what you are saying now
oof
Basically all implementations I know of would get stuck in an infinite loop if you mess up its rounding direction
Well, no. If compiler assumes l and r are both positive, there is no need to fix the rounding after bitshift (which is what division optimizes into), and you get the correct positive result. If compiler does not assume that, you get the arithmetic bitshift instead of the logical one anyway, so you get negative number anyway ๐ค
I see you haven't read my epic binary search article I posted yesterday, you should go and check it out ๐
(no, actually, don't bother)
The issue is if l and r are negative to begin with
well it's only possible if l = -1, r = 0 and then binary search terminates instantly
(or whatever indexing you use)
Don't assume the binary search is over indices of a list
or do you mean just binary search in general?
Yes
yeah, ok, fine
then it's ok if you are using semi-intervals
then it doesn't matter which way you round
if both l and r are included, then it becomes and issue
Semi intervals?
That's not at all the issue
L=-10 and R=-3
That's problematic
No matter if you are using open or closed intervals
If using semi-intervals, then M = -6, then go to either [-10, -6) or [-6 -3), then either [-10, -8) or [-8, -6) or [-6, -4) or [-4, -3). It's all good. Same with [-10, -2) (if that's what you meant): [-10, -6) & [-6, -2), [-10, -8) & [-8, -6) & [-6, -4) & [-4, -2),
If using segments, then yeah, you get screwed. If you have [-2, -1] then you get [-2, -1] again.
Hey it's me again
I solved https://codeforces.com/contest/271/problem/D with String Hashing (thanks for the advice given yesterday btw it helped :D, and at least this problem doesn't let the brute force O(n^3) solution pass like the leetcode one) but since the editorial mentioned Tries I also implemented the Trie solution but it gets TLE (https://pastebin.com/jr19x17v). It works when converted in cpp but I'm trying to make it work with PyPy (mainly for the fun of it and to learn a few tricks) but I'm running out of ideas. Anyone has any suggestion?
From what I can tell, even your C++ solution is really slow
I would consider comming up with a better trie implementation
I tried using an array (size 26) instead of a dictionary for the children thinking there'd be less overhead but I might've done it wrong because it didn't help much
I also tried Python instead of PyPy in case it'd work but it obviously didn't either
Btw in c++ you really should use
cin.sync_with_stdio(false);
cin.tie(nullptr);
Your input reading is super slow without it
(despite what many copy-paste around, cout.tie(nullptr) does nothing)
@regal spoke has opinions on this ๐
I get why its happening and everything
Kinda just wanted an excuse to post it here because I'm proud
is there any kind soul / pyspark wizard who could help me with this issue?
hi
i have a question
elif len(bracket_track) != 0:
if s[i] == bracket_track[-1]:
bracket_track.pop()
is_valid = True
else:
return False
else:
return False
is there any way to write this piece of code in better way?
elif len(bracket_track) != 0 and s[i] == bracket_track[-1]:
bracket_track.pop()
is_valid = True
else:
return False
when i write it this way , i get error list index out of range , shouldnt it check the first condition then go to second?
sorry , if i asked a silly question , i am not very experienced in programming
i might be wrong but are these the batches we use in training models and set epochs?
that looks valid. perhaps it's s[i] that's out of range?
@vocal gorge yes
then you should check if i >= len(s)
in which case s[i] does not exist and is not a valid thing to compare bracket_track[-1] with
i am sorry , bracket_track[-1] is out of range , sorry i didnt notice
the list is empty so it goes out of range
well, that shouldn't be possible given the snippet you provided. s[i] == bracket_track[-1] will only be evaluated if len(bracket_track) != 0 is true.
yeah thats why i am confused it should check 1st condition then go to second after "and" right? but it throws me error list index out of range
anyway ill check it again
post the full exception and traceback
yes i am dumb , it works , it was a leetcode question - "Valid Parentheses"
thanks for help xD
Wonder what is the big difference between 1.4//1 and math.floor(1.4). Is it something worth benchmarking?
1.4//1 returns float but math.floor(1.4) returns int
math.floor(x) should be about equal in performance to int(x//1)
I'm trying to graph a 3-dimensional function, and the method I'm thinking of using is to create a dataframe with a bunch of points and then graph them somehow. Is there a better method than this, and if not what kind of data structure should I use?
okay guys so, this place is the correct place for pyston?
this library sandbox code
i've made an update into my code into this:
from pyston import PystonClient, File
import asyncio
file_content = """
import bot
bot.run_bot()
"""
file = File(file_content, filename="file.py")
async def main():
sandbox = PystonClient()
output = await sandbox.execute("python", [file])
print(output)
asyncio.get_event_loop().run_until_complete(main())
but all i did was following a cake recipe
how do i whitelist what it can output for my discord?
like, i don't want people accessing my computer
or VPS
and neither acessing my .env
i don't know how whilelisting
and blacklisting works here
That'd work, sure. The normal way in matplotlib is to have 3 arrays - X, Y, and f(X,Y). The X,Y arrays you could obtain via, say, np.meshgrid.
what kind of datastructure should I use?
i need help with a task, anybody?
Hi, I'm trying to make an alogirthm that adds all numbers on a list to see if any of their sums is equal to "goal"
for number in range(len(list)-1):
for i in range(1,len(list)-1):
sum = list[number] + list[number+i]
if sum == goal:
print("Ja want: ",list[number],"+",list[number+i],"=",sum)
return
elif len(list) == 0:
print("Geen oplossing")
return
list.remove(list[number])
find_sum(list,goal)
But it says that my index is out of range in line 6, and I don't really see why?
for clarity 'lines 6' is the sum thing
number + i can be way out of range for indexing the list
(also this task can be done in a much more efficient way, but that's a later concern)
yeah I'm pretty new to python, trying to get a bit better at recursive algorithms
I'm trying to do something along the lines of, take the first number on the list and keep adding next ones, if the sum =/= goal, remove the first item and iterate again until the list is empty
the sum part should've been sum += ....
for number in range(len(list)-1):
used_number = []
used_number.append(list[number])
sum = 0
for i in range(number+1,len(list)-1):
sum += list[number] + list[number+i]
used_number.append(list[number+i])
if sum == goal:
print("Ja want: ")
for i in used_number:
print(i," + ")
if i == list[-1]:
print(i," = ",sum)
return
list.remove(list[number])
if len(list) < 2:
print("Geen oplossing")
return
find_sum(list,goal)
I modified it to this
If an object is removed from the list, and then it iterates from number+1 to len(list)-1, it should never be out of range right?
should I remove the first for loop?
I think I just fully messed it up lol
wait, what is the actual problem to be solved? a pair of numbers summing to goal? any subset summing to goal? 
any subset
The subset sum problem (SSP) is a decision problem in computer science. In its most general formulation, there is a multiset
S
{\displaystyle S}
of integers and a target-sum
T
{\displaystyle T}
, and the question is to decide whether any subset of the integers...
for which you really want the dp solution
def pal_num(s):
if len(s) <= 1:
return len(s)
if s[0] == s[-1]:
return 2 + pal_num(s[1:-1])
else:
return max(pal_num(s[1:]), pal_num(s[:-1]))
what is the time complexity of this thing
I think it should be recursively defined as T(n) = 2T(n - 1) where T(1) = O(1)
is that correct?
slicing makes a copy
its O(2^n) right?
please only ping me if u actually know what u are talking about
your recursive definition is missing the time spent in the function. as purplys mentioned, it's O(n) here, since it takes a slice of most of the list
is there a recursive way to solve the Subset sum problem?
I have this
def find_sum(list,goal):
for i in range(1,2**len(list)):
binary = bin(i)
binary = binary[2:]
used_number = []
binary_with_leading_zeros = binary.zfill(len(list))
for number_idx in range(len(list)):
if binary_with_leading_zeros[number_idx] == '1':
used_number.append(list[number_idx])
if sum(used_number) == goal:
print(used_number)
just run two calls for each index - if we take current value and if we don't
def f(i, want):
...
f(i + 1, want)
f(i + 1, want - A[i])
Just tried solving it with Python https://codeforces.com/contest/271/submission/258134476
Got AC first trie
But TL is pretty tight
Damn thank you so much Iโll check it
Good enough for me, as long as I learn something Iโm down
1968/2000 :D
older pypy3 is 1560/2000
damn boy
Was it the class that was inducing overhead or the dictionaries? Because I had tried without dictionaries (but still with the class, cf below) and it wasn't passing so I'm assuming classes include somewhat of a significant overhead?
class TrieNode:
def __init__(self):
self.children: list[bool | TrieNode] = [False] * 26
self.valid = False
Well for 1998ms you don't need a really big overhead xd
But yes, classes are taking some time
yeah fair
shame we can't run it 10x just to have averages but I guess their servers would be overwhelmed if that was possible
Btw submitting your solution with older pypy also gives 1560 ms instead of 1930
Hey, how do I make this as fast as possible?
``X = np.sum(Y, axis=1)` ?
I find that with Cupy, and Cuda, this is only getting about 60x per second. And the math I want to do is a lot more complicated, but just needs this as step one
yup it's dutch
what does A represent? does this essentially subtract the number from our 'want' ?
A refers to the variable you called list
A is a better name for it thanlist, since list is already taken by the class list
!e
A = [1, 2, 3]
B = list(A) # Make a copy of A
print(B)
@regal spoke :white_check_mark: Your 3.12 eval job has completed with return code 0.
[1, 2, 3]
ah true
and this essentially just subtracts numbers from our 'want'
and I assume when 'want' == 0 it stops and prints the numbers we subtracted?
or atleast that's how it should be implemented
You could make it do something like that, ye
Just so you know, this will run pretty slow (will take about 1s for a list of 20 items), since it runs in O(2^n)
yeah
I already felt the slowness of my A* + TSP algorithm
which are both O(n!) I think?
took about 2-3minutes for a list of 10 destinations
Btw there is a somewhat simple trick to make an algorithm for the subset sum problem that runs in O(2^(n/2))
oh, cool, what is it?
Split the list into two halves. Compute all possible sums you can make out of the first half, and all possible sums you can make out the 2nd half
ohh devide and conquer
Its not exactly divide and conquer
meet in the middle!
ah ok, but I recognised a pattern where splitting lists into smaller ones usually makes the algorithm more efficient
don't know if it's a rule of thumb tho
It is called divide and conquer if you repeatedly do the split. For example merge sort
For the algorithm I was talking about, the split only happens once
why is it perse faster then the normal one?
you get two lists, 2^(n/2) + 2^(n/2) right?
Ye so computing all possible sums of the two halves takes 2^(n/2) + 2^(n/2) time
Then you need to check if there is one sum from the first half and one sum from the 2nd half than sums up to the target
ah alright, is making 4 list a possibility too then?
Not really
or is it too much work/it's too small of a benefit
Think about how to find sum1 and sum2 such that sum1 + sum2 = target
it's essentially the same time complexity, which we then add up for all n/2 inside those lists
The easiest way is to use a hashtable to check if target - sum2 is a possible sum of the first half, for all possible values of sum2
ohh so we simply go over one list
yes
and see if the result is in the other alright
That is the trick that makes things "fast"
With this a list of length 40 should be doable in 1s
it is
is 4 lists a thing?
so if we'd split it in 3, 4 ..., we'd get sum1 + sum2 + sum3 + sum4 = target
which would cause problems considering we'd have to iterate over every one of them causing our time complexity to go up
as in, we'd have to compare target - sum1 with every subset of [sum2, sum3, sum4]
Think about it like this:
Your original strategy is checking every binary number fomr 0 to 2^N.
When you do this, how many times are you adding num0 and num1?
I'm very confused.
You'll add num0+num1 exactly (2^N)/2^2 times
The thing I was talking about is definitely only valid for 2 lists
oh yeah bcs other iterations
Only reason I could see someone go for a split in 4 is to try to save memory
so the trick pajenegod is mentioning uses this knowledge 1 time to add each 2 digit sum, and use that as a baseline.
But in theory, you could do that for sums of 3 or 4 numbers. You'd take more memory, but you'd save on additions
2^n-2 times, bcs you can add num1 to nothing and can't add up num2 twice to itself correct?
or simply add num2 to nothing ig
yea. I think about it like "there are 4 ways to represent a binary digit of length 2"
00, 01, 10, 11
It's only added together when it's 11
so if you ignore all the other numbers, exactly 1/4th of the time, you'll add those two numbers
correct, you used binary to represent every possible combination, and test them individually
def find_sum(list,goal):
for i in range(1,2**len(list)):
binary = bin(i)
binary = binary[2:]
used_number = []
binary_with_leading_zeros = binary.zfill(len(list))
for number_idx in range(len(list)):
if binary_with_leading_zeros[number_idx] == '1':
used_number.append(list[number_idx])
if sum(used_number) == goal:
print(used_number)
yeah, but I don't rly get ur method
but using this knowledge, we know that some parts of the addition are actually repeated, so we can use that knowledge to save time
The method is called Dynamic Programming, and it's very popular in interviews
ohhh I think I get what ur saying..., so there are 2^n ways to reperesent a n-number digit, which meant that the actual number only is fully expressed as a non-sum in 1/2^n cases
wait no..., I think I still don't get it
More like...
let's try 4 numbers: Say you wanted to add the subset sum for (3, 7, 11, 19). We represent all possible combinations using binary... right:
0000 , 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111
How many times does the number start with "11" (i.e. "we will add 3+7") followed by any other numbers (or none at all(
it you look, it will be 4 times. This is because N=4. So there are 2^4=16 possibilities. 3+7 will happen exactly 1/4th of the time. So 16/4 = 4.
Anyways, here is the algorithm I was talking about
def find_sum(A, target):
if sum(A) == target:
return A
if target == 0:
return []
A1 = A[:len(A)//2] # First half
A2 = A[len(A)//2:] # 2nd half
# Compute all possible sums of first half
A1sums = [0]
for a in A1:
A1sums += [s + a for s in A1sums]
# Compute all possible sums of 2nd half
A2sums = [0]
for a in A2:
A2sums += [s + a for s in A2sums]
# Check if target - sum2 = sum1, for sum1 in A1sums, and sum2 in A2sums
A1sums = set(A1sums) # Make into set for fast lookup
for sum2 in A2sums:
if target - sum2 in A1sums:
# Target is possible to reach!
# Recursively find which numbers to use from A1 and A2
# This runs very fast since A1 and A2 are tiny in comparision to A
return find_sum(A1, target - sum2) + find_sum(A2, sum2)
# Not possible
return None
ohh yeah, I remember that from combinatorics
yeahh..., now I remember it, 2^n/n^p where p is the amount of repeated elements
Yeah, the method is taking advantage of this
so it's basically working backwards, and comparing, I'll try to understand the appearantly existing 1/4 method
I don't understand the 1/4 thing at all
the only thing I don't get is how we can take advantage of it
do we merge sum1 and sum2 making the list shorter?
so we don't have to iterate over them again
and thus removing needed binary checks
or more, unneeded
so this should shorten our timecomplexity to O(2^(n/2^p) where p is the amount of splits
there is a small problem with this, basically they say that if the last element is bigger then our "wanted" elementent, remove it
So, these methods change the complexity to put it in terms of the Sum you are trying to reach
but there are negative numbers allowed (I think)
I can promise that you wont get any better time complexity than O(2^(n/2)) without doing something super fancy
Ah fair
One improvement I know is that it is possible to modify the algorithm to not use any hashtables and still run in O(2^(n/2)). This is done by keeping A1sums and A2sums sorted while constructing them (similar to the merge step usually done in merge sort).
so splitting the original list in two, and sorting both lists
No there is a smarter trick
You can build A1sums and A2sums already sorted
No need to pay any extra log cost from sorting them
ohh so just sort beforehand
no
No sorting
Simply switch A1sums += [s + a for s in A1sums] into A1sums = merge(A1sums, [s + a for s in A1sums]), where merge is the typical merge function used in merge sort
merge takes in two sorted lists, and returns a sorted list (the combination of the two)
what does the 'a' stand for?
hold on, Idk if I'm interpreting this correctly
A1sums is a list containing all sums with the first index
or atleast for the 1st iteration
A1 is the first half of A
yeah of first index of the first half
A1sums are all possible sums of subsets of A1
wait... you can make all subsets using a nested for loop?
Yes
jezus I'm dumb
yeah I see it now...
one more question
does A1sums reset? after each iteration?
for a
A1sums grows by a factor of 2 at each iteration
A1sums += [s + a for s in A1sums]
The += is called extend
ohh so += list expands the list?
yes
append only adds one more element to a list
what's the difference between expand and append?
To add multiple elements at the same time, you need to do += or .extend
A1sums.extend([s + a for s in A1sums])
oh ok so extend is a iterative function
!e
A = [1,2,3]
A += [4,5,6]
print(A)
@regal spoke :white_check_mark: Your 3.12 eval job has completed with return code 0.
[1, 2, 3, 4, 5, 6]
append only adds on one more thing
!e
A = [1,2,3]
A.append(4)
A.append(5)
A.append(6)
print(A)
@regal spoke :white_check_mark: Your 3.12 eval job has completed with return code 0.
[1, 2, 3, 4, 5, 6]
so extend essentially allows you to make for loops inside of it
instead of first making a for loop and appending each element
If you want to use append, you could do
A1sums = [0]
for a in A1:
for new_sum in [s + a for s in A1sums]:
A1sums.append(new_sum)
Its the same thing
so extend doesn't rly differ much from this approach besides saving space