#algos-and-data-structs
1 messages · Page 68 of 1
cells cant be negative
bacuse itll break cycles
and yea obviously mod 256 is most useful, thats why its set by default
Does anyone have a roadmap to learn this topic
Substructural type systems are a family of type systems analogous to substructural logics where one or more of the structural rules are absent or only allowed under controlled circumstances. Such systems can constrain access to system resources such as files, locks, and memory by keeping track of changes of state and prohibiting invalid states.:...
why? no
negative numbers can be compared to 0 as well
that's cool, I wonder if that has its use in the real world :)
because it works bad, 0.1% of people use unlimited interpreters and 0.0001% will actually try to make code which works with both negative and positive numbers
[-] for "reset to zero" will be quite unhappy for negative values 🙂
do [+] 🤷♂️
how tf i would make + resetting cell value to zero if theres no limits💀💀
bytemode 1 is making overflow useful and thats why everyone uses it
but then I need to know I have a negative value
granted, I think the code I shared for PE6 doesn't rely on wraparound regardless
fwiw, I've had to deal with worse bf impls
I had to deal with one which threw exceptions if you went out of range, which means answering question like "is x or y smaller" gets very annoying
actually, the check I'm thinking about is just "is x and y equal?"
which you can do like
[->-<]
```and checking if the second cell is 0 with any sane interpreter
but for the piece of crap interpreter they had enabled erroring if you went out of range, which means you get an error if cell 1 is > cell 2
sounds like insult tbh
I was more saying that I've had bf implementations/setups where doing anything was so much harder than it should be because of stuff like throwing if you go out of range
wrapping or allowing negative values are both setups that are actually useable
i am shit in python dsa and i am from india i need to improve on dsa to crack interview and do cp
help me on which course helps me??
i am soo stupid that i see the soultion and copypaste it and put explanation in chatgpt and read it
pls help me
i have 4000 rs to spend money on a course or yearly subscription...
help me on this
ill add it later
if ur saying these usable
Maybe learn all data structures needed for interviews and then try to solve problems on LeetCode, if you don't know solution then you can check solution that other people posted on LeetCode. Also you can use LLM for some questions regarding problem. Then after some time you can do same problem again. I think that it would be useful to follow roadmap like NeetCode 150 or NeetCode 250. There are also some pinned messages about resources for learning DSA. Hope it helps

thanks man
will look into it
hey im really new to python and i wanna write a recoil script does anybody know how to and can help me please?
well, you can use pyautogui for simulating mouse recoil (i.e., pulling the mouse down every few milliseconds after a click)
it's my opinion
Where do you want to use the recoil script
recoil script usually involves adjusting mouse movement** to counteract recoil in shooting games, often using libraries like PyAutoGUI in Python. They’ll need to check if it’s allowed in their game. If they need more details, they can ask!
How to even form algo without for loops? What the heck my teacher don’t allow for loops - https://chatgpt.com/canvas/shared/68405d34d0048191a176d821041b6e42
what about while loops?
what is this supposed to do?
nvm i figured it out
Depending on the algorithm you want to write, you can use a loop or not....
Unfortunately, we have to learn the algorithm before programming... this order is usually not followed
Definitely. that is quite right order.
It is similar that before we use our hand but think first in our head.
For turn of action. 🙂
Exactly 👌
Are you also a algorithm specialist?
I've worked with some AI algorithms and filters like Kalman, and also played around with stuff like CNNs, SVMs, and reinforcement learning. Still got a lot to learn, though! Do you work with algorithms too?
I'm learning quantum programming basics so I don't fall behind in the future
Nice!
Yeah these days I am also trying to implement some algo tasks.
Messing with algos is always a mix of fun and brain pain 🥴Hope it's goin' smooth for ya
🙂 Haha. Right definitely!
where can i learn dsa for python?
If you want a solid learning path for DSA... check out Roadmap.sh.... it guides you step by step. After that, GeeksforGeeks is great for diving deeper and practicing...
GeeksforGeeks is great
🤢
YouTube is better (I learned it myself from YouTube)
Why 😂
the quality of their stuff is frequently bad
So, where did you learn from?
Random places. Wiki articles, papers, lectures, ...
which videos? pls
ty
Code Basics is a good channel—assuming the Indian accent doesn’t bother you 🙂
Any thought on designing an experiment that looks like real-world scenario where branch prediction consideration in the code design impacts performance? I** managed to see the impact of branch prediction on simple examples but it required sorting the data beforehands on the one with few mispredictions but then sorting is way more expensive that the gains I have by having very few mispredictions** so my curiosity remains unsatisfied: it doesn't feel like a real-world scenario where optimising for branch prediction makes my code a tiny bit faster.
I have this idea in python but I'm not sure the "branchless" version has no jmp statement (and it doesn't even work because it's slower):
import time
def count_positive_branchy(data):
count = 0
for x in data:
if x > 0:
count += 1
return count
def count_positive_branchless(data):
count = 0
for x in data:
count += x > 0
return count
n = 10_000_000
alternating_data = [1 if i % 2 == 0 else -1 for i in range(n)]
start_branchy = time.time()
count_branchy = count_positive_branchy(alternating_data)
end_branchy = time.time()
branchy_time = end_branchy - start_branchy
start_branchless = time.time()
count_branchless = count_positive_branchless(alternating_data)
end_branchless = time.time()
branchless_time = end_branchless - start_branchless
print(branchy_time, branchless_time, count_branchy, count_branchless)
ehh, the python implementation itself has a bunch of branches, so you cant really make branchless code
the need to do +0 and reassign the value even though it doesnt change it is what makes it worse, i guess
Yeah yeah of course but I guess I managed to have ten times less branches on a dummy example and it was around 20% faster
So now im looking for somewhat of a realistic example that does the same
I’m not against doing stuff in C++ to satisfy my curiosity it’s just that the optimiser often times just removes the branches in a clever manner
:incoming_envelope: :ok_hand: applied timeout to @ripe hawk until <t:1749632911:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).
The <@&831776746206265384> have been alerted for review.
.
Problem:
given an array A, and 2 numbers x y, one can do the following operations any number of times:
- Type I: choose a subarray of length 3 and subtract
xfrom all of its elements - Type II: subtract
yfrom an element
find the minimum number of operations to make every element <= 0.
anyone got ideas? I'm thinking maybe some sort of dp, but the transitions I'm coming up with are pretty wrong
Explain merge sorting algorithm
sorting hard, merging sorted things easy
split your thing recursively in half until sorted, merge sorted things recursively
maybe sorting not so hard after all
Is it given that 3x > y?
OK I guess you can DP it even without that.
So the function will be something like:
def dp(idx, c1, c2, c3):
# idx: index currently being processed
# c1: amount left in idx
# c2: ... idx+1
# c3: ... idx+2
- While processing idx, we only ever consider applying Type I to [idx:idx+3], and only ever consider applying Type II to idx itself.
- Key considerations:
- If y <= x, then Type II will never (ever) be applied.
- Elif c2 == c3 == 0, then Type I will never be applied.
I'm not really sure if that will get to the solution
more like dancing around it
no, but if 3x <= y it's pretty trivial and uninteresting anyway so focusing on 3x > y is fine
Why don't you think it will get to the solution?
can you tell me what's the state transition you're thinking of?
or maybe, what will your dp(idx, c1, c2, c3) return
like the Key considerations are true, but those are more like special cases where it won't help in solving the general case
I'll give you more details after I'm done w work
def dp(idx, c1, c2, c3):
if idx == len(lst) - 2: return 0
if cacheHasSolution: return cachedSolution # or use @cache
if c1 <= 0: return dp(idx + 1, c2, c3, lst[idx + 3])
t1 = float('inf') if c2 <= 0 and c3 <= 0 else 1 + dp(idx, c1 - x, c2 - x, c3 - x)
t2 = float('inf') if y <= x else 1 + dp(idx, c1 - y, c2, c3)
return min(t1, t2) # memoize manually if not using @cache
# read lst from input
lst += [0, 0] # tack on 2 zeroes for convenience
ans = dp(0, *lst[:3])
Something like this.
I might have made some blunders. Or not.
I'm not sure if that's correct, but it'll be very slow
e.g. consider x, y = 1, 2 with elements up in the 10**8
That's correct. Is that how your test cases are?
I'm going to be the problem setter for a contest, and this is one of my ideas
I'm trying to find whether there's a plausible solution
so for now there are no test cases
Ooh, should I delete that then
nah
they're not in this server
at least I'm 99% sure
I am 99.9% certain it will work
(or can be made to work)
Although you're right about the slowness if elements are large
I can try to think of an alternative approach if you require speed.
well again, problems start to form if x, y are way smaller than the elements
I'm thinking of the "usual" requirements of n <= 2000 and x, y, a[i] <= 1e9 if it can be done in n^2
or like n <= 1e5 if n log n is possible
Are you even able to set a test case which is provably correct
for small cases, yeah, manually + calculate by hand
Mathematically, it seems like it would be a difficult problem involving number theory and perhaps combinatorics.
I mean how else do problem setters come up with correct test cases, if not make the correct solution first, then genning inputs and throwing it into the solution to get the answers
You could also code something which is provably correct then use that to generate the test cases.
this is what I meant
Ah ok
which is why I'm asking here cause my brain is failing to come up with a solution
Well why not see if what I gave you works for a small test case first
Then at least you have a fallback
Meanwhile I'll think of a different approach, since this is somewhat interesting.
I think it's right
modified a bit:
from functools import cache
@cache
def dp(idx, c1, c2, c3):
if idx >= l: return 0
if c1 <= 0: return dp(idx + 1, c2, c3, lst[idx + 3])
t1 = 1 + dp(idx, c1 - x, c2 - x, c3 - x)
t2 = 1 + dp(idx, c1 - y, c2, c3)
return min(t1, t2) # memoize manually if not using @cache
# ... input lst
l = len(lst)
lst += [0, 0, 0]
ans = dp(0, *lst[:3])
inputs and outputs which I think are right
x, y = 2, 3
lst = [3, 3, 3]
out: 2
---
x, y = 4, 5
lst = [5, 8, 8]
out: 2
Next step: write code to recover how many type I and II moves were used and where.
Then maybe create a medium sized test case.
Also I believe you mean to store the list's length before extending it.
@regal spoke maybe you have some idea of what kind of approach would fit this problem?
let me see
even if correct, won't this scale terribly for large array values?
don't think so, cause I pad 3 extra 0's
it's not a set problem yet
it's not set yet
I'm going to be a problem setter and this is an idea I had
so depending on the solution it could be bigger / smaller
Yeah but you should be terminating before those extra zeroes.
Else you might IndexError again
yeah which is why l = len(...) is before the padding
I see it being after the padding.
oh oops
fixed
Very terribly.
Hence any better solution has to involve some number theory.
E.g. Consideration of the relative sizes of 3x and y
If x > y, then Type 1 would always give a greater impact, there would be no reason to use Type 2.
If 3x > y && x < y, then Type 1 would be used in most cases, except edge cases like A = [3, 0, 0], x = 2, y = 3
If 3x < y, then Type 2 would be used in most cases, except edge cases like A = [2, 2, 2], x = 2, y = 8
Is this reasoning correct?
so one potentially relevant observation is that order of operations does not matter
e.g. for index 0 you will use a x moves and b y moves, such that a*x + b*y >= A[0]
what you are currently doing actually tries all orderings if I read your thing correctly
No it goes L to R. It doesn't try all orderings.
all orderings of operations
good catch
tho the current code has @cache, and since as you said ordering doesn't matter, that should mean like if I -> II was calculated, II -> I will pass the same parameters into the function and it'll just return the cached result
Ahh
it would be funny if this ends up being an actually quite hard problem
Right should make it that a type II can no longer try a type I at the same index.
Not surprised if it's NP hard.
I guess for simplicity one could consider the even smaller case of subarrays of length 2
I can write a fix if needed
I don't know if that is obviously easier
I mean it sounds simple, I thought it would be simple, but...
if the solution involves some galaxy brain algorithms I might have to drop it because the contestants won't be ready for it at all
The first solution I would try on a problem like that would be O(max(A)^2 * n)
from this observation I think this would be right as well:
from functools import cache
from math import ceil
@cache
def dp(idx, c1, c2, c3):
if idx >= l: return 0
ans = float('inf')
for a in range(ceil(c1 / x)):
b = ceil((c1 - x * a) / y)
ans = min(ans, a + b + dp(idx+1, c2 - x*a, c3 - x*a, lst[idx+3]))
return ans
though looking at this, I'm not sure how many overlapping states there really would be for a dp
DP still good. Future states are still reused.
How about reducing input size? I think it's a great problem really. Just... Don't have elements going to the speed of light?
that'd be another way yeah
could always just dip it to like A[i] <= 100 or something and call it a day
You could make x large instead
keep A[i] / x small, fair
You could also consider making the interval be 2 instead of 3
actually I'm not sure how I'd state that in the requirements without it being very obvious
what solution do you have in mind if it were 2?
ig then the final problem would be:
given an array A, and 2 numbers x y, one can do the following operations any number of times:
- Type I: choose a subarray of length 2 and subtract
xfrom all of its elements - Type II: subtract
yfrom an element
find the minimum number of operations to make every element <= 0.
and something liken <= 1e5, && x, y, A[i] <= 100
and the solution is basically the same as this
looks good?
Length 3 subarray is good imo.
My gripe with this problem is that just the straight forward dp is the intended solution
It would be a lot nice if it was somehow modified to require an insight
I guess it depends who the competitors are.
would you get something interesting if you gave the input some specific property?
e.g. say the array was guaranteed to be sorted
Oh that's interesting
But DP is precisely precomputing stuff 😆
no
But yes I get what you mean
DP is making use of optimal substructure
How about "pick any 3 elements and reduce them by x"
I think that ends up being trivial
Lets say the array is [..., 56, 67, ..., 56, 67, ...]
My idea is that you can somehow reuse a calculation in order to be able to skip the 2nd time a pair appears
It's not the same which pair you apply the op to.
@narrow mica who's the target audience / competitors?
subarray is consecutive sequence?
yes
In most definitions, that's how it tends to be.
wow
that's confusing
A subarray or substring will always be contiguous, but a subsequence need not be contiguous
I'm not really sure how to really give a number rating
but they definitely should know non-trivial dp, some useful structures like segment trees probably
a given data structure describes a specific way to store, read, and modify data, with clearly defined operations for all 3 (or some of the 3).
the wiki defines it as a sort of analogue of Algebraic structure but for data
yea, that is about right.
the simplest non-trivial example I can think of as a Group or perhaps even a Vector Space
So, um these structures have some operations wrt which some axioms are put forward
and there are many examples of the said structure which satisfies these properties
Now, is it the case that with Data Structures you have a pre-defined framework of properties wrt to the operations (store, read and modify data) which if satisfied qualifies the object to be called a Data Structure?
A data structure would be something like a list, a heap, a binary tree. Something is the list data structure if it stores, reads and modifies data like lists should.
The other related terms are abstract data types, which only specify the read and modify operations, and do not care about storage, such as stack, queue.
But terms are not used consistently unlike in math, so you will see list used as an abstract data types, or a stack used as a data structure, etc.
is there anywhere I can look which gives an unambiguous definition, gives some examples and what have you
If bricks are data, then a brick house is a data structure, and an abstract data type is "an object which you can enter and leave and provides shelter."
The ADT description could be a cardboard box too.
I am not sure if I follow that analogy
hmm, I'm not aware of anything. I know that I was taught subtly incompatible definitions compared to some other people here in college, so it is probably a typical case where you have to check what definition a given author is using if you need an exact definition.
alright
Definition of data structure,
possibly with links to more information and implementations.
Definition of abstract data type,
possibly with links to more information and implementations.
Guys explain merge sort
First, you need to learn to merge two sorted lists. Do you know how to do that?
Yes, with python function
Show me
I am on phone
Which python function are you using to merge the lists?
You can merge two lists in Python using the + operator, . extend() method, or by using list comprehension.
Then the result might not be sorted
Why
What if the lists are 1, 3, 5 and 2, 4, 6
hello i have a dp question i need some help with is anyone up
Hello guys I have a questino for topological sorting of a graph can we have multiple representations for one graph?
go on
yes, a topological sorting is not necessarily unique
alright thank you
if your graph looks like
1 -> 2 -> 3
-> 4
then there's, uhh, something like 4 possible topological sorts: 1 must always be first, but you can shuffle the two chains branching off from 1 into any order, so 1 2 3 4, 1 2 4 3, 1 4 2 3 are all valid.
Calculating the number of possible topo sorts is an interesting exercise in combinatorics.
Hello
How can I implement a singleton in Python?
The idea is to have one importable random source (for example, an instance of numpy Generator) for the rest of the code to use. It needs to be able to reseed (recreate the instance) without the need to reimport it.
The only idea I have is to implement a proxy object that will take care of seeding an instance, while forwarding all other calls to it. So you will be importing this proxy object.
Are there any better ways of doing that?
Do what the random module does. It has a Random class, and you can have multiple instances of if you want. The random module also has a module-level Random instance which backs the module-level functions like seed, random, randint etc.
If you need multiple sources like that, then you definitely don't want a singleton or a global, and you want some kind of proxy
The typical way is to override __new__ to create an instance (if it doesn't exist), or to return that instance if it already exists.
This way, you have no more than one instance regardless how many times you call the constructor.
What do you mean about proxy in case of multiple sources?
What I meant was that instead of manually creating module level function references to numpy random methods, to create a proxy that will just forward all calls to the underlying instance.
Something like this? ```py
class RandomProxy:
def init(self):
self._random = Random(_random_seed())
def random(self):
return self._random.random()
def randint(self, a, b):
return self._random.randint(a, b)
def seed(self, value):
self._random = Random(value)
This will cause problems with reseeding numpy generators - it creates a new instance.
I rather meant overriding the __getattribute__ method
How many methods do you need to proxy?
Preferably everything what numpy random provides
Though will this approach cause problems with typing?
yes, that's going to disable all static analysis essentially
That is unfortunate, yea
What's your use case though? Why do you need to re-seed the generator while someone else is getting values from it?
As the initiation of it might (do not ask why...) be called after the objects using random init themselves.
Besides static analysis, automatically proxying all methods can be a bit of a pandora's box. For example, what happens with
https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.spawn.html
Should the children get wrapped in the proxy as well? How will that work? (or I guess what happens when it's re-seeded)
What about this? ```py
_generator = numpy.random.default_rng()
def seed_generator(seed):
global _generator
_generator = numpy.random.default_rng(seed)
def get_generator():
return _generator
This will cause problems with reseeding. WOn't it?
As you will have an old reference
Oh wait
If you want to get a random value, always call get_generator(). But yes, it's not possible to enforce. It is much simpler though
Yeah...
Do you have the code somewhere?
At the moment I like the explicit forwarding more due to typing. Guess will have to spend some time to list everything that is needed tehre
One benefit of explicit forwarding is that you get to examine what you're actually forwarding, so you'll find the spawn function and it will probably cause some doubt
(it's probaby a good idea to exclude that in particular)
It will be excluded for sure, as I do not want multiple different instances
Unfo, I cannot share that code.
So you would suggest to just make module references to methods (like in Random), right?
Guess I'll be making a proxy object that encapsulates a generator instance and either proxies some specific methods or all of them. Can even make module level functions as shortcuts to that proxy (as it will always be the same object).
At least you will not have to call get_generator all the time (as I am sure will get forgotten and just saved)
Morning all, can anybody recommend me the best libraries when working matrices / image manipulation?
what kind of image manipulation?
I think numpy
Dlib
OpenCv
hello
oh completely forgot opencv. thank you
hi all
hi
hey man
someone told me about introduction to algorithms as good book for data structs and algs
can someone tell if it is really? like i know intermediate level python
built some projects
i understand very basic of like O(n) and big o notation and all
https://soi.ch/wiki/smaller-to-larger/
any reason why for DSU on tree they use sets/maps instead of hashsets/hashmaps? Python doesn’t have those sets/maps but I don’t get why we can’t use regular hashsets/hashmaps and get the complexity down to O(nlogn) granted there isn’t a poisonous input causing loads of hashes
from the complexity description, it sounds to me like you can. perhaps they just don't care about the extra log(n) factor.
making readable code
def __str__(self):
items_s = ", ".join(map(str, reversed(self._items)))
return f"|{items_s}|{self._max_size}"
I see three possible reasons.
- The vanilla hash tables in both python and c++ are known to be hackable. So it can be nice to avoid them.
- The person wants to solve a problem where they need a sorted data structure.
- The person doesn't know hashtables/don't care.
Hackable?
People think operations like insert and look up are O(1) on a hashtable. But that is not the case with the built in ones
There are many ways to "hack" them
If you mean that you can pick specific strings to generate hash collisions, Python and many other languages randomize the hashing algorithm to combat that (see e.g. PYTHONHASHSEED)
(given control over the input, you can easily generate input that performs terribly)
not for integers
It is so much easier than that if you use integers
yeah, for integers it can be problematic
!e
{i * (2**61-1) for i in range(10**5)}
:warning: Your 3.13 eval job timed out or ran out of memory.
[No output]
!e
{i * (2**62-1) for i in range(10**5)}
:warning: Your 3.13 eval job has completed with return code 0.
[No output]
The first one takes forever to run, the second one runs fast
The reason is hash(i * (2**61 - 1)) is always 0
This is just one way to "hack" a python hashtable
a bit off-topic but #define int int64_t is kinda evil
(maybe that's even compliant with the standard?)
I've seen worse
Have you ever seen signed main?
oh god
People do that so they can do #define int long long without it messing up main
it's so hacky and ugly
doesn't auto main work?
I've never tried that
seems like no
$ cat a.cpp
auto main() {
return 0;
}
$ compile a.cpp
clang++ -O3 -march=native -std=c++23 a.cpp -o a
a.cpp:1:1: error: 'main' must return 'int'
1 | auto main() {
| ^~~~
| int
1 error generated.
auto return type is pretty weird to use anyway
Hi Everyone 😊
why python supports lgbt 🥀
!warn 568360253108912137 Do not use homophobic slurs again.
:incoming_envelope: :ok_hand: applied warning to @cursive wing.
Any idea how to AC this problem without using Euler's Tour? https://codeforces.com/contest/570/problem/D
I've been trying with DSU on tree for a while and even with many micro-optimisations, I can't find a way to make it pass Test Case 30:
~~https://codeforces.com/contest/570/submission/325595303~~
This is I believe my most advanced one. I use list instead of dict when possible and instead of a dict[str, int] for the number of letters, I use a single integer as a bitmask and XOR operations since we only need the parity (that gets rid of the MLE, but not of the TLE, credits where it's due, the editorial mentions it).
And since there are strings in the input I can't use the "magic get_all_integers at once" trick I see pajenegod use a lot on CSES to speed up taking the input
EDIT1: Submission where I take all the inputs at once with sys.stdin.buffer.read():
~~https://codeforces.com/contest/570/submission/325597419~~
It's faster on Test 21 so I think it helps, but still not enough for Test 30
EDIT2: Putting Yes by default in answers and only modifying it to No when I want to add a No makes me get to Test 31 but still stuck 😦 at this point the code is hardly readable lol
~~https://codeforces.com/contest/570/submission/325598080~~
EDIT3: using .bit_count() > 1 instead of doing looping and counting the number of times where 1 << i & isn't 0 made me move forward to Test 33.
~~https://codeforces.com/contest/570/submission/325598764~~
EDIT4: somehow using pyrival's bootstrap function allowed me to get to Test 35 but with a MLE this time
https://codeforces.com/contest/570/submission/325599343
I tried doing solving it https://codeforces.com/contest/570/submission/325603789
Kind of strict constraints
I doubt my solution can be improved by much
Does this work because of
The parent of vertex i is vertex pi, the parent index is always less than the index of the vertex (i.e., pi < i).
Yes
If I understand correctly, the way you make DSU on tree work iteratively is by iterating over nodes and you use that property?
I see.
I really thought that was useless
But it gives a nice "topological order" in which you can process the nodes
If the tree was given in another format I would do essentially the same solution
But change range(n) to BFS, where BFS is a list of the nodes in a BFS order
Bfs once, yes
I much prefer iterative to recursion in Python
what is data[i][j] exactly in your code? data[i] is a list of the letters at each height?
isn't this O(n^2)?
ah yeah no I'm stupid that's the whole point of small to large merging
you're reusing the lists
data[i] is a list of length height of node i
data[i][~j] is the xor of the bitmasks on depth j from node i
I will never understand why integers are represented by the two's complement
like if 6 is 0000....0110 why is -6 not 1000...0110
well one of the more obvious advantages is you don't have -0
ah yes, good point
I'm trying to reproduce this my own way and I can't this is driving me crazy
Is it just because I'm uing list[dict] instead of list[list]?
states: list[dict[int, int]] = [{depth[i]: (1 << (ord(letters[i]) - 97))} for i in range(n)]
for i in range(n-1, -1, -1):
# Answer all queries for node i
for (d, idx) in node_to_queries[i]:
if d in states[i]:
if states[i][d].bit_count() > 1:
answer_string[idx] = "No"
if i:
# Merge state of node i with its parent
parent = par[i - 1] - 1
if len(states[i]) > len(states[parent]):
# we want states[i] to be smol
states[parent], states[i] = states[i], states[parent]
for d, counter in states[i].items():
states[parent][d] = states[parent].get(d, 0) ^ counter
using dictionaries sounds a lot slower than lists
It’s a MLE though but yeah I’m trying to use lists instead
ok it's not a complexity thing
if i put the memory limit to 512MB it passes
it probably is just that dictionary are memory hungrier than lists
Thank you
I'm actually for the first time convinced by using ~i instead of -(i+1) which feels so awkard when you index from the end
I thought that SortedList (https://github.com/cheran-senthil/PyRival/blob/master/pyrival/data_structures/SortedList.py or the one from the Python library sortedcontaines) was the equivalent of std::set but if I'm not mistaken std::set doesn't allow indexing. Is there an equivalent of the different Python SortedLists in other languages or would one need to implement it manually as well to have relatively cheap (O(n^(1/3)) or O(log(n)) indexing? I thought an AVL tree would (theoretically) do the job (indexing in log(n) granted you augment it with subtree sizes but that doesn't cost anything in complexity, search in log(n), add in log(n)), surprised to see no one implements it?
That is called "k-th order statistics"
It is kind of built into c++ https://codeforces.com/blog/entry/11080
It is part of a gnu extension of the standard library. So it is basically built in, but it is not technically part of the c++ standard library.
I think it is mostly just the name
I think it means they are configurable
In my experience people usually dont bother to configure things other than custom hash functions if you are using their hash tables
Its built in hash tables without a custom hash are increadibly easy to hack. They are by default so easy to hack that I don't think they are designed to be used without a custom hash.
how do people even remember how to custom hash honestly
I will never remember the magic numbers they use
Wouldn’t anything with some randomized seed likely do the job?
if your hash function is bad then it'll still get hacked even with a randomized seed
I've been away from this for a bit too long so I can't remember examples, pajenegod probably can
I thought if you did a linear transformation for your hash (a*x+b mod p) with a b and p randomized and x your key (if we’re talking about a dict[int, Any]) then it’d be unhackable
As long as p is not like 15 obviously
And a not 0
Also can you define your own hash in a Python dictionary? Or is the solution just to convert your integers to strings which have randomized hashes
When I do something like xor hashing, I like to use
memo = {}
def my_hash(x):
key = str(x)
if key not in memo:
memo[key] = random.randrange(10**18)
return memo[key]

Isn’t using a dict with string keys enough?
And probably less overhead than that?
Ah non ok i cant read
Would this be hackable?
I don’t see how you could come up with a sequence of integers x with the same hash without knowing in advance a b and p
You have to wrap things in a class, same as with sorting using a custom comparetor
I'm not sure if that is "safe"
Using non-prime p is probably a really bad idea
If a b and p happens to be even, then the first bit in the hash is always 0
Fair enough
So it won't be fully utilizing all available bits
but (i) even then, it'd be hard to hack your solution if it wasn't close to TLEing, you'd just have twice as many collisions and (ii) with p any random big prime should do the trick right?
what i'm trying to get at is I'm not sure you have to memorize any particular prime
i could be wrong though
for me remembering big primes isn't usually a problem cause a good few have simple forms like 1e9+7 or 2^61-1
the problem is the other magic numbers hash functions use, for example this is splitmix64:
uint64_t z = (x += 0x9e3779b97f4a7c15);
z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
return z ^ (z >> 31);
Yeah I’m not sure that’s necessary compared to just a basic modded linear transformation
At least I don’t see it
It's not. It's easily breakable by passing in "successive" values and observing the output.
If it were secure, there wouldn't be value in all those other hashes like SHA
Frankly unless the numbers are really poorly chosen (like pajenagod's example), it doesn't really matter much. djb2, for example, uses some funky starting values. But in truth any LCG with sensible starting values will have at least 97.5% of the performance of djb2, and sometimes outperform it. It depends heavily too on the inputs being hashed.
I was only talking in the realm of competitive programming hacks where each test cases would yield different random numbers
Not a scenario where you can try loads of different test cases to determine weakness in the system because after being randomized once the values stay fixed
Oh mb I misunderstood.
Hmm but in that case the random hash just has to work for the given test cases?
And any people can create after reading your code
In a file, I have a string that represents a filepath. It's enclosed in double-quotes and I need to replace it to another string. I know the path to a folder, but I don't know the end of this string. How can I find it and replace all the contents inside the quotes?
Just solved one of the problem the blog's author mentioned, works great thanks
It works so well that the equivalent PyPy solution with SortedList doesn't pass
I'd be curious if you can find a trick to pass without changing the approach https://acm.timus.ru/problem.aspx?space=1&num=1090
does timus even have pypy?
PyPy 3.10 x64
It's my first time going there, maybe that's new idk - there's just so many platforms
Pretty sure with merge-sort-inversion-count that'd be faster since there's likely less overhead
guys where could i practice straighforward graph problems specially bfs and dfs
or tell me how did you master it
umm hey
Im new to the coding space
Which resource would you guys recomment to learn specific concepts and solving problems?
Leetcode is fine & user-friendly
(it's also very meh)
hey you can use visuAlgo and CodeForces
also I recommend this article for basic learning
https://web.stanford.edu/class/cs97si/06-basic-graph-algorithms.pdf
and this one as a practical example
https://arxiv.org/abs/2409.11687
Knowledge graphs have been shown to play a significant role in current knowledge mining fields, including life sciences, bioinformatics, computational social sciences, and social network analysis. The problem of link prediction bears many applications and has been extensively studied. However, most methods are restricted to dimension reduction, ...
why do you have beef with every popular site out there🙂 😂
i mean if you wanna learn dfs i don't see what's wrong with it
honestly I have beef with very few sites
g4g leetcode w3schools are the three off the top of my head
this is kinda weird on its own too 😳
for someone starting out it’s a good place
but once you get a bit advanced they’re not that useful
at that level it’s better to use Git
i believe if i had started with these sites my learning process would’ve been faster
how are they related with git
What's wrong with g4g
the quality of their stuff tends to be questionable at times
I've seen a handful of cases where their implementations were just incorrect, which caused a fun time for some people trying to their impl in a contest
I suppose I just happened to read those which aren't questionable
Is it possible to get a list of explicitly implemented methods in a class? I'd like to have a proxy class that forwards all calls (via the getattribute) except for the implemented ones.
Definitely. Access the class' __dict__ attribute to get the dictionary with everything that is defined, then look for functions. You'll also probably want to iterate through the __mro__, to be able to check superclasses too. Although, you could just override __getattr__ instead, which is called only for attributes that don't exist normally.
Somehow I didn't think checking __dict__ on class - was instead checking the object one. I'll be leaving the inheritance for now.
__getattr__ is only used when the attribute name is not found. I was thinking to forward basically everything to the target object especially the __class__ thingy to mimic isinstance. And those are implicitly implemented, including some other stuff like equality.
However, I ended up with the getattr approach while overriding the class - seems easier for now. Will just forward special stuff when needed.
As a side question that I didn't know - does isinstance check the __class__ attribute only when the direct check fails? I mean that checking object of class A against type A will not check the __class__ while doing it against type B will.
Normally yes, but if you make it a metaclass, it will call __instancecheck__ it seems. However if it's an exact match, it'll always succeed. Here's the implementation: https://github.com/python/cpython/blob/c419af9e277bea7dd78f4defefc752fe93b0b8ec/Objects/abstract.c#L2632 (It starts in PyObject_IsInstance() which calls this).
Objects/abstract.c line 2632
object_recursive_isinstance(PyThreadState *tstate, PyObject *inst, PyObject *cls)```
I see. Didn't know that!
is list.append() O(n) at the backend like while implementing in C or O(n) explain please
no, it's O(1) amortized
the idea is once you don't have enough space, you move everything elsewhere that does have enough space
say you have n element and want to .append, instead of moving somewhere with n+1 space, you move it to a place with 2n space so subsequent .appends don't need moving
you analyze the time complexity over a series of operations rather than 1 operation
e.g. the worst case complexity for what I describe above is still O(n) for when you have to relocate, however all the other times it's only O(1)
blah blah blah apply some amortized analysis techniques, and you'll find that the overall time complexity is O(1) despite the really bad O(n) sometimes
@narrow mica
you're just doing .append() len(l2) times
No, addition takes len(L1) + Len(L2) time since it creates a copy
But += (extend) is same as Len(L2) appends
there is this question - Sort array by parity
which approach do you prefer
- In place sorting
- constructing new list
the questions asks to take all even numbers at the beginning and odd at the last
ah right, whoops 🥴
Does += really call extend?
I think it does L1+L2 and then reassigns
yes
but it is not fully equivalent
A = [1, 2]
B = [3, 4]
def f():
A += B
f()
this doenst work ^
But
A = [1, 2]
B = [3, 4]
def f():
A.extend(B)
f()
this works
Your example was what made me think there was reassignment like A = A+B
But it looks optimised
:white_check_mark: Your 3.13 eval job has completed with return code 0.
2.3563370406627655 2.2979408372193575
It is common to see people do
all available inplace ops in mutable types are always mutating
A+=1,
in codegolf instead of append
which is to say A += B for lists is always basically extending B to A (as a non-copying mutation)
!e
import timeit
L1 = list(range(10**5))
L2 = list(range(1))
setup_code = """
L1 = list(range(10**5))
L2 = list(range(1))
"""
plus_equal_code = """
L1 += L2
"""
extend_code = """
L1.extend(L2)
"""
plus_equal_time = timeit.timeit(plus_equal_code, setup=setup_code, number=7000000)
extend_time = timeit.timeit(extend_code, setup=setup_code, number=7000000)
print(plus_equal_time, extend_time)
+= and extend are basically exactly the same
a += b and a = a + b are not the same
:white_check_mark: Your 3.13 eval job has completed with return code 0.
0.46920054964721203 0.5193300461396575
Always assumed they were yeah
Im convinced now (not that they are)
Today I learned a new thing I guess
!e there is a much easier way to check this
a = [1, 2]
b = [3, 4]
a_id_before = id(a)
c = a + b
a += b
print(a_id_before == id(a))
print(a_id_before != id(c))
:white_check_mark: Your 3.13 eval job has completed with return code 0.
001 | True
002 | True
my least favourite python feature 😔
is is a better way to check, because if the list did disappear, a new list could get the same ID
Python 3.13.1 (main, Jan 9 2025, 17:09:58) [GCC 14.2.1 20240912 (Red Hat 14.2.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> id([])
139722956197056
>>> id([])
139722956197056
>>> id([])
139722956197056
can I even do an is check?
between what things?
I guess I could store a reference to a
!e
a = [1, 2]
b = [3, 4]
old_a = a
c = a + b
a += b
print(a is old_a)
print(c is not old_a)
:white_check_mark: Your 3.13 eval job has completed with return code 0.
001 | True
002 | True
yeah, that
!e
a = [1, 2]
b = [3, 4]
old_a = a
c = a + b
a += b
print(a is old_a)
print(c is not old_a)
a = a + b
print(a is not old_a)
:white_check_mark: Your 3.13 eval job has completed with return code 0.
001 | True
002 | True
003 | True
Kinda cool
What is 3.13t?
Is that like the beta version of the new update of 3.13?
free-threaded mode of python
https://neetcode.io/solutions/longest-repeating-character-replacement
Why does this work?
while (r - l + 1) - maxf > k:
count[s[l]] -= 1
l += 1
Why does this help when you MIGHT DECREMENT THE MOST FREQUENT CHARACTER?
A better way to prepare for coding interviews.
YOU AREN'T TRACKING MAXF SO HOW DOES THIS NOT BREAK THE SOLUTION??
!e
a = [1,2,3,1,2,3,4,5,4,3]
result = []
for k in a:
if k in a and k not in result:
result.append(k)
print(result)
:white_check_mark: Your 3.14 pre-release eval job has completed with return code 0.
[1, 2, 3, 4, 5]
Think about it this way: every iteration of the for loop in the sliding window solution is checking whether it's possible for a string ending at the r'th position to be the "longest repeating character after replacement" found so far. The while loop is removing characters from the beginning that cannot be part of the best solution that ends with that new rightmost character s[r]. Imagine you're allowed to do 1 replacement on a string like AABCCC. After 3 iterations of the for loop, your window holds AAB, and that's fine because you can replace the B with an A. After the 4th iteration of the loop, your window is expanded to hold AABC, which can't be turned into only one repeating character using only one replacement, so that while loop chops the first character off, shrinking the window to hold just ABC. And then the while loop fires a second time, seeing that you can't make ABC into one repeating character using only one replacement either, so it shrinks the window again leaving just BC in it, because the "longest repeating character after 1 replacement" that you can make looking backwards from the first C is only 2 characters long
In other words, as the for r in range(len(s)): loop is advancing through the string, at each step, you're finding the length of the longest substring which contains only one distinct character after at most k replacements and which ends with the character at index r.
You do that N times, finding the best length that can be formed ending at each possible position, and the best of those bests is the overall best
So with 1 replacement, on the string AABCCC, the best possible solution after the r'th iteration is:
0. A
1. AA
2. AAB (replacing the B with an A)
3. BC (replacing the B with a C, or vice versa)
4. BCC (replacing the B with a C)
5. BCCC (replacing the B with a C)
The while loop is handling that drop from it being possible to construct a 3 character string ending with the B at r=2 to only being possible to construct a 2 character string ending with the first C at r=3
i have this question
Intersection of multiple array and ** Intersection of two arrays**
Like does set(num1) is faster than manually adding and also while conversion and & operator
like poeple are doing it manually ( you could to show your interviewer) i think interal operations are somewhat faster
It is almost certainly faster to use sets
and can interviews argu that built ins are not allowed
Sure, the interviewer can place whatever conditions on you they want. They'd usually only tell you not to use some built-in or some part of the standard library if they were asking you how to go about building that thing yourself, though
Like "imagine dict didn't exist: how would you create your own hash table based mapping?"
man i have idea that list.append is O(1) amortised and remove is O(n)
what are add and discard in sets time complexity
O(1) amortized
They're based on hash tables, and hash tables have amortized O(1) insertions and removals, and O(1) lookups
okay so like answer would be i would create a structure that would take key and then store that key in array. that array would have hash value calculated in which position that key would be placed, in case of collision i would look for next present value
something like that and i would code that
Sure. The point is that the only plausible reason they'd tell you not to use a dict is that they want to see that you understand enough about how a dict works that you can explain how to implement it "from scratch"
Same goes for any other restriction they might place on you. The only good reason to say "don't use X" is because X abstracts away all the complexity, and they want to check that you understand the complexity that's being abstracted away
ah thanks man
i have like basic to intermediate python knowledge and basic data structs and algo
i was reading fluent python book
do you suggest i should shift to learn algos and data structs first or complete this one
I'd recommend taking a university course on data structures and algorithms. Until you're working towards a degree, you're better served by just building projects than by focusing specifically on data structures and algorithms
i have no academic knowledge
do i learn online any course because i think basics and intermediate level should be cleared before projects what you say
Practicing data structures and algorithms is usually one of the things that people do while preparing to apply for jobs. I wouldn't recommend applying to jobs without first getting a degree.
It's possible to get jobs without a degree, but it's much, much harder, at least in my country. The default path to employment is getting hired by a company after completing a degree
DON'T RELY ON COLLEGE TO GET GOOD AT ALGORITHMS. College can't teach you this. You can't get good at this unless you grind leetcode/neetcode. You can only learn it through self-learning.
but how are you supposed to come up with all that within 30 minutes in an interview, that's insane
At least minimum window substring wasn't too hard.
Most people learn it from their university classes, and then just need to brush up a little bit with something like leetcode when they're applying to jobs
I don't think most people would come up with this algorithm in an interview setting, at least not without help from the interviewer.
What about the O(1) space solution to the buying and selling stock question?
I'd expect a good candidate to be able to come up with solution 2 on that page, but 3 requires some insights that the interviewer would probably have to provide
Don't know what you're referring to
You have an array with positive numbers in it, and you need to find the maximum value where you subtract a number from another number at a higher index.
I couldn't figure out that you can update left once you find a smaller value.
Not a problem I recognize, so I can't really weigh in on how well I'd expect a candidate to do on it
But remember that interviewers are just using algorithms questions to gauge your coding skill. They're a means to an end, a conversation starter. You don't need to find the optimal solution while the interview sits silently by. They're a conversation starter
We're veering back away from #algos-and-data-structs topics into #career-advice topics, though
The ideal thing to do when you're asked an algorithms question in an interview is to start thinking out loud, describing the first approach that comes to your mind. Most likely the interviewer is gonna drop some hints to help you refine that to a better solution
if it is the problem I think it is the insight is just that if you look at selling at time t you need to know the minimum for times <t
i.e. conceptually you are interested in
max(a[i] - min(a[:i]))
computing this directly as stated would be inefficient, but the minimum could be computed smarter as just "the minimum so far"
because
min(a[:i]) = min(a[i], min(a[:i-1]))
also 
I see how it works, but I couldn't come up with it.
I tried making another array which records the maximums starting from each index, but that's O(n) space.
?? Of course college teaches you DSA, it usually focuses more on the theoretical side of it but I’ve had around 80-100 hours of lectures and exercise sessions on DSA back then
And that’s usually enough for job interviews you only need to « grind » if you want to do competitive programming
Mine had one class on data structures and one class on algorithms, and it didn't teach enough. I still can't do most leetcode puzzles.
How hard is it to come up with the O(n) solution to Sliding Window Maximum? I see on neetcode that it says to aim for O(n*log(n)) or better. Should I try to find the O(n) solution, or is that unreasonable and I should just look at the solution?
you don't need to be able to do all of them to pass interviews
at least in france/uk knowing how to solve mediums is plenty enough
I know, I can't solve them.
Even some of the easy ones are too hard for me. I couldn't solve the buying and selling stock problem optimally. Yet I got an A in both classes.
🤷♂️
Always try to find the answer on your own first would be my advice but eh whatever works for you
I looked up the deque solution, but it took me a minute to figure out why it wasn't O(n*k)
This is not a question you'd ask to test someone ingenuity
It is simply a question about something relatively standard
So it is about knowledge, and not ingenuity
What do you mean?
I haven't read the entire discussion
My point is just that O(n) solution to sliding window maximum is not really something you come up with on the spot
Instead, it is a well known technique. Commonly refered to by something like "Monotonic stack"
What about the dynamic programming solution?
I think DP problems can make good interview questions. You can usually solve DP problems on the spot without having to know specific techiniques.
It tests your ability to reason mathematically
But what about specifically for sliding window maximum, where you split the array into k-sized sections instead of using a monotonic queue.
I'm concerned about it because there's no way I would have thought of that.
Ye there are solutions like that. There are also things like sparse tables that (if optimized) can with O(n) precomputation find the maximum of any interval in O(1) time.
https://neetcode.io/problems/sliding-window-maximum?list=neetcode150
How does the fourth solution even work? I've been staring at it for a few hours and I can't figure out how leftMax and rightMax work.
I get that they show the maximums in a bunch of slices of k-sized partitions, but how does that let you find all the sliding window maximums? I can't connect the dots.
I think I figured it out after drawing some stuff in mspaint. They basically let you check the maximums of a left side and a right side of the sliding window at any position.
I really hope interviewers don't expect you to come up with solutions like that one, because it's fucking impossible, I can barely even understand the solution after seeing it.
https://neetcode.io/problems/daily-temperatures?list=neetcode150
WHY CAN'T I SOLVE IT. I saw that there's a solution with a stack and a solution with dynamic programming and I'm still completely lost. I don't get what the fuck people mean when they say all you need is to know the data structures and algorithms. I still have no clue what to do.
WHY did I ever try to get into this field, I hate programming this, I hate algorithm puzzled, looking at this insane bullshit makes me want to blow my brains out.
HOW do you people look at puzzles like this and just know what to do??
this is also a monotonic stack problem
try to break the problem down into pieces. what do you need to do, and what is your plan to do it? can you come up with a brute force solution?
by having done many similar ish problems before
Yeah, starting from each index, look ahead until the day is found.
can you write the code for it?
Yeah
It's not the code writing, it's thinking of the algorithm. I can't figure out the algorithms.
what code do you currently have?
I didn't write anything. I can't figure out what the algorithm is supposed to be.
for the brute force solution?
Why would I write that?
If you waste time on writing that then you'll just be wasting interview time.
ok i guess
They make you speedrun it in a limited time
You are learning right now, not in an interview I assume.
Okay, but still, why would I waste time writing a bad solution?
practice
Because of how the human brain works. It makes you learn to correctly interpret the problem, to give any solution makes your brain not lock up during the interview, confidence. The more simple solution cues your brain into internalizing the structure of the problem, the variables and their relationships.
Instead of thinking that you need to solve it all in your head and then take action, take action immediately.
Humans need to mess around with things to learn, and having to one-shot the problem makes that process happen less.
Imagine you want to learn to multiply numbers, you could try to do it all in your head despite having never done it before, possible, but really inefficient. Doing so makes the number of problems (samples) you go through much less than if you did not try to be perfect and instead just start doing.
In other words:
But I already know how the brute force solution works
But did you write it and run it? You may think you know, but actually running it will give you feedback on any gaps in your understanding of the problem.
I've looked at it and it was written basically how I would have written it.
i also asked you this because the optimized solution is just small-ish variation on the brute force solution
this is a common trap when solving problems. you think you understand, but the only way to truly test it is to try it yourself
(This is basically all of programming, otherwise there would not be all these bugs all the time, and security flaws)
The important thing of writing something down is that it extends your working memory, humans can't keep track of many things at once and tunnel vision on things, tricking themselves into thinking they have the whole picture.
Writing/paper was a huge invention because of this (not just sharing information with others).
(It's an augmentation, technology)
I just don't get it.
"There's a monotonic stack solution and a dynamic programming solution, and the optimized solution isn't much different than the brute force solution". I have this info and I'm still completely stuck.
I need to get IQ tested.
HOW WOULD A MONOTONIC STACK HELP YOU, YOU NEED TO KNOW THE INDEX POSITIONS, NOT JUST THAT A HIGH VALUE WAS SEEN SOMEWHERE.
I just don't fucking get it.
[1000, 900, 800 700, 600, 500, 400, 300, 200, 100, 101, 201, 301, 401, 501, 601, 701, 801, 901, 1001]
HOW would a monotonic stack help with this, you would still need to LOOK THROUGH THE FILLED IN VALUES MAKING IT O(n^2)
At 501 you need to figure out where 500 is, at 601 where 600 is. It's O(n^2). Knowing that somewhere back there there's a lower value doesn't help you.
so you need the indexes, right? what if you cached or stored them somehow?
I guess you could do that, but I don't think the dynamic programming solution does that since it's O(1) and doesn't store stuff, so I don't see why the stack solution would.
just ask the question
notably, given an index in the string, how would i tell what line and column it's on
what's the best way to do that
again, you don't need to jump to the most complex solution first
So you store the index along with the value??
try it
it worked.
If you don't figure it out then you get rejected from the job.
unless it's something that's wildly unreasonable AND not something that they'll say "you should have seen the algorithm before".
you're not in an interview. you're practicing. also, that's not true
I've never heard of a rope before, but could you store the count of \n characters in addition to the total number of characters?
but you would still need to know how many characters there's been since the last \n character.
this is what I was thinking of with underscores instead of newlines.
And if you delete the newline in node N, it would change to "0, X", and node A would change to "3, 7".
You might not need to store the distance, maybe you could just keep track of that when doing operations on it based on when the line count increases.
So in the naive solution you basically don't know how to answer the question of how many days for an element and to fix this you use future vision and look ahead (inner loop), then rewind time (inner loop done) and answer. But another way is to flip this around temporally, and instead use past vision, where the idea is that you don't know how to answer yet, so you save it for later until you can (memory). Then after answering you can remove that from your memory (if you no longer need it).
Dynamic programming is then saying, "I will use memory, but in the perfect order so it all nicely builds on itself in the perfect way."
I think you can definitely at least count how many lines there are.
Note that this memory approach is how you learn things. You experience things in arbitrary order, maybe don't know what to do about it and so you remember it for later, then after solving some other problem come back to it.
one could try keeping just the count in the subtree
and do two traversals, one for "what's the count C up to index i?" and one for "what's the last index with count C-1?"
couldnt you just store the number of newlines in the node and then the index of the last newline relative to the node's length?
where do these numbers come from
Apologize if this is the incorrect channel, struggling to formulate this into a problem I can Google, why I am asking here.
I want to create a parser that will taken a string and decompose it into a series of operations to perform.
Essentially I have a string of the form <prefix>_<metric>_<suffix> which <prefix> being optional. Both <prefix> and <suffix> are operations to perform. <prefix> refers to some sort of selection operation, and <suffix> is an aggregation operation. So let's say I have a string initial_detection_time_mean. The parser would breakout initial, detection_time, and mean. What would happen is the resultant would be taking the first detection time from each sample and averaging it.
I want to make something robust, that isn't a series of conditional statements.
Is there a topic that I could look to for how to efficiently do this? Am I making my own kind of domain specific language? Do I have write my own grammar and parser?
EDIT:
To expand a bit more the result of initial_detection_time_mean would produce an expression that I would eval on a dataframe such as series.groupby(grp_vars).first().mean() where series (the data representing the metric) and group_vars are injected dynamically. Or maybe I go a step further and auto generate functions, I'm not sure. Still thinking about it. However all possible values for <prefix> <metric> and <suffix> would be known apriori
You are recreating SQL. You can simply split by underscores. However note that if instead of "initial" or "mean" you had one or both of those be something with underscores in it you can't tell where each part begins and ends.
I understand that. I can't introduce technologies. This is limited to pandas & dask.
You are introducing a DSL.
Yeah but its in house
If you want to introduce an in-house thing then I would make basically DuckDB from scratch if you already are allowed the time to do all this and introduce a new thing.
Adding this DSL on top of Pandas does not really do anything? I'm not seeing how this does work for you. It's basically different syntax for the same thing.
(It only adds more complexity)
I understand where you are coming from, and I can appreciate you trying to guide in a different direction
With that said, is there any yes/no on these
Am I making my own kind of domain specific language? Do I have write my own grammar and parser?
Yes, yes.
Although writing that is not too complicated.
Can you just define keywords in a grammar? Specifically for the prefix/suffix bit
Yes.
I'll also say, I'm not 100% sold on this. I'm really just trying to explore some solutions to reduce the verbosity of a code base im inheriting.
Which is built around pandas/dask, but uses them in non conventional ways. Lots of dask.delayed calls instead of using dask.DataFrame. Its tought to get into specifics, as I'm still learning it
Ok, well the way you deal with that is by removing things / replacing them with something else. Adding more stuff on top to cover it up is a common trap that results in a giant wobbly Jenga tower that nobody can work with anymore.
But its a large code, which many BAs at my job relay upon, and have to actively contribute to, so i can't just go introducing massive paradigm shifts
Each layer you add to some software is some amount of imperfect, and these add up.
Right, we are trying to remove some stuff. So the problem is, we have a lot of boiler plate computations, huge repetitions with small name changes (the prefixes and suffixes) and defining them at import time is causing massive slow downs (and to make matters worse, the number of definitions scales as a product of computations and input files). We're looking for way a to still perform the computations but in a more dynamic way.
You must convince them that Pandas or whatever they are using is not scaling well, and that they should look for alternatives to replace it with (ideally with better performance/stability/etc too while you are at it).
Yeah so Python Pandas is good for exploration of data, but it's not designed for scaling of many data manipulations in a non-exploration/non-notebook kind of setup.
Oh I know that, I believe original author (who has left the company), believed he could scale things better with Dask and our massive computing cluster
it just hasnt panned out like that
Regardless, i need to make an attempt and possible pro/cons. Make a story and explain why this is probably not the best long term solution. Justify a full rewrite
So I'm with you.
If Python itself is causing issues with import times you may need to consider that Python may not be the right tool for the job either (you can usually still have it as a top level glue/code director role).
I really don't blame python too much. I blame the way we're doing this
Yeah, why I wrote right tool for the job. Python is not exactly meant to have tons of this at import time.
IDK if you can change that and keep Python or not without more details.
We do monte carlo trials, and have a list of a few 100 metrics we calculate. The way this thing works, we create this massive dask task graph out of dask.delayed function calls, one function call for each metric multiplied by number of monte carlos. Its honestly ridiculous
we spend a huge amount of time just setting up the graph
Yeah, there just comes a time when you outgrow the tools being used / or the setup of those is just wrong.
And stacking more on top to improve developer experience results in digging a whole future devs can't climb out of.
It's like sweeping under the rug instead of actually cleaning.
(Solving the core issue(s))
(It's why most software is slow, bloated, and buggy now)
(Removing code is good)
(With this DSL you are heading towards https://en.wikipedia.org/wiki/Inner-platform_effect )
The inner-platform effect is the tendency of software architects to create a system so customizable as to become a replica, and often a poor replica, of the software development platform they are using. This is generally inefficient and such systems are often considered to be examples of an anti-pattern.
On the other hand, such functions are often created to present a simpler (and often more portable) abstraction layer on top of lower level services that either have an awkward interface, are too complex, non-portable or insufficiently portable, or simply a poor match for higher level application code.
Feel like that fits the bill big time
Since software is all about automation we often fall into this trap because once we feel some pain point instead of fixing the core issue we instead tend to automate it away. But it's like imagine in some strange world we had a rule that everyone needs to get a wood board broken over their heads to enter any building (like one of the breakable ones that hurts a bit but not too much). Then the person making these starts to feel the pain of having to constantly make these by hand. To solve this they make a board factory (automating the problem away). But the real solution is to just stop hitting everyone on the head with a board (removal/replacement of a thing, not adding more on top).
Instead I recommend not automating it away and feeling that pain point, as it's an indicator that something is wrong, and it's important to be able to still sense that, rather than push it off for a later time when it hurts even more.
Automation amplifies a thing. This includes things that should not be that way to begin with.
(LLMs spewing out code is the extreme form of this, the end game of this kind of cover up of bad design (code that should not need to be generated))
They're the "_" count in the left subtree and the distance from the last "_" in that subtree.
But it might be easier to do this instead of storing the distance.
it's okay, i just stored a list[int] of the relative indexes of newlines inside each node
storing indices will make updates terrible
i removed like 1.2 million characters and release did it fine in some 10 ms or whatever
i spoke about those concerns with the people in the rust server and they said it wasnt really worth worrying about cloning a Vec<u16>
either way, i can always optimise later down the line
if you actually use a rope for the stuff rope is good for you would care about update costs 
It would probably be simplest to just store the line count, then you can find line number and line position similarly to how you find an index positions.
it really shouldnt be an issue
we'll see though
it's more that if you don't need the ropeyness of rope, just use a string
no
efficiency
i think?
i already made it anyway 
to be sure i asked in both rust servers so we'll see what i get bac
what kind of sizes? what kind of operations?
Why are the variable names on the neetcode solutions so dogshit? It really pisses me off, I can barely understand what I'm looking at.
def __init__(self, N, A):
self.n = N
self.INF = int(1e9)
self.A = A
while (self.n & (self.n - 1)) != 0:
self.A.append(self.INF)
self.n += 1
self.tree = [0] * (2 * self.n)
self.build()
WHY DID YOU NAME THEM N AND A?? GIVE THEM DESCRIPTIVE NAMES.
inserting and deleting which is basically all you do in a text editor
n for number, a for array
but yes those don't look very good..
why did they have to make INF an attribute?
from my chess days, i know that n & n - 1 is a way to get the LSB
i dont know why they're adding to it
yeah i cant read this shit fuck
https://neetcode.io/problems/largest-rectangle-in-histogram?list=neetcode150
It's from the segment tree solution.
so for each update you rewrite possibly all of that array? 
unless you update the whole of the string, you wont update the whole of the array
idk, I don't mind n and A
it's basically "increment until power of 2"
isn't there a better way to solve this..?
but inserting a line shifts everything around
maintaining the count of newlines would make for easy update logic and pretty simple query logic
well you can always just get the MSB and << 1 it
it shifts only the lower half
in fact, i'll bench inserting like 100 newlines into it
half is still linear work
➜ cargo run
Compiling terminal-editor v0.1.0 (C:\Dev\Projects\Rust\terminal-editor)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.62s
Running `target\debug\terminal-editor.exe`
Inserting 1000 lines took: 109.3424ms
➜ cargo run -r
Compiling terminal-editor v0.1.0 (C:\Dev\Projects\Rust\terminal-editor)
Finished `release` profile [optimized] target(s) in 0.97s
Running `target\release\terminal-editor.exe`
Inserting 1000 lines took: 6.0153ms
seems like it works fine
if your input is small mostly anything will do fine
it's 4.5 megabytes
how many lines?
lemme check
31k * 1000 is pretty small
I just looked up the O(1) space solution.
I swear I've come up with "find the answers starting from the end" types of solutions before. It's really frustrating that I couldn't come up with that for this one.
It's concerning, it makes me feel like I'm not absorbing what I need to.
Instead of just memorizing starting from the end, it's more about the going in the right order/direction in general, but you know that direction from the data dependencies. Ask yourself what you need to know to answer what result[i] is. You know from the brute force solution that you need stuff to the right (search ahead to the right). So you can imagine arrows pointing from the element to the right to the current element (dependencies). So now you want to go in the order the flows best with the direction of the dependencies, which is right to left (starting at the end).
Note that the stack solution is still in the wrong order, and so it needs to store extra things for a while to basically fix this wrong order.
The brute force solution just looks ahead and rewinds to fix the order (so more work instead of memory).
Which comes back to the directed acyclic graph (DAG) and its relation to dynamic programming (modeling dependencies) and in general being a great tool to think about how to solve problems.
This is also why sorting can often be a good first step as it puts things in order (sometimes, if it does not outweigh the rest of the work on its own).
I'm actually looking at it again, and I can't really tell why it's guaranteed O(n). It seems like there could be some kind of input which would lead to O(n^2) time.
[10,1,2,3,4,5,6,7,8,9,10,11]
If you have this then j will need to go through the whole array.
How do they know there's no input where a lot of the inner loops will have to iterate over some fraction of the array causing O(n^2) time?
I think that's probably why I didn't think of it, it seems more like an optimization for the O(n^2) method rather than a guarantee of O(n).
How many times does it go through the whole array?
When will it go through the whole array?
How many iterations on average per outer loop iteration?
I don't know.
[10,1,2,3,4,5,6,7,8,9,10,11,10,1,2,3,4,5,6,7,8,9,10,11,10,1,2,3,4,5,6,7,8,9,10,11]
but if you have some pattern like this then it will need to go through a fraction of the array a lot, every time it gets to a 10 or 11. But in this case it will have to go through n / 3 of the array 6 times 3 times, so maybe it's one of those situations where it might need to do c operations n / c times, resulting in O(n) overall. Is that what's going on?
I can't think about things like this, it's way too hard.
Is it really necessary to remember all the approaches (brute force, better, optimal) to solve a data structures and algorithms problem?
Or only the optimal approach will do
Let's say it's an interview and the interviewer asked to explain the solution of a problem, then will I go straight with the optimal code or do I have to start with brute force approach first and then optimal
I've heard people recommend that you should just go right into trying to come up with the best possible solution in an interview, but the users here might disagree with that.
The programming job market is oversaturated and you're competing with geniuses, so I don't see why you wouldn't just go straight to the best possible solution if you can.
What I've heard is that if you fuck around talking about suboptimal solutions, you're just wasting time. The interviewers often put you under a time limit to solve all their time complexity riddles, so you have to try to get through them as fast as possible.
Does it go through a third of the array 6 times?
yes
What happens when you reach the first 10 after the 1?
it goes through 1/3rd of the array.
And now when you get to the 11 right after that 10?
Putting aside the rest of the array, how many iterations has the inner loop done in total over all outer loop iterations?
it could need to go through n / c of the array c times, right? which is O(n).
with that type of pattern.
Ah, but what do you need for it to be N^2?
.
neetcode says it can't be O(n^2)
But why? When is something n^2?
Lets say you are on the first 10, how many iterations does the inner loop do at that index?
So if you count all the iterations done for the inner loop over the whole array, how many?
If you add them all up.
I know that specific example is O(n)
and that pattern in general is O(n)
because it's O((n / c ) * c) which is O(n).
honestly it's not about rote memorization, it's about different properties of the problem lending itself to different approaches
a lot of the time a better solution is just brute force plus some insight about the problem
I know brute force might give insight about a problem but I am already trying to remember so many algos and stuff and upon that u are saying I also have to understand and memorize three approaches: brute, better, optimal for every program i practice
understand more, memorize less
Bruh
you need to memorize some stuff
you memorize general patterns
At this point I don't even know how to explain
not super specific quirks for super specific problems
Anyone here with a trading algo with track performance?
Like can someone really remember all the approaches for every program ?
I know u meant understanding, but still
They expect you to just come up with the solution by yourself, not memorize it. But for certain algorithms, you need to have others memorized.
Like Kadane's algorithm.
You probably won't just think that up on the fly.
Unless you're really smart.
Don't try to memorize anything. Just grind leetcode and hope that you pick up what you need along the way.
of course you need to have some building blocks
If they ask you questions where you could only really answer it by having already memorized it then you might want to ask yourself if you really want to be working for that company.
You're right, but apparently that's what they do.
But expecting to not have to know any DS&A and that any such questions are unfair is not a good sign either.
for a lot of problems it's all about being able to analyze a problem and seeing what building blocks that can be used
When did you guys get employed?
A company that knows what they are doing in the hiring will ask pretty reasonable questions, and may not even expect you to get all of them as long as they can see that you can reason through things and more importantly are a good fit for the company (personality).
just being good enough to GET an interview is a nightmare now.
It means that they don't really want to hire anyhow in that case.
They probably already know who they want.
2019
There is no guarantee that there will always be developer jobs for those kind of companies you are trying to get into.
Or in general (who knows what the future holds).
Economy is not static with forever jobs.
you basically need to memorize all the algorithms that have names
or not ALL of them, but the common ones, and then hope that the interviewer doesn't give you a puzzle which uses one of the obscure ones which you haven't memorized.
And then on top of that, you need to hope that you're smart enough to quickly recognize what algorithms you need to use for any new problem.
You should probably grind leetcode BEFORE going to college and sinking a bunch of time into pursuing a career. Don't make the same mistake I did.
I got a degree, but I'm seriously questioning if I have the ability to solve algorithm puzzles as proficiently as companies demand. So I may have basically wasted years of my life for nothing. Don't put yourself in that position.
And you ALSO apparently need to do a ton of personal projects which happen to use the things that the companies are looking for.
i mean, back then dp wasn't as prevalent, but nowadays you could reasonably expect someone familiar with dp to get the linear solution
It's not reasonable, it's clever and there's no way you'll just randomly come up with it as an average midwit guy.
I mean it's not reasonable to expect a normal person to come up with it.
These jumps in the inner loop can be seen as forming a path in a graph, so from the 11 you start at 10, then jump to the right most 11. After one of these edges has been used, will it ever be used again? This might help.
They are not looking to hire average.
It's "reasonable" for an employer to expect one of their applicants to come up with it since the market is oversaturated, so now every job opening has the opportunity to grab someone who's a standard deviation or two above the cognitive norm.
I know
At least not for some high pay.
i said it's reasonable for someone familiar with dp to come up with it, which is not the average person
But other companies are, don't need to end up in the super top tier prestige place. You can gain experience there and keep practicing.
Anyone can be "familiar with DP", that doesn't mean anything.
You could be "familiar with DP" and still fail to come up with Kadane's algorithm.
Unlikely, if you solve 2 DP problems a day for like a year (to not come up with it).
If you do that then you've probably already seen Kadane's algorithm.
Assuming you did not.
You need to be smart enough to spot the clever galaxy brain trick, you need to be able to look at it and then figure out that you can reset the running sum.
even assuming you have, even for similar problems
to attempt a dp solve, you'd look at how much info you need to compute the answer for an array given the answer for the array without the last element
It's not just a matter of having seen DP solutions before.
otherwise you'll probably end up coming up with an O(n) space solution.
A core part of learning DP is knowing how far back you need stuff from, and what you can then throw away, and hopefully that requires constant space.
not really? if you start with an O(n) dp solution, it's pretty straightforward afterwards to go "ok, how much of the dp table do I actually need to store at any given time"
Try Fibonacci sequence DP to see what I mean.
Start with O(N) space.
DP solutions themselves are decently formulaic. Even if you are not sure why it's O(N) like in the temperature problem, you can assume that the DP solution is probably decent and just try it, then try to see later why it's O(N).
It's like a leap of faith in DP working out, similar to the recursive leap of faith concept.
Then why is the O(1) solution named after someone?
if it's so simple to get from O(n) to O(1)
because that was years ago, and dp wasn't as prevalent back then
it's not as trivial as you're saying it is.
It was apparently prevalent enough for them to have came up with O(n) solutions.
it's not trivial if dp isn't a core part of dsa like it is nowadays
The O(n) space solutions use DP too
This is like asking why Pythagoras's theorem is named. It's trivial these days to see it given all the context we have now.
I'm not discrediting kadane for coming up with it in their time, it is fairly impressive
it's just that nowadays, we have a much higher base level of knowledge
You are expected to have that context.
Inventing pythagorean theorem isn't trivial
Correct, and if it were back then, then asking someone to invent it on the spot would be absurd, but that was then.
Now if you can't you are considered to not know the basics of math.
you're saying you could come up with pythagorean theorem if you had never seen it before??
If you have everything but it in Euclid's Elements you can do it, it follows naturally.
It all builds on top of itself, same thing for DS&A.
Okay
Making the leap much smaller.
it's not about whether you can come up with it or not
i probably couldn't come up with calculus myself, but that doesn't make it unreasonable to ask calculus questions in a test, because it's expected that you have spent time studying calculus and are not going into it blind
I'm sure it feels great to have an exceptionally high IQ and to be able to just come up with things like that. That's really cool for you.
🗿
But most people can't do that.
similarly, interviews can expect that you have done a fair amount of study on dp beforehand
If you had the original context, yes, it's absurdly clever. Although even Pythagoras had some context, there was much math still happening at the time and he surrounded himself with them.
(Including the study of geometry, was a big thing)
honestly, even if you've seen dynamic programming before, it's still hard to come up with new solutions which use it. it's hard for the midwit-tier mind to even think like that.
They should give people more free IQ tests in school.
and that's why people study and practice so much, so they can think like that
No they're just smarter.
don't attribute to effort what can be explained by inherent ability.
if this is the mindset that you choose to stick with, then idk what to tell you
I think that is over estimating IQ difference. People with high IQ that you hear of are constantly practicing, much more than the average person.
Sure in the like 160s range, yeah, but developers are not that.
did you not go through the whole growth vs fixed mindset thing in elementary school
I'm not saying anything or "sticking with a mindset", it just irritates me when people say "bro I put so much effort, I practiced so much", when really they're just reaping the benefits of lucky genetics.
Social seems to be a larger factor. Einstein would have made basically zero progress without all the people he constantly talked to and surrounded himself with for example.
Einstein was a genius, he was incredibly genetically lucky.
Yes, but it's not enough.
so what?
Over estimating how much that matters.
most of the time this is not true. inherent ability is only a small part of it, which can be overcomed with effort
His personality, being very social and kind, was his biggest boon in combination with his skill (creativity specifically).
but anyway this is probably straying off topic
THAT'S COPE. That's so toxic, that type of mindset encourages people to throw away years of their lives.
If you have an average IQ then trying to get into high IQ fields is a huge mistake.
There needs to be more gatekeeping.
Average IQ is highly functional, it's average, not below average. Average intelligence humans can do A LOT (with practice).
Anyways, if you're really great at time complexity algorithms and things like that, be thankful for being mentally gifted.
let's move to #ot2-never-nester’s-nightmare . this channel is for dsa
IQ is a social construct
This gotta be rage bait
class Solution:
def generateParenthesis(self, n: int) -> List[str]:
output_list = []
# the first pattern is always "()" * n, so just skip ahead to that point
ch_stack = ["(", ")"] * n
opens = 0
n *= 2
while ch_stack:
if len(ch_stack) == n:
if opens == 0:
output_list.append("".join(ch_stack))
# this should always be ")"
del ch_stack[-1]
opens += 1
while ch_stack and ch_stack[-1] == "(":
del ch_stack[-1]
opens -= 1
if ch_stack:
ch_stack[-1] = "("
opens += 2
elif opens > 0:
ch_stack.append(")")
opens -= 1
else:
ch_stack.append("(")
opens += 1
return output_list
This is my solution to the Generate Parentheses problem. Is it normal that I can't work this out in my head? I had a basic idea that it would use backtracking, but I had to write out the process step by step and refer back to what I wrote while writing the code.
It's totally normal and actually great that you were able to solve it. But when you solve a problem try to look for other solution that are better than yours or optimal. You have idea that it could be solved with backtracking then try to solve it with backtracking, if you can't even after trying then look up the solution and try to solve it the next day.
https://neetcode.io/problems/car-fleet?list=neetcode150
Do they give you the cars in unsorted order just to troll you?? That kind of pisses me off. It makes it so the first thing you need to do is sort the data, and then you're instantly thinking "OH NO, do I need to do that? Is this a suboptimal solution? Is there a way to do this in O(n) time instead of O(n*log(n))?". Why the fuck did they not just give it to you in sorted order??
not sure why you wouldn't go with recursion for this
!e
def f(s, n, depth):
if n == 0:
yield s
return
if depth:
yield from f(s + ')', n - 1, depth - 1)
if n-1 >= depth+1:
yield from f(s + '(', n - 1, depth + 1)
print(list(f('', 6, 0)))
print(list(f('', 8, 0)))
:white_check_mark: Your 3.13 eval job has completed with return code 0.
001 | ['()()()', '()(())', '(())()', '(()())', '((()))']
002 | ['()()()()', '()()(())', '()(())()', '()(()())', '()((()))', '(())()()', '(())(())', '(()())()', '(()()())', '(()(()))', '((()))()', '((())())', '((()()))', '(((())))']
one could avoid string ops and do list append/pop instead, but meh
same algorithm
I dunno, I guess because I just suck at coding.
Every time I come up with a solution, the answer on the site is way shorter than mine.
so study it, see what things they did that you didn't, learn some techniques
what does "if n-1 >= depth+1:" check? I can't understand it.
it's just bailing if adding a ( would make it impossible to terminate with a correct bracket sequence
i.e. after adding a (, is it possible to get back to 0 depth even with all )
!e one could move the check up top
def f(s, n, depth):
if n < depth:
return
if n == 0:
yield s
return
if depth:
yield from f(s + ')', n - 1, depth - 1)
yield from f(s + '(', n - 1, depth + 1)
print(list(f('', 6, 0)))
print(list(f('', 8, 0)))
:white_check_mark: Your 3.13 eval job has completed with return code 0.
001 | ['()()()', '()(())', '(())()', '(()())', '((()))']
002 | ['()()()()', '()()(())', '()(())()', '()(()())', '()((()))', '(())()()', '(())(())', '(()())()', '(()()())', '(()(()))', '((()))()', '((())())', '((()()))', '(((())))']
Yes that is normal, after you do something many times on paper you can start doing it in your head. After you have done it many times slowly and consciously in you head, you can start doing it subconsciously. It's important to play around with example cases that you make up yourself, typically the simple cases (and most simple ones being the base case). https://www.youtube.com/watch?v=ngCos392W4w
In this video, we take a look at one of the more challenging computer science concepts: Recursion. We introduce 5 simple steps to help you solve challenging recursive problems and show you 3 specific examples, each progressively more difficult than the last.
Support: https://www.patreon.com/reducible
This video wouldn't be possible without the...
This is the kind of thing you add to your list of things to check / tools. You think "but what if it was sorted already instead?"
And this generalizes to the general problem solving tool of asking yourself the more simple version of the same problem. So basically if you get stuck and it seems like a convoluted mess, try asking yourself the more simple version of the same problem that is easier first, and solve that. Hopefully that will then give you insight into solving the more complicated verson. This is how mathematicians/scientists break down complex things that seems completely impossible to understand at first glance.
In the recursion case, this translates to asking yourself the solution to the same problem but with less inputs (until it's so simple you understand it and it forms the base case), then finding/hoping there is a relationship between those and more complex cases (that you also messed around with).
That makes sense, but it just seems trollish to let the person trying to solve the problem think that there might be an O(n) solution.
But at least neetcode lets you know what time and space complexity to aim for.
In a real world scenario you don't know if there is an O(n) solution or not, it's possible that you will waste your time looking for one. That's just the way it goes, you don't know what is worth looking further into and when to give up and try something else. But you can always at least try something worse (the naive solution) and from that you can often get a better guess if there may be something better, since you understand the problem better after having created any solution at all.
but do interviewers troll you like that?
If you give a naive solution they may prompt you for a better one. Something like "that is great, but can you do better?"
If the applicant can then show how given the knowledge they now have of the problem how they might do better (give some list of potential things that might work), then it's a key indicator of problem solving ability.
Simply even given any list at all of things to try out is a good indicator (not just being frozen and giving up).
They might ask you to formally prove you can't do better than a certain complexity I guess
though that's a bit theoretical
I think one argument is that you by necessity need to know the car directly ahead of every car
which requires sorting
(if it could be done better than sorting then you would be able to sort better using that approach as well, which is a contradiction)
https://neetcode.io/problems/largest-rectangle-in-histogram?list=neetcode150
If I was asked how to do this in O(n) time then I would be frozen and give up.
How the fuck is a stack supposed to help you with this?
At least this one is marked as hard difficulty.
It’s a monotonic stack I think
Why would a monotonic stack let you solve it in O(n) time?
It doesn't help you quickly find minimums over any array slice.
You would still need to fill it for each slice.
Neetcode’s video is pretty good if I remember correctly
But you shouldn’t be doing hards before solving all the easys and mediums if you’re following neetcode’s thing
I don't want to look at them, I want to figure out an O(n) solution myself.
I am, I did the previous stack problems.
No all the easys and mediums of all the categories
Otherwise I think the difficulty ramps up too fast
But do what works for you
I guess
That’s just my two cents
I see
I think it would be a monotonic stack of (height, start_index)
when you ||encounter a lower value than the top value||, you ||handle the rectangle of size height * (cur_index - start_index)||
!e i.e. this
||
def solve(heights):
stack = [(0, 0)]
max_area = 0
# Natural boundary condition.
heights.append(0)
for i, h in enumerate(heights):
start = i
if h > stack[-1][0]:
stack.append((h, i))
else:
while stack and h <= stack[-1][0]:
height, start = stack.pop()
max_area = max(max_area, height * (i - start))
# The new hight stretches back to the last popped index.
stack.append((h, start))
return max_area
print(solve([7,1,7,2,2,4]))
print(solve([1,3,7]))
```||
:white_check_mark: Your 3.13 eval job has completed with return code 0.
001 | 8
002 | 7
hye, can anybody send the solution for this problem?
modify the graph a bit and run dijkstra with some tweaks on it
e.g. the usual dijkstra:
...
if dist_j <= dist_i + cost_ij:
dist_j = dist_i + cost_ij
```normally `cost_ij` is just `weight_ij`, but you now also have to consider that the lights must be the same color, so you can do like `cost_ij = weight_ij + wait_ij` where `wait_ij` is how long you have to wait until the lights match up
and there's 2 states a junction can be in - blue or purple
so while normally a usual graph's nodes are uniquely identified by their indices, maybe here you instead want to use (index, light_color) to identify a node instead
I finally came up with something.
class Solution:
def largestRectangleArea(self, heights: List[int]) -> int:
answer = 0
iter_heights = iter(heights)
# (height, distance from previous smaller bar)
stack = [(next(iter_heights, 0), 1)]
for h in iter_heights:
if stack[-1][0] < h:
stack.append((h, 1))
elif stack[-1][0] == h:
stack[-1] = (stack[-1][0], stack[-1][1] + 1)
else:
dist_behind_current = 1
while stack and stack[-1][0] > h:
height, width = stack.pop()
tmp = width
width += dist_behind_current - 1
dist_behind_current += tmp
area = height * width
if answer < area:
answer = area
if stack and stack[-1][0] == h:
stack[-1] = (stack[-1][0], stack[-1][1] + dist_behind_current)
else:
stack.append((h, dist_behind_current))
dist_behind_current = 0
while stack:
height, width = stack.pop()
tmp = width
width += dist_behind_current
dist_behind_current += tmp
area = height * width
if answer < area:
answer = area
return answer
why are my solutions always so big, they're always 2-3 times as big as the site's solutions. Even though I'm doing the same thing they're doing.
Actually it looks like they stored index positions rather than distances from lower heights.
@fervent saddle my solution is here
(I didn't try to submit it, but I think it should be correct)
I wouldn’t worry about that too much
fwiw, I think your elif is not needed at all
the else case does the exact same thing in that case
and if you look at my solution I do two things which simplify the logic a lot
I add a 0 height entry to the stack, which means not having to think about the case of an empty stack
I also add a trailing zero to the input, which removes the need for the handling you do after the loop
this is a kinda common pattern, being able to add some natural terminator which makes special cases go away
in string algorithms this could be adding a character that is outside your range of valid characters (e.g. tack a $ at the end of a string)
I know, but I thought it might make it a little bit faster.
it doesn't need to check the while condition
I looked at the space optimized solution yesterday where the stack only stores indexes from heights. Now I'm going to try to write that from memory. Is that a good learning / memorization strategy?
probably
if you have grasped the idea of the solution you should be able to implement it
MINE:
class Solution:
def search(self, nums: List[int], target: int) -> int:
low = 0
high = len(nums) - 1
while low <= high:
mid = low + ((high - 1) // 2)
if nums[mid] > target:
high = mid - 1
elif nums[mid] < target:
low = mid + 1
else:
return mid
return -1
THEIRS:
class Solution:
def search(self, nums: List[int], target: int) -> int:
l, r = 0, len(nums) - 1
while l <= r:
# (l + r) // 2 can lead to overflow
m = l + ((r - l) // 2)
if nums[m] > target:
r = m - 1
elif nums[m] < target:
l = m + 1
else:
return m
return -1
Can someone explain how it's possible that mine causes time limit exceeded and theirs doesn't when it's THE EXACT SAME FUCKING THING??
HOW IS IT POSSIBLE?
I hate programming so much. WHY did I ever try to get into this??
WHERE
check your calculation of mid
I hate programming
protip: try to add some prints to sanity check stuff
WHY can I not even figure out how to write bbinary search??
I hate programming so much
I hate off-by-1 mistakes
off-by-1 mistakes are the bane of my existence
this wasn't off by one
This is normal, it's why print debugging, visualizers, debuggers, assertions and loggers are used all over the place.
I know it's not but I was having that problem earlier
I much prefer inclusive l and exclusive r
Shouldn't it not matter unless you're using signed ints?
and the array might be really really big?
it is true for languages with fixed sized ints
but bs for python
IMO binary search is one of those algorithms where it's totally necessary to think about invariants. Decide beforehand what variant you're going to maintain, e.g. "the target is always in arr[l:r]" and that'll inform your choice of update function, etc
I much prefer the invariant of f(l) true and f(r) false (or the reverse)
that way l and r always straddle the transition
How large is your input array? I think you may have other issues before that happening (in C).
this also translates better to more general binary searches I think, e.g. with floats
binary searching isn't limited to arrays
Yeah, but it is in this case.
if you use a 32 bit type you might still hit this
Although outside of arrays, this error would still be rare as in those cases you would be well aware that you are dealing with massive numbers.
Yeah, but this is why you use 64 bit ints by default for everything typically.
They could also use 8 bit, but who does that?
it doesn't matter if it overflows, the result will still be the same. it only matters if you're trying to avoid undefined behavior. Right?
nevermind
that's wrong
that's stupid, what am i talking about
If you get an incorrect search, and that now causes the nukes to be sent, it's a problem. There are more issues than undefined behavior.
that's obviously not the case when you're doing DIVISION.
this kind of thing is my go to way to write binary search
def bs(l, r, cond):
while r - l > 1:
mid = (l + r)//2 # or whatever other thing
if cond(mid):
l = mid
else:
r = mid
return l, r
l = 0, r = len(arr) in this case
it wouldn't matter if all you were doing was addition and subtraction AND you expected the final result to be in range.
I find this invariant so much nicer
This is the version I usually do too.
HOW do you even figure out that this works just by looking at it?
I can't think like this
wdym?
I mean that my brain stops working once I need to think about borders
the only invariant that is kept is that f(l) is true and f(r) is false