#algos-and-data-structs
1 messages · Page 13 of 1
Hmm that stack overflow link or creating your own view seems like the best options.
Are you particular about using a global list specifically?
One idea is maybe changing your global list into a doubly linked list. That way it has order.
Create a function to return subsets (splices) of the linked list and modify them, then link them back into the global list using the prev and next pointer
Here's a simple demo with singly linked list (let's say you wanted to modify and append to slice root[1:3]):
class LL:
def __init__(self, val=None, next=None):
self.val = val
self.next = next
root = LL(140.000)
root.next = LL(1)
root.next.next = LL("Channel #1")
root.next.next.next = LL(True)
start = root
for _ in range(1):
start = start.next
start.val = 2
end = root
for _ in range(2):
end = end.next
end.val = "Channel #2"
temp = end.next
end.next = LL("ChannelType.Sampler")
end.next.next = temp
temp = root
while temp:
print(temp.val)
temp = temp.next
# should be:
# 140.0
# 2
# Channel #2
# ChannelType.Sampler
# True
def findMinCutCost(m, n):
cost = n * len(m)
for i in range(len(m)):
cost = min(findMinCutCostImpl(m, n, i), cost)
return cost
def findMinCutCostImpl(m, n, i):
if len(m) == 1: return n
cl = 0; cr = 0
if i > 0:
cl = 10**10
ml = m[0:i]
nl = m[i]
for j in range(len(ml)):
cl = min(findMinCutCostImpl(ml, nl, j), cl)
if i < len(m) - 1:
cr = 10**10
mr = m[i + 1:len(m)]
nr = n - m[i]
for j in range(len(mr)):
mr[j] = mr[j] - m[i]
for j in range(len(mr)):
cr = min(findMinCutCostImpl(mr, nr, j), cr)
return n + cl + cr
m=[2,8,10,15]
n=30
print(findMinCutCost(m,n))
here's what i understand so far:
cl = cost left
cr = cost right
m = array of cuts to be made, ints
n = string length
ml = segment, left to middle
mr = segment, middle to right
nl = ?
nr = ?
is it possible to have a dyn prog approach without the classic dp table?
Hey @fiery cosmos!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
see here:
what is meant by dag in this context
eg:
"every dynamic program has an underlying dag structure"
oh
directed acyclic graph
did i share the google AI beating strassen's algo here?
Is lambda considered a form of list comprehension?
solving by hand you should at least be able to tell which one is correct, right?
no
Thankyou.
this is Levenshtein distance, no?
that particular instance may be, but they seem to be arguing in the book chapter that every dynamic program has an underlying directed acyclic graph structure
I mean yes
otherwise the problem couldn't even be solved
having a DAG structure means (in this case) there exist some order you can solve things in so you only depend on things already computed (i.e. a topological ordering)
things not being a DAG means having a cycle in the graph
e.g. A(0) depends on A(1) depends on A(2) depends on A(0)
you can't start anywhere in that example digraph
right right ok
how can i turn this into a change-making program
rather, how can i make a dictionary where each key is on of a series of coin denominations from an input array
and each value of each key initialized to zero
oh i got it. you do:
mydict = {}
for i in denom_arr:
mydict[i] = 0
ask in #python-discussion it's not relevant to the topic of this channel
oh sorry
is there a return to loop header with current i value function?
no problem, a lot of people randomly end up here, probably because it's the first topical help channel 🤷♀️
what would this even mean?
so i want to repeat the loop for the same i value while some condition is not yet met
for i in reversed(denom_array):
while cum_val + denom_array[i] < inp_val:
so you want a while loop
well i need an interator to change the denomination of coin im adding
✋
so what's wrong with this?
it tries one coin denom one time and moves on. i think i can figure it out. the for loop needs nested within the while loop
but this has to be an n*k time complexity algo, so idk that this will work
class EventList:
def __init__(self, lst_: list[AnyEvent] | None = None, p: EventList | None = None):
self.local = [(i, e) for i, e in enumerate(lst_ or [])]
self.parent = p or self
@overload
def __getitem__(self, i: type[EventEnum]) -> EventList:
...
@overload
def __getitem__(self, i: EventEnum) -> EventList:
...
def __getitem__(self, i: type[EventEnum] | EventEnum):
if isinstance(i, EventEnum):
return EventList([e for _, e in self.local if e.id == i], self.parent)
return EventList([e for _, e in self.local if e.id in i], self.parent)
def __iter__(self):
yield from self.local
def __len__(self):
return len(self.local)
def insert(self, i: int, event: AnyEvent):
p_idx = self.local[i][0] # Index of event in parent
self.local[i:] = [(p_idx, event)] + [(i + 1, e) for i, e in self.local[i:]]
if self.parent is not self: # Prevent recursion loop
self.parent.insert(p_idx, event) # Insert in parent's local
Decided to use a list as the base collection.
I think this should do it @haughty mountain@hybrid epoch@austere sparrow@loud trail
Some smart recursion at end, should resolve any kind of invalidation issues ig
This will allow sublist of sublist of sublist to update original-most list as well, provided the g is passed correctly
that's an interesting way to do it
pseudocode makes no sense, why would you run 0 to n where n is the amt of change to be made
This statement is technically not correct in general for all of dynamic programming (it's more general than that), but for the kinds of dynamic programming problems you will encounter it probably is.
how can i modify this code to return also the number of coins of each denomination:
INF = 100000
def min(x, y):
if x < y:
return x
return y
#k is number of denominations of the coin or length of d
def coin_change(d, n, k):
M = [0]*(n+1)
for j in range(1, n+1):
minimum = INF
for i in range(1, k+1):
if(j >= d[i]):
minimum = min(minimum, 1+M[j-d[i]])
M[j] = minimum
return M[n]
if __name__ == '__main__':
# array starting from 1, element at index 0 is fake
d = [0, 1, 5, 10, 25]
print(coin_change(d, 73, 4)) #to make 5. Number of denominations = 3
right now it simply prints the number of coins. i would like the number of each coin as well
i had a feeling, yeah. seemed too vague or generalizing
Well, maybe if unrolled through time.
how would you resolve a loop?
In successive approximations it's just updating the values. One could say that it's just generating a new value and that would still make it loop-less.
So some time subscripts to the values.
this sounds less like dp and more like some system trying to find equilibrium over time
Dynamic programming has that kind of stuff.
sound more like markov stuff
Dynamic programming also has that.
(Q-learning)
There is the DP programming method, and then there is DP as in mathematical optimization.
Both are DP.
The non-programming method part of it was actually kind of the main focus of the research back then.
idk if I would call stuff like Q-learning and policy iteration dp, but this might be dp being a wildly terrible name for things again 
Some DP uses fixed point iteration, maybe that makes it more clear.
it's very different from classical dp
Specifically solving Bellman's equation.
That is classical DP.
Wikipedia gets it right actually: "Dynamic programming is both a mathematical optimization method and a computer programming method. "
you'd be surprised how hard it is to find decent pseudocode for the coin change problem with a O(n*k) time complexity and using dp
A lot of it comes from Control Theory, and reinforcement learning (Q-learning) comes from Optimal Control Theory.
dp seems pretty obvious when it comes to not re-solving the same problems over and over, im waiting for the insightful or non-obvious part to come up
So not only is DP a bad name on purpose, it now refers to two different things that are technically the same, but feel very different in practice.
actually, i was already exposed to that in trying to understand how an algorithm works
naming is hard, apparently
wdym doesn't make sense?
n is the amt of change to be made
sure, isn't the question what's the cheapest way of getting there?
what's the fewest coins to get from 0 to n
so what's the thing that doesn't make sense?
it starts wIth C(0) = 0 and then for p = 1, ..., n considers the cheapest path it could have come from
that's it for computing the number of coins
i think im reading it wrong. the p loop will not run n times
n-1 times
oh
i think it's saying like while p is not n but p could jump + 1*denom
or it will run n times?
idk why the number of loops is that important?
i just don't see how you'd have to run a loop 79 times to make change for 79 cents
it's more important what the loop does
it should be much simpler
at this cost you might as well just count pennies out until you get there
what if you had one coin with denomination 1?
only coins w denom 1 yeah
the point is finding the best solution, not just any solution
any solution could probably be done a bit simpler, e.g. 2 coin denominations is a simple diophantine equation
so sidebar, this is confusing to me bc they don't explicitly state what the C and S arrays are for, and furthermore there is another pseudocode which is called to compute the sltn, and which only uses S.. let me try to implement and see
do you get the basic idea or not?
no, i get like the memoization and dp table stuff but i wasn't able to see why they are typically 2d arrays. i need to read the pseudocode again
also idk what's the point of S, you shouldn't need it
hi, is anyone know how to insert into list objects all keys?how it is used in js? {...objects, newKey: value}
these should be 1d arrays
they can be any dimension
most dp tables i see are 2d and are referenced like arr[i][j]
depends on how many variables you have in your dp formulation
I've seen 4 or 5 used at some point
4 dimensional arrays? oh that's just a quadruply-nested array?
yes, you can have 4 variables in the dp formulation
all combinations might need to store a dp value
you've seen 2d arrays because you've probably only seen dp formulations with 2 variables
sounds.. space complex
depends on how big the variables can be
well 4^4 is 256 so, small amt of inputs, lots of individual table entries
does one need quad nested loops to read/write with that many vars?
well yeah, you can have small sized stuff though, like a dimension could be possible decimal digits, i.e. size 10
and then you can have larger stuff in other dimensions
kinda yeah
its a logical extension given what i've seen required for 2d arrays
but yeah, you don't really need the S array they have
well i'm glad bc i could not see the reasoning
maybe it makes reconstruction easier, but not much
so reconstructing without S is pretty easy
start at n, the number of coins needed was c, where could you have come from?
if you took an optimal path
wdym.. like from the c-1 state?
n-c_i?
n - c_i are the states you could have come from yes
and only ones with C(n-c_i) = c-1 are optimal
pick any of the optimal ones
(and bad choice of name c by me)
and then repeat
their S array I think is there to know which coin produced the minimum
so you know where to jump along an optimal path
but you can reconstruct that info easily while backtracking as well in this case
well i would like the series of coins which produced the mins, eg each coin denom and its count
so p looks to be a running summation of the values of coins we've picked, we're running it up to n
then we have for i beginning at 1 and running through k, where k is the number of denominations, eg 4 for the set d = [1,5,10,25], similarly one could simply use len(d), the denomination array
line 5 makes sense
how should i be thinking about the C array
or rather, what should i initialize it to, wrt to length and values?
oh it can simply be len(d) and zeroes
i keep getting local variable referenced b4 assign
if you pastebin the code, people can help
def change(denom_arr,change):
C = [0*i for i in range(len(denom_arr))]
S = [0*i for i in range(len(denom_arr))]
for p in range(0,change):
min = 10**10
for i in range(0,len(denom_arr)):
if denom_arr[i] <= p:
if 0 + C[p-denom_arr[i]] < min:
min = 1 + C[p-d[i]]
coin = i
C[p] = min
S[p] = coin
return C, S
d = [1,5,10,25]
change(d,73)
it's clearly not passing the second if check, so coin is never assigned, leading to an error on the S[p] = coin line
it sounds like you understand the problem
pseudocode is here:
i do but why would they write it to throw an error
i mean, i understand the error but not how to correct
i actually should probably just try to implement the greedy approach
C(n) is the smallest amount of coins to make change n
the p loop shouldn't be 0 indexed
implemented recommended change
so my overall question is, do you know what the dp formulation is for this problem?
i.e. what is the formula for C(n)?
if you know that the impl should make sense
not really. i know its based on the minimum expression of 1 + C[p - d_i] over all indices i checking d_i in the denomination array
but i think implementing the greedy algo would be more realistic
especially for a HW assignment. i just don't know if that is O(n*k) time
greedy is not correct
hmm
think about what the minimum means, when C[p] is the smallest number coins needed to get to change p
its equal to 1 when if d[i] = p
in general, where can I arrive to p from?
so this is where i don't like set notation. i would say that given a set C of coins which cumulatively equal P (of varying denominations), you could look at the p=C-1 set. but sets i cannot represent say, 5 coins of denom 5, i simply have a single 5
hang on
ok, to be precise, where can I arrive to p from by adding a single coin
C[p-1]
nope
C[p-d[i]]
I was after just p-d[i], but yes
ok, how much does it cost me to go to p via p-d[i]?
so total change p minus some coin denomination with index i
d[i]
are you asking if one were to remove that entire denomination from consideration (computing the subproblem where we havent yet considered denomination i just yet)?
no
this is about making sense of the dp formulation
so we could have come from p from p-d[i] (for some i)
what's the least amount of coins to get to p-d[i]?
(it's a dumb question with a dumb answer)
idk. here is my logic train:
d[i] represents a denomination of a coin. say 25 cents. then for example say we have one dollar. so i have 100-25 = 75 cents. then it'd be 3*d[i], but i don't know how to abstract it
think about adding one coin at a time, that's the whole point of this formulation
it's ||C[p-d[i]]||
i think the indices are tripping me up. also it's not clear to me if d[i] represents a single coin of denomination i or that entire denomination
it represents the denomination, but we think of adding exactly one coin
ok so the next obvious question would be, how to we solve the subproblems (construct table C)
so we know the lowest #coins to get (from 0) to p-d[i] is C[p-d[i]]
how many coins do we need to go to C[p] from there?
C[d[i]]
technically correct
1
C[p - d[i]] + 1 total coins to get to p
we have a few options (different i) but we are only interested in the option with the lowest #coins
hence min(C[p - d[i]]) + 1
(or the 1 inside the min as in the version you showed earlier, same thing)
point is, stand at p, look at all options you could have gotten there by adding one coin: p - d[i]
the cheapest way to get to p is the cheapest way to get to one of those options, plus the 1 coin extra
adding one coin p-d[i]?
maybe that's clearer
right ok
p - d[i] refer to the "options"
right right the cheapest way is adding a single coin of denomination i, otherwise, if it weren't there would be no way to get to p
or if it requires multiply coins of a given denom, its not the cheapest way
actually wait
this is false
we only look one coin back
and there may be multiple coins of the same denom in the solution but we are considering each case one at a time and would get there by first adding each one at a time
by thinking about adding one coin at a time we can relate the fewest amount of coins to get to p to the fewest amount of coins to get to some earlier points
for which we have already calculated the fewest coins needed
because we calculate them in order in the loop
p=1 up to p=n
its really just lines 6 and 7 that get me confused. im totally with you on the logic. if the first denom, d[i] we try is less than p, the total change, then we set min equal to 1 + the minimum number of coins on the way to p, minus d[i], given as min = 1 + C[p-d[i]]
i think C[p-d[i]] is really getting me
how so?
the line 5 if is just there so we don't try to access C[-1], C[-2] and so on
the for loop as a whole is just computing the min
defaulting to infinity if there are no valid choices
i think p running up to n means we need an array of length n = p = change to be made. then in line 7 we are setting min = 1 plus the cost array C[p-d[i]], meaning we are subtracting the denomination amount and looking d[i] back in the cost array C, at that index?
consider one iteration of the loop
we are standing at p and looking at where we could have come from
you realize the for loop is just a min expression, right?
if the d[i] is a valid one (the if at 5) and it produces a smaller value than the current minimum (the if at 6) then overwrite the current minimum
that's how you compute a minimum with a loop, plus we have a constraint in this case (to not look at the invalid C[-1] and beyone)
does it matter the order of the denominations in the denomination array
no
so C[some_index] is the minimum number of coins of that denomination?
right right
but if p runs through n, you'd never have so many different coin denominations
e.g. if d = [5, 2, 1] then C[11] is 3 because 5+5+1
ohhh
you loop over the values to reach
not the denominations
in the outer loop
oohh
not quite but it will click tonight or tomorrow. you've gotten me most of the way there and i appreciate your time as well as the socratic teaching method
most of my breakthroughs come when im not thinking about it
also, i have come a long way since asking how to implement "how do i interpret for p = 1 to n from pseudocode." so i am happy about that
it's important to realize what the idea is
then the mechanics of computing it makes more sense
of course
and realizing "oh, this is just computing a minimum" is useful
pattern matching
so there's no recursive calls here but certainly a lookup table
well yeah, that's kinda the reason for using the table approach
you can eliminate the recursion
if you're careful about computing things in a correct order
(DAGs and topological orderings, as mentioned)
in this case the ordering is easy, compute the answer for smaller values first
since we depend on those
writing the recursion and memoizing answers has the nice property of fixing the ordering for you
but tends to be slower because recursion
so their S array is where they are storing the actual coins as well?
they store the coin that gave the minimum
what about combinations of coins or coins of different denoms
but as said, that information isn't strictly needed
wdym
i don't understand what S[p] would look like / how the number of each coin is represented
if p - d[i] was the thing that created the minimum, store S[p] = i
but surely you'd need multiple coins to make certain p values
like 9
is 5+1+1+1+1
so it actually looks up for p = 6 what the minimum coin was that gave rise to p = 5?
Look at the recursion tree (if it were recursive) and work backwards.
hypothetical situation:
S[9] = 5, ok some optimal path used a 5 here
9-5 = 4
S[4] = 1, ok some optimal path used 1 here
4-1 = 3
S[3] ...
for your 5+1+1+1+1 example
oof
ok, note that I stored the coin values in S
not the indices
but you get the point
S points to where an optimal path came from
it's a trail of bread crumbs
follow it
that must be this:
makes sense
alright i need to get some dinner, tysm
i'll be around
yes
https://www.codechef.com/problems/LECANDY can someone help me with this, i will do the coding part just need explanation
CodeChef
this is trivial, no? to boil the statement down: you have c things, you need to give A_k of them to each elephant, do you have enough?
thanks got the answer
Suppose that you're in a while True loop and you have a single variable holding a string s. All you can do is:
- see if
smatches a regex (PCRE), and combine these checks with and/or/not - prepend a fixed string to the beginning of
s - use
breakandcontinue
You can't introduce additional variables or functions.
Would that be Turing-complete?
It "feels" like it's Turing-complete. But I can't figure it out
is it possible that a dictionary is treated as a list depending on its content or something?
i have a dictionary i'm trying to add an element to but it's giving me this error
i've checked and im not redefining it as a list somehow anywhere
is playerdata a list?
yeah
it looks like it's an empty list.
python3.11 has better error messages that would show you which part of the line was failing
have a look at https://www.youtube.com/watch?v=8_npHZbe3qM
This lecture is an addendum to my video about Anagraphs, which you should watch first (or only). I do two proofs by reduction: One that the "generalized kerning" problem is undecidable, and one that the "generalized anagraphing" problem is decidable. The first proof uses turing machines, and the second a fragment of linear logic. I try to explai...
feels kinda similar, building a turing machine based on string replacements
(also that channel in general is great)
though I guess you're not allowing substitutions 
so, one silly idea would be to use the regex just for divisibility checks (or modulo, rather)
then you can define a state machine with N states (consider the number of chars in the string mod N, that decides the action you do) and you can append strings to move between the states
so state machine without memory is for sure possible
not sure how far that gets you
only allowing prepending a fixed string makes things a bit awkward
I think wlog you can think of this as a number in base 1
regex and append will allow some arithmetic and checks
e.g. regex is great at divisibility checks and modulo
but as I said idk how far this will take you
you might be able to pass some data between states, e.g. trigger state X if the number is 0 mod N, if the number is 0 mod 2N then the data passed is 0 if the number is N mod 2N it's a 1
maybe this state machine can go somewhere, but I'm iffy about the requirements to be a turing machine
I think the infinite tape requirement is the thing that might totally break this
can i get like this for understanding permetuation?
or can anyone say me how permetuation works
A permutation is an arrangement of a set of elements where the order of the elements matter. This is opposed to a combination where the order of the elements doesn't matter:
321 and 132 are two permutations of 123
321, 132 are both a SINGLE combination (actually a 3-combination) of 123
Can someone help me figure out the difference between the greedy choice property vs optimal substructure?
The greedy choice means that a globally optimal solution can be obtained from choosing locally optimal solutions.
Optimal substructure means that a globally optimal solution to a problem can be constructed from optimally solving its subproblems.
Both of these sound similar to me. What's the difference?
greedy generally involves ignoring a bunch of possibilities because they can't be optimal
for optimal substructure you actually consider the options (in some sense)
they are fairly similar concepts though
greedy choice means making the best looking choice at the time (during the given subproblem) e.g. hill climbing example
sometimes greedy can give you the optimal outcome, other times it does not
depending on the subproblems
greedy gives an optimal sltn to the fractional knapsack problem; it does not give optimal solution to the 0/1 knapsack
for which you need dynamic programming
how can i resolve these if checks to make sense and not throw an error
what would even throw an error?
one of them aren't passing, so min is never getting declared
oh wait
thats wrong
C is probably the wrong length
initializing a list of all zeroes?
0
oh i isee
and as for the lengths of C and S
i fixed them up
what is the largest and smallest index of C that you need?
p
er, the greatest val of p when it is equal to change
C = [0 for i in range(change)]
change being equal to the value of change you want to make
that doesn't go up to change
it goes up to change - 1
and you want it to go up to change
C = [0 for i in range(change)+1]
C = [0 for i in range(change+1)]
and how high does it currently go?
uhh the for loop only allows it to go up to and not including change, so change-1
right
so that needs to be fixed
with the range of p fixed that should be everything I think
Soo we just had power surges including my comp turning back on and off again like 3x and internet going out, hadn’t saved, back to drawing board. It doesn’t matter this doesn’t have to run I just want to understand it
And I guess I like getting code running which is something new
second programming project just released, haven't looked at it. i know its on hashing
do you remember us talking about argmin? what you are computing in the inner loop is basically an argmin
(you compute both the min and the argmin)
(min and coin)
alugien puede ayudarme con mis deberes
right so the min would be an integer or data point and the argmin would be a mathematical expression
or function call
min is the minimum value
argmin is the argument that gives the minimum value
right makes sense
in what way is the inner loop an argmin
it computes both the min (min) and argmin (coin)
right, the coin arg gives rise to the min value
yep
ok!
it's a common pattern worth recognizing, that for loop
there is an extra if in this one that you usually don't have if you're just calculating the min/argmin
that extra if is to achieve the restriction i: d_i <= p
I guess dumb question, but do you know how to read that expression?
the one you linked back to above or the i: expression you just wrote
well the thing I wrote now, but it's a part of the bigger expression
i such that coin denomination under consideration d[i] is less than or equal to running total, p
ok cool, just wanted to check you knew how to read it out
so yeah, it's a constraint on the valid i
the "such that" directly translates to the extra if
hmm very curious
i really love the logic in all of this
or like, practicing reasoning
i like practicing reasoning bc its mentally stimulating / rewarding and of course challenging as well
as someone who loves music / art, playing music, all that creative stuff, using this part of my brain is quite fascinating
and its certainly a skill you can develop, even if at the outset you're a bit slow on the uptake
you could break it apart as well
valid_i = [i for i in range(len(d)) if d[i] <= p]
min = 10**10
coin = None
for i in valid_i:
if 1 + C[p - d[i]] < min:
min = 1 + C[p - d[i]]
coin = i
(I would maybe pick a name other than min since that is an existing function)
right right
this is the more common form of the loop you would see
if the current value is smaller than the current min, replace the current min
(and update the argmin)
hopefully if should be obvious this computes the minimum
(and the argmin)
the syntax and what its trying to do are straightforward. i wouldn't say i can have the global scope or hierarchy of the program right there at my mental grasp, but that'll come
as an example without the argmin
things = [5, 4, 1, 2, 3, 1, 5]
mini = 10**10
for thing in things:
if thing < mini:
mini = thing
hopefully you can make sense of the fact that this code indeed computes the minimum value
of course
you could then add the argmin tracking on top of that
you could easily modify this to compute a maximum
you could allow something like
things = [5, 4, 1, 2, 3, 1, 5]
mini = 10**10
for thing in things:
if f(thing) < mini:
mini = thing
which would correspond to using python's min with a key function
(i.e. this would compute an argmin)
(the element in things that minimizes function f)
so then argmin = f(thing)
argmin = mini
min = f(mini)
so...my naming was kinda bad
this is the form the algorithms has (ignoring the constraint/extra if)
mini = 10**10
argmin = None
for thing in things:
if f(thing) < mini:
mini = f(thing)
argmin = thing
so this logic is what is underpinning something like argmin
both min and argmin, really
red is neural network prediction
#data-science-and-ml is probably a better fit for this
argmin = thing helped with comprehension. seeing it as an argument to the function
Ah. So it's like you're pruning your problem space for a solution candidate, right? Then immediately choose that candidate because of the greedy choice property
kinda yeah
Does this mean if a problem is solvable by dp, then it isn't solvable by the other?
no
Yes
some problems will be solvable by both, some only one or the other
Ah okay gotcha
there are some excellent problems where you go through and see that by tweaking one thing, it destroys the optimal substructure property
dp is the more general thing
likewise, you see where greedy can be applied and where it fails
check the problems and solutions for chapters 15 and 16
greedy is great and simple if you can prove that the greedy choice you do is actually optimal
don't forget the greedy choice can appear optimal in a subproblem but give rise to a wrong global sltn
or, a non-optimal soltn
e.g. hill climbing
Problem 16.1-1 seems good actually. I don't have the solutions manual. That'll be a good buy
(and that is why we prove or at least try to motivate the correctness of the greedy solution)
That makes sense. Showing that a greedy choice is optimal or not shouldn't take too much time to do
Very insightful guys. Thanks 🙂
still blows my mind that greedy is good for fractional but not 0/1 knapsack. i would have thought it'd be the other way round, since fractional seems more complex (more likely to require dp approach)
finding a candidate for the greedy choice might not be trivial
e.g. in interval scheduling you have a bunch of intervals and you want to pick as many non-overlapping intervals as possible
a naive idea would be to just pick the interval that starts the first
is that an optimal greedy choice?
i know the ans to this 😜
Does not seem so. Because it could overlap, correct?
well, I meant pick the first starting interval that don't overlap any of the intervals I already picked
so overlapping isn't an issue
but no, it's not an optimal choice, you can easily find counterexamples, e.g.
|------------|
|--|
|---|
picking the interval that starts first is not optimal here, clearly
Ah, because the finish time of one interval could occur after the start time of another
any other idea for what could be an optimal greedy choice?
Hmm. We would have to show that the interval with the earliest endtime (just looking at the diagram) will sort of 'contain' or 'cover' all the other intervals. Suppose that the earliest endtime is x. Then any available timeslot for any other interval will occur after x. if we choose another interval with endtime y, then the same is true for anything that occurs after y. Then, we have to show that x <= y.... for all x and y. That must mean, that everything that occurs after y must be greater than everything that occurs after x, therefore there is no way anything to the left of y will be more than what's to the left of x
Let me know if that makes sense
there is a common and neat argument for greedy algos that's called an exchange argument
assume that there exists some optimal move that is not the greedy one
we can replace that move with the greedy one, and we are in an as good or better position (the end point is the same or earlier than the other optimal one)
(actually, the first choice being optimal isn't actually important so far, what I said is actually true for any choice)
(we can replace the choice with a greedy one and the situation isn't worse than before, it might even be better)
so we can transform a sequence of choices into a greedy sequence of choices and be as good or better
this is true for any sequence of choices, even an optimal sequence
tl;dr: I can make the greedy choice and it doesn't make the solution worse than what I had previously
you may see it referred to as the 'cut and paste' argument
to me it feels like circular logic, but if it suffices for a mathematical proof, awesome. more based in logic and reason than math anyhow
i did a similar proof by contradiction for my last collaborative project, we'll see how the profs judge it (how convinced they are)
maybe I'm actually confusing the name, there is also "greedy stays ahead" which might actually be what I'm describing here
That makes a lot of sense. I think I'm finally seeing the difference. Using exchange argument, you ignore making choices that will make your overall solution worse. With memoization, you cache a solution regardless of if it'll make the problem worse. You just care about the optimal path in the memoization table
an actual exchange argument tends to boil down to showing that sorting by something is optimal, by showing that swapping two elements not in sorted order is as good or better
but the arguments are similar in spirit
i keep coming across hamming distance
hamming distance is a very boring string distance
two strings of the same length, the distance is how many positions are different
stuff like levenshtein distance is a lot more interesting
and useful in practice
guess how long the knapsack problem has been around for
idk 60s?
since at least 1897, per wiki
also you might find this interesting: (will be involved in my future work):
here we go with hidden markov models again
yeah, I've realized string algorithms are very important in stuff like bioinformatics
Can almost make a cyclic tag system. Maybe it's possible.
ah, Post stuff
See you in #data-science-and-ml 😉
A tag system is a deterministic computational model published by Emil Leon Post in 1943 as a simple form of a Post canonical system. A tag system may also be viewed as an abstract machine, called a Post tag machine (not to be confused with Post–Turing machines)—briefly, a finite-state machine whose only tape is a FIFO queue of unbounded length, ...
Hard to say, PCRE has a bunch of stuff in it that might work.
With a simple counter (current iteration of loop / index) variable it would def. be a cyclic tag system.
!e
Well, I was able to make it work with adding two numbers. Suppose that the input is of the form aaaaaaa+aaa, and you need to return aaaaaaaaaa=aaaaaaa+aaa.
import re
s = "aaaaaaa+aaa"
while True:
if not re.match(".*=", s):
s = "=" + s
elif not re.match(r"(a+)(a+)=\1\+\2$", s):
s = "a" + s
else:
break
print(s)
@austere sparrow :white_check_mark: Your 3.11 eval job has completed with return code 0.
aaaaaaaaaa=aaaaaaa+aaa
Ok, I was looking for something like that.
wait, you're allowed to add different fixed strings?
That makes it more possible.
Yeah, I meant that you can't "compute" the prefix you're adding
like you can't do s = s + s
that's not how I read the problem 🥲
oh sorry
I was going to ask about the "fixed string" part too, but I just guessed you meant that.
I learned about Turing machines but forgot everything
do I ask questions regarding linked lists in here?
Yeah
Guys, I have a peculiar question and asking it in #help didn't yield any results, I'm assuming because it's too specific.
I'm practicing with Pandas dataframes right now. And need a bit of help understanding something.
So, I have a dataframe:
size color length id
25 28 33 5 1
35 33 1 1
34 2 1
28 37 33 3 1
38 33 4 1
And I have to export it into a nested dictionary, like this:
{'size': {25: {'color': {28: {'length': {33: {'id': 5}}},
35: {'length': {33: {'id': 1},
24: {'id': 2}}}}}},
'size': {28: {'color': {37: {'length': {33: {'id': 3}}},
38: {'length': {33: {'id': 4}}}}}}}
But I can't figure out how export into a dictionary works and which modifying methods I should apply to get the required result.
The reason I can't use regular dicts and avoid using pandas completely is that there's might be another set of columns aside from "id"
have you thought about it
Because every possible shifted string exists as a substring in goal + goal
Hey people, I opened a ticket in #help-lemon. I use a dataframe but although saying all columns should be printed, it breaks the dataframe into two
Could you show me how you got to this structure? Would be appreicated
Hey @fiery cosmos!
It looks like you tried to attach file type(s) that we do not allow (.zip). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.
Feel free to ask in #community-meta if you think this is a mistake.
hi
please tell me any website to learn algos from basics
🙂
and also say some website to learn maths for programming
Well, what are your goals? If it's to ace the coding interview, competitive programming or similar, then I'd recommend https://leetcode.com/, https://www.hackerrank.com/, or https://codeforces.com/ or similar. For academia, I'll recommend visiting any of the pinned links we have in this channel.
Which is faster computing sin/cos or computing a sqrt of a float
done by the math module?
see for yourself, that's what timeit is for
if you care about the low level instruction https://www.agner.org/optimize/instruction_tables.pdf might be interesting
it suggests sqrt is significantly faster
(but if you're doing stuff though python the cost of these operations probably doesn't that much compared to the overhead of python)
I don't think sin/cos CPU instructions are actually used, though, so this wouldn't apply
(as in, for them software implementations are faster than hardware ones)
not sure about sqrt

my source for that is https://stackoverflow.com/a/2285277
There is a tradeoff between performance and error.
Both hardware and software implementations are computing approximations. So it depends how approximate you are ok with.
As for the bit hacks used in many of the software implementations. I recommend reading Hacker's Delight.
oh yeah, now I remember reading sin source code at some point
it's awfully complicated
but yeah I can imagine the hardware instruction hasn't aged well
What do you mean by that? Software sin is guaranteed to be accurate to half an ulp.
I could implement sin as ```
def sin(x):
return 0
so unless the hardware one is even more accurate, that can't be an excuse for it being so slow
An example of higher error for way more speed is the fast inverse square root.
Depending on the hardware it could also just be straight up slower.
at least do the engineering thing and do sin(x) = x
accurate for all the angles that matter
Classic.
It turns out that a bunch of the hardware implementations were actually both inaccurate and slow in the past (just bad).
So GNU takes it into its own hands.
Which is kind of the correct answer anyhow.
One wants their sin to be predictable across hardware.
(Intel's fsin was bugged, it did not match the rounding of other instructions)
(But now they can't change it because code relies on it...)
(Hardware bugs are common)
Good morning im french, my name is Sami, im 16years old and i have a homework on arithmetic suits.
My teacher gave me this
And he gave also value : vn+1=2vn+n
V0 = -6
I tried this but this make me an error
My friend found this program on internet and this is working:
print("v({}) = {}".format(0,v))
for n in range (1,10):
v=2*v + v
print("v({}) = {}".format(n,v))```
But i dont want to cheat and i want to do like my teacher
well your friend does not know how to simplify v=2*v + v because it's the same thing as v=3*v
do you mean to indent the for loop code and the print expression?
Now i did an indent, but when I execute there is nothing happening
because you have not called liste
Can somebody explain the confusion matrix to me please
liste is called by doing liste(n)
and n is how much iterations you will want to do
i understand like if i want 10 iterations i have to do : liste(10)
Is this the right channel for that?
yes
Thank you so mutch
i have a result now
But i want this
i think i have to put liste.append(vn) on my program
Also everithing i put on the n in the liste(n) gave me for the result -18
Is it possible to share a array between threads and function that arent in the same file?
You can do inter-thread communication / IPC with queues
!d queue
Source code: Lib/queue.py
The queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be exchanged safely between multiple threads. The Queue class in this module implements all the required locking semantics.
The module implements three types of queue, which differ only in the order in which the entries are retrieved. In a FIFO queue, the first tasks added are the first retrieved. In a LIFO queue, the most recently added entry is the first retrieved (operating like a stack). With a priority queue, the entries are kept sorted (using the heapq module) and the lowest valued entry is retrieved first.
is it possible todo it without a queue? Just with a array?
bc i still need the data thats inside the array/queue
like are there a way with shared memory pool etc? in python
If I use bfs and create a queue where the Nodes are stored but removed after does that count as O(n) Space Complexity?
Does anyone know how to use BFS to determine whether a given undirected graph has a cycle that contains the source? Source is the starting point.
all the websites I found online states if parent[u] != v there is a cycle
but this doesn't necessarily mean the cycle contains the source
it just means as you do bfs down the graph, u will encounter a cycle
i want the cycle to include the source (ie the starting point of my BFS)
If you have any cycle, you can extend it to include the source by adding a path to the source and back to the cycle
EDIT: oh right, cycle already requires no repeating nodes does it. hmm
why not do the same technique that you can do to find a topological ordering on a DAG?
i.e. remove leaves until you can't remove any more leaves
everything that's left is part of some cycle
i feel like it would work to, like... tag each node with which of the degree(source) subtrees starting from the source it belongs to. If you find that a node that is already part of one subtree is connected to another subtree's node, then you have found a cycle such that it passes through source
actually, if we start a search at the source and at some point get back to the source then there is some cycle containing the source, no?
yeah for example in this case if sourcce=3 then the function should returns false. Although clearly graph is a cycle, source is not on the path
how can you ever get back to the source in BFS?
err, right
it would work in a dfs I guess
why do you want to do it with a BFS?
it's a requirement
it's not like i come up with the question LOL
i'm not that nerdy
my prof is evil
i guess
this still seems like a working simple solution to me
could you elaborate? 
So, on the first steps of the bfs, we always handle the distance-1 nodes - the neighbours of source. These get their own indices, 1 to degree(source) inclusive, say. After that, every time we handle a node cur, we tag its neighbours with the same index as cur.
If when processing cur, you find a neighbouring node neib that's already tagged but haven't been processed yet, you have found some cycle. But if tag(neib) == tag(cur), then this cycle doesn't include source and should be ignored, whereas if tag(neib) != tag(cur), it does.
ah, I see what you mean
in terms of coloring (which seems to be common for graphs)
color all the neighbors of the source differently
and expand those colors
if you ever have two colors meet there is a cycle containing the source
yeah, I also was thinking in terms of coloring
hmm
the one case that spells disaster for most approaches is an example where there is a cycle within one color, e.g. in the image that @thorny bluff posted
but this approach deals with that
Thanks for the responses guys. Though I'm not sure if this will work because the prof says I have to modify this psedocode.
this approach will work with any search where you can store some extra info
but i'll ask him and see what he says
What if you just traverse the previous array in a loop and return true if you find source?
if u == s then give v it a unique number (we called them colors, but you already have color there so you would have to add something new) otherwise just give v the same number as u was given
if you end up trying to overwrite something existing, if the number stored is not the same as you try to write, there is a cycle
If I use bfs and create a queue where the Nodes are stored but removed after does that count as O(n) Space Complexity?
space complexity tends to care about the peak memory usage
why not
Wdym by that, shared memory is a lower level solution. What type of data will be shared between threads in your app?
Hey everyone, I am trying to parse a text file to store all the words inside a list after getting rid of punctuations and unwanted numerals. But I am not getting satisfying output. Can someone help me with this problem. Below is the data inside text file and the code with its output.
INGREDIENTS:
Cereal Extract 519% * (Barley, &
Wheat), Sugar, Cocoa Solids,
Caramel (150c), Liquid Glucose,
Protein Isolate, Maltodextrin, Milk
Solids, Vitamins, Emulsifiers (322,
471), Raising Agent (500(ii)),
Minerals, lodised Salt. So |
Code -
with open("Ingrediants.txt") as f:
text = f.read()
words = text.split(",")
table = str.maketrans("", "", string.punctuation)
stripped_words = [w.translate(table) for w in words]
for word in stripped_words:
if word[:11].lower() == "ingredients":
improved_str.append(word[11:].replace('\n', '').replace('\x0c', ''))
else:
improved_str.append(word.replace('\n', '').replace('\x0c', ''))
print(list(set(improved_str)))
Output list - [' lodised Salt So ', ' Wheat', ' Vitamins', 'Cereal Extract 519 Barley', 'Minerals', ' MilkSolids', ' Sugar', ' Cocoa Solids', ' Raising Agent 500ii', 'Caramel 150c', ' Maltodextrin', '471', ' Liquid Glucose', ' Emulsifiers 322', 'Protein Isolate']
Any help would be very much appreciated!
Lots of problems based on tree traversal like DFS/BFS, which use a stack/queue.
Though commonly done with recursion, so a stack may not explicitly be used.
I have an easier time traversing with dfs
I once implemented level order tree traversal with dfs and once I tried it with a queue it was way easier
i’m studying for midterm I want a 100 raw grade so ive been doing online midterms from different unis
I feel pretty confident dealing with trees and recursions but unsure of what type of linked lists,stacks, or queue problems he’ll give
my weakness would probably be analyzing time complexity constraints
There is a huge amount of problems to pick from, so of course you cannot and shouldn't try to study them all. Instead, you should study the building blocks e.g. common algorithm classifications (greedy, dynamic programming, recursive, divide and conquer, etc) and the data structures.
The actual challenge will be looking at a problem and identifying what the right tools for the solution are as well as how to apply them to the problem. But many problems tend to follow common patterns, so the way I studied for that was to look at a decent variety of problems to build up the "vocabulary" for solving them.
thanks appreciate it
greedy and dp are beyond scope of class but ive practiced them for technical interviews. gonna start building up my problem solving skills
Just a array
then use a queue
but when I iterate it it loses all the data
yeah but you can do
while not q.empty():
item = q.get()
return item
is possible to store python objects in a sqllite or not?
and then reconstruct it?
the queues you're talking about are for things like assigning tasks for workers
which is the case you actually need the synchronization they provide
if you have a lot of shared and synchronized data between threads then you are probably doing something wrong
I just have to share a array between the threads. It should be a array of objects. The data should be stored in memory and not on the disk
read only?
read and write
well that's the problem
synchronizing reads and writes is not a good idea if you can avoid it
do threads depends on the writes done by other threads?
no they dont
so why do you need to write back to the same shared array?
modifying the same shared data is asking for problems
1 thread is putting data in there and the other is reading the data
if so why do you need an array?
thread 1 can put elements in a queue for the thread 2 to do work on
if thread 2 need an array it could build that locally in the thread
no sharing needed
ah sry i said something wrong:
Thread 1: nothing
Thread 2: writing data
Thread 3: reading and writing
Thread 4: reading and writing
what does thread 3 and 4 write?
reading and writing new objects
its should be a multi client socket server
3 is for handling new connection
4 is for "talking" to those
so 3 should accept the connection and delegate the task of talking to it to 4
yes
that can easily be implemented by having 3 communicate with 4 with python's synchronized queues
just send the relevant information from 3 to 4 by pushing an object to the queue from 3
aha
4 can then pick the task up and do work
so if theres a new connection, 3 is putting it into a queue, and 4 is getting the next item (client) and add it to a array in their own class?
what's the array for?
for the connected client
so you can do multi client connections
=> broadcast
ok I know how to implement this now. Thanks for you help 🙂
as long as you can avoid having a bunch of synchronized data between threads you should be fine
I think something like a thread pool for talking threads and a bunch of tasks would be a sensible design for this kind of thing
(or delve into python async)
@haughty mountain here is an implementation of a ListView using dict as its base.
@pallid coral you might wanna take a look too
https://paste.pythondiscord.com/lofoqupifu
it turns out that instead of prioritising the append / remove part and using a list as a base. using dict simplifies lookup and doesn't complicate the append / remove part a lot.
yes but they aren't just lists, they are
list[tuple[int, AnyEvent]]where theintcontains the index of theAnyEventmember in "master" list
are you saying the ordering is basically just random in that the indices relevant can be weird stuff like event with index [2, 5, 7, 123, 1024] are the ones relevant for this thing?
or are the indices something sensible like ranges of elements?
yea they can be random can never tell (can't make assumptions)
Hello, can someone help me with queue and deque
Hi, anyone knows anything about the WaterJug problem being solved using Hill Climbing or A star algorithm?
@haughty mountain This covers all the functionality I will ever need, any suggestions? https://paste.pythondiscord.com/atakujukas
most online courses are only concerned with upper bound of algorithms
my class wants us to dive into lower, upper, and asymptotically tight bound
O = "big oh" = upper bound
Θ = "theta" = upper and lower bound (big oh and big omega)
Ω = "omega" = lower bound
any of these can be asymptotically tight or non-tight. eg, O(n^2) on a linear algorithm is technically correct, though O(n) is tighter. Ω(n) is technically a lower bound on a quadratic runtime algo, though it is not asymptotically tight
Hi guys, I need some help with this DSA question. I've done the coding part when I submit it is failing the test case for some reason. This is the question: "Find the largest rectangular area possible in a given histogram where the largest rectangle can be made of a number of contiguous bars. Assume that all bars have width of 1 unit.
Input Format
In the function an integer vector representing height of the bars is passed.
Output Format
Return an integer representing the maximum possible area.
Sample Input
{6, 2, 5, 4, 5, 1, 6}
Sample Output
12
" One of the test cases is {2,1,5,6,2,3} which has the highest area = 10 . I don't get where 10 comes from?
hey all,
im interested in making every bucket in a hash table a list of length 2, where the first element will be the key to be inserted, and the second element will be an integer as a pointer to where another key was stored due to collision resolution, will this ruin the runtime benefits of using a hashtable?
Ok, so the original task looks like this:
Incoming:
[
{'id': 1, 'size': 25, 'color': 35, 'length': 33},
{'id': 2, 'size': 28, 'color': 35, 'length': 33},
{'id': 3, 'size': 28, 'color': 37, 'length': 33},
{'id': 4, 'size': 28, 'color': 38, 'length': 33},
{'id': 5, 'size': 25, 'color': 28, 'length': 33}
]
Outgoing:
{'size': {25: {'color': {28: {'length': {33: {'id': 5}}},
35: {'length': {33: {'id': 1},
24: {'id': 2}}}}}},
'size': {28: {'color': {37: {'length': {33: {'id': 3}}},
38: {'length': {33: {'id': 4}}}}}}}
The number of "columns" might differ; There might be more keywords aside from size, color, length...
I tried approaching it head-on and with the use of pandas but couldn't solve it and gave up. Is it supposed to be dealt with via a recursive loop?
Hi, i have this question where i gotta make an algorithm similar to (Kahn’s algorithm for Topological Sorting) but starting from vertices with 0 outgoing edges instead of incoming ones
Issue is that once i know which one has 0 incoming and remove it, how do i mention to remove the edge which is directed to it directly
Just need to make pseudo code for that
Basically this
did anyone see my ? regarding time complexity of my hashing scheme
whats the diff between o(n) and theta(n)
theta means an asymptotically tight bound from both above and below
O(n) only means asymptotically bound from above
just for clarification
if a nested for loop breaks at a certain condition
would time complexity be theta(n^2)?
and if it never breaks its O(n^2)
i think this reasoning is correct but i dont fully grasp theta(n)
like this java(lol) code's run time is theta(n^3) because to my understanding if it doesn't break in the loop you use big O, but if it does break, you use big theta
that's not the difference
do you know the mathematical definition of the notation?
nah
when do i use big theta or big o
when you need to express the definition of big theta or big o
that's kind of it. it's just notation that expresses a long sentence in like 1 character
the thing i posted above the teacher said theta(n^3) is a better answer than o(n^3)
because it's a tighter bound
because Theta implies O, but not the other way around
so both upper and lower bound is a constant in front of n^3
and btw, you want to use caps, since little theta and little o notations both exist
theta implies big omega and big o?
yes
so using big theta implies that there are two constants and one grows faster/slower than f(n) after passing n naught?
for example c1 * f(n) < f(n) < c2* f(n)
when analyzing algorithms how do i know to use big theta, big o, or big omega
when you need to express the definition of big theta or big o or big omega
i see some problems can only be expressed in big o
like one where the loop breaks
the answer said big o(n) instead of big theta (n)
example checking if all numbes in array are positive
if you see negative you break
and key said runtime is o(n)
the key described a summation algorithm of an array as big theta(n)
last question sorry but youve kind of confused me. when should you use big o notation instead of big theta
according to hw seems like a case exists where u cannot express in big theta
i'm pretty sure that one is big Omega as well, so it would probably be big Theta
when you can only claim what big O claims, and not what big Theta claims
algorithms aren't big o or big theta, you would analyze their best case or average or worst cases
well, i can give a trivial example, the function f(n) = n is O(n^2), but not Theta(n^2)
why is that
because O is basically <=
nothing grows slower than n
Ω is basically >=
is part of the reasoning of why theta is wrong?
that's not true. log, sqrt, cbrt, ...
no i mean constant
Θ is both O and Ω, i.e. basically =
when i said constant i was thinking that no constant times n will be less than n so big omega cant exist? idk
im sorry im so dumb guys but i appreciate the help
O and Ω are upper and lower bounds respectively
Θ is when the bounds can be made the same
much like a <= b and a >= b implies a = b with inequalities
O and the friends are just ways of writing >=, <= and = for asymptotic behavior of functions
in my class any function that has loops that breaks is referred to with big o and if it doesnt break then its referred to as big theta
does this align with ur knowledge or na
take something silly like
def smallest_divisor(n):
for i in range(2, n):
if n%i == 0:
return i
return n
```what's O and Ω?
err, I meant to make i start at 2
always only running once would be O(1)
although since it's python it's log n? because ints can be any size
in any case it's O(n) and Ω(1)
so, we have loads of values for which the function runs in constant time
e.g. anything divisible by 2
so the lower bound can't be higher than a constant
when can something only be expressed in big omega
would it be if a loop runs once and breaks no matter what
similarly primes means linear time, so upper bound can't be lower than linear
always
O and Ω always exist
Θ might not
in the example I gave it doesn't exist
because O and Ω can't be made to match
so theta exists if big o and big omega are equivalent
yes
oh that makes sense
i'd answer stuff correctly on hw but i wouldnt know the reasoning behind it
that's why I'm saying they are like >=, <= and =
it's like the squeeze theorem if you're familiar with that
like this one i answered O(n) because it breaks but the real reasoning is that the lower bound is Ω(1) and upper bound is O(n)
I guess it's worth noting that this applies for the tightest possible O and Ω
e.g. I could describe something with always linear runtime as O(n²) and Ω(1)
they would be terrible bounds, but still bounds
for something that's really linear it's a terrible bound
i understand why n^2 is a bad bound
like, it's like saying 1 <= n or n <= n²
true but probably not that useful
the tight upper and lower bounds are O(n) and Ω(n) for a function that's always linear runtime
yea that makes sense
and those are what matters for Θ
in a linear function that doesnt break/return/etc the best case is omega(n) and worst case is O(n) is that ✅
saying it is theta(n) provides the tightest bound for the function
thats probably why teacher prefers big theta if there is a tight upper and lower bound
theta is the more precise thing yeah
like O(n²) is saying the runtime is not worse than quadratic, but who knows if that bound is tight?
the real answer could be linear, who knows?
Θ(n²) is for sure a tight bound, it's exactly quadratic, not better not worse
should i change how i approach these problems where you give complexity of function?
my approach was if it interrupted i'd give big o but if it didnt interrupt then big theta
i didnt get logic behind it at that time
but it didnt give me an incorrect answer
if i get a free response on time complexity then its helpful to know the distinctions between big omega, big theta, and big o
anyways im grateful for help gonna study linked lists now
conceptually this topic was my biggest weakness because i didnt grasp how it worked its weird to get a problem right without knowing why you got it right
yeah, there are plenty of examples where that's not true, e.g. quicksort which has a bad worst case without having a break or a nearly return
thats true
O(n²) and Ω(n log n)
no shortcuts then 😢
and I could a loop with a break that is still linear
dumb example
for i in range(n):
if i >= n//2:
break
omega(n/2) still omega(n)
yes, constant factors don't matter
@glass river This does less work (than full n steps), but big O, omega, theta, are concerned with the growth rate / "curvy-ness". https://www.desmos.com/calculator/clymw6pjdc
Doing half does not change the curvy-ness.
(It just changes the slope of the line, still a line)
how about determining time complexity of recursive function
linear - o(n) tree (m^n) where m is # of calls made in function
like fib is 2^n (?)
Plot N vs number of steps for the given N.
Then given that plot you might have some idea of what it could be, try to see why.
Then prove it.
(With enough practice you can just look at the code and guess)
you have to solve the recursion. there are various methods
can someone help me with my python code?
it runs 25% of the test but i cant seem to understand what is wrong with it
its due tonight in 2 hours lol pls, just an explanation or hint
why is this n log n
actually it makes sense when you look at higher values of i
if a function triples itself as it approaches n , is it still log(n)?
The runtime of the function (here, the number of times print() is run) is
n/1 + n/2 + n/3 + ... + n/n
= n * (1/1 + 1/2 + 1/3 + ... + 1/n)
Note that (1/1 + 1/2 + 1/3 + ... + 1/n) is the nth harmonic number, which can be approximated by ln(n) + const as n becomes large
Therefore the limit of the runtime as n becomes large is n * (ln(n) + const) = n * (log(n) / log(e) + const). At this point it should be easy to convert into big-O notation
Not sure what you mean by "triples itself"
Basically, the first step in proving the runtime complexity of a function is to express the runtime in terms of n (however you define it)
Those intuitions based on the structure of loop nesting are derived from this first step
Say you've got a row in a database of some description.
PK col 1, col 2 .... col 10,000. Col names are 2KB Varchar.
1KB varchar 1B Boolean
What's the best algorithm to turn the 10,000 boolean fields into a hashmap/set of a fairly small size? Some sort of data structure/2 way hashing algorithm?
Currently the row is fairly large since DB Table overhead. I want that row to be as small as possible, whilst still allowing instantaneous determination of whether Col 1 to Col 10,000 is True or False.
Best I got.
If you turn the 10,000 columns into an index hashed array, and then use a hashing algorithm on the 10,000 2KB varchar names. Then you have an array that's roughly 10KB in length, and a hashing algorithm.
0 1
1KB Varchar 10KB array of 10,000 places. Hashing the 2KB col name gives you the array place.
How can you make a row that's smaller than 11KB? Whilst still supporting fast access.
maybe I'm missing something, but can't you just shove things into a bitset? 10000/8 bytes = 1250 bytes
idk what's a good bitset in python though, you could build your own with the array module I guess
yea, a bit array
that's the best thing I think.
you can also take multiple bit arrays, then compress a bit matrix.
Using matrix multiplication. But I don't think you want to do that.
It's not really a Python thing.
More like a how computers work thing.
No matter the programming language, you'd use a library implementation.
Or you can use
!pip construct
is it possible to insert object classes into a queue?
pretty sure the python queues can take whatever
much like other python containers
<objectName object at object at 0x000002883D3618E0> when returning an filtered object in a array. How can I fix this so ITS the OBJECT
that looks like it is the object
it's just the default repr from object
I've been struggling with some leetcode problems, would you mind if I send them in here to discuss them or is that wrong?
Someone keep me honest, but as long as it's Python, related to DSA, and you're not spamming, I think you should be ok
send it
what does this do
head = head and head.next or head
head is linked list head
Assigns head.next to head if head.next is not None. Otherwise it assigns head to head (if head or head.next is None)
hmm I tried what you said. In this graph,
source is in the cycle, however tag(neib)=tag(cur) in this graph?
why so? you start by tagging left neighbour of source as 1 and right as 2. Then left bottom node gets tagged 1. Then right bottom node gets tagged 2, sees the node to the left of it, and there we go - neighbouring node with different tag.
yeah it is dsa and you can implement it in python so I think we're good
srry i just saw this. Why isn't bottom left node tag as 1 not 2? i thought it 's two because the distance from source is 1+1=2.
Are you saying the left subtree from source are all 1s? and right subtree from the source are all 2s?
With that being said, what happens if we have 3 neighbours connecting to the source? In this case all will have 3 different values for each neighbor of source. This means as long there is a cycle, source will guarantee to be in the cycle for the case of 3 neighbors from a source
So essentially, the gist of this algorithm states that as long source has more than 1 neighbor, if there is a cycle, then guarantee source is in the cycle?
actually I get it. even this example it works. because all the left of S is labeled as 1, the cycle in the left would be comparing 1 with 1 and the cycle in the right will be comparing 2 with 2.
I believe so, yes. Not like what I outlined about is a formal proof, of course.
oh sorry there was a misunderstanding. I didn't mean to help because I can't but I needed to create a multiindex dataframe and wanted to ask how you did that haha
I am working on a quiz application, where I have to show the next question on the bases of next answer of the previous question.
for example the first question is
what do you like?
option 1 : Cat
option 2: Dog
if answer is cat then the next question will be
what type of cat do you like?
options with type cat.
same for dog selection
how I can make a system like that?
a tree
Do I need to use two loops for swapping
no
Yo! I'm interested in finding an algorithm that... I think is kind of like binary search but is more generally applicable to multiple dimensions.
Scenario... You're given access to a slow API that has a few features:
- Given a URL, return a list of years there is at least one archive for that URL.
- Given a URL and a year, return a list of days there is at least one archive for that URL.
- Given a URL and a day, return a list of times there as at least one archive for that URL.
- Given a URL and a time, return the title of the page for that URL at the given time.
Your goal is to find all titles that ever existed for the page, with the assumption that the title will never revert to a previous title.
The algorithm seems like it's straightforward but trying to write out a rough implementation yields a lot of pitfalls. This is not the correct algorithm, but it's kind of close to what I want:
- Get the list of years.
- Get the list of days for the first year and the last year.
- Get the list of times for the first day of the first year and the list of times for the last day of the last year.
- Get the title for the first time of the first day of the first year and the last time of the last day of the last year.
5a. If they're the same, you're done. Return a list with just that title.
5b. If they're different, yield that title, then repeat from step 2 but replace the last year with the middlemost year? - The above is repeated somehow until you locate the first change. It should take log(n) calls to do this.
- Repeat the algorithm from step 2 with the location of the first change as the new first year.
- Some similar recursive logic needs to be done on the days and times as well in case changes happen that quickly.
- You're done when you converge on the last change.
What would be nice is maybe an algorithm that can simply take a tree like this and traverse it in an intelligent way to find the changes in child nodes while skipping as much else as possible without having to code in all of that specific logic for years/days/times/etc.
Effectively, here is some dummy code I'd like to improve:
import hashlib
from time import sleep, time
YEAR_API_DELAY = 0.1
DAY_API_DELAY = 0.1
TIME_API_DELAY = 0.1
VALUE_API_DELAY = 0.1
def get_years():
sleep(YEAR_API_DELAY)
return [1, 2, 3, 4, 5, 6]
def get_days(year):
sleep(DAY_API_DELAY)
return [(year, 1), (year, 2), (year, 3)]
def get_times(day):
sleep(TIME_API_DELAY)
return [(day[0], day[1], 1), (day[0], day[1], 2)]
def get_value(t):
sleep(VALUE_API_DELAY)
return hashlib.sha256(str(((t[0] * (2 ** 4)) + (t[1] * (2 ** 2)) + t[2]) // 3).encode()).hexdigest()
def get_changes():
# TODO: Optimize me!
values = []
for year in get_years():
for day in get_days(year):
for t in get_times(day):
value = get_value(t)
if values and values[-1] == value:
continue
values.append(value)
return values
def verify_result(result):
assert hashlib.sha256(
str(result).encode()).hexdigest() == "a17ff1f64d587ee4eb7811b2c0e874573a088a2bfdf634de9e2df0ddb3c6d896"
def main():
t1 = time()
result = get_changes()
t2 = time()
verify_result(result)
print("Correct result produced in", t2 - t1, "seconds.")
if __name__ == '__main__':
main()
Currently, the get_changes() function runs in 6.669126 seconds
When traversed in order the values never return to a previous value, but they're also not sorted (they ARE sorted, but then they're hashed to destroy the sort property, to simulate real world data that occasionally changes but not necessarily in a way that's alphabetically increasing, like a website title or a commit hash)
Can you do some kind of binary search to minimize the calls to the get_years/get_days/get_times/get_value functions? Feel free to cache those functions
I don't get what you're trying to do, is the code for the api calls meant to be stuff we don't know about?
So, my original use case (although I'm interested in general solutions to problems like this)
was that I was trying to find a way to efficiently discover youtube title history by checking the Wayback Archive. The API is modeled like the above, and has rate limits.
I'd like to make as few calls as possible and find any change to the title of videos. I'm OK assuming that the titles will never return to a previous title once changed.
so totally black box but you're expecting things to be constant most of the time?
yes, you can check the first date and then binary search for the last date that has that value
Yeah -- I think the nested nature of it is making it hard for me to devise a proper binary search
Enter Youtube ID or URL: https://www.youtube.com/watch?v=HeQX2HjkcNo&t=8s
2021.5.22.14.39.25 You Can't Prove Everything That's True
2021.5.22.14.54.02 You Can't Prove Everything That's True
2021.5.22.15.25.19 You Can't Prove Everything That's True
2021.5.22.16.03.49 You Can't Prove Everything That's True
2021.5.22.16.33.04 You Can't Prove Everything That's True
2021.5.22.17.40.06 You Can't Prove Everything That's True
2021.5.22.18.20.16 You Can't Prove Everything That's True
2021.5.22.19.26.06 You Can't Prove Everything That's True
2021.5.22.22.12.13 You Can't Prove Everything That's True
2021.5.23.24.94.3 There's a Hole at the Bottom of Math
2021.5.23.43.60.6 There's a Hole at the Bottom of Math
2021.5.23.75.30.8 There's a Hole at the Bottom of Math
2021.5.23.10.23.18 There's a Hole at the Bottom of Math
2021.5.23.11.38.50 There's a Hole at the Bottom of Math
2021.5.23.16.45.20 There's a Hole at the Bottom of Math
2021.5.24.71.84.4 There's a Hole at the Bottom of Math
2021.5.24.90.15.5 There's a Hole at the Bottom of Math
2021.5.24.95.14.8 There's a Hole at the Bottom of Math
2021.5.24.12.46.29 There's a Hole at the Bottom of Math
2021.5.24.15.41.49 There's a Hole at the Bottom of Math
2021.5.24.16.43.18 There's a Hole at the Bottom of Math
2021.5.24.17.29.44 There's a Hole at the Bottom of Math
2021.5.24.23.47.36 There's a Hole at the Bottom of Math
2021.5.25.38.22. There's a Hole at the Bottom of Math
2021.5.25.12.74.2 There's a Hole at the Bottom of Math
2021.5.25.51.50.7 There's a Hole at the Bottom of Math
2021.5.25.94.21.3 There's a Hole at the Bottom of Math
2021.5.25.10.03.21 There's a Hole at the Bottom of Math
2021.5.25.12.34.12 There's a Hole at the Bottom of Math
2021.5.25.16.32.40 There's a Hole at the Bottom of Math
2021.5.25.17.08.39 There's a Hole at the Bottom of Math
2021.5.25.19.30.45 There's a Hole at the Bottom of Math
2021.5.25.20.04.31 There's a Hole at the Bottom of Math
2021.5.26.51.63.4 There's a Hole at the Bottom of Math
2021.5.26.61.61.2 There's a Hole at the Bottom of Math
2021.5.26.82.74.2 There's a Hole at the Bottom of Math
2021.5.26.95.72.8 There's a Hole at the Bottom of Math
2021.5.26.12.16.55 There's a Hole at the Bottom of Math
2021.5.26.12.48.51 There's a Hole at the Bottom of Math
2021.5.26.13.50.07 There's a Hole at the Bottom of Math
2021.5.26.14.01.34 There's a Hole at the Bottom of Math
2021.5.26.20.30.33 There's a Hole at the Bottom of Math
2021.5.26.21.47.43 There's a Hole at the Bottom of Math
2021.5.26.23.53.12 There's a Hole at the Bottom of Math
2021.5.27.62.32.7 There's a Hole at the Bottom of Math
2021.5.27.13.54.35 There's a Hole at the Bottom of Math
2021.5.27.22.21.00 There's a Hole at the Bottom of Math
2021.5.28.11.27.07 This is Math's Fatal Flaw
2021.5.28.12.54.35 This is Math's Fatal Flaw
Here's an example
Really I just want to ignore the duplicates, but you can see how many wasted API calls are being made
there's also the issue of backtracking -- a binary search could easily find one change, but I want to find them all
flatten things first
But discovering the nodes takes API calls too -- if the video has been out for 10 years, I wanna check just the first year and the last year first right?
I feel like this is kind of a tree problem..
is it cheap enough to do all the calls to get the dates without the values?
so you have to do some search guessing where the midpoint would be?
Right... I mean, at the top level it's easy, try the middle year. But it gets messy fast.
it's not easy even then, unless you assume samples are evenly spread
In my case the examples are probably frontloaded with changes then stop but I feel like this can be solved in log(n) API calls regardless
where n is the total number of archived points
if you're fine binary searching on the actual time then that might work, but it makes a bunch of assumptions
no chance of log(n), it will also scale with the number of changes
Oh that's true...
