#algos-and-data-structs
1 messages · Page 2 of 1
where did your expressions come from
2^d is the number of nodes at a specific level of height d?
oh no that can't be right looking at your picture
oh it could be the max nodes at said height
2*2*2 = 8 nodes max at height 3
ok so i've sort of just derived where your top expression comes from, what about the 2 on the bottom. i notice one is also equal to the max nodes of a given height
from here 
my fraction is roughly how much is in the left subtree
I guess the exact thing is
(left subtree size)/(full tree size)
(2^d - 1)/(2^d - 1 + 2^(d-1) - 1 + 1)
(2^d - 1)/(2^d + 2^(d-1) - 1)
for simplicity let x = 2^(d-1)
(2x - 1)/(3x - 1)
```which is < 2/3
(when x >= 1)
they claim the total cost to build a max heap is:
i'm wondering how to read this, is it that expression in the ceiling * O(h)?
Guys I got coding assessment from Amazon, how can I become better at data structure and algorithms
it's the sum of the whole expression for values of h from 0 to floor(lg n)
I'm assuming it's sum(number of elements at level h * O(h))
i.e. the cost of inserting elements one at a time, each time the cost is O(current tree height)
actually, maybe my interpretation is wrong, idk 
if i send the entire expression, you'll be able to tell what it means
maybe i'll do that tomorrow i don't want to type it all out and i haven't gotten much work done today
do you have a copy of CLRS?
3rd ed
I have a pdf
ok that was from pg 159, the time required by max heapify
section 6.4
if you care to look. if not no worries
it is prefaced by
The time required by MAX-HEAPIFY whencalled on anode ofheight h is O(h) and so wecan express the total cost of BUILD-MAX-HEAP as being bounded from above by
so yes, it is close to what I said
hmm.. can you help me conceptually with what it means to sum terms * big O of something
for all levels (heights) in the tree, all nodes at that height costs O(h)
from here
you know the definition of O(f(n)), basically the thing is bounded by a constant times f(n)
you can see that they don't care to much in their derivation though, all we care about is O if the whole expression
so that constant doesn't matter, the outer O will just absorb that
the n is independent of h, so it can be factored out of the sum
ok, before that, we can replace ceil(n/2^(h+1)) with n/2^h
it will not be more than a factor 2 larger, and constant factors will be absorbed by the O
one sec
then as I said you can factor the n out, since it's no longer in the ceil
and you basically have their thing on the right
I wouldn't look at individual terms like that, it doesn't make much sense
evaluating their thing on the right makes more sense
but like if O was some other function, to evaluate it you'd go through each h=0,h=1,...floor(lgn), where for each h you evaluate the ceiling and multiply that term times O(h), to get each term
what theynare saying is just that the work is proportiinal to the height
O is not a function
hmm
i just think while i'm considering this i may as well know what precisely its describing so in the future i can describe said thing with this precision
I mean, you could roughly replace O(h) with c*h and it might make more sense
but we want the O of the whole expression anyway
which will absorb that constant
i can't see what is meant mathematically without considering O(h) as a function or like, some term
otherwise what would you multiply the expression in the ceiling with
this means the expression in the ceiling * O(h) correct?
expression in ceiling * O(h) + new expression in ceiling * O(h+1) ... +
is this incorrect
this is the kind of stuff that can feel sloppy when using O notation, but if you go through the details it checks out
or maybe expression in ceiling * O(f(h)) + h+1 in ceiling*O(f(h+1))???
am i reading it correctly? i'll reread what you wrote above
i don't know if its sloppy i just don't know what its saying
formally you would write things in terms of the definition of O
O(h) basically is a function c*h
for some constant c
(and the usual details of the definition)
does the sum make more sense if you replace O(h) with c*h?
something like f(x) being upper bounded (O of) means that there is some other anonymous function, g(x), for which g(x) is always greater for all n > n0
still, this question
idk i still haven't gotten the answer if the expression is interpreted as multiply by the term on the right (the ch term)
yes, it's just a multiplication
ok yeah then i think it would be fairly easy to evaluate. to look at an algorithm and then spit this out, not so much
fwiw we are kinda living in the big O world here, where these constant factors like c don't really matter
i understand i just want to know what is meant by the notation when looking at something like that
the left sum is just (number of nodes at height)*(work required for nodes at that height)
like i want to understand what the person who wrote it meant exactly, as that is the intention of the notation, to convey a precise meaning
summed over all the valid heights
what they meant is what I wrote just above

ok
the work required for nodes at height h is O(h)
and then the simplifications from there is in this big O world where constants (and lower order terms don't matter)
they simplify to the sum on the right, which is something they know how to evaluate
they mention some result they have calculated before, but the sum of h/2^h from 0 to ∞ has a pretty cute proof
1/2 + 2/4 + 3/8 + 4/16 + ... =
1/2 + 1/4 + 1/8 + 1/16 + ...
+ 1/4 + 1/8 + 1/16 + ...
+ 1/8 + 1/16 + ...
+ 1/16 + ...
```sum each row and you have
1 + 1/2 + 1/4 + 1/8 + ... = 2
it's the geometric sum over 1/2^n all the way down 🐢
it's turtles all the way down huh
believe it or not i've read some of the brief history of time 😜
sorry *a brief history of time
is there a significant difference in a Queue object and just...a list where i append and pop(0)
yes, the Queue will be substantially faster
appending adds to the end, pop(0) would pop from the head of the list no?
right
right
ty
seems like in this quicksort implementation they actually want one of their array indices equal to negative 1, it doesn't look like it'd throw an error as they never call array[i] anywhere until its equal to at least 0, does that sound right
there even is a picture where i is before the first element of the array
sorry they want a variable that'll be used to look at indices into the array the begin at -1*
that sounds reasonable
i'm debugging on paper so i can't just run and see
thanks
debugging is wrong word. i'm running it on paper / in my mind
well it worked in theory. it sorted my array into everything lower than the pivot element and higher and then moved the pivot into the middle
i think i lucked out on picking an array where that could work cleanly, in practice i'll need the recursion
this is a great exercise if you want to really get a feel for how the thing works
hello
Hey I'm having a little bit of trouble, I'm writing an algorithm to return the intersections of two lists of intervals as such
def intervalIntersection(self, firstList: List[List[int]], secondList: List[List[int]]) -> List[List[int]]:
if not firstList or not secondList:
return None
res = []
for i in range(len(firstList)):
if secondList[i][1] < firstList[i][0]:
res.append(secondList[i])
continue
elif secondList[i][0] > firstList[i][1]:
res.append(firstList[i])
continue
else:
try:
if firstList[i][1] == secondList[i + 1][0]:
res.append([firstList[i][1],firstList[i][1]])
elif secondList[i][1] == firstList[i + 1][0]:
res.append([secondList[i][1], secondList[i][1]])
except IndexError:
pass
res.append([max(firstList[i][0], secondList[i][0]), min(firstList[i][1], secondList[i][1])])
return res
Expected:
Input:
[[0,2],[5,10],[13,23],[24,25]]
[[1,5],[8,12],[15,24],[25,26]]
Output:
[[1,2],[5,5],[8,10],[15,23],[24,24],[25,25]]
However from the previous input I am getting this output:
[[5,5],[1,2],[8,10],[24,24],[15,23],[25,25]]
For reference, this is leetcode 986 interval list intersections. Any suggestions?
the expected output is just in sorted order, no?
guys having difficulty in understanding leet code questions ,i do know the basics of python but not able to solve easy questions as well ,any suggestions guy
are you familiar with any data structures or algorithms?
i am getting introduced to it so what i thought was ,if a watch a topic like arrays i feel if we implement it right away will get more knowledge towards it ,this is what i thought correctly me if i am wrong
please guide me whats the correct approach towards dsa and leetcode
as well
that is generally true
if you could give me any advice on how do i improve dsa and leetcode problems it would be very helpful for me
don't just bash your head against a problem for hours. after some time, maybe 30mins to an hour, try looking at hints/the solution and try to understand it
yes sir i have done that as well ,but the problem i am getting over there is that i do understand correctly as soon as i see the solution but again if i see a new problem its the same story again i dont understand what exactly they want from me from the question
do you have a specific problem in mind?
no sir i dont know if i could articulate properly like i will give you an example sir i was solving an boot camp question related to converting Celsius to Fahrenheit it is pretty straight forward answer i am able to get the answer is different intelij platforms and all but when there is a test cases
i fail sir
i fail to recognize what excatly do they want
from me which method do they want
the method you use doesn't really matter, all that matters is the right answer
it might just be that you don't have enough experience interpreting problem statements
yes sir that might be the problem will solve everyday and be trying my best thank you for your time sir
Hi, is there a way to create an array of matrices efficiently in numpy?
Suppose I have a 1D array a.
From each element, I want to create a matrix, say M_i=[[a[i], a[i]+1], [2, 3]].
Can I vectorise this operation? That is, can I create all matrices M_i with a single command?
Hey guys i need some help
array 1 : ['itachi', 'naruto', 'kakashi',]
array 2: [['requestrr-itachi', 'false'], ['mylar3-itachi', 'false'], ['pyload-itachi', 'false'], ['pyload-naruto', 'true'], ['pydio-kakashi', 'true'], ['mariadb-kakashi', 'true']]
result : [[itachi,['requestrr-itachi', 'false'], ['mylar3-itachi', 'false'], ['pyload-itachi', 'false']], [naruto,['pyload-naruto', 'true']], [kakashi,['pydio-kakashi', 'true'], ['mariadb-kakashi', 'true']]]
using array 1 and array 2 i want result array can someone help me with it ? thank you 🙂
What's the point of adding a C here. If it's gonna surpass for a certain n. Then it will anyways. Why we include a constant unnecessarily?
Okay looks like, it's to include the same degree of polynomials too
right
hi all,
can you help me to understand this proof. i tried by plugging in cn^2 for all n in the expression but that doesn't work out
i sort of see what's going on
the upper bound is sensible
T(n) = T(n-1) + Θ(n)
<= c2 (n-1)² + c2 n (use hypothesis)
= c2 (n² - 2n + 1) + c2 n
= c2 (n² - n + 1)
<= c2 n² (for n >= 1)
the lower bound calculation seems wrong
T(n) = T(n-1) + Θ(n)
>= c1 (n-1)² + c1 n (use hypothesis)
= c1 (n² - 2n + 1) + c1 n
= c1 (n² - n + 1)
```which is not `>= c1 n²`
may i ask what you did here:
= c2 (n² - n + 1)```
factor out the common c2, then -2n + n = -n
so, middle step c2(n² - 2n + 1 + n) if that helps
thank you. so i can just pull out the c2 from the c2(n) term and add it to the c2(n^2...) term without changing the coefficient
it's just
a b + a c = a (b + c)
the lower bound just looks wrong
when i get done algos i'm going to have to repay you somehow
i didn't understand their statement that the pivot element of quicksort is never again selected in any subsequent call to quicksort
seems like it easily could be if it happens to fall at the end of one of the subarrays
oh nvm, maybe the q partitioning changes that
I don't see any reason for using that hypothesis...
🤷♂️ these are probably student solutions. was found online
here:
what is meant by this line:
"counting sort assumes that each of n input elements is an integer in the range 0 to k, for some integer k. When k = O(n), the sort runs in theta(n) time."
T(n) = T(n-1) + th(n)...[keep expanding] = th( 1 + ... + n) = th( n(n+1)/2 ) = th( n^2 + n ) [coefficients can be ignored] = th( n^2) [polynomial growth is higher than linear growth]
no, since you recursively partition elements to the left of the pivot, and to the right of the pivot, you never touch the pivot again
hm ok
where did you get theta(1+2+...+n), from from the idea of asymptotics and ever increasing input size n?
th(1) + th(2) + ... + th(n) = ... [combined them]
still don't understand what they mean by an integer being O(n)
think about what O() means
an integer n in the set O means that there is some anonymous function f(n) such that n can be upper bounded by c(f(n)) for all n greater than n naught
idk what you mean by anonymous function here
O( ) is the set of all functions that upper bound n for all n greater than n naught
it just means it's bounded by n
oh, they are just saying that if the biggest number isn't asymptotically greater than n, then the overall complexity of counting sort is O(n)
the complexity is really O(n + k)
but if k is not greater than n, then we can just say O(n)
if the biggest number in the input array isn't asymptotically greater than the length of the array?
idk what it means, maybe its not important
but thinking of an integer k as having O(n) is making my brain hurt
it's the same as any other function being O(n)
but its not a function
i mean i guess the set of all integers can be upper bounded by a linear function
it is a function
an integer k is a function?
it's a 0 degree polynomial
like as in k^0?
kx^0
oh right bc k^0 = 1
not k
hmm.. interesting to think of integers in that way
i guess it just comes naturally to some
so in that case when might some integer k not be O(n)?
are all loops mathematically equivalent to summations?
like all loops of the form for i in range (1,x)
no, only if you're summing inside
but typically we are interested in getting the runtime by summing constant operations that happen inside loops, no?
oh, for the runtime you mean. then yes, you're summing the operations inside over the number of iterations
deepends on what commands are used inside the loop...
right right
k could be proportional to something else, like sqrt(n) or n^2 or something
but in their statement they say that each of the n input elements is an integer in the range 0...k, for some integer k
so we know its going to be linear / k is just some integer at the end of a range beginning at 0
well maybe not that its going to be linear
well actually yeah bc regardless of the input elements, an array grows linearly with each additional element
so it doesnt matter even if its in the form of like a=[1,10,100,1000]
well, that's not linear
the length of an array grows linearly with each additional element
sure, but the elements in it might not, which is what we actually care about
so counting sort runs in O(n+k)? which is equivalent to the size of the input array plus the largest integer in the array?
at first i struggled to see why that may be, but then, it must be dependent on the greatest integer bc if you have k of say, 100, then you need an array of length 100 to get the equivalent index.
makes sense i guess
exactly
hmmm ok so then radix sort becomes theta(d(n + k)), where d is the number of digits, and you use a stable sort such as counting sort with runtime of (n+k)
what is meant by a 64 bit version of windows?
i ask bc they are now discussing n b-bit integers
A 64 bit version of Windows is a version of the Windows operating system that is designed to run on computers with 64 bit processors.
as in, the cpu?
nvm i see it
64-bit operating system, x64-based processor
how can bucketsort be faster than insertionsort if it requires multiple calls to insertionsort?
i could see it being possible if each call to insertionsort was runtime complexity of omega(n)
it's faster in most cases, because if the input data is decently spread out there will be very few elements per bucket
tl;dr a sum of small squares is less than a big square
there is also the option to continue sorting by doing bucket sort again on the bucket contents, which ends up being something like radix sort
It looks like probability data if I’m not mistaken
Like the reason why an input array might be all of that decimal type and size?
tim sort is best overall
Hmm haven’t covered that one yet
woah. quicksort is stupid fast
sorting 10k random numbers in 0.053s
which random command do i use for a random element of an array
random.choice(array)
having trouble understand the pseudocode, they call for i = RANDOM(p,r), where an array has indices array[p,r]. i don't know if they want to pick randomly between p and r or a random index of the array
pg 179 CLRS
that means get a number in that range
its just one of the elements of the array, i believe. i tried
i = random.choice(array)
code wasn't having it
no, it's looking for a random index
then which argument would i use for it
wdym?
i can't pass random.choice the indices, it takes zero arguments
their pseudocode is using a random index, but your code chooses a random element. i'm sure you can figure out how to make that work
they're only using the index to access the list anyways
well, it takes 1, a sequence
alright i have office hrs now, i'll be back
anyone here familiar with DFS?
if anyone is familiar with DFS/BFS please send me a PM. thanks guys
You should post here what your actual question is regarding DFS/BFS.
Hey I need sanity check from someone else. For work not learning/fun - program has tons of graph theory going on, its basically calculating routes for estimating. Anyway using a Point class with x and y instead of just a tuple with floats for a vertex. But I have thought about making a edge class 5 times and can't decide if its stupid or a good idea.
edit: nm, its going to be useful for sure.
dontasktoask
there appears to be a mistake here with an extra c in the second line
between the two expressions
hmm ok
so they make the expression slightly larger in keeping with the <= inequality
yeah, they make it larger but gets rid of the annoying ceil
so, so far in this book they've been simply ignoring a lot of the floors and ceilings, when must you deal with them instead?
"under reasonable assumptions, the average time to search for an element in a hash table is O(1)."
🤯 🤯 🤯
that's why they're so useful
There is a trade-off, which is that it often requires more memory than for example an array with the numbers tightly packed in consequent memory
not asymptotically more, though. and the search speedup is huge
when i run python and tell it to generate an array of 10k numbers, they are stored in one large continuous spot in memory. is this on/in the ram or the cache of the cpu
pretty much, yeah. a lot of things you can deal with in memory, but stuff like dbs are on disk
whats the upper limit on what my RAM could handle in terms of an array of like 1m, 10m, 100m numbers
depends how much you have
right but then there is the linearity aspect as well
in other words, a linked list has the advantage that its stored in discontinuous space in memory(ram)
if i'm understanding correctly. but i don't understand what that means exactly
Good afternoon
the elements are not necessarily stored contiguously, i.e. right next to each other
this isn't actually correct, there are pointers stored contiguously, the integers are not
ok. sounds like there are subtleties not caught in my reading. thanks
it's more a python thing
in many other languages the objects would be stored contiguously
in C, an array is literally just a block of memory
so inferring from what y'all are saying, i'm envisaging that RAM is addressed or indexed in some way
wrt cache, the cpu will load stuff in to/out of cache as needed, you don't have much say as a programmer
yeah, it's broken down into things called "pages" stored in a page table, and that's the extent of my knowledge
stuff is in ram, if you read from some location in ram the data at that location (and stuff nearby) is pulled into the cpu cache
so one giant array in C would be like pages in range(1,n)
you don't really have control over what the OS gives you
ok ok. i guess i'm getting off topic. the cool thing about my ram is that is has RGB. also off topic. back to hash tables
haha!
CPU: AMD Ryzen 5 5600X 6-Core Processor 3.70 GHz
i also have a pretty sweet GPU if i ever learn to do ML stuff
does python have an inbuilt linked list data structure?
kinda, collections.deque
would that be equivalent to single or doubly linked lists
doubly
so might one create a hash function/table and resolve collisions via deque?
one might. although, others might say that's cheating
ok. setting aside the question of cheating wrt programming, i've seen linked lists implemented in python as classes. why might one do this rather than, say, linking pythonic lists themselves? or is that what they are doing in essence
wdym linking pythonic lists
python lists aren't linked lists, though
right i'm saying why not build a linked list by using lists themselves? is it bc its inefficient? or it would break when you go and delete one?
like you could have each node be a list of length 3 with list[0] = previous node, list[1] = data, list[2] = next node
that could probably work
there must be some reason that's not the conventional approach
having names for them is easier, self.val, self.prev, self.next is better than data[0], data[1], data[2]
hmm right. i wonder how the time complexity operations would change
it's the same, list indexing is O(1), same as looking up a variable name
i'm not following. i know constant time operations are O(1), like eg. declaring a variable or something. O(n) operations are like, when have to search every item in an array, as an example. wdym list indexing
doing a_list[an_index]
oh ok
iiuc, all you've changed is instead of using 3 separate variables, you store them together in a list, right?
idk what i mean exactly, just when thinking of building a linked list, the first logical step is to take an existing data structure and find the way to link it. all theoretical at this point
i was just wondering why people don't take that approach, and as you've pointed out it could be as simple as having better naming conventions
anyone know the time complexity of my code in pseudocode ? ```python
function g(n)
i = 1
x = 0
while i < n do
i *= 4
j = 1
while j < n do
x += j
j *= 2
k = 1
while k < n do
x += j
k += 8
return x
i think its O(n^2logn) but not sure
isnt it n(log n)^2 
how?
i loop is log n, j loop is log n, k loop is n
oh wait do you mean O(n^2log^2n) right
or is that just the logn?
n * ((logn)^2)
i never seen the squared after logn usually its log^2n
sure same thing
oh ok thanks
you can totally store nodes in a list, and keep track of the index of the next node
there is a fun structure where you have a prev array, a next array and a data array to implement a doubly linked list
so prev[i] is the node before i
next[i] is the node after
and data[i] is the value
What about the data associated with the node? That is why structs / classes are used. They have heterogeneous data / types within them.
You can def. do a struct with data, and then a list of 2 pointers / references (next, prev).
In Python you could do a list, because lists in Python can have different types for each element, but those are really annoying to deal with.
(You could also just not use classes and use lists for everything, it won't be fun)
If you have a class for the node, it shows up as a Node type. But if you use a list, it shows up as a list, regardless if it's a list representing a node or not.
(You want the node to be its own type)
(Python is more loose with typing, but I don't think most devs want to be that loose with typing (in a large project) (also entering LISP territory (everything a list)))
(Unless you really like LISP)
what would be the time complexity of this algorithm?
ArrayStack() basically being a stack data structure means push and pop operations are both O(1)
what do you think?
doesn't it depend on how many time the while loop runs? also most of cost will be due to find operations i assume
it doesn't really matter how many times the while loop runs
find runs in O(n) in the worst case since it needs to search through the entire raw text ?
yes
i am not sure, wouldn't you keep doing find("<") and find(">") until you exhaust all possibilities in terms of tags? So it is kinda dependent on the number of tags. Ngl i am a bit lost, could you help
you call find multiple times, but you adjust the start location for each call. this means subsequent calls don't search previously searched parts of the string, right?
so no matter how many times you call find, you only look at each character once
ahhh i just realised so we will not revisit characters due the adjustment of the position to start from for the find() method, that means overall it runs in O(n) (as we look at each character once)?
technically you do revisit chars with tag = raw[j + 1: k], but this is also linear. so you visit chars at most twice
ah ok thank you for the help
Can someone please explain the colors in red black tree I don’t get it, it says black is null but I see it at the root?
I'm doing
Y=train['target']
X=train.drop(columns=['target'])```
when len(train) is 200.000, but after this len(Y) is somehow 200.001. What could be the reason?
Hey anyone knows some good resources or YouTube channel to learn data structures and algorithms? I can't find any which must be good
how does this function have O(sqrt(n)) Time complexity?
if are t iterations
then 1 + 2 + 3 + ... + t = n
or t(t+1)/2 = n
or t^2 + t = 2n
or t = O(sqrt(n))
If anyone has a solution to this task in python that passes please send it
my code passes all tests except one but I really don't know what to improve
in the loop s goes
s = 1
s = 1 + 2 = 3
s = 1 + 2 + 3 = 6
...
```in general the kth iteration is
s = 1 + 2 + ... + (k-1) + k = k*(k+1)/2
```so how many iterations are there until s > n and the loop terminates?
i.e. what's the smallest k such that
k*(k+1)/2 > n
you could solve the quadratic explicitly, but the solution is basically a square root of n
thanks
cpython is too slow
pypy works fine
Hey all, page 393 I’m reading about the longest common subsequence of two DNA sequences. Under step 2: a recursive solution, I’m not understanding when they say if x_m = y_n if they are meaning the bases are the same or the sequences are of the same length.
Pg 393 CLRS 3rd edition section 15.4
I also do not understand their notation for the longest common subsequence of two sequences Xi and Yj to be c[i,j], because as they pointed out, the LCS may be anywhere along the sequence and omit any bases from either, thus you cannot capture it in an array type fashion
c[i,j] is the length of the longest common subsequence if you consider a prefix of both strings (of length i and j respectively)
it's not saying anything about what the subsequence is, just what its length is
Ok thank you for that any idea about the if x_m = y_n statement?
Looks like in theorem 15.1 they must be meaning if the elements are identical
Not the length
I think that's just if the characters at those indices match
Correct
and capital X, Y are the prefixes I think
But, when they speak of Z_k-1, then they’re talking about a subsequence
I don’t understand the logic following the and in statement 1
Z, where?
I have page numbers
ah yeah, Z is an LCS there
I don’t understand the logic following the ‘and’ in statement 1
I understand statement 2
If the final elements are not equivalent, then the LCS must be of the subsequence X_m-1 and Y
first statement:
if x_m == y_n; then the last element is part of the LCS, and the LCS without the last element is the LCS of the sequences without the last element
second:
if x_m != y_n then the last char of X not matching the last char of Z means the LCS must be that of X without the last element and Y
third is the same but with X and Y swapped
basically, if the chars mismatch and the last char of X doesn't match the last char of the LCS, then you can ignore the last char of X and the LCS is the same
Yeah I think I’m getting it now, thank you
What’s the purpose of dummy keys in optimal binary search trees
if the value you're looking for isn't in the tree you end up in a dummy node
Is that just to make a returning NIL feature
I think it just makes code easier
if dummy node then element doesn't exist
I think in practice that dummy might be None in python or null/null pointer in other langs
can you expand on this? I'm a bit confused why I would need ternary
I've been thinking of something like "divide by 2" every time it continues improving, otherwise multiply by 2
I just didn't know how to prevent them from bouncing back and forth if I'm going towards the minimum
for example f(10) = 10, check f(5) = 15, check f(2.5) = 20, check f(1.25) = 19
that means the answer is between 1.25 and 2.5 so these are my new ranges
maybe whenever I get to such a situation I would start rising slower
like instead of multiplying by 2 I would multiply by 1.75
so I am probably going to do that
if score better than prev run => continue with operation
if score worse than prev run => do opposite operation but not as strong
2hard4me
You are given two non-negative integers A and B. Let the function Sum(A) be the sum of digits of the number A. You can perform any number of moves (possibly zeros).
In one move you can increase the value of A by 1.
Find the minimum number of moves you have to perform to make sum(A) <= B.
Constraints:
1 <= A <= 10^12
1 <= B <= 108
Sample Test Cases:
- A = 1, B = 1 Output = 0
- A = 9, B = 2 Output = 1
- A = 555, B = 10 Output = 45
*inputs are given in string.
How to solve this problem in python?
Here is the code I have tried:
def countmoves(A,B): int_A, int_B = int(A), int(B) count = 0 while True: digits = [int(str(int_A)[i]) for i in range(len(str(int_A)))] s_m = sum(digits) if s_m <= int_B: return count else: int_A += 1 count += 1
But my code is getting Timed Out. How can I optimize it?
do bigger steps
suppose the number is 783910
look at the very end 783910
you can't decrease this, so look at 2 digits 783910
you can drop this to 0, make a 90 step and you get 784000, now start over
Yes, thanks!! I'll think over it.
it doesn't feel optimally optimal, but that should be fast enough anyway, and it's easy to understand
I did get your point, but not able to code it.
well it should be easier to do it in one step then
assume you zero out the suffix and it increases the rest by one
suppose you need to decrease the sum by 12
783910 we assume this will decrease it by −1
783910 that's 0, −1+1
783910 10 − 1 = 9
783910 13 − 1 = 12, that should do it
you just add this up, subtract 1 and see if it will be enough
at that point you checked 4 digits, so you subtract 10**4 − 3910
Hello there, if I need to uniquely identify a dictionary (for example by computing a hash) would using something like
hash(tuple(sorted(dict_.items())))
be a good solution?
why not str
class Solution {
public:
int maximumGroups(vector<int>& grades) {
return -1+ ((1+ pow(1+4*(2*grades.size()),0.5))/2);
}
};```
one liner medium
that ignores the actual grades
does that really hold?
oh
kinda does, but not enough i think
hm
Are you asking me?
yes, hash(str(dict_)) is just simpler
It wouldn't be sorted then
oh true
In my case order shouldn't matter
Dicts preserve insertion order, i think that holds for the str repr but i should check
hash(tuple(str(x) for x in sorted(dict_.items())))
>>> x = {1: 1, 2: 2}
>>> y = {2: 2, 1: 1}
>>> str(x)
'{1: 1, 2: 2}'
>>> str(y)
'{2: 2, 1: 1}'
Ehm
This is just longer
yes, but your values can be unhashable this way
That's not required, I assume all values are hashable here
ok then yeah the original should work
yeah i see
why did you sort A?
ah, that shouldn't be needed, some logic from an earlier wrong attempt
hmm, weird I had almost the same code as you and it didn't pass the time, also How are you so good at doing these tasks? I mean do you do competitive programming proffesionally or you are just good at programming in general?
yeah, now it passed, thanks
I did competitive programming for fun for a few years, and I'm very used to problem solving stuff in general (loads of math and physics and whatnot).
I'm not amazing at it compared to some others, but I'm good enough at it :P
Thanks, I hope i become as good as you are
Can anyone help in suggesting some good course to learn data structures and algorithms..it will be really helpful..
check channel pins
i just grind leet code
and use neetcode on youtube to explain solutions
Yeah, hackerrank, leetcode, codewars are all great options to practice with that imo
hi all,
in reading CLRS, page 288, I'm not seeing where the expression (c + d)n + c comes from
its the proof that inorder tree walk will take O(n) time
d is some constant that represents the time taken for the body of the script independent of the recursive calls
c is i suppose the usual constant we'd use when trying to upper bound something in the form cn when providing an O(n) proof
if I have an html code with a list ( <ol><li>... ) Whats the best way to convert it into a json?
beautifulsoup, perhaps
thanks
And what about learning the basic concepts like various algorithms...from where i should get those lectures..
check pins, there are free MIT lectures on youtube
Introduction to algorithms 3rd edition is pretty much the gold standard, its sold over 1m copies
by cormen, leiserson, rivest, stine. colloquially CLRS
very technical for beginner i'm finding
well, with algoritims the best way to learn them is to just practice them
i didn't begin learning algos by learning how they work, i just did questions and solved coding problems
once you know the basics of any programming languages (functions, conditionals, loops, data types, lists ... etc), your ready to learn algos and datastructures
why cant for the first one return [50, 50,51,51]?
so try and make as little chucks as possible whilst making it between min and maxx?
Hmm, what comes to my mind is... Take n=math.ceil(number/max). Then you need to divide number into n equal-ish chunks, which can be done like
mod = number % n
results = []
for i in range(n):
results.append(number//n + (1 if i<mod else 0))
This produces splits like in your example.
Now, does that always ensure that MIN is also guaranteed... well, only if number//ceil(number/max) >= min, which, hmm, I'm not sure about.
for your MIN and MAX, it seems to work fine
def split_to_chunks(number, min, max):
n = (number+max-2)//(max-1) # that's like ceil(number/(max-1))
mod = number % n
results = []
for i in range(n):
x = number//n + (1 if i < mod else 0)
assert min <= x < max, x
results.append(x)
return results
MIN = 10
MAX = 100
[split_to_chunks(i, MIN, MAX) for i in range(MIN,1000)] # no errors
should i practice implementing binary search trees, red-black trees before learning about B-trees?
the implementation of trees in python is straightforward using classes for tree and for node, although i haven't done it myself yet
i'm also not seeing the similarity between the pseudocode and implementations of trees in python, as CLRS has algorithms for every operation, and in the node class you just have attributes to do things like delete, insert, etc
it can for sure fail if min and max are too close so no solution exists (maybe having a check for that rather than an assert would be a good idea)
but if there exist a solution this should give the one with the smallest number of chunks
would y'all recommend creating my own binary tree type from classes from scratch so that I can implement their algorithms for searching, inorder tree walk, insertion, etc? i feel like this is a waste of time when I could just use binarytree module however, the book does build up to more advanced structures such as red black and 2-3-4 trees which idk if i can implement with python's binarytree mod
a binary tree type is pretty trivial though
a tree node is just
class Node:
left_child: Node
right_child: Node
value: ValueType
and for a general tree, replace left/right child with a list of children
this is what I have so far:
class node:
def __init__(self, key):
self.left = None
self.right = None
self.key = key
i need to learn how to build it into a tree and use CLRS algorithms on it. i mean i don't have any assignment or anything like that but the chapters seems to be building like first we learned trees, then bsts, red-black, now i'm on b trees
@haughty mountain could you walk me through it sometime if you have the time
brb
sure
class node:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
class tree:
def __init__(self):
self.root = None
def node_addition(self, node, key):
if node == None:
self.root = node(key)
if key < node.key:
if node.left == None:
node.left = node(key)
else:
self.node_addition(node.left, key)
else:
if node.right == None:
node.right = node(key)
else:
self.node_addition(node.right, key)
def inorderPrint(self, node):
while node is not None:
self.inorderPrint(node.left)
print(node.key)
self.inorderPrint(node.right)
test_tree = tree()
test_tree.node_addition(test_tree.root, 10)
print(test_tree)
hey all i'm getting an error:
'NoneType' object is not callable```
lines 13 and 33
this is another reason why you name classes in PascalCase, fwiw
your node class is overwritten by the node variable
like pascalCase
PascalCase
ok ty
<__main__.Tree object at 0x00000274255AEAD0>
T(n) = 8T(n/2) + n^3/logn why is this not solvable by master theorem??
what is a master theorem?
In the analysis of algorithms, the master theorem for divide-and-conquer recurrences provides an asymptotic analysis (using Big O notation) for recurrence relations of types that occur in the analysis of many divide and conquer algorithms. The approach was first presented by Jon Bentley, Dorothea Haken, and James B. Saxe in 1980, where it was de...
because the second term is non-polynomial
there are variants that can handle it, but the typical one is only for when it's polynomial
perhaps it isn't in the correct format
what specifically is non-polynomial about that? for learning
the log n
I guess you could ignore the division by log n and you would get some upper bound, but a worse upper bound than necessary
the wiki article includes this case, the expansion of case 2: 2a, 2b and 2c
in the example above you're in case 2d
T(n) = 8T(n/2) + n^3/logn
which ends up being Θ(n^3 log log n)
when building a binary tree would you give the root attribute to your Node class or Tree class
Tree
i'm really struggling with how to write a tree such that the CLRS algorithms will be compatible
there are plenty of tree implementations online but i don't think they're compatible with the CLRS algorithms which have their own algo for every function
I'm learning about data structures and memory location , is there any meaning behind 200 towards 244 for memory locations or are they just random numbers used to showcase memory location ? (I thought I could attach an image) here is a link to the image -> https://ibb.co/Qjgy06n
the start seems to be arbitrary, but it does show they are all next to each other
that referring to my question ?
what's wrong with this code:
class Node(self, key):
def __init__(self, key):
self.left = None
self.right = None
self.key = key
yeah
your class header shouldn't have parentheses, that's for inheritance
so 200 means nothing when it comes to memory location ? they just uses it as an example . You could start at 0 or 400 ?
class Node:
def __init__(self, key):
self.left = None
self.right = None
self.val = key
if __name__ == '__main__':
# Create root
root = Node(1)
whats going on w this syntax
specifically the __name__ line
yeah, the picture doesn't really say anything, so it's probably just arbitrary
!if-name-main
if __name__ == '__main__'
This is a statement that is only true if the module (your source code) it appears in is being run directly, as opposed to being imported into another module. When you run your module, the __name__ special variable is automatically set to the string '__main__'. Conversely, when you import that same module into a different one, and run that, __name__ is instead set to the filename of your module minus the .py extension.
Example
# foo.py
print('spam')
if __name__ == '__main__':
print('eggs')
If you run the above module foo.py directly, both 'spam'and 'eggs' will be printed. Now consider this next example:
# bar.py
import foo
If you run this module named bar.py, it will execute the code in foo.py. First it will print 'spam', and then the if statement will fail, because __name__ will now be the string 'foo'.
Why would I do this?
• Your module is a library, but also has a special case where it can be run directly
• Your module is a library and you want to safeguard it against people running it directly (like what pip does)
• Your module is the main program, but has unit tests and the testing framework works by importing your module, and you want to avoid having your main code run during the test
thanks
has anyone had success implementing binary trees that work with the CLRS algos?
they span many chapters in there
when you say if something: it just evaluates to True if the thing exists correct
why is root not defined in the following code:
class Node:
def __init__(self, key):
self.ley = key
self.left = None
self.right = None
class BinaryTree:
def __init__(self, root):
self.root = Node(root)
def inorderTreeWalk(self, x):
if x == None:
inorderTreeWalk(x.left)
print(x.key)
inorderTreeWalk(x.right)
tree = BinaryTree(1)
tree.root.left = Node(2)
tree.root.right = Node(3)
tree.root.left.left = Node(4)
tree.root.left.right = Node(5)
print(tree.inorderTreeWalk(root))
you never made a variable named root in that scope
can i give inordertreewalk the scope of root or how might one correct this
idk i'm sort of fudging it to make it work with the algos in the book but not entirely sure of what im doing
hmm i changed my call to print(tree.inorderTreeWalk(tree.root)) which runs but returns None
yeah, because your function doesn't return anything
why do they write the pseudocode that way
what way
this is the pseudocode:
INORDER-TREE-WALK(x)
if x =/= NIL
INORDER-TREE-WALK(x.left)
print x.key
INORDER-TREE-WALK(x.right)
why wouldn't they write it that way, all it's supposed to do is print
whats the name for a function within a class, again?
method
i think its not working bc this wasn't written to be a method, but rather an entire separate algo/function to call on a binary tree. i'm trying to write it within the binary tree class as a method
yes, the functions are procedural, not oop
ok ok. got it going
now to figure out how to write a BST instead of a binary tree
like i'm not seeing how to confer the proper properties to it. i guess i have to write it into the tree class.. like eg if node_to_add < node: node_to_add = node.leftchild
or something similar
this could easily be written standalone or in a node class
def add_node(root, value):
if value <= root.value:
if root.left is not None:
# There is a left subtree, add value there
add_node(root.left, value)
else:
# No left subtree, create one
root.left = Node(value)
else:
# Same logic but for right
i cannot use methods such as this, as CLRS has their own TREE-INSERT algorithm
and if i'm asked to implement such algos then having a node addition method in the tree will break things
i'm working my way to that but first i need to get the tree search running which currently is not
this is my code that currently does not work:
class Node:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
class BinaryTree:
def __init__(self, root):
self.root = Node(root)
tree = BinaryTree(1)
tree.root.left = Node(2)
tree.root.right = Node(3)
tree.root.left.left = Node(4)
tree.root.left.right = Node(5)
def TreeSearch(x,k):
if x is None or k == x.key:
return x
if k < x.key:
return TreeSearch(x.left,k)
else:
return TreeSearch(x.right,k)
print(TreeSearch(tree.root,2))
my inorder tree walk is working, so that's good
my thing is basically their thing written recursively
oh you said it could be written standalone i see
it's pretty much the same code for standalone or if part of the node class
hmm
only the function call changes
that's good to know, i was getting flustered not knowing how to implement a tree that could utilize all their pseudocode
e.g.
class Node:
...
def add_node(root, value):
if value <= root.value:
if root.left is not None:
# There is a left subtree, add value there
root.left.add_node(value)
else:
# No left subtree, create one
root.left = Node(value)
else:
# Same logic but for right
ayyy its working
well not the tree search, but the node addition as a standalone class:
class Node:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
class BinaryTree:
def __init__(self, root):
self.root = Node(root)
tree = BinaryTree(7)
def tree_insert(root, key):
if key <= root.key:
if root.left is not None:
tree_inser(root.left, key)
else:
root.left = Node(key)
else:
if root.right is not None:
tree_insert(root.right, key)
else:
root.right = Node(key)
tree_insert(tree.root, 9)
tree_insert(tree.root, 4)
tree_insert(tree.root, 17)
def InorderTreeWalk(x):
if x == None:
return 'empty tree'
else:
InorderTreeWalk(x.left)
print(x.key)
InorderTreeWalk(x.right)
InorderTreeWalk(tree.root)
standalone function*
this tree search isn't working:
def TreeSearch(x,k):
if x == None or k == x.key:
return x
if k < x.key:
return TreeSearch(x.left,k)
else:
return TreeSearch(x.right,k)
print(TreeSearch(tree.root,5))
returns (or prints?) this:
<__main__.Node object at 0x0000025289624880>
you could implement a __str__ or __repr__ in TreeNode for a custom string representation
looks like their treesearch is only identifying if a given int occurs in the tree, not where it is or anything else
Is this a normal behavior ?
they have:
if x == NIL, i have if x == None which i'm thinking are not equivalent statements
they're pretty much the same
can y'all help w my randomized quicksort? it's running but no longer sorting although its based on my quicksort which is working
yes
a new function randomized partition swaps the element occurring at the randomly chosen location with the element at the end of the array
no but out of curiosity what does that one do? graph traversal?
post the code?
import time
import random
arrayA = []
for x in range(10000):
arrayA.append(random.randint(1,100000))
def randomizedQuicksort(array, p, r):
if p < r:
q = randomizedPartition(array, p, r)
randomizedQuicksort(array, p, q-1)
randomizedQuicksort(array, q+1, r)
def randomizedPartition(array,p,r):
i = random.randint(0,len(array)-1)
array[r], array[i] = array[i], array[r]
return partition(array,p,r)
def partition(array, p, r):
x = array[r]
i = p-1
for j in range(p,r):
if array[j] <= x:
i += 1
array[i], array[j] = array[j], array[i]
array[i+1], array[r] = array[r], array[i+1]
return i+1
start=time.time()
randomizedQuicksort(arrayA, 0, len(arrayA)-1)
print(arrayA)
print(time.time()-start)
if i were a better coder i know it'd be obvious by looking at it what's going wrong, but alas i am not
it's kinda weird to do do the randomized partition by swapping an element to the end and then re-using the other partition
though it shouldn't be wrong
it would be more sensible to pass the index of the pivot to partition
just saying it's a weird code design
thats what it says in the book unless i've interpreted the pseudocode incorrectly
uh sort of. more broadly im trying to understand how adding an element of random behavior to a known algorithm changes its runtime
it should be faster, otherwise people wouldn't bother
depends on the input
wait this is wrong
i = random.randint(0,len(array)-1)
it can pick stuff outside the current range
fwiw, picking the last/first element is a very bad choice for quicksort
it's Θ(n²) on already sorted data
what's the specific question about Bellman-Ford?
@haughty mountain really don't know what i would do without your help
hmm its only returning the first so many elements of an array after my print/return statements, is this a function of python itself or my IDE
if i bump it down to n=1000 it'll print all of them
but i need to be working in the 50k - 100k range
python has a 1000 deep recursion limit by default
is that what's known as the "stack size"?
kinda, though this is more of an arbitrary limit python sets
hmm i wonder what i'll have to do to get my scripts working in the higher range
if you pick the last element as pivot and run on sorted data you have O(n) recursion depth
you're probably encountering something like that
run on sorted data?
or what are you referring to here? 
like when i test my scripts i'll print out the randomly generated array and then the array after i call randomizedquicksort
and the terminal only shows the first so many elements of the arrays
i don't get a recursion error or anything, it just doesn't print the entire arrays when they are like, n=10k
it should be a bit similar to Dijkstra, Dijkstra is essentially Bellman-Ford but it only does the relaxation greedily
while Bellman-Ford is just "run relaxation over all vertices n_vertices - 1 times"
if it's code you can just share the code here
you can also put images here
that does not look how I expect Bellman-Ford to look 
what is the unvisited thing even doing?
@haughty mountain i'll wait until you're done helping hinder and then i'd love for you to take a look at why my tree search algorithm isn't working, upon running i'm getting
<__main__.Node object at 0x0000023F312B4880>
stealing part of the wiki pseudocode, this is the main thing happening in bellman-ford
repeat |V|−1 times:
for each edge (u, v) with weight w in edges do
if distance[u] + w < distance[v] then
distance[v] := distance[u] + w
do the relaxation over all edges |V|-1 times
and then the distance array is correct (other than maybe negative cycles)
floyd-warshall is way overkill here though?
and floyd-warshall does not like negative cycles
in bellman-ford you can detect negative cycles
the last code you posted?
you're getting a node back yeah, because you're returning a node 
what did you expect to get?
yeah, that's what I'm guessing as well
getting a Node or None is a sensible api as well though
ok i fixed it up with py return x.key
but now it errors when the search term isn't in the tree
def TreeSearch(x,k):
if x == None or k == x.key:
return x.key
if k < x.key:
return TreeSearch(x.left,k)
else:
return TreeSearch(x.right,k)
you would need to handle that case and return something sensible, e.g. None
here's the code
ok i think it works like this:
def TreeSearch(x,k):
if x == None:
return "does not exist"
if k == x.key:
return x.key
if k < x.key:
return TreeSearch(x.left,k)
else:
return TreeSearch(x.right,k)
err, if x == None why are you accessing x.key?
right
sry its sloppy CLRS pseudocode
theirs literally says if x == NIL or k == x.key:
return x
it doesn't work that way
either it returns that weird statement above or you can tell it to return x.key, and if x = None it gets fussy
this way works better, the if x==None statement i added
@west vigil why are you deleting your stuff? 
but its only returning a search term itself if it exists, not like a node location or anything
it's not a weird statement
it's one of your nodes
ok should i go back to that version and try to modify it to return the nodes value?
it's the default repr of a class, if you want it to print something more useful you can implement __repr__
returing the node from the function seems sensible enough as an api, no?
so.. yes? haha
yes, you can then check if the returned value is not None, and if so you can get the value of it
ok, i'll add a __repr__ method to my Node class
def __repr__(self):
return key
like this?
why is a priority queue such a pain in python? I just want to put objects in it and change the sort order
am I missing something?
import heapq
ugh aren't there any classes?
it uses a normal list and you can use heapq.heappush and heapq.heappop
k how do I change the sort order?
the basic idea of BF isn't that hard, you start only by knowing that the distance to your start vertex is 0, and assume everything else is ∞ (or just some large value)
then you perform this relaxation step on all edges, everytime you run the relaxation more vertices end up with the right distance, after |V|-1 steps everything has the right distance (unless you have negative cycles, which can be discovered with some extra work if those are interesting)
ok i got it working with None, but not with search term
making your own class that wraps heapq is easy though
like if the search term isn't in the tree i return None
what part of it?
if it is, i get:
__repr__ returned non-string (type int)
can you re-post that code?
yes, __repr__ should return a string representation of the object
idk if its working or just returning my search term to me
here's my repr function:
def __repr__(self):
return str(self.key)
it's returning the node it found, or None if the value didn't exist 
def TreeSearch(x,k):
if x == None or k == x.key:
return repr(x)
if k == x.key:
return x.key
if k < x.key:
return TreeSearch(x.left,k)
else:
return TreeSearch(x.right,k)
print(TreeSearch(tree.root,9))
this is succesfully returning the int if its in the tree, and None if it isn't
so let's ignore the parent part for now at least, what's the issue with using direct access arrays? the first loop is the |V|-1 relaxations, you can see that?
i guess it couldn't possibly work that way if it were simply printing the search term
ah, yeah they assume you have vertices numbered 0 to |V|-1
so your adjacency list could be written
myGraph = [
[1, 2, 3],
[2, 4, 5],
[3],
[2],
[3, 2],
[4],
]
more like the weight is some function of the two vertex indices
you could provide such a function that does lookups in a dict or similar
or you could rewrite their code to work with your type of graph representation
when I formulate graphs I tend to write down the edges, e.g. a list of (u, v, w) where u and v are indices and w is a weight, and then I can easily write a few lines of code to generate the graph representation
@west vigil e.g.
edges = [... list of (u, v, w)]
n = number of vertices
adj = [[] for _ in range(n)]
weight = {}
for u, v, w in edges:
adj[u].append(v)
adj[v].append(u) # if undirected graph
weight[u, v] = w
weight[v, u] = w # if undirected graph
def w(u, v):
return weight[u, v]
that would be a valid function w to pass to their function
you can totally rewrite their code to work with your graph representation as well, it wouldn't be that huge changes
I didn't really study computer science, I took a few courses, but most stuff I learned outside school doing programming contests and similar
I didn't really learn Bellman-Ford in particular either, I had to read into it a while back when someone was asking about it in another server when they were working on a problem :P
the idea is similar to stuff like Floyd-Warshall though, in that you do a bunch of relaxations
so not that hard to pick up
all down to practice 😄
programming contest stuff is nice in that you actually get some hands on experience implementing the stuff
just memorizing specific solutions is not a good path to learn things
it's much more important to understand the ideas that goes into the solution
ideas and problem solving techniques is what tends to generalize to new unseen problems
if you know what you need to do but have troubles expressing it in code, then solving some problems would probably be useful, yeah
I think in a lot of cs courses there is too little focus on actually implementing stuff. I think I could grab a bunch of students that took the same algo classes as me and they would struggle a ton to implement stuff
I mean, the theory is also needed
I've tutored someone that is doing a master in AI, and they had to look up how to print a variable in python :/
People get taught theory a lot, but practice a bit too little
but if you don't do things in practice you will struggle to have a deep understanding of things
solving programming problems is all about breaking a task apart in sensible pieces, finding the right tools (algos, data structures, ...) to solve the pieces, and then putting all the pieces together into an implementation
(tl;dr: theory is important, but should be complemented with something more practical)
not now, it's like 1am here :P
wat
so binary search on a sorted input array is lg(n)?
lg is usually used to mean base-10 logarithm. This isn't wrong, since O(log(n)) = O(lg(n)) = O(log2(n)) anyway, but it's a bit weird to see.
but yes
i think in CLRS they use lg to mean log_2
if i'm not mistaken. but i realize not everyone is thinking in CLRS terms
ah, alright
so whats the precise mathematical notations for say, a binary search algorithm with a single while loop
huh, i thought lg was base 2, and log was base 10
natural log is ln
well, every iteration you halve the search space, so it'll be exactly-ish log2(n) rounded up or something like that.
in physics you just write log, same in a lot of maths :p
yeah in chem we use log_10
it also things log is natural lol
real mathematicians only use ln(n)/ln(10) 🥴
hey all i am practicing a mathematical induction proof, i have the recurrence:
is T(n) = nlgn
now i've just shown myself its true for when k=2
but i'm used to proving its first true for n=1 and then the inductive step is showing it must also be true for n+1
so i guess that the k is throwing me, will my inductive step be showing its true for k+1 instead?
here is the whole thing, i don't see where the 2(2^k lg(2^k)) + 2^k+1 expression comes from
Hello can someone help me with a travelling salesman problem?
Its a long piece of code but its under introduction to Artificial Intelligence
def survivorSelection(population, eliteSize):
# TODO 5 (10 marks) - Replace the dummy survival selection function below with
# either Fitness Based Selection or Merge, Sort & Truncate.
# Marking scheme:
# 7 to 10 marks: Correct implementation.
# 5 to <7 marks: Minor errors.
# >0 to <5 marks: Major errors.
# 0 marks: No answer is given.
elites = []
# Replacement starts here #this is dummy/ Use the merge, sort, truncate and take the elite size list from the top
def merge_sort (survivorSelection):
if len(survivorSelection)>1:
left_arr =
for i in range(eliteSize):
elites.append(population[i])
# Replacement ends here
return elites
I have to replace the piece of code BETWEEN"# Replacement starts here" AND "# Replacement ends here"
the previous expression but with T(n) = n lg n substituted in
Hey guys, can anyone help with this Genetic Algorithm problem?
I sent it in picture form to be able to see the notes and full question
Im just not sure what values I need to put in and how to write the python code for Merge,sort and truncate
thank you
hey all,
i don't understand when they say "we add k + 1 to both sides of the equation P(k)". i already see a k+1 in the LHS
sorry, i should have phrased my ? different. its quite apparent that that's what they've done but i don't understand the why
i guess the why is that, that entire T(something) expression evaluates to 2^k when you plug into it your k+1 term
1st eq is what they start with, 2nd is what they want to show, 3rd part is working from 1st eq
so i can see that they have derived an expression for the P(k+1) summation, namely, (k+1)(k+2)/2
it's just "guess the solution and see if it works in the recurrence"
but then that seems to disappear and they go back to working with the original term for the summation k(k+1)/2
could you please just try to walk me through it, everyone in the math server just is condescending
the derivation is the third part
oh i see something i hadn't before. they show the two equations are both equal to the same expression
the assumption is that 1 + ... + k = k (k+1)/2
this also predicts that 1 + ... + k + (k+1) = (k+1) (k+w)/2
let's see if this is true, i.e. take
1 + ... + k = k (k+1)/2
and add k+1 to both sides and see if the right hand side ends up correct, if so the formula seems to work
yes, the assumed formula predicts what the sum of k+1 terms is, afterwards they check if that matches with what we get from going from the sum of k terms and adding the k+1 term
ah i got it! and i even used the old multiply by 2/2 to get there
🎊
boy what a relief. ok. next problem
"conjecture a formula for the sum of the first n positive odd* integers. then prove your conjecture using mathematical induction."
hmm
try to write the first few sums and see if you can guess a pattern
i'm on it
looks like n^2 to me
P(1) = 1, P(2) = 4, P(3) = 9, P(4) = 16, P(5) = 25
i'm going to let P(n) be the proposition that the sum of the first n positive odd integers is equal to n^2 and try to prove
so in this case my inductive proof will involve n+2 to keep the next integer odd?
you could do that, or assume you have 1 + 3 + (2*k-1)
then you have the k first odd integers
with the way you phrased it the number of terms is kinda awkward
i think im using n where i should be using k
e.g. 1 + 3 + ... + n has (n+1)/2 terms
the letter used doesn't matter so much
right
phrasing odd numbers as 2*k - 1 is quite useful in general
so my k+1 term would be 2*k+1
i must have gone wrong somewhere
so i set up that 1+3+...+2k-1 = k^2
so i get that my k+1 expression should be equal to k^2 + 2k+1
ok i got it
i showed that the (k+1) term gets the form k^2 + 2k + 1, which is equal to my expression for the first 1+3+...+k + k+1 term, which was in the form (k+1)(k+1) .. (from the k^2 assumption)
k^2 + 2k + 1 = (k+1)(k+1) .. QED
huh, that's exactly what the book did
thanks for the 2k-1 tip
so you like to show things first for k-1 term and then for k is it?
I often prefer that, yeah
the kth term tends to be a neater target to try to reach
e.g.
(k-1)² + 2k - 1 =
(k² - 2k + 1) + 2k - 1 =
k²
ok i'm working on:
use mathematical induction to show 1 + 2 + 2^2 + ... + 2^n = 2^(n+1) - 1
so the first thing i did was set the expression to end instead at 2^n-1, making the entire summation equal to 1 + 2 + 2^2 + ... + 2^(n-1) = (2^n)-1
which i think is a legitimate manipulation
just shifted the entire thing by one exponentiation
you can pick your starting point to be one earlier, it's not really a manipulation of anything though
ok so so far i have that 1+2+2^2+...2^(n-1)+2^n should be equal to (2^(2n))-1
ok let me back up then
fwiw 2*2^n = 2^(n+1)
I'm guessing that's what you rewrote incorrectly?
no what happened was i incorrectly tried to plug in a 2^n term into the 2^n-1 expression
wait, why? 
misunderstanding
i think i am at the entire expression on the left should be equal to 2^n - 1 + 2^n
2^n + 2^n ..?
you have 2 of the 2^n terms (or equivalently you can factor 2^n out), in any case you have 2*2^n - 1
which is just 2^(n+1) - 1
fwiw the statement itself is pretty obvious when you think about it, you have a binary numbers with all one bits, if you add 1 to it you're left with a single one bit, one higher than the previous highest bit
0b111 + 1 = 0b1000
right bc 2*(2^n) = 2^1 + 2^n = 2^(n+1)
that middle step is wrong
and the last one
you're missing () in the last one?
() are important for the expression to be correct
the middle step is wrong though
you have + where it should be *
no i know. i meant to write that 2^1 * 2*m == 2^(1+m)
but yes, x^a * x^b = x^(a+b)
but at any rate i'm not getting the right answer, now i have 2^(n+1) - 1 for one expression and (2^n) - 1 for the other
how so?
no idea what this means
idk i was using the previous problem as a model and trying to follow it but i must have gone awry
your guess is that in general
1 + 2 + ... + 2^n = 2^(n+1) - 1
```right?
err, wrong reply, but whatever
thats how it was given to me, i changed it to
1+2+2^n+...+2^(n-1) = (2^n) - 1
you chose to prove this by starting at a term one before and adding a term
1 + 2 + ... + 2^(n-1) = 2^n - 1
```add the next term
```py
1 + 2 + ... + 2^(n-1) + 2^n = 2^n - 1 + 2^n
= 2^(n+1) - 1
then i add the next term to both sides:
1 + 2 + 2^n + ... + 2^(n-1) + 2^n = (2^n) - 1 + (2^n)
so RHS equals 2^(n+1) - 1
which is what we wanted to prove
ahh ok. i didn't go all the way back to the beginning, only went back to my eq RHS
gotcha
ok this is beginning to make sense. i think i need to practice a few every day so that i'm not relearning every time i come back to it
the recursive ones get really tricky for me
so the process of the induction is this: let's call G(n) your guess for the sum, let g(n) be the nth term
pick any k (in this case I picked k = n-1) and show that this equality holds
G(k) + g(k+1) = G(k+1)
if you can show that you have showed that your guess is consistent with just adding the next term (they give the same result)
so if you can just prove that some starting point is true e.g. that G(0) has the right value, then by the property from earlier G(1) is also true, and in turn G(2) is true, and so on
yeah i don't get the logic of the starting point zero when the expression says 1 + 2 + ... + n, bc they aren't claiming in their proposition that 0 is included anywhere
or in their equation or however you would like to name it
not in their sequence nor could n be some higher number zero
but hey if you're allowed to do that, i like it bc its easier
my next one is:
use mathematical induction to prove the inequality n < 2^n for all positive integers n
base case of 1 checks out
ahh this one is very curious
they make use of the fact that 1 <= 2^k, and create a new expression which we know to be greater than (2^k) + 1
hmm i seem to have gotten snagged on proving the summation for the for n even positive integers. my proposition is that 2+4+6+...+n = n(n+1). so the base case checks for n=1, that is, the first even positive integer 2 is equal to 1*(1+1)
so next i set it to the form 2 + 4 + 6 + ... + 2k-2 = k(k+1)
right so far?
oooh
something just clicked with k(k+1)+2(k+1)
unless i am mistaken, this can be rewritten as (k+2)(k+1) by asking the question, how any (k+1) terms are there, and the answer is k+2 of them
is this correct
i don't really see how getting the form (k+2)(k+1) proves the summation of the first n even positive integers being equal to k*(k+1)
i guess its bc the (k+2) is the next term in the series, and thus it has the form of the original hypothesis where k+1 is the same and the next term is in the form k+2 bc its the next even integer
the starting point is arbitrary, though if you pick a base case at say b, then you have only proved the statement for b, b+1, ...
so you want to pick the earliest sensible thing
0 is sensible as the sum of none of the integers
yep, factor out (k+1) and you have (k+1)(k + 2)
oh yeah, this is a very loose bound, especially for larger n, saying 1<=2^k seems kinda silly, but it gets to the expression they want
in my notation from earlier you started with G(n) and proved that G(n) + g(n+1) = G(n+1), right?
G(n+1) is indeed (k+1)(k+2)
though do note that deriving the sum of even numbers can be done a lot easier, you're already familiar with the sum of 1 + 2 + 3 + ... + n = n(n + 1)/2, right?
the sum of even numbers is just double that :P
why would it be double if you are only adding half of the numbers in the sequence
you're omitting the odd ones
btw, I think this thing is wrong
2+4+6+...+n = n(n+1)
```I think it's supposed to be
```py
2+4+6+...+2n = n(n+1)
sum of the first n even numbers
lms what they have in the back of the book
you're correct it should be 2n
i've moved on to reading about greedy algorithms but perhaps i should practice another proof while you're around
yeah i feel like i should go back and practice the one from CLRS with this fresh practice
when evaluating this, i first look and say to myself what is lg(2^k), which must equal k, correct
the way to eval that is ask what must 2 be raised to to equal 2^k
not bc of the two inside but bc of the base, eg lg_2
yes that's correct according to wolfram
ok so then i have 2((2^k)(k)) + otherterm
so that must be (3^k)*2
but in the sltn they get k(2^(k+1))..
what gives
ah ok
2^1 * 2^k = 2^(k+1)
And now you still have a k there too.
wait back up
parentheses, remember
the last thing is grouped as:
2^k + 1 = (2^k) + 1
Without the ability to actually superscript and subscript we need those extra parens.
huh?
wait so what is (2^k) * k?
Just as is.
2 * 2^k * k = 2^(k+1) * k
2((2^k)(k)) You can change the parens here (associativity rule).
yeah you can't really combine those
maybe you're thinking of k^2 * k = k^3?
yes
but in my eq it now looks like 2((2^k) * k)
everything is just multiplied, you can drop all those parentheses you have there
oh so you're saying its all one term so i can multiply the (2^k) by 2 (which is 2^1)
yes
ok ok
within a term you can play around with products pretty freely
now i see where they get k2^(k+1)
a*b = b*a and whatnot
hmm so how do they go from
And 5 * 4 * 3 = (5 * 4) * 3 = 5 * (4 * 3), etc
2^(k+1) = 1 * 2^(k+1)
yeah i'm seeing what clicked before, where we have:
k two to the k plus ones
+
1 two to the k plus one
(k+1) two to the k plus ones
You can imagine the more simplified version of it by treating the 2^(k+1) as one thing.
Then it's just like kx + x = (k + 1)x
k apples + 1 apple :P
(You can even do the substitution in your work, temporary variable usage (let x = 2^(k+1)))
hmm ok. i'll get there
(But you don't have to)
yeah, substituting stuff can be useful to keep expressions simpler
especially when expressions grow larger
looks like it finished in this manner:
You can make up variables with "let ...". To help make it more simple for you.
so indeed you have your x lg x where x = 2^(k+1)
Yeah it's pattern matching in that case.
You know that n = ...
If you know how n and k are related you can show the steps to get to n.
prefix let or postfix where, both good options in mathematical text 😄
i've been trying to adopt algmyr's n-1 hypothesis, then show that it must be true for n, but i mix myself up sometimes
it's very arbitrary, I do whatever makes the target I want to reach the easiest
i'll get there. this is good practice
the n-1 case confuses me when like in this example, n = 2^k
Which route to choose from can be one of the hardest parts of math and also result in the most elegant or horrendous proofs.
although i guess it'd just be 2^(k-1)
yep. then you could show that if its assumed to be true for 2^(k-1), it must also be true for 2^k
i'm gonna practice a few of these per day before i get into the coding side
there is always the option to start from both ends and end up with the same messy but equal expression, I've done that a few times where some step wasn't obvious to me
Yeah connecting from both ends is really good. It's why having a good guess to what the end looks like is important, and comes from playing around with various simple cases.
alright, what do we think here ladies + gents, this is in line 5 of a recursive algo on activity selection. the function is to return the greatest number of activities which can be done in a given space on a given day
my input arrays are one with start times, and one with finish times
here is the code so far:
s = [1,3,0,5,3,5,6,8,8,2,12]
f = [4,5,6,7,9,9,10,11,12,14,16]
def recursiveactivityselector(start_array,finish_array,k,n):
m = k + 1
while m<=n and s[m] < f[n]:
m += 1
if m<=n:
return
the equation i posted above must go in on the line with a single return statement now
so activity 0 begins at 1 and ends at 4, act 1 begins at 3 and ends at 5, and so on
so the return statement is to return a set with a single activity in it, plus the union to the set found with the recursive call
my discrete math book says that the union of two sets is the set that contains those elements that are in either A or in B, or in both.
i have a more general question regarding runtime complexity
i've generated a cover file for my randomized quicksort, and added up all the computations it did, which is equal to 42,743 lines executed on an input of a single array of 1000 numbers
so i would expect that this number could be upper bounded by the runtime complexity O(nlgn)
however, nlgn of 1000 is 9,965
therefore the O doesn't hold. therefore i must be misunderstanding something about the nature of runtime complexity vs. lines of code executed
you're forgetting the constant factor part
ah ok. so there is some low constant factor like 5 that can upper bound it
something like that yeah
hmm
what if i multiplied the runtime complexity by the amount of time it actually took to run
lms
no theres no formula nor association there so far as i can tell
which i guess is the point of O in the first place
to remove the machine specifics
right, the time complexity isn't going to be evident off 1 measurement
i'm just playing around with how can i take what i've learned and apply it to a formal analysis of say, one algorithm vs. the same with the random aspect, or say two sorting algorithms compared. is there a call that will sum the lines of the cover file and return it so that i don't have to do it manually? i guess i could write a script to do that?
looks like all the integers i want are preceded by the same character, :, that'll be helpful
writing a script to sum the stuff is very simple yeah, though you run the risk of very misrepresentative data
e.g. these three sections have the same complexity
1: A = []
1001: for i in range(1000):
1000: A.append(i)
1002: A = [i for i in range(1000)]
1: A = list(range(1000))
hmm i'd like to pick your brain about that when i'm around
basic question:
category, company_name, company_url, job_cost, linkedin_url, phone_number, logo_url, site_url = process_results_page(article, session)
write_to_mysql(category, company_name, company_url, job_cost, linkedin_url, phone_number, logo_url, site_url, cursor, conn)
What is the best way to shorten this list of variables? My favorite way is using a dictionary of course, but with MySQL it seems not very practical, since I need to put them in a tuple for the MySQL query...
have you considered a named tuple maybe?
Yes, I'm investigating it now. I'm just not sure exactly how to use it in a way that doesn't create too long lines. Will I be using the asterisk unpacker?
sql = "\
INSERT INTO houzzscrap.houzz \
(category, company_name, company_url, linkedin_url, phone_number, \
job_cost, logo_url, site_url) \
\
VALUES (%s, %s, %s, %s, %s, %s, %s, %s)"
val = (category, company_name, company_url, linkedin_url, phone_number, job_cost, logo_url, site_url)```
(the inside of the mysql function)
you can unpack, yeah