#algos-and-data-structs
1 messages ยท Page 1 of 1 (latest)
so answer is what? O(1) or O(n+m) depending on input, and infinite worst case?
wait...the worst case is not n+m
consider where the frog randomly isolates part of the board so that the infection can't get there
then wouldn't stuff proceed until the frog randomly clears that part of the board?
actually, the frog logic is pretty weird... it updates the frog but does no recursion 
feels like the frogs do very little as the algo is currently written
yea prob i got it wrong, i need to check with the answers when i got them
this function was called from another function
but dont think it does anything
maybe the input will have effect
you know alog
is there a free source for learning algorithms and big o notation
for a beginner ?
yes, but what do you mean "for a beginner". do you mean a beginner to programming? or to algorithms
both
i would probably hold off on learning algorithms for now. you really won't be able to understand anything without any programming knowledge
hmm actually i finished cs50
i am not a complete beginner
i see. maybe try mit opencourseware?
ok thanks for the advice
is there a specific video series ?
MIT 6.006 Introduction to Algorithms, Spring 2020
Instructor: Jason Ku
View the complete course: https://ocw.mit.edu/6-006S20
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63EdVPNLG3ToM6LaEUuStEY
The goal of this introductions to algorithms class is to teach you to solve computation problems and communication that your solu...
6.001 i think is introductory python, 6.006 is introductory algorithms
i'll check the videos thanks for helping me
does this sort of notation work in python?
no, it's range notation from math
ok
but [l, r) is the kind of ranges python works with
with range, slicing and whatnot
meaning including l and not including r
right
i'm working on what squiggle was trying to tell me yesterday
seems to be a puzzle i cannot crack!
Did you try to find what p q and r would be for an array of length 4? You correctly decided that n1 and n2 should both be 2.
i think for an array of length 4 it works, splitting 2 and 2 for n1 and n2
p = 0, q = 1, r = 3
So given that n1 and n2 are suppose to be equal to 2. How can you combine your p and q to get n1, and your q and r to get n2?
so r is actually = 4 if i'm designating it as len(array) when array length = 4
i could always do len(array)-1 instead, but let me keep things the way they are for now
Yes. Just keep it as r=4 for now.
Which would make q = ?
q=(p+r)//2
oh right 2
Yeah, so now the goal is to get an n1 and n2 that are both 2 from those 3 values. And what are p, q and r? Well they are basically the start, mid, and endpoints of the list.
So what we are trying to do is get n1 = "the distance between the start and mid" and n2 = "the distance between the mid and end".
Which gives us a half-half split of the array.
well you cannot have the mid in both
Yes, but for now ignore that small detail.
The general idea is to get those two distances (which may need to be slightly adjusted).
So the question is now simplified to the question of "how do I get the distance between two points on a number line?"
for an array of length 4 it partitions incorrectly as p through q would be indices 0-2 and r would be 3
Right now we have p = 0, q = 2, and r = 4.
Three points on the number line, and we want the distance between p and q, and q and r.
+2
right
So how would I get the distance between p and q?
(Which is what n1 is suppose to be)
Or really any two points?
pointa-pointb
What is point a and what is point b in this case?
point(b)=q and point(a)=p
So the expression n1 = ? is?
that +1 shouldn't be there
Since q is point b and p is point a, your expression is n1 = p - q.
right
sry it should be point(b)-point(a)
like in their code q-p
r-q
Correct.
And you can check they both give you n1 = 2 and n2 = 2 for an array of length 4.
Which is the correct lengths for the splits.
So our formulas work for the specific case of an array of length 4.
Try another case that is similar, say an array of length 6.
What do you get for p, q, r, n1, and n2?
Check n1 again.
n1 + n2 must be the length of the array because they are splitting it up.
(Which is r)
.
.
that's what's written in the pseudocode
if we ignore the pseudocode, it wouldn't work for an array of length 4, that is, it wouldn't split the array properly
What would n1 and n2 be for an array of length 4?
n1 = q-p+1
n2 = r-q
Ignoring the pseudocode, please burn it.
well that's what's telling me how to compute q..
I meant for n1 and n2.
alright lms
We decided that n1 and n2 that they have does not work in Python.
So we ignore it and come up with how to compute n1 and n2.
Rather than trying to fix it by adjusting what they have (from scratch).
i mean it works for an array of size 4 and gives n1=2 and n2=2 but to go between indices 2 and 4 in the array doesn't work as the last index is 3
in this case, the array is unevenly split, bc array 0-2 comprises 3 elements
We are not dealing with indices yet. This is just the temporary array lengths.
Which are correct with an array of length 4.
Both arrays should be length 2.
ok so for array of length 5 n1=2 and n2=3
ignoring their way of computing and using n1=q-p and n2=r-q
wait
then r=6
q=3
then they both equal 3, which is too long
i really don't see how this is the source of the error, i've tried every iteration of +1, -1, no 1, len(array), len(array)-1, etc etc etc
q-p-1, q-p+1, r-q-1, r-q, r-q+1, etc
Notice how with this version, there is no +1 nor -1.
It's more simple, and I think you will see how it has a trickle down effect.
Which case?
n1 = q-p and n2 = r-q we just showed to improperly calculate the length of the subarrays
For which case? Which array length?
when len(array) = 5
You got what for n1 and n2?
3 and 3
Show your calculations.
how are you calculating q?
r=6 when len(array)=5
mergesort(a,0,len(a))
ok n1=2 and n2=3 for array length 5
So try a few other cases to convince yourself.
See if the formulas hold in general.
Note that with an odd length, you can't get a perfect split, one must be 1 longer than the other.
is that not why they wrote q-p+1??
That is if you are starting at 1 for p.
Note that if p = 1 instead of 0.
Then q-p+1 is the same as ours q-p.
This is one of the reasons why 1 indexing is annoying.
you know i've tried running it without the +1 correct
There are more bugs to come. It's the combination of it all.
When you change one thing like this, you need to change the rest.
But this version of q-p and r-q is the most simple, and Python likes it more as you will see.
It's the zero index version.
ok so for an array of len(7), it works, n1=3,n2=4
Try another even length case.
q=0 when len(a)=2
(p+r) // 2 = ? (p = 0, r = 2)
1
1 and 1
using r as the length of the array for which no index exists is really screwing w me
i'm looking at example arrays like n=5 = a=[a,b,c,d,e]
You can think of it like this. We go from p = 0 to q = 1, but not including q, and then we go from q = 1 to r = 2, not including r.
[p, q), [q, r)
If you go up to the length of the array, but not including it, you go up to the last index.
Since the last index is length - 1.
(in zero index start)
ok
Which is why with odd length you got n2 > n1.
Like with length 5.
[0, 2), [2, 5)
[0, 2) -> [0, 1], [2, 5) -> [2, 3, 4]
(up to, but not including the last)
(But including the first)
Why is this nice for Python? Because remember that range(n) is [0, n). range goes up to, but not including n, and starts at 0.
(Why? Because the last valid index is n - 1, not n (where n is the length of an array))
why did they go with zero indexing and not indices beginning at 1
i don't see how that's helpful, to think of the "zeroth element of some series"
It makes more sense is lower level programming languages. But it became convention from those. There are languages that don't, such as Lua, which starts at 1.
In general it makes more sense with how computers work.
Though it's arguably more nice in general. As you can see, you don't need the +1 for the n1 or n2 this way.
ok so now where do i go
The key here is that when converting pseudocode, it's often better to just know what they are trying to do in general (split the array by computing two lengths n1 and n2) and do that however you can best do it in the language.
Rather than trying to copy directly and then fix it.
i still think list slicing would have worked better. maybe it gets complicated when the recursion comes about
right. and i'm definitely spending too much time on this, it's not an assignment just in a way i wanted to go through and try to implement as much as possible from the book
So what is the next general steps they are doing in this pseudocode? Well they are making two copies of the the splits and adding an inf of the end of each.
So the question is, how do we copy part of an array?
Given the start and end, in the form of [start, end).
array = [a,b,c,d]
arr1 = array[0:1]
arr2 = array[2:3]
```?
Let's start with just the basic for range loop.
for i in range(n1):
left[i] = array[p+i]
lms what this is actually doing
so for i=0, the left array with index 0, it sets equal to the zeroth element of array
bc p=0 + i=0 = 0, array[0] is zeroth element or first element, so makes sense
so for element with index 1, (the second element) it goes and copies the element with index 1 from array into left
so its working
but you said all the way up to and not including the n1th element?
n1-1
With an array of length 4, what is n1?
2
So n1 - 1 = ?
1
So left is length 2 right?
yeah, zero and 1st elements
i=0, i=1
correct
What is inside the loop though?
a = [a,b,c,d]
when j = 0:
right = [c]
when j = 1
right = [c,d]
Do we have the index of c?
(The first index of the loop?)
a= [a, b, c, d]
What is p, q, r?
p=0, r=4, q=2
And what is a[q]?
c
What is a[q+0]?
c
What is a[q+i]?
c
So what should the for loop contents be?
We know it must involve right[i] = ... (or j, you can use i again too)
Try it out on length 4.
and length 5.
Both loops.
*Remember that we have [p, q), [q, r). So p is the start of the first sub-array (left), and q is the start of the other sub-array (right).
(The right starts right after the left ends)
p, q, and r are computed such that we have [p, q), and [q, r). "[" means included, ")" means excluded (up to but not including).
Since Python loops start at 0 and end up to, but not including, they match perfectly.
Which is why there is no +1 or -1 nonsense.
[p, q) = "Start on p, and go up to q, but not including q"
Since p and q are n1 apart in distance, adding i to p will start at p and end on 1 before q.
(because i is looping up to but not including n1)
well we're doing ourselves a disservice by making r = the final index length plus 1, when the other index (the beginning) is matching
It actually makes it more nice. Consider if we have [p, q] instead of [p, q).
We would have to mess around with +1s.
Let me put it this way.
Try looping over an array with a for loop. Some new array.
Using a range loop.
no i know what you're getting at, its better because then we can think of it as going up to and including the final element
Yeah because our indices for an array go [0, n).
Not [1, n].
In Python.
So it makes sense that our sub-arrays mirror this.
wait wait wait
oh ok
i thought you were saying the final index isn't included in the set S = {indices}
Yea it's included [.
It does not make sense to include n, because that is of course out of bounds in Python.
You can write it as [0, n) or [0, n - 1], same thing.
(Not having to deal with +1, -1 is more nice)
right
Which is why Python range goes up to. Because otherwise your code would be littered with -1 in every range loop.
(len(a) - 1)
(thought was put behind these language design decisions)
and thats a result of the zero indexing right
Yes.
ok
and also the idea that end - start should give the number of items in the range
And the choice of [,) over [,]
ok ok
(again without having to deal with +1, -1 stuff, which is error prone, "off by one errors" cause a lot of bugs in software worldwide (the design decisions run much deeper))
boy when we finally get to the bottom of this its gonna feel trimphant
even if i'm burning my precious time i should be running the maths ๐
i'm def getting better and reading and analyzing code if nothing else
If you can get past this mergesort in Python you should have no problem writing actual code.
From pseudocode.
Unless it's absurd.
When you know how Python wants you to do things you can copy the general things they are doing while ignoring their specific indexing and such.
well some things get more complicated, for example, a hash table and resolving collisions by chaining via doubly linked lists. thankfully i already know what all that means and i just have to practice implementing linked list classes ๐
Yeah, but you should be fine with that, this indexing stuff is the annoying stuff to translate.
Because it's error prone if you do it too literally (exactly? not sure which word).
honestly the maths are the hard part. inductive proofs and solving recurrences
literal interpretation is the correct way to describe what you would not want to do, yes
Yup, but you probably have to write some code right? Tends be a chunk of such tests.
yes that's going to be a part of it for sure
coding projects
which is why i'm learning how to time my algorithm runs
Yeah that will help a lot. A few things to take away from what we just did. We split the problem into sub problems (first, how do we compute p, q, r, then n1 and n2, then how to do the copies), we played around with simple cases (length 4, 6, 2, 5, 7) and then tried to find a general solution from that experimentation (general formulas for n1 and n2), and finally we did not try follow the pseudocode exactly / solve the problem head-on, instead we take a step back and try to see what it's trying to do in general at each part. These tips can apply to all of your programming and even more so to your math.
(Especially the part of playing around with simple cases (small length, less dimensions, less constraints, etc) (asking the more simple version of the same problem), then trying to generalize from those (this is a huge part of the essence of math))
Hi guys,
I can use the following (first) query to extract data from an API:
data['longName']
But, when I make a dictionary like this:
dictionary = {
"longName": ["data['longName']", 44, "lname", "LN"],
"shortName": ["data['shortName']", 45, "sname", "SN"]
...
}
and construct an 'algorithmic query' from the dictionary:
list(dictionary.values())[0][0]
because list(dictionary.values())[0][0] == "data['longName']" # True
But unfortunately I cannot use the algorithmic query to get data from the API because the latter is a string.
Is there a way to convert my 'algorithmic query' string to 'act like the first query'?
piggybacking off @fiery cosmos's question as I'm also working through CLRS mergeSort section, is this also a valid implementation?
from math import floor
def merge(arr1, arr2):
res = []
i = 0
j = 0
while i < len(arr1) and j < len(arr2):
if arr1[i] > arr2[j]:
res.append(arr2[j])
j += 1
else:
res.append(arr1[i])
i += 1
while i < len(arr1):
res.append(arr1[i])
i += 1
while j < len(arr2):
res.append(arr2[j])
j += 1
return res
def mergeSort(arr):
if len(arr) <= 1:
return arr
mid = floor(len(arr) / 2)
left = mergeSort(arr[0:mid])
right = mergeSort(arr[mid:])
return merge(left, right)
did you test it?
yup, works but I want to know whether it'd be accepted in a prod setting
ok i'll leave that for the experts but thank you for posting as it'll be helpful for me to compare with the pseudocode / my code
def SumEven(n):
if n == 0:
return 0
if n % 2 == 0:
return n + SumEven(n - 1)
else:
return SumEven(n - 1)
print(SumEven(9))
how does one draw a recursive call tree for this?
Hi there, Im new to this server, so I dont know if its channel for such qs, anyway -> what are python libs for data structures, algos dev?
Or writing algos in python = writing in CPython xd? Let me know as I dont know whats the approach in this lang
if it's even why would subtract 1 instead of 2 because you can be sure in subtracting 2 the next number is even
??? ```py
3 - 2 = 1
oh you mean in the n % 2 == 0 branch
they can just do (a:=a>>1) * (a + 1) though
i wrote this program according to this question
still conforms to the question I think
what, my question was how do i draw the recursive call tree with the given program?
fwiw the +1 nonsense has more to do with not using [l, r) ranges, you can make this sensible in 1 indexing as well, but they didn't
sorry yeah I don't have a clue about what a "recursive call tree" is, never heard of it
it would just be a boring line, the recursive thing you've written is basically a for loop
it's basically just a single formula
oh i see it's ok then
isn't a recursion call tree something like this?
I guess, I only drew the call arrows
should be easy enough for you to fill the rest in
sure i'll try first
yeah i'm unable to visualise the program on a recursion call tree :/
it's basically the same as the factorial one
as written it just has this alternating odd even behavior
nvm solved
hi , I'm completely new to python , but I'm an expert with c# - Unity
I don't work with terminal that much , I'm getting my path with each termenail run
how can I get rid of it ?
I just wanna see my code without the highlighted line
I delete the blue path , but the PS is still there ....
That's just the command to execute the code
thx I fix it , and I think I post my question in the wrong channel.
#editors-ides is probably the closest thing
I am facing difficulty while learning DSA in python
Can anyone give me a roadmap or some suggestions
Which topics of python we need to known well before learning DSA in python?
?@!!
do you know any python
anyone around today? working on my max subarray implementation. the max crossing subarray part of the algo appears to be working
there is no maxHeap in python right ? only minHeap and u change positive to negative number to do maxHeap
yeah
Yes
Is it possible to solve this task in this way
That we keep two heaps and two dictonaries for elements lower and higher than the median and when we need to find the element smaller or higher than the median we pop the element from the maxheap or minheap, but only if it is in the mindictionary or maxdictionary if not we pop the last element until we find an element that is in one of these dictionaries. And every time "the window moves right" we pop the leftmost item from dictionary and add the right (in the mindictionary and maxdicionary depeding if it is bigger or smaller than the median). I am very bad at explaining if you need further clarification please say so
a two heap solution should work, yeah
with some additional book-keeping since deleting arbitrary elements from a heap is hard/expensive
Can someone help me understand this
I thought it would be 5รvertice_cost+edgesรedge_cost
each thing on the left is a pointer, every block in the adjacency list is an index, a weight, and a pointer
left: n_vertices * pointer
right: n_edges * (index + weight + pointer)
this is only the case since the lists are linked lists, you can get rid of the pointer cost on the right by using some dynamic sized array
Yo peeps, where can I get started with algo and data structures stuff
check the pins
Is there a thing similiar to multiset(c++) in python, because I need it often
there is no bbst in python stdlib, sadly
What about multidict?
whatโs that
what about it
i received this question from Google phone.
book(10,20) -> True
book(20,30) -> True
book(5,15) -> False, overlap so we ignore
book(5,10) -> True
Basically make a booking reservation system. You have to code everything. And the 4 functions above are 4 separate functions.
brute force - I come up with a simple brute force book() checkAvailability()
binary search - also I think if you bookSmart() then the array is sorted, and checkAvailability() is pretty standard log(n) binarySearch
Hello guys, i have question is it very challenging to get job without cs degree, i am currently planning to learn python upto intermediate level and than learn DSA, i will appreciate if you show me some guidance and assistance.
yes, it's challenging
<@&831776746206265384>
the issue is keeping the thing sorted
easy in C++/Java with std::set/TreeSet, but python does not have such a structure
my friend said to use Interval Tree/ Segment Tree
idk too much about that, thought i should share
it's overkill for this application
Prims use (tree, unseen, fringe) word as representation.
Kruskal use forest(list) type of representation with priority_queue
What else is the difference?
If not, i stand against kruskal and prim to have algorithm named after them.
there are problems which can be solved using kruskal and not prim and vice versa
Prim's can be made to have a slightly better complexity using weirder priority queues, and Prim's requires the use of a priority queue where you can do key updates.
Kruskal's doesn't require a priority queue, you can just sort the list of edges, then you use a disjoint set union data structure to keep track of connectivity.
All in all the performance is similar, I prefer Kruskal's because it's so simple to implement
You mean prim can be made better when using priority queue with comparators when data isnt normal?
prim can be made to run in O(|E| + |V| log |V|)
Kruskal requires ordering all edges, so O(|E| log |E|) or O(|E| log |V|)
I guess Kruskal can be improved using bucket sort or similar in cases with limited edge weights
i thought, both are for Minimum Spanning Tree
prim's considers all the neighboring edges only (get the the smallest)
and kruskal's considers all edges globally (get the smallest)
which is why kruskal's is slightly slower right ?
hi all i need help in converting the pseudocode into this mathematical representation
Here you have to logically understand that first for loop will run n-1 times starting from i=1 to n-1. Then notice that, for every loop starting from i, 2nd loop will traverse through n-i elements. Thus the total work done/time taken summation over n-i, from i=1 to n-i.
can you help me to understand how to expand the summation? i understand why:
why does the i summation become (n^2-n)/2?
that is not true
wait nvm
it's an n in the sum
so yeah, just (n-1) copies of n
we have derived this fact before, remember?
n (n-1) / 2
I think we did the sum up to n which is (n+1) n / 2
but same formula just shifted over by 1
no unfortunately as soon as i don't look at this stuff for enough consecutive hours i immediately forget everything
i had to focus on the coding aspect
well, you can do induction to prove the result
hello everyone, im trying to solve thiss leetcode problem
this is my attempted solution, but i dont know why the results is not incrementing on each iteration loop
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def dfs(self, node, sum, results, targetSum):
# check if node is null
# check if node is greater than targetSum
# update total sum
# check if sum equals target - if true update results counter
if node is None: return
if node.val < targetSum:
sum += node.val
if sum == targetSum:
results += 1
self.dfs(node.left, sum, results, targetSum)
self.dfs(node.right, sum, results, targetSum)
return results
def pathSum(self, root: Optional[TreeNode], targetSum: int) -> int:
sum = 0
results = 0
return self.dfs(root, sum, results, targetSum)
also how would i return results once the recursive function exits?
thanks
it's a fact that you really should remember, it pops up a bunch
it's also easy to derive
easy to derive a summation with different unknown variables in it???
S = 1 + 2 + ... + (n-1) + n
2S = 1 + 2 + ... + (n-1) + n
= n + (n-1) + ... + 2 + 1
= (n+1)*n
S = (n+1)*n/2
yes, my thing above is a well known derivation
I did the sum up to n, but this is the summation with i
just expanded
to get the sum to n-1 you can just exchange n -> n-1 in the formula
or you can easily do the same derivation I did
2S has n-1 pairs of terms, all summing to n
so S = (n-1)n/2
can somebody tell me what im doing wrong with my code. im trying to sovle this leetcode question. https://leetcode.com/problems/path-sum-iii/
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
# recursive function
def dfs(self, node, sum, count, targetSum):
# check if node is null - exit function
# check if node is greater than targetSum
# update total sum
# check if sum equals target - if true update count counter
if node is None: return
sum += node.val
if sum > targetSum:
sum = 0
if sum == targetSum:
count += 1
print(True)
sum = 0
self.dfs(node.left, sum, count, targetSum)
self.dfs(node.right, sum, count, targetSum)
return count
def pathSum(self, root: Optional[TreeNode], targetSum: int) -> int:
if root is None: return 0
sum = 0
count = 0
return self.dfs(root, sum, count, targetSum)
I've never heard of it
I know about Tarjan's algorithm for SCC
which seems way more common
i've never heard about tarjan's, but have of kosaraju lol
haha, wasnt expecting that
respect for ex goog, face restored
i thought it was comman
if you want to do it by induction, assume it's true for n - 1 and show that implies it's true for n:
Sum(n - 1) = (n - 1) * n / 2 =>
Sum(n) = Sum(n - 1) + n = (n - 1) * n / 2 + n =
n**2 / 2 - n / 2 + n = n ** 2 / 2 + n / 2 = (n + 1) * n / 2
which is what we wanted to show
establish a base and you're done
fwiw expanding isn't really making your life easier, compare
Sum(n) = Sum(n - 1) + n
= (n - 1) * n / 2 + n
= n**2 / 2 - n / 2 + n
= n ** 2 / 2 + n / 2
= (n + 1) * n / 2
```with
Sum(n) = Sum(n - 1) + n
= (n - 1) * n / 2 + n
= ((n - 1) * n + 2n) / 2
= (n + 1) * n / 2
what
just different ways of simplifying, expanding the product isn't the most straightforward way
why not
maybe it's me just preferring to avoid expanding it because factorizing in the more general case is hard, idk
in this case the factorization ends up trivial
i don't really see going from ((n - 1) * n + 2n) / 2 to (n + 1) * n / 2 without expanding, collecting like terms, and factoring out the n
how many n do you have?
(n-1) + 2 of them
it was just a note on your thing, nothing too relevant to them
I guess we already started with a common factor n that could be factored out immediately 
def Sequence(test):
if (test == 0):
return
else:
print(test, end=" ")
Sequence(test - 1)
print(test, end=" ")
return
# Driver Code
Sequence(3)
how do i write a recursion function for this:
Sequence(x)
Sequence(1)
-> 1 0 1
Sequence(2)
-> 2 1 0 1 2
Sequence(3)
-> 3 2 1 0 1 2 3
doesn't that basically work, other than not printing the middle number?
hi
ye i realised i just need to print a value 0 under the if statement
if I were to write this kind of thing I would write a generator function, but the structure is the same
def f(n):
if n == 0:
yield 0
return
yield n
yield from f(n - 1)
yield n
and then you can just do whatever you would do with an iterator, e.g.
print(*f(4))
ahh i see
How can I covert the below tuple into the array. My script has converted the array to tuple by adding brackets at the start and end.
(array[[0, 0, 0],
[1, 1, 1], dtype=int32])
the tuples are inmutable !
I don't understand your question
tuple = (array[[0, 0, 0],[1, 1, 1], dtype=int32])
this just gives me a syntax error
this is not the full thing is it?
it is according to solutions i found online. i'm also trying to understand this one:
can you give links?
yeah do you want the .pdf or the site
either
i guess its the same thing
that of course will not have the original problem, which i wrote in
4.3-2
and
4.3-6
so i got discouraged today because i thought let me figure out what log_2(n/2) is according to wolfram alpha, and it just changed the base which makes it more complicated
it should really be like, 2 to the what power gives you n/2, which should be a simple expression?
2^-1 gives 1/2?
so i guess it'd be 2^-n?
yeah thats correct
but anyway, these CLRS problems seem impossible rn. that's why i want to keep working them
log_2(n/2) = log_2(n) - 1 if that helps
i don't know whether it'll help or not ๐ญ
i don't think that's correct, 2^(-n) is just 1 / 2^n, which obviously isn't n / 2
it feels very silly to enter a super specific T(n) in the solution, if you were actually trying to show it's O(log(n)) you would reasonably assume that T(n) = c log(n)
and check if it works
T(n) = T(floor(n/2)) + 1
= c log(floor(n/2)) + 1
<= c log(n/2) + 1
= c log(n) - c + 1
<= c log(n) (if c >= 1)
totally guessing
are you referencing this problem?
picking T(n) = 3 log n - 1 feels super weird
yes
yeah my mistake, though doesn't CLRS mention that such rounding doesn't matter?
for asymptotics
yes but you have to take it into account when evaluating specific expressions
bc in these solutions they'll make substitutions given that like floor(n/2) <= 3n/4 or something similar
multiplying/dividing in a log can be split into different logs, log(a/b) = log a - log b
technically, he should have used like c1 or something, it's not the same c
i just got access to the voice chat idk if y'all ever do that but i'd love to have someone walk me through these problems via voice
shouldn't it be log(n) - log(c) then
log(n/2) = log(n)-log(2) = log(n) - 1
it's the same c
how?
oh I thought you were in base 10
c log(n/2) =
c (log(n) - log(2)) =
c (log(n) - 1) =
c log(n) - c
oh yeah, base 2
yeah, I'm used to lg ๐ silly me
they did write that, in the problem statement
the solution didn't use the same syntax
not that the log base matters too much in the grand scheme of things
right, yeah
the problem statement was:
show that the solution of T(n) = T(ceiling(n/2)) + 1 is O(lg(n))
it's true that it's c log n - c log 2 in general
@haughty mountain your solutions make much more sense than these which i have found online
i daresay i was able to follow what you did to solve this one
let me try this one:
without looking at their answers
that one seems trivial with the last result
basically assume the answer is in a form like T(n) <= c n lg n or maybe with a + d as well if it helps make things cleaner
and then see if there exist constants that makes it work
(and n>n0 stuff for the inequality to hold)
is this right so far:
2(c(floor(n/2)+17)lg(floor(n/2)+17))+n
looks about right, sure
ok great. continuing on
dang. i am so close to getting it in the form i want. i have a cn and a lg(n) but they're being added together and a +n in there. the lg(n) also has a 2 coefficient. i wonder where i went wrong
should i take this to the math server is this bugging y'all?
nah, it's fine
alright here's where i'm working from
and trying to get to O(nlg(n))
so when i pull apart that lg into lg(n)-lg(2), does it become +lg(n)-lg(2) or stay as a factor
i'll try keeping it as a factor i think that's where i goofed
its really difficult for me to see what needs to be distributed where
you should end up with something like
c n lg(n/2 + 17) + O(n)
i want to get it into the form of cnlg(n)
why did you wind up with a big O in there
dang i was feeling good about this one too
because the n and a messy term containing lg(n) are O(n)
I don't need such low terms, so I just stuff them into O(n) so I don't have to write them
2 (c(floor(n/2) + 17) lg(floor(n/2) + 17)) + n
<= 2 c(n/2 + 17) lg(n/2 + 17) + n
<= 2 c n/2 lg(n/2 + 17) + 2 c 17 lg(n/2 + 17)) + n
<= c n lg(n/2 + 17) + O(n)
is cnlg(n) + O(n) = O(nlg(n))?
yes
ok i recall reading something in CLRS about how to eval those i believe it was chapt 3
thanks for this im going to work through it rn
O(n) is smaller than O(n log n)
of course
so I can ignore single terms smaller than O(n log n)
what's remaining here is just to show that there exists an n0 such that for n>n0 such that
c n lg(n/2 + 17) + O(n) <= c n lg(n) + O(n)
I have a feeling someone following CLRS religiously would call me sloppy, and to do things properly you need to keep all those annoying terms around and play with constants to make things work precisely
In reality you can totally argue the way I'm doing here, though you need to be a bit careful about what you can and can't ignore. You can shoot yourself in the foot if you're being too sloppy
(e.g. a sum of k O(n) terms is not just O(n) if k depends on n)
ok
looking at that expression and playing around w numbers in my head i can totally see why the expression on left would be lesser, as your lg_2 of say 67 when n=100 will always be a smaller number than your lg_2 of 100 on the right
i think there is an extra 17 in here
i see you went to distribute the 2c to both terms
but then for a second term you get:
2 c 17 lg(n/2 + 17))
I wanted to isolate the part we actually care about
isn't there a way to get it almost entirely into the form we want by pulling apart lg(n/2) into lg(n)-lg(2) = lg(n) - 1, then have cn*lg(n) - 16 + O(n)
that whole first term of course being the exact form of the answer
this is my work for the recurrence 4.3, T(n) = theta(1) if n=1, and when n>1, T(ceiling(n/2))+T(floor(n/2))+theta(n)
can you all follow what i did?
you have an extra factor of c out of nowhere
between the 5th and 4th line from the bottom
also, do write out the relations between the lines, if it's = to the previous thing, or <=, or whatever
oh youre right i shouldn't have distributed that
so given that i'm at
2cnlg(n)-1+theta(n) how do i justify that is = theta(nlg(n))
the solutions they have for problem 2-1 are quite intuitive if you have the knowledge of the mergesort recursion tree depth and merging time on your mind
you also have a 2 that should have cancelled
yes you're correct, for the same reason i thought i should have distributed the c
so the answer is simply that cnlgn - 1 + theta(n) can be upper and lower bounded by two constants c1 and c2 and another function g(n) such that c1(g(nlgn) <= cnlgn-1+theta(n) <= c2g(nlgn) for all n greater than n0? am i saying this correctly?
in that vein, you didn't distribute a factor c when separating the lg(2)
ok
so you would end up with
c n lg n - c + O(n)
```or even
c n lg n + O(n)
if you want to be pedantic you would keep all the terms I put into the O(n), then you can try to find values for those constants
(not that that's particularly interesting)
the O(n) term on the right has some n0_right on its own, but whatever n0_left you have for the left term, you can just pick the larger of the two n0 = max(n0_left, n0_right) and everything works
the exact n0 isn't really important, what's important is that such an n0 exists
right but is my statement correct
does anynone have leetcode premium that could do a homie a solid and share solutions with me
not I
๐ญ
I just want to look at the solutions and see their video walkthroughs
basically, other than the expression you wrote having some mistakes with a missing c and whatnot
ok. at least i am getting somewhere. in the solution to 2-1 of CLRS, they use the reverse log(a/b) rule, that is, they have log(n)-log(k) and combine to show that the depth of a tree is log(n/k)
its incredible inhabiting this world in a way that i would never think
its like a whole different universe
I guess what you really have is
T(n) <= c n lg n + O(n)
```we can expand the definition of O(n)
T(n) <= c n lg n + C n (for n >= n0)
I honestly don't care too much about finding some concrete constants, I just care whether they exist. CLRS seems to care about the constants for some reason
yes they even calculate them in a number of examples
but to their credit they do make the point that the constants specifics do not matter and that what matters is simply that there are any such constants
how can i fix this list index out of range error:
def bubblesort(array):
for i in range(1,len(array)):
for j in range(len(array),1,-1):
if array[j] < array[j-1]:
array[j], array[j-1] = array[j-1], array[j]
return array
a=[3,6,5,7,8,1,0,2,4,1,7,4,8,14,52,63,78,90,1,22,34,44,52,62,11,23]
bubblesort(a)
the j range starts at len(array) which is oob
for your own sanity, use reversed(range(...))
ahh
range(n-1, -1, -1) is so much harder to read and understand than reversed(range(0, n))
because in the former you end up having a range that is (l, r] which is...weird
looks like python wants : for reversed* syntax
its not sorting lol
def bubblesort(array):
for i in range(1,len(array)):
for j in reversed(range(1,len(array)-1)):
if array[j] < array[j-1]:
array[j], array[j-1] = array[j-1], array[j]
return array
a=[3,6,5,7,8,1,0,2,4,1,7,4,8,14,52,63,78,90,1,22,34,44,52,62,11,23]
print(bubblesort(a))
nvm it is. my return statement was nested too far into the program
so bubblesort can sort 1000 randomly generated numbers btwn 1 and 10k in under 0.2s
kind of cool for an inefficient algorithm
but indeed, it's quite slow
it gets very slow when i try an array of length 10k
i need to get back to working with these summations, that seems to be a weak point for me
i finally learned what monotonically increasing means
im guessing that bubblesort has n^2 runtime?
ah, i forgot about the array sorting algorithms cheatsheet
well, don't look at the cheatsheet. try and figure it out from your code
yes, it's theta and O of n^2 in average and worst case, and omega(n) in best
right i just meant like i can look and confirm my hypothesis without bothering you all
it also robs you of practice you probably need
i suspected n^2 due to the nested loops, but as someone has pointed out, not every algorithm with nested loop structures have n^2 runtime, for example most recursive algos
just count the number of iterations
i wonder if i'll have to describe its activity precisely like using the summations and whatnot ๐ค
n-1 iterations of the outer loop, each of those has a loop with n-1 iterations
(n-1)*(n-1) = O(nยฒ)
right right
you can actually make the inner loop be reversed(range(i, len(array))
which makes it a tad faster
interesting
how would the time complexity analysis change?
lms
well if you're saying it changes the complexity, that implies that the inner loop would inherit or make use of the same i from the outer loop
it's why it's called bubble sort, in the first iteration the min value has "bubbled" to the bottom
don't bubbles rise
well, yes
often the algo is written so that the max element bubbles to the end/top
I didn't say it changes the complexity, it changes the analysis a bit
i'm having trouble reading the inner loop when i=1
bc its supposed to start at len(array) or len(array)-1
it starts at len(array)-1 still
it ends at i though, rather than 1
There is still n-1 iterations of the outer loop, but the lengths of the inner loop changes
the first inner loop is n-1 iterations
the second one is n-2
...
n-1 + n-2 + ... + 2 + 1
hmm its hard for me to picture that
how so?
range(1, n) is n-1 long
range(2, n) is n-2 long
right?
range(i, n) just has a lower bound that changes, it has length n-i
are you talking about the inner or outer loop
you're saying this is how they increment when the outer loop is steadily increasing, the inner loop thus becomes smaller and smaller?
i'm gonna plug into python tutor and see what its doing
yes, the inner loop becomes shorter and shorter
it stops earlier and earlier
seems like it doesn't matter what value i has after it goes into the inner loop, what matters is that the inner loop is done once for each i in the outer range statement
i immediately takes on the new value of the last index
i made it sort a very simple input of 12 numbers and it takes 324 lines to do so
well i'm seeing how nested loops work, so this is good
btw, my prof said the timing function im using is no good, but that i need to write into the code how many times a certain line is accessed or how many times functions are used.. any easy way to do this? should i learn the debugging function of vscode?
you can imagine unnesting the loop
for i in range(1,len(array)):
for j in reversed(range(i, len(array)-1)): ...
for j in reversed(range(1, len(array)-1)): ...
for j in reversed(range(2, len(array)-1)): ...
for j in reversed(range(3, len(array)-1)): ...
...
for j in reversed(range(len(array)-1, len(array)-1)): ...
all in all using i as the lower bound should roughly cut the number of total iterations in half
thats interesting that its taking essentially the same amt of time
longer even. i guess my prof wasnt lying when she said the timing function is no good as it has too much to do with background processes
how are you timing it?
start=time.time()
print(bubblesort(a))
print(time.time()-start)
its certainly not if im controlling the computing environment. len(a) is 1000 random integers between 1 and 10k
you're running it on your computer, right?
along with other apps/stuff?
yes, let me close other apps and see how it changes
is this running via my cpu or RAM
you could try this for a smaller n
python -m trace --trace your_file.py
..both?
maybe with --count for a larger one, it produces a "cover" file
yeah the extra time added after the change was just me opening firefox, its essentially taking the same time, 0.18s
0.1840
You have to run algorithms multiple times and average.
hm ok
use timeit, even better, use ipython's %timeit
0.1930
timeit does the multiple runs averaged.
it says i cannot use CPU or clock time, so, are these suggestions adhering to this rule?
ah, can you send your code. i think i know the problem
what problem?
why it's taking the same amount of time
this is from the programming assignment guidelines
oh sure 1 sec
import random
import time
def bubblesort(array):
for i in range(1,len(array)):
for i in reversed(range(1,len(array))):
if array[i] < array[i-1]:
array[i], array[i-1] = array[i-1], array[i]
return array
a=[]
for x in range(1000):
x = random.randint(1,10000)
a.append(x)
print("the list before sorting is: "+'\n'*2 +str(a)+'\n')
start=time.time()
print(bubblesort(a))
print(time.time()-start)
how do i see current working directory in python
nvm ill google
you're using i in both loops?
Are you trying to count number of iterations or actual time taken?
that was @haughty mountain recommended change
no it wasn't, he wanted you to change the bounds of the range, not the loop variable
o
.
this
ahhh i see the distinction
1s
btw this code has a bug, the inner range should be len(array), without the -1
import random
import time
def measure(n, sort_fun):
A = list(range(n))
random.seed(42)
random.shuffle(A)
start = time.perf_counter()
sort_fun(A)
end = time.perf_counter()
print(f'{sort_fun.__name__}: {end - start}s')
assert A == sorted(A)
def bubblesort1(array):
for i in range(1,len(array)):
for j in reversed(range(1, len(array))):
if array[j] < array[j-1]:
array[j], array[j-1] = array[j-1], array[j]
return array
def bubblesort2(array):
for i in range(1,len(array)):
for j in reversed(range(i, len(array))):
if array[j] < array[j-1]:
array[j], array[j-1] = array[j-1], array[j]
return array
n = 2000
measure(n, bubblesort1)
measure(n, bubblesort2)
(also, the return array is pretty worthless)
bubblesort1: 0.7679922000097577s
bubblesort2: 0.49232970000593923s
bubblesort1: 0.7743299000139814s
bubblesort2: 0.5124772999843117s
i can definitely hear my cpu fans whirring up when i run it haha
do note that your L is sorted after the first call
that little guy is working hard
๐ i did not see this
alright y'all this has been fun, gotta go eat and whatnot, i'll be back tomorrow or later. thank you everyone
i see why you set the seed now
seed + generate a new list every time, yes :P
but that would take way longer ๐
remember i need to find a way to print each line of code of mine in the way that python tutor does it
at least you're not benchmarking timsort which would be O(n) after the first call
@fiery cosmos
oh yeah thats why i was asking about the python working directory
im not good with knowing what is pointing where. i did just add python path to my desktop
oh wait
can i just go and type there right in the terminal tab of vscode?
yeah, probably
oh wow
~ $ cat a.py
n = 4
for i in range(n):
for j in range(i, n):
pass
~ $ python -m trace --trace a.py
--- modulename: a, funcname: <module>
a.py(1): n = 4
a.py(3): for i in range(n):
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(3): for i in range(n):
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(3): for i in range(n):
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(3): for i in range(n):
a.py(4): for j in range(i, n):
a.py(5): pass
a.py(4): for j in range(i, n):
a.py(3): for i in range(n):
Cannot run file 'bubblesort.py' because: [Errno 2] No such file or directory: 'bubblesort.py'
๐ญ
In [29]: %timeit bubblesort1((random.seed(69420), random.choices(range(10000), k=10000))[1])
11.6 s ยฑ 681 ms per loop (mean ยฑ std. dev. of 7 runs, 1 loop each)
In [30]: %timeit bubblesort2((random.seed(69420), random.choices(range(10000), k=10000))[1])
7.4 s ยฑ 237 ms per loop (mean ยฑ std. dev. of 7 runs, 1 loop each)
``` tada
~ $ cat a.py
n = 1000
for i in range(n):
for j in range(i, n):
pass
~ $ python -m trace --count a.py
~ $ cat a.cover
1: n = 1000
1001: for i in range(n):
501500: for j in range(i, n):
500500: pass
shoot it looks like i have one more folder to get into in the terminal window, i know cd is the command on linux for change directory, what is it in windows powershell?
cd
eh its not going, probably something to do with these weird .json file im working in
ill try it outside of vscode
I didn't know about the trace module, pretty nifty
rrrr i never know where python is pointing at
can i navigate to where i want to be first then use the python command
lol, I just live in a linux terminal all the time, no big surprises with directories and whatnot
generally that's a good idea, yeah
you could also use the absolute path to your python file
whatever is more convenient
ok so i am in the correct folder, i call python, in python i'm using '-m trace --trace bubblesort.py'
-m : The term '-m' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the
spelling of the name, or if a path was included, verify that the path is correct and try again.
i need to start working at or near the root to make things simplistic
you need the full command, python -m ...
right, the -m ... are arguments to the python program
not something you run in the repl
-m tells python that you want to run a python module (built-in or installed), the module being run is trace and then you give trace the optional argument of --trace (the action to be done) and then the file to do the trace on so: python -m trace --trace bubblesort.py
ah ok. yeah i'm vaguely familiar with flags. got it running but it just printed out the same thing over and over looks like
did you run it for a large list or something?
i incremented the array down from 1k to 100
let me try it for array of length 10
is this showing me which line was called
/run
You probably want --count instead of --trace, --trace is nice for small input size to see how the algorithm works.
Trace gives you the steps (imagine going through it line by line by hand, it does that with --trace).
yeah, count is more interesting for the larger cases
trace might help you get a sense of what the program does
the latter you'll learn to do in your head pretty quickly
just open it in your favorite text editor 
okok
oooooooooooooh
this is how they were counting lines
perfect
this is excellent you all thank you so so much
Or from the console: https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/type
now i can focus on the mathematics 
tysm all
np
I have a function f to test a parameter t, and I want to find the highest f(t) score
t can have any value in a certain range, for example between 1-10 it can get 0.00001, 5.02102, 10, etc
f(t) is like an upside down parabola - which means that if f(t1) > f(t2) and t1 > t2 then I don't need to check any number under t2 anymore, and vice versa
how would I find the ideal t?
I was thinking about doing something like a binary search
it sounds like what you want is gradient descent
of course if it's actually a parabola you should just solve for it
f(t) is like an upside down parabola - which means that if f(t1) > f(t2) and t1 > t2 then I don't need to check any number under t2 anymore, and vice versa
I was thinking about doing something like a binary search
It's indeed an existing algorithm - it's called ternary search.
Binary search gets you the root of a function, given an interval on which there's exactly one root.
Ternary search gets you the local minimum/maximum of a function, given an interval on which there's only one such.
Why do I get a runtime error with this code, does anyone have a solution that passes?
INF = 10**18+7
x, n = list(map(int, input().split()))
lst = list(map(int, input().split()))
lst.append(0)
table = [[INF]*(n+1) for i in range(x+1)]
for i in range(x+1):
table[i][0] = 0
for i in range(1, x+1):
for j in range(1, n+1):
table[i][j] = table[i-1][j]
if j-lst[i-1]>=0:
table[i][j] = min(table[i][j], table[i][j-lst[i-1]]+1)
print(table[-1][-1] if table[-1][-1]!=INF else -1)
this task: https://cses.fi/problemset/task/1634/
hi all i'd like to revisit this for a sec, i see why
becomes n(n-1)
and that they've simply split the expression into summation(n) - summation(i) terms
but why then does the summation i term become n^2-n/2. why is n substituted in?
.latex $\sum_{i=1}^{n-1} i = \frac{(n-1) n}{2}$
it can be proven formally by induction, or there's an more intuitive proof by noticing that you can pair up the terms:
1 + 2 + 3 + 4 + ... + n-1
=
(1 + n-1) + (2 + n-2) + (3 + n-3) + ...
=
(n) + (n) + (n) + ... (total of (n-1)/2 terms)
= n * (n-1/2)
(This intuitive proof is not quite full because it only works for odd n, so that the number of terms is even and you can pair them all up - but you can examine the even-n case in which there's a single unpaired element in the middle, and discover the formula works then as well)
yes i've seen this before, i think what i'm failing to grasp is how the summation with the n and the summation with the i terms differ
The summation with n is literally just n + n + ... + n, total of n-1 terms, hence n*(n-1).
Whereas the other one is 1 + 2 + ... + n-1.
ok, thank you
how do you know when you've hit n-1
when evaluating the i term
i guess you have to know n
so it's i + (i+1) + (i+2) + ... + until i = (n-1)
is that it?
i starts from 1 and goes up to n-1 inclusive.
right
n must always be known then, like the size of an input?
or length of an array.. etc
I mean, this is algebra, n is just some variable.
you must know n though to know when you've hit n-1?
Not sure what you mean
I'd say you add the i terms as i goes from 1 to n-1, but alright, sure
so then you must know when i = n-1 to know what the last term added is?
or do you simply write i + i+1 + i+2 ... + (n-1)?
As you can see above, you don't need to use any concrete value of n to obtain an expression for the sum that works for any n.
furthermore, is the index of summation shown below the same as the starting i on the right or can they be different? is the index the starting value or the step increment
Below the sum symbol is the starting value for the summation variable, above the last value, to the right is the expression which can and usually does depend on the summation variable.
It's like a sum(i for i in range(1, n)).
so the expression on the right i is not the same as the i below the summation
Not sure what you mean by that. The expression on the right can be anything, usually some function that depends on i. In this case, it's just i itself.
it could have been, say, 3 i^2
but it starts with the value below.. the index?
or does it increment by that amount
i starts with the value below the index, yes. The step size is always 1 (there's no common way to specify a different step size, at least I haven't seen one. Sometimes you see something like i <triple dot> 2 (i is divisible by 2)).
.latex $\sum_{i=1}^{n} 3i^2 = 3 + 12 + 27 + \dots + 3n^2$
maybe this^ makes it a bit more clear
so if the index below were 5, you'd just start at 5 but still increment your expression by 5+1 + 5+2 +..
sure
the other day user andrewn mentioned that there is no such data structure as a max heap in python, what is meant by that
can one not program any data structure they like?
or did he mean perhaps its not a part of the standard libraries
The latter.
which means you'd have to use the min heap in heapq and just, well, negate the elements
if your keys aren't numeric? oh well
although you could make a cmp function and use functools.cmp_to_key or something?
wrapper class that inverts comparison
that's sad ๐
rust has a helper builtin for that, funnily enough: https://doc.rust-lang.org/std/cmp/struct.Reverse.html
A helper struct for reverse ordering.
strange there's not something like that in python's functools or whatever
new feature? ๐
I mean, python ditched the cmp function that existed in e.g. the sort methods and only has the key function
it's a bit sad, since some stuff is easier to express as a comparison function
"Heaps are binary trees for which every parent node has a value less than or equal to any of its children."
CLRS would like a word with you
they're describing a min heap
oh i know
so CLRS says implement heaps as max heaps and use max heaps to implement priority queues. given that heaps in python are min heaps, how would the min heap as a priority queue differ from a max heap priority queue
well, think about it
i mean other than the obvious that the lowest key event is simulated first
that's the only difference
sweet
do people in practice use min heaps to create and use priority queues in python?
no, there's a pqueue built in
heapq
and queue.PriorityQueue
yeah, it will be slower than heapq for single threaded
so just, use the builtin heapq when you need to implement a priority queue?
basically, yeah
you can wrap it in a class if you want a nicer interface
especially if you do a max heap
why would one create a max heap when there is an inbuilt minheap in python
also i recently came across generators too, need to look into those
I implemented very simple wrappers for someone's problem a while back
because I want to get the maximum rather than the minimum?
and I'm not talking implementing your own thing, just wrap heapq in a nice class
if you want a max heap you can do the trick of negating the key inside the class
oh i was thinking about using them as priority queues, forgot their actual function ๐ฅ
thanks for sharing that code that will be super helpful for me to play around with to understand how it works
I mean, what does the priority queue prioritize?
max or min are both things you might want
i recently also found the original paper for self organizing (AVL) binary trees if anyone is interested
straight from 1962
ngl, I've never bothered to learn the details of bbsts
___ binary search trees
balanced
ty
I know what they do of course, but not the details of avl trees, red-black trees, btrees, ...
red-black trees are really cool. highly recommend
yeah i have to learn red black trees for algos
so it seems like a heap is just a mostly filled out bst except it stores the min or max at the root
whereas a sorted BST would have, the median?
no, the nodes have no real ordering in a heap, except that the root is the max. the nodes only satisfy the heap property
the heap property being that the node is greater than every element in both subtrees
yeah
if there is no real ordering, how is it helpful in implementing a queue
because you can get the max element quickly
hmm ok
the only ordering is that (in a max heap) all nodes are the max in their subtrees
so regardless which node you pick in the heap, that property is true
you can construct a BST where this isn't true quite easily, try it.
and then it's just down to maintaining that property during operations
right that's what gives rise to the heapify operation
no i mean like, in theory what is it supposed to be / what do we want it to be
heapify is the initial building of the heap
we want it to be the median
fixing the heap after operations is more efficient
you just call heapify again on the node that may violate the heap property
ok
heapify is an operation on a whole array the way I learned it
and expensive, O(n)
in CLRS they use it recursively to "fix up" the heap and restore it to every node having the heap property
it should only be log n, i think
alright dumb syntax questions here:
O(1) is constant time, O(n) = linear, O(logn) = logarithmic with respect to n, O(n^2) is quadratic
?
right
!d heapq
Source code: Lib/heapq.py
This module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm.
Heaps are binary trees for which every parent node has a value less than or equal to any of its children. This implementation uses arrays for which heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2] for all k, counting elements from zero. For the sake of comparison, non-existing elements are considered to be infinite. The interesting property of a heap is that its smallest element is always the root, heap[0].
ok not so helpful
how would you describe (nlogn) runtime? (in english) and why is n called linear, simply that the time taken can be plotted and varies linearly with the input, i guess
log linear
!d heapq.heapify
heapq.heapify(x)```
Transform list *x* into a heap, in-place, in linear time.
this is what I think of when I hear heapify
yes that's what they use initially to build the heap. they also use it recursively to restore nodes that may violate the heap property
wrt runtime:
which you can easily eval w the master's theorem
i don't know why the children's subtrees would each have size at most 2n/3, maybe if i sketch it out i'll see
the python source calls the rebalancing "sift-up" https://github.com/python/cpython/blob/3.10/Lib/heapq.py#L135-L143
Lib/heapq.py lines 135 to 143
def heappop(heap):
"""Pop the smallest item off the heap, maintaining the heap invariant."""
lastelt = heap.pop() # raises appropriate IndexError if heap is empty
if heap:
returnitem = heap[0]
heap[0] = lastelt
_siftup(heap, 0)
return returnitem
return lastelt```
hm looks like siftup is a function, what does the preceding _ do or mean
as in _siftup( )
just a function that is not meant for the user to touch
that reminds me i've been meaning to ask
but I think it's just differing terminology wrt what "heapify" means
how to people actually package their python into an .exe for others to run on their computer who may not have python
painfully, from the issues I've seen people have
I live in a nice linux world where I can just use your code, python is readily available :P
i mean the whole point of this is to be able to write software right?
i guess not, it could just be more maintaining databases and whatnot
this is when people start to talk about a stack and python being in the backend i guess
idk none of that is important for me rn i'm just curious about how executable files come to be
putting python in exes is a pretty hacky thing to do
you end up bundling a python runtime in the exe
so then most exes are written in?
executable files are a lot more natural for compiled languages
like java?
like, here is the program code and whatnot, please execute it
more C/C++ and the like
java still requires a runtime
much like python
you mean a runtime environment
right
python is used for a lot of stuff
nowadays when java is used for apps, you get a tiny java runtime that you send with your app's code, much like you would do with python
ooooohhhhhhh
you just explained times in my childhood when i'd be sitting installing a software off of a CD and the little coffee mug would come up
huh? is the time of "please run this java installer" over?
i hope so
that runtime is pretty chunky
better than making people that have no idea what to do install java ยฏ_(ใ)_/ยฏ
@haughty mountain maybe when all this is over i'll be able to understand your thesis
that reminds me i need to get my randomized quicksort up and running
having a hard time focusing today
It would not surprise me if it's just so the employees stop downloading viruses.
consider if you are in the root and the last level is half full, the left subtree has something like 2^d-1 nodes, the right has something like 2^(d-1) - 1 nodes
is d tree height
basically, I might be off by one
ok
*If you "compile" a Python program to an executable with something like Pyinstaller it's including Python with it to run the scripts. You can put multiple things in an executable including script source code, bytecode, images, etc (some have unpacking code that "installs" the files into separate files / locations).
(Executable file formats are more than just the machine code in plain form)
how does the 2n/3 come in then