#algos-and-data-structs
1 messages · Page 7 of 1
wait i think i realised
~n/2 leaves, 1st leaf has depth 1, 2nd leaf has depth 2 ect. => 1 + 2 + 3 +...+ (n/2 -1) (this would be sum of each of their depth) ?
ah ok got it now thanks for the help
a lot of stuff about max heaps and heapsort in my course.. i hope they aren't going to make us reinvent the wheel..
heapsort is trivial if you have a heap
(I never bothered to learn to implement a heap)
I know the ideas that goes into it and what proeries it has though, which probably are the more important parts
what i mean is that i hope they don't actually expect us to manually implement a heap using all their specialized operations when we could just use heapq
they have a different algorithm for every operation
the good thing is that i checked the documentation and their idea of children nodes has essentially the same indexing
the indexing of binary trees stored in a list is quite common
with 1 indexing the left child is 2*n, the right child is 2*n+1
above: python documentation
below, content from course
the complexity analysis is going to get wonky though when they have specifics for every little operation and i'm not seeing that bc its on the "back end"
does anyone know if list_a[::-1] is less performant than list_a.reverse()?
they don't do the same thing, list.reverse modifies the list, [::-1] creates a new list
When should I use a set instead of a list?
if you care about not having duplicates and/or want fast contains checks
How would I write an efficient (directional) graph implementation? Currently I have a list of nodes and those each have their own list of incoming nodes, but that has very poor runtime performance. Ideally this can all be represented as a matrix or a single array of some sort, but I have no clue how to do so, especially when the graph can change frequenty, with only an upper bound on vertices, though an upper bound on edges should be somewhat realistic to come up with
So I guess if what you want is to actually reverse said list, you should use list.reverse and not list[::-1] right? Cause last option would create a new list and the assign instead of just making use of the list itself?
im getting confused about reconciling the costs between the algorithms in the book and the ones i implement in python, given that i'm going to represent things differently and/or use inbuilt algorithms such as heapq
for me it's also a pain, but tbh there's no other option rather than checking what python does and understanding it to make a better approximation. That's my take at least
yeah i mean one can look up the time complexity of whatever they are doing using the built in functions, however i wonder if that isn't making thing more complicated rather than trying to implement what is in the book and therefore having the complexity analysis explained in a distinct way
drawing up my own time complexity from scratch 😵💫
are there any good libraries for printing your data structure as it exists like a BST print or something
you could always implement a traversal function that does that. would be a good exercise too
Hello - would someone be able to help me?
Basically what I want to do is I want to pull App URLs into an array from this website: https://github.com/bhagyas/app-urls/blob/master/README.adoc
A long list of App URLs for iOS, macOS and Android - app-urls/README.adoc at master · bhagyas/app-urls
Would anyone be able to help me do that? I need the table named APP URLs into an array but just the links - no additional text.
I know a little about apt-get but not enough to do this. Thanks!
i have an inorder traversal working, i meant like a visual output of some sort such as a .jpg. know people play around with console outputs but that seems like a lot of time for little or no gain
hey all, how do i evaluate this to get O(n^2)
Perhaps this will help:
https://www.wolframalpha.com/input?i=sum+from+k%3D0+to+n%2C+k
so you just memorize that any summation from 0 through n is equal to n*(n+1)/2 and assume n^2 time when you see it?
yeah, if it helps, when summing stuff, let's say from 1 to 10, instead of doing 1+2...+10, you could just group together like:
1+10 + 2+9 + 3+8 ....
so you end up having 11 + 11 + 11 +...
now.. how many times is 11 repeated? , in this case 5 times.
so, you could also express this like 10 * (10 + 1) / 2
or, if you think of 10 like n... n*(n+1)/2
that's what works for me instead of memorizing it
yeah no im pretty familiar with summations at this point i just wanted to know how to solve it in that specific syntax
Complexity isn't 'syntax'
that specific mathematical notation
But that summation is an arithmetic progression, you can find out more about them
oh ive been studying them for quite some time, i just wanted to know how to look at that and yield the answer, given that there is no other n in the expression to multiply by
but shouldn't it be:
O(n*(n+1)/2) -> O(n^2/2 + n/2) -> O(n^2)
maybe i'm getting your question wrong
not sure what you mean
the other n you asked for comes from the summation, which we know is equal to O(n*(n+1)/2)
If you expand said expression, you'll end up with O(n^2).
But I'm not sure you're asking for this or what
i would think it'd be equal to k^2 looking at that, if anything
Read up on arithmetic progression.
k is only the variable you use to substitute all the way until you get to the last num, from 0 to n-1.
If you expand the sum there is no k. It's just the current index.
.latex $$\sum_{i=1}^{N}{i} = 1+2+3+4+5+...+N$$
There is no i in the expansion, you can imagine an i being below each of the values 1, 2, 3, ...
so then one can essentially just plug in n(n-1)/2 to solve the k*O(1) term
You can algebraically manipulate the expression to match that summation pattern exactly and then replace that sum with N(N-1)/2.
Does not yet exactly match it, it has more going on than just k in the sum.
so like (c_1(1) + (k*c_2(1)) type thing
where c_1 is c sub 1
im just expanding those O terms
im not gonna get it unless i see how its manipulated. thanks for help
you're essentially doing a O(1) operation n times, thus O(n) there
the problem im facing now is i have a python implementation of a linked list and may have to implement one for my course but i cannot use this code exactly
so im trying to just read through it and get the essence, so i can write my own
right but the summation above doesn't say that, it is the summation of an expression involving multiple different asymptotic notations and a changing integer k
i can see how its n(n+1)/2, but not how to algebraically manipulate it to get that form
.latex $$\sum_{i=1}^{N}{1 + 1} = ?$$
2 + 3 + 4 + ... + N
this is just a very convoluted way to say 2*N
(1+1) + (1+1) + (1+1) + ... + (1+1) = 2 + 2 + 2 + ... + 2
There is no N at the end.
Every value is the same.
The index changes, but not the value, and the values are what is added up.
right yeah..
this is so sad we've been over this countless times and it doesnt click
anyhow
"Values added, index changes and values can be based on index (if the index is used in the sum)"
right so in your example if i=3 then it'd just at (1+1) + (1+1) + ... + (1+1) n-3 times
n times, not n-3 times.
Wait I thought you wrote n = 3.
i = 3 is one specific part of the total sum.
i say n-3 times total to take into account that the index began at 3 and not 1, where if it were instead 1 it'd be n times or n-1 times
depending on indexing
If n = 3, then (1+1) + (1+1) + (1+1) = 2 + 2 + 2 = 6 = 2(3) = 2n.
and if i were 1
If i started at 1 or is currently 1?
i = 1, (1+1)
i = 2, + (1+1)
i = 3 + (1+1)
sum and terminate
Yes.
Summations are basically for loops that add each part together.
(Until we start dealing with infinity)
right in your example i wasn't looking closely and i thought you had written i+1 which is why i responded 2 + 3 + 4 + ... + n times
.
Note that with the 1+1 case, we could split the sum into two sums.
this one would be what i wrote above and terminate when index = n
Would the last value be n though?
Double check to make sure your first and last values make sense always.
When i = n what is the value?
2
How did you get 2?
er.. 1?
1+i = ? when i = n.
Yes, 1+n.
That is the final value in the summation.
Because the final index is i = n.
So this is not correct. Because the final value you wrote here is n.
i was saying add 2 + 3 + 4 + ... and to add these terms n times, or in other words n+1 times if you're beginning at 1 and ending at n? idk
The n at the end of a a + b + c + ... + n is not saying how many times we add. It's just another value.
The number of times we add is the number of +'s.
And the number of +'s will be n-1. The number of terms is the number of +'s + 1 or n.
the way i have read it explained is you add n-i times
in other words, you begin adding, and add until the index of summation i is equal to n
so a range equal to n - i
This was correct.
The number of lines would be n.
i = 1 to n
yeah the index and increment things always throws me
And it would look like (1+1) + (1+1) + (1+1) + .... + (1+1). The final value is the same as every other value in this case.
There are n terms.
n of the (1+1)'s.
so like what would the summation look like for summation(i) where i=0 and n=3
0+1+2+3 right
i read this as "for all i between 0 and 3 [0,3], add i"
wait'
brackets wrong
there we go
Yes.
seems straightforward enough. but the expression in the summation might be i+i, so in that case and with the same indices and n termination, we'd get (0+0)+(1+1)+(2+2)+(3+3) = 12
.latex I have occasionally written stuff like
$$\sum_{i \in [0, n]} i$$
yeah thats much more readable imo
the typical indec notation is super standard though
well back to the task at hand, i'll repaste the thing i was asking about
you can convert it in your head if it makes things easier
Yes.
I would factor out the O(1)
so you're left with a sum of 1+k
and this might be helpful
.latex
$$
\sum_{k=0}^{n-1} 1 + k = \sum_{k=1}^n k
$$
just shifting the k over by one
if you expand the sum you'll see they are the same
what does this notation mean. i know its list slicing
but like what is meant by A[:i]
is it just 0 through i? and they don't write it bc its implied? i could just test in a program and see
Omitting the first index a[:n] starts the slice at the beginning of the list.
yep got it
im reading over the recitation notes for the MIT course and it seems like they keep their substitution proofs much simpler
does a triply nested loop become n^3 or n^4?
furthermore what if you had two separate doubly nested loops but they were within the same indentation level (non-nested)
it depends on what you do in the loop
i wonder if people know there are python versions of every algorithm available in the pinned course
every algorithm in CLRS i mean
rosetta code might have them
im really glad i didn't have them when trying to implement my own stuff
oooooh
Visualization and "audibilization" of 15 Sorting Algorithms in 6 Minutes.
Sorts random shuffles of integers, with both speed and the number of items adapted to each algorithm's complexity.
The algorithms are: selection sort, insertion sort, quick sort, merge sort, heap sort, radix sort (LSD), radix sort (MSD), std::sort (intro sort), std::stable...
Oh yeah I’ve seen that one
Hello, I'm writing a maze solver and I was wondering which graph representation would be the most versatile. Right now I only know BFS/DFS, but I plan on implementing other algorithms like Dijikstra, etc.
I was planning on using an adjacency list as it would be easiest to do.
Any advice?
hi guys. I have made a videogame with python called "turtle racer". Pls could you tell me where can I upload it?
does linked list require iteration for searching?
What do you mean by this?
const search = (f, ll) => {
// search for a linked list node where the boolean function f returns true for f(p)
const p = ll; // pointer
return p
? f(p) // does p exist? if so is f(p) true?
? true // if yes, return true
: search(f, p.next) // if not, search the next node
: false; // if p doesn't exist, return false as we've exhausted the list
}
you can recursively search like this (not sure about python snytax)
ohhh ok
however you incur the auxilary complexity of the call stack
so an iterative search is preferable
i.e.
pointer = ll;
while pointer:
if f(pointer):
return True
pointer = pointer.next
return False
something like that
class Node:
def __init__(self, dataval=None):
self.dataval = dataval
self.nextval = None
Hello folks, currently doing 8. String to Integer (atoi) on leetcode
def myAtoi(self, s: str) -> int:
s = s.lstrip()
if not s: return 0
i = 0
sign = -1 if s[0] == '-' else 1
if s[i] == '+':
i += 1
elif s[i] == '-':
i += 1
parsed = 0
while i < len(s):
cur = s[i]
if not cur.isdigit():
break
else:
parsed = parsed * 10 + (ord(s[i]) - ord('0'))
i += 1
parsed *= sign
if parsed > 2**31 - 1:
return 2**31 - 1
elif parsed < -2**31:
return -2**31
else:
return parsed```
I can't wrap my head around this part for some reason
```py
if s[i] == '+':
i += 1
elif s[i] == '-':
i += 1```
anyone trying google kickstart ?
i'm reading that as increment i by +1 regardless of whether its negative or positive
question: can hashmap collisions, resolved by chaining, be resolved by a standard python list or must you use a linked list implementation
someone explain me this
its for x in y. typically would look like:
for i in range(0,10):
foo```
could be for x in list
anything
hence the for x in y
⬆️
doesn't matter
ok that's what i had gathered from my reading. ty
well they said one can use a dynamic array, so, pythonic list
they're going out of their way to implement a randomly chosen hash function from a predefined set of functions, idk why they're not simply using python's inbuilt hash
that's cheating
aka, if the point is to make a hash function, it's not really fun or educational to just use a prebuilt one
true
it is kind of neat. they create a HashTableSet class and create inbuilt methods, one of which is defining a prime as ((2^31) - 1), then select a random integer from 1 to that prime method (minus 1 again)
so they are literally creating the whole h in the set H of possible hash functions thing
kind of cool
whats the assert method all about
i read the py documentation and didn't really get a good handle
same with yield
assert condition is like if condition: raise AssertionError()
hm simple enough. and i just read yield: it's like a return except it pauses the function in that state, so that the next time it's called (the function), it'll continue on from there until the next yield or return
yes, except you don't actually call it
calling resets it to start
(well, makes a separate sequence, the old one is still available)
hm maybe i need to read more ok
looks fine
though you should use inp instead of A inside the function, str("foo") is redundant and just "foo" is fine
ok. i was sharing a small victory here. enhancements would be implementing returning the duplicates themselves and/or their locations in the array
the duplicates problem can be solved in O(nlogn) time if you first sort the array with mergesort and loop through each pair checking for duplicate values
and solved in O(n) time if you use hashing 🤯
read about generator functions
they're often better than iterators (which require a class that implements the iterator method)
do i need to get this sympy library to get access to the isprime method?
You can also use generators to "pause" functions if you want to debug them or show their progression graphically
i.e. say you want to show what a graph traversal algorithm is doing in a GUI, make the graph traversal algorithm a generator that outputs the current step it's on and you can indicate that on the GUI
yes, that'd be really helpful for showing whats going on in the traversal
ok so i just got a big list of primary numbers, and i want to randomly select from that as part of my hashing algorithm, where in the hash function should I use it
eg h(k) = (randomlypickedprime mod k)? would that be good?
wait a min
lol nvm im dumb
i guess that was in the context of randomly selecting a hash function to use
alright can someone tell me if i am thinking about this correctly:
let's say k=10, if h(k) hashes to 0, go to the hash table and store 10 at hashtable[0]
Play with small cases of n and find relationship between smaller and larger n-cases.
If the larger cases can be built from the small ones by adding stuff on, etc, then it might be dp.
i did, but the only dp solution i was able to think was creating some repetitive solutions for which i had to check if they are unique
what's the proper syntax for
if list not empty
you could define your own
true
simple one
def isprime(n):
if n < 2: return False
d = 2
while d**2 <= n:
if n%d == 0:
return False
d += 1
return True
decent enough performance
there are fun ones that leverage more number theory, but probably not worth digging into if you just want a simple isprime
(forgot d+=1, fixed)
the prime things it turns out i didn't need (or will when i want to generate a random hashing function) but the first time i wrote it i made it generate a different key for the same input each time you run it 🤦♂️
looking at this i think i need to make my hash function separate from some input iterating function
what is this function supposed to do? from the name is sounds like a hash function, but it's not
well that's what im trying to build to
but i need to stumble through it on my own to understand 😉
keep in mind my test input is simply an array of integers
also this is weird as a hash function 
(2**31 - elem) % 7
unless, that doesn't make sense, then i'll go to the string conversion approach
"With modular hashing, the hash function is simply h(k) = k mod m for some m (usually, the number of buckets)"
hm looks to be so
i'll try that
it would be very weird if m is not the number of buckets
if m < number of buckets you're not using all buckets
how can i determine the number of buckets i need given an input array
in other words, how do i compute the hashtable size
depends, how much do you want to avoid collisions?
pls refrain from writing code
the hash table is created irrespective of possible inputs, though. you don't get to know the input before choosing that (in the general case)
i want to avoid them by adding the element to a list if its empty, appending to list if its nonempty
im visualizing my hashtable as a pythonic list running vertically, so like hashtable[0] is bucket 1, hashtable[1] is bucket 2 etc
that's dealing with collisions, but you have a cost associated with the collisions, the deeper the chain gets the slower your operations get
and in each bucket there is a list
Draw the recursion tree and study it. Here blue is case 1, green is case 2, and red is case 3.
your complexity scales with the number of collisions in a bucket, so you make enough buckets so that elements per bucket is expected to be low
See how the higher n cases are built out of the lower n cases. Case n = 3 is case 2 with parens around it, plus a combination of case 2 and 1. So previous case wrapped, and all previous cases as combinations.
(for the number of elements you're going to store)
perhaps i should begin with a hashmap (list)of length 10, and initialize it to every index location storing an empty list
right
then once i have that working i can tinker with the load factor and resizing the hashmap/table
the length you should pick will depend on the load factor you're aiming for
empty list is false, non-empty is true
ty
hmm
i think if hashmap[hashloc] == True will not do what i am trying to do
lets see
i've noticed that people use _list if they want to use a reserved keyword, is this pretty common in engineering applications
or do you just stay away from reserved keywords altogether
I would name it something better
right
it won't
yeah i had a feeling. im like essentially asking it if that index exists
doesn't it have to be if you mod it by table size
this is my test input:
listtobehashed = []
for x in range(100):
listtobehashed.append(random.choice(range(0,10000)))
you can do if hashmap[hashloc] to check if the list at the index is non-empty
if thing kinda expands to if bool(thing)
and bool(a_list) is true if non-empty
i guess its doesn't matter though? like im going to use append whether its empty or not?
right, it doesn't matter
ok ok
what is it where you put a function inside a function? a method?
no thats for classes
sry i have tried googling
i have a function i want to put inside another but i have the feeling that's not an efficient way or best practice
hmm this may actually be working
yeah this would have been better as a class. i can always rewrite
if i do like return x,y does that return both variables or a tuple of the two like (x,y)
second one
it returns a tuple yeah
how might one return separate variables
you can't, you return a tuple and if you want you unpack the tuple on the other side
ah frick ok
!e
def f():
return 4, 2
x, y = f()
print(x, y)
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
4 2
just for purpose of another function having access to something that was passed to the first function
e.g.
def f():
x = 42
return x
```returns the value that x contains
maybe global is in order?
.
what do you need access to in particular?
you can do a class, you can do a closure (nested function) depending on your needs
the var buckets which was passed as an argument to my hashmapmaker function, i want to use it again in another function
specifically my hashlookup function
yeah like i said, im realizing this would have been better as a class
sounds like this should be a hashmap class that contains a member buckets, yeah
you can also pass buckets as an argument all the time, but that's not exactly nice to use
dang. but it's working so i wanted to see if i could write another function and work with it "as is"
i'll just do the tuple return and learn to unpack, good thing to know anyhow
actually i don't like that bc then when trying to return just the hashmap i'll also get the # of buckets for no reason
converting it to a class shouldn't be much work at all
yeah this is starting to break pretty quick
this is much more complex to a noob programmer, lol
probably a good idea to learn it 
100%
basically classes have functions that look like
def f(self, other_args...)
```where self is the instance of the class
you have a special method __init__ that lets you initialize things when you create an instance
what do you need to set up to create a new hash map?
and what would you need as input to do so?
basically just the list of all inputs i want to hash and the number of buckets i want to make
at least, i wrote my hashtable function allowing for that as input
brb
to begin with, why not add elements one at a time?
e.g. you could try to implement
class HashMap:
def __init__(self, n_buckets):
self.buckets = [[] for _ in range(n_buckets)]
def insert(self, key, value):
...
def get(key):
...
Classes (with methods) are basically for globals that are not global. Classes are a type of modularization and putting variables in them is like module-scope globals (not Python's terminology of module, because that is its own thing in Python).
(and you could later switch them over to even use indexing with [])
in some sense, global is typically not what you want
you tend to have some natural scope for something
if many related methods needs to modify the same data, that should probably be a class yeah
in this case you have the underlying storage buckets which everything needs access to
maybe in the future you need more things, maybe a custom hash function the user can supply, maybe something else
a class can keep that state that is common to the methods
Getting the nice property of globals that when you add them you suddenly have access to them in all your functions, but instead of having it actually be global and pollute the global scope, you have it be class-scope.
(And you can make multiple instances of that set of variables by creating multiple instances of the class)
Actual globals are rare (when used well). If you have them in some project it's probably for some global instance that exists throughout the lifetime of the entire application. But even that global instance will probably be an instance of a class.
i see
ok i'll try this. i have to leave now for family commitment but as always tysm
How do i use this to code?
I did realize it was dp in first look
Could i get some help understanding this question?
what part?
i dont even understand the question itself i think
what parts do you understand?
so i pick a number
and 3 to the power of some n is how many numbers i should have
and i should prove that that the number is divisible by 3 to the power of some n
right
it's just an example
why are they doing the decimal expansion
where did they get the 0s and 1s
the iductive step makes no sense to me
they found a number that produces 3^(n+1) when multiplied by 3^n
why
did you read the last paragraph?
they simply broke down the 3^(n+1) case into the 3^n case and another number
but how did they do that
math?
but how like decimal expansion?
i dont understand the proof they wrote
whered they get the 0s and 1s
it's just the simple observation that going from n to n+1 will multiply the number of digits you have by 3. 3^(n+1) is 3 times greater than 3^n
If you have the recurrence relation figured already then the code kind of writes itself in dp problems.
dp[i] = "(" + dp[j] + ")" + dp[i-j-1]
but i still dont get it what is the decimal expansion
i get that you break n+1 into three parts and each part is n
and n is divisible 3
but they say the other two add up somehow?
and are divisible?
the other factor is divisible because the digits sum to 3
i'm not sure what you mean by that
isnt that something 3^n * 3^n
oh ok
a number with 3^(n+1) digits = a number with 3^n digits * some_number
ok
yeah i get it
but how do they know that that number adds up to three
i mean divisble by three
it has 3 ones, the rest of the digits are 0s
so they use one for the example?
where did this come from 
hello i have a problem and i dont know where to ask for help
@haughty mountain bro thats what im asking
if you write it out, it's kinda intuitive that it would work
for any choice of a?
n=2
3^n digits: 777777777
3^(n+1) digits: 777777777777777777777777777
that number: 1000000001000000001
I was reading multiplication
yeah, that confused me too
let's call 3^n as A
you have the 3^(n+1) long sequence AAA
which can be written A + A*10^(3^n) + A*10^(2 3^n)
or factoring out A A (1 + 10^(3^n) + 10^(2 3^n))
we knew 3^n | A
the other factor has exactly 3 ones
cool
what part of it?
the A + A*10^(3^n) + A*10^(2 3^n) part is just a consequence of how we write numbers
so its some kind of rule i dont know
I wouldn't call it a rule
123000 = 123*10^3
i.e. 10^n shifts stuff to the left
right
so I could rewrite 123123 as 123*10^3 + 123
as an example
what I did above is exactly that rewrite
we have the same pattern of 3^n digits repeated as AAA
i.e A + A*10^(3^n) + A*10^(2 3^n)
one copy shifted over 3^n steps
the other 2*3^n steps
isnt that backwards kinda
backwards in what sense?
like it doesnt make a difference but its easier to understand if the biggest term went first
wait nvm
I guess, you can just reverse terms if you want
dude but how did you figure that out
idk, maybe I've seen something similar before? playing with the base 10 representation of stuff tend to boil down to this kind of shifts
how long did it take you see patterns like this you solve like every problem i give
lol, I've always liked math puzzles growing up and in early school, and I studies a bunch of it at uni as well 😛
I've been pretty trained in problem solving
dude if i went to mit i would fail everything
wow thanks alot for the explanation i get it now
@haughty mountain did you go to high end college
idk about high end, nationally it's one of the best engineering universities, internationally probably not amazing, some ranking puts it as 125th place
are you done?
since about 3 years, yeah
wow nice
def negamax(self, grid: np.array, player: int, depth: int, alpha: float, beta: float, started: bool) -> tuple[int, int] | int:
if depth == 0 or self.check_win(grid=grid):
return -depth
moves = self.get_all_moves(grid=grid) # , player=player)
if not moves:
return 0
value = -inf
for move in moves:
new_grid = np.copy(grid)
self.make_move(grid=new_grid, move=move, player=player)
nmv = -self.negamax(
grid=new_grid,
player=self.next_turn(player=player),
depth=depth-1,
alpha=-beta,
beta=-alpha,
started=False,
)
if nmv > value:
value = nmv
best_move = move
alpha = max(alpha, value)
if alpha >= beta:
break
return best_move if started else value``` this is my negamax algorithm for my connect 4 ai, its for my discord bot so its current starting depth is 6,
but now (i think this is why) when it sees a certain loss, it basically 'gives up' because it assumes the opponent plays perfectly,
but of course humans do not play perfectly at all, so how could i improve this to still choose moves that prevent loss at a low depth (i think that will solve it, i can be very wrong though please correct me if so)
hello folks just recently solved 131. Palindrome Partitioning on leetcode and this is my code
def partition(self, s: str) -> List[List[str]]:
def dfs(s, path):
if not s:
res.append(path)
for i in range(1, len(s)+1):
if s[:i] == s[i-1::-1]:
dfs(s[i:], path + [s[:i]])
res = []
dfs(s, [])
return res```
whats the time and space complexity? Would it be 2^n for both?
How can I speed up these complex for loops(2 filters and 1 sorter) made in pygame this is why there is red.y.
allMoves=[]
bullets=set()
bulletsx=[]
wmove=[]
redy=[]
closesty=[]
# The part with filling input lists
for i in range(5,500-5):
allMoves.append(i)
for i in range(red.y,red.y+red.width):
redy.append(i)
for i in range(len(redy)):
wmove.append([])
#The part with filtering
for bulletex in yellow_bullets:
if bulletex.x<red.x+red.width:
bulletsx.append(bulletex)
for bulletexey in bulletsx:
bullets.update(range(bulletexey.y,bulletexey.y+bulletexey.height))
# The part with sorting
if not len(bullets)==0:
try:
for i in range(len(allMoves)):
for x in range(len(redy)):
try:
idn=0
for y in range(len(redy)):
if not allMoves[i+y] in bullets:
idn+=1
if idn==len(redy):
wmove[x].append(allMoves[i+x])
except:
pass
for i in range(len(redy)):
adf = lambda le : abs(le-redy[i])
ce = min(wmove[i], key=adf)
closesty.append(ce)
except:
print('Error.')
Currently it needs 0.2 seconds to execute full function but I need it to 0.015 seconds
Please anyone
?
The time complexity of this code on the internet is o(n). Looking at the code though, I see two for loops.
letters = "abcdefghijklmnopqrstuvwxyz"
for letter in letters:
if s.count(letter) != t.count(letter):
The first for loop goes through letters and the the second is with count.
s.count is basically a for loop since we go through the full string to count each occurrence of a letter.
Can anyone tell me why this is o(n)? Thanks.
Then shouldn't the complexity rather be 0(s+t)
I do think I'm confused about how o(n) works. Any good articles I can read?
yeah that's right, O(len(s) + len(t))
i assumed s and t are of the same length n
They would actually be the same length. The question is for valid anagram
wow life is fair
it's O(len(letters) * n), but len(letters) is a constant
though in many cases I would include the size of the alphabet in the complexity
e.g. if your alphabet had been all unicode characters sweeping that under the rug as a constant is...misleading
My confusion comes from the double for loop.
There's a main for loop and then a hidden for loop within count.
Shouldn't that be quadratic?
the number of iterations of the outer loop stays constant, it doesnt increase when len(s) or len(t) increase
two nested loops doesn't simply imply quadratic
a single loop doesn't simply imply linear
e.g. O(sqrt(n))
i = 0
while i**2 < n:
...
i += 1
So instead of judging by the for loop or while loop, judge by what you are iterating over?
That makes sense then
e.g. O(n)
for i in range(3):
for j in range(n):
...
basically just count how many operations you do
I think I understand it now. Let me share one example and make my deduction and then you can tell me if I'm right.
res = []
if(root == None):
return res
queue = deque()
queue.append(root)
while(queue):
no = len(queue)
curr = []
for _ in range(no):
tmp = queue.popleft()
curr.append(tmp.val)
if(tmp.left):
queue.append(tmp.left)
if(tmp.right):
queue.append(tmp.right)
res.append(curr)
return res
This should be o(n)
But from what I see above the first while loop would always be one O(1)
and the second for loop would go through the number of items in the queue O(n)
this is a bfs, right?
Yes
how many times will each node be in the queue?
and much work does each node require to process?
fwiw, it's important to pick what n is in a sensible way, in this case the number of nodes in the tree seems sensible
in a binary tree your second loop wouldn't be O(n), but still O(1)
you would process like 2 nodes at the second level
Once. On the first run the root node would be in queue. Then we pop it out and add it's right and left if node available.
So let's say for the first level of a tree, at most 2n and then multiply that as we go deeper.
each node is processed once, we are doing O(1) work for each
so in total O(n) where n is the number of nodes
class Location:
root = "Unset"
def __init__(self, name, description, parents, children):
self.name, self.self.description = name,description
if len(self.parents) == 0:
root = self
self.parents, self.children = parents, children
kitchen = Location("The Kitchen", "This is where you cook meals",[],[dining_room])
dining_room = Location("The Dining Room", "This is where you eat meals", [kitchen],[] )```
surely there is a better way to do this
alternative way: if you have a complete binary tree the size of the queue you loop over in each iteration goes like
1, 2, 4, 8, ..., 2^d
```where d is the depth of the tree
d is roughly log_2(n)
this sum is O(n) where n is the number of nodes
you're trying to build a graph?
Yes
How would i do that in python? or do i just store the parents in an array
i now realise i dont need to store the children as well as the parents so this solves my problem
but im sure a graph would be better to use
one approach I like is to add the node data to a list
[<kitchen>, <dining_room>]
```which means they have a numbering, kitchen is 0, dining_room is 1
then you can use typical structures like adjacency lists
e.g.
adj = [[1], [0]]
means room 0 has connections to rooms adj[0] which is [1] and room 1 had connections to rooms adj[1] which is [0]
doesn't that mean each room can only have two adjacent rooms?
no
e.g.
adj = [[1,2,3], [0], [0], [0]]
0 is connected to 1, 2 and 3
and all of 1, 2 and 3 are connected to 0
adjacency lists are probably the most useful data structure for working with graphs
especially when traversing the graph
since you can easily find the neighbors of a node
of course you could also have directed edges, e.g.
[[1], []]
0 can go to 1, but 1 is a dead end
you can't even go back to 0
kk
Could i get some help understanding this strong induction proof
I kinda get it but i kinda dont
what.. what the heck are decorators
Go to this server discord.gg/math
You'll get all the help you need
im in there too i dont like that server
think i got my hashmap as class working
trying to get creative with print statements and add search and delete functions
if someone could help me with strong induction that would be great
they are dumber than you think, they are just wrapping stuff
these are basically equivalent
@deco
def f(...): ...
def tmp(...): ...
f = deco(tmp)
what is their function/purpose
you can do a bunch of things with them, a typical usage might be to log some timing info about the function call
!e
import time
def timeit(f):
def wrapper(*args, **kwargs):
start = time.perf_counter()
f(*args, **kwargs)
end = time.perf_counter()
print(f'elapsed: {end-start}s')
return wrapper
@timeit
def f(t):
time.sleep(t)
f(0.5)
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
elapsed: 0.5001342161558568s
but you can probably imagine you could do a lot of interesting stuff wrapping functions/classes like this
they can be pretty useful
as another example in a project I worked on I made a decorator to rate limit groups of api calls, i.e. calling the functions basically schedules it for execution rather than calling them immediately
hmm ok
hey so i am testing and debugging my hashmap class
when i run it it rnadomly fills the map with 100 random choice integers between 0-5000
i think its working so i want to test my getkey functionality, but when i run the script it makes a new random hashmap each time
i tried going into cmd and running it from command line side but it may be that the way i've written the program it might have the same issue
set the seed, so you always get the same numbers
will that work like this: random.seed(range(0,5000))
hm may have to do it this way:
random.seed(10) print(random.random())
yes, the second
hm it was getting fussy with floats. i tried append(int(random.random()) but got zeros. anyhow, now im testing by simply keying in a bunch of values and it seems to be hashing very unevenly, eg right now i have a bunch of empty buckets and some with like 10 items
random.random() always gives a number in [0, 1), so you'll only get 0 when yo truncate
you need a better hash function then
yeah i'll get there. rn i'm trying to return a value given its key with my .getkey method, basically i compute the key and go and look in that list in my hashtable (another list) and if it's there I want to return its location within the list of lists
i think my syntax is wrong
nvm i got it
hmm i'm having trouble returning the index at which my value appears in my list of lists (hashmap)
how so
oh i got it
why would that be useful for a return?
don't you just care about the value? that other info could be a debug print
it's my getvalue method
basically like a hashmap search
returns x does not occur in map anywhere if it doesn't
you're right in that its not really useful but i am just sort of like building this thing out to see how it works
really kind of in disbelief i wrote this over the past 2 days and even moreso i converted it from the way i originally wrote it (a function)
i need to implement a delete method now
hm something interesting going on. i'm successfully searching my hashtable and returning 'xxxx' not in map when an integer doesn't occur, i'm also successfully returning that an element does exist and where (separate method call separate line) but if i then go and delete that element (again, separate method separate on separate line) and then try to search for it again, i instead get None rather than my expected 'xxxx' not in map
if i do multiple self.value = value within different methods in a class will they remain distinct or will it break things
it would be the same attribute
For traversing through a tree, when do you use bfs or dfs?
what's the reason you're traversing?
No reason. Just asking if you're given a general question how do you determine what to use
One reason I know to use bfs is when you're told to populate an array
i don't really get that case, but i'd probably use bfs because it's alphabetically before dfs
So both can be used for any tree question?
one might be better, depends on the question
What if you have different methods you want to use the variable name value in and use it for both of them
if you use the same name, it's the same attribute. if you don't want that, pick a different name
maybe it can be a local variable?
do different methods within the same instance of a given class all have scope for every other method attribute?
wdym method attribute
what you were saying is the same attribute
if it's an attribute on self, all instance methods can access it
oh ok good to know
hm my getvalue method clearly isn't working for some values
its returning that it doesn't exist in the hashmap when i know it does
what's weird is that its working for some value lookups
i think we were getting our wires crossed. i just tried replacing a variable with a self.var. is it possible you were referring to arguments to the __init__ method?
i'm not sure what you mean
nvm i figured the error. i'm not actually looping over my list at the hashtable[0] eg the first list in bucket 0
for some reason it's only checking the first element in each list i think
import random
class hashmap:
def __init__(self, numbuckets):
self.hashtable = [[] for i in range(numbuckets)]
self.numbuckets = numbuckets
def insert(self, value):
key = value % self.numbuckets
self.hashtable[key].append(value)
return self.hashtable
def getvalue(self, value):
key = value % self.numbuckets
listvalues = self.hashtable[key]
for val in listvalues:
if value == val:
statement = str(value) + ' occurs at location ' + str(self.hashtable[key].index(value)) + ' in bucket ' + str(key)
return statement
else:
return str(value)+ ' not in map'
# def deletevalue(self, value):
# key = value % self.numbuckets
# listvalues = self.hashtable[key]
# for val in listvalues:
# if val == value:
# listvalues.remove(val)
def printhashtable(self):
return hashtable
h = hashmap(10)
h.insert(7)
h.insert(70)
h.insert(700)
h.insert(7000)
h.insert(7123)
h.insert(452)
h.insert(142)
h.insert(800)
h.insert(1200)
h.insert(75400)
h.insert(5400)
h.insert(75000)
h.insert(780000)
h.insert(1000000)
h.insert(10**7)
h.insert(10**8)
h.insert(10**9)
h.insert(10**10)
h.insert(10**11)
h.insert(10**1)
h.insert(10**2)
h.insert(10**3)
h.insert(10**4)
h.insert(1928)
h.insert(8291)
h.insert(1289)
h.insert(1767)
print(h.hashtable)
print(h.getvalue(7122))
print(h.getvalue(8291))
print(h.getvalue(10))
well yeah, look at your if else
the not found return should be after the loop
as written if the first item doesn't match it returns not found
🤦♂️ alright thanks
i wonder if my None issue from before is still persisting
lms
no seems to be working now, including the delete value method
sweet!
so the way it's written now, i'm arbitrarily picking the number of buckets, is there a better approach
you cannot make an iterable return statement right? like a return inside a loop can only return once?
hmm
so basically i am interested in printing my hashtable (a list of lists) but i want to print list at hashtable[0], newline, list at hashtable[1], new line.. etc
keep getting stuff like this:
<bound method hashmap.printhashtable of <__main__.hashmap object at 0x000001AB25526FB0>>
ty
im pretty proud of how this is working now, i made some neat print statements and even added an assertion to make sure the number of buckets is a valid integer (non alphabetical, non float)
#create a pythonic Class entitled hashmap. this will be my implementation of a hashtable / hashing function
class hashmap:
def __init__(self, numbuckets):
#check that the number of buckets declared by the user is valid
assert isinstance(numbuckets, int), "buckets must be integer number"
self.hashtable = [[] for i in range(numbuckets)]
self.numbuckets = numbuckets
def insert(self, value):
#compute key for given value based on the input integer value and number of buckets:
key = value % self.numbuckets
self.hashtable[key].append(value)
return self.hashtable
def getvalue(self, value):
key = value % self.numbuckets
listvalues = self.hashtable[key]
for val in listvalues:
if value == val:
statement = str(value) + ' occurs at location ' + str(self.hashtable[key].index(value)) + ' in bucket ' + str(key)
return statement
return str(value) + ' does not occur in the map.'
def deletevalue(self, value):
key = value % self.numbuckets
listvalues = self.hashtable[key]
for val in listvalues:
if val == value:
listvalues.remove(val)
print(str(value) + ' has been deleted from the map.')
def printhashtable(self):
for i in range(self.numbuckets):
print('bucket[' + str(i) +']:')
print(self.hashtable[i])
print('\n')
if you want the expected number of elements in a chain be c, then
hash_table_size * c = n_elements
- added some comments
i.e.
hash_table_size = n_elements/c
it's a very simple choice for the size, but it's probably good enough
in a real hash map you would keep track of the this constant c as you go and rehash the table when it gets too full
what do you mean real hashmap 😦
I think this c is what people call the load factor
in the book its called the load factor, alpha
I mean, in a practical implementation in a library :P
α
In Python we use snake_case, so if you have a multi-word variable try word1_word2_....
listvalues -> list_values
what about listValues
That is camelCase.
ohh
cURSEDiNVERTEDcAMELcASE
(Yes I know, very funny, snake_case in "Python")
yeah i think in an engineering scenario too one would use a much more sophisticated hashing function (key generator)
don't forget spongecase
this iS SPonGeCase
you're basically doing what python does for ints already (which is pretty bad)
My personal favorite style is snake_case for variables and functions, and Pascal_Snake_Case for types.
its bad that im doing it or its bad what python does
i like using capitalization for classes like class Hashtable
even though i totally didn't do that in the code above
(or x is a very large int)
Python has a style guide.
!pep8
PEP 8 is the official style guide for Python. It includes comprehensive guidelines for code formatting, variable naming, and making your code easy to read. Professional Python developers are usually required to follow the guidelines, and will often use code-linters like flake8 to verify that the code they're writing complies with the style guide.
More information:
• PEP 8 document
• Our PEP 8 song! :notes:
yeah i steered clear of python's hash i don't recall why
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
-2
so like, this style 'hashing function' if i can even call it that, would work fine for records where each record is like a dictionary and i'm hashing based on some record number, obviously i'd need to change it for strings, but then why would someone being storing a whole giant boatload of strings anyhow
unless maybe you were hashing based on unique record names
String as a key is common.
like h(myfullname) and output key int as a record number
and that's when you get into ord() and chr()
You can view both ints and strings as just a string of bits that needs to be hashed if used as a key.
It's just ofc fast if your code knows it's an int or string or whatever.
there are loads of bad choices of hash functions
(hash of an int being the same int is one of them)
not absolutely terrible, but not good
Like when you add two numbers you don't manually do the addition bit by bit (with a loop), your computer knows it's an int and can use the instruction that just does it in hardware for you. But it only has instructions for specific sizes.
a worse hash function for strings would be
sum(map(ord, my_string))
yeah i'd love to know more about bits and word size and have a better idea of what the machine code is doing. i recently looked into how cpus / semiconductors are made, fascinating
like even how they get the silicon is cool
You can imagine a hash function that works with any number of bits, but it would be super slow for something like an int, because you could just not do all that looping manually and just work with ints in the normal way.
# Add 4 spaces (an extra level of indentation) to distinguish arguments from the rest.
def long_function_name(
var_one, var_two, var_three,
var_four):
print(var_one)
i like this for isolating the arguments, good choice
It's like how old computers did not have multiplication instructions, they just added in a loop, which was super slow.
that must be ridiculously way back
and bits (binary digits) are the most rudimentary representation of any information in memory, as they are simply on/off, 1/0, etc
I think even many mechanical computers did multiplications
there has been base 3 computers as well
Balanced ternary is a ternary numeral system (i.e. base 3 with three digits) that uses a balanced signed-digit representation of the integers in which the digits have the values −1, 0, and 1. This stands in contrast to the standard (unbalanced) ternary system, in which digits have values 0, 1 and 2.
The balanced ternary system can represent all...
my earliest memories of a computer is like, playing snake on an old CRT monitor, no idea what the hardware was
Not /that/ far back, 6502 for example.
Home PCs did not have multiplication, division, mod, floats for some time and were pretty wide spread at that point.
Back then, people did some pretty crazy stuff to multiply fast.
Yeah.
One fun trick on the 6502 was using a squares lookup table.
And then doing xy = (x^2 + y^2 - (x-y)^2)/2.
so binary search is typically* O(log(n)) yeah? on a sorted input array?
if comparison is O(1), yeah
if you have something long strings in your sorted list then it's kinda misleading to call it O(log n)
O(log n) * cost_of_comparison
oof i need to look back at that, i can't remember if you pick some element in the middle to compare to and then check the left half if its lower, right if higher?
i can look it up
you pick the midpoint and see if the answer you're looking for is bigger or smaller
and then call it again and pick the new midpoint and do a new comparison?
recursively
right, you recurse on the side the answer must lie in
👍
(if it exists)
yeah the recitation notes for the MIT course have been a big help. they really distill everything and boil it all down rather than reading from 1300 pg CLRS
do you guys do any consulting? i feel like y'all could make bank
i know this is off topic sry
It's kind of like playing that game were one person tries to find a hidden object and the other keeps saying warmer or colder over and over.
thats a good analogy
lol, I earn enough already from working, I'm happier having some free time to do whatever
even if some of that is spent helping people
did y'all have any notes on my code? i'll go back and look again i feel like @haughty mountain you did
oh yeah this
it's pretty easy to make int input that makes python dict and set have O(n) lookups
because the hash function is so predictable
Crafting such inputs is actually used as an attack by hackers to wreck a system (lag it out). So try not having a bad hash function.
you /sort of/ just read my mind, i was gonna ask about hashing functions and cryptography
python adds a random element to string hashes, but not to int hashes
for whatever reason
so int hashes are super predictable
and collisions are easy to construct
Yeah, and I would have my own hashtable implementations or one from a lib I can easily understand on web servers for this reason.
there are very good hash algos and hash tables around
Yeah, there is a list somewhere IDK if I have the link on me right now.
Says pros and cons.
What NOT to use them for, etc.
for C++ the flat_hash_map in abseil is quite good
it's cool i can find it. it typically has to do with some prime number and staying away from base_2 / base_10 integers, and modulo
iirc rust decided to port that as well for their hash table
An open-source collection of core C++ library code
so C++ and Java and other languages are much faster than python?
for many reasons
Fibonacci > modulo (can add it on to whatever pretty much), it's for speed mostly, but it can actually improve the results too.
i know python is typically more human readable
what was it that we used before to unpack tuples? was it reduce?
ooh B-tree ordered containers
thats a whole chapter in my book
well B-trees themselves, not containers
what kind of applications y'all do.. like IT or security, database management? software engineering applications to create .exes???
What is the max number of executions in python for one second
@haughty mountain :white_check_mark: Your 3.11 timeit job has completed with return code 0.
1 loop, best of 5: 596 msec per loop
close enough
its neat that is this very close to n^2 runtime, as we expect
i mean, close to n^2 lines executed, let's say
although its curious that when i change the inner loop to for j in range(0,n) it becomes more like 220 lines
in fact the change is what i think we expect to be O(n^2), where this we would expect to be quite a bit faster but not sub n^2? does that sound right
the loop with i should make it like half the operations
lms
it's like 222 vs. 132. so yes near half
but again i thought that the inner loop being (0,n) should be roughly n^2 operations on given n,
so why is it so much higher
n^2 vs (n+1)n/2
its more like 2.2(n^2), which is O(n^2) sure
oh the n(n+1)/2 is only when your inner loop's range begins at your outer loops for i variable?
right
because n + (n-1) + (n-2) + ,,, + 2 + 1
eg any plain old inner loop
right right. it's just helpful to think of it in the two loops heirarchy
so its doing roughly 4x the number of computations we'd think, if we're counting lines executed as computations
but i think we said before that the two are not synonymous. which is obvious when you can have multiple computations per line
or a line that simply declares a variable
4x?
yeah so like, 10*11 / 2 = 55
but its doing like 222 lines executed
i guess bc the print statement?
wait sry
it's doing 132 for that
so roughly double
but again theres a print in there
you're doing a really small n though
ok let me raise it
and yeah double isn't too weird since the loop itself will be visited every iteration to do the termination check
at least it should be something like that
uhh
i just did it and it's exactly n(n+1)/2
for n=100
and this code:
n=100
out = 0
for i in range(n):
for j in range (i,n):
out+=1
print(out)```
incredible
i mean you can say 'well duh cause math' but like, i've never seen it actually align like perfectly as expected
let's try n=1000
.latex
$$\sum_{i=0}^{n-1} \sum_{j=i}^{n-1} 1$$
n=1000 should be 500500
yeah again it agrees exactly and is 10x the other output
yep
when the computation works perfectly:
yay for actually doing math correctly in my head
i have moments. one time on a popular TV show in the states it was the 'tournament of champions' nobody even tried to answer (a math question) and i got it right
but most of the time i'm not that sharp
it even works for small n, after i changed the print statement
super neato
what is the number of executions in this code (for a in range(0, 40) for b in range(a+1, 40) for c in range(b+1, 40)... for e in range(d+1, 40)) so five nested loops total
!e
print(sum(1 for a in range(0, 40) for b in range(a+1, 40) for c in range(b+1, 40) for d in range(c+1, 40) for e in range(d+1, 40)))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
658008
@dark aurora 
or more boring: 40 choose 5
!e
import math
print(math.comb(40, 5))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
658008
dont know if i should post it here, but do i understand Dependency Injection correctly ?
class Computer has class RedKeyboard and class RedMonitor -> what if we want to use BlueKeyboard
then -> we create interface Keyboard so that RedKeyboard and BlueKeyboard can implement them
do you happen to know why that is when i calculated the time comolexity it was ~n5/32 but the correct one is ~n5/128
I mean, the actual exact answer is neither of those
and the complexity doesn't care about constant factors
ok yeah wrong term not time compexity vut number of executions, it is not exact because it is n*(n-1) not n potention 2 but for easier writing i wrote it like that
it's exactly
n*(n-1)*(n-2)*(n-3)*(n-4)/(1*2*3*4*5)
(i.e. n choose 5)
I guess if you drop all lower order terms it's like n^5/120
yeah that, do you know why is it divided by 1 x 2 x 3 x 4 x 5
because it's n choose 5 
the number of ways to pick 5 elements out of n without caring about order
.latex $$\frac{n!}{(n-5)! 5!} = \frac{n (n-1) (n-2) (n-3) (n-4)}{1 \cdot 2 \cdot 3 \cdot 4 \cdot 5}$$
.latex $$\frac{n!}{(n-5)! 5!} = \frac{n (n-1) (n-2) (n-3) (n-4)}{1 \cdot 2 \cdot 3 \cdot 4 \cdot 5}$$
yeah, i get it, still feels weird that a five nested loops has
so litlle executions
i would ask in #software-architecture
ty
Hey there, how can I come up with "my_function" in the code below? ```py
from ipaddress import ip_address
x = [{'begin': ip_address('10.0.0.1'), 'stop': ip_address('10.0.0.10'), 'p_begin': 0, 'p_end': 30}]
y = [{'begin': ip_address('10.0.0.1'), 'stop': ip_address('10.0.0.10'), 'p_begin': 0, 'p_end': 10}, {'begin': ip_address('10.0.0.1'), 'stop': ip_address('10.0.0.10'), 'p_begin': 10, 'p_end': 30}]
z = [{'begin': ip_address('10.0.0.1'), 'stop': ip_address('10.0.0.2'), 'p_begin': 10, 'p_end': 20}, {'begin': ip_address('10.0.0.2'), 'stop': ip_address('10.0.0.10'), 'p_begin': 20, 'p_end': 30}]
my_function(x, y) # True, two dicts in 1 can be "condensed" into one dict in x, all ports and ip range match
my_function(x, z) # False, x allows something like 10.0.0.1:25, but z does not allow that, port range is 10 - 20
not asking for the full solution, I want to know what algorithm I can use here
just looking for pointers on how to approach this
my question is if were to pop the tail
that would be O(1) not O(n) correct?
adding the node (3) is O(1)
but removing is O(n)
adding the node with value 3 might be O(1) or O(n) depending on the list implementation
if you hold a reference to both the first and last node of the list, then you can add an element at the end in O(1)
if you hold a reference to only the first element of the list, then it takes you O(n) to add an element at the end
either way, once you've added the node with value 3, it's O(n) to remove it
because in order to remove it, you need to update the node before it by setting its next to None
and it takes you O(n) to find the element before the last element.
ok I;m sorry but bear with me lol
oh does it?
I thought we could directly
access that
how would you do that?
by setting some variable = self.tail
oh
I see
now it makes more sense
hmm but in a doubly linked list
it would be O(1)
yes, in a doubly linked list, adding or removing an element at either end is O(1)
assuming you keep a reference to both the head and the tail, anyway, which you would.
I see
It kinda throws me off
when i think of the number of operations sometimes
I think of o(1) as constant number of operations but yeah
that's exactly what o(1) is
the number of operations required to perform the modification doesn't vary with the size of the list
the o(n) n part is about the fact that you have to traverse the entire linked list to get to the last element
it's not about the actual removing part
ah, right.
yeah so I don't know if thats the correct way to think about it or not
but if we have to loop through it
then it's o(n)?>
so the longer your linked list, the longer time it will take
such it scales with n (size of linked list)
yeah makes sense
finding the node that needs node.next = None takes O(n) time, updating that node to set node.next = None takes O(1) time
Hi so I am solving LeetCode 680. Valid Palindrome II: https://leetcode.com/problems/valid-palindrome-ii/
here is my solution:
class Solution:
def validPalindrome(self, s: str) -> bool:
def verify(s, first, last):
while first < last:
if s[first] != s[last]: return False
first += 1
last -= 1
return True
start = 0
end = len(s) - 1
while start < end:
if s[start] != s[end]:
return verify(s, start + 1, end) or verify(s, start, end-1)
start += 1
end -= 1
return True
this is much slower than
class Solution:
def validPalindrome(self, s: str) -> bool:
def check_palindrome(s, i, j) -> bool:
while i < j:
if s[i] != s[j]:
return False
i += 1
j -= 1
return True
left, right = 0, len(s)-1
while left < right:
# print(f"{s[left]=}, {s[right]=}")
if s[left] == s[right]:
left += 1
right -= 1
else:
return check_palindrome(s, left + 1, right) or check_palindrome(s, left, right - 1)
return True
I am not sure why since they're effectively doing the same thing. Aren't both of this solutions O(n) time and O(1) space?
Hi guys - this is a weird one, but I've basically got a computational problem which I've solved, but I'm looking to speed up and work out if there's better data structures for my use case. Basically it involves around 220,000 list "computations", which then have their own sub-computations. DM me to get in touch and I can send the ZIP w the challenge. If this is the wrong place to post this, let me know!
!rule 9
where can i ask for telegram help?
if this list is a palindrome ```py
[1, 2, 2, 1]
then is this list counted as a palindrome ? ```py
[1, 2, 3, 2, 1]
yes
whomever was using o(something) and O(something) yesterday interchangeably should realize that those are not the same concepts
how would the time complexity of various operations on a hashtable using pythonic list to resolve collisions vs. a doubly linked list data structure implemented as a class to resolve collisions differ
my gut says it wouldn't matter so much until the lists become large
!rule 9
append on list is amortized O(1), traversal and removal by value is O(n)
append on linked list is O(1), traversal and removal by value is O(n)
for lower level languages linked list has some useful properties, like if you append pointers to the previous values are unchanged
for list/vector appending can cause a resize which invalidates pointers
digits in base 13
looks like hash tables are indicated for fast adjacency lists?
im thinking i should also rewrite my representation of a graph as node class and traversal classes, rather than what i have now which is parallel arrays for each attribute of each node and breadth first search as a function
thoughts?
hash table if you're not working with indices
conceptually a list of length n is a mapping from [0, n) to values
do people ever sort as they are storing values in their hash table chain? like so that way it's already sorted when you go to do other operations?
In the chain? Like, in the specific bucket? Not sure why you'd need that - after all, it only has more than one element if there's collisions
like lets say i wanted to first store all my records in a hash table and only after everything was stored then go back and look for duplicates
if the buckets were already sorted, then checking each pair would be trivial over something like first sorting the bucket with insertionsort
eg check bucket 0 for dups in ABCDD would be check (A,B),(B,C),(C,D),(D,D). i guess if the bucket contents are small enough n then n^2 running time would be tolerable anyway (the insertionsort approach)
this actually leads me to a different question, can you have some sorting algorithm running on lets say, 5 parallel arrays, and each time you add an item to one of the arrays, it is first added then sorted, before another element gets added to that array or another array?
i guess the question would be regarding like active algorithms during runtime or at least during building out a data structure
hm i guess you could but it could become wildly inefficient
When vertices are uniquely labeled from 0
to |V | − 1, it is common to store the top-level Set Adj within a direct access array of length |V |,
where array slot i points to the adjacency list of the vertex labeled i. Otherwise, if the vertices
are not labeled in this way, it is also common to use a hash table to map each u ∈ V to Adj(u).
alg was correct
or use a hash set to check for dupes
for sure
there are nice containers for keeping data sorted e.g. BBSTs and B-trees
if that's what you're after
that was the sort of prescribed answer to solve the dups problem in O(1) or linear time but it got me thinking about programs comprised of multiple algorithms
yeah ok. i just glossed over the binary trees section to focus more on graphs and graph traversals, i guess it feels like my representation is lacking. also how would one rehash their hash table or rebalance it
do y'all think its worth rewriting my graph representation from parallel arrays for every attribute of a given node to one based on Node class? my BFS is working now but i think storing weighted directed edges would become cumbersome
Whatever you prefer. You can just have the graph be a list of Nodes: ```py
class Node:
def init(self, neighbors):
self.neighbors = neighbors
# Attributes here.
graph = [Node([1]), Node([0])]
Where neighbors is a list of indices pointing into the list that contains all the Nodes.
Same thing as parallel arrays, but array of struct (AoS, array of objects) instead of struct of array (SoA, parallel arrays).
I would personally separate the data and the graph info
vertex indices, edges and weights go into the graph representation
and data can be looked up using the index as needed
Yeah you can also do: ```py
graph = adj_list_here
attribs = [NodeData(...), NodeData(...), ...]
Just two parallel arrays, the second is all of the data in one, instead of multiple arrays.
that's what I meant yeah, it's a sensible structure
would you initialize neighbors as an empty list or just key in all the neighbors in a list format when you make a node
If you are just making an algorithm quickly and don't really care about naming the attributes and making a class for them you can just use a tuple or list for each element of attribs.
i'm trying to build a Vertex class but sort of struggling to generalize things and the concept of self.attribute from yesterday got me a bit confused
The neighbors are given on creation into the __init__.
so you just key in a list including the brackets and all
i'm doing like y = Node(5, [0,3,2,4])
On its own it's not part of the graph.
how would you handle the vertex name/value here
oh you're saying given an array, at index zero, Node([neighbors]) occurs
i see
If your graph is an adj list done with a dict, you can do graph[5] = Node([0, 3, 2, 4]).
what's the point in wrapping the neighbors list in a Node class?
i'm trying to move from the many lists approach to a Class approach, but yes i'll still need an adjacency list, which could be indexed in the natural manner
i want to have every attribute be a part of the graph or node classes
Attributes can be included as well.
for easy modular representation and adding any attributes like color, weights, etc
graph[5] = Node([0, 3, 2, 4], "foo", 55.4, another_graph)
weight isn't a node property
true
Weight is separate. Unless in my example by weight you mean like weight of person (each vertex a student or something).
I mean edge weight
maybe what i didn't like about my existing representation is that the outputs were messy and i should just go fix that instead
If you get too many parallel arrays for attributes you can combine them.
like, I imagine this kind of setup, Graph could use an adjacency list or whatever as the underlying data structure
vertex_data = [data0, data1, data2]
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
for u in g.neighbors(0):
print(vertex_data[u])
This method (parallel arrays, attributes combined or not), keeps the graph structure separate from the attributes.
This means you have nice things like only passing around the structure if you only care about that. Or only the data, either way.
alright i'll just fix my print statements to make it more palatable
you could also bake in magic like mapping stuff to indices into the Graph class
such that the outside user wouldn't even need to know about that indices are used internally
ok let's say we are sticking with a graph as 2 lists, vertices and neighbors which is a list of lists. then how would y'all then represent weights
i guess you could iterate over the vertices and neighbors lists and store the edge tuples and their weight in a dictionary or something? doesn't that get convoluted and messy?
vertices = [0,1,2,3,4,5,6,7] neighbors = [[1,4],[0,5],[3,5,6],[2,6,7],[0],[1,2,6],[2,3,5,7],[3,6]]
these really don't need to be two different lists do they
either put the weights in the adj list, i.e. store (neighbor, weight) tuples in the list
or have a separate dict with weights
if you use indices 0...n-1 then the vertices list is superfluous yeah
it's just range(n) if you need it
range(len(neighbors))
right
kind of amazing that you could store a graph with vertices 0-n in a neighbors list of lists comprised of tuples of neighbors and their edge weights