#algos-and-data-structs
1 messages · Page 149 of 1
Have you looked at my code? I've done those in separate steps and Im asking for help in how they'd be arranged in a single list comp
Yeah i've looked at it, i'm telling you which processes can be combined in a single list comprehension
Do you want the code for it or do you want to find it yourself?
@manic karma
I am trying to find the answer myself. Thank you
I would have liked some explanation about how I could get these coordinates in tuples with each x and y as integers. splitting the strings is what I already understand, I already have a very hamfisted way of working the coordinates into the form that I want them in. Reaching out to find a more pythonic way of approaching this problem doesn't seem to have gotten me a positive response so I'm just going to figure it out on my own.
Well it's kinda hard to help you get to the answer without spoon feeding it directly
The problem is not to do with understanding, it has to do with making code shorter and using syntactic sugar
And saying you haven't gotten a "positive response" when someone is actively trying to help you on your way might not attract that many "positive" helpers as you would think. @manic karma
Yeah, no thanks on the help then.
managed to use tuple(map()) to get a simpler result:
coordinates = input.read().split('\n')
# coordinates = [i.split(' -> ') for i in coordinates]
coordinates = [coord_set.split(' -> ') for coord_set in coordinates]
for coord in coordinates:
x1, x2 = tuple(map(int, coord[0].split(',')))
y1, y2 = tuple(map(int, coord[1].split(',')))
So I am learning about disjoint sets. Just to check - if Find returns same ID, then that implies that sets were merged, right?
You can remove the tuple() parts 
Thank you!!
nothing, as long as you return the same value after the while loop
break just exits the loop, and afterwards there is a return result statement
So it will basically return result either way
@spare parcel
i dont wanna be arrogant, some one very pro at leetcode wrote this
why didnt he write the one liner then..
whats the point of running another comparison with min
nm i went through the logic again, wrong answer with return nums[left]
imo the code isn't particularly good, using inclusive-inclusive range for the binary adds a bunch of +-1 stuff that's just noisy and error prone. also seeing "pointer" used to mean index is a bit jarring to read
yes, you need the min, because the initial result = nums[0] is dealing with the non-rotated case
Note that the uniqueness of elements is actually important, without it you can't do things quicker than O(n). In any case here is a nicer implementatiom
def find_min(nums):
l = 0
r = len(nums)
while r - l > 1:
mid = (r + l)//2
if nums[l] < nums[mid]:
l = mid
else:
r = mid
# If not rotated, r == len(nums)
return nums[r%len(nums)]
The binary search looks for point where where nums[l] > nums[r] and it naturally handles the sorted case, no special cases
Hey! I was wondering if anyone has any recommendations on books on data structures and algorithms suited for intermediate to expert level of expertise in programming? I know I could find this on !resources but I want to hear from the community itself.
I'm having troubles with agglomoerative clustering with ward. When my algorithm tries to creatae a connectivity matrix,this happens:
File "/usr/lib/python2.7/dist-packages/scipy/sparse/compressed.py", line 964, in toarray
return self.tocoo(copy=False).toarray(order=order, out=out)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/coo.py", line 252, in toarray
B = self._process_toarray_args(order, out)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/base.py", line 1039, in _process_toarray_args
return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError```
I have serached online and from what I can understand the matrix gets to big and the memory runs out? Is there a solution?
if I iterate through a list of dictionaries and append the resulting dictionaries to another list, will it keep order from when it was appended?
list.append always adds things to the end of the list
Is that what you meant?
Yes. Hm, for some reason my list of dictionaries getting out of order.
Whatever is doing it, it isn’t list.append
Is anyone good at using numpy and has experience with 4-D arrays?
Hey everyone!!! My name is Jamie, and I am a junior programmer. I was wondering if I could talk to someone about a problem I am trying to solve using hashmaps & algorithms? I am fairly new to python and would love some help!
I'm attempting to sort this list of dictionaries by block_date but receiving an empty response. Any suggestions?
Here's what I've tried,
list_of_dict.sort(key=itemgetter('block_date'), reverse=True)
mapped_data.sort(key = lambda x:x['block_date'])
empty response? @rigid turret
yes, nothing returned.
What should be returned, you sort the list of dicts in place
you can use sorted if you want a copy of it to be made and returned
@lament totem thank you for assisting on this.
im hiring a data structures experts for a very short project (maybe 10 hours) Please ping or PM me if interested
why is list comprehension faster then generator expression here?
>>> timeit.timeit('all(a == 2 for a in x)', 'x=[2] * 4000', number=10000)
6.00...
>>> timeit.timeit('all([a == 2 for a in x])', 'x=[2] * 4000', number=10000)
4.00...
because there is a cost to yielding and lazily constructing the sequence
because execution jumps between your program and the generator for every generated value
in contrast generating the list the whole thing is created at once
more memory used with the list, but also typically faster, it's a trade-off
the list is already constructed there and all() just has to get the iterator and iterate over it
meanwhile the generator is lazily evaluated and still needs to calculate the value every iteration
i see thx
Hi not sure if this is a right channel so sorry if its not.
Ive got this homework and Im really strugling with it
Find shortest path from start to finish (or out of the maze)
50a)we have a square maze and inside of it is a white lady, find the path from u to v, which minimizes number of steps through walls
50b) same maze, but we are going through it and for breaking down one wall it takes k time.
Write both with good complexity and then write time complexity
a)My thoughts are that i will use BFS for finding the shortest path and then if the finish is not blocked this will be the right path and if its blocked i will just go through the walls that are in front of finish .But i don't really know if its good and im confused by the assignment.
b) i would again use BFS and basically the same thing as with (a) .But i dont really know whats the difference between these two assignments
So if anyone could help me, i would be really glad

minimizing steps through walls sounds weird @austere shell
Does the rest not matter, just the minimum number of walls?
And how big is the maze?
well thats the problem this is the only thing i have
thats random it can be any size
it should work for any size of maze so basically any of these
Well the idea I had in mind was just using breadth first search to cover all spots not occluded by walls
After there is no position left to visit directly, from every position you can go through a single wall
Then try visit all possible nodes again until no more left
allow go through walls once again
etc.
he added that in a) youre counting only the walls
doing it like this will basically give you the minimum number of walls broken down
just like he said
by your way ?
yeah, because after all positions are visited, you are allowed to go through a single wall from every visited position
and you can just count the amount of times you allow this
And each node in the maze should ofc. store the path it has taken
I'm probably not explaining it very well :/
well its more like my fault that im not really good at this ,sorry 😅
that would be nice 
That's enough paint for 2day
tell me if you don't understand a step @austere shell
okay i do understand this but in which time complexity is this going to work ? 😅
well you only visit each position once
so it will be proportional to the size of the board in cells basically
The space complexity could be quadratic at worst with this method
so o(N) ? if i understand correctly
yes for time
but each cell stores the path to that cell
You could maybe also store from which way is travelled to that cell in every cell
that would make it linear as well I think
instead of storing the entire path
oh so if i would write it somehow, it would be something like, were given maze and after we use BFS on it we will break every position around that path and after we do that we will remove the data of visited cells, we will again break the walls and so it repeatedly until were in the finish position, and for every cell we store the direction of the shortest path right ?
yeah, it would be repeated iterations of:
loop:
1. BFS all reachable spaces from visited locations
2. Break all walls around currently reached spaces
3. Remove currently reached spaces (Don't need them anymore)
4. Set broken walls as visited
oh thanks, and i know im asking little too much but dont you know how to change it for the B) question ? 😅 
It would be regular BFS, but each iteration you also add walls to the neighbouring spots to visit, but with a "timer" that counts down every iteration
initialized with k
and only when the timer is 0, you can travel to that spot
oh okay 😅 thx ill try to rewrite it somehow thx again
❤️
class UnionFind:
def __init__(self, n: int, count: int):
"""O(n) space"""
self.parent = [i for i in range(n)]
self.size = [1] * n
self.count = count
def find(self, A: int) -> int:
"""O(inverse Ackermann function) time"""
while self.parent[A] != A:
A, self.parent[A] = self.parent[A], self.parent[self.parent[A]]
return A
def union(self, A: int, B: int) -> bool:
"""O(inverse Ackermann function) time
True if a merge happened, False otherwise"""
root_A = self.find(A)
root_B = self.find(B)
if root_A == root_B:
return False
if self.size[root_A] < self.size[root_B]:
root_A, root_B = root_B, root_A
self.parent[root_B] = root_A
self.size[root_A] += self.size[root_B]
self.count -= 1
return True
am I correct in thinking unionfind implemented with union by size and path compression in the find is inverse ackermann time both find and union?
Is O(m * n) considered quadratic complexity?
It's apparently called bilinear
uhh
never heard of that
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
ans = defaultdict(list)
for s in strs:
count = [0] * 26
for c in s:
count[ord(c) - ord("a")] += 1
ans[tuple(count)].append(s)
return ans.values()
So this is bilinear?
if n is the amount of strings and m the average string length ig
quadratic is O(n^2)
It can be simplified a lot with Counters
It’s O(n) with the number of characters
hi
is it correct practice to work out standard deviation of an average column. e.g. i have 4 columns for for different people data and 5th column is average column of those 4. can i still work out standard deviation of column 5 - does it represent SD of all patients
Not really sure what you mean
you have multiple values, and you take the average of those values, which gives you 1 average value right?
@lament totem ill show a basic screenshot now
row 8 is standard deviation of each column
what i want to know is it statistically allowed to work out SD of average column - does this represent an average SD of the 3 persons
Is each row a different type of variable
like does the order of the numbers from top to bottom matter? @solemn zodiac
5, 4, 1, 7, 5 f.e.
the actually dataset is much bigger. the numbers basically would represent coordinate points (y) to make a graph.
the order prob does matter a bit
what is the data
because the graph plots like many semicircles over and over again
Can you explain it to me
x
sure will do now
because it matters whether taking the average per row makes sense
so for each person i have a graph which shows velocity of blood for that speciifc one patient. i have extracted the values of each part of the graph so get thousands and the values basically take every single little point of the line
so if you do a graph of those values again you will get same graph
So the graph was basically three lines showing values over time or something
or however many people you had
so if i just select one person data values in that one column i get a graph like this
but it goes on many more times
just sleected a portion of values
But that's a single line
yes and a different graph for next person
Oh right srr haha
but similar sort of line but will vary
And a value represent a measurement at a certain time point?
at time points 1, 12, 23 ... ?
as of now the x axis is row number on my spreadsheet. but overall x axis label is time. but each person might have different number of row numbers per second lets say
actually im not sure about the last part i said
let me think a sex
sec
Well if the time points don't match, it does not really make sense to take the average of points on the same row
give me 10s let me think a sec about that
okay no actually each row does represent same time like a millisecond
so it will be same for all people
As long as they "match" in some way for each person it makes sense to average them
And you could maybe also take the standard deviation at each time point
Don't think taking the standard deviation of the averages makes a whole lot of sense, I'd rather take the standard deviation over all points
It really depends on what you want to show/find out
yes the time will match but the point in the graph might not which might be issue - hard to explain - basically if one person has small peak (see graph) then they have less number of rows for that peak in graph. so at row 10 i might be averaging peak of graph for one patient and bottom of peak for another which i think wont work
Hmm yeah
I'm really not very confident in giving much advice without fully understanding the dataset, but i'm sure it's not easy to explain
Just try to understand what the meaning of taking the standard deviation of the average per row is
And if that would be somehow useful in explaining something or showing something
If you still feel unsure, it might also be best to move the conversation to #data-science-and-ml
basically i wanted to get a measure of standard deviation for all values. so i could work out SD of all values for each person which i think would be accurate as it tells you spread of values for one person. but i woiuld have to do this spearately for all people to get each person with own SD. or i could average each column to create average coolumn to represent all patients - the mean should be right (lets assume for now) but if i work out SD of that average column not sure if its accurate to say SD of all these people is that
Yes, SD per person could be useful, and SD for all people (so SD given all datapoints) maybe too
do you think its fine though to do SD of the average column. lets say SD of that column came out 4. would i be able to say SD for all these people data points is 4.
masta
ur solution is 2x faster too
can anyone suggest where i have to start dsa to learn from scratch
maybe https://www.geeksforgeeks.org/data-structures/ and when ready the MIT 6.006 vids in the pins are god
Can anyone help me find a real-life use for a binary search tree? I've seen examples of implementation but very little on use cases. Thank you
balances binary search trees are used a bunch
you can keep an ordered collection of items with cheap O(log n) cost for insertion and deletion
just binary search trees in general is of less practical usage I think
the balanced ones give you guarantees about tree depth, which is what gives really nice properties
Since they’re in sorted order, so you could do something like have a big collection of strings, and then start reading them starting from the ones which are alphabetically past a lookup string
And hash tables don’t do that
And if you had a sorted list along with a hash table, insertion would be O(n)
in the worst cases binary search trees have O(n) insertion and deletion as well though
balanced binary search trees are what makes things really nice and useful
hi, how can implement this in python?
I tried this but seems like incorrect
def find_or_create(_id: str, _hash_table: dict) -> dict:
if _hash_table.get(_id) is not None: node = _hash_table[_id]
else: node = {"id": _id}
return node
roots = []
hash_table = {}
for parent_id, child_id in source:
child_node = find_or_create(child_id, hash_table)
parent_node = find_or_create(parent_id, hash_table)
parent_node["children"] = dict(parent_node["children"], **child_node)
child_node["parent"] = parent_node
hash_table[child_id] = child_node
hash_table[parent_node] = parent_node
if parent_node in roots:
roots = roots.remove(child_node)
I think you want this instead of the odd dict thing
parent_node["children"].append(child_node)
and make it a list, of course
I'm not sure what you mean exactly
unrelated but I would recommend using a class instead of a dict for the node, e.g.
class Node:
def __init__(self, id):
self.id = id
self.children = []
self.parent = None
```should be a lot cleaner
can you pls provide some samples
ahh, I misanderstood
do you have 1 min to implement this in your way?
I updated the Node class, I would recommend using that
instead of having a dict to represent the node
then the relevant code becomes
parent_node.children.append(child_node)
child_node.parent = parent_node
that + update find_or_create to use the class
(you can make things work with a dict for the node, but it's just annoying code)
I will try to do that
but yeah, I think the main underlying issue is that this operation is really a list append (or adding to a set)
rather than the dict weirdness you have on the corresponding line
Does anyone know array broadcasting from numpy?
You meant like this?
class Node:
def __init__(self, _id):
self.id = _id
self.children = []
self.parent = None
def find_or_create(_id: str, _hash_table: dict) -> Node:
if _hash_table.get(_id) is not None:
node = _hash_table[_id]
else:
node = Node(_id)
return node
roots = []
hash_table = {}
for parent_id, child_id in source:
child_node = find_or_create(child_id, hash_table)
parent_node = find_or_create(parent_id, hash_table)
parent_node.children = parent_node.children.append(child_node)
child_node.parent = parent_node
hash_table[child_id] = child_node
hash_table[parent_node] = parent_node
if parent_node in roots:
roots = roots.remove(child_node)
kinda, the append line shouldn't have the assignment
@fiery cosmos Could you help me with numpy?
I have to finish this first
ok afterwards could you help me please
at least I will try
I am doing that for this:
source = [
(None, 'a'),
(None, 'b'),
(None, 'c'),
('a', 'a1'),
('a', 'a2'),
('a2', 'a21'),
('a2', 'a22'),
('b', 'b1'),
('b1', 'b11'),
('b11', 'b111'),
('b', 'b2'),
('c', 'c1'),
]
expected = {
'a': {'a1': {}, 'a2': {'a21': {}, 'a22': {}}},
'b': {'b1': {'b11': {'b111': {}}}, 'b2': {}},
'c': {'c1': {}},
}
# Write a function that builds a tree from a list
# id pairs (parent id, child id),
# where None is the id of the root node.
# Work example:
but seems like I need different solution right?
btw, write your problem anyways... maybe someone else has time
I am not sure if these are correct
are there any clustering algorithms you guys know of that make use of adjacency list? NOT adjacency matrix, but list
can someone explain what exactly prefix codes mean in Huffman's Coding ?
@fiery cosmos
!e I see what the dicts are for now, I made an implementation, but it's quite different from the one you posted
source = [
(None, 'a'),
(None, 'b'),
(None, 'c'),
('a', 'a1'),
('a', 'a2'),
('a2', 'a21'),
('a2', 'a22'),
('b', 'b1'),
('b1', 'b11'),
('b11', 'b111'),
('b', 'b2'),
('c', 'c1'),
]
expected = {
'a': {'a1': {}, 'a2': {'a21': {}, 'a22': {}}},
'b': {'b1': {'b11': {'b111': {}}}, 'b2': {}},
'c': {'c1': {}},
}
##
# Init unconnected trees (single node)
table = {}
for parent, child in source:
if parent: table[parent] = {}
if child: table[child] = {}
roots = set(table.keys())
# Connect trees
for parent, child in source:
if parent is None:
continue
parent_node = table[parent]
parent_node[child] = table[child]
roots.remove(child)
forest = {root: table[root] for root in roots}
print(forest)
assert forest == expected
@haughty mountain :white_check_mark: Your eval job has completed with return code 0.
{'c': {'c1': {}}, 'b': {'b1': {'b11': {'b111': {}}}, 'b2': {}}, 'a': {'a1': {}, 'a2': {'a21': {}, 'a22': {}}}}
cant be implemented in O(n)?
@fiery cosmos Do you think you could quickly help me with my question?
But how? there are two loops
O(n) + O(n) = O(n)
it's not like there are nested loops
it's not a single pass over source, but it's still O(n)
oh I get it, thank you so much 🙏
@haughty mountain Could you please help me with my question?
the docs are pretty clear about how it works: https://numpy.org/doc/stable/user/basics.broadcasting.html
really appreciate any suggestion as to how to tackle this. I recognize this as a dynamic programming problem, it's similar to unbounded knapsack but i can't figure out what's the state transition formula. Thanks in advance 🙂
i tried to form a 2d dp table, with the max box size as row and the index of box_i as column
In order from small to large, compute the best you can achieve with nesting. When looking at box size n, you have the boxes 1 to n-1 to pick from, the value with nesting is already known, so you only need to pick the small boxes that goes into the size n box.
That subproblem is your typical knapsack problem
thanks a lot!
You mean multiline strings?
´´´text´´´
Here's how to format Python code on Discord:
```py
print('Hello world!')
```
These are backticks, not quotes. Check this out if you can't find the backtick key.
yea code blocks
but maybe best to ask in #python-discussion unless it involves algorithms and data structures
im tring to find the exact character in my keyboard but i cant find it
usually next to 1, same key as ~
yea it also have a question about perceptrons and reinforced learning
Then maybe #data-science-and-ml
it returns this ||||°°°
You can also copy this message to make a code block (edit the code accordingly)
im using a spanish vertion of keyboard but ill try whit english version
it works in any discord server isnt?
yeah
ok ill make test in a empty discord server tanks
What is this algorythm named?
5 groups, ABCDE, set relationship AB AC AD AE, then BC,BD,BE, etc
hey guys can you help me with this dfs algo
is there any good source to start learning Data Structures and algorithms
anyone?
is this just a double nested loop?
geeksforgeeks is pretty good in my opinionhttps://www.geeksforgeeks.org/fundamentals-of-algorithms/
A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.
I'm personally not a big fan of g4g, I've seen a bunch of cases where their info is either not great or just plain incorrect
but fwiw this is in the context of looking up some particular algo to remind myself of it
i am trying to print all the possible subsets for a given array using recursion
def all_subsets(arr, idx, current_combination, ans):
ans.append(current_combination)
for i in range(idx, len(arr)):
if idx != i and arr[i] == arr[i - 1]:
continue
current_combination.append(arr[i])
all_subsets(arr, i + 1, current_combination, ans)
current_combination.pop()
a = [1, 2, 3]
res = []
a = sorted(a)
all_subsets(a, 0, [], res)
print(res)
But the values are not getting append in the ans at top of the function. After printing res, it just prints an empty array of arrays
Why this is happening and how to solve it?
.
thank you 😄
you're appending the list, but that does not make a copy
so in that list are all references to the same list
which ends up empty in the end
you could either convert is to a tuple, or make a copy of the list when you append
anyone kno any algos to detect if a word or a variation of a word is in a string?
e.g. looking for the word "toast" in these 3 strings "toast is yummy" "t oast is yummy" "t0ast is yummy" would all return true
No, dynamic.
You an only group the groups by two.
And groups can be created dynamic. Exept the main bits like swat, police etc...
You can see you have law enforcement team, and you have the main organizations, and inside them you have teams, that will be spawned by me on my wish or auto generated, so that there should be a dynamic part of the team. However, everything in law enforcement, every group should be in a relationship, and you can only group them by two at a time sadly. So need to figure out how I can make a relationship with these groups.
function createGroupRelation(group1, group2)
local groups = [
["lawEnforcement"] = {
swat = {
swat_01 = {}, -- auto generated numbers
swat_02 = {}
},
police =
military
cia =
},
["criminals"] = {}
]
end
I suppose as far as I can see it it would be up to three depth
but maybe I'm overlooking something and could be 4
some of them will be only one depth
but then would be good to be able to also group them with other stuff if needed
law enforcement is robust, but a 'creepty' team would be just one object/array but then might group them with criminals right.
But the API lets me group only two groups at a time.
set_relationship_between_groups(int relationType, Hash group1, Hash group2)
and a group can contain only 8 characters
So it feels like I need some algorithm for this
:incoming_envelope: :ok_hand: applied mute to @tawny moon until <t:1648751065:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
anyone know why the book would declare a variable within a for loop statement in a pseudo code of insertion sort?
maybe its allowed in other languages
also their list indexing begins with 1 shudder
Our goal in this video is to design a way of keeping the height of our binary max heap shallow
Word "shallow" confuses me - what does it mean to keep height of binary max heap shallow?
small
like short branches
why it's a case that complete binary tree is shallower than some other binary tree?
Shallow means that the branches are short
a complete binary tree is completely filled except the bottom (so all nodes have 2 children, except for bottom)
That is as shallow as you can make it with a given number of nodes
I don't understand this question
which of the options uses most computational power
how could I add code in these 3 functions in order for children nodes to have reference to their parent node
anyone use binance api ?
Hello! Is this an appropriate place to ask about line detection?
I'm looking for an algorithm that detects lines from an image, but the start/endpoints of the lines are known - just the edges and the ways those start/endpoints are connected isn't
All I can find on the internet is the Hough Line Detection algorithm
I have a pathfinding algorithm to find the path with 0's and avoid 1's
https://paste.pythondiscord.com/jesowageni
But for some reason I get an unexpected output py [['|' 1 1 1 1 1 1 1] [1 '|' 1 1 1 1 1 1] [0 '|' 1 1 1 1 1 1] [0 1 0 '|' 1 1 0 1] [1 1 1 0 '|' 1 1 1] [0 1 1 1 0 '|' 1 1] [1 1 1 1 1 0 '|' 1] [1 1 1 1 1 1 0 '|']] in the 3rd row 4th item it is making it as a path while it shouldn't be. the correct one should be the 3th item of the 3rd row
I cannot find out why is it only for the 3rd row
the algorithm is just wrong, look at a case like
[
[0, 1, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0],
]
there is a perfectly ok path, but it will jump to the column on the far end when possible
if the algo checks the previous row and the next row it should work as intended right?
your code basically sees the 0 that is reachable from above and says that's ok
despite the top 0 being unreachable
(looking at the column on the right with alternating zeroes and ones)
I suspect you want to look into graph traversal algos like dfs
thanks I'll look into it
have you heard of https://github.com/labuladong/fucking-algorithm/tree/english ? What's your opinion on it?
haven't heard of it
skimmed through some articles and they seem ok
though I see some cases where technical terms are used wrongly, probably because of translation
Hello guys, does anyone have experience with line detection algorithms?
Hmmm you can have a node class that gets created in the init method. That node class can have a field that maintains a reference to the parent.
Your tree class can be used to assign those references.
So Im able to get through most algorithms but my thing is that since it's my first time doing virtually all of them in years, I often have to use the debugger to squash my bugs. THat made me realize, how come this isnt typically a thing in coding interviews? Is it just time constraints? I use the debugger for probably 50%+ of the algorithms I write
Never used a debugger, probably should use them
But most of the times the bugs can be found by looking at the error and some debug prints
Like f.e. when you get a shape mismatch with a matrix multiplication you print the shape of both matrices, and then go back to see if all shapes are what you expect
ah yes that is sorta similar - I just examine them in memory rather than print but yes I have also used such print statements.
I guess the better question would be is it pretty typical to have bugs in your algorithms at first? I feel like if e.g. I wrote BSTs every day, after a couple weeks I could probably write a bug-free basic BST, but there are so many data structs and algorithms to know that at any given time, each one I am "rusty" on if that makes sense
hmhm, yeah that makes sense
obviously looking up algorithms you've known at some point is expected
Having known them is still useful for re-learning though
And it might also help to look at the code and see if it is actually correct than just running it multiple times, looking at error, fix, cycle
Yeah this above point of yours is something I wish I was better at. Sometimes when I have a bug and I am staring at the code, my brain gets foggy until I either hop into the debugger or basically do a manual debug on pen and paper (where I plug numbers into the vars and manually step thru the process)
I was just wondering if this was something that was "frowned upon" or relatively common when someone is writing a particular algorithm for the first time
Is there a more elegant pythonic way of unpacking a dict into key value pairs?
foo = {'red': '4'}
k, v = list(foo.items())[0]
#Output
#k = red
#v = '4'
k, v = foo.popitem()
though it changes your dict.
also: next(iter(d.items()))
changes meaning empties
yes
next(iter(d.items())) ... I'll explore this option a bit
hey im trying to make an algorithm that recives the capacity of a HDD and then print the number of micro discs necesary to make a security copy of the main disc, when I try to run the program it does nothing can someone help pls
memoria_disco_duro = int(input("ingrese la cantidad de memoria de su disco en GB : "))
capacidad_micro_disco = 1.44
micro_discos_necesarios = 1
memoria_total_microdiscos = capacidad_micro_disco * micro_discos_necesarios
while memoria_disco_duro > memoria_total_microdiscos:
micro_discos_necesarios += 1
else:
print("la cantidad necesaria de micro discos es :", micro_discos_necesarios)
``` this is the code
:incoming_envelope: :ok_hand: applied mute to @quaint flame until <t:1648870177:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
That while loop will never terminate. You're incrementing micro_discos_necesarios, which changes neither memoria_disco_duro nor memoria_total_microdiscos, so the comparison will forever be the same.
Since it's pretty topical: the qualification round for Google Code Jam is ongoing!
it's a fun contest
people in this channel might be interested
int hack_count(int n){
int x[n],y[n];
x[0]=y[0]=1;
for (int i=1; i<n; i++){
x[i]=x[i-1]+y[i-1];
y[i]=x[i-1];
}
return (1<<n)-x[n-1]-y[n-1];
}
what can be the time complexity
of it
O(n)
What is #algos-and-data-structs
discussion about algorithms and data structures
well, an example of a data structure is a Python list
Okay
With algorithms people normally mean code that is used to solve specific task
And data structure?
a python list is similar to a vector or linked list iirc
like sorting a list, traversing a binary tree, finding out if a number is prime etc.
And yeah like postgredis says, a data struct can be something like a list, dictionairy, binary tree, graph etc.
a format/way for storing data
Storing the data from algorithms?
we can use algorithms to traverse and sort data structures
Ok so where i can start
learn big o notation
With all these stuff
O notation ok
some things fundamental to DSA are big O notation, logarithms, and a lot of math
Ok
Which should i learm first
big o notation
https://www.youtube.com/watch?v=oz9cEqFynHU&t=851s this video is a cool introduction to DSA
Best resources to learn Data Structures and Algorithms:
Jomaclass: https://tren.jomaclass.com
MIT Lectures: https://www.youtube.com/watch?v=HtSuA80QTyo&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
► Patreon: https://www.patreon.com/trenblack
► Instagram: https://www.instagram.com/trenblack/
► Twitter: https://twitter.com/tren_black
► Merch: trenb...
A short introduction to Big-O notation.
Data Structures Source Code:
https://github.com/williamfiset/algorithms
Ok
The overview I wish I had when I was starting!
👨💻 Join Freemote, the Freelance Developer Bootcamp
https://freemote.com/?el=youtube
🍿 Learn the "Zero to Freelance Developer" Strategy (free)
https://freemote.com/strategy/?el=youtube
📸 Social media
https://instagram.com/aaronjack
#datastructures #algorithms #basics
👍
quick question to anyone who knows anything about lstm, ml, tensorflow. is there usually obvious reasons as to why lstm models over fit?
for i in range(1, 2):
model.fit(x_train, y_train, batch_size=5, epochs=500)
time.sleep(0.001)
Epoch 1/500
237/237 [==============================] - 12s 52ms/step - loss: 3.8080e-04
Epoch 100/500
237/237 [==============================] - 11s 44ms/step - loss: 1.3932e-04
Epoch 200/500
237/237 [==============================] - 11s 45ms/step - loss: 1.1043e-04
Epoch 300/500
237/237 [==============================] - 11s 46ms/step - loss: 7.9581e-05
Epoch 400/500
237/237 [==============================] - 11s 47ms/step - loss: 6.8104e-05
Epoch 500/500
237/237 [==============================] - 11s 47ms/step - loss: 5.6542e-05
around 175-250 its perfect and then around 275 it starts becoming stupid
if I apply scaled data after test data will it help my model?
:incoming_envelope: :ok_hand: applied mute to @stiff tangle until <t:1648914228:f> (9 minutes and 59 seconds) (reason: newlines rule: sent 120 newlines in 10s).
!unmute 901553514520645632
:incoming_envelope: :ok_hand: pardoned infraction mute for @stiff tangle.
Sorry about that
Use this for large amounts of code: https://paste.pythondiscord.com/
Thank you man
:incoming_envelope: :ok_hand: applied mute to @stiff tangle until <t:1648914356:f> (9 minutes and 59 seconds) (reason: newlines rule: sent 118 newlines in 10s).
!unmute 901553514520645632
:incoming_envelope: :ok_hand: pardoned infraction mute for @stiff tangle.
Ah sorry. Go to that website, paste the code, click save, then copy and paste the url.
Btw, #data-science-and-ml is a better fit for your question.
thank you ill go there
Hello, can I ask if this line has any sense in Python:
d_nodes[node_id] = globals()[name](node_id, properties)
Especially the last tuple?
its not a tuple, its calling globals()[name] with those arguments
!e eg
def f(a, b): return a + b
def g(a, b): return a - b
d = {"f": f, "g": g}
print(d["f"](3, 4))
@jolly mortar :white_check_mark: Your eval job has completed with return code 0.
7
Oh, thanks 🙂
lmao that's hilarious 😂
:incoming_envelope: :ok_hand: applied mute to @dark aurora until <t:1648935428:f> (9 minutes and 59 seconds) (reason: newlines rule: sent 115 newlines in 10s).
Solve
Does anyone have any tips for dealing with recursion wrt Binary Search Trees? These really trip me up because I have a hard time wrapping my head around recursive functions that have both multiple base cases and multiple reduction/recursive steps
Like, its pretty ridiculous, I can sit here and write many loops recursive all day no problem, like fibonacci, doubling integers in a list, etc... But BSTs throw me off since there are two separate "branches" at each node, and its not as simple as saying "when at zero or empty array, we're done"
After a while, I begin to just get totally confused and start switching the base cases reduction, and recursive steps around into a buggy nightmare lol
I agree, go join Google Code Jam and solve that problem yourself
Often you don't have to really think too deep. A lot of the reasoning tends to boil down to "let's assume my function does the right thing when I call it recursively, what does the code do?", plus a few simple base cases
It's the same kind of reasoning that you do when doing inductive proofs, where you assume your statement is true for n and then prove it's true for n+1.
That plus the base cases is the inductive proof. You can do the exact same kind of reasoning with recursion
You get the base cases right, and then you can just focus on solving the problem assuming your recursive call does the right thing
how can I make it work
Most of the time I find the base case can boil down to "is this node null (None) or not null". If not null you can recurse on the children which is fine since they will hit the base case if they themselves are null.
Also grinding some problems may help https://leetcode.com/tag/tree/ 
~~if only we had some nice operator to do an action depending on whether something is null... 😩 ~~
I'm salty about 505. It's like four years old, but "deferred" it is.
Ah yes this is definitely a sore spot for me - the iterative part of my brain has been checking if the children are null rather than the “this node” when I should have been checking this node. It may be possible but it’s certainly made the code harder to reason about that way. My brain is thinking “check length-1 before you do a buffer overflow” type of thinking and forgot as you said you don’t need to check next if your base case covers it. Similar to not having to check in the body of a loop if the conditional statement covers it.
\n
Maybe ask in a help channel #❓|how-to-get-help or #python-discussion since that's not an algorithms question 
^ was about to say that
thanks!
lol looking over some stats at worldlifeexpectancy.com and had a chuckle
leading cause of death is coronary heart disease, not suprisingly, low birthweight is nowhere near the top
like, sorry, your country is fat, but good news, your country is fat
how to reverse part of list in place without writing a loop?
nums[1:].reverse()
Yeah but none of that matters just Covid. Because companies profit off of the stuff that makes people fat.
suppose i can mark each node in a tree as red or black, is there a way to mark an entire subtree as red in constant time. Is there a storage format or something that can achieve this?
I have a list of lists,
e.g. x = [["a", "b", "c", "d", "e"], ["f", "g"]]
how can I create a function, say serial_no, that will return the serial no of the element given from x. Like if i want serial no of "f" it should be 6, for "c" it should be 3
nums[1:] = reversed(nums[1:])
I know how to do it in logarithmic time, not constant time though
yeh please advice
range query trees/segment trees
thanks a lot
actually, that deals with leaves and not all nodes in the tree 
so maybe it's not what you want
there are ways to make it work anyway, but it's a tad more work
i'll look in other directions then, it's probably not what the task intended
if you make a list in preorder dfs order then subtrees are ranges, so you can use algorithms that work on ranges
it's a neat trick, but probably not what's intended
what time constraints do you have?
how expensive can setting a color be and how expensive can a lookup be?
ah
it's a dynamic problem, i've solved most of it. So mark the root red if it has positive value, mark the children red if their sum with the root are positive
The thing is that, if i were to abandon a child subtree because i'm abandoning the root, then i need to mark that subtree all black
idk why it matters that the tree is rooted for this problem
I guess having it rooted helps for stuff you need to do later
I suspect this is just some DP on the tree
i'm using a post order traversal to solve the subproblems first, then the larger problems. Bottom up approach.
u mean 'rooted at vertex 1'? they didn't mention how the tree is stored in memory so i don't think it is something important
fwiw this problem is essentially largest sum subarray but generalized to trees
oh ok 
like, if you have a line graph
o--o--o--o--o--o--o--o
```then that is the maximum sum subarray problem
do it might be useful to look how that version is solved
and see if it can be generalized
The most well known algo for solving that problem is called Kadane's algorithm
https://cp-algorithms.com/others/maximum_average_segment.html
https://en.wikipedia.org/wiki/Maximum_subarray_problem#Kadane's_algorithm
oh what i thought seem to be completely wrong
At least I'm pretty sure that a line graph will end up being max subarray
unless I've read the problem wrong
so, something weird like this could be a solution to the tree version
you want to find a connected component in the tree with maximum sum
or the maximum value, at least
oh god, I know a technique that would work
but it's a bit of a mess
if you only want to solve it for components containing the root then the problem isn't that hard
you ask all your subtrees what the maximum sum component of them are, where the root is forced to be red (I suspect totally black tree, sum=0 is also ok)
from that info you can easily compute the answer for the root
so we can solve it for the root in O(n)
thanks for ur time
there is this technique called rerooting of a tree
which allows you do transform an existing dp for the root as if the root was some other node
it feels too advanced though, like using a sledgehammer on your problem
i think that is what the question meant by "rooted at vertex 1"
yes, but the question doesn't say that the root must be red
if we consider this dp, where all subtree roots knows the answer if the subtree root is part of the max sum component, how much work is it to move the root around?
like, say blue was the root, and now we want to move it to where the neighbor the arrow is pointing to
all these values are unaffected by the re-rooting, their subtrees are still exactly the same
so we would only need to recompute the dp values of those two nodes
i.e. recompute the value of the old root, which is now the root of this subtree
we already know all the dp values of the children, so that's pretty cheap
so there are two cases, one where the root is forced to be red. second case is where we swap the root with an imaginary 0 value node
ahhh i think my low clock-speed brain is malfunctioning
the case I want to avoid in the dp is where the red component is somewhere deep in the tree
because if so I can't connect is to my root
which means it's an invalid coloring
actually, recomputing the dp-values using the old subtree values might actually be too expensive
I guess it's not O(n) anymore
you want something that only looks at the values in the two nodes, because then it's for sure O(n) in total to reroot the root around the whole tree
wait can we use the 'naive' method, then store all the dp for each node in another array, and then find the maximum value afterwards?
there are n nodes, n subproblems, still O(n)
actually, maybe the thing I'm saying does work
how does the value of dp[old_root] and dp[new_root] change?
when I move the root
dp[old_root] loses the contribution from the old subtree, so subtract dp[new_root]
dp[new_root] gains the contribution of the new subtree at the old root, which is just adding the dp[old_root] we just computed
so this will actually work
Do you get the high level idea? @fathom delta
sadly not yet haha 🙂
Do you get that we can answer the question pretty easily with a dp if we start at the root?
in O(n)
that yes
Did you get the part where I wanted to move the root to a neighbor?
it takes some time for me to understand
so, this part
in the beginning the root had contribution from all the green nodes
sry why are we swapping the root in the first place?
I'm kinda running out of good colors... The old root is now a subtree with contribution only from these green nodes
So, we solved the problem for when the root was part of the component
but that's not necessarily true
yes
But what if we did it for all possible roots?
Naively just repeating the DP for all nodes is O(n^2)
Is it a good idea to learn ds and algo parallel to python or any other language?
I am learning python and java and have a beginner level knowledge of them
Or its better to learn it later when oop concepts are in strong grip?
Implementing some algos and data structures is a good way to get used to a language
I can recommend it
Solving small problems can be fun, and will get you coding practice
@fathom delta Do you get what I'm saying here?
Ok thanks
is the yellow node previously a green node? it swapped with the blue node
the yellow is the old root, I just wanted to give it some color
ok so now the yellow swaps with blue
So, the big picture is, if we did the dp over all possible roots, and take the maximum, we get the answer
wait is this same as that?
But computing the dp for all roots individually is very expensive
I'm not exactly sure what you mean there. But the idea is indeed to store the initial dp, and then make small changes to it
Because it turns out that if you just move the root to a neighbor, not that much changes
In fact, moving the root is O(1)
i get what u mean
So...we can just shift the root around, say in dfs order
and move the root around the whole tree
It will be ~2n moves
This is kinda fun. I've never actually implemented this kind of thing in practice. I've just heard about it and my friend has talked a bunch about it
So I guess the remaining complication is understanding how the update to move the root works. Which is what I tried to outline before
In the two images I showed before I tried to show how the situation for the root changes, it loses the contribution of one of the green nodes, but we know that contribution, it's just the dp value of the node that was just lost
in this way, the initial dp value will only increase or stay the same. ?
No, the dp value of the old root goes down
because it effectively loses the contribution from a subtree
For completeness, here is the corresponding situation for the new root
it gains a new contribution
the value for the right subtree
which is the updated dp value of the old node
at this stage, yellow is the root of the subtree containing the 2 greens
correct
and now it swapped with the blue, and the previously yellow is now the green to the right
yes
you can actually think of the full update in two steps, the first step that does dp[old_root] -= dp[new_root] pretty much breaks the tree into two parts
old_root and new_root are roots of two separate trees
and then you reconnect the trees but with new_root as the root
i.e. the two blue nodes here are roots of their respective trees, which I marked in red
so you basically disconnect and then reconnect the tree (wrt the dp values)
keep a record of the largest sum of subtree by moving the root along the entire graph,, that is the max
yes, the same process as doing the max over all dp values in the naive O(n^2) way, but we are clever and transform the dp instead of recomputing it over and over
we make use of the fact that most of the dp doesn't change when shifting the root around
at any point in time, there are two subtrees in the graph whose value sums we need to look at and compare with the record max
well, you really only need to look at the dp value of the current root
when updating the recorded max
in my image you would have started out recording the dp value of the original root, and then the dp value of the new root, and so on
oh man i totally forgot to use the maximum subarray sum logic
thanks for taking so much of your time to help me!
I actually didn't keep the max subarray logic in mind much, at least not Kadane's algo
I mainly realized that the problem is easy if we force the root to be part of the component
and then we can work from there
so the problem boils down to implementing the first dp on the root correctly, and then do the rerooting
I suspect you might also need to consider an all black tree, but that's a trivial edge case
it's just sum = 0
np, it was actually good to have to solve this, I've been meaning to do something with rerooting at some point, and this is as good an excuse as any 😛
is vertex of tree element of tree that has children?
a vertex is an element that can have edges that connect to other vertices
Is there an algorithm to generate all possible permutations of pairs without duplicates given an array of length n and all unique items
basically any node in this pic
vertex, plural vertices
edge, plural edges
I think vertex in particular has a bunch of names
so basically if there is an element in tree that can be connected to other elements in tree that can be connected to some other elements, then these elements are vertices?
vertex, node, maybe even point
I know people say vertexes
ah so vertex is basically any element of tree?
vertexes are the things in the tree, the connections between vertexes are edges
Yeah I understand then
vertexes
thanks I hate it
there's actually a good reason, vertex -> vertexes may be easier for non-fluent english speakers
goose(se) instead of geese

I wanted to say gooset, but pronouncing that like goost is like impossible when reading the word
'goo' 'set' or 'goose' 't'
I think if one wants to be proper it would be vertex and edge in undirected graphs and nodes and arcs in directed graphs
though I know people are very inconsistent about it
especially vertex vs node
I've seen both "directed edge" and "arc" in the directed case
Hello, I'm not a coder but I want to learn, but I have this idea and I just want to ask if it is possible. The question: Is it possible to predict the number generator generating 1-20?
depends on the number generator
if it's a weak one like linear feedback shift register it's easy, if it's something better it's harder, if it's something actually good it can be pretty much impossible
(assuming you can observe a bunch of elements before starting to predict)
I don't anything about machine learning and my motivation to learn to do so is about this predicting number generator, thanks for your response. Is there a algorithm or model or code that I can search online that I can use to test my idea??
I'm thinking like I will just change the data maybe in the array where i put the numbers
And run the code and see if it can predict
e.g. for mersenne twister (which is kinda common when you don't need cryptographically strong randomness) https://github.com/kmyk/mersenne-twister-predictor
I don't think machine learning will help
I think everything will be math based approaches
So my job is to find a model that fits my problem
Duh, as python itself uses an algorithm to generate random numbers
You can predict any outcome given the factors
If python generate random numbers, can python predict itself what number will be next?
Yes, you need to recover the internal state of the RNG
it probably would be possible with machine learning, but there's no reason to when there are effective algorithms that don't use machine learning
Can you tell me more about this?
since the algorithm python uses is deterministic, given enough data you should be able to predict what numbers will come next
Yes
your model would need to accurately match 2^19937 -1 sequences to the correct initial state, for every initial state
good luck
it's possible
also if it wasn't initialized with sufficient entropy, it'd be much easier
much like me factoring a zillion bit semi-prime is possible by guessing factors
ML is useful for stuff where we might not even know the internal mechanisms for what we are dealing with, or where things are just way too complicated to make a decent model by hand (e.g. image related tasks for the latter)
in this case the mechanism is simple, but has immensely complex external behavior
which is a terrible case for ML
and a very good case for humans
I don't understand anything, but am I in the right track to learn ml?
I just want to predict some random numbers
IMO this is a terrible application for ML
Those numbers are generated from the game
What should I learn?
You would need to know info about how these numbers are generated to do much at all
using numpy, how do i make a repeating list that increments the numbers?
or using matplotlib, how do i make a plot of [2,1] that repeats and those numbers also increments by 1?
output should be [2,1,3,2,4,3,5,4,6,5,7,6,8,7]
Hi, what happens when i have Graph and his minimum spanning tree and the weight changes on one edge, does anything changes for minimum spanning tree ?
change can be decreasing or increasing of weight, and this edge doesn't have to be in minimum spanning tree. I would need to know what happens for all cases if you could tell me thx .
how many times does it have to do this? it looks like your example does it 7 times.
doesn't really matter but let's say 10
let me see
another question, how do i run a while loop while also keeping the plot open?
the while loop also asks for input so it will freeze from that
@marble crown this seems like a weird thing to want to do. also I don't think you can continuously add information to a plot. nor can you repeatedly add information to a numpy array (their shapes are immutable).
i want to change information in a plot
that would involve making a new plot
isn't there any way to keep the plot async?
I don't think so, but I could be wrong.
though it sounds like what you mean is "mutable" rather than "async".
I'm pretty sure you can make dynamically updating plots in matplotlib
you like enable interactive mode and then show the plot, then you can update the plot data
that would be a better question for #databases than #algos-and-data-structs. You should probably have two separate tables, one called player_name that stores (player_id, name) and doesn't allow duplicate IDs, and one called player_inventory that stores (player_id, item) and does allow duplicate IDs.
I want to traverse a tree (/dendrogram) to get a list of all paths. The code below does what I want but I feel like there must be better solutions ... appreciate any ideas 
test = {"L": {"L": 1, "R": {"L": 3, "R": 9}}, 'R': {'L': 1, 'R': 0}}
stack = [[test, []]]
path = [[]]
roots = []
while stack:
node = stack.pop()
path = node[1]
node = node[0]
if isinstance(node, dict):
stack.append([node['L'], path + ['L']])
stack.append([node['R'], path + ['R']])
else:
roots.append(path)
roots
in property 2 of DFS, if v is the descendant of u, why does v have to finish before u finishes
can someone remind me why this series becomes 2N?
it's from https://leetcode.com/problems/k-closest-points-to-origin/solution/
it's a convergent geometric sequence, so we can use N / (1 - r) where r is the common ratio, r = .5, so N / .5 so 2N
S = 1 + 1/2 + 1/4 + 1/8 + 1/16 + ...
2*S = 2 + 1 + 1/2 + 1/4 + 1/8 + ...
|_______________________|
|
S
2*S = 2 + S
S = 2
Hey @fathom delta!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
For an infinite sum of G.P. you can use a/(1-r)
I am aware of the geometric series generalizations. I thought I would just write it out for this specific case, rather than give a formula to plug and chug.
Hello, im no programmer and still beginner and trying to learn python, what are your thoughts on this: there's a game that im playing that spits out random numbers, how are you going to approach this problem to predict what numbers will come out next? This idea is my motivation to learn i dont know, machine learning stuff neural networks
you could also do the general case as easily
S = 1 + r + r^2 + r^3 + r^4 + ...
r*S = r + r^2 + r^3 + r^4 + ...
S - r*S = 1
S = 1/(1 - r)
didn't u ask that question yesterday
and got the answer "this is a terrible idea for ML"
The reason why I thought this topic because quants uses machine learning to predict patterns in stock market, so I thought this idea is "small project" for machine learning since it will predict
it's not like the stock market is totally random
PRNG's aren't either, but stuff like cryptographically secure RNGs are designed to be very hard to predict
if I gave you actually random data your model would just be trying to fit to noise, which is totally worthless
a decent PRNG probably wouldn't be much different
Yeah maybe my idea is just too ambitious
Is it hard to create cryptographically secure RNGs?
I don't know enough to say, there are a bunch of ones you could look at though, shouldn't be hard to find some if you search for it
The reason why I asked that question to assume whether the game will use that kind of security just for a game. Btw thanks for your responses to my question really appreciate it
a game wouldn't use that kind of security, some of the possibly cryptographically secure rngs I know can only generate ~2mb/s of data(which is fast but slow compared to pseudo ones, but it might be fast on some other hardware), while non-cryptographicalls are much faster (500mb/s). Do you by any chance know which algo the game is using or even which language the game is coded in?
Based on what Ive search, it's c++
release year?
2012
hmm, it could be a good guess that they are using mt19937 algo to generate the rngs, you could try brute forcing the seed with mt19937 to predict the numbers
I wish I could, what steps should I take to do that? Learn c++?
learn basic c++ syntax, and try brute-forcing, if you are not doing to gain some advantage in online pvp or other stuff.
I need help on traveling salesman problem
they might be still using rand() but its depreceated since c++11, so if somehow it doesn't works just remember that they have been still using legacy code 😄
Thank you very much for the idea @tidal shard I want to learn machine learning when I have free time but I need small project that I should aim so I'm motivated to learn and this predicting random thingy is what first comes to my mind. Have a great day
Kaggle has got some small projects related to machine learning like this one https://www.kaggle.com/competitions/titanic
brute forcing won't get you much, there are well known attacks on mersenne twister though
I linked one repo earlier
with a long enough sequence (624 terms) you can do it e.g. https://github.com/kmyk/mersenne-twister-predictor
I was thunking about brute forcing the seed
will that won't work?
the number of seeds is pretty large no?
its large but like in 1hr of time it should brute force it, i think
2^32 seed values
and you will need to generate a bunch of values to verify the sequence
the attacks are better
yea
I mean, it's not insane amounts of work, but it will take a good while
especially considering we have better methods already
yea, didn't know about the method thats why recommended brute forcing
though I'm guessing if the same random engine is re-used for stuff we can't see that will become fun
like, skipping every few elements
it could get pretty dirty if thats done
hey guys! i started a new youtube channel to keep me active in my leetcode grinding journey. i make videos solving leetcode questions and hope to start making videos about system design and software architecture as well! if any of ths resonates with you then you can check out my videos https://youtu.be/IBGxUakhSVA 🙂
Hello!
This is a quick explanation and implementation of leetcode 11 Container With Most Water in Python.
0:00 Read question
1:45 Go through examples
2:17 Discuss solution
4:18 Code solution
Has anyone ever been writing an algorithm and basically finished it, discovered bugs, then tried to debug/fix it, and you end up like fixing each bug but start to lose the overall big picture? Like Im in a situation where I think I misunderstood the original ask/problem a bit, coded up something that fit my understanding, then had to change the code to fit the actual ask. In this process, Im almost feeling lost in my own code and like I'm just putting out fires in hopes that the thing will eventually satisfy the tests.
The point of asking that is that I'm trying to decide at this point if I shouldn't just scrap the whole thing and retry from scratch next week or some other time (work on other algos in mean time), OR if it's valuable to just use this as a way to improve/exercise my debugging skills until I get this thing under control.
Yes and both can be valuable 
If the code gets so crazy that you're certain it could be solved a simpler way then maybe step back and rethink
but debugging (in head or in ide) is really useful to help understand your algo or the ideal one
Or don't be afraid to look at other people's solutions imo. They can sometimes reveal a lot (though also make you complacent if you think you understand, I've been here many times
)
Fair. Yeah I think I may be able to pull this one off myself without looking, but you're right. I was recently working on BST stuff... And long story short I was lost until I finally saw someone's solution that "clicked" with me and now I totally get it.
I was trying to be too cool and sit there for hrs multiple days forcing myself to figure it out, and I do think there was some value there too, but looking at other solves isnt always "cheating", it can actually be educational, I agree.
Now if on every challenge you just copy/paste everyone else's work, yeah thats pretty lame and it wont get you anywhere. This is also why IMO GitHub Copilot isnt going to be as crazy and scary as some folks surmise. You still have to understand whats going on and be able to fine-tune code.
Exactly, unless it's literally for a test it's not cheating. No one was born understanding this stuff and there's whole college level classes dedicated to it. But if you do look at other code you really gotta make sure you understand it.
hey, i have a really basic question here, I am trying to run a code that has a user input three numbers and then say "Your sum is ___" but for some reason its bringing up an error at the end, does anyone know whats wrong?
name = input("Hello, please enter your name: ")
number_info_one = "Hello, " + name + ". Please enter one number: "
first_number = int(input(number_info_one))
number_info_two = "Please enter another number: "
second_number = int(input(number_info_two))
number_info_three = "Please enter one final number: "
third_number = int(input(number_info_three))
#this is how to add?
total_val = first_number + second_number + third_number
final_text = int(total_val) + "is your total value"
final_text = int(total_val) + "is your total value"
TypeError: unsupported operand type(s) for +: 'int' and 'str'
@woven prism youre trying to add an integer value to a string there technically.
Maybe ask in #python-discussion or grab a help channel #❓|how-to-get-help since that's not about algorithms. But your problem is you can't add int and string as machine says
try final_text = str(total_val) + "is your total value" for the last line
(and/or look into fstrings)
okay, thank you ill do that
hey
does anyone know
if there's like a centralized repository pdf of problems
like there's "Looking for a Challenge" from the POI
but those problems are way too hard
so basically is there an easier version of that book
Use None instead of null
not a pdf but maybe https://github.com/labuladong/fucking-algorithm/tree/english
Yap None worked
https://mystb.in/RoutineHeyAssessments.python
start (2, 0)
end (1, 10)
[['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_'],
['_', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '[]', '_']]
Traceback (most recent call last):
File "C:\Users\Marsel\Desktop\app\dungeon_exploration_thingy\main.py", line 152, in <module>
game = Game()
File "C:\Users\Marsel\Desktop\app\dungeon_exploration_thingy\main.py", line 17, in __init__
self.generate_path()
File "C:\Users\Marsel\Desktop\app\dungeon_exploration_thingy\main.py", line 130, in generate_path
assert self.check_stuck()
File "C:\Users\Marsel\Desktop\app\dungeon_exploration_thingy\main.py", line 35, in check_stuck
self.map[x][y+1] == '_'
IndexError: list index out of range
not really sure what im doing wrong
im trying to create a path between two points and the path should have like dead ends like its not just a single line
Question - I’ve also been working on binary search trees. Why are you trying to use a list here? Just curious. Well I guess it’s due to the challenge you’re working on. But I’m just wondering why anyone would do this.
That's leetcode's way of representing trees concisely, where you kinda read them off top to bottom, left to right
Cleaner than writing a big hierarchy in one line I suppose
Ah interesting. It would be fun to write some algos to convert/parse from list to tree actualy
Yeah, was just thinking that too 😄 it's gotta be a LC challenge already 
Storing k-trees in an array is very fast in low level programming languages. You can also do it in Python, but i'm not sure if there is a benefit, or how large.
For a binary tree, you can access the left child of the kth element with 2k+1 and the right child with 2k+2.
Contiguous memory is generally faster.
>>> def inorder(tree, i):
... if i >= len(tree):
... return
... if tree[i] is None:
... return
... inorder(tree, 2 * i + 1)
... print(tree[i])
... inorder(tree, 2 * i + 2)
...
>>> tree = [3, 9, 20, None, None, 15, 7]
>>> inorder(tree, 0)
9
3
15
20
7
>>>
(the list is the tree, it's about how you access the data (access pattern))
That’s cool thanks @opal oriole .
This is the technique used for heaps usually: https://en.wikipedia.org/wiki/Binary_heap#Heap_implementation
A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues.: 162–163 The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort.A binary heap is defined as a binary tree with two additional constraints:
Shape property: a binary hea...
I assume what the python heapq module does.
Yeah, it's a very old way of doing it, dates back to 1964. So the original way.
But the contiguous memory benefit matters way more now than it did back then (AFAIK).
Well I think data locality will never cease to matter
Always faster to read a block of memory than dereference a bunch of pointers
Yeah, it's just back then the actual operations run on the data took long enough that the execution time ratio was not so skewed as it is now.
Dereferencing the pointers is not exactly the problem. It's that those things that the pointers point to are not next to each other. If you put them together though, then it's fine.
Very true
And that it has not visited them before / recently (cached)*
(And/or it can't predict where you will access next (with a bunch of things randomly placed on the heap it has no chance, but even for complex but uniform (access patterns) structures it can because CPUs have crazy predictors now (entire neural networks in hardware)))
Computers are really complicated now.
Heh, that's before you get into NUMA
Yeah it's complicated but maybe simple heuristics are helpful, like perhaps it helps at all scales to keep the data small, contiguous and close to the CPU where it's going to be processed
Generally yeah, and unfortunately, that means avoiding heap allocation, which currently most programs spam all over the place. However, the alternatives (such as memory arenas) not only make it faster, but make management of memory easier and safer.
(C++ is getting there with custom allocator support, but still kind of meh)
I suppose if you are using something like PyPy instead of CPython the contiguous memory benefit will probably show up again (be noticeable, but IDK, maybe someone here wants to time it in CPython to see if there is any gain).
Hey, as a layman, I know that SMT solvers are NP-Complete, and that they are used to determine if curtain mathematical formulas are satisfiable or not, this sound (at least at the surface) like computer algebra system (CAS), eg SymPy, my question is if there is any connection between the two? I went looking if perhaps CAS systems are just the front end for algorithms that rely on SMT under the hood but it seems that SymPy does not use SMT? If so does that make it any faster at the cost of some functionality?
binary trees in particular, there are nicer and more general ways of representing trees compactly, e.g. just have a list P where P[i] is the parent of the ith node
if you can assume the tree is rooted at node 0, then you only need to specify n-1 parents
and then you can easily have an n long list with node values as well
hi is there someone that knows how to work with Flask? I have some questions about it.
greetings, was looking to learn DSA , i know basic python , can anyone help me with some guidance on this please
hello, I have a question HOW do you print the path to a node using the Dijkstra shortest path algorithm? I have the code to share aswell
I imagine thats becauase you can do direct access via index? Rather than traversing... IIRC this is also where contiguous memory/array stand out via linked list. But linked list insertion is more efficient.
we have cpu caches now, which makes data locality even more important
You can either take a formal class or just go watch some online content and learn them, and also go to something like leetcode to practice exercises. I would post others but I don't really want to "advertise" for them here per se.
@still night you can DM me if you want some other resources
okay thanks a lot
the channel pins also have resources
I got a question about HISTORY
It seems (and this is just speculation really) that many of these algorithms we're talking about here and in general when we say "data structures and algorithms", were actually invented close to when computing was brand new. So What I'm wondering is approximately how many of them were created theoretically or by mathematicians on paper, vs created after computers were applied and by engineers building systems who just figured them out?
A lot of CS is doing stuff theoretically
Like, for example, it seems to me that much of computing was basically figured out by mathematicians and similar scholars in the early days mostly on paper or theoretically, and then it was applied. And it seems that those days is when computer science really made the most advancements. Now, decades seem to go by and comparatively we have much less real core CS innovation. Again, speculation.
Yeah, so I'm wondering, if I want to say start trying to come up with some new algorithms... Is it in my best interest to sit around and play with code and benchmarking things... Or relearn/learn some discrete math and use a notepad and a calculator
its just crazy because for example the idea of quicksort seems extremely simple and almost intuitive... But someone had to "invent" it
Im wondering if it was sorta a happy accident, or purposeful, etc...
You kinda need to know algorithms relevant for a field to try to come up with improvements
I think stuff like merge sort was due to von Neumann in the very early days of computing
back then you didn't exactly have easy to use computers
But yes, a lot of CS is applied mathematics and problem solving. Doing things theoretically can be a lot easier in some ways, you can work a lot more abstractly, rather than if you just write code
True... Though even writing code is probably far easier than other fields like medical or physics and hardware 😛
far easier to experiment with that is 😛
it's for sure easier to verify that what you were thinking was correct
Yeah, during the pandemic I realized that man it would sure be nice if medical could just write a few unit tests on these solutions lol
the room for speculation there is definitely higher because testing is much more complex.
@haughty mountain you also brought up a good point in that I think that most "general purpose" algorithms and data structures are well-discovered... However, its the niche ones where real innovation can still happen.
As you try to use a general purpose one for your domain-specific problems, you eventually want/need to customize and/or alter it to fit there.
CPUs are really fast at doing whatever operations on data now. The problem is getting that data to place where the operations are being done. And it turns out that CPUs are so fast now that that is the main bottleneck for speed. So to help with this problem your CPU won't load one element from main memory at a time, it fetches a whole bunch expecting you to soon use those values. When the memory is contiguous it correctly fetches/pre-fetches the current element and the next few in the array.
Ah that’s right. I learned about that a while back, it’s cool to hear about it in this context.
In addition, if you access those again, it caches them locally on the CPU. So re-accessing them again is faster.
eg SIMD right
SIMD is like doing 4 float multiplications at once. It's something else.
So you want to optimize for the cache lines
(wide operations)
(separate from threading)
Yes, you can to optimize for cache friendly-ness. But first make sure you don't have some absurdly bad big O.
However, being cache-friendly can make a orders of magnitude difference, so it's a large constant speedup.
(And many arrays / things being operated on in a typical program are not massive, but rather a bunch of smaller things, so it really helps with that (they can all add up to make your program generally feel sluggish, but not any one in particular will stand out in the profiler))
(the stuff that does stand out in the profiler is probably where you end up with a leet-code style puzzle to solve to make it really good, and is worth the time because it's probably central to the program (many programs revolve around one or a few key data structures / algorithms and the rest is stuff like GUI, getting the data to work on, and such (the grind work of programming)))
(an exception to this is video games actually, since they kind of have a little bit of everything (all in one program) / can have anything in them (game engines are hard to make (good)))
right. It's like "ok now we need to work smarter not harder to speed this up"
I'm a security specialist, so this stuff is not where my focus typically is, but I do find it interesting and one reason why I've enjoyed working on algorithm challenges in my spare time.
There is doing less work (big O, compiled language (no extra work for the CPU to run an interpreter, etc), not adding a bunch of work in general for it to do when it does not need to) and doing that same work, but faster (cache, SIMD, compiled language again, etc).
And I can honestly say that modern, especially web-based stuff, tends to be horribly slow in my opinion
Like, I remember my 2003 computer being more responsive sometimes than some of these web apps, because I did so much more natively then and there were less bloated libs.
Like one practical example, is here on Discord when I hit @ I near-instantly get a menu. When I do the same in Google Chat on firefox, I literally can beat the app and it fails to actually tag someone
thats just sad
Well luckily, the first part of doing less work that involves not throwing a bunch of extra code at the problem makes it both faster, and less code involved (including the stuff you did not write, so any libs, etc) so it's generally less buggy and more secure / easier to understand.
Well sort of, so there is a curve to it.
Less code is better, but then after a certain point to get even faster it needs to get more complex again (the code).
But most programs are not near that point.
They are on the too much code side of more work than needed.
totally. We deal with this all the time in security. I do see some paralells between security and performance optimization
and performance optimization directly can relate to algorithm time/space complexities
small world. 😛
The fastest, most secure program in existence is the program that does nothing, it just exits. Anything that strays away from that is worse.
(But ofc we want it to do useful work, so your lower bound is whatever that work is)
Anyhow we can move this to #cybersecurity
Yeah.. I've even used this algorithmic thinking and seen patterns in other areas of work beside code. Like, I recently noticed that a lot of our process/procedure docs at work have a ton of unecessary complexity... Which in turn causes them to of course take more space up, but more importantly, be misunderstood or misintrepeted by team members, which then causes other issues... Good algorithms from what I've seen do the least worst in the most concise but understandable way. This has made me realize that I could apply the same thinking to clean up the docs.
and yes, willing to hop to #cybersecurity for that
<code> 64
/
32 96
/
16 48
</code>
You are doing a left rotate on 64 with 96 as the pivot.
<code> 64 96
/ \ /
32 96 --> 64 128
/ \ /
16 48 32 112
</code>
Does anyone know any clustering algorithms that use adjacency lists?
Good evening to everyone. Does anyone know about 1D Bin packing well? I have implemented the best fit and now I want to do worst fit. If anyone knows, please send me a pm. Thank you very much.
def diff_quad(data):
r_exp = -1
res = 0
for i in data:
r_exp = (r_exp[i] - data[i])**3
res[i] = r_exp
return res
import numpy as npy
print(diff_quad(npy.array([15, 11, 19, 5, 47])))
good day. this is my code and i need to debug it. its giving me a "TypeError: 'int' object is not subscriptable" for line 5. please help
You've defined res as an interger, 0, when it seems like from context you want to define it as an empty list.
Write a Python program that reads the N size of a list, N <= 20, then read the integers in the list.
As output the program should print the smallest and largest elements in the list, repectivamente.
could someone give me a hint of how to start this program?
For memory usage, would I be better off using pandas or readlines to read a 370k line CSV file and pull 284 lines off of it each day?
Looks like pandas has a thing to read certain lines from a file. Probably go with that https://stackoverflow.com/a/52152924/10241514 (don't .readlines() the whole file if it can be avoided)
Thanks a lot
is there any way to find a minimum spanning tree of a graph that connects all vertices?
can anyone help me
my friend gave me a challenge to get this data and try making it work and i dont know, he said get all of this data and do something with it. he left out the python code but showed me the output i dont know what to do. pls help
import pandas as p
def calculate_prevalence_rate_by_location(input_path, output_path):
df = p.read_csv(input_path)
if __name__ == '__main__':
input_path = "input.csv"
output_path = "output.csv"
calculate_prevalence_rate_by_location(input_path, output_path)
the py code ^
location_code,disease,prevalence,population
01,diabetes,1277,18063
02,diabetes,1344,11123
04,diabetes,603,11545
12,diabetes,1265,19140
06,diabetes,1943,11811
08,diabetes,584,9037
01,pneumonia,1482,18063
01,cancer,344,18063
04,cancer,351,11545
01,heart disease,1217,18063
02,heart disease,542,11123
04,heart disease,474,11545
12,cancer,1440,19140
02,cancer,1324,11123
05,heart disease,947,26111
06,heart disease,1979,11811
12,heart disease,1006,19140
06,pneumonia,1260,11811
08,pneumonia,1655,9037
the input.csv ^
location_code,cancer,diabetes,heart disease,pneumonia
01,1.9,7.07,6.74,8.2
02,11.9,12.08,4.87,0.0
04,3.04,5.22,4.11,0.0
05,0.0,0.0,3.63,0.0
06,0.0,16.45,16.76,10.67
08,0.0,6.46,0.0,18.31
12,7.52,6.61,5.26,0.0
the output.csv he gave me ^
and he has gave me some things to do with this data
Please complete the following Python function that:
(a) calculates the percentage of people who have each disease in each location, and
(b) writes the result to a specified CSV file
hi, could someone help me with a simple question that I'm stuck with asyntotic notation?
sure
but when i enter the list in python jupyter notebook
it dosent work
Did you replace the nulls with None? And also it won't directly be a tree you can pass to functions that expects a tree node object
they have code that can translate the list style tree to a tree node or vice versa
it's in a class, so you'd need to instantiate that first
s = Solution()
s.max_depth(someroot)
It will you just have to give it a TreeNode like the function expects
e.g. ```py
class TreeNode:
def init(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
root = TreeNode(1, None, TreeNode(2))``` for the [1,null,2] example
Hello friends. I'm making a program that takes the average of evens and odds. but when I make entries of negative odd numbers (-9,-7,-5,-3,-1) it gives the error of division by zero. Can anyone help?
par is 0 because there is no even number
probably do ```py
par = impar = 1
some_par = some_impar = 0
thanks
oh you should do this instead ```py
par = impar = n == 0
some_par = some_impar = 0
I am currently learning Django. I am building a system that gets input from the user as image and throws back the segmented images using UNet. I have the python files all caught up. But I am having difficulty doing the Django part.
How do I get an image as input from the user and run the UNet model behind it to display the segmented images?
Thanks in advance
the output of the incorrect.
ok wait changed
OK. thanks.
Try asking about this in #web-development
anyone know what to do?
for anything deeper than 2 level, i will have to continue adding eh? like [1, null, treenode(2, null, treenode(1, 2, 3))]
hi, guys whats the leetcode question given two lists, list_1 = [a, b, c, d, e] list_2 = [x, y, z, k] return all combination of length 4 containing elements from both lists
Hey I'm getting an error with my code when I try and run it through pgz. CMD is giving me an error on a line of code saying "list index is out of range". I'm new to coding and would appreciate any help I can get. (If you could DM me I can send you a sc of the code).
arr = [1,2,3,4]
print(sum(x > 0 for x in arr))
I am confuse how sum will return 4, does it add up all the True ?
yes, False == 0 and True == 1
sum(x for x in arr if x > 0)
if this is what you want
Man they need a feature like ‘[x for x in just finish writing my whole program for me]’ heh. In all seriousness though sometimes I start jamming more in there than I should
that is a list comprehension. EX:
[x for x in range(10)]. There are also dict and set comprehension. And generator expressions
Yeah it really depends on exactly what you’re after. There are certainly a lot of books. Some are more mathy/formal but explain them more thoroughly and the history behind them. Others are made to teach you as much as you need to know to just pass interview questions, etc..
@fiery cosmos maybe check out https://nostarch.com/algorithmic-thinking which has more practical hands-on examples, there is https://www.crackingthecodinginterview.com/ which is quite comprehensive actually, but I wouldnt say it goes super deep into each algorithm/data structure.
The first one uses Python which is a bonus. I think CCI uses Java for most of its examples
theres also a free one IIRC ill see if I can find it
Couldnt find it. Theres also this one: https://www.amazon.com/Algorithms-Adventurous-Bradford-Tuckfield/dp/1718500688/ref=asc_df_1718500688/?tag=hyprod-20&linkCode=df0&hvadid=459709175715&hvpos=&hvnetw=g&hvrand=11587691960020743844&hvpone=&hvptwo=&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=9031294&hvtargid=pla-918195320310&psc=1 , relatively newer it was published in 2021. Its more hands-on approach too.
so I am trying to edit a method in python and I cant seem to get it to work
from types import MethodType
class Person:
def __init__(self, age):
self.age = age
def multiply(self):
self.age = self.age * 1.1
return self.age
p1 = Person(36)
def tot(self):
return int(self.multiply)
p1.multiply = MethodType(tot, p1)
p1.multiply()```
TypeError: int() argument must be a string, a bytes-like object or a number, not 'method'
that function isn’t even part of the cosss
Class*
but regardless you are trying to pass a function to int() as it said
!d int
class int([x])``````py
class int(x, base=10)```
Return an integer object constructed from a number or string *x*, or return `0` if no arguments are given. If *x* defines `__int__()`, `int(x)` returns `x.__int__()`. If *x* defines `__index__()`, it returns `x.__index__()`. If *x* defines `__trunc__()`, it returns `x.__trunc__()`. For floating point numbers, this truncates towards zero.
!e
class Foo:
def __init__(self):
pass
def f(self):
return "Foo"
def func(self):
self.f()
print(func())
@rustic salmon :x: Your eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 11, in <module>
003 | TypeError: func() missing 1 required positional argument: 'self'
I’m surprised your code even runs… maybe I’m missing something
im trying to monkey patch it
I have a 3rd party library Im working with
found a workout though
while i know there are some EDI libararies out there already, anybody working on/interesting in collarborating with a NYSDOH EDI lib? AFAIK, there is nothing for NYSDOH specific EDI files. I've already built out a few libs for my company but would love to work in a group to open source more of them.
I know this may not be the most suitable chat to post this on, but it still relates. So i am trying to prepare for competitive programming competitions and I am wondering should I focus on one thing/theme/algorithm only until I master it sufficiantly or should I do everything slowly. ( I have a year until the competition). (I would rather do one theme until mastery because i hate doing things unfinished)
Follow a textbook.
Don't do the rubbish thing where you follow some online course or do random practice questions.
Buy a well regarded, appropriate algorithms textbook. Follow the textbook left to right. Do every second exercise.
Thanks a lot. I am going to do that
so, my programming teacher practically told us that laaksonen's book is god's gift to humanity so I am going to follow that one
Laaksonen is also involved in making the CSES problem set https://cses.fi/problemset/
I can recommend practising on those problems
does anyone have a good resource on implementing hash sets/maps from scratch? ddg and google results are full of w3schools and related blogspam
oh the wikipedia article looks good https://en.wikipedia.org/wiki/Hash_table. still taking suggestion though
https://www.geeksforgeeks.org/hash-map-in-python/
have u seen this?
no, that's helpful though
is CLR(S) a good resource for this? i think i have a pdf copy of that book somewhere
I remember watching a pretty nice video of someone implementing a hash-map in C 🤔
This was it: https://www.youtube.com/watch?v=2Ti5yvumFTU
Patreon ➤ https://www.patreon.com/jacobsorber
Courses ➤ https://jacobsorber.thinkific.com
Website ➤ https://www.jacobsorber.com
Merch ➤ https://merchonate.com/collections/jacob-sorber
Understanding and implementing a Hash Table (in C). //. What is a hash table, and how do I implement one? Hash tables are a fundamental data structure that all...
Also, I'm sure I watched a talk by Raymond Hettinger talking about the design decisions that went into Python's hash table implementation 🤔
I think it's called "modern python dictionaries" or something
cool, thank you @keen hearth
"Speaker: Raymond Hettinger
Python's dictionaries are stunningly good. Over the years,
many great ideas have combined together to produce the
modern implementation in Python 3.6.
This fun talk uses pictures and little bits of pure python
code to explain all of the key ideas and how they evolved
over time.
Includes newer features such as key-...
On the topic of hash tables, here are talks from the guy who worked on Google's hash tables that got open sourced a good while back:
https://www.youtube.com/watch?v=ncHmEUmJZf4
https://www.youtube.com/watch?v=JZE3_0qvrMg
http://CppCon.org
—
Presentation Slides, PDFs, Source Code and other presenter materials are available at: https://github.com/CppCon/CppCon2017
—
Hash tables consume a large volume of both compute resources and memory across Google's production system. The design for hash tables in C++ traces its origins to the SGI STL implementation from 20 yea...
http://CppCon.org
—
Discussion & Comments: https://www.reddit.com/r/cpp/
—
Presentation Slides, PDFs, Source Code and other presenter materials are available at: https://github.com/CppCon/CppCon2019
—
Two years ago we announced a new hashtable we were pushing out across Google. One year ago, we open sourced it as part of Abseil. This talk is a r...
This is also the table that was ported and used in languages like Rust
1 is not a valid variable name
was reading a solution on leetcode and it said
The DFS strategy can be further distinguished as preorder DFS, inorder DFS and postorder DFS, depending on the relative order of visit among the node itself and its child nodes.
I know what inorder preorder and postorder are, but is preorder DFS different from preorder BFS? it just seems weird thing to say unless I dont understand
@latent oak #discord-bots or #❓|how-to-get-help
dictionary = {
"Key1" : "1",
"Key2" : {
"a" : "2",
"b" : "3",
"c" : {
"d" : "3",
"e" : {
"" : "1"
}
}
}
}
The problem is to flatten a dictionary, make all nested keys top level keys.
But you can flatten a dictionary like this example in constant space can't you?
Because when you want to flatten say, Key2: "a": 2,
- You first add "Key2a": 2 to the original dictionary.
- Then delete Key2: "a": 2 from the dictionary. Then that memory is deallocated.
Have I got this wrong? All the answers I see online are people doing O(n) space solutions with big ole recursive call stacks.
Oh
the actual answer is the maximum space complexity will be the length of the most nested key of the dictionary
right, makes sense, tree space complexity you need to think about the size of your longest DFS.
I am expecting a huge amount of regexes (right now I am thinking something like 100 to 250 is where average uses will be), is there any way to optimize this?
Because worst-case scenario right now - that I get no matches - is quite bad since each letter is scanned at least n times (n is the amount of regexes) and I am sure that regex may scan it more than once.
Hmm, yeah I guess that as long as the regex is quick enough it'll be fine
I knew a bout cses and even solved some problems there though now i am wondering should I use "128 algorithms in python guide to competitive programming" or andri laaksonen's book. I think laaksonen's book is better but I don't code in c++
I don't know much about either book, I've at least looked at parts of Laaksonen's book and it seemed good. Hopefully most stuff should be valuable regardless of language
And getting used to C++ isn't a bad idea either
It's very useful in the cases where python is just too slow
yeah, i don't think it matters that much honestly, but if I start covering something I want choose carefully, and it is pretty hard for me to switch to c++ because I am already very good at python syntax and know almost all of the usefull mechanics, classes, tricks etc...
I skimmed through Laaksonen's book and yes, it uses C++ for all examples of code. Though I think most algorithm descriptions are more general
Hi! I have an algorithm that I need to find and I was wondering if this is something that has a proper name that I can do research into.
I have a target sequence of length n ["N", "E", "S", "W"]
This sequence is valid in any shift but NOT in any permutation. In this case, it'll be cardinal directions (North, East, South, West) and it can go clockwise or counterclockwise.
So ["E", "N", "S", "W"] is not okay because they just switched N and E but ["W", "N", "E", "S"] is okay because it's a shift to the right.
I also have an input sequence that will be a combination of those n terms and that sequence will be greater than n. Examples:
1. ["S", "S", "W", "S", "N", "E"]
2. ["N", "N", "E", "S", "W", "E"]
The goal is to determine, in any given input sequence how similar my input is to the shift (ultimately, the goal is to match the shift as well but I'd imagine simplifying this to have a score for just 1 input array and then just applying it to all shifts and outputing the highest score).
So in the first sample, it's closest to SWNE, with an extra "S" between the "W" and "N". The extra "S" would reduce the score, but it would be "removeable"
The second sample has an incorrect "E" at the end, otherwise it's most similar to "NESW"
My initial thought was removing duplicate letters (so first sequence just becomes "SWSNE") and then doing a Levenshtein distance between each shift of NESW. I think this would give okay first results, but I'm not sure if there is a more proper algorithm or if this kind of thing has a more proper name. Does this sound familiar to anyone? Thank you!
It feels like this thing is underspecified, what does similarity mean?
You gave deduping+levenshtein distance as an example, but is part of the problem to find out what is a good measure of similarity?
Yeah the similarity metric isn't super clear. Ultimately, the goal is to see if the sequence is closer to going NESW, ESWN, SWNE, or WNES and it's got a bunch of inputs
The scoring is just my way of trying to address the problem, there might be a more proper way to go about it
what's the big picture?
NESW kinda makes it look you're doing something geometrical
so the string problem is just one way of trying to model it
Yeah, I think that's the closest representation of the proble
so what's the actual output you want from a solution as a whole?
The actual output should be which direction is it starting and which way is it going. So in a spin, you can't jump from S to N, you gotta go through E or W first. So if I get readings of:
1. ["S", "S", "W", "S", "N", "E"] then the output should be "SWNE".
Fixing the "S" being wrong will be addressed later, but I think just finding the direction seems like the tricky first step
We also know it's one direction, so you can't go from counter clockwise to clockwise and vise versa
My issue with the Levenstein distance is that it doesn't quite measure adjacent similarity. Is there another metric that takes that into account?
Would it be useful to convert the direction to integers mod 4? E.g. let N = 0, W = 1, S = 2, E = 3
["S", "S", "W", "S", "N", "E"]
[2, 2, 1, 2, 0, 3]
```then you could compute the closest distances (though completely opposite directions will always be iffy
[0, -1, 1, +-2, -1]
in this case it's overall more negative, which matches the clock-wise (NESW) direction
This thing reminds me of https://en.wikipedia.org/wiki/Winding_number in complex analysis
In mathematics, the winding number or winding index of a closed curve in the plane around a given point is an integer representing the total number of times that curve travels counterclockwise around the point, i.e., the curve's number of turns. The winding number depends on the orientation of the curve, and it is negative if the curve travels ...
Hmm I like that a lot. That's a lot more specific than Levenstein distance and you retain the ordinality. And awesome, I'll check it out. Thank you!
I think an issue is that it not quite cyclical. If you do E to N using this, it becomes very negative (if using directional). Is there a mathy or algorithmic way to represent the cycle without having to iterate over different starting positions?
If I just add the missing element to the differences E -> S (3 -> 2) I have something that is independent of starting point
[0, -1, 1, +-2, -1, -1]
So if we take take 3 to 0 (end to start), it becomes -3 right? But it shouldn't since it loops around. Or is there a logic I'm missing
Are you talking about the direction or the starting point here?
And for the distances, pick the value that has a smaller absolute value mod 4, so instead of -3 you would pick 1
if you go the other direction (right to left in the list) then all signs should just flip
Mmm gotcha and then you combine distance with directional to get a proper value. That makes sense. Lemme see how that goes, tyty
but as said, a difference of +-2 is weird
probably best to ignore that value
i.e. treat it as a zero
I think the 2 is still important though. That should still be a significant loss compared to a distance of 1, which means they're adjacent
hi, I have a more "computer science" question rather then a python question,
lets say (hypothetically) I ask this question in an exam. (a guided radix sort question)
- make a function that returns the nth digit of a number
- make a function that takes an array of numbers and a digit, and sorts the numbers in the array into 10 arrays such that each number is in the array corresponding to the nth digit.
- make a function that gets an array of numbers/ints and sorts it
if I ask about the efficiency of the last function would you say it is N*K where N is the array's length and K is the longest number's length or N because ints have a maximum of 14 digits and as 14 is a constant K can be discarded?
(I plan to accept both but me and the head teacher disagree on the more natural way to write it)
I mean, how would you interpret the 2?
say the last difference was 1 (CCW) and you get a 2
I thought 2 would be obtained for N to S or E to W
Which means its missing a coordinate (from lack of reading) or there is a wrong coordinate (from incorrect reading)
all these give 2: NS SN EW WE
I would note that in, say, Python, ints are unbounded in length, so K can be arbitrarily high. So if it's meant to be truly language-agnostic, it can't be N unless you explicitly note in the problem the integers are of specific width.
oh right, I forgot that in python ints are also objects
frame the question as K bit numbers and there is no ambiguity
Right, so the 2 would be a greater indicator of an incorrect or missing reading
As opposed to only directional
or the last value was wrong
good idea ty
Right, but if the values prior were 1s and 0s, it means it was most likely flowing in that direction. If there is a sudden 2, it means that most likely the value 2 was incorrect. If the previous value was a 0 and then 2, its possible that the 0 should've been the intermediate point. Either way, it isolates the transition which has an error reading.
It kinda comes down to what kind of errors you expect in the data
if there are few errors then I think just throwing away the 2s are fine
say the proper direction is 1 (CCW) and then you hit an error, worst case you get one erroneous -1
but if there are few errors the 1s should win out anyway