#algos-and-data-structs
1 messages · Page 5 of 1
what's the relation between k and n?
ugh sorry
T(2^n) = T(2^(n-1)) + n
are you leaving the n exponent on the left alone and assuming that it is equal to c*2^(n-1)^2 or similar? that type of thing?
im just trying to see it in a different form bc as is now, i'll never know if you are setting the LHS = T(n-1), since the LHS and RHS are different versions of the same thing
and we're typically working with n-1 then prove for all n
so, if we were to attack this thing in the same manner we would take some guess at what the answer is and check if it works or not
though I think having the exponentials will be more confusing
we could look at some simple recurrence
T(n) = T(n-1) + 1
(assume T(0) = 0)
what's a reasonable guess for what T(n) is?
ok cool
let's make our guess T(k) <= ck
(I use k here to not confuse it with n later, but it's just a function parameter)
(fixed expression)
ok, the recurrence is
T(n) = T(n-1) + 1
we can make use of our guess
with k = n-1
T(n) <= c(k-1)+k
yep
ck-c+k
welll we are asking, is ck less than or equal to ck minus constant c plus variable k
no wait
so its not exactly intuitive
oh i was still working with my form ok.
so we have T(n) <= c(k-1)+1
= ck-c+1
so now we are asking, is ck <= ck-c+1?
we are not doing that, we are deriving a new expression for T(n)
ok so rn T(n) = ck-c+1?
T(n) = ck-c+1 <= ck
err, T(n) <= ck-c+1, but yeah
I should really read properly before saying yes...
well you said its the new expression for so i assumed =
ok so we ahve T(n) <=ck-c+1 <= ck
new bound would have been a better wording probably 
ok
well, we don't have the <= ck yet
we want to show that it's <= ck
what you wrote is our end goal
so...is this true? ck - c + 1 <= ck
ok ok but what are we asking ourselves at this step, eg "is ck-c+1 less than ck"
ok uh
wait, why are we mixing n and k
we're at
T(n)
= T(n-1) + 1
<= c(n-1) + 1
= cn - c + 1
yeah, we should have n in this expression
my mistake for not noticing and pointing it out earlier
the k is just there in the guess as a parameter
you just used k for n its no bd
but yeah, if c>=1 the statement is true
for all n?
we don't actually need a bound for n in this case to make it work, so yeah technically true for all n
just tried c=2 and n=5 and its good
we can let our n0 be whatever
in some cases we do need to have some higher n0 to make things work
soo have we shown that T(n)=T(n-1)+1 is O(n)?
basically yeah
ok so how were we able to avoid the whole n and n+1 thing, there was no induction?
is substitution always synonymous with induction?
we didn't avoid it? and there was an induction step 
i mean we didn't do any assume its true for n and show its true for n+1 type stuff
we proved stuff about T(n) given stuff about T(n-1)
we just guessed and substituted and checked
wat
if we were doing n and n+1 expressions, we'd have ended up with like RHS of T(n-1) and T((n+1-1))
or left of T(n) and T(n+1)
we never did that here. we simply had T(n) is equal to T(n-1) (some other level of the recursion) + a constant
T(n+1)
= T(n) + 1
<= cn + 1
= c(n+1) - c + 1
```which is `<= c(n+1)` if `c>=1`
same argument but n -> n+1
then the n-1 form would look like T(n-1) = (n-1)+2-(n-1)
sure, it's just a regular function
soo you're essentially allowing the n-1 form to be handled by the recurrence? this really isn't making sense
T(n) = T(n-1)+n, the n-1 form would be T(n-1) = T(n-1)-1)+(n-1)
we didn't go through any of that
oh you wouldn't plug it in for the +n
bc that's not part of the function
not the recursive part anyway
but you see what im saying
why is the n-1 form of the recurrence relevant? the point of the substitution is to not have to dive into the recurrence part
you need n and n+1 or n-1 and n to have distinct assumption vs proof parts
distinct hypothesis vs. showing
which are separate from base case
base case is like T(0), assumption is like T(n-1) or T(n), proof is T(n) or T(n+1)
we do have T(n) and T(n-1) though
both are in T(n) = T(n-1) + 1
our assumption comes from our guess at the formula, which is that T(n-1) <= c(n-1)
(and the base case is basically that T(constant) = constant, not too interesting)
Hi - I need to display data flowing through a network over time. the network is defined as nodes - similar to a graph chart. the data is flowing over time so a time axis is required. Is there a chart type that is used to display such data? see attached how I imagine it should look. If there is nothing like that - is there a python plot lib that can present such a data set? thanks!
hmm i guess im not seeing how you can show both T(n-1) and T(n) forms or both T(n) and T(n+1) forms in the same formula
it's the same in regular induction though
e.g. S(n) = 1 + 2 + ... + n
and we prove stuff about S(n) = S(n-1) + n
given some expression for S(n-1)
e.g. our guess is that S(k) = k(k+1)/2
S(n) = S(n-1) + n
```use guess, with k = n-1
= (n-1)n/2 + n
= (n+1)n/2
the principles are the same
You could make an animation using matplotlib
Or even add a slider and make an interactive plot where you can adjust the time using the slider
Thanks! actually in the diagram I sent the X is the time - so I can see the entire time in one image, which I prefer over animation. I am guessing there might be such a chart or library because I'm probably not the first to try and visualize it.
Using matplot lib I had a hard time resolving how to fork out and merge using the library, and how I should structure the dataset for this.
Worst comes to worst I'll invent a dataset from scratch and plot on HTML canvas - but I am sure there is a better way
Matplotlib can plot circles and lines too, so if you have f.e. a list with all the nodes, their location (x-axis: letter, y-axis: time) and what nodes are connected, you could simply plot that
I see, ok that is already very good to know. thanks!
hi all, i am trying to find the area of regions. I have the coordinates of all the zipcodes but not sure where to start with boundary detection & calculating areas
for example, i have coordinates of all 400+ zipcodes of seattle, but how do i actually start to calculate the area? are there certain libraries i should use?
how can I define error, like for an error bar, if I don't really know what the error is? for instance, in google sheets, you can tick a box called "error bars" and it'll just generate them for you automatically
hello everyone, i have a business idea and I'm looking for someone who knows web developing, java script and wanna develop and work with me together. Send me a pm if you're interested. Best Oleg
I have a graph of connected subgraphs. Need to find shortest path between two points that may be on separate subcomponents. In that case I need to find edge to connect the subgraphs in a way to complete the shortest path. Any edge added cannot cross another edge, and finding shortest edge to connect the components wont work since they can be closer in an area that shouldn't be used in the path.
What I can think of is use find connected component algo then write something to interconnect them in every legal way (not crossing other edges etc) then I can use that in dijkstras etc. But before I spent a weak writing algo to interconnect them in every way that doesn't cross an edge which seems like it would run slow is there some algo for this, or better idea to do what I want?
Edit: thinking how to write that. For each subgraph, for each point(A) inside - for each subgraph not current subgraph, find nearest point(B) to point(A) that doesn't cross another edge - Add that edge to a list, None may exist between two sub-graphs. This way at best I am only adding one edge for every point - should be much lower then that so resulting graph is not overloaded with edges. Not sure how fast checking if edges are crossing allot of times is going to be.
Edit2: Thinking about it more I can see how only the best for each point could lead to non-ideal path. Luckily for these graphs I think all the nearest will always be ideal.
So I'm trying to implement a Sequential model to Detect the likelihood of someone getting liver cirrhosis given some readings and I keep running into this error while trying to train the model:
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'N_Days': <tf.Tensor 'IteratorGetNext:11' shape=(None,) dtype=int64>, 'Status': <tf.Tensor 'IteratorGetNext:18' shape=(None,) dtype=string>, 'Drug': <tf.Tensor 'IteratorGetNext:8' shape=(None,) dtype=string>, 'Age': <tf.Tensor 'IteratorGetNext:0' shape=(None,) dtype=int64>,
I just wanna make sure I'm doing it right or make it work for starters.
is it good to use log base as the 2 ??
Did it slightly different than I said. Extended all the edges and found intersections and than used those to find shortest dist to points in connected component.
school has started doing algorithm competitions, I am somewhat new to algorithms. Its also enjoyable for me to learn. What do you guys recommend is the fastest way to learn multiple algorithms quick and get the hang of it. Doing leetcode is the obvious solution but any other recommendations?
I'd suggest doing extra online competitions on your own, for example the codeforces ones (they have several divisions, so it should be possible even for a beginner to solve some problems in the lower division) and then, importantly, looking at the postmortem (or however they call it) of each competition you do. There, they explain the ways to solve each problem.
I want to start learning algos and data structures for cp but have been told to learn discrete mathematics beforehand. Is discrete math necessary?
arguably dsa is discrete math
oh ok great, I didn’t want to go through a discrete math yt playlist
Ill just dive in
not really, discrete math might show up, but it's not like it's a prerequisite to know it
much like you don't need number theory for most dsa work, but you'll want to know it for stuff involving primes and similar
I mean, technically it's all just reasoning about formal discrete structures
There is a lot of stuff in my discrete math book that’ll be helpful and they recommend it as a prereq at my uni but it’s just that recommended not required
data structures was required, and i took a data structures in python cert on EdX*, which im realizing now must have been wildly different than an actual uni data structures course
😬
should i read the "techniques of counting" chapter or can i skip this
yeah that sounds about right. it's fairly useful
aka basic combinatorics, extremely useful
Do I use the help chat for help or this area
Looking for help in this chat but will move to help-pineapple for a solution
the book says one may represent a graph as an adjacency matrix or as adjacency structure of linked lists, which do y'all use?
I usually use a dictionary mapping a vertex's count to the list of all vertices it has edges to
Sometimes adjacency matrix, but only if I use it as a matrix (e.g. to square it, which results in a matrix that is related to the graph of all vertices with a path of length 2 between them)
i am familiar with a pythonic dictionary, so might you mean something that looks like this: py {"V1":"V2,V4","V2":"V3,V4,V5"}
something like that? wdym the vertex's count?
any reason you don't use the vertex name itself? do you maintain a different array with the vertices? i understand that any graph G is comprised of two sets, one of V(vertices or nodes) and E(edges)
oh i see in this setup the set V of vertices is V = {0,1,2,etc}
Hmm, I suppose it doesn't matter if you use strings to specify vertices, sure
(though one optimization which is done sometimes for performance is to replace the dict with just an array of possibly-empty lists, in which case you won't have the option of using string names)
yeah, I tend to use an adjacency list
conn = [[] for _ in range(n_vertices)]
for a, b in edges:
conn[a].append(b)
conn[b].append(a) # if undirected
adjacency matrix is rarely good unless you really need it as a matrix
it's just a waste of space
and iterating over neighbors is slower in matrix form
hmm i'm trying to see how to implement the adjacency list structure for graphs in schaums discrete math outline, looks like i'll first have to make a linked list
they do a vertex file and an edge file
Does not need to use a linked list.
it says they are in the particular book/example im on
Depends on the matrix.
this may? be a waste of time for me to try to implement especially if its going to be a different data structure or algorithm in CLRS.. i just really want to learn DFS and BFS
ConfusedReptile's dict method is an implementation of adjacency lists that is really nice because your vertices can be any hashable object.
so when they talk about a vertex file and edge file.. i'd instead just use two dictionaries?
If you want to loop over all edges you can.
If you have {0: [1, 3], 1: [2, 3, 4]}, then you know you have the edges 0-1, 0-3, 1-2, 1-3, 1-4.
yeah i'm able to visualize. just dont know if i should try to implement this version in this book or some CLRS version
The linked list version?
idk what version is in CLRS. the book i'm in now yes they claim G=V,E, you should use one linked list to represent {V} and one for {E}
An adjacency list? Or just as that?
Vertex File:
the vertex file will contain a list of vertices of the graph G usually maintained by an array or linked list. each record of the vertex file will have the form: (vertex, nextvertex, pointer)
then the edge file contains the adjacency lists in the form (edge, adjacency, next)
Oh, yeah that is just a list of all vertices and and adjacency list for edges.
They can be both in one.
should i start w the dict structure or class structure
can you make a class structure and specify of which type it must be?
Which implementation of an adjacency list do you want? Linked list, where each node points to a linked list? List in which element is a linked list? List in which each element is a list? Dict in which each value is a linked list? Dict in which each value is a list?
IMO the last one is best.
You can make a class that wraps any one of these / has one of these inside it and you interact with it via its methods.
From the class user's POV it could be whatever inside, the class creator could swap out the implementation.
ok how can i use a dict that has different number of elements for each value? when a node has more than one adjacent node, i put them in as tuples
i can use .get() to return the adjacent nodes given a key node
but given the data the values will have different numbers of elements
ohh i just noticed your each value is a list comment. i'll revise
hmm
trying to generate the list of all edges given each key
aha! i figured out how to return all keys. getting somewhere
the dict approach is nice, and in the case you have vertices 0, ..., n-1 you can use a list instead of a dict
(and you can always map your vertices to integers 0, ..., n-1 if you feel like it)
The dict approach happens to be the one Guido uses. Makes sense since it's easy to do in Python.
how would i return each edge in the form of (V1,V2) given the following format:
V = {"A":["B","D"],"B":["A","C","D"],"C":["B"],"D":["A","B"],"E":[]}```
so that's an adjacency list
for u, neighs in V.items():
for v in neighs:
# (u, v) is an edge
in practice in algos you often have a node and you need to know its neighbors, in which case
for neigh in V[node]:
# do something with neigh...
ok sweet! looks like I am returning some edges twice as (A,B),(B,A)
If it's a directed graph it's not a duplicate.
right, if you care about unique edges of an undirected graph add an if u < v:
in an adjacency list you basically model an undirected edge between A and B as two directed edges A->B and B->A
^To see why, look at an adjacency matrix, notice the mirror.
though to be fair, I think there rarely is a reason to convert an adjacency list to a list of edges
don't you need edges to do DFS and BFS?
usually you have the edges already and create an adjecency list from it
no
you just need 
given a node you can find its neighbors easily
Moving from one node to another via its neighbor list.
yeah
oh i see
More common than viewing the entire thing all at once. Like all edges.
so, do i need a node class for this implementation of the dictionary based approach for adjacency lists
or can i just treat every key as a node and every value as a node
and if so how do i give them certain attributes
Every key is a node and every value is that node's neighbors.
right
e.g. here is a bfs traversal
# let conn be an adj list
Q = deque([start])
visited = set()
while Q:
cur = Q.popleft()
if cur in visited: continue
visited.add(cur)
for neigh in conn[cur]:
Q.append(neigh)
Or "vertex" instead of node if you prefer.
as I typically write it
must i import deque
or a dfs
# let conn be an adj list
Q = [start]
visited = set()
while Q:
cur = Q.pop()
if cur in visited: continue
visited.add(cur)
for neigh in conn[cur]:
Q.append(neigh)
```pretty much the same code
from collections import deque
(untested code, but I've written these so many times I would be surprised if I made some big error)
is start the first key in the adjacency list?
and you make make it into a "Dijkstra-order" if you use a proriority queue as the container, the overall code structure is the same
i.e. dfs, bfs, and Dijkstra are basically the same thing, but using a stack, queue, and priority queue
what does your cur stand for
You can start at any node you want.
whatever starting node you want
current node
Current in the POV of hopping from one node to another through its neighbor list.
Well, you are pushing the hopping off for later. Um, how to put this?
don't these search methods require weighted edges?
I just think of it as I process nodes in some order defined by the data structure, I pop the next one to be processed and that's my "current node"
you can easily add weighted edges
If you have weights you may use that to direct the search. These are unguided.
Weights can be used for cost, or as positives (think following bread crumbs).
it's a fair point that the Dijksta-order is silly for an unweighted graph
for a weighted graph I would either store weights in the adjacency list (so store a (neighbor, weight) tuple rather than just neighbor)
or store the weights in a dict, keyed by node pairs
i.e. W[u, v] is the weight of the edge from u to v
either approach is fine
looks like they implement a BFS to begin with using a stack
and 3 different states for each node: ready, waiting, and processed
When you program actual projects / problems you will find which implementations of graphs and such you prefer / are best, so though there many seem to be many options you will probably have a few goto-s.
this book literally does not say the purpose of a DFS or BFS
i'm assuming its to find certain paths...
dfs and bfs is at its core just about traversing a graph
the order you traverse nodes in have different properties
e.g. in an unweighed graph the bfs explores nodes in distance order
Yeah it can be.
so this implementation of DFS has me initializing nodes to given states, how might i do that given the adjacency list i have so far, must i make a node class?
You can have a node class or just a separate list.
maybe i should rewrite my adjacency list using nodes
wdym?
like i need to give each node a 'state' and i'm not familiar enough with dictionaries to just modify the one i have
i want to make a Node class that has like value, neighbors, status
you can have another dict with the state for each node
how could i write a node class to take multiple neighbors? here's what I have:
class Node:
def __init__(self)
def status(self):
self.status = status
def value(self):
self.value = value
def neigh(self):
self.neigh = neigh
(or list in my preferred approach with vertices 0, ..., n-1)
[node1, node2, node3, node4, ...]
I would just do
state = {v: default_state for v in conn}
or if list based approach
state = [default_state]*n
class Node:
def __init__(self, value, neighbors, status):
self.value = value
self.neighbors = neighbors
self.status = status
imo I rather have multiple lists/whatever than cram everything into one
it's easy to add/remove features you need/don't need
You can also do: ```py
values = [value1, value2, value3, ...]
neighbors = [[...], [...], [...], ...]
states = [start_state, start_state, start_state, ...]

This is known as array of structs vs struct of arrays.
AoS vs SoA.
Or single array vs parallel arrays.
yeah, it has a bunch of impact on data locality
not that you care much about that in python
Note that you can put all those parallel arrays into one class called Graph if you want. You can always make classes to bunch things together.
It's best to do so if you find yourself passing some items together into functions all the time.
Like if you find yourself passing around values, neighbors, states together all the time, make them a class.
that's a good point, you can always wrap stuff in a nice class with nice methods to interface with it, the underlying implementation can be whatever, the user shouldn't need to care
ok ok so i've reformulated my adjacency array as these 3 lists:
values = ['A','B','C','D','E']
neighbors = [['B','D'],['A','C','D'],['B'],['A','B'],None]
states = [1,1,1,1,1]
(I tend not to in programming contest style stuff because I often might need to change the internals)
the neighbors thing should be a dict in your case
Yea that is why you don't do it until you find yourself actually passing it around together a whole bunch and that pattern seems to hold up.
and I would recommend generating those based on a list of edges
As something is repeated a bunch (without change), you can solidify it in code and make it more nice to use.
i think its fine so long as node 0's neighbors are in neighbors[0]
but you are using strings as node labels in the neighbors list
how do I know what nodes are neighbors of say 'C'?
right, either use indices in which case you can use such a list, or use a dict
Basically your nodes are indices, their data is stored in those parallel lists.
like this?:
neighbors = {0:['B','D'],1:['A','C','D'],2:['B'],3:['A','B'],4:[]]```
or do you mean the keys equal to the node values
you want to map index->index, or label->label
e.g. edges [(0,1), (0, 2)] would produce
[[1, 2], [0], [0]]
```or
```py
{0: [1, 2], 1: [0], 2: [0]}
the latter variant could be used for non-int labels as well
e.g.
{'A': ['B', 'C'], 'B': ['A'], , 'C': ['A']}
now my states is indexed differently than my neighbors though
values = ['A','B','C','D','E']
neighbors = {'A':['B','D'],'B':['A','C','D'],'C':['B'],'D':['A','B'],'E':[]}
states = [1,1,1,1,1]
either all dicts based on labels or all lists based on indices
ok ok
I prefer just indices, if you want the nodes to have labels you can have a label array that is parallel to the rest, or as you name it, values.
i think i prefer indices as well, but there was something y'all didn't like about that
If you want to lookup a node by label, you can have a dict that maps from label to index.
values = ['A','B','C','D','E']
neighbors = {'A':['B','D'],'B':['A','C','D'],'C':['B'],'D':['A','B'],'E':[]}
states = {'A':1,'B':1,'C':1,'D':1,'E':1}
assuming I mapped ABCDE to integers 01234, I would write your example
n = 5
edges = [(0, 1), (0, 3), (1, 2), (1, 3)]
neighbors = [[] for _ in range(n)]
for a, b in edges:
neighbors[a].append(b)
neighbors[b].append(a)
states = [1]*n
no, it was that you mixed indices and labels
you use one or the other
even if I'm given labels as strings or whatever I tend to map it to indices because lists are nicer to work with imo
values = [value1, value2, value3, ...]
neighbors = [[...], [...], [...], ...]
states = [start_state, start_state, start_state, ...]
``` Is what I would do, and just leave it at that.
that plus some logic to map values to indices (and back, if needed)
If you want fast lookup of a node by label you add an indexing method via a dict: ```py
primary_key = {"A": 0, "B": 1, ...}
(If you can see it, you know that this is what a database does, the values, neighbors, and states are columns in a table)
lets table this for now y'all, and pick up tomorrow
if i sit for any longer im going to die
and you can generate this mapping easily
i thought i follow this example and it didnt sit well with you guys
That example is good, final answer. The rest is just extra if you want to look up nodes by value and not by index (in better than O(n)) / want more complicated stuff.
something like
indices = {}
def index(u):
return indices.setdefault(u, len(indices))
edges_with_indices = [(index(a), index(b)) for a, b in edges]
no, neighbors here maps indices to labels, that's what we had issues with
Perhaps an example neighbors: [[1, 4, 3], [2, 3], [0], [], [3, 4], ...]
yeah im on it
and as said, I really recommend just writing a list of edges and generating everything else
it's easier than editing many different lists all over
Yea this is the storage form, but it may not be the best human-editable form (crafting by hand the 3 lists).
as an example, this
you can also add some mapping step as I mentioned 
and after than you can just work with indices
this def must be inside a function or a node class?
no, it can just be a local function, it's just so I don't have to repeat the setdefault logic twice
ok i'm going to have to study this
i've printed out indices and edges_with_indices so its definitely working
oh i think you mean vertices
!e
edges = [('A', 'B'), ('A', 'D'), ('B', 'C'), ('B', 'D')]
indices = {}
def index(u):
return indices.setdefault(u, len(indices))
edges_with_indices = [(index(a), index(b)) for a, b in edges]
print('mapping', indices)
print('mapped edges', edges)
n = len(indices)
neighbors = [[] for _ in range(n)]
for a, b in edges_with_indices:
neighbors[a].append(b)
neighbors[b].append(a)
print('neighbors')
for i, neighs in enumerate(neighbors):
print(i, neighs)
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | mapping {'A': 0, 'B': 1, 'D': 2, 'C': 3}
002 | mapped edges [('A', 'B'), ('A', 'D'), ('B', 'C'), ('B', 'D')]
003 | neighbors
004 | 0 [1, 2]
005 | 1 [0, 3, 2]
006 | 2 [0, 1]
007 | 3 [1]
a full example
where?
.
right ok
hm i'd want to map in alphabetical order but we're mapping D to 2 and C to 3
does it matter?
if you want to you can list the vertices and generate the mapping from that
ok
!e
vertex_labels = ['A', 'B', 'C', 'D', 'E']
indices = {label: i for i, label in enumerate(vertex_labels)}
print(indices)
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
{'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4}
so essentially what y'all are saying is forget their implementation in the discrete math book and use multiple lists to represent the vertices, neighbors, states
I personally like the multiple list approach better yeah, though there are many ways to model graphs
all approaches should end up functionally mostly equivalent
The book's way is really bad. If they use linked lists where each node contains only a single number and not some big chunk of memory you can be sure it's bad.
linked list is fine in theory, in practice they are kinda annoying
Linked lists can be nice for implementation purposes in a few niche cases, but in Python you already have lists and such, so it's much easier to just use them.
it's simple and gets the point across
a lot of the time an algo mentions linked list I just replace it with list in python or vector in C++
unless there is a good reason to use linked lists
which sometimes is the case
Old books use linked lists because they used to make more sense when memory speed was not the bottleneck and implementing dynamic arrays was a pain in assembly.
linked lists are easy to define
a dynamic array is more annoying to define in a mathy way
omg i think this book was 1976
LISP and LISP ideas were also big in CS back then, so linked lists were everywhere.
(hardware had specific assembly instructions for LISP (car, cdr))
ultimately those details don't matter so much, understand what they are trying to achieve and translate it to appropriate structures in your preferred language
Liked lists -> lists in the case of Python.
unless there are strong reasons for a particular linked list
e.g. implementing a queue
(you can do a cyclic queue with a list, but you get the point)
with a linked list popping from both ends is O(1)
with a list popping from the front is O(n)
so understand the intent, translate as appropriate
granted, the "translate as appropriate" is not obvious until you get used to things
(An example of a linked list being useful is if each node is big, especially if each node is a page of memory, then you can page in and out nodes, and you are basically linking memory pages to each other, this comes up when you want a dynamic memory arena for region based memory management and/or want chunking for something like a space-partitioned simulation)
lol yes, not at all
doesn't help that im trying to learn discrete math, data structures, and relearn algebra all at once
lol, way too involved at this point 😛
i think y'all need to dumb down a lot of the stuff you talk about for me, or at least recall that i'm very new to programming
Yeah, but I guess it also shows that these days it's niche.
what about for collision resolution in a hashmap, would you use linked lists?
that's a case where it might make sense
i think thats the recommended strategy in CLRS
because deletion is cheap when you get to the element
i need to understand graph implementations first though 😂
Hard to say, probably not. Because how many collisions are you going to have?
O(n) deleting is basically nothing if the list is small.
well clearly this is split opinion
chaining does incur an extra cost for the pointers
in other langs there are other advantages to using a linked list there, e.g. if I have a pointer to an element in a hash map and I insert a value, is that pointer still valid? (assuming no rehash happens)
If the linked nodes are on the system heap, de-allocating them (a single one) can be much slower than an O(n) deletion in a small list (a small shift).
with a linked list the pointer is valid
with a dynamic array it might not be because resizing
Yeah, but generally good code does not have pointers hanging around for later like that, unless you really need the extra bit of speed.
but I think this is probably way to detailed stuff than what's helpful
It's probably premature optimization.
something something C++ iterators
Handles.
You use a handle to get a temporary pointer from a system to the resource.
You don't store the pointer for later because it may become invalid.
It's temporary.
Iterators (some) do this, but more specifically for a data structure, not a whole system.
Similar idea.
my tl;dr: A fine mental model for a hashmap is an array of some kind of list for collision resolution. There are other ways but this mental model gets the important points about hash tables across
that's probably better tbh, i just think of an array that magically makes collisions go away into a different slot by open addressing
iterators on std::vector notably do invalidate iterators on resize
right, the overall ideas stay the same, hash to find a slot, resolve collisions somehow
yeah, it's just with chaining you know where the elements go on collision lol
Yeah, that is up to the class implementer, from user POV it's just magic.
You want to know what not to do too.
Hopefully the class encapsulates it enough in that way, but this can't always be done perfectly.
And also the user should know the general properties such as big O of ops.
Go for it. Interrupt us.
yeah, the list for collisions mental model accomplishes that and is easy to teach, flat hash maps are a lot more intricate
can we upload photo here?
This is a graph, So we can think like edges can carry this amount of number, our aim to maximize to total weight carried by edges to C
max flow?
for example from a to c answer is 12
yes max flow
is it simple problem?
i was thinking like last 2 hours wth drwaings but cant find anything
it's a well known problem, the algos for solving them gets pretty involved though
i need to find exact result
not greedy aproach and my test data is very big
what is "very big"
there is a long list of possible algos on the wiki page https://en.wikipedia.org/wiki/Maximum_flow_problem
In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.
The maximum flow problem can be seen as a special case of more complex network flow problems, such as the circulation problem. The maximum value of an s-t flow (i.e., flow from source s to sink t) is e...
thank you dude
ill check all of them
i like this problem very much
if the max flow is not large ford-fulkerson is a fine one
Dinic's algo and push relabel are other common ones, with different trade-offs performance-wise
so if complexity is v'square
though maybe you just want to try to use some existing library? networkx module should have some max flow algos implemented
ill check complexiteis
yea i can try
but i like the problem so i wanna undertsand solution
right, you'll have to check what algo seems like a good fit based on number of vertices, number of edges, and in the case of ford-fulkerson even the size of the answer
I would say flow algorithms get pretty intricate, they aren't the easiest thing to wrap your head around, though it's possible for sure
and the easiest way to represent a graph is with separate lists
having separate lists for different things is what I prefer at least
probably because I've done a lot of problems where I need to tack on extra data per node to compute something, so just adding a new list is simple to do
the indices in our solution was a dictionary though
in my thing?
that's just the mapping used to convert labels to indices
ok
"indices" is more like "labels_to_indices_map".
the actual graph representation is just lists

(I love long variable and function names, I use sentences)
(in my own code I might call it just mapping)
(or even M, because why not)
it's also useful to make an inverse mapping from indices to labels if you want to print node labels to the user
When programming for small programs I use single letters, but for actual projects long names, because nobody including yourself will know what was meant (you will, but there is some probability you won't and in a large project that adds up).
(After you come back to the code in a couple months or years)
(Although I don't use Python for projects that need to be that big and last that long)
well yeah, I don't write stuff that unreadable if I expect others to read it
though I try to keep stuff as short as I can while still being clear
having huge variable names gets very noisy very quickly
what is this setdefault method all about
Yeah, but at some point I found that what was clear to me was not to others and so I just error on the side of too long now. IDK, it's hard to say.
Often names can be shortened by context though. Like if it's in a class with some name, that name gives you additional information to guess what is meant.
(And documentation, if there is any)
basically these behave the same
return my_dict.setdefault(key, value)
if key not in my_dict:
my_dict[key] = value
return my_dict[key]
if key is in the dict, return the value
if it's not, insert the given value and return it
the second approach does 2-3 lookups into the dict, the setdefault might get by with less
(and the second is longer and more logic to write)
Too in depth but... there are some cases where you still want the former. If value is computed and expensive to compute setdefault always computes it, the version with if doesn't
Some languages has versions that handle this kind of laziness, e.g. you give a function that computes the value rather than the computed value. Afaik python doesn't have this
return my_dict[key] if key in my_dict else my_dict.setdefault(key, value)
``` what about doing this
1 expression
but many lookups into my_dict
if only there was a 1 lookup and 1 expression version
and doing that kinda defeats the purpose of setdefault
many other langs has something morally equivalent to
my_dict.lazysetdefault(key, lambda: expensive_computation(args))
hii can i optimize this soln.
for t in range(T):
x = int(input())
z=[]
for i in range(1,x):
for j in range(1,x):
if 2*i + 2*j + i*j ==x:
z.append("YES")
if len(z) ==0:
print("NO")
else:
print("YES")```
yeah, for starters you don't need to iterate up to x
i*j constraints the possible i and j a lot
this fact should help as well
2i + 2j + ij = (2 + i)(2 + j) - 4
so you're checking if
(2 + i)(2+j) == x - 4
```the answer will be related to the divisors of `x - 4` which you can find in O(sqrt(x)) time
and you only care if such a (i, j) pair exists or not (the list append is quite silly)
so the thing boils down to checking if x-4 has any split into factors p*q such that p>=3 and q>=3, or equivalently does there exist a divisor >= 3 and is x-4 >= 9
hi @haughty mountain , what is this code you wrote yesterday doing:
for a, b in edges:
neighbors[a].append(b)
neighbors[b].append(a)
also, do i not require a Node class if I am going to do the 3 lists with indices approach?
hmm i can already see list indexing issues cropping up:
ok i'm trying to implement the BFS from CLRS, i've gotten pretty far except i'm hitting a snag on this line 15 here:
its running but i think not like how some of my indices point at a list
like in neighbors
hard for me to tell from CLRS (pg.594) but i think they want the neighbors to be in the form of tuples so like if V1 was neighbors with V3 and V4, they'd want (V1,V3),(V1,V4) in an adjacency list, is that how y'all read it?
i don't think my neighbors list of lists will work with the CLRS algo the way its written
hmm i'm trying to write this iterator to generate the edges out of the vertices and adjacency list
might zip be helpful?
neigh over here is a list, like [2,5]
So you can't do neighbors[[2,5]]
how might i do like for each element in vertices, add to edges vertex+neighbors[0][0], vertex+neighbors[0][1]
given that the lists are different length
there is an edge between a and b, if the graph is undirected this means b is a neighbor of a (first line) and a is a neighbor of b (second line)
@haughty mountain sorry to ping. are you around? i'm trying to write this in the style of the CLRS algo. i think it's looking at edges in the form (V1,V2) given by like (u,v). for that reason i'd like to rewrite the edges generator based on the vertices and neighbors lists above and end up with all the edges as tuples. i know you wrote one y'day but i want to do this from scratch
so i guess i am asking given:
neighbors = [[2,5],[1,3,4,5],[2,4],[2,3,5],[1,2,4],[3,4]]```
how do i generate the edges
yes
oh well you don't need to consult CLRS for that
not as lists tho
tuples i guess or however they are inspecting them in CLRS pg 595
so just
(1,2),(1,5),(2,1),...
but do you want repeats?
i.e. if you have (1,2), do you want (2,1)
My suggestion is to use a set of tuples
hm maybe not idk. the others were saying just handle the data in parallel arrays
unless your edges are directed
no i'm learning undirected first
It depends on what operations you're going to regularly be doing
ok if it's undirected then there's no reason to store (1,2) and (2,1) simultaneously
so we use a set
my_set = set()
for index, edge in enumerate(neighbors):
if not my_set.has((edge, index + 1)):
my_set.add((index+1, edge))
that probably won't run but you get the idea
if you want to do 1 based indexing
if you want to do 0 based indexing then just remove the + 1
I believe python has tuple comparison... if not then use a dictionary
But again the format you put a graph in is really reliant on what you want to do with it
Like if you want to check whether an edge exists between nodes x and y, then an adjacency matrix is ideal (but it's less space efficient)
if you want to get all edges that a certain node has, then an adjacency list or set is ideal
what does enumerate do
enumerate takes a list
so say
my_list = ["a","b","c"]
enumerate(my_list)
is going to be
(0,"a"),(1,"b"),(2,"c")
so for every element in the list it takes it and indexes it
ya
neighbors = [[2,5],[1,3,4,5],[2,4],[2,3,5],[1,2,4],[3,4]]
this is already an adjacency list
it gives
you didn't
i see so adjacency list 0:(0,[2,5])
1: (1,3,4,5)
etc
yeah
but it's implicit right
index 0 contains all nodes that have an edge with 0
and because arrays have O(1) access you can get all edges of say, node 5, in O(1) time
i don't get the for index, edge bit
edges_of_node_5 = adj_list[5]
well again
what format do you want to put this in
a list of all edges as tuples?
oh ok yeah I might've made a mistake my apologies
whatever i put it in its gotta be indexed in the same way as the vertices and neighbors lists i think
well if you don't know no one can help you
true
also for just transforming an adjacency list into another graph representation
that's usually not going to be something you need to consult CLRS for you just need to think about it right
sasy I have
it really depends what you want to do with the graph
actually i don't think they are inspecting edges as tuples, looking at this again
which representation you are going to use
and also keep in mind converting representations is a bit expensive too
there's an auxilary space complexity of O(N^2) for building an adjacency matrix for instance
An edge set is like
neighbors = [[2,5],[1,3,4,5],[2,4],[2,3,5],[1,2,4],[3,4]]
^ that's an adjacency list
so an edge set looks like
set((1,2),(1,5),(2,3),(2,4),(2,5))
so like you need to figure out what operations you're going to be doing frequently before you decide how to model the graph
in leetcode they'll generally give you an adjacency list I think?
sometimes it's not worth converting as you see there
i mean, they talk about edges in the set E with (u,v) so that must be an edge beginning at node u and ending at node v
but they don't explicitly say how to represent E in memory
"there is no one best way to store and access vertex and edge attributes"
probably an edge set then
but yeah there is no best way it depends what you're given initially, what you're doing, and how many times
right, yeah so i was trying to generate the edge set given the vertex list and the neighbors list
lms if i can get the code snippet you shared working
yeah so if you want to do that
just create a set, traverse the adjacency list, use enumerate, and put the pair in IF the reverse pair is not already in the set
wait i think that'll mess up the indexing
yeah it depends on if you're doing 1 based indexing too
it seemed from what you sent me
they consider the 0th index of the adjacency list as node "1"'s edges
so you have to offset index from enumerate by 1
maybe this really would be easier with Class sigh
not really lol
only looking at their pseudocode i mean
It's kinda rare to need classes at all for leetcode problems
this aint for leetcode
do you have CLRS
nah I did sedgewick
ok i kind of want to send their pseudocode and my code and you tell me where im going wrong
sure
I would shift all those integers down by one to match python indexing
^
ok ok
you could do like map(x, x => map(x, y => y - 1))
I'm bad at python map I might've written that wrong
or just mutate
so your example would be
vertices = [0,1,2,3,4,5]
neighbors = [[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3],[2,3]]
do you know how to interpret neighbors?
no i think thats whats derailing me. also i need to better understand deque to interpret their pseudobode
its weird that they talk about edges in the form (u,v) but then don't actually generate that list anywhere
neighbors[1] are all nodes adjacent to 1
[0, 2, 3, 4]
so edges (1, 0), (1, 2), (1, 3) and (1, 4) exist
(those are the edges containing 1)
is the dict case clearer to you? e.g. {'a': ['b', 'c']}
i can understand in either sense
can/can't?
can
so you know how to interpret the neighbors list? you said you didn't just before 
let me look over my code and the pseudocode again
its just confusing bc they talk about edges in the form (u,v) but they don't generate that list anywhere nor actually use it in the pseudocode
So neighbors gives you an adjacency list that implicitly lists all edges
[[2,3], [1], [1]] means that node 1 has edges with nodes 2 and 3, and node 2 has an edge with node 1, and node 3 has an edge with node 1
The first thing I'd say to do is to convert this to 0 based indexing
we did
Like algmyr says the representation you want it in all depends on what you're doing, you already have access to all (u,v) pairs without forming an edge set or edge list
neighbors = [[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3],[2,3]]```
yeah, even when problems give me 1 indexed things I immediately convert
so, based on neighbors can you tell me what edges contain vertex 2, and how can you see that?
yeah it'd be any edge beginning or ending in 2 and all the edges found in neighbors[2]
this isn't 1 based indexing anymore, yeah
hmm
neighbors[2] contains 1,3 but neighbors[5] contains 2, so 5 must be added to neighbors[2] to make it accurate
5 didn't exist before and idk where this list even came from anymore do i'm gonna do away with vertex 5
when they say enqueue in the pseudocode, i think they mean deque.append()
and when they say dequeue, i think they mean deque.pop()
why are they enqueuing an empty set?
when they first call enqueue, they pass it an empty set Q and the first node to inspect
probably because they're getting ready to traverse it, but I have no idea why they'd queue the set itself
usually to traverse you create a set of visited nodes and a queue of nodes you can visit (you have seen edges to them in the visited nodes)
you then traverse until the queue goes empty
yeah im going to keep track of that attributes with colors
things start out as white
and you don't put nodes that have been in the visited set in the queue
so like, your nodes are
{color, [edge, ...]} ?
no its all parallel lists like this:
vertices = [0,1,2,3,4]
neighbors = [[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3]]
predecessors = []
distances = []
colors = []
then how are you coloring them
they need to be able to have a color associated
or you need to use a separate visited set
one or the other
def BFS(vertices_list,neighbors_list,predecessors_list,distances_list,colors_list,sourcenode):
for vertex in vertices_list:
colors.append('white')
distances.append(math.inf)
predecessors.append(None)
oh ok
colors[sourcenode] = 'grey'
distances[sourcenode] = 0
predecessors[sourcenode] = None
line 12 is what I am most struggling with
they say for each v in the set adjacencylist[u]
there's more to this than you've said
I'm not sure what distances is for
or predecessors
i sent you all my code and their pseudocode
wait, what you gave here isn't an undirected graph, is it?
it is
let's assume I decremented everything by 1 correctly
neighbors = [[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3],[2,3]]
5 is connected to 2, but I don't see 2 being connected to 5
2 is in neighbors[5]
5 is not in neighbors[2]
this broken symmetry means a directed graph
no like i said i got rid of 5
bc it didn't exist in the first place
i just had added some more data to make it more complex but obviously thats not how it works. currently looks like this:
vertices = [0,1,2,3,4]
neighbors = [[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3]]
ok, cool
so neighbors[2] tells you [1, 3] are neighbors, so 2->1 and 2->3 exist (directed edges, because that's kinda how adjacency lists and adjacency matrices work)
since you knew the graph is undirected you know edges 2--1 and 2--3 exist (for completeness you could also verify that 2 is also in neighbors[1] and neighbors[3])
looking back a bit
for neigh in neighbors[x]
at least I think that's what you intended...
no i did away with that, its been determined that i don't need an edge list to do this
yeah, though generating an edge list isn't that hard
for node, neighs in enumerate(neighbors):
for neigh in neighs:
# directed edge node->neigh exists
ahhh
vertices = [0,1,2,3,4]
neighbors = [[1,4],[0,2,3,4],[1,3],[1,2,4],[0,1,3]]
edges = []
for node, neighs in enumerate(neighbors):
for neigh in neighs:
edges.append((node,neigh))
print(edges)
you will get "duplicate" edges in the undirected case
i see that
though you could just add an if statement to fix that
can you tell me more about this line:
for node, neighs in enumerate(neighbors):
i know what enumerate does now, but so far as i can read this we are saying for x,y in list[0-5]..
id love to understand this whole block better
for node, neighs in enumerate(neighbors): for neigh in neighs: edges.append((node,neigh))
if node < neigh:
neighbors[v] are the nodes adjacent to v, right?
yes
enumerate(neighbors) will yield (0, neighbors[0]), (1, neighbors[1]), (2, neighbors[2]), ...
we unpack each pair into node, neighs
where neighs are the nodes adjacent to node node
we then just iterate over the neighbors in neighs
in the end, for every node, we iterate over all its adjacent nodes
so enumerate is often used to unpack iterables, is it?
ah i see a bug
i think the colors list will get messed up having the same indexing as the list of lists
ok i think its working? hard to tell ask i don't know what the output should look like
enumerate is when you want both the index and the value.
for v in lst for just value, for i in range(len(lst)) just index, for i, v in enumerate(lst) both index and value.
What is happening with enumerate is that it is generating (index, value) tuples and we are unpacking them into i, v.
>>> x, y = (10, 20)
>>> x
10
>>> y
20
yeah i get it
range is generating indices (integers, could be non-index stuff, whatever the integers are suppose to represent)
so having your node names like a b c d etc must be super annoying
Not really because you can just make a dict that maps from name to index.
hmm i kind of want to recreate this undirected graph from clrs and run my bfs on it
We want to work with indices, if we have non-indices, get to indices.
alg wrote me a alphanumeric to int index mapper yesterday but it went woosh right over my head
ill take a look again
glad i got the edge generator working w algs help
vertex_labels = ['r','s','t','u','v','w','x','y']
indices = {label: i for i, label in enumerate(vertex_labels)}
print(indices)
hmm
Generating a dict which goes from label to index.
I would rename indices to label_to_index_map.
ok so i have what i think is working bfs, but rn i want to repurpose the graph in the book to be my new test graph
so they have nodes r-y
so i'll want vertices 0-7
rn i'm manually curating the adjacency map
"neighbors"
vertices = [0,1,2,3,4,5,6,7] neighbors = [[1,4],[0,5],[3,5,6],[2,6,7],[0],[1,2,6],[2,3,5,7],[3,6]]
think i did that right. i should have made the map first lol
What's the problem for that graph shortest path?
Just learning BFS for now
Can solutions to dp problems be a bit more complex?
my solution to one problem is dp[k][i x j] = max(dp[k-1][i] x lst[k][j], dp[k][i x j]
usually it is just dp[i][j] = ...
need help with recursion
anyone free?
I have a rudimentary understanding but you can ask me anyway maybe I can help you
just found a cool resource:
you can write latex as you naturally would in python using a jupyter notebook
hey how can i test my DFS to see if it is doing what it should be
You could use 2 dicts for both child lookup and parent lookup. It would take space. Yes.
Or yeah since you're managing indexes and using list, you could have a reverse list?
You can also use sympy directly: ```py
from sympy import init_session
init_session(quiet=True)
Type "help", "copyright", "credits" or "license" for more information.
(SymPyConsole)
Integral(sqrt(1/x), x)
⌠
⎮ ___
⎮ ╱ 1
⎮ ╱ ─ dx
⎮ ╲╱ x
⌡
Have a start node, and a target node. Then have it print a path from start to target or that there is none.
I am doing a problem where I am to return an array of the three largest numbers in the array without sorting
def findThreeLargestNumbers(array):
# Write your code here.
seen = set()
three_largest = [0]*3
insert_at = 2
for iteration in range(3):
max = [float('-inf'), -1]
for i, num in enumerate(array):
if max[0] < num and i not in seen:
max[0] = num
max[1] = i
seen.add(max[1])
three_largest[insert_at] = max[0]
insert_at -= 1
return three_largest
'''
Runtime: Big O(3*n) => O(n) where n is the length of the array
Space: O(3) + O(3) + other O(1) operations => O(1)
'''
Here is my approach and Big O Analysis. Just wondering if I used O(1) space I think so since both structures I am declaring will have three elements inside so when constants are dropped isn't it O(1)?
need an example of positive assertion ?=
yo
I think you overcomplicated the program, this should be a bit more optimal (if I understand what you're trying to do correctly)
def findThreeLargestNumbers(array):
array = list(set(array)) # remove duplicates
three_largest = [0]*3
for iteration in range(3):
three_largest[iteration] = max(array) # get current largest number and put it inside three_largest
array.remove(three_largest[iteration]) # remove that largest number from the array
return three_largest
In terms of Big O notation, this is effectively O(3) since we're just doing a loop 3 times, rather than your case where you use a nested loop of 3*N
doesn't need list()
Does set() have the remove method?
Then yeah no need for the conversion there
But otherwise I think my function is moderately more optimal
that is not O(3) because max takes O(n) time
isn't O(<literal>) just O(1)
At least there's no wasted actions on checking if a number was already seen or having to manually save the mas values
well, since you call max three times, it is actually looping through the list three times
what if max() took another argument that says how many of the max numbers it should return
def findThreeLargestNumbers(array):
array = set(array) # remove duplicates
three_largest = []
for iteration in range(3):
three_largest.append(max(array)) # get current largest number and put it inside three_largest
array.remove(three_largest[iteration]) # remove that largest number from the array
return three_largest
heapq.nlargest exists
Hold on does your program allow duplicates to exist? Because I think they can for that problem
Like [1, 2, 3,3,3] could be a valid input I think
So long as they’re in different indices
Also this particular question requires the three largest in sorted order. I suppose using a sorting function would be 3log(3) => O(1) so that going backwards step was unneeded.
Using a heap would sort it though wouldn’t it? Since you’d need the last three heappops to get the three largest
It takes O(n) to find each max and we’re doing this 3 times. O(n*3) => O(n) no?
Lib/heapq.py line 397
# Algorithm notes for nlargest() and nsmallest()```
print visited nodes as you go and check against your graph that what it does is an dfs/bfs/whatever else you are testing
Hi all - I have a point cloud of points where each point has a colour.
Is there a pre-existing algorithm for identifying duplicate points in the point cloud (points that are close together in both space and colour)?
I have a function that can tell me if two points are duplicates
and I have another function that can merge two points together
But I'm looking for a way of implementing this such that after merging, no further duplicates can be found
idk if such an algo is well defined
depending on merge order and how exactly your merge works you might get different results
yeah, that's the problem i'm running into
I suppose the results will change depending on if you're merging C into A+B or A into B+C
and if there's no inherent order to the point cloud...
I had one idea of: if |A-B| < r, pick point (A+B)/2 and merge all matching points within distance r into it
and I suppose remove A, B and any other matching points from any further processing?
yeah
I suppose that doesn't solve the merge order either
the way I'm doing it at the moment is with NetworkX
it ensures that a merged point isn't merged again at least
true
I don't think you can get rid of the merge order dependence
the way I'm doing it at the moment is:
- Take the list of all points
- Get a list of all pairs of points
- For each pair, work out if the points are close enough together and similar enough colour (within tolerances)
- Make an NX graph, add all the points as nodes
- Add edges to the graph between points that match
- Extract all connected components
- For each connected component, merge all its nodes together
- For all the isolated components (points with no matches), just keep them
Instead there are wasted actions for removing the values from the list. In terms of clean code the solution is also pretty bad since it modifies the input array. And it still requires at least 3n operations, so no it isn't any faster.
This is the ideal solution. For finding the m largest numbers this should take O(n log m) time which is even better than the O(nm) time of the naive approach (for obvious reasons this is irrelevant for m = 3 though). Space is O(m) aux. for the heap which is also optimal.
you need to put all elements in the heap, it's n space still
for m largest numbers in an n element array, it's m log n time, n space
also it's possible to do this in one pass
def f(arr):
a = b = c = float("-inf")
for x in arr:
if x > a:
a, b, c = x, a, b
elif x > b:
b, c = x, b
elif x > c:
c = x
return a, b, c
```something like this i think
it claims
keeping the k most extreme values in a heap.
ah
actually heapq is inplace anyways
would probably be faster to heapify everything
Yes but Python implements a minHeap so finding the three largest would mean essentially to heapSort the array and return a list of the last three heappops?
not saying this is what it actually does but you can mimic a maxheap with a minheap by just adding a negative sign on the elements
(assuming ints)
I know about this trick and used it as my first approach to solving this problem, but the problem is that it was then considering the absolute smallest number as one of the largest numbers or something. But regardless wouldn't doing it this way be considered a sorting being applied? The problem says to specifically not sort.
from heapq import heappush, heappop, heapify
def findThreeLargestNumbers(array):
# Write your code here.
for i, num in enumerate(array):
array[i] = abs(num)*-1
heapify(array)
three_largest = []
print(array)
for i in range(3):
elem = heappop(array)*-1
three_largest.append(elem)
three_largest.reverse()
return three_largest
# 6/8 testcases pass
failed tests:
input: {
"array": [141, 1, 17, -7, -17, -27, 18, 541, 8, 7, 7]
}
expected: [18, 141, 541]
actual: [27, 141, 541]
{
"array": [-1, -2, -3, -7, -17, -27, -18, -541, -8, -7, 7]
}
expected: [-2, -1, 7]
actual: [18, 27, 541]
abs(num) * -1 doesn't work, that would do exactly what you said, "absolute smallest number as one of the largest "
honestly i'm not sure why they wouldn't let you sort, it's probably the fastest way
I understand, one way that just came to my mind would be to make a new array of the input size and keep a record of whether the number was negative or non-negative and then based on that have some conditionals that are checking the heappops
you don't need to do that though, just multiply all the numbers by -1
nothing fancy, you're just inverting the order
My solution is faster, sorting would be O(nlogn) mine is O(n) probably to introduce you to a new spin on a problem because my first instinct was to sort using a heap since I was practicing the use of a heap in interviews.
so num * -1 instead of abs(num) * -1?
you are a genius, it worked thanks 🙂
faster asymptotically, yes, but in practice you'd have to get a lot of elements to beat it. sorting is done in C, and it's a very optimized algorithm
heapq i think is C too i think though 🤔 might be interesting to benchmark
isn't asymptotic growth all that matters for Big O in terms of interviews?
i wouldn't know, but i would think knowing more than just big O is a bonus
plus if it's for an interview, you can totally make the argument that
sorted(arr)[-3:]
``` is much cleaner and easier to maintain
I'm not too good with slice notations need to reference docs every time I use them, but I think they're looking for general competency in algorithm design and analysis
sure, other things matter as well, though
hey, i wanna populate a liat with a default length
cuz destructuring a, b, *rest = argv is giving me arrayoutofbounds error
ye tried doing something with that
but how do i merge it with argv, setting argv as default
so to destructure at least 2 values, i'd need sth like
merge(argv,[None]*2)
oh not too sure then
in itertools there's a flatten right? so with that i was thinking
flatten(zip(argv,[None]*2))
but zip truncates to the highest length... :/
what are you trying to do exactly?
this
and you want to solve it how exactly? pad with None?
for example, but the how is the subject of my question
would this work?
a, b = (argv + [None]*2)[:2]
couldnt u just iterate thru rest and argv manually?
and I guess
rest = argv[2:]
pattern matching might be cool here
it's a bit sad there isn't a get on list like there is with dict
I find myself re-implementing it a bunch
just argv+[None] is enough actually... 😂
operator.add(argv, [] if len(argv)>1 else [None])
what's with operator.add there, can't you use +?
cant use a ternary with +
why not?
i mean im just back to python after i learned it 5 years ago
i thought i cant
i'd just need brackets then?
argv+([] if len(argv)>1 else [None])
ok it worked too
looks kinda silly though
final solution:
argv+([] if len(argv)>1 else [None]*(2-len(argv)))
pretty ugly
screams js-accustomed author
why is this preferable to the pretty simple
(argv + [None]*2)[:2]
it's not exactly equivalent, but pretty much
maybe this is the nicer solution
def get(v, i, default=None):
try:
return v[i]
except IndexError:
return default
a = get(argv, 0)
b = get(argv, 1)
Hello chat, I’m new to this discord. I have a course of algorithms and data structures this semester, is it hard?
Hi all, I hope this is the right place for this.
I'm a novice programmer with python and I am having a problem (which I think is related to python memory allocation), I'm building a desktop software in python, for this I'm using QtDesigner for the GUI and I'm writing the program logic, but I ran into a problem that I don't know how or why it happens, but I think I've located it.
The software is an inventory manager, and I want to modify a record, then I am building the methods that take the data from the table (QTableWidget) a method (menuedit) validates what type of element is, calls the respective method in charge of controlling the edit window of that type of element, and passes it as argument the data that it took from the table (by means of the variable 'item'), until there all good, at this moment I am making the method to edit the elements of type '0', that are "parts", then it loads the window of edition of parts, the method (editpart) takes the data that are in the variable 'item' and it shows them in the screen and until there everything works well, the idea is that the user can click on the "save" button that is in the editpart window and then the part is updated in the database and in the table (QTableWidget) for this the method (updatepart) is called and is given as parameter the variable 'item' that contains the data taken from the table (QTableWidget) this method is responsible for making the update in the database and in the table making the necessary validations, and this works fine in the first time, but in the following times a problem occurs when the method (updatepart) receives as argument to the variable 'item' and it receives the same information as in the first time that a modification was made and this causes errors in the program and then strangely re-executes the lines of code now with the correct variable 'item'
and honestly I do not understand how this is possible or how to solve it, I attach the code fragment I'm talking about and a screenshot of some screen prints to better understand the problem.
def menuedit(self): #This is the function that is called when the edit button is clicked
item = [dato.text() for dato in self.table_part.selectedItems()]
if item != []:
tipe = item[8]
if tipe == "0":
print(item, "menu edit")
self.editpart(item) #if the type of the element is 0, it is a part
elif tipe == "1":
self.editassambly(item)
else:
self.warn_part.setStyleSheet("color: green;")
self.warn_part.setText("No item selected")
def editpart(self, item): #this function control the edit part window
self.stackedWidget.setCurrentWidget(self.part_page)
#Here are a few lines of code that should not affect the problem
self.btn_exitpart.clicked.connect(lambda: self.exiteditpart(item))
print(item[:], "edit part", id(item), "id")
self.btn_save.clicked.connect(lambda : self.updatepart(item[:])) #I think here is where the problem is
def updatepart(self, item):#This is called when the save button is clicked
print(item, "item", id(item), "id") #the "item" has the same value as the first time it was called
partnumber = item[0]
print(partnumber, "part number")
if self.in_nump != item[0] or self.input_namep.text() != item[1] or self.input_desc.toPlainText() != item[2] or self.input_hand.text() != str(item[3]) or self.input_orderp.text() != str(item[4]) or self.input_ven.text() != item[5] or self.input_price.text() != item[6] or self.input_time.text() != item[7]:
error_flag = 0
self.warn_part.setText("")
parts = []
pnn = self.in_nump.text()
print(pnn, "new part number\n")
#Here are a few lines of code that should not affect the problem
If someone maybe knows what is happening or is willing to help me I am willing to show the complete code and talk about how it works, I am not looking for you to write the code for me or a complete solution, I just want someone to help me understand what is happening.
Maybe you're supposed to unconnect the old function from clicked before connecting a new one, and the way you're doing it you end up with both new and old connections?
I don't understand very well what is the purpose of disconnecting a button but, as far as I understand it is to assign the same button to another function, right? But I am not using the same button for two different things, they are always independent buttons, the error is occurring at the moment that the variable 'item' is passed as argument to the updatepart method.
Your error is half way in updatepart in the if no? It prints part number ac003 then prints new part number ac002 so its somewhere in that func not in editpart? Where in_nump.text() coming from because thats where ac002 is coming from?
hello i need some help on my code
def arithmetic_arranger(list, truth = False):
uppernum = []
bottomnum = []
strip = []
answer = []
if len(list) > 5:
return 'Error: Too many problems.'
for part in list:
string = part.split(' ')
num1 = string[0]
num2 = string[2]
op = string[1]
while True:
if len(num1) <= len(num2):
width = len(num1) + 2
elif len(num2) <= len(num2):
width = len(num2) + 2
if op == '+':
num3 = int(num1) + int(num2)
elif op == '-':
num3 = int(num1) - int(num2)
spaces1 = width - len(num1)
spaces2 = width - len(num2)
spaces3 = width - len(num3)
uppernum.append(' '*spaces1 + num1)
bottomnum.append(op + ' '*spaces2 + num1)
strip.append('-'*width)
answer.append(' '*spaces3)
print (uppernum)
print (bottomnum)
print (strip)
print (answer)
arithmetic_arranger(['32 - 698', '1 - 3801', '45 + 43', '123 + 49', '988 + 40'], True)
it somehow traceback and keep saying 'local variable 'num3' referenced before assignment'
where did it do it wrong?
First of all stop using list, its a keyword
Built in or whatever, name your argument my_list or something
okay will do
But for your question, can op be other then "+" or "-" maybe that is the problem? maybe add else after elif to have default operator
nope, it must be either '+' or '-'
thanks for the answer tho
its for arithmetic formatter test ptoject from free code camp
Hrmm based on your input I wouldn't think so but that would be where i'd look, add assert op == "+" or op == "-" and see?
before your if op
Yes
im curious
def arithmetic_arranger(mylist, truth = False):
uppernum = []
bottomnum = []
strip = []
answer = []
if len(mylist) > 5:
return 'Error: Too many problems.'
for part in mylist:
string = part.split(' ')
num1 = string[0]
num2 = string[2]
op = string[1]
if op == '+':
num3 = int(num1) + int(num2)
elif op == '-':
num3 = int(num1) - int(num2)
while True:
if len(num1) <= len(num2):
width = len(num1) + 2
elif len(num2) <= len(num2):
width = len(num2) + 2
spaces1 = width - len(num1)
spaces2 = width - len(num2)
spaces3 = width - len(str(num3))
uppernum.append(' '*spaces1 + num1)
bottomnum.append(op + ' '*spaces2 + num1)
strip.append('-'*width)
answer.append(' '*len(spaces3))
print (uppernum)
print (bottomnum)
print (strip)
print (answer)
arithmetic_arranger(['32 - 698', '1 - 3801', '45 + 43', '123 + 49', '988 + 40'], True)
it somehow work
I was wondering if the loop was the problem, but i dont see num1 or num2 changing in loop so not sure why it would be issue
np
I would say its fun but that's me, depends what kind of person you are. The Tim Roughgarden stanford videos/coursera algorithms specialization courses is where I got into programming, i could barely program in python at the time and had to look everything up and or find examples but learned a ton that way. Need time/focus though.
The new part number is ok that it is ac002 in this example, the problem is the printout that I highlighted in red in the image, ac003 belongs to the first time a record was modified and should not be printed there, but in that printout highlighted in red should be the information of ac002 in the example, then the error is right in the middle between the editpart method and updatepart, just at the moment when updatepart receives the variable 'item' as argument.
how can I figure out in case of dynamic programming, that whether I must apply top-down approach or bottom-up approach?
hello is anyone here acquainted with substitution method? this comes in recursive thingy
T(n) = 4T(n/2) + n
This is an equation which i have. im trying to understand how i can apply substitution method in this. I'm feeling kinda lost so im hoping that anyone here will be able to guide me in solving this
I don't have time do delve deeply, but figure out what you think T(n) is (check some specific n, also try to evaluate a few layers of the recurrence)
and then substitute that in to see if it checks out
the thing is, my prof did some mathematical induction process. she guessed that t(n) = o(n^3) and then assumed that t(n) <= c x n^3. but im just confused that how did she made that guess
that's just the definition of O(n^3) though
O(n³) isn't even a good bound on this
idk if ive typed anything wrong here as i wasnt present for that lecture
wait, o(n^3)? huh
I'm pretty sure this is O(n²)
idk it feels like i missed a lot so rn im having a hard time understanding it
i asked her how she did that and she said that she just assumed it
and theres no logic behind it
if you expand a few iterations you get
T(n) = 4T(n/2) + n
T(n) = 4(4T(n/4) + n/2) + n
T(n) = 16 T(n/4) + 2n + n
T(n) = 16 (4T(n/8) + n/4) + 2n + n
T(n) = 64 T(n/8) + 4n + 2n + n
yes i watched a couple of videos and they are doing something like this
your thing will go roughly log2(n) layers deep
oh
so you have roughly n + 2n + 4n + ... + 2^log2(n) n
this is < 2 2^log2(n) n = 2 n²
so c n² is a reasonable guess
can i ask a stupid question? how did you reached to log2(n)?
how many times can I divide n by two until I reach 1?
oh
from the T(n/2)
ah umm idk 😭
im sorry idk
but nw!! i will look at whatever you've typed so far and will try to understand it
so sorry for the trouble! and thank you once again 🙏
oh nvm i understood how log came
basically there will be some base case like T(1) = constant
how deep do we need to go to get there
yes rn we have T(1) = 1
so now you know what kind of guess you need
oh damn so is my prof's guess a valid guess?
O(n³)?
yes
it's technically correct, but it's not a good bound
oh
O is an upper bound
yes
yea
also
can i assume O(n logn)?
because idk why this kinda looks like merge sort's eqn
merge sort has 2 T(n/2)
not 4 T(n/2)
oh so assuming O(n log n) is a bad upper bound in this case?
I mean, you can try substituting that in
as compared to O(n^2)
but it won't work
oh
so i need to first find what value to guess, then do the assumption and then substitution right??
and then you check if things work out or not
if we assume T(n) <= c n² we want to see if that satisfies the recurrence
T(n)
= 4 T(n/2) + n
<= 4 c (n/2)² + n
= c n² + n
this almost works
oh
but we have an annoying n we would want to get rid of
but it will become insignificant when we add large values right?
and i think our prof told to calculate the upper bounds only
so i think the final value would be n^2
it will, but if we want to be super strict we would need a slightly different guess
something that can eliminate the + n
our prof just got rid of the constants and n and it is written here that it becomes insignificant in larger values and we're finding out O
imo that's fine
I don't think T(n) = 4T(n/2)+n is enough to solve the problem