#algos-and-data-structs
1 messages · Page 21 of 1
alright i think i have some pseudocode i can follow for an input set, now i wonder how to construct the set returner
i'll try to write some pseudo
Btw, try not redefining built-ins like sum. Use a different name.
its pseudocode, won't matter. but i got you
!e ```py
print(sum([1, 2, 3]))
sum = 0
print(sum([1, 2, 3]))
@opal oriole :x: Your 3.11 eval job has completed with return code 1.
001 | 6
002 | Traceback (most recent call last):
003 | File "<string>", line 3, in <module>
004 | TypeError: 'int' object is not callable
Or you might run into this.
Python lets you do this, but I don't recommend it, it's like redefining things in C with macros.
#define private public
can i use the same for j in range loop near the end to construct the sets?
other abuse of the preprocessor that I kinda helped with https://codeforces.com/blog/entry/77480
The var in the for is local to the for, so it's not being reused.
So one j in a loop is the not the same j in another, unless they are nested.
that's not exactly true for python
Yeah I think you can get away with it outside right?
python scoping is weird
It's just not done normally, because that would be strange.
yes, but you can totally do it
Like var i = 0 in a for in javascript?
Yeah, kind of one of the whole selling points of structured programming.
you could maybe use it to find the value you need to backtrack
by modifying it a bit
maybe i start:
def set_getr(dp):
for j in range(dp):
```?
the existing loop is a min while you want an argmin
so you can re-create the partition from the sum
You can think of indices as holding the structure, while the values are well, the values.
And sometimes you want the structure, not the values.
Or both.
def set_getr(dp,s):
for j in range(s):
y = argmin(dp[j][s])
dp tables in particular tend to hold a lot of information in the indexing 😛
how far off am i
And argmin, max, etc, deals with indices (while regular min is for values).
very?
figured
considering I don't know what you're trying to do there
you know what we mean by argmin, right?
the argument which gave rise to the minimum
right
the min diff in the existing loop?
Think of a list / table as a function which maps index to value, so what would an argmin give (what is the argument?)?
should i be trying to augment the existing for j loop or writing a new function
gtg, gl
oh wait
nvm
they only look at indices <= total_sum/2
but that's actually fine
you can run it in python if you so desire
since it's mirrored
if one partition has sum total_sum/2 - x the other has sum total_sum/2 + x
so you actually only need to check one direction
why would I run the code? I'm trying to check that the logic is right
running examples can make things look correct unless you actually happen to try some edge case
right right
ok yeah the pseudocode i have now won't work at all. i'm not doing +=1 in the way we did for the arrays above anywhere
and apparently they don't either anywhere either..
they just write True or False throughout the table
we weren't doing += 1
no just assigning 1 as T
so im assuming their T/F is our 1/0, so everything is the same but i have the first for loop make the whole first column 1 instead of T
i got rid of the second loop, the for j in range where they set things equal to False
will that break things?
not if you default to False
im defaulting to 0, as thats the way the table is instantiated
the only difference is that i set all the first column to be 1 as in your example using their existing loop
i'll be able to prove it works when i walk through it with examples but i haven't figured out how to return the subsets yet
actually, why do they do that?
you don't need to set dp[...][0] to trues
you only need to set dp[0][0] to true
Initialize top row, except dp[0][0],
# as false. With 0 elements, no other
# sum except 0 is possible
for j in range(1, su + 1):
dp[0][j] = False
im guessing you are saying a non-logical loop, rather than a dumb impl
it should go from zero
for j in range(0, su + 1):
that way you don't need to special case the whole first column like they do
i.e. default everything to 0, dp[0][0] = True and you're good to go
the rule we apply works fine for zero, it's not a special case like they make it look like
im using dp[0][0] = 1, its ok?
that means i don't need the loop initializing the first column to 1's..
as long as you fix the loop that's the one index you need to set
so i've put it into pythontutor to see what's happening but theirs has 1 indexing in the j loop
i just dont know what the original uncompressed string is, since the source code only has the serialized form
thanks this is helpful. good to know I was sort of in the right direction because I planning on making classes for the pokemon and poke data. im still going to try and more or less do it on my own and check for references because I want to try figuring out myself. kind of like the 'look' 'cover' 'write' check' process.
you could at least grab my class that wraps their data and converts some of it to less awful formats
like a byte string rather than an array of strings representing bytes
the overall js code is quite bad
i didnt even know bytes was a data type
i still dont understand 1.how he compressedit and 2. how do i get the original un serialized data? And for the sake of understanding how did he serialise it to begin with (I know he uses a look up table, but from where)
Are there any tutorials that would help me understand this encoding decoding? because this stuff is kind of new to me and making my head hurt hahah. encoding decoding file formats sounds like it could be a really useful skill for me as a 3d artist
hello. would anyone know how to get from a pandas dataframe with 2 columns (x,y) as coordinates, and turn those into some kind of random walk?
the goal is to maximize profit. maximum of 24 hours to do the heist. we have to calculate travel time too. and we need to get back to the choppah before we run out of time. idk where to start with this.
i was thinking using networkx to generate a graph. i thought x,y coords could be seen as nodes. it's just not working. been googling for a few hours
I have tree given in parent array representation. How can I inorder traverse it?
x,y coords would determine distances between nodes which would be weights on edges in a graph
hi guys i have 2 txt files, textfile1 is 500k lines and textfile2 is 3m lines i want to compare textfile1 with textfile2 and copy the entry from textfile2 into a new textfile called output. What would be the fastest way/algo to do this ?
wdym "compare"
Yeah, I thought of doing this manually. But there's 10000 nodes. They're all connected too. I can't wrap my head around this problem
How do I go from the pandas dataframe to a code that checks the most lucrative path from node to node. But all that from a pandas dataframe 😬
I was able to use math.dist to check the distance between 2 nodes
lol, for 100000 nodes you're absolutely screwed trying to find an optimal solution
I expect this kind of problem to be NP-hard
Yup agreed
Doesn't need to be perfect, but I'd like to give it a try at least lol
We have until Thursday.. I'm so stressed lmao
this feels like the kind of problem I would see in google hashcode
i.e. a large NP-hard problem that people try to optimize
also, it's 10k and not 100k, right?
at least that's your table size
(10k)^2 edges is borderline managable
With the full time job that leaves me 3-4 hours a day to actually do some work on the problem. Not optimal
That's like 100m ? Oof lol
maybe too much for python
but in a less memory wasting language we're talking less than a GB of memory
One of the rule is that the code has to run in less than 3 minutes on an average laptop
😬
3 mins is p lenient, even for python
depends, I had a task where your typical reading from file took close to a minute 😛
like a GB of input data iirc
(then solving the problem took only a few seconds, because decent algos)
so a dumb attempt at a solution would be to just greedily pick the "best" node
where best is up to you
a reasonable heuristic would be something like profit divided by time spent
with a restriction that you only allow to pick nodes where you can make it back in time
if this was google hashcode that would be my first thing to try
O(n²) ish
but for n=10^4 I would consider using something other than python
or at least using pypy
That's what I thought selecting neighbors with highest ratio of money/time
And always moving towards 0,0 as we do the random walk
The problem is that I googled for 6 hours yesterday, trying to find snippets of code to get me started. Can't find anything that uses a pandas dataframe to create relations between the nodes I'd create with networkx. I managed to visualize a simple scatter plot with matplotlib tho. But it was useless
I would just write the graph structure myself, but that's just me
granted, since you want all connections you could use numpy
!e
import numpy as np
x = np.array([[1, 2, 7, 4]])
y = np.array([[-3, 2, 1, 4]])
print(np.sqrt((x - x.T)**2 + (y - y.T)**2))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | [[0. 5.09901951 7.21110255 7.61577311]
002 | [5.09901951 0. 5.09901951 2.82842712]
003 | [7.21110255 5.09901951 0. 4.24264069]
004 | [7.61577311 2.82842712 4.24264069 0. ]]
that also helps with space issues, since numpy is actually pretty good about memory usage since it doesn't use python collections
so what I computed there is the distance between all pairs
which is essentially an adjacency matrix
Interesting, I'd have to iterate over the rows in the dataframe and plug in the coords in numpy
would you? I expect pandas would play well with numpy
Yeah, pandas is built on top of numpy, now that I think of it
docs says .to_numpy on a dataframe
I'm practicing hackerrank and having some trouble finishing this solution
do any of you have any ideas for optimization for the algorithm?
of course brute force is easy
and I dn't think simply adding a list so we don't have to recheck certain strings
compute the True/False answer for each element, do a prefix sum, query sum in range in O(1) per query
do you mind typing a solution, I do not understand what you mean, sorry
or some pseudocode
the first part shouldmbe obvious, just compute the answer for each element
yup I got that
def prefix_sums(A):
2 n = len(A)
3 P = [0] * (n + 1)
4 for k in xrange(1, n + 1):
5 P[k] = P[k - 1] + A[k - 1]
6 return P
so something like this
and then what do you mean by the last part
query sum in range in O(1)
You think this would compute relatively fast for 10000 coords?
So like df[x_coords].to_numpy and same for y_coords. Store that in variables and plug it into the hypotenuse formula? I know math.dist does that. Numpy is probably faster tho
With that adj matrix graph for networkx would work
Just need to access IDs
And make a list of IDs as output somehow
calm down with the gifs..
Should ban imo. Not the place for this crap.
<@&831776746206265384>
I wouldn't even put this in networkx, I would expect the size to grow a bunch if you do
granted idk what their internal format is
you have an adjacency matrix, so you have a graph description already
Good point
and the indices correspond to rows in your date
I guess I'm lacking confidence without the training wheels 😂
!ban 502883843825598475 not the place for those gifs
:incoming_envelope: :ok_hand: applied ban to @lethal stirrup permanently.
and as for performance, it's probably isn't going to be great, though maybe you can vectorize most operations with numpy
rows of your matrix gives you the distance
you will also have vectors with things like the money, time, ...
which you can combine with the distance to compute a cost
which should be easily vectorizable with numpy
i.e. not python slowness
(basically do as little as you can in pure python)
hey guys, can someone explain how in this topological sort algo, how it's traversing back to the nodes which had their vertices appended? My graph is in this order {'A': ['C'], 'C': ['E'], 'E': ['H', 'F'], 'B': ['C', 'D'], 'D': ['F'], 'F': ['G']}). After it indexes G from F, it goes back to F, then back to E, then back to C, etc. Where is the code that's doing that? I understand all other bits of the algo, except that part
that's what a prefix sum allows you to do
from collections import defaultdict #dict for graph
"""
1. If a vertex depends on CurrentVertex -> Go to that vertex and then come back to current vertex
2. Push current vertex to stack
"""
class Graph:
def __init__(self, numVertices):
self.graph = defaultdict(list)
def addEdge(self, vertex, edge):
#adding vertices and their edges
self.graph[vertex].append(edge) #adding edge to vertex: A: C, B:C
#print("added edges", self.graph) #adding edges and tehir vertices to graph
print("starting graph", self.graph)
def topologicalSortUtil(self, currentVertex, visited, stack):
visited.append(currentVertex) #add all unvisited vertices to visited initially from first topological sort call.
#Add A, C, E H, F, G
#after all the vertices have been added from the first function - edge elements gets appended
print("visited elements", visited) #should be
# #finding edges of these vertices
for i in self.graph[currentVertex]: #looping through edges
print("when currentVertex is indexed", i)
if i not in visited:
print("not in visited", i) #A, C, E, H, F, G
self.topologicalSortUtil(i, visited, stack)
print("item about to enter stack", currentVertex)
stack.insert(0, currentVertex)
print("Stack", stack)
def topologicalSort(self):
visited = []
stack = []
for k in list(self.graph):
if k not in visited:
print("not yet visited ", k)
self.topologicalSortUtil(k, visited, stack)
#print("finished stack", stack)
graph = Graph(8)
graph.addEdge("A", "C")
graph.addEdge("C", "E")
graph.addEdge("E", "H")
graph.addEdge("E", "F")
graph.addEdge("B", "C")
graph.addEdge("B", "D")
graph.addEdge("D", "F")
graph.addEdge("F", "G")
graph.topologicalSort()```
dang i definitely just missed something funny.. what sort of gifs were they spamming? also, why are mathematicians preoccupied with things like this:
Is this 3d representation of something 11d? I'd see why they are if so
let me get back to where i found it
its a stereographic projection of a dodecaplex
Also known as a three-manifold
http://www.gang.umass.edu/~kusner/other/3mfd.html
oh. is this the genesis of your username
factorial time is greater than polynomial time yeah?
n! > n^3?
Yup
I think of R^n as n-dimensional euclidian space
So if you're talking about a real number, then you're referring to an element in R (or R^1), which is a 1-dimensional euclidian space
Makes sense
Just got back home. Will be working on the workshop for school. Hopefully I get somewhere today
output for this is flat. not quite sure why.. hmm .reshape?
df10x = df10.x_coordinate.to_numpy()
df10y = df10.y_coordinate.to_numpy()
df10adj = np.sqrt((df10x - df10x.T)**2 + (df10y - df10y.T)**2)
np.asmatrix would do the work
df10x = np.asmatrix(...)
so, the pandas series gotta be turned asmatrix
hmm, i get a lot of nan
matrix([[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., 27.29493721,
nan, nan],
[ nan, nan, nan, ..., nan,
nan, nan],
...,
[ nan, 27.29493721, nan, ..., nan,
nan, 33.6148582 ],
[ nan, nan, nan, ..., nan,
nan, nan],
[ nan, nan, nan, ..., 33.6148582 ,
nan, nan]])
df10x = df10.x_coordinate.to_numpy()
df10y = df10.y_coordinate.to_numpy()
df10x = np.asmatrix(df10x)
df10y = np.asmatrix(df10y)
df10adj = np.sqrt((df10x - df10x.T)**2 + (df10y - df10y.T)**2)
df10adj
oh wait, I see what's going on
i also reduced the df to 50 elements, just for the prototyping part
**2 for matrix is a matrix multiplication
maybe there is a better way to turn the array 2d...
df10x = df10.x_coordinate.to_numpy()
df10y = df10.y_coordinate.to_numpy()
nx = len(df10x)
ny = len(df10y)
df10x = np.reshape(df10x, (nx,1))
df10y = np.reshape(df10y, (ny,1))
df10adj = np.sqrt((df10x - df10x.T)**2 + (df10y - df10y.T)**2)
df10adj
output
array([[ 0. , 4.06879027, 4.98027366, ..., 8.46070511,
7.68653398, 6.14089225],
[ 4.06879027, 0. , 5.82129779, ..., 11.52977429,
8.24517308, 2.33589784],
[ 4.98027366, 5.82129779, 0. , ..., 6.60271112,
2.73392075, 6.21347304],
...,
[ 8.46070511, 11.52977429, 6.60271112, ..., 0. ,
6.55712635, 12.58602939],
[ 7.68653398, 8.24517308, 2.73392075, ..., 6.55712635,
0. , 8.08790142],
[ 6.14089225, 2.33589784, 6.21347304, ..., 12.58602939,
8.08790142, 0. ]])
does that sound right?
actually, there is also meshgrid which might be cleaner
x_mat, y_mat = np.meshgrid(x_vec, y_vec)
and then compute things with the x and y matrices
you're right
way cleaner
df10x = df10.x_coordinate.to_numpy()
df10y = df10.y_coordinate.to_numpy()
x_mat, y_mat = np.meshgrid(df10x, df10y)
df10adj = np.sqrt((x_mat - x_mat.T)**2 + (y_mat - y_mat.T)**2)
df10adj
Is it possible to generate nodes in a loop doubly linked lists
actually, would that even do the right thing...
I think not
you probably want the x - x.T
meshgrid doesn't quite do the same thing
both output looks the same, i think
they shouldn't 
the equivalent operation would be something like
x1,x2 = np.meshgrid(x, x)
x_dist = x1 - x2
y1,y2 = np.meshgrid(y, y)
y_dist = y1 - y2
np.sqrt(x_dist**2 + y_dist**2)
which is...not great
why do you even get a vector from pandas?
oh, is the thing you're converting not a dataframe?
but a column, or something?
I'd say this is probably clean enough
yeah df[x_coordinate] and the y equivalent, they are Series in pandas i think
apparently
series.reset_index().to_numpy()
```would work
reset_index turns it into a dataframe
so you should get a 2d array
oh wow look at the mess i made
oh interesting, let me check that one out
looks like the shape i get is 500,2
but
ValueError Traceback (most recent call last)
Cell In [56], line 10
2 df10y = df10.y_coordinate.reset_index().to_numpy()
4 # nx = len(df10x)
5 # ny = len(df10y)
6
7 # df10x = np.reshape(df10x, (nx, 1))
8 # df10y = np.reshape(df10y, (ny, 1))
---> 10 df10adj = np.sqrt((df10x - df10x.T)**2 + (df10y - df10y.T)**2)
11 df10adj
ValueError: operands could not be broadcast together with shapes (500,2) (2,500)
df10x = df10.x_coordinate.reset_index().to_numpy()
df10y = df10.y_coordinate.reset_index().to_numpy()
df10adj = np.sqrt((df10x - df10x.T)**2 + (df10y - df10y.T)**2)
df10adj
oh the index is thrown in
ok i cheated
df10x = df10.x_coordinate.reset_index().to_numpy()[:,1:]
df10y = df10.y_coordinate.reset_index().to_numpy()[:,1:]
added a slice at the end lol
im not sure how this works but i got node IDs too
so i somehow managed to make dijkstra work on my adjacency matrix. can i use it to compute the "optimal" path to steal from the banks
not really?
it's not a shortest path problem
did you try just implementing the greedy choices?
it's probably the best you can (easily) do
it feels like a constrained version of the longest path problem
and the longest path problem is NP-hard
Is this the right place to ask about trees? I'm unsure if I can ask it in #help channels because it is not really Python-exclusive.
Please delete if this is not the right place, but how do I know where to stop "tree-ing" the 1's in a Fibonnacci tree? I am watching a tree introduction and this is how they make a fibonnaci tree with root 3
3
/
1 2
/\ /
0 1 1 1
/
0 1
How do I know when to stop? If I can divide the 1's into [0,1], then why is a Fibonnaci tree not an inifite sequence of 0's and 1's
i edited the tree
so if i follow your 4th condition, then the correct tree is ...
-----3
/
1 2
/\ /
0 1 1 1
/\ /
0 1 0 1
in his code, if i am understanding it correctly, when the node key is one (or zero) and it is a root then it is a tree (a leaf) also
is it multithreaded or linear algorithms that are N/A to python?
bump
it is min by problem definition, but yes, that. i need to first understand what a vertex cover is. i am reading the wikipedia now
oh i get it. every edge has at least one endpoint node in the vertex cover:
vertex covers and min vertex covers
top and bottom
right
What's cnf and what's p?
CNF = conjunctive normal form. the P is polynomial time reducible
NP problems are determined for hardness by reductions from known NP problems
Oh thanks 🙂
the original NP being circuit satisfiability
Try to understand why it's greater
i may have written it in the wrong direction
the typical name is 3SAT
Nah I get what you want to say, no worry
3CNF is the canonical form
3sat?
who talked about circuit sat?
Think me by, accident 😦
i was saying circuit sat is the original NP problem from which others are derived
oh, I was complaining about you calling the problem 3CNF
Because it's 45 cnf
3-SAT is the usual name
In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can...
Nobody was against you dude xS
another name for 3SAT. got it
the SAT part is the actual problem
is this thing satisfiable
the 3-CNF is a restriction on the allowed input
Tried to make a truth table?
That makes it quite clear I think
If I did not miss anything
(and that's one way of deriving it, you could really start with any problem)
the NP complete problems are fun in that if you can solve one, you can solve all
how about the problem of figuring out what's wrong with my mother-in-law, definitely NP complete
And it's the first way to understand in logic, as you have the bollean rules
And de Morgan of course
de Morgan was a slouch
Why! 😮 just know his laws for logic
i'm totally joking. as was i for the mother in law comment. i dont have a mother in law
c'mon this was pretty good
now i know how @haughty mountain felt when nobody appreciated the pi hexadecimal reconstruction 😦
OK, I think I miss ome concept to understand this joke :/
which joke
😢
is your mother-in-law 3-SAT? ||because she's near impossible to satisfy||
lmao there we go
the lowest of humor
the nichest
it's a hard thing to cover
ahhhhh hahahaha
I know it's been a couple days but, want to know how uncompressed Pokemon data was serialized with a look up table.
would this be an appropriate channel to ask for an explanation for why one solution might be better than another?
probably
idk what their serialization process is, but the deserialization logic is simple enough
in your example was the raw_pkmn_data you imported made into a class? because I tried without classes and got an error at line parsed = dict(map(parse_name, raw_pkmn_data.pkmns.split('|'))) saying that raw_pkm_data doesnt have any pkmns attribute which makes sense because the raw data was just two strings in the original
i know why this doesnt work but how would that pkmdata class work in your one since your acessing raw_pkm _data attributes
oh, the pkmns, eggs, and types are from the js file
javacalc.html lines 533 to 534
types = ['Normal','Fighting','Flying','Poison','Ground','Rock','Bug','Ghost','Steel','???','Fire','Water','Grass','Electric','Psychic','Ice','Dragon','Dark'];
eggs = ['???','Monster','Water1','Bug','Flying','Ground','Fairy','Plant','Humanshape','Water3','Mineral','Indeterminate','Water2','Ditto','Dragon','No Eggs'];```
LegendaryPKMN.net’s Pokémon Individual Value & Stat Calculator. - ivcalc/javacalc.html at master · LegendaryPKMN/ivcalc
and of course the other lines below that
thanks, for pointing that out. though I don't think that answers my original question. like all that is necessary data but it doesn't address my original problem. how does raw_pkm_data have a pkms attribute if there all various strings/ lists (not a custom datatype)
it is just a string
like, raw_pkmn_data.py is just
their data
(ignore the list[...] typehint from my editor)
the actual code I have just puts their data in a nicer form
I just put all their raw data in a separate module
Oh I see, all good. Thanks for explaining!
I'm so used .attribute being a class /data type thing I forgot you can just do that with global variables in python
Assume you have a set of jobs, each taking from time x to time y.
You can only do one at once and each must be completed their entire time.
How would you maximize the amount of time taken? (Not all jobs have to be completed, just maximize the total time)
log(log n) ∈ o(log n) is true, right
(that's little-o, not big)
more informally, the fraction tends to zero
Because for j in range(0): does not even enter the code block indented below it
So it just executes print() and continues to the next i
Ok thank you
sorry to bother you again. ive tried studying/ playing around with your code but I don't understand it. (specifically, the parts where you're extracting the serialized data) the things that are throwing me off (they go hand in hand). 1 your use of the byte datatype is wholly unfamiliar to me. so when you're manipulating those bytes variables i dont know what your doing 2. i don't know how you deserialized it. normally I could probably figure it out by myself but because it revolves around an unfamiliar datatype(bytes) i dont really know what to do
I admitt to being a beginner . my intentions doing this was to just to write my own simplified version. id have the data in regular data (lists of strings, or int arrays ect.) and use that to form the basis of everything. If knew it involved everything else i wouldnt have done it because its too advanced for my level.
even if it's bloated as hell id like to start with the base stats for every pokemon as a normal list of an array of integers in the raw data and work from there. How would you just get all the base stats in national dex order ? Or is it mangled in such a way that makes that difficult
the bytes datatype isn't advanced, it's just a sequence of values in range 0-255
you could use a list instead
Their pk is based on a string of hexadevimal values separated by comma, which they split to have a list of strings. Every time they read something from pk they need to do a conversion from the hex to int, I say just do the conversion once since at the end of the day what's needed is the byte values.
I use bytes.fromhex but you could also do [int(value, 16) for value in pk]
the pokemon stats (and some other stuff) is represented as 12 characters in the pkmn string, so raw_data grabs the relevant 12 bytes
basically it's just 12 characters for each pokemon in dex order in pkmn
these characters are mapped to indices using mn, and the actual byte values are in pk
(it's a dumb encoding, idk why they do it)
in any case, my raw_data function deals with this nonsense and just gives you 12 values
12 bytes
the first 6 are just the base stats
the next 2 are primary/secondary type
the next 2 are egg groups
the last 2 encode the ev yield
the format is dumb, but believe me when I say my code to deal with it is a lot better than the js code that does this...
Oh I believe you hahaha. It's night for me and I'm mentally tired so I'll try tomorrow morning but this clears it up. Thanks!
as an example, the second set of 12 characters (index 1) is '2ρρ**21()B )'
it's the data for bulbasaur
lets look at the first char '2' which should correspond to hp
hi, how is going?
we see what index it has in mn
# 1111111111
# 01234567890123456789
mn = ' !"#$&()*+,-./01234567...'
index 16
why 12 chars, not 16?
we look up the value at index 16 in pk
or 12 is just random?
which is 2D or in decimal 45
which is indeed bulbasaur's hp stat
what is in pk?
you can look at the data here if you're interested
it's a bad format, but that's what they are using
12 is enough for the data they are storing

Hi all, I saw on Reddit that I’ll be better working with data sets if I’m comfortable with eigenvectors and eigenvalues.. what are those and how will that allow me to comprehend a given dataset better
remind me what is a doubly nested loop i can use to do an operation on all pairs of a list?
i think its something like:
for i in range(n):
for i+1 in range(n)
seems kinda random..idk how reasonable that advice is. but an eigenvector of a matrix is a vector that when multiplied by that matrix is only scaled by a constant factor
that constant factor is the eigenvalue for that eigenvector
you're probably thinking of
for i in range(n):
for j in range(i + 1, n):
...
idk if it helps but for context the data would be in bioinformatics so like gene expression data
probably, yeah
i don't know anything about that ¯_(ツ)_/¯
but what i am reading is a list of lists where the first value of each list is the name and the second is the pairwise comparison data. so i'll modify
fair enough
gene expression data is really straightforward, you have a bunch of genes and their lvl of expression is the number of mRNA molecules that were counted. so you'll have:
geneA 1200
geneB 100
geneC 0
geneD 17
geneE 51
etc
conceptually eigenvalues and eigenvectors are easy
say you have some matrix M
then vectors v and constants λ such that
M v = λ v
are the eigenvectors and eigenvalues of the matrix
i.e., what are the vectors for which the linear transformation M is effecticely just multiplying by a constant
😵💫
sounds like i'll need to do some reading to understand
struggling with my recent algo. cannot paste here i'll dm
basically in which directions does the data expand/contract
if you have a eigenvalue > 1 in some direction, vectors pointing in that direction will grow larger
< 1 and they would become smaller
what is the linear transformation M
oh
i'm thinking of a matrix to store data, not as a functional structure
perhaps inaccurately so
above: i can just add a bunch of append statements yeah
I would totally use yield there
and build a string
the function name is now totally misleading, but still
def print_lcs(b, string_a, i, j):
if i == 0 or j == 0:
return
if b[i-1][j-1] == 3:
yield from print_lcs(b, string_a, i-1, j-1)
yield string_a[i-1]
elif b[i-1][j-1] == 2:
yield from print_lcs(b, string_a,i-1,j)
else:
yield from print_lcs(b,string_a,i,j-1)
right i see what u mean. ok ill try
then something like ''.join(print_lcs(...))
so it's working great, just every char gets its own newline, which i do not want
data in out is a list of lists where element zero is string name and element [0][1] is string to compare
better to build a string and then printing it, doing some hacky prints in functions is generally a bad idea
i agree
but uhh.. do i need to use yield?
i can just make a string and add a bunch of append()s?
ok so i'll add those yield statements and then have each line wrapped in an append
?
wdym?
or with yield i don't need append
exactly
it returns the values you yield one by one
how do i yield them into a string
!e
def f():
yield "a"
yield "b"
yield "c"
print(f())
print(list(f()))
print("".join(f()))
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | <generator object f at 0x7f531971c880>
002 | ['a', 'b', 'c']
003 | abc
generator functions are great for building sequences like this
yield is what makes a generator function
oh oh you use it to confer generator functionality to any function you're writing i got it
yielding also makes the function lazy
!e
def squares():
i = 0
while True:
yield i**2
i += 1
for sq in squares():
if sq > 90:
break
print(sq)
@haughty mountain :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 0
002 | 1
003 | 4
004 | 9
005 | 16
006 | 25
007 | 36
008 | 49
009 | 64
010 | 81
oh wow it'd just keep on going
e.g. list(squares()) would be a bad idea
lol
i think the next thing i want to learn is using more than a single processor of my cpu while computing, or even use the GPU
parallelism in python? probably not
oof
using the GPU is (somewhat) straightforward. lots of modules for that
thats cool. i guess i dont really have any applications for that quite yet. although the strassen's matrix multiplier was really slow. could have been better there
i sure wonder what language you could do it in 🦀
||C ||
Try thinking of numbers not as "how much" but as actions, e.g. +3 -> add three or move three to the right, *3 -> scale up / stretch. Then consider M again as a more fancy number, you can multiply it with other stuff. So like with normal numbers, it does a transformation / action.
(data <-> code (data is code and code is data))
The easiest way in Python is via Numba.
ok thanks @opal oriole
you can't do any fancier stuff though
I remember in an HPC class we had a thread delegating work to worker threads and another thread writing to file as results came back
oh, that's actually efficient?
i always wonder if such designs are a good idea whenever I'm writing multithreaded anything
The threading Numba provides is not for things like IO, only numeric stuff, and it does so naively. But the performance gain to coding effort ratio is great.
actually I think there wasn't even active logic to delegate work, we set up a list of tasks protected by a mutex, and spun up a bunch of worker threads to take on tasks
and then the main thread became the writer thread
Optimal multithreading is kind of crazy on modern machines, it's rarely done.
It's called master-slave paradigm iirc, but I think that name is not allowed anymore 😛
not a fancy job-stealing queue? :p
but yes, as long as you can make threads work mostly independently you can have great speedups
wrong channel
granted, our single threaded code beat our professor's multithreaded target time 😛
I recommend always trying to squeeze more out of single core first, because modern cores are so fast (for less complexity).
but our stuff also scaled well with threads
we were actually bottlenecked by fprintf
Michael Isard, Derek Murray, and I recently sent in a HotOS submission (it’s not blind, so no harm talking about it, we think). The subject is hinted at from...
*Which funnily enough Python does faster due to buffering stuff.
i love how i have absolutely no idea what y'all are talking about
so we switched to mmap and manually printing integers
and then we were finally kinda limited by hardware speeds 😛
i think im going to have to learn AWS at some point. that's industry standard if the company doesn't have their own comp. cluster
there are others
it'd be cooler to build out a cluster and do all in-house computation 😛
thats what they did at an academic lab i was a part of. but when i was in industry they used AWS
let me just map-reduce a few petabytes of data with these 10k machines
The timing on that project is a bit rough as getting chips for such clusters shot up in price in the last couple of years. Raspberry PI for $300...
yeah i'd be interested which hardware was used to create the cluster
i just got a raspberry pi for a project for super cheap.. it was like $20 USD
(I saw discussion at work about some compute cluster having actually running into issues of using 64 bit integers for addressable storage)
translate pls
Yeah that is happening more and more now. "Big Data" / everyone doing data science.
32 bit numbers can address ~4GB of data, 64 bits could address...a lot more
how much is it again?
4EiB?
16 I think.
correct
You have: 2**64 bytes
You want: EiB
* 16
/ 0.0625
yes, someone was questioning why utilities dealing with byte sizes was using 128 bit integers
and got the response that 64 bits actually started causing issues
👀
for reference, 16 EiB is 16777216 TiB
TiB?
Tibibytes, just to be pedantic about it being the base 2 version
wth are tibibytes
18446744 TB if you prefer that
so computers like base 2
our prefixes like kilo, mega, ... don't
like binary?
e.g. 1kb = 1000bytes
which is not that nice for computing
so it's common to use 1kib = 1024bytes
kibibytes
ohh got it
kb used to mean 1024, but they changed it because it confused consumers when they saw the numbers on the boxes.
yeah...
well yeah
1kb in my world is 1k base pairs
anyone know how to create a histogram from an image?
harddrive manufacturers love the base 10 version
because they can claim higher numbers
And so depending on who you ask, they will tell you that kb is still 1024, if they are being stubborn or depending on context (e.g. a kernel's code).
i have all SSDs in my PC 🙂
or you are the pedantic nerd in the room who uses the base 2 prefixes
i agree it should be changed, you cannot claim kilo is anything other than 1000 of something
(Like with math, just define it beforehand, then continue)
kilogram, kilodalton, kilometer
kilogram being the SI unit is fun
i usually work in terms of like micromolar or nM (millions or billions of a mol / L)
μM = micromolar
although that world will be long forgotten if i continue the computational route
sry, not on topic
Unless you start doing simulations or other tools for that.
In which DS&A come up way more than other kinds of programming.
i doubt i'll go into chemical informatics but i suppose it's possible. i've had 4 semesters of chemistry and i was really good at organic which a lot of people flounder at
let me see if I can find some old stuff from my HPC course
the threading task I was talking about was about generating Newton fractals
we learned that writing the complex math by hand was 2x faster :^)
and terrible to write and look at
looking at the current site for the course (which has changed over time) running our code single threaded code is about at the limit for 10 threads
woah
actually, maybe I can find the original constraints we had
how was Gothenburg
I liked it
i was asking about why mathematicians are so preoccupied with shapes like above the other day
ah, found the old thing http://www.math.chalmers.se/Math/Grundutb/CTH/tma881/1617/assignments.html#optimization
I'm re-running our old code on my a tad more modern hardware
for fun
-rw-r--r-- 1 algmyr algmyr 7.2G Nov 24 22:39 /tmp/newton_attractors_x7.ppm
-rw-r--r-- 1 algmyr algmyr 8.5G Nov 24 22:39 /tmp/newton_convergence_x7.ppm
the file sizes for the 50k lines are a bit chunky
it generates image files for the fractals yeah
50k x 50k pixels
which is kinda ridiculous
woah
the 1 thread version is almost finished
The Newton fractal is a boundary set in the complex plane which is characterized by Newton's method applied to a fixed polynomial p(Z) ∈ ℂ[Z] or transcendental function. It is the Julia set of the meromorphic function z ↦ z − p(z)p′(z) which is given by Newton's method.
lol, these times are great
1000 50000
1 thread 0.130s 307.7s
10 threads 0.021s 40.91s
Who knew root-finding could be so complicated?
Next part: https://youtu.be/LqbZpur38nw
Special thanks to the following supporters: https://3b1b.co/lessons/newtons-fractal#thanks
An equally valuable form of support is to simply share the videos.
Interactive for this video:
https://www.3blue1brown.com/lessons/newtons-fractal
...
our aim for all tasks was basically to beat the target time with one thread
and I think we succeeded basically all the time
Here is the short report we wrote, we used some cute math tricks
http://algmyr.se/upload/newton.pdf
(it's not a long report)
mathematical cunningness and laziness
lmao
can you really put stuff like that in your academic work
this is just a hand-in, not like an article 😛
(and this was probably written at some point in the middle of the night)
what did we misspell?
ah
race conditions is when two (or more) threads is working on the same data at the same time
z should be z_n in this formula
lol
lemme just critique the spelling of your years-old work real quick 😛
what is mutex
I think the convergence trick was quite neat
cheaper, and requires no knowledge about the exact answer
my spelling comment was more to confirm "yes, i can tell it was written in the middle of the night as you have just stated" 😛
hate ∆ as a variable name tbh, I keep reading "laplacian, wtf??"
mutex stands for mutual exclusion
basically, only one thing can hold the mutex at once
think of it as only being allowed to do things when you hold a specific object
you pick it up, do your work, and put it down
if the thing is already picked up, sucks to be you, you have to wait
sometimes during parties people will have a speaking stick (some random object) and only the person holding it is allowed to speak
sounds like that
a bit
I think we avoided most use of mutexes in this code
it's basically only used when picking up new tasks
so multiple threads don't pick up the same task
we probably should have used δ instead, but annoying people with Δ is fun too
this is good life advice for programmers
when does a linear approximation to the function around a value x equal zero
yeah i suffer from trying to develop the plan within an IDE.. need to work out on paper first
although it somehow worked today to get some code running
sometimes i can see what i'm trying to do better when all the variables are at hand
great stuff, thanks
rare usage of the because symbol
I think i'd use impliedby for that 😛
Write a C program sum that computes naively and outputs the sum of the first billion integers. The makefile should contain
already too busy screaming, brb
nope.jpg
the last task was very dumb
solve a Dijkstra problem with distributed computing
interesting tasks though, I wonder which of them make sense in rust
which is very much BS since Djikstra really doesn't benefit from it
something something scalability but at worth cost
I think the intended thing was to find the min in a distributed way
but...we could just use a priority queue...
one of the best academia things i ever solved was determining structures of chemicals from proton NMR data. a classmate of mine and I were working on some homework and perfectly drew the structure which was a cyclic structure with a bridge on the top
wish i could find that problem
its buried in an organic chem book somewhere
we did actually make use of the distributed thing though, just because
You can solve the problem quicker if you assume that the longest path is below some limit, it means you can also throw away a lot of the edges. So start different computers with different assumptions about the shortest path
and whichever computer gives an answer first we take
and kill the other computers
for another one of the tasks we tried to implement fft on a GPU instead of using the GPU for computing convolutions
sadly we had precision issues that we couldn't resolve in time
we had a working impl of fft on a gpu though
i'm watching this 3blue1brown vid and i have no idea what he's on about
fft?
oh something fourier transform?
my cs prof always talks about how one of his former students scanned his wife's sinuses and he could see all the layers in real time with an MRI or something and that back in the day it would take a week to process
fun application of... that's not even a fourier transform, just a single-frequency component of it... is synchronous detection. If you want to detect a signal with known frequency, you can do that very well even when you have a lot (orders of magnitude more than the signal) of noise.
iirc you can extract a single frequency quicker than by doing a full fft as well
yeah, sure, fft is n log n, you can do one in n
we're getting to the point where ultrasounds are cheap enough now to have one in the house and look at things when you're having pain, and yet, interpreting the data probably requires a medical degree
you basically just compute
average = signal.mean()
amplitude = np.hypot(np.mean(signal * np.sin(freq*time)), np.mean(signal * np.cos(freq*time)))
Gilbert Strang, author of the classic textbook Linear Algebra and Its Applications, once referred to the fast Fourier transform, or FFT, as “the most important numerical algorithm in our lifetime.”
wth is a discrete fourier transform
and this can get you insane sensitivity to differences in frequency with long enough "exposure" (number of points) - basically all other frequencies get filtered away.
its pretty interesting how in most modern tech, multiple different fields are converging
a discrete version of the usual fourier transform :^)
well, in digital computing you don't get continious signals - you measure the signal once every 5 microseconds or whatever and get a stream of values like that.
the fourier transform but discrete
so you need to adapt the fourier transform math to work with sums rather than integrals. That's the DFT.
but as a more serious answer, one view of fourier transforms is that you can go from a time representation of a signal (e.g. a waveform of sound) to an equivalent frequency representation of a signal
an interesting view of DFT is in the context of polynomials, where the regular representation is a bunch of function values, and the transformed version are the coefficients of the polynomial
i'm reading the discrete fourier transform wiki and my head is spinning
this also leads to fun consequences, multiplying polynomials by only knowing the coefficients is expensive
bit if I knew function values it would be trivial
so you can use fft to transform into function values, multiply, and then transform back
avoiding the usual expensive O(n^2) multiplication
in the context of the task we were supposed to do on the GPU, it was basically do a heat transfer simulation, which boiled down to doing an convolution against a specific kernel over and over
being more math savvy we immediately saw that we could do this much faster if we used fourier transform
heat transfer -> boiled down
i wonder if you can literally just... ask some distributed computing library like opencl here to implement a convolution as a fourier transform
because convolution is just regular pointwise multiplication in the transformed world, which can be computed quickly even for a huge number of iterations
since it's a very normal thing to do
I really doubt it would be faster than doing it on a cpu
ah, that's fair
we wanted to do it on a gpu because we could
...and because the assignment asks for it 😛
for one interpretation of asks
the intended solution was for sure just to do the convolutions
I was going to say that scipy's convolve uses fft, but looking at the code... looks like only the signal one does
and the ndimage one doesn't
at least, can't easily see it
I can recommend looking at Kahan summation, it's a nice technique
ah, the thing math.fsum does presumably
and yeah, consumer(gaming) GPUs are usually so slow at doubles it's not worth it
like, it depends on how they are split between 32-bit and 64-bit processing units AFAIK, and GPUs for graphics generally lean hard 32-bit (ones made for compute generally have a more even split)
I think our main problems were that this was basically the first time we wrote an fft
so we made some dumb mistakes
ah, fair
we had so much fun doing stupid stuff in this course
a thousand-line FFT in C and OpenCL is an interesting definition of fun 😛
it's the friends you made along the way 🥺
We were also super pedantic in one task about getting exact solutions
the cell distance one on the website
iirc the professor's solution wasn't even exact
Just a slight brag about beating the professor's reference solution while also guaranteeing perfect accuracy
Turns out using a float sqrt and correcting the small error afterwards is a viable thing to do for the input we had. So we could process twice the values at a time in our SIMD
I also suspect we have some UB in the code
needless to say this was probably one of my favorite courses since I ended up getting the opportunity of doing dumb algorithms, math and low level optimizations with a friend of mine 😄
Hello everyone!
I came across this server as I like python (not favorite lang) but have gotten exposure to
and
during data mining and also like writing python code for some alg problems!
For me it's the other way around. I don't like the use of capital delta for the Laplacian.
Anyone got good recommendations to learn data structures and algorithms of python??
Check the pins
What are the potential use cases of using a column as an index in a Pandas dataframe while also keeping the original column, e.g.
df = pd.DataFrame(data)
df = df.set_index("Name", drop = False)
Is this useful in some obscure edge cases or is my imagination too limited?
sometimes you have two unique columns, like maybe user ids and user names, and want to switch which one is the index without dropping the other, perhaps
DS courses are typically language agnostic
can someone help me implement a flag to include an optional output
ah shit nvm. i need to be able to handle input that is both one and multiple lines
I showed you this some time ago I'm pretty sure
!main
if __name__ == '__main__'
This is a statement that is only true if the module (your source code) it appears in is being run directly, as opposed to being imported into another module. When you run your module, the __name__ special variable is automatically set to the string '__main__'. Conversely, when you import that same module into a different one, and run that, __name__ is instead set to the filename of your module minus the .py extension.
Example
# foo.py
print('spam')
if __name__ == '__main__':
print('eggs')
If you run the above module foo.py directly, both 'spam'and 'eggs' will be printed. Now consider this next example:
# bar.py
import foo
If you run this module named bar.py, it will execute the code in foo.py. First it will print 'spam', and then the if statement will fail, because __name__ will now be the string 'foo'.
Why would I do this?
• Your module is a library, but also has a special case where it can be run directly
• Your module is a library and you want to safeguard it against people running it directly (like what pip does)
• Your module is the main program, but has unit tests and the testing framework works by importing your module, and you want to avoid having your main code run during the test
"This is a statement that is only true if the module (your source code) it appears in is being run directly"
ok that seems to always be the case for my progs
sooo my program was working beautifully but i think it falls apart when one of my input sequences spans multiple lines
bc of the way i am reading the input
here is how i am reading input:
which works great for my example input, where each string is on its own line. however, i wanted to test my program with much longer sequences (actual gene sequences from different species) and it breaks
so im wondering how i can write it to handle both single-lined strings and strings which span many lines
🤔
maybe like py if char == '\n': pass
or some logic like that
what's the input format?
text file that looks like this
or in my test case, each string declared spanned many lines
that's what I was asking, an example of that?
yo guys
does anyone know how to do this?
it seems pretty straightforward. just sum up the transfers in and out and the fee
can you give me code for checking the condition on calculating the fee?
what have you tried so far?
comparing datetime objs and calculating that way but it's not dynamic enough
and I don't think that's how you're supposed to do it in actual interviews
wdym "it's not dynamic enough"
yeah, that looks wrong
it's not 10/mo, but 10 total since there's 2 months
ah
it's asking "at the end of the year 2020", not at the end of the data
so that's 12 * 5, which accounts for the 60
so it auto applies a fee of 60$/year unless transactions are made 3x within that month
do you recommend placing things in a dict by dict, if not, how should I compare dt objects
i don't see how you'd use a dict for this, unless you mean using the months as keys?
yeah
why not a list
how would I compare dt objects
i just wouldn't use dt objects. i would just get the month out of the string directly
and you'd only have dates, not datetimes
thanks, I get it since you explained it quite well. my question is out of curiosity what would be a good way to serialize it/compress it?
Hello
Has anyone heard of sloot digital coding system? So my friend randomly saw this video on YouTube youtu.be/KOvoD1upTxM and he sought after making the algorithm himself. From what I read jan sloot's algorithm was mathematically not possible but somehow my friend has come up with a very basic code that as per him applies the same theory. Now we want to convert it for large scale picture and documents for compression. But again as I mentioned before lot of ppl say that it's mathematically impossible. Here is his code:
We wanted to work on it so that we can offer a service that provides compression by a factor of 10
Now do note that the video he looked up was a Google developers video that was posted on the 1st of April so we are in the blue whether the thing is possible or not but since this code has been working he has been very adamant that the theory works
seems like a joke to me, though it's technically possible
it's amazingly hard to try to compress anything of non-trivial length
it's kind of the best kind of programming joke, it's technically correct but wildly wrong in all other ways
So it works with small amount of data but say with a 4k image it goes brr
Can you explain in a bit more detail please
so let's do a binary stream for simplicity, you want to have a seed that ends up generating the N bits you want
you would expect to have to try something like 2^N seeds to find one that works
and it's very clear in the video that the larger examples are generated by inverting the process
as in, just generate a bunch of random values
It’s not
wrong inequality, let me fix
Let’s say we have 10 - 1 >= 10 - 5 and 1 is log2 and 5 is 1/2logn that’s true
try n=1

you just need to say that it's valid for n greater than some value
Yeah let me find it
~~you could also just do
n/2 log n/2 <= n log n
```and be done (you still need a lower bound, but whatever)~~
errr
n can’t be 0
the top inequality is what your step assumes
I meant 1
my bad
n=1 gives that inequality
log 1 = 0
n=4 is the cutoff yes
not that it matters much what the exact cutoff is
it just needs to be some finite number
Like a single number
finite number as in some constant < infinity
anyone knows how to write this in a pythonic way? xd
floors = {
"Zone 1": [1,2,3,4,5,6,7,8,9,10],
"Zone 2": [11,12,13,14,15,16,17,18,19,20],
"Zone 3": [21,22,23,24,25,26,27,28,29,30],
"Zone 4": [31,32,33,34,35,36,37,38,39,40],
"Zone 5": [41,42,43,44,45,46,47,48,49,50],
"Zone 6": [51,52,53,54,55,56,57,58,59,60],
"Zone 7": [61,62,63,64,65,66,67,68,69,70],
"Zone 8": [71,72,73,74,75,76,77,78,79,80],
"Zone 9": [81,82,83,84,85,86,87,88,89,90],
"Zone 10": [91,92,93,94,95,96,97,98,99,100]
}
dunno if i'd call it pythonic, but here
!e
floors = {f'Zone {z}': [*range(10*(z-1) + 1, 10*z + 1)] for z in range(1, 11)}
print(floors)
@covert thorn :white_check_mark: Your 3.11 eval job has completed with return code 0.
{'Zone 1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'Zone 2': [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 'Zone 3': [21, 22, 23, 24, 25, 26, 27, 28, 29, 30], 'Zone 4': [31, 32, 33, 34, 35, 36, 37, 38, 39, 40], 'Zone 5': [41, 42, 43, 44, 45, 46, 47, 48, 49, 50], 'Zone 6': [51, 52, 53, 54, 55, 56, 57, 58, 59, 60], 'Zone 7': [61, 62, 63, 64, 65, 66, 67, 68, 69, 70], 'Zone 8': [71, 72, 73, 74, 75, 76, 77, 78, 79, 80], 'Zone 9': [81, 82, 83, 84, 85, 86, 87, 88, 89, 90], 'Zone 10': [91, 92, 93, 94, 95, 96, 97, 98, 99, 100]}
its ok, its perfect
thats a smart way of going around it
thank you lol
I'm curious, is there a simpler (or more pythonic) way to do this?
created = []
for attrs in data: # data type: list[list[str]]
if len(attrs) >= 3:
work = Work(*attrs) # initialize object
if work.save(): # returns bool
created.append(work)
Will this do the same trick?
created = [work for attrs in data if len(attrs) >= 3 and (work := Work(*attrs)).save()]
yes, but I like the first more tbh
Because of the controversy with the walrus operator or because it only works on 3.8+?
Just because it's harder to read, actually.
although I'm personally tempted to write it like something like
creates = SI(data).filter(len(_)>=3).map(Work(*_)).filter(_.save()).tolist()
which uses https://github.com/kachayev/fn.py, so your mileage may wary
Ooooo thanks!
If you want other people to understand your code, I advise you to choose this version. (Often, the other person is a future version of yourself)
can anyone help me w algo coursework??
Hello
Is there a way to such thing in python , call function by its name stored in a variable
x = "bool"
l = x(0)
print(l) # Should print False
yes, but why
!e this, also off-topic ```py
x = "bool"
l = locals()x
print(l)
@stray fractal :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 2, in <module>
003 | KeyError: 'bool'
nvm
I'm testing a function when I want to edit some app parameteres , and I want to pass the new value and a type to help me convert (cast) in case I need to
You'd use vars, not locals
Nevermind
thank u
how to cut equal pieces in cake using python
can someone look at this: https://discord.com/channels/267624335836053506/1035199133436354600
check if tuple with range is inside list
how can i just convert this timestamp to say yy:mm:day
it is too specific
i get a bad line plot
what is the most concise way to write if character is not equivalent to one of the following several characters
are you sure the timestamps are sorted correctly ?
im struggling with my __repr__ method for a custom error class
idk why it's returning as a tuple
i would like a string
hello!
👋
i want the compiler keep asking use to input until the user is done
user*
when the user press enter whith no input
thats when he is done
while loop?
don't how it's going to help
i know how it works but couldn't phrase it with my condition
stick around someone will be able to help you better
welp. ok
i think input is missing some braces above
anyone can help with my custom error __repr__? idk why its printing as a tuple like this:
('my error message', 'error_character')
sry its a custom exception
Hi. Is there a person who might know how to access a 2D list using a tuple?
like using (0, 0) to access 5 in [[5, 3], [6, 7]]
you don't use a tuple, you use both indices:
y = [[5,3], [6,7]]
if you want 5:
print(y[0][0])
can you show the class definition?
nevermind, i figured it out
ahh nice
I know the normal mode but i need to access it using a tuple
why must you use a tuple
if you absolutely have to do that, you probably need to convert it to the proper access syntax somehow
You can do something weird with a reduce function and calling it on the tuple
Other than that, I don't think python supports tuples for array indexing
I don't think it will, the unpacking won't work as intended
my input to program is like this
[(0, 0), (3, 3)]
and these are the items in my matrices so i should access to y[0][0] or y[3][3]
something like that
what universe posted will work then
now this will work but i want to modify them too
modify what
y[0][0] for example if it is 0 i want to change it to 19
just add some if statements to the code above
if a == 0: a = 19
oh you mean the value at y[0][0]
yup
y[a][b] = 19```
not like that
okay i think i figured out a way thanks for your help
hmm right now i am catching an error and aborting the program, it'd be nice to instead just remove the sequence with the error and run the others as usual
Does python set's hashing function can have hash collision same way like dictionary ?
yes
is it possible to remove list elements while iterating over a list and would this not ruin the notion of a list index
e.g.,
if i am running a loop from i to len(list), but then remove an element from the list during the loop
iterate in reverse order
!e
xs = ["a", "b", "c", "d", "e"]
for i in range(5):
print(i, xs[i])
xs.pop(i)
@jolly mortar :x: Your 3.11 eval job has completed with return code 1.
001 | 0 a
002 | 1 c
003 | 2 e
004 | Traceback (most recent call last):
005 | File "<string>", line 3, in <module>
006 | IndexError: list index out of range
!e
xs = ["a", "b", "c", "d", "e"]
for i in range(4, -1, -1):
print(i, xs[i])
xs.pop(i)
@jolly mortar :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 4 e
002 | 3 d
003 | 2 c
004 | 1 b
005 | 0 a
arguably simpler way is to just make a new list for the elements you want to keep insteas of mutating the current one
range is stop-exclusive, so the last thing it produces is 0
yeah its kind of tough bc there is error handling and i already have a list from the text file input that i'd like to just remove stuff from when I catch an error
an error being that one of the chars in the string is not in the language under consideration
can you not continue or pass after an exception is raised?
in a loop
!e ```py
xs = ["a", "b", "c", "d", "e"]
for i in reversed(range(5)):
print(i, xs[i])
xs.pop(i)
@opal oriole :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | 4 e
002 | 3 d
003 | 2 c
004 | 1 b
005 | 0 a
agree
If you catch it.
Oof
Do you need to remove it or just skip that iteration?
If just skip then a continue in the except.
If remove, it's easier to do what hsop said and generate a new list.
It's effectively the same as the exception skip version, but if not skipped, it adds to the new list.
So it's a copy that sometimes skips on exception.
This specific task is actually not so simple (depending on language and more).
If by language you mean spoken language.
No the language is just a series of characters
So you have a set of values and are checking if it's in there?
I do the other way around. ‘if char not in ‘stringwitheverychar’’
raise exception. I also want to remove that string (which is the second item in a list) from consideration from the master 2D list
So you have a list of strings and if the string contains an invalid character remove it from the list?
Not exactly. Let me get on my PC 1s
hey guys, sorry if this is the wrong section to post in. but i’m wondering the most efficient way to search and extract from 46 files in a folder / directory. basically i need to extract the files only with specific peak absorbances
do the ones with peak absorbances have a commonality to their file name?
nope, the files are just named based on their dates and sample specification
what is the file type
and the specific peak absorbance is, a range?
While you could raise an exception for this, it may be easier to just return a boolean (is valid) and then either process it or skip it.
🤔
Whether to use an exception or not depends on the code. One of the main things about exceptions is that they can be passed up (propagated).
so, i need to extract files with peak absorbances at 451, 434, 320, 271. this is to identify the files which contain peaks for a specific molecule so i can plot the spectra
"raise an exception so i can print to file which invalid character was detected" - You could do all the normal processing in the try and in the except do that printing.
Or if you want to do the printing later, add those exceptions to a list.
@rigid gyro this sounds easily emenable to automation. you'll just write a python program to iterate through each line in each file in your directory looking for the peak_abs or whatever string in a cell, and if the next cell is one of those values, store the file in a list of files to return
it'll look roughly like this:
return = []
for file in filedirectory:
with open(file) as f:
if f.readline == 'peak_abs':
if peak_abs == range(220-694):
return.append(file)
you'll need to find the specifics of opening a .cvs file however
okay! though the peak absorbance isn’t stated in the files, basically i have only wavelength, absorbance in visible region and concentration. so, i guess i will have to use scipy_findpeaks somewhere too
absorbance between 220-694 actually
if it's a csv, you'll have to decide the row you want to parse on
row or rows
you need to be able to read a specific row and decide to return that file or not based on the integer there
that's not quite right above but its the general principle
okay thank you @fiery cosmos ! that’s helpful too know
So there are different kinds of "errors". There is the error where if it happens you have the cancel the whole operation and/or rewind. And there there is the "error" where you process as much as you can and report the failed ones.
And so I would have the data_read give back two results, the list of things correctly read, and a list of failures.
yeah i'd like to do the second one, this is a self-imposed error, a custom exception