#data-science-and-ml | Python | Page 28

serene scaffold Nov 6, 2022, 6:32 PM

#

Sure. and lo and behold, the difference is negligible.

In [21]: %%timeit
    ...: i = 0
    ...: while i < 1_000_000:
    ...:     i += 1
    ...:
28.9 ms ± 315 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [22]: %%timeit
    ...: i = 0
    ...: for _ in range(1_000_000):
    ...:     i += 1
    ...:
26.4 ms ± 224 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

strong sedge Nov 6, 2022, 6:32 PM

#

yeah, for loops are faster, but not that much faster

serene scaffold Nov 6, 2022, 6:32 PM

#

strong sedge yeah, for loops are faster, but not that much faster

Please stop saying this.

strong sedge Nov 6, 2022, 6:33 PM

#

serene scaffold Please stop saying this.

its faster tho ?
but fine ill stop

serene scaffold Nov 6, 2022, 6:33 PM

#

strong sedge its faster tho ? but fine ill stop

Switching between for and while loops is not an optimization technique. Saying so is doing our users a disservice. So thank you.

idle urchin Nov 6, 2022, 6:35 PM

#

serene scaffold Switching between for and while loops is not an optimization technique. Saying s...

do u have any recomendations for what I asked above

fading crane Nov 6, 2022, 6:35 PM

#

Are there any other communities like this one

#

For AI?

#

I literally could not find any others that had activity

strong sedge Nov 6, 2022, 6:35 PM

#

serene scaffold Switching between for and while loops is not an optimization technique. Saying s...

yeah, I am just telling what I have noticed while trying to do something in pygame
but yeah, numpy or numba would be faster, and its best to just find some function in pandas that would effectively do what he wants

serene scaffold Nov 6, 2022, 6:37 PM

#

So for each row, you're trying to figure out how many prior rows satisfy some condition, or what?

idle urchin Nov 6, 2022, 6:37 PM

#

yeah

#

and based on that increase the row_index

serene scaffold Nov 6, 2022, 6:38 PM

#

idle urchin yeah

pandas doesn't effectively support operations like that, but you could speed it up by caching the result for prior rows.

#

that way you don't have to start looping from the beginning each time.

idle urchin Nov 6, 2022, 6:39 PM

#

serene scaffold pandas doesn't effectively support operations like that, but you could speed it ...

what do you mean proior rows

opal garden Nov 6, 2022, 6:39 PM

#

Hi there, I’d like to start career as data analyst. For know, I know basics of pandas and sql, of corse python core too. What libraries next should I learned?

serene scaffold Nov 6, 2022, 6:40 PM

#

opal garden Hi there, I’d like to start career as data analyst. For know, I know basics of p...

do you have professional experience or a degree?

idle urchin Nov 6, 2022, 6:40 PM

#

strong sedge yeah, I am just telling what I have noticed while trying to do something in pyga...

the thing about numba is the first time it takes forever to run

#

is there a way to fix that

opal garden Nov 6, 2022, 6:40 PM

#

serene scaffold do you have professional experience or a degree?

Nope, nothing. I’m a student at university that’s all

serene scaffold Nov 6, 2022, 6:40 PM

#

idle urchin what do you mean proior rows

isn't the idea that for row 13, you need to count stuff from rows 0 to 12?

serene scaffold Nov 6, 2022, 6:40 PM

#

opal garden Nope, nothing. I’m a student at university that’s all

what degree are you pursuing?

opal garden Nov 6, 2022, 6:41 PM

#

Im at 2nd year, this time we had only python córę

#

Core*

serene scaffold Nov 6, 2022, 6:41 PM

#

opal garden Im at 2nd year, this time we had only python córę

please say the name of the degree that you get at the end.

wheat snow Nov 6, 2022, 6:42 PM

#

what is the exact subject that your studfying=

#

?

uncut loom Nov 6, 2022, 6:42 PM

#

Hey

#

Im trying to make an Ai

strong sedge Nov 6, 2022, 6:42 PM

#

idle urchin the thing about numba is the first time it takes forever to run

it shouldnt take too much extra time
cause from my understanding, numba compiles the function on first call
compilation for such a small function shouldnt be too much

opal garden Nov 6, 2022, 6:42 PM

#

serene scaffold please say the name of the degree that you get at the end.

programming engineer after all

strong sedge Nov 6, 2022, 6:42 PM

#

can you explain what your trying to do better ?

uncut loom Nov 6, 2022, 6:42 PM

#

Am I right here

serene scaffold Nov 6, 2022, 6:42 PM

#

opal garden programming engineer after all

try using scipy

opal garden Nov 6, 2022, 6:43 PM

#

serene scaffold try using scipy

Okay Thanks 😊

uncut loom Nov 6, 2022, 6:43 PM

#

Does anyone knows how to create new objects while the code is running

#

(Python)

idle urchin Nov 6, 2022, 6:44 PM

#

strong sedge it shouldnt take too much extra time cause from my understanding, numba compiles...

but it runs faster then the pure python code the first time. is there another way that makes it faster even when I run it the first time

strong sedge Nov 6, 2022, 6:45 PM

#

idle urchin but it runs faster then the pure python code the first time. is there another wa...

maybe you can cache the compilation

serene scaffold Nov 6, 2022, 6:45 PM

#

@idle urchin is there a CSV that animals comes from? please drag the file into this chat.

strong sedge Nov 6, 2022, 6:45 PM

#

yeah, share the file and also, explain what exactly your trying to do

#

that would help

strong sedge Nov 6, 2022, 6:47 PM

#

serene scaffold Sure. and lo and behold, the difference is negligible. ```py In [21]: %%timeit ...

sorry for being petty 😅 but

Number of iterations: {100}
4.923000233247876e-06
5.7369998103240505e-06
Number of iterations: {10000}
0.0003440270002101897
0.0006337819995678728
Number of iterations: {500_000}
0.021400072999313124
0.03858755500004918

code

import time


def for_loop(iters):
    i = 0
    for _ in range(iters):
        i += _


def while_loop(iters):
    i = 0
    _ = 0
    while _ < iters:
        i += _
        _ += 1
        

def timer(f, iters = 1000000):
    start = time.perf_counter()
    f(iters)
    return time.perf_counter() - start

print("Number of iterations: {100}")
print(timer(for_loop,   100))
print(timer(while_loop, 100))

print("Number of iterations: {10000}")
print(timer(for_loop,   10000))
print(timer(while_loop, 10000))

print("Number of iterations: {500_000}")
print(timer(for_loop,   500_000))
print(timer(while_loop, 500_000))

atleast on my machine, its about 2x faster

#

I consider 2x significant lol

#

Number of iterations: {100}
4.971000635123346e-06
5.819999387313146e-06
Number of iterations: {10000}
0.00035766699966188753
0.0006018149997544242
Number of iterations: {500_000}
0.028559688000314054
0.04812260699964099
Number of iterations: {1_000_000}
0.05709153600037098
0.08422518800034595```

serene scaffold Nov 6, 2022, 6:52 PM

#

strong sedge sorry for being petty 😅 but ``` Number of iterations: {100} 4.923000233247876e...

do you even know how for and while loops are implemented?

idle urchin Nov 6, 2022, 6:53 PM

#

serene scaffold <@433723196122988575> is there a CSV that animals comes from? please drag the fi...

there is no file I'm just asking in general

strong sedge Nov 6, 2022, 6:55 PM

#

serene scaffold do you even know how for and while loops are implemented?

yoo its chill, I am not trynna one up you lmao, you obviously know more than me 😅

#

ummm, yeah

slate gate Nov 6, 2022, 6:57 PM

#

is there a python library for the engilish dictionary where i can get a random word, or a random word of a specfic type (ex: random noun)

serene scaffold Nov 6, 2022, 6:58 PM

#

strong sedge yoo its chill, I am not trynna one up you lmao, you obviously know more than me ...

a for loop is essentially a while True loop that passes the iterator to next repeatedly until it raises StopIteration, at which time it breaks. So the overhead of each kind of loop depends on what the while condition is, or (for for loops) what work the iterator has to do to produce new values.

The execution of the code inside the loop, meanwhile, is completely unaffected by what kind of loop it is.

strong sedge Nov 6, 2022, 6:58 PM

#

serene scaffold a for loop is essentially a while True loop that passes the iterator to `next` r...

yes, I am aware of this

#

but its implemented in C

#

not python, thats why for is faster

#

lemme try using range with while

#

maybe thatll be faster

#

yeaah, 100%

serene scaffold Nov 6, 2022, 7:00 PM

#

You can continue this in #internals-and-peps if you want @strong sedge

strong sedge Nov 6, 2022, 7:01 PM

#

for expensive computations, the computation would be the bottleneck

serene scaffold Nov 6, 2022, 7:02 PM

#

No more discussion about for vs while in this channel; go to #internals-and-peps instead.

wheat snow Nov 6, 2022, 7:23 PM

#

guys i need some help, i wanna add a complete column to a tkinter combobox, how would i do that?

#

its a long column btw

#

0        2022-06-17
1        2022-06-09
6        2022-06-08
8        2022-06-05
10       2022-06-02
            ...
23022    2018-07-16
23030    2018-07-15
23047    2018-07-13
23059    2018-07-12
23073    2018-07-11
``` this is my output for the general colöumn

#

now i wanna add the column into a combobox

#

but smh it always takes the index with it

atomic palm Nov 6, 2022, 7:31 PM

#

is p,q value in ARIMA in range of 50-60 is correct?

wheat snow Nov 6, 2022, 7:32 PM

#

i tried

cb_start_time['values']=   df_date['Dates'].reset_index(drop=True)  
``` but this doesnt seem to work

gaunt anvil Nov 6, 2022, 8:02 PM

#

does anyone know what this means?
using https://github.com/NVIDIA/tacotron2

serene scaffold Nov 6, 2022, 8:06 PM

#

gaunt anvil does anyone know what this means? using <https://github.com/NVIDIA/tacotron2>

Looks like your loss starts oscillating, instead of steadily approaching zero. You might need to make the learning rate smaller

gaunt anvil Nov 6, 2022, 8:09 PM

#

serene scaffold Looks like your loss starts oscillating, instead of steadily approaching zero. Y...

so like this val? how much smaller should it even be then ?~?

serene scaffold Nov 6, 2022, 8:10 PM

#

gaunt anvil so like this val? how much smaller should it even be then ?~?

Yeah. Idk, try changing it to 1e-4. Though that will make it take longer to train.

#

Also, that's scientific notation. Though you might already know that.

gaunt anvil Nov 6, 2022, 8:10 PM

#

yh

#

10^-3

#

alr ty i'll try it

#

how would i know if the learning_rate is small enough

#

training loss converges to 0?

serene scaffold Nov 6, 2022, 8:12 PM

#

gaunt anvil how would i know if the learning_rate is small enough

The learning rate determines by how much you adjust the weights after each epoch. And if your loss is oscillating (which means "going up and down repeatedly"), that means you keep overcorrecting, and then overcorrecting the overcorrection, and then overcorrecting again.

#

Because you can't make an adjustment small enough to get to the optimum

gaunt anvil Nov 6, 2022, 8:12 PM

#

ah i see, i didn't know that

#

thanks!

#

i assume as we continue to train then, we should expect to see training loss converging to 0?

serene scaffold Nov 6, 2022, 8:15 PM

#

gaunt anvil i assume as we continue to train then, we should expect to see training loss con...

It should approach zero over time, yes. Though it won't necessarily get there. If it's not approaching zero, it might mean that your learning rate is too large, or that your model doesn't have enough parameters to learn what you want it to learn, or something.

gaunt anvil Nov 6, 2022, 8:16 PM

#

hmm

#

thank you :>

arctic wedgeBOT Nov 6, 2022, 9:13 PM

#

:incoming_envelope: :ok_hand: applied mute to @misty swan until <t:1667769816:f> (10 minutes) (reason: duplicates rule: sent 4 duplicated messages in 10s).

The <@&831776746206265384> have been alerted for review.

tame ocean Nov 6, 2022, 9:15 PM

#

Yo guys I need help with making my first tensorflow project

#

I don't understand ml and ai much

#

If any of you could briefly explain the basic concepts of it

serene scaffold Nov 6, 2022, 9:50 PM

#

tame ocean If any of you could briefly explain the basic concepts of it

there really isn't any way to explain it briefly. This is something people dedicate a lot of time to studying.

hasty mountain Nov 6, 2022, 11:58 PM

#

Hey guys, I want to make a neural network in Pytorch which has multiple outputs, and one of those outputs is conditioned by another.
However, I want the backpropagation for output A to be independent from the backpropagation for output B.
How should I do this? Simply using .detach()?

Code example:


class Subject(torch.nn.Module):

  def __init__(self):

    super(Subject,self).__init__()

    self.linear1 = torch.nn.Linear(1, 50)
    self.linear2 = torch.nn.Linear(100, 1)

  def forward(self, input):

    outA = self.linear1(input)

    input = torch.cat((outA, outA), 1) # This generates a conflict in backprop
    
    input = torch.cat((outA.detach(), outA.detach()), 1) # Maybe this is what I want?
    
    outB = self.linear2(input)

    return outA, outB

#

lol...it seems this is also making the output from the first iteration in the batch condition the output from the rest of the batch.
Conditioning a bit too much.

serene scaffold Nov 7, 2022, 12:11 AM

#

I wonder if that would just make it not backprop all the way

hasty mountain Nov 7, 2022, 12:14 AM

#

Well, the idea would be pass the outA into one loss, and outB into another loss, and apply backward on each loss.
I suppose lossA would backprop through linear1 and lossB through linear2?

#

At least, that's the idea...

hasty mountain Nov 7, 2022, 12:31 AM

#

Tested and confirmed: exactly what I want brainmon

serene scaffold Nov 7, 2022, 12:33 AM

#

YAY

hasty mountain Nov 7, 2022, 12:36 AM

#

Now I just have to wait and confirm if my learning rate is too high or if the model is simply still a bit confused with the data...as one of the output losses went from trillions to millions and is now going to the moon...but it's only the 7th epoch.

#

Also, is it normal to use a LogSoftmax + NLLLoss and get a loss value of...like...200.000?

idle urchin Nov 7, 2022, 12:59 AM

#

are there any ways that are faster then "np.where" that I can search if a column in a dataframe contains a certin value and return that row index

hasty mountain Nov 7, 2022, 1:04 AM

#

idle urchin are there any ways that are faster then "np.where" that I can search if a column...

Try checking pandas.loc and pandas.iloc args in their docs

idle urchin Nov 7, 2022, 1:04 AM

#

hasty mountain Try checking `pandas.loc` and `pandas.iloc` args in their docs

I did but those are slower then np.where

#

like I even tried it

hasty mountain Nov 7, 2022, 1:04 AM

#

Oh...

idle urchin Nov 7, 2022, 1:06 AM

#

hasty mountain Oh...

any other ideas

hasty mountain Nov 7, 2022, 1:06 AM

#

Nah, I don't use pandas that much since I've started studying neural networks

#

I just use it once in a while for visualizing data, separating X and y...

gaunt anvil Nov 7, 2022, 2:44 AM

#

does anyone know how I can train off a pretrained with https://github.com/jik876/hifi-gan

rugged comet Nov 7, 2022, 2:49 AM

#

If anyone has the time, would someone please review my code and give me some guidance on where to go next? The accuracy is barely better than guessing randomly.
The important parts are build_mtg_model, build_preprocessing_model, and main. The file attached is main.py. Most of the other functions are commented-out because they were previous tries.
https://paste.pythondiscord.com/supecupuni
If you want to take a look at mtg.py, which loads the data, just let me know.

woeful hedge Nov 7, 2022, 3:47 AM

#

Can someone give me some info and guidance on where to begin with making a local closed system NLP system that I can train with a neural network to read books from my Google play library or from my local hard drive.

gaunt anvil Nov 7, 2022, 3:59 AM

#

how do i convert the model.inference to output a mel spectrogram so i can run it thru smth other than waveglow

gaunt anvil Nov 7, 2022, 4:52 AM

#

what does this bit of code do? i.e. what's reduced_loss for

fervent hatch Nov 7, 2022, 7:36 AM

#

Hello i just want to ask what does the cross validation score do?

drifting imp Nov 7, 2022, 9:25 AM

#

Hi, I need a pre-trained model or library which can recognise hate speech in the input text. Any suggestions? Thanks <3.14

fossil ivy Nov 7, 2022, 9:42 AM

#

good morning everyone

#

How can I get rid of the grey-ish background of the plot?

mighty patio Nov 7, 2022, 10:31 AM

#

fossil ivy How can I get rid of the grey-ish background of the plot?

what package are you using to make the plot?

fossil ivy Nov 7, 2022, 10:31 AM

#

mighty patio what package are you using to make the plot?

I fixed it by now, kinda

#

import seaborn as sns
import matplotlib.pyplot as plt

def create_box_whisker(x):

    #sns.set(rc={'figure.figsize': (18, 8), 'axes.facecolor': 'white', 'figure.facecolor': 'white'})
    fig, ax = plt.subplots(figsize=(22, 8))
    #sns.set(rc={'axes.facecolor': 'white', 'figure.facecolor': 'white'})

    sns.barplot(x="Paper", y="Value", data=x, color="grey")
    plt.xlabel("")
    plt.ylabel("Dismantling duration per MW [h]", fontsize=22)
    plt.xticks(fontsize=22, rotation=0)
    plt.yticks(fontsize=22)

    plt.show()

#

This is the code, apparently the sns.set makes some default changes to the plot

#

so I did plt.subplots to define the figsize instead, making the grey-ish background disappear

mighty patio Nov 7, 2022, 10:37 AM

#

fossil ivy ```py import seaborn as sns import matplotlib.pyplot as plt def create_box_whis...

This is why I prefer pure matplotlib instead of seaborn.
Anyways, I suggest you set a higher dpi and lower figsize instead of adjusting the fontsize manually
Try the following

def create_box_whisker(x):
    fig, ax = plt.subplots(figsize=(11, 4), dpi = 200)
    sns.barplot(x="Paper", y="Value", data=x, color="grey", ax = ax)
    ax.set_xlabel("")
    ax.set_ylabel("Dismantling duration per MW [h]", fontsize=22)
    fig.show()

fossil ivy Nov 7, 2022, 10:38 AM

#

looks even better

#

thanks

fossil ivy Nov 7, 2022, 11:17 AM

#

@mighty patio may I ask you another question?
Im using pyplot to plot the results of my simulation model. It looks like this.
How hard would it be to have an average axhline for each plot in there? I've tried making it work but, again, couldn't find something for multiple lines

#

(also just added plt.tight_layout()) to have the x label shown properly

mighty patio Nov 7, 2022, 11:34 AM

#

I have never used axhline before, but I assume you mean something like this?

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(1,13,200)
y1 = np.cos(2*np.pi*x/12)+1
y2 = np.cos(2*np.pi*x/12+1)**3+1.7

fig, ax = plt.subplots(1,1, figsize = (6,4), dpi = 200)
ax.plot(x, y1, label = "A", color = [1,0.5,0])
ax.plot(x, y2, label = "B", color = [0,0.5,1])
ax.axhline(np.average(y1),ls = "--", color = [1,0.5,0])
ax.axhline(np.average(y2),ls = "--", color = [0,0.5,1])

ax.set_xticks(np.arange(12)+1)
ax.set_xticklabels(["Jan\n2022","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov", "Dec"])
ax.legend()
ax.set_xlabel("Starting date")
ax.set_ylabel("Project cost [M€]")
ax.set_xlim(1,13)
fig.tight_layout()
fig.savefig("temp.png")

fossil ivy Nov 7, 2022, 11:35 AM

#

yes exactly

#

seems like I will have to make changes then

#

because I used

def create_timeseries(x, s, f):
    final_df = pd.concat(x)

    averages = final_df.groupby(["Vessel", "Start Date"]).mean()
    final_df.set_index(["Vessel", "Start Date"])
    averages.to_excel("SIMRESULTS_COSTPERMW_AVG.xlsx")

    with pd.option_context("display.max_rows", None,
                           "display.max_columns", 9,
                           "display.precision", 3,
                           "display.expand_frame_repr", False):
        print(final_df)
        print(averages)

    plt.figure(1, figsize=(50, 5))
    averages.unstack(level=0)["Duration"].plot()
    plt.xlabel("Starting Date")
    plt.ylabel("Project Duration")
    plt.title(f"Decommissioning duration per starting date, Number of Turbines: {f.n}")

drifting imp Nov 7, 2022, 11:49 AM

#

Hi can I fix this red lines? Thanks!

fossil ivy Nov 7, 2022, 11:50 AM

#

im not sure, have you tried using from keras.preprocessing.text import Tokenizer?

#

Like you import keras, so maybe the reference should also be keras only not tensorflow.keras

drifting imp Nov 7, 2022, 11:51 AM

#

fossil ivy im not sure, have you tried using `from keras.preprocessing.text import Tokenize...

fossil ivy Nov 7, 2022, 11:51 AM

#

from keras.preprocessing

#

not tensorflow.preprocessing

drifting imp Nov 7, 2022, 11:52 AM

#

fossil ivy not tensorflow.preprocessing

fossil ivy Nov 7, 2022, 11:55 AM

#

so does it work?

drifting imp Nov 7, 2022, 11:58 AM

#

fossil ivy so does it work?

no

keen notch Nov 7, 2022, 12:21 PM

#

hey I have a python error was wondering if someone can help spot the problem

#

# function first, then main script for plotting
# YOUR CODE HERE
def fourier(values, order):
    n = order
    xDivideT = values
    solution = []
    for i in range(len(xDivideT)):
        summation = 0
        for  m in range(1, n+1):
            j = (2*m)-1
            k = np.sin(j*2*np.pi*xDivideT[i])
            sum = (1/j)* k
        result.append(4* np.pi * sum)
    return result

##plotting 3 curves
values = np.linspace(0,1,200)

fourierOne = fourier(values, 3)
fourierTwo = fourier(values, 11)
fourierThree = fourier(values, 40)

plt.plot(values,fourierOne, color = "blue" )
plt.plot(values,fourierTwo, color = "green" )
plt.plot(values,fourierThree, color = "red" )

plt.title("Fourier series approximation")
plt.show()```

#

wooden sail Nov 7, 2022, 12:32 PM

#

keen notch ```py # function first, then main script for plotting # YOUR CODE HERE def fouri...

you probably meant to write solution instead of result

keen notch Nov 7, 2022, 12:35 PM

#

wooden sail you probably meant to write solution instead of result

omg yes!!

#

thank you🙈

wooden sail Nov 7, 2022, 12:39 PM

#

similarly with summation and sum, btw

#

and if it's called sum(mation), you probably meant += as well?

drifting imp Nov 7, 2022, 1:00 PM

#

fossil ivy so does it work?

It was all good. Just a PyCharm bug :///

fossil ivy Nov 7, 2022, 1:01 PM

#

drifting imp It was all good. Just a PyCharm bug :///

oh... but seems like its fixed now

drifting imp Nov 7, 2022, 1:03 PM

#

fossil ivy oh... but seems like its fixed now

Yes, I dived into the source code and thats why PyCharm doesn't recognise those classes and functions

drifting imp Nov 7, 2022, 1:32 PM

#

Hi again, I didn't get why am not able to fit this model. Let me share the Google Colab file link: https://colab.research.google.com/drive/1Hl9cwWBr1_gxzRT4UpHFIkwlfvfdYRdL?usp=sharing

Google Colaboratory

keen notch Nov 7, 2022, 2:05 PM

#

wooden sail and if it's called sum(mation), you probably meant += as well?

sorry what do you mean

#

import matplotlib.pyplot as plt
# function first, then main script for plotting
# YOUR CODE HERE
def fourier(values, order):
    n = order
    xDivideT = values
    solution = []
    for i in range(len(xDivideT)):
        summation = 0
        for  m in range(1, n+1):
            j = (2*m)-1
            k = np.sin(j*2*np.pi*xDivideT[i])
            summation = (1/j)* k
        solution.append(4* np.pi * summation)
    return solution

##plotting 3 curves
values = np.linspace(0,1,200)

fourierOne = fourier(values, 3)
fourierTwo = fourier(values, 11)
fourierThree = fourier(values, 40)

plt.plot(values,fourierOne, color = "blue" )
plt.plot(values,fourierTwo, color = "green" )
plt.plot(values,fourierThree, color = "red" )

plt.title("Fourier series approximation")
##plt.legend()
plt.show()```

#

the graph doesn't seem right

#

wooden sail Nov 7, 2022, 2:12 PM

#

in the for loop here

for  m in range(1, n+1):
            j = (2*m)-1
            k = np.sin(j*2*np.pi*xDivideT[i])
            summation = (1/j)* k

i'm pretty sure you meant summation += something, otherwise there is no point in iterating

#

either that, or the append goes in the loop

#

what are you trying to do?

keen notch Nov 7, 2022, 2:13 PM

#

I'm trying to write this instead of just one line I tried breaking it down

keen notch Nov 7, 2022, 2:14 PM

#

wooden sail in the for loop here ```py for m in range(1, n+1): j = (2*m)-1 ...

oh this looks right!

#

it makes sense

wooden sail Nov 7, 2022, 2:15 PM

#

right, so there's a summation 🙂 you need to += something

keen notch Nov 7, 2022, 2:19 PM

#

i see i see thank you!!:)

#

looks good I think!!

#

import matplotlib.pyplot as plt
# function first, then main script for plotting
# YOUR CODE HERE
def fourier(values, order):
    n = order
    xDivideT = values
    solution = []
    for i in range(len(xDivideT)):
        summation = 0
        for  m in range(1, n+1):
            j = (2*m)-1
            k = np.sin(j*2*np.pi*xDivideT[i])
            summation += (1/j)* k
        solution.append(4* np.pi * summation)
    return solution

##plotting 3 curves
values = np.linspace(0,1,200)

fourierOne = fourier(values, 3)
fourierTwo = fourier(values, 11)
fourierThree = fourier(values, 40)

plt.plot(values,fourierOne, color = "blue" )
plt.plot(values,fourierTwo, color = "green" )
plt.plot(values,fourierThree, color = "red" )

plt.title("Fourier series approximation")
##plt.legend()
plt.show()```

#

#

hmm but still getting an assertion error

#

wooden sail Nov 7, 2022, 2:21 PM

#

i have no idea about that, idk what the assert is trying to check

#

those are arbitrary constants, so presumably it'll only work for a specific set of signals

compact star Nov 7, 2022, 2:26 PM

#

Does anyone know how I could do forward propogation from scratch in python for a convolutional layer? If possible could u show me how to vectorise it and then apply that whole process using numba

keen notch Nov 7, 2022, 2:27 PM

#

wooden sail i have no idea about that, idk what the assert is trying to check

hmm i'll keep debugging

wooden sail Nov 7, 2022, 2:27 PM

#

forward propagation is simply the application of the model. now, regarding convolutions, there's a LOT of freedom

keen notch Nov 7, 2022, 2:28 PM

#

I have another error for another code

wooden sail Nov 7, 2022, 2:28 PM

#

you can apply convolutions by constructing (multi level) toeplitz matrices, by using built-in convolution functions, or by doing it in the frequency domain

keen notch Nov 7, 2022, 2:28 PM

#

'''
One of the simplest change detect methods is the
online exponential filter, dating back to early radar applications.
Change detection means the comparison of each incoming value to the previous
value, see the detail and formula below.  If that numerical comparison of
the current value with the previous value exceeds a fixed threshold value then
an alarm is raised (or the location is stored as in this exercise). This
process can be implemented on a computer as a simple digital filter

The filter takes one data item after the other (online). The filter is
implemented in the function 'expofilter(prval, data, alpha).
The factor alpha is a gain factor or 'forgetfulness' factor,
quantifying how much influence on the filter previous data values should
have with values in the interval 0<=alpha<=1. Small alpha lead to hardly
any smoothing and the filter will react on any change in the signal very
sensitively while large alpha should show a clear change but react
little on noisy input.
'''
# YOUR CODE HERE

import numpy as np
import matplotlib.pyplot as plt

def expofilter(prval, data, alpha): 
    return alpha*prval + (1-alpha)*data # YR: no error in this line

def changeDetect(data, alpha, threshold): 
    previousvalue = data
    response = []
    change = []
    alarms = []
    for counter, val in enumerate(data):
        value = expofilter(previousvalue, val, alpha)
        print(value, previousvalue, threshold)
        if abs(value-previousvalue)>threshold:
            change.append(counter)
        response.append(abs(value-previousvalue))
    return np.array(response), np.array(change)

# Use case and testing; 
# YR: No error below this line, style changes as appropriate are possible.
tseries = np.random.randint(-4,4,100)
tseries[50] += 20
tseries[51] += 20
tseries[52] += 20
alarmlevel = 1
gainfactor = 0.85
resp, alarms = changeDetect(tseries, gainfactor, alarmlevel)


# plotting
plt.plot(resp)
plt.xlabel('time')
plt.ylabel('filter response')
plt.show()```

#

wooden sail Nov 7, 2022, 2:30 PM

#

so, it looks like you expect previousvalue and value to be scalars, but they aren't in changeDetect, what is data supposed to be? presumably a numpy array

#

when you do previousvalue = data, you make previousvalue a numpy array too

#

then the operation if abs(previousvalue...) > something also returns a numpy array of booleans

#

if [numpy array of booleans] is ambiguous

#

but more importantly, i'm pretty sure you meant previousvalue to be a scalar in the first place 😛 rethink how you store the previous value

keen notch Nov 7, 2022, 2:31 PM

#

wooden sail so, it looks like you expect previousvalue and value to be scalars, but they are...

it is supposed to be a numpy array

keen notch Nov 7, 2022, 2:32 PM

#

wooden sail but more importantly, i'm pretty sure you meant previousvalue to be a scalar in ...

ohhh ok

wooden sail Nov 7, 2022, 2:33 PM

#

looks like you're doing a bsc or msc in something that involves signal processing, control, electrical engineering, or something of the sort :x fun times

keen notch Nov 7, 2022, 2:33 PM

#

wooden sail looks like you're doing a bsc or msc in something that involves signal processin...

haha yes I'm a computer systems engineering student🙈

wooden sail Nov 7, 2022, 2:34 PM

#

heh. have fun!

keen notch Nov 7, 2022, 2:34 PM

#

wooden sail heh. have fun!

i'll tryyy??

worn dome Nov 7, 2022, 2:51 PM

#

Hello Everyone.

Does too many python creates an issue while installing tensorflow?
i.e.
I have 1. Anaconda (Python 3.8.8 64-bit)
2. Miniconda3 (Python 3.9.12 64-bit)
3. Python 3.10.8 (64-bit)

Will it create any problem for installing tensorflow and object detection?
Can anyone help me with installation for the same?

wooden sail Nov 7, 2022, 2:57 PM

#

there shouldn't be a problem, just make sure you specify which interpreter you're installing it for

mild sorrel Nov 7, 2022, 3:22 PM

#

hi i'm currently trying to recreate Michael Reeves' sentiment analysis thing for fun and wondering how i would do it since i know nothing about this topic

hasty mountain Nov 7, 2022, 3:57 PM

#

mild sorrel hi i'm currently trying to recreate Michael Reeves' sentiment analysis thing for...

If you know ML, things can get quite easy.
You just have to think: how can you notice when someone, in a social media, is expressing certain sentiment?
There are words that can be associated with certain sentiments, or some ways to construct phrases

#

joe_maverick

#

It's easier to start with isolated words, that is, marking a sentence with "good words"("nice", "good", "welldone", "well-made") with positive emotions, and making the counterpart for negative emotions

obsidian belfry Nov 7, 2022, 4:00 PM

#

hey, anyone of you knows a way to display the km scaling on cloropleth maps? (with plotly express)

#

simple fossil Nov 7, 2022, 4:09 PM

#

Hello. Is there a way to make the following function faster py def calculate_distance_sklearn(target_representation: list, representations: pd.DataFrame): numpy_representations = np.array(representations["representation"].tolist()) numpy_target_rep = np.array([target_representation]) repr = cosine_similarity(numpy_representations, numpy_target_rep) representations["distance"] = repr.flatten() return representations

#

Currently, the function takes 5 seconds. The most expensive operation is .tolist() call. Any ideas on how to make it faster?

strong sedge Nov 7, 2022, 4:11 PM

#

simple fossil Currently, the function takes 5 seconds. The most expensive operation is `.tolis...

Does it work if you remove tolist ?

simple fossil Nov 7, 2022, 4:12 PM

#

No, it fails with an error.

#

Traceback (most recent call last):
  File "vectorize.py", line 118, in <module>
    calculate_distance_sklearn_quick(target_rep, representations)
  File "D:\AI\website\api\utils\general.py", line 10, in wrap_func
    result = original_function(*args, **kwargs)
  File "vectorize.py", line 76, in calculate_distance_sklearn_quick
    repr = cosine_similarity(numpy_representations, numpy_target_rep)
  File "C:\Users\Martin\.conda\envs\tf\lib\site-packages\sklearn\metrics\pairwise.py", line 1377, in cosine_similarity
    X, Y = check_pairwise_arrays(X, Y)
  File "C:\Users\Martin\.conda\envs\tf\lib\site-packages\sklearn\metrics\pairwise.py", line 155, in check_pairwise_arrays
    X = check_array(
  File "C:\Users\Martin\.conda\envs\tf\lib\site-packages\sklearn\utils\validation.py", line 856, in check_array
    array = np.asarray(array, order=order, dtype=dtype)
ValueError: setting an array element with a sequence.```

tidal bough Nov 7, 2022, 4:27 PM

#

simple fossil Hello. Is there a way to make the following function faster ```py def calculate_...

Try numpy_representations = representations["representation"].values. A column of a dataframe is a Series, and a Series under the hood is basically a numpy array, so you shouldn't need to convert to a list and back to turn it into an array.

simple fossil Nov 7, 2022, 4:30 PM

#

tidal bough Try `numpy_representations = representations["representation"].values`. A column...

I'm having the same error. ValueError: setting an array element with a sequence.

tidal bough Nov 7, 2022, 4:31 PM

#

What's the dtype of that column (representations["representation"].dtype)?

simple fossil Nov 7, 2022, 4:32 PM

#

Object

#

Here is an image to clarify what I'm trying to do.

tidal bough Nov 7, 2022, 4:46 PM

#

Oh, each row being a python list is annoying indeed. tolist shouldn't be necessary still, but converting this column of list into a 2d array will still take some time.

#

@simple fossilI tested a few approaches, and I'm getting that the fastest is this very straightforward one:

def f3(df):
    col = df.lists.values
    n,m = len(col), len(col[0])
    arr = np.empty((n,m),dtype=type(col[0][0]))
    for i,row in enumerate(col):
        arr[i,:] = row
    return arr
%timeit np.array(df.lists.tolist())
%timeit np.vstack(df.lists.apply(np.array).values)
%timeit f3(df)

1.32 ms ± 170 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
2.29 ms ± 439 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.15 ms ± 243 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

#

this is on this test dataframe:

import numpy as np
import pandas as pd
df = pd.DataFrame.from_dict({"lists":[[j+0.5 for j in range(i,i+100)] for i in range(100)]})

#

the reason this is fast is because it allocates an array of the final size right away, whereas solutions like vstack have to allocate arrays for each row first

simple fossil Nov 7, 2022, 4:51 PM

#

Thank you. That did speed up. Now it takes 3.8 seconds.

#

Would there be a better way to do it that would be faster, like storing those values as a different type?

wooden sail Nov 7, 2022, 5:05 PM

#

ah, reptile, now that you're around i wanna ask you a question. you happen to know a good estimator for the fundamental frequency of a signal consisting of a sinusoid and its harmonics?

drowsy cairn Nov 7, 2022, 5:18 PM

#

Hey guys, I am perusing MS in DS. I have to choose a dataset for my course project, related to unsupervised learning (clustering). Can any of you guys help me out with the dataset.

serene scaffold Nov 7, 2022, 5:20 PM

#

drowsy cairn Hey guys, I am perusing MS in DS. I have to choose a dataset for my course proje...

There are a lot of toy datasets on kaggle. Your instructor didn't give you any input on where to find data?

drowsy cairn Nov 7, 2022, 5:21 PM

#

serene scaffold There are a lot of toy datasets on kaggle. Your instructor didn't give you any i...

he is fine with anything, i didn't want to just choose anything, a bit more challenging 😊

edit: it should also help me for my resume.

serene scaffold Nov 7, 2022, 5:24 PM

#

I wouldn't make it more challenging than your instructor intended tbh. a university course should introduce new concepts in a deliberate order. Unless you have prior data science experience, I'd focus on the specific goals of the assignment.

drowsy cairn Nov 7, 2022, 5:25 PM

#

serene scaffold I wouldn't make it more challenging than your instructor intended tbh. a univers...

I have worked on these topics before, I am somewhat familiar with the topics. Also like I said, I should be able to showcase it somewhere.

young granite Nov 7, 2022, 5:41 PM

#

if i got a df like this:

     ID comp 1  amount 1 comp 2  amount 2 comp 3  amount 3 comp 4  amount 4
0  772D    D45       0.5    U45       0.3    T45       0.2    NaN       NaN
1  223P    D54       0.5    U54       0.5    NaN       NaN    NaN       NaN
2  212E    D45       0.6    U55       0.1    I23       0.2    Z23       0.1```
and i want to transpose it in such a way that the values in "comp x" are cols what would be a good approach maybe a dict?

serene scaffold Nov 7, 2022, 5:47 PM

#

young granite if i got a df like this: ```py ID comp 1 amount 1 comp 2 amount 2 comp 3 ...

Thanks for giving your dataframe as text. You might want to do .pivot_table

young granite Nov 7, 2022, 5:48 PM

#

serene scaffold Thanks for giving your dataframe as text. You might want to do `.pivot_table`

i tried that already, maybe i did it wrong, but that didnt worked out how i want it to be

serene scaffold Nov 7, 2022, 5:48 PM

#

actually, hmm

#

so basically, you want columns that are (ID, comp, amount)

young granite Nov 7, 2022, 5:48 PM

#

no

#

i want ID, comp values as cols

#

so i tried to get all unique values from each comp and then attach the right amount values 🗿

serene scaffold Nov 7, 2022, 5:51 PM

#

I'm in a meeting, so I can't focus in, unfortunately

young granite Nov 7, 2022, 5:51 PM

#

its alright maybe later u find some time or someone else

#

thanks nevertheless

pure plover Nov 7, 2022, 5:59 PM

#

Trying to subtract a portion of one dataframe for another (background subtraction - timeseries). I want to just subtract the reading values but I don't want to subtract the time columns. The headers are: {'Time [s]': [0.0, 60.015, 120.03, 180.048, 240.048], 'A1': [328, 394, 452, 515, 577], 'A2': [299, 360, 416, 472, 524], 'A3': [685, 826, 952, 1118, 1209], 'A4': [631, 768, 898, 1034, 1154], 'A5': [1420, 1689, 1956, 2236, 2460], 'A6': [1475, 1797, 2093, 2391, 2601], 'A7': [2231, 2569, 2935, 3262, 3588], 'A8': [2426, 2799, 3185, 3579, 3924]}

#

The header of the dataframe I would like to subtract from that data is: {'Time [s]': [0.0, 60.015, 120.03, 180.048, 240.048], 'A1': [84, 80, 79, 82, 79], 'A2': [167, 162, 154, 154, 155], 'A3': [330, 283, 280, 279, 281], 'A4': [256, 246, 248, 246, 246], 'A5': [545, 543, 557, 548, 545], 'A6': [563, 566, 552, 565, 576], 'A7': [1075, 1025, 1025, 1027, 1033], 'A8': [969, 974, 997, 996, 980]}

#

The purpose is background subtraction

tidal bough Nov 7, 2022, 6:13 PM

#

wooden sail ah, reptile, now that you're around i wanna ask you a question. you happen to kn...

Other than the obvious like taking the lowest peak in the fourier transform of the signal, not really.

torn monolith Nov 7, 2022, 6:14 PM

#

def list_of_case(self):
lisx=[]
for gulty in self.bomb:
lisx.append(gulty['category'])
return lisx

#

bro i made a class object and i allways get a error from for loop

#

can you help me ?

serene scaffold Nov 7, 2022, 6:15 PM

#

@torn monolith this isn't a data science question--see #❓｜how-to-get-help

#

And remember to never say that you "get an error" without copying and pasting the whole error message into the chat. No one knows what the error is except you.

wooden sail Nov 7, 2022, 6:19 PM

#

tidal bough Other than the obvious like taking the lowest peak in the fourier transform of t...

aight. i'm about to ESPRIT a signal, but that computes the spectral components separately (albeit with so-called "super resolution"). i wanted something that exploits the relationship between the harmonics, but it's kinda hard to beat how simple this algorithm is given the amazing performance

compact star Nov 7, 2022, 6:37 PM

#

I am trying to implement forward and backward propagation for a convolutional layer:

I currently have this as my class for that layer and I was wondering how I could create a function that uses guvectorize as it said in the page linked that it can be used for that.

Any help would be appreciated

class ConvolutionalLayer(Layer):
    def __init__(self, input_shape, stride, kernel_size, number_of_filters):
        self.input_depth, self.input_height, self.input_width = input_shape
        self.number_of_filters = number_of_filters
        self.input_shape = input_shape
        self.stride = stride

        self.output_shape = (number_of_filters, (self.input_height - kernel_size) // self.stride + 1, (self.input_width - kernel_size) // self.stride + 1)
        self.kernel_shape = (number_of_filters, kernel_size, kernel_size)
        self.kernels = np.random.randn(*self.kernel_shape)
        self.biases = np.random.randn(*self.output_shape)

https://numba.pydata.org/numba-doc/latest/user/vectorize.html#

desert oar Nov 7, 2022, 7:43 PM

#

compact star I am trying to implement forward and backward propagation for a convolutional la...

you wouldn't necessarily want to vectorize the init method

#

doesn't make sense to put that on the gpu even if you could do it sensibly

#

nor does vectorizing over input shapes make a lot of sense

#

if you wrote your own forward and backward pass implementations using numpy arrays and python loops, you could guvectorize those functions

#

then call those vectorized functions from the methods on the class

#

numba does support jit compiling entire classes but i don't think it supports vectorizing methods on those classes

compact star Nov 7, 2022, 7:48 PM

#

ah, right thanks for that, I have now written a forward pass like this

def forward_propogation(self, a):
        self.input = a
        self.output = np.copy(self.biases)

        for i, channel in enumerate(self.input):
            for j, kernel in enumerate(self.kernels):
                self.output[i] = np.add(self.output[i], convolve2d(channel, kernel, self.stride, self.output_shape[1:]))

#

with the convolve2d function looking like this

def convolve2d(image, kernel, stride, output_shape):
    kernel = flipud(fliplr(kernel))
    kernel_size = kernel.shape[0]
    
    output = zeros(output_shape)
    for y in range(image.shape[1]):
        if y > image.shape[1] - kernel_size:
            break
        
        if y % stride == 0:
            for x in range(image.shape[0]):
                # Go to next row once kernel is out of bounds
                if x > image.shape[0] - kernel_size:
                    break
                try:
                    # Only Convolve if x has moved by the specified Strides
                    if x % strides == 0:
                        output[x, y] = (kernel * image[x: x + kernel_size, y: y + kernel_size]).sum()
                except:
                    break

    return output

#

How would I use guvectorize for this?

desert oar Nov 7, 2022, 7:53 PM

#

compact star How would I use `guvectorize` for this?

read the docs 🙂 https://numba.readthedocs.io/en/stable/user/vectorize.html#the-guvectorize-decorator

#

this is a good candidate for numba because it uses only python primitives and numba primitives

#

(i would start with @vectorize to test and debug on cpu first)

#

that said, i think this is already vectorized

#

unless you are trying to vectorize this over multiple images

#

i'm not sure how to specify what to "vectorize over" specifically

#

oh i see, there's the "layout" spec (n),()->(n)

#

you should be able to just guvectorize this as-written. if you want to vectorize over "stacks" of images, you need a for loop over those as well, and you need to modify your layout spec accordingly

fluid spindle Nov 7, 2022, 8:05 PM

#

Hey, can I remove a feature from a Pandas framework without affecting the source dataset?

gaunt anvil Nov 7, 2022, 8:07 PM

#

using tacotron2, how do I convert model.inference to output a mel spectrogram so I can run it thru a different vocoder?

fluid spindle Nov 7, 2022, 8:21 PM

#

Sadly only matplotlib and seaborn are allowed, also I am unfamiliar to PyTorch yet

compact star Nov 7, 2022, 8:26 PM

#

desert oar read the docs 🙂 https://numba.readthedocs.io/en/stable/user/vectorize.html#the-...

for the layout spec if could i do (n, n, n) to mean (1, 2, 3) for example, as in does it treat the n's as different numbers

desert oar Nov 7, 2022, 8:27 PM

#

fluid spindle Hey, can I remove a feature from a Pandas framework without affecting the source...

columns_to_remove = [ ... ]

data2 = data1.drop(columns=columns_to_remove)

#

!d pandas.DataFrame.drop

arctic wedgeBOT Nov 7, 2022, 8:27 PM

#

pandas.DataFrame.drop


DataFrame.drop(labels=None, *, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')```
Drop specified labels from rows or columns.

Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names. When using a multi-index, labels on different levels can be removed by specifying the level. See the user guide <advanced.shown\_levels> for more information about the now unused levels.

desert oar Nov 7, 2022, 8:27 PM

#

compact star for the layout spec if could i do (n, n, n) to mean (1, 2, 3) for example, as in...

i'm not sure, you might need 3 different letters

fluid spindle Nov 7, 2022, 8:38 PM

#

desert oar ```python columns_to_remove = [ ... ] data2 = data1.drop(columns=columns_to_rem...

I have been trying that, found out if you enter inplace parameter it causes the reassignment error

#

Thanks and sorey for being stupid

desert oar Nov 7, 2022, 8:38 PM

#

fluid spindle I have been trying that, found out if you enter inplace parameter it causes the ...

well yeah, inplace=True makes the method return None and modifies the data in-place instead

knotty crystal Nov 7, 2022, 10:03 PM

#

I have some questions on Reinforcement Learning, is there someone available to assist ?

gaunt anvil Nov 7, 2022, 10:33 PM

#

Hi does anyone know what this error means? Using https://github.com/rishikksh20/HiFi-GAN on commit 7c049f9

Traceback (most recent call last):
  File "/home/user/HiFi-GAN/utils/train.py", line 87, in train
    step)
  File "/home/user/HiFi-GAN/utils/validation.py", line 25, in validate
    sc_loss, mag_loss = stft_loss(fake_audio[:, :, :audio.size(2)].squeeze(1), audio.squeeze(1))
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/HiFi-GAN/utils/stft_loss.py", line 130, in forward
    sc_l, mag_l = f(x, y)
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/HiFi-GAN/utils/stft_loss.py", line 89, in forward
    x_mag = stft(x, self.fft_size, self.shift_size, self.win_length, self.window)
  File "/home/user/HiFi-GAN/utils/stft_loss.py", line 23, in stft
    x_stft = torch.stft(x, fft_size, hop_size, win_length, window)
  File "/home/user/.local/lib/python3.6/site-packages/torch/functional.py", line 573, in stft
    normalized, onesided, return_complex)
RuntimeError: stft input and window must be on the same device but got self on cuda:0 and window on cpu```

#

how would i fix this error?

dusty valve Nov 7, 2022, 10:45 PM

#

got this while training keras sequential model 2022-11-07 17:43:04.428412: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 504022000 exceeds 10% of free system memory.

#

model = Sequential((
    layers.Embedding(len(char2idx.items()), 256),
    layers.LSTM(128, activation='relu'),
    layers.Dense(len(char2idx.items()), activation='softmax')
))```

#

never got that before

#

nvm fixed

#

browser was taking up 6 gigs

upbeat dagger Nov 7, 2022, 10:49 PM

#

gaunt anvil Hi does anyone know what this error means? Using <https://github.com/rishikksh20...

I'm not sure how to fix it exactly, but the error is basically saying that self's process is running on the GPU and the window is on the CPU. Trying to get window onto the GPU would be my first thought.

#

Question for you all. Zip codes are categorical data stored as numbers right?

If I didn't have a column name or data dictionary to tell me, how might I test if I'm looking at categorical data and not a numeric data?

turbid wolf Nov 7, 2022, 11:01 PM

#

I would use https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.nunique.html for that specific column and then judge from the results to see whether I'm dealing with numeric or categorical data. Data dictionary would be ideal though

hasty mountain Nov 7, 2022, 11:03 PM

#

gaunt anvil Hi does anyone know what this error means? Using <https://github.com/rishikksh20...

RuntimeError: stft input and window must be on the same device but got self on cuda:0 and window on cpu

Pass the window value you're using to cuda with window_value = window_value.cuda()

steady basalt Nov 7, 2022, 11:38 PM

#

man im so damn rusty py_guido

#

time to get back into the swing

#

thinking kaggle

soft badge Nov 8, 2022, 12:05 AM

#

What is best course, introdution to machine learning?

#

Or site for learn?

steady basalt Nov 8, 2022, 12:26 AM

#

id just google about read some pages people wrote

#

coursera do paid courses

novel python Nov 8, 2022, 12:27 AM

#

I'm getting "AttributeError: 'Series' object has no attribute 'columns'", but every item in my list is a dataframe with the attribute columns. When I access them individually it works, but when I try accessing it in a for loop it gives me this error.

steady basalt Nov 8, 2022, 12:27 AM

#

i probably wont pay for another ML course in my life

steady basalt Nov 8, 2022, 12:27 AM

#

novel python I'm getting "AttributeError: 'Series' object has no attribute 'columns'", but ev...

huh , interesting

#

each item is a DF object?

#

and for x in list, x.columns doenst work?

novel python Nov 8, 2022, 12:28 AM

#

it doesn't, and I can't really tell why since I can access them outside the for loop

steady basalt Nov 8, 2022, 12:28 AM

#

show loop and list

#

and how u called the objects

novel python Nov 8, 2022, 12:29 AM

#

List is:

                jan_feb_average_0, 
                jan_feb_average_1, 
                jan_feb_average_2, 
                jan_feb_average_3, 
                jan_feb_average_4, 
                jan_feb_average_5, 
                jan_feb_average_6, 
                jan_feb_average_7, 
                jan_feb_average_8, 
                jan_feb_average_9, 
                jan_feb_average_10
                ]```

And I'm trying to access a particular column for every dataframe in that list with:

```for items in jan_feb_items:
    print(f"{items.columns[8]} amount of predictions within 0.25GB: ", len(items[abs(items[items.columns[8]]) <= 0.25]), len(items[abs(items[items.columns[8]]) <= 0.25])/len(items))```

#

if I do "jan_feb_average_0.columns[8]" it works

steady basalt Nov 8, 2022, 12:31 AM

#

and how doesnt it work?

#

without the [8]?

novel python Nov 8, 2022, 12:32 AM

#

novel python I'm getting "AttributeError: 'Series' object has no attribute 'columns'", but ev...

it returns this error

#

inside the for loop

steady basalt Nov 8, 2022, 12:32 AM

#

try just printing part?

novel python Nov 8, 2022, 12:33 AM

#

it works completely fine

#

I could create a print statement for each object in that list, I just thought it'd be completely inneficient

steady basalt Nov 8, 2022, 12:33 AM

#

yuh i think its cause ur finding the len of something within the object

novel python Nov 8, 2022, 12:33 AM

#

since the for loop should work fine

steady basalt Nov 8, 2022, 12:33 AM

#

ur doing items items

#

so i think its trying to get series

#

len(items[abs(items[items.

#

must be that right

#

ok

novel python Nov 8, 2022, 12:34 AM

#

hmm let me check

steady basalt Nov 8, 2022, 12:34 AM

#

ur calling items p

#

items [ .. ... .. <<< thats a series

#

cause its party of items

#

rmemeber when u call a df column/series u do that too

serene scaffold Nov 8, 2022, 12:35 AM

#

novel python List is: ```jan_feb_items = [ jan_feb_average_0, ...

if you have a list of dataframes, you're almost certainly approaching the problem wrong

steady basalt Nov 8, 2022, 12:35 AM

#

this is a bit true but the actual issue here is ur doing 'items[' which is inherently a series once u do that

#

if memoery serves corerclty

#

anyway dict is usally better

#

but ive in hte past for some reason also done a list, cant recall why

novel python Nov 8, 2022, 12:36 AM

#

serene scaffold if you have a list of dataframes, you're almost certainly approaching the proble...

I wanted to print a statement that's present in each of the dataframes, so I thought it'd be better if I created a list instead of accessing them individually with 10 different print statements

serene scaffold Nov 8, 2022, 12:36 AM

#

if you do print(jan_feb_average_0.head().to_dict('list')), show the text (no screenshots), and explain what you're trying to do without any code, I will try to help.

steady basalt Nov 8, 2022, 12:37 AM

#

makes sense, altho cud be better to dictionary th em

#

the error tho is because or ur syntax ur trying to find the columns of a series, whhere you did items[ you basically said series

#

ud need to do items.columns only

novel python Nov 8, 2022, 12:39 AM

#

serene scaffold if you do `print(jan_feb_average_0.head().to_dict('list'))`, show the text (no s...

{'LINE__R.CARRIER_DEVICE_CATEGORY_FORMULA__C': ['Tablet', 'Tablet', 'Tablet', 'Tablet', 'Aircard'], 'January': [0.314, 0.544, 0.0, 0.0, 0.0], 'February': [0.045, 0.685, 0.0, 0.0, 0.0], 'March': [0.0, 0.389, 0.0, 0.0, 0.0], 'Variance': [0.018090250000000002, 0.004970250000000001, 0.0, 0.0, 0.0], 'Standard Deviation': [0.1345, 0.07050000000000001, 0.0, 0.0, 0.0], 'Average': [0.1795, 0.6145, 0.0, 0.0, 0.0], 'Last Month Model Prediction': [0.045, 0.685, 0.0, 0.0, 0.0], 'Last Month Model Difference': [0.045, 0.29600000000000004, 0.0, 0.0, 0.0]}

All I'm trying to do is access the 8th column of each of these dataframes and check how many values of that column are below a certain threshold. The other columns are completely useless for that analysis. So jan_feb_average_1, 2, 3, 4, etc. are the same shape, just with different values based on their average

novel python Nov 8, 2022, 12:39 AM

#

steady basalt ud need to do items.columns only

i'll check that

steady basalt Nov 8, 2022, 12:39 AM

#

if u know the column name, just specify items[column name]

#

and then theres some syntax ive long forgotten to check values below threshold but its easily googlable

#

for x in the list, get values under threshold of x

#

u wudnt need to make series like you did

serene scaffold Nov 8, 2022, 12:42 AM

#

novel python {'LINE__R.CARRIER_DEVICE_CATEGORY_FORMULA__C': ['Tablet', 'Tablet', 'Tablet', 'T...

All I'm trying to do is access the 8th column of each of these dataframes and check how many values of that column are below a certain threshold.
Assuming the threshold is 0.25, the solution is just

In [6]: df.iloc[:, 8].le(.25).sum()
Out[6]: 4

steady basalt Nov 8, 2022, 12:42 AM

#

well, you would but then u wudnt use columns

serene scaffold Nov 8, 2022, 12:42 AM

#

le means "less than or equal"

steady basalt Nov 8, 2022, 12:42 AM

#

id actually specify a series as you did

steady basalt Nov 8, 2022, 12:43 AM

#

serene scaffold > All I'm trying to do is access the 8th column of each of these dataframes and ...

basically this in the for loop instead of df ud put x

#

that ought to work and print them all

serene scaffold Nov 8, 2022, 12:44 AM

#

steady basalt basically this in the for loop instead of df ud put x

I didn't look at their code that closely except to identify that .25 has some significance, since we've already established that that code doesn't work, and they told me what they're trying to do.

steady basalt Nov 8, 2022, 12:44 AM

#

yes your soluition is extremely pandonic

#

i didnt even know about le()

serene scaffold Nov 8, 2022, 12:45 AM

#

there's eq, ne, lt, gt, le, and ge

#

so they all match the dunder methods

steady basalt Nov 8, 2022, 12:45 AM

#

niced

serene scaffold Nov 8, 2022, 12:46 AM

#

@novel python it looks to me like your problem is that you were trying to get "the 8th column" when your columns are indexed by strings. in which case you need .iloc to do position-based indexing.

#

I guess that also means the solution might be df.iloc[:, 7].le(.25).sum(), depending on whether you consider the leftmost column to be 1st or 0th.

steady basalt Nov 8, 2022, 12:47 AM

#

God I hope I’ll be allowed to install Linux at my new company I’ve had it up to here with windows cmd

serene scaffold Nov 8, 2022, 12:47 AM

#

steady basalt God I hope I’ll be allowed to install Linux at my new company I’ve had it up to ...

use cmder

steady basalt Nov 8, 2022, 12:47 AM

#

Willing to learn Linux properly to dodge windows at this point

serene scaffold Nov 8, 2022, 12:47 AM

#

https://cmder.app/

Cmder | Console Emulator

cmder is software package that provides great console experience even on Windows

steady basalt Nov 8, 2022, 12:48 AM

#

It’s not just for the command line interface itts the os in general

#

Felt kinda nice using a Linux cluster

#

Simple and good

#

Makes environments easier

novel python Nov 8, 2022, 12:49 AM

#

serene scaffold <@379779815617724429> it looks to me like your problem is that you were trying t...

worked like a charm. Thank you very much!

serene scaffold Nov 8, 2022, 12:49 AM

#

companies usually won't let you install linux on a company issued computer, unless they have a specific linux image that they maintain with all the security stuff they want.

steady basalt Nov 8, 2022, 12:49 AM

#

Cringe

serene scaffold Nov 8, 2022, 12:49 AM

#

novel python worked like a charm. Thank you very much!

no problem!

steady basalt Nov 8, 2022, 12:49 AM

#

I have hell awaiting then

serene scaffold Nov 8, 2022, 12:50 AM

#

steady basalt I have hell awaiting then

my work laptop is Windows, but with cmder and the C++ build tools, it's not that bad.

steady basalt Nov 8, 2022, 12:50 AM

#

Wonder if I can sneak work on my personal mac and copy paste the end project back

#

Prob not allowed

serene scaffold Nov 8, 2022, 12:50 AM

#

have fun getting fired

steady basalt Nov 8, 2022, 12:50 AM

#

True

serene scaffold Nov 8, 2022, 12:50 AM

#

you can't pick between windows and mac?

steady basalt Nov 8, 2022, 12:50 AM

#

No I beleive they all use Lenovo

#

Like most companies

#

Gona have to wait and see

serene scaffold Nov 8, 2022, 12:51 AM

#

my company lets you pick between a macbook pro and some Dell model

steady basalt Nov 8, 2022, 12:51 AM

#

Damn that’s amazing

serene scaffold Nov 8, 2022, 12:51 AM

#

but no matter which you pick, it comes with a specific image of that operating system with tons of security crap on it

steady basalt Nov 8, 2022, 12:51 AM

#

U know it’s quite funny how they’ve titled me DS consultant but I’ll be working on coding projects for them like some data pipelines and possible ml. normal?

#

That sounds like two roles in one job!

serene scaffold Nov 8, 2022, 12:52 AM

#

job titles in the data science sphere are kind of arbitrary

steady basalt Nov 8, 2022, 12:52 AM

#

serene scaffold but no matter which you pick, it comes with a specific image of that operating s...

Mcafee pop ups lol

serene scaffold Nov 8, 2022, 12:52 AM

#

we don't use mcafee, no

steady basalt Nov 8, 2022, 12:53 AM

#

My current company didn’t even try to counter when I handed notice

#

Was strange, guess they don’t care

#

I don’t got much experience w these things

serene scaffold Nov 8, 2022, 12:53 AM

#

I've heard it's usually a bad idea to accept a counter offer anyway

steady basalt Nov 8, 2022, 12:53 AM

#

Why

serene scaffold Nov 8, 2022, 12:54 AM

#

because you've already revealed that you wanted to leave and were about to, which is a negative signal.

steady basalt Nov 8, 2022, 12:54 AM

#

True, and they have acted weird about it too

#

I actively warned them I was interviewing and I was still not transparent enough apparently

#

I gave them the courtesy of notice instead of insta raise and changing to remote working so I wasn’t expecting to get any stick

#

Not to mention I hadn’t signed shit, so it shud go both ways and yet the manager still had to double check if I deserve holidays

desert oar Nov 8, 2022, 12:57 AM

#

steady basalt U know it’s quite funny how they’ve titled me DS consultant but I’ll be working ...

completely. data scientist is a more desired job title, but most companies need data engineers more than they need data scientists. moreover, actually good data engineers are in extreme demand, hard to find, and expensive. hence you have a lot of companies trying to pass of data engineering jobs as data science jobs, and a lot of candidates wanting to get into data science trying to hack it as under-qualified data engineers.

steady basalt Nov 8, 2022, 12:58 AM

#

desert oar completely. data scientist is a more desired job title, but most companies need ...

Wow. You know I actually pitched my novice ass DE work in the interview and they liked it a lot.. they got a pretty cheap deal tho imo

#

I’d honestly not mind following and asking later to move over to DE, as I don’t beleive DS is inherently better

#

Nor should it be desired more

desert oar Nov 8, 2022, 12:59 AM

#

you'll learn a lot either way. and being a good data engineer nowadays is always going to be an asset as a data science candidate

steady basalt Nov 8, 2022, 1:00 AM

#

Yeah, but the fact that I’m also “consultant” adds extra workload ??

desert oar Nov 8, 2022, 1:00 AM

#

btw regarding counteroffers, it works both ways: a company isn't going to want to throw extra money at someone they know is dissatisfied and going to leave anyway

steady basalt Nov 8, 2022, 1:00 AM

#

Was curious why it’s in the title if I’ll be doing technical work - makes it look to other employers that I did only meetings and presentations

desert oar Nov 8, 2022, 1:01 AM

#

steady basalt Yeah, but the fact that I’m also “consultant” adds extra workload ??

idk

steady basalt Nov 8, 2022, 1:01 AM

#

desert oar btw regarding counteroffers, it works both ways: a company isn't going to want t...

Me and my colleague got contracts from that company that was below our current rate

#

Was kinda a bit rude tbh given we do more than agreed to intusll

desert oar Nov 8, 2022, 1:01 AM

#

steady basalt Was curious why it’s in the title if I’ll be doing technical work - makes it loo...

i don't think that's true. just say that you ended up being more of a data engineer than a data scientist, but you learned a lot

steady basalt Nov 8, 2022, 1:02 AM

#

Initially*

steady basalt Nov 8, 2022, 1:02 AM

#

desert oar i don't think that's true. just say that you ended up being more of a data engin...

Yeah, not the DS but the “consultant” is something I associate with purely soft skills

desert oar Nov 8, 2022, 1:02 AM

#

steady basalt God I hope I’ll be allowed to install Linux at my new company I’ve had it up to ...

WSL at least?

desert oar Nov 8, 2022, 1:02 AM

#

steady basalt Yeah, not the DS but the “consultant” is something I associate with purely soft ...

nah, that's like if you're a consultant at a "consulting agency"

#

there are plenty of actually technical consultants

#

it more describes the nature of the employment relationship than anything

steady basalt Nov 8, 2022, 1:03 AM

#

It does confuse me…

#

I havnt used wsl

#

I could try to adapt and run things on cloud more

#

I also made it my mission to learn some etl tools

#

There’s a book pdf teaching a few

#

They’re a bit annoying sometimes tho

desert oar Nov 8, 2022, 1:05 AM

#

steady basalt I could try to adapt and run things on cloud more

docker is a much easier option

#

it's pretty important to know docker nowadays anyway

#

you don't even need a dockerfile, you can just do docker run -it ubuntu:latest /bin/bash and you're in

steady basalt Nov 8, 2022, 1:06 AM

#

Spark, Kafka, airflow

#

Etc , all things I shud really know at this point

steady basalt Nov 8, 2022, 1:06 AM

#

desert oar it's pretty important to know docker nowadays anyway

Good shout

#

My current role hasn’t rly required too much, and half of my queries have been like 5 left joins cause it’s a pretty late stage database

#

Left join x left join y left join z lol

#

One query

#

Bad practise?

#

Largely for fetching info for colleagues

desert oar Nov 8, 2022, 1:12 AM

#

steady basalt Spark, Kafka, airflow

spark is a huge can of worms, i honestly wouldn't worry about it. it takes a lot of infrastructure and tuning to run a cluster of JVM applications. you can start learning pyspark in a "local" cluster running on your own machine, but frankly it's becoming less relevant as purpose-built data warehouse tools catch up in their functionality, and most companies realize that they really do not need a computing cluster for their daily work.
kafka, kind of a specific tool. not necessary unless you are working with really high-volume data, and in that case you'll likely be able to collaborate with software engineers.
airflow is great to know. learn it, spend time practicing building pipelines with it

however, i think your highest priorities should be:

sql. get really good. if you think doing joins is good, you are a sql baby. learn about all different kinds of joins. convince yourself that an inner join is just a filtered cross join, which is a synonym for a cartesian product. learn about window functions and lateral joins, and understand when and why you want to use them. set up your own postgres server, populate it with more data than you're comfortable with (a couple tables of a few hundred million rows). learn about indexes and query plans. understand relational algebra.
excel. yes, fucking excel (or google sheets). learn about array functions, vlookup/hlookup/xlookup/index-match, and named ranges. try to make some sense of its internal data type system, especially how dates and percentages are stored. understand the difference between "3" and 3 in a cell.
try to make some kind of dashboardy thing somehow. there are a million tools for this, some more "industrial-strength" than others. data viz is never not important.
docker (as above)
airflow (as above)

#

oh also: dbt. dbt is a great tool. learn it before or after airflow, but not before docker. it integrates pretty well with airflow, so they will be a good pair.

serene scaffold Nov 8, 2022, 1:14 AM

#

desert oar you don't even need a dockerfile, you can just do `docker run -it ubuntu:latest ...

how does this work in terms of containers? does it use the same container every time you do this command? how does it know?

desert oar Nov 8, 2022, 1:15 AM

#

finally, spend some time messing with aws s3 or google cloud storage. just practice uploading and downloading data with python and the command line. you can set up an ec2 or gcp instance as well and try to get comfortable poking around in a command line linux environment (figure out how to make your own ssh keys and connect using ssh; change the ssh daemon port and disable password auth to make sure you did it right). this won't take very long unless you're a total raw noob, in which case you should learn anyway. this you can do a little over time.

desert oar Nov 8, 2022, 1:16 AM

#

serene scaffold how does this work in terms of containers? does it use the same container every ...

i think there's a way to reuse the container. you might have to get the container id from docker ps -a and use docker start instead of docker run. you could probably set a label on the container and rerun it that way.

steady basalt Nov 8, 2022, 1:16 AM

#

Good advice in airflow and lesser spark, I’ve already done s3 based dashboaridng with Boto3

#

What can I do with airflow to make life easier

#

As for spark everyone keeps telling me to learn it but I get frustrated with it

#

Sql is a no brainier to improve at he’s

#

I used ec2 to host the dashboard refreshes

desert oar Nov 8, 2022, 1:17 AM

#

steady basalt What can I do with airflow to make life easier

like, what is airflow useful for? it's great when you have task C that depends on task B that depends on task A. so start there: pick a sequence of 3 data processing tasks and run a small DAG on your computer

desert oar Nov 8, 2022, 1:18 AM

#

steady basalt I used ec2 to host the dashboard refreshes

nice, so you're not at total noob

steady basalt Nov 8, 2022, 1:18 AM

#

I was until I taught myself in the office last second

#

But I used boto3 to send the data w sql, no need for airflow for simple dash

#

Cause I used cron

desert oar Nov 8, 2022, 1:19 AM

#

steady basalt As for spark everyone keeps telling me to learn it but I get frustrated with it

feel free to ask for help here. the key thing to remember about spark is that operations on rdds and dataframes are not executed right away. it builds up a chain or sequence of computations, and executes them all at once when you collect the data or perform an aggregating operation. it's a lot like programmatically building up a big sql query, and then running it at the end.

also don't even bother with scala spark. pyspark is good enough for most things.

desert oar Nov 8, 2022, 1:19 AM

#

steady basalt Cause I used cron

practice by replacing cron with airflow

#

and if you have multiple steps in the pipeline, practice by breaking it up into separate airflow tasks

steady basalt Nov 8, 2022, 1:21 AM

#

So run airflow on Linux?

#

I’m sure it’s straightforward but I didn’t try yet

#

As for pyspark yes I don’t know scala, but even the syntax annoys me

desert oar Nov 8, 2022, 1:25 AM

#

steady basalt So run airflow on Linux?

it's a python app, runs anywhere

steady basalt Nov 8, 2022, 1:25 AM

#

desert oar practice by replacing cron with airflow

This would require re engineering things actually - as in running python scripts and heroku api

desert oar Nov 8, 2022, 1:25 AM

#

steady basalt As for pyspark yes I don’t know scala, but even the syntax annoys me

the syntax is literally python. pyspark is a python library.

desert oar Nov 8, 2022, 1:25 AM

#

steady basalt This would require re engineering things actually - as in running python scripts...

i mean do it on your computer as an exercise

#

or do something similar

steady basalt Nov 8, 2022, 1:26 AM

#

desert oar the syntax is literally python. pyspark is a python library.

Yeah but making dataframes requires type setting in tutorials

desert oar Nov 8, 2022, 1:26 AM

#

steady basalt Yeah but making dataframes requires type setting in tutorials

i don't know what you mean by this

steady basalt Nov 8, 2022, 1:26 AM

#

I had to import integer and string

desert oar Nov 8, 2022, 1:27 AM

#

steady basalt So run airflow on Linux?

airflow homework:

task 1: run script that writes data to a directory

task 2: load data in directory into db table

task 3: apply some data transformation that creates a different table

task 4: do some data quality checks

DAG:

T1 - T2 - T3
        \ T4

steady basalt Nov 8, 2022, 1:27 AM

#

I will try this tomororw

desert oar Nov 8, 2022, 1:27 AM

#

steady basalt I had to import integer and string

right, because the whole thing is an interface to the spark engine. it's not using native python objects. it is a python library that interacts with an external system

steady basalt Nov 8, 2022, 1:28 AM

#

By that script writing data, did you mean creating a new csv

desert oar Nov 8, 2022, 1:28 AM

#

steady basalt By that script writing data, did you mean creating a new csv

whatever you want. maybe have one script that writes json and another that turns json into csv or parquet, then load that into the db

steady basalt Nov 8, 2022, 1:28 AM

#

I have plenty of data available

#

Ok

desert oar Nov 8, 2022, 1:29 AM

#

oh here's another good skill to practice: simulating fake data

steady basalt Nov 8, 2022, 1:29 AM

#

I have Athena as a source for downloading data

desert oar Nov 8, 2022, 1:29 AM

#

its ridiculous how many things we have to know and be good at

steady basalt Nov 8, 2022, 1:29 AM

#

True words

#

I’m still reading maths in my spare hours

desert oar Nov 8, 2022, 1:30 AM

#

i've been learning new things every week for like 10 years and i still feel like a novice at a lot of things

#

you're doing the right things

steady basalt Nov 8, 2022, 1:30 AM

#

I’m on a 1000 page precalc book cause catch-up innit

#

Who knew sets in python functions are in maths

desert oar Nov 8, 2022, 1:31 AM

#

heh, that's where the name "set" comes from!

#

you want to catch up quickly? watch the 3blue1brown calculus and linear algebra series

steady basalt Nov 8, 2022, 1:31 AM

#

I noticed today in one book a potential mistake in the example ?

#

Oh , no, calculus after precalc no linalg as I’d be busy for a year

#

Anyway

#

Say you have

#

Man I’ll just find it

#

Ok so it’s math syntax here, ) and ] being exclusive and inclusive of the set

iron basalt Nov 8, 2022, 1:33 AM

#

*Unlike learning frameworks / random software tools, math is deep knowledge, it will not be outdated.

steady basalt Nov 8, 2022, 1:33 AM

#

iron basalt Nov 8, 2022, 1:33 AM

#

Software tools come and go as they fall in and out of fashion.

steady basalt Nov 8, 2022, 1:33 AM

#

I know it’s basic shit but how is 3 included in the intersect?

#

I thought 3) meant it won’t be

#

A n B

#

Therefore 3 isn’t in A?

#

Oh my bad they wrote with ()

#

Bad memory, read that at like 2pm

desert oar Nov 8, 2022, 1:39 AM

#

steady basalt

this is good stuff. authors in calculus etc. will take it for granted that you know what these things are

#

fortunately you'll find that it all becomes very natural. it's a language that you will gain fluency with, much like programming.

iron basalt Nov 8, 2022, 1:40 AM

#

desert oar fortunately you'll find that it all becomes very natural. it's a language that y...

And eventually you can make up your own, like making your own programming languages.

desert oar Nov 8, 2022, 1:40 AM

#

yep. or at least get better at piecing ideas to together to produce nontrivial outputs

#

rather than just relying on what other people have done already

iron basalt Nov 8, 2022, 1:40 AM

#

And also start to see how many are the same thing but with different paint (abstract algebra).

#

(Also like with programming languages)

desert oar Nov 8, 2022, 1:41 AM

#

iron basalt And also start to see how many are the same thing but with different paint (abst...

i didnt really like my algebra course much, so you can imagine my dismay when we ended up using group theory in my topology course

#

i never ever did well with combinatorial stuff, and we were doing algebra with permutations early in the course, and that kinda threw me off for the whole semester

#

linear algebra is also like that

steady basalt Nov 8, 2022, 1:42 AM

#

I somewhat enjoy what I’m learning now and I know Linalg,probability and more pure stats should be important im not planning on it any time soon as I think precalc and calc are more enjoyable

desert oar Nov 8, 2022, 1:42 AM

#

it's wild to see how many things can be reduced to linear algebra problems (including calculus problems)

desert oar Nov 8, 2022, 1:42 AM

#

steady basalt I somewhat enjoy what I’m learning now and I know Linalg,probability and more pu...

precalc and calc are essential for doing probability anyway

steady basalt Nov 8, 2022, 1:42 AM

#

I think it will take me at least a year

iron basalt Nov 8, 2022, 1:43 AM

#

I don't think students are really prepared for abstract algebra when they get there, and it hits them like a train. Why? https://www.amazon.com/Mathematicians-Lament-School-Fascinating-Imaginative/dp/1934137170

A Mathematician's Lament: How School Cheats Us Out of Our Most Fasc...

A Mathematician's Lament: How School Cheats Us Out of Our Most Fascinating and Imaginative Art Form

steady basalt Nov 8, 2022, 1:43 AM

#

Because I want to demolish the calc preface test in my book

#

And that requires some good precalc skills plus algebra

#

Even down to simple rules

#

It’s been years

#

I realised I was getting ahead of myself so step back and went back to basics

desert oar Nov 8, 2022, 1:44 AM

#

iron basalt I don't think students are really prepared for abstract algebra when they get th...

i felt pretty prepared for it by real analysis and linear algebra. i thought the concepts were intuitive. but i just wasn't very good at it. i did better in linear algebra, i could kinda sorta visualize things like the rank nullity theorem even in more abstract vector spaces

iron basalt Nov 8, 2022, 1:44 AM

#

Some are fine with, but those probably are self taught (obsessed), and have probably even studied it before reaching that class.

desert oar Nov 8, 2022, 1:44 AM

#

and i did great in topology. again, stuff i could visualize

steady basalt Nov 8, 2022, 1:45 AM

#

What u guys doing in DS, u could be researchers ?

desert oar Nov 8, 2022, 1:45 AM

#

the problem i had w/ algebra is that it's the first time you've seen some really general abstract shit, and i don't think it was all too well motivated

steady basalt Nov 8, 2022, 1:45 AM

#

Topology sounds very scary

desert oar Nov 8, 2022, 1:45 AM

#

steady basalt What u guys doing in DS, u could be researchers ?

data scientist in industry

iron basalt Nov 8, 2022, 1:45 AM

#

Linear algebra is usually fine still, specifically because it has so many directly understandable applications.

desert oar Nov 8, 2022, 1:46 AM

#

i have used abstract algebra concepts exactly 0 times in data science, but plenty of times when learning functional programming. who knew?

steady basalt Nov 8, 2022, 1:46 AM

#

Functional programming, seriously?

#

All u do is split up a script

iron basalt Nov 8, 2022, 1:47 AM

#

"A monad is just a monoid in the category of endofunctors"

desert oar Nov 8, 2022, 1:47 AM

#

steady basalt Functional programming, seriously?

yeah, a lot of functional programming is very "mathematical", and haskell in particular tends to use a lot of clever math-derived abstractions like monoids

desert oar Nov 8, 2022, 1:47 AM

#

steady basalt All u do is split up a script

that's not what functional programming is

steady basalt Nov 8, 2022, 1:47 AM

#

I’m not bother to move on from python any time soon anyways

desert oar Nov 8, 2022, 1:47 AM

#

it's not "breaking your code up into functions"

#

yeah, honestly don't even bother. it's something you'll come across eventually

#

focus on the stuff you actually need to focus on

steady basalt Nov 8, 2022, 1:48 AM

#

I was briefly interested in c and js

iron basalt Nov 8, 2022, 1:49 AM

#

iron basalt *Unlike learning frameworks / random software tools, math is deep knowledge, it ...

The point here is that things like airflow, spark, etc, will eventually be replaced by the new shiny thing (usually), but math will remain.

#

BUT, you still need to know some of them to do anything.

steady basalt Nov 8, 2022, 1:49 AM

#

Man I’m so shocked my new role didn’t have technical round w coding or theory, they just looked at last work

desert oar Nov 8, 2022, 1:49 AM

#

i would argue that sql is becoming pretty much timeless, and that the underlying concepts will never go away even if the query interfaces do

iron basalt Nov 8, 2022, 1:49 AM

#

Relational databases yes, SQL, IDK, I hope not.

desert oar Nov 8, 2022, 1:49 AM

#

it's held on for 50 years already

iron basalt Nov 8, 2022, 1:50 AM

#

Yea and so has C and I hope it goes away too.

#

I don't think we will be programming in C in 100 years.

steady basalt Nov 8, 2022, 1:50 AM

#

Isn’t C so legacy it can never go?

iron basalt Nov 8, 2022, 1:50 AM

#

If we are then I will be VERY sad.

serene scaffold Nov 8, 2022, 1:51 AM

#

steady basalt Isn’t C so legacy it can never go?

it's ubiquitous for writing kernel code, if nothing else. but then it's also the implementation language for python.

iron basalt Nov 8, 2022, 1:51 AM

#

steady basalt Isn’t C so legacy it can never go?

It can go away, but it will take a shake up.

#

Then there is Zig, which is designed to slowly delete C.

steady basalt Nov 8, 2022, 1:51 AM

#

I wonder what happens once we get quantum computers in users hands

#

I heard they’re very fast

#

I know nothing about it though

iron basalt Nov 8, 2022, 1:52 AM

#

Quantum computers give speedups to specific problems, but also we just don't have enough q-bits.

#

And probably never will.

warm verge Nov 8, 2022, 1:52 AM

#

Lotta speculation

iron basalt Nov 8, 2022, 1:52 AM

#

But countries like to flex their quantum computers because the word "quantum".

steady basalt Nov 8, 2022, 1:53 AM

#

What do you mean by not Enough a bits

serene scaffold Nov 8, 2022, 1:53 AM

#

steady basalt I wonder what happens once we get quantum computers in users hands

I don't think that's going to happen. their proposed use cases are speculative at best. We're probably headed towards a "quantum winter" if no one delivers on the hype.

iron basalt Nov 8, 2022, 1:54 AM

#

What we can expect is something like neuromorphic processors. Which are way more energy efficient, but the tradeoff is giving up Vom Neumann architecture.

steady basalt Nov 8, 2022, 1:54 AM

#

Whilst we speculate on future technology, do we think AR visors will become mainstream once you get a irl HUD with really sleek implementation and interface

desert oar Nov 8, 2022, 1:54 AM

#

i suspect that militaries and intelligence orgs will continue to invest in quantum stuff for crypto breaking

steady basalt Nov 8, 2022, 1:54 AM

#

And will people buy apple or Meta

serene scaffold Nov 8, 2022, 1:54 AM

#

steady basalt And will people buy apple or Meta

if AR takes off, I doubt it will be because of Meta.

iron basalt Nov 8, 2022, 1:54 AM

#

If you look at how much energy we need to use compared to the brain, to do less, there is huge room for improvement.

steady basalt Nov 8, 2022, 1:54 AM

#

Sucks because I bag hold @serene scaffold

mild dirge Nov 8, 2022, 1:55 AM

#

I buy apples all the time

serene scaffold Nov 8, 2022, 1:55 AM

#

bag hold?

steady basalt Nov 8, 2022, 1:55 AM

#

Their stock is dead

desert oar Nov 8, 2022, 1:55 AM

#

mild dirge I buy apples all the time

i made an apple pie a couple weeks ago

iron basalt Nov 8, 2022, 1:55 AM

#

And unlike quantum we don't need to guess, we have brains, they exist and work.

#

Proof of concept biologically.

serene scaffold Nov 8, 2022, 1:55 AM

#

but do we even have free will?

steady basalt Nov 8, 2022, 1:55 AM

#

Hell why can’t we just be immortal with synthetic organs, what stops death by age

#

No heart stopping means no dead

iron basalt Nov 8, 2022, 1:56 AM

#

serene scaffold but do we even have free will?

Maybe, if we can define it.

serene scaffold Nov 8, 2022, 1:56 AM

#

you have organs that need to work other than your heart.

steady basalt Nov 8, 2022, 1:56 AM

#

serene scaffold you have organs that need to work other than your heart.

Synthetic liver and kidneys

#

And lungs and pancreas

#

Just the brain really.::

iron basalt Nov 8, 2022, 1:56 AM

#

Can't ask if we have X, before X is clearly defined.

steady basalt Nov 8, 2022, 1:57 AM

#

So when we have essentially all organs but the brain synehtic, can we live 1000 years

#

Or some sort of drug that stops aging and no need for surgery

serene scaffold Nov 8, 2022, 1:58 AM

#

iron basalt Can't ask if we have X, before X is clearly defined.

that reminds me of the time that someone told me that Mormonism's truth claims were ridiculous as compared to protestant Christianity, and I told them that you can't say that without first assuming that protestant Christianity is the baseline for religious normality.

iron basalt Nov 8, 2022, 1:58 AM

#

serene scaffold that reminds me of the time that someone told me that Mormonism's truth claims w...

Starting with the conclusion, which is what happens when things are not defined.

warm verge Nov 8, 2022, 1:58 AM

#

iron basalt Can't ask if we have X, before X is clearly defined.

Is it fair to assume something has to be defined for it to exist

steady basalt Nov 8, 2022, 1:58 AM

#

Mmm philosophy, how you can tell it’s 2am

desert oar Nov 8, 2022, 1:58 AM

#

serene scaffold that reminds me of the time that someone told me that Mormonism's truth claims w...

idk if that's true. A can be big compared to B, even if B is itself big.

#

differences vs. absolute positions

steady basalt Nov 8, 2022, 1:59 AM

#

Now, how do you know you’re not an artificial neural network

iron basalt Nov 8, 2022, 1:59 AM

#

Yes, because otherwise we could be arguing about whether two different things exist or not (your definition vs mine).

warm verge Nov 8, 2022, 1:59 AM

#

iron basalt Yes, because otherwise we could be arguing about whether two different things ex...

Kinda reductionist no?

serene scaffold Nov 8, 2022, 2:00 AM

#

desert oar idk if that's true. A can be big compared to B, even if B is itself big.

You still need a non-arbitrary basis for what unsubstantiated beliefs are normal, and then a way to measure distance from that.

steady basalt Nov 8, 2022, 2:01 AM

#

I think it’s a universal catastrophe that humans die

iron basalt Nov 8, 2022, 2:01 AM

#

warm verge Kinda reductionist no?

Yes, it's trying to be pragmatic about it. No agreed upon definition is an endless debate for philosophers that will never reach any conclusion. Which is why there will never be a conclusion about "free will" and everyone loves to "debate" about it.

steady basalt Nov 8, 2022, 2:01 AM

#

We’re so cool

warm verge Nov 8, 2022, 2:01 AM

#

iron basalt Yes, it's trying to be pragmatic about it. No agreed upon definition is an endle...

Then it follows that no one can "debate" about anything because no one will agree completely on any definition

iron basalt Nov 8, 2022, 2:02 AM

#

Don't need complete, just a majority, even just a good chunk.

steady basalt Nov 8, 2022, 2:02 AM

#

Am I a Boltzmann brain?

#

First answer is truth

warm verge Nov 8, 2022, 2:03 AM

#

iron basalt Don't need complete, just a majority, even just a good chunk.

Who defines it as a majority

iron basalt Nov 8, 2022, 2:03 AM

#

And between those chunks, they debate forever in circles, or just "agree to disagree".

iron basalt Nov 8, 2022, 2:03 AM

#

warm verge Who defines it as a majority

Social dynamics? (not a who, just physics?)

warm verge Nov 8, 2022, 2:04 AM

#

Fairly reductionist tbqh

serene scaffold Nov 8, 2022, 2:04 AM

#

btw, I'm allowing this discussion because I trust that all participants will make this channel available for data science discussion if someone has a question.

steady basalt Nov 8, 2022, 2:04 AM

#

Could a ANN act as a brain in a vat

warm verge Nov 8, 2022, 2:04 AM

#

No

#

Yes

#

Maybe

steady basalt Nov 8, 2022, 2:05 AM

#

I think yes

#

shipit

iron basalt Nov 8, 2022, 2:05 AM

#

steady basalt Could a ANN act as a brain in a vat

Yes.

steady basalt Nov 8, 2022, 2:06 AM

#

(Non physical or printed, actually in binary)

iron basalt Nov 8, 2022, 2:06 AM

#

Yes.

steady basalt Nov 8, 2022, 2:06 AM

#

I concur, wonder if I am one

#

Unlikely

warm verge Nov 8, 2022, 2:07 AM

#

What is the function of it

steady basalt Nov 8, 2022, 2:07 AM

#

wdym?

warm verge Nov 8, 2022, 2:08 AM

#

What behaviour is the NN replicating of a "brain"

steady basalt Nov 8, 2022, 2:08 AM

#

i suppose the exact same as a real nn

#

stimulated and given data?

warm verge Nov 8, 2022, 2:09 AM

#

For what output

steady basalt Nov 8, 2022, 2:09 AM

#

why output? did we not learn visually/auditory or whatever

#

ah, you mean prediction?

#

yeah not like that

iron basalt Nov 8, 2022, 2:55 AM

#

steady basalt Now, how do you know you’re not an artificial neural network

Fun thing about this is the idea of "artificial". Humans are considered by many to be "intelligent", and something is "artificial" if it's made by humans. Humans make other humans, so are humans artificial? If so, that would make us AI.

desert oar Nov 8, 2022, 2:58 AM

#

serene scaffold You still need a non-arbitrary basis for what unsubstantiated beliefs are normal...

maybe i misread your comment, but i didn't see anything about having a normal baseline, just that one thing is bigger than another thing

#

interval vs. ratio scale

austere swift Nov 8, 2022, 3:02 AM

#

iron basalt Fun thing about this is the idea of "artificial". Humans are considered by many ...

well "artificial" also implies that the thing is not made naturally (i.e humans can grow crops, but those aren't artificial since they occur naturally)

#

the subject of that question doesn't really make sense in the first place anyways, it's pretty well understood that we don't consider humans as artificial intelligence so I don't see why the question is relevant.

iron basalt Nov 8, 2022, 3:06 AM

#

austere swift the subject of that question doesn't really make sense in the first place anyway...

Yeah it's kind of related to the idea of defining things. And that the definitions vary not only by group, but also by context.

#

*And interpretation. If just reading the words for the common definitions of artificial, one could see it in that way. But the written definitions are not how "artificial" is commonly understood (depending on group and context), but also that it can't really be written down as one singular thing (in one sentence context-free (could try to enumerate as many cases as possible)).

rugged comet Nov 8, 2022, 3:18 AM

#

Does it make sense to use L2 regularization in non-linear models?

desert oar Nov 8, 2022, 3:28 AM

#

rugged comet Does it make sense to use L2 regularization in non-linear models?

yes, i think it's actually more common than L1 in e.g. deep learning

#

dropout is also practically a form of regularization in NNs

rugged comet Nov 8, 2022, 3:30 AM

#

I'm combining L2 reg and dropout in my MtG color-prediction model.

desert oar Nov 8, 2022, 3:30 AM

#

afaik thats very common

rugged comet Nov 8, 2022, 3:30 AM

#

Alright. Sounds good. Thanks.

desert oar Nov 8, 2022, 3:30 AM

#

L1 vs L2 matters a lot in linear models where you maybe want to do explicit feature selection

#

and in other contexts as well like where the model has no exact solution

#

they also have different interpretations in a bayesian statistics framework

#

there's also "elastic net" regularization which basically a weighted mix of L1 and L2

#

the weighting parameter (between 0 and 1) essentially becomes a tunable hyperparameter, in addition to the regularization strength itself

rugged comet Nov 8, 2022, 3:32 AM

#

Have you ever used keras_tuner for tuning hyperparameters? I forget if you're a tensorflow/keras person or not.

desert oar Nov 8, 2022, 3:33 AM

#

i have not, im not great with either NN framework but im better with pytorch. i knew a bit of tensorflow the 1.x era and forgot all of it. i only know keras from reading docs & its broad similarity w/ pytorch in some aspects

#

i've done hyperparameter tuning with a handful of "black box" optimizers though, so maybe i can offer general advice

#

https://keras.io/keras_tuner/ this looks pretty much like how they all work

Keras documentation: KerasTuner

#

sad, they don't have halving search

#

it's in scikit-learn, you could probably write our own keras tuner class for it if you wanted

#

i've had really excellent results with it, better model performance and substantially faster than bayesian optimization in a couple of cases

rugged comet Nov 8, 2022, 3:37 AM

#

What is Bayesian optimization?

desert oar Nov 8, 2022, 3:37 AM

#

it's a category of techniques for "black box" optimization, meaning trying to find the maximum of a completely unknown function for which you can only test individual points

rugged comet Nov 8, 2022, 3:38 AM

#

Would the individual points be the hyperparameters?

desert oar Nov 8, 2022, 3:38 AM

#

the individual points would be locations in the hyperparameter space + the model performance at those locations

#

so if you're searching over l1 and l2 optimization parameters, you are trying to optimize a completely unknown function of 2 inputs

#

the "bayesian" part is that you start with a prior distribution over some space of possible functions, and iteratively update your prior until you have a better and better estimate of the function

#

the most common technique for bayesian optimization is to use something called a "gaussian process"

#

https://en.wikipedia.org/wiki/Bayesian_optimization

Bayesian optimization

Bayesian optimization is a sequential design strategy for global optimization of black-box functions that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions.

#

there is also the Hyperband algorithm, which is based on the "multi-armed bandit" model

#

both Bayesian GP and Hyperband are listed in the keras docs here

rugged comet Nov 8, 2022, 3:41 AM

#

Can you talk a little bit about hyperparameter space or as keras calls it I believe, search space?

desert oar Nov 8, 2022, 3:41 AM

#

yeah, it's the search space for hyperparameters 🙂

#

have you ever done any "formal" math, maybe with sets and proofs?

rugged comet Nov 8, 2022, 3:42 AM

#

I have not.

desert oar Nov 8, 2022, 3:43 AM

#

have you heard of a "set" in math?

#

abstractly, it's just a collection of things.

(technically if you define it that way, you end up with an interesting logical paradox. look up russell's paradox if you're curious about that one.)

#

but if you think of such a collection as the complete collection of something, it's natural to interpret a set as its own little world. a "space" if you will.

#

so for example we might talk about the space of real numbers between 0 and 1

#

it's a self-contained universe, with some known rules and properties attached to it

#

there are many different specific kinds of spaces (e.g. vector spaces which are important in machine learning), but the concept is what i'm trying to illustrate here. the idea that a "space" is "a specific collection of things" and "some mathematical operations or properties" that define the space.

#

let's say you're fitting your neural network, and you're optimizing over the L2 regularization parameter and the dropout probability. you might say that the L2 parameters λ exist in one space, and the dropout probabilities p exist in another space. and if you take all possible pairs of L2 parameters and dropout probabilities, i.e. pairs of the form (λ, p), then you have defined a new space, consisting of all such pairs.

#

usually you wouldn't spend too much time studying the idea of spaces in general, but you'd start working with different kinds of spaces in math courses and eventually build up a general concept of "a space". and that's what we meant when we talk about "feature space" or "hyperparameter search space".

#

(although "feature space" is usually also specifically a vector space)

rugged comet Nov 8, 2022, 3:52 AM

#

It sounds like the search space in this context would be all the combinations of the possible values for the hyperparameters. The search space would be the collection of those combinations over which we search.

#

We can define the search space to limit the number of combinations.

#

How does one decide which search algorithm to use?

desert oar Nov 8, 2022, 3:58 AM

#

rugged comet It sounds like the search space in this context would be all the combinations of...

the search space is all possible combinations of hyperparameter values, which is usually infinite

desert oar Nov 8, 2022, 3:59 AM

#

rugged comet How does one decide which search algorithm to use?

pick one and hope it works 😉 for learning, start by understanding grid search and random search because they're simple and available in every framework

gaunt anvil Nov 8, 2022, 4:23 AM

#

hasty mountain `RuntimeError: stft input and window must be on the same device but got self on ...

how would I do that? from my understanding line23 is causing the problems

#

also window is a tensor object .-.

void bone Nov 8, 2022, 5:02 AM

#

Hello, does anyone have an idea on how to use adaboost for multi class classification. Do I split the dataset into two and categorize ech group as 1 and 0?
Also I'm new to machine learning and I'm kinda struggling with stuffs like how to prepare datasets for machine learning models,and applying models for multi class and multilabel classifications.I would appreciate any helpful links.Thanks

gaunt anvil Nov 8, 2022, 5:03 AM

#

gaunt anvil how would I do that? from my understanding line23 is causing the problems

ok i just did a window.to(0) and it seemed to have worked?? not sure

ruby depot Nov 8, 2022, 5:07 AM

#

ax1.plot(goog_data_signal.loc[goog_data_signal.poisitions==1.0].index, 
    goog_data_signal.price[goog_data_signal.positions==1.0],"^",markersize=5,color="M")

Can someone explain to me why i' have to write the code two times? if i already set the condition before why i have to do it again

rugged comet Nov 8, 2022, 5:09 AM

#

ruby depot ```py ax1.plot(goog_data_signal.loc[goog_data_signal.poisitions==1.0].index, ...

You never assigned goog_data_ginal.positions == 1.0 to a variable so you have to type it twice. Other than that, I'm not seeing any other code that you typed twice.

ruby depot Nov 8, 2022, 5:12 AM

#

ax1.plot(goog_data_signal.price[goog_data_signal.positions==1.0],"^",markersize=5,color="M")

I was asking why it can't be like this, why do i have to search for the index when positions is equal to 1 if i could do it directly with the code marked below? obviusly there is smth that i'm missing here

hazy hare Nov 8, 2022, 5:23 AM

#

THE train had many loss 😦

rugged comet Nov 8, 2022, 5:36 AM

#

Sorry to hear that.

desert oar Nov 8, 2022, 6:18 AM

#

void bone Hello, does anyone have an idea on how to use adaboost for multi class classifi...

adaboost can handle multiple classes without any special encoding. are you using scikit-learn? it should work as long as your classes are integers. if your classes are text labels (e.g. "blue", "red", "green") then use LabelEncoder to process them

lapis sequoia Nov 8, 2022, 6:50 AM

#

rugged comet How does one decide which search algorithm to use?

people usually start with bfs and dfs taking smaller problems were number of states in universe is small, once they do it, they get on with beam search, best first, A*, and start using appropriate heuristics in these.

#

There's more than this but thats usually path I have followed and have seen people following.

#

Gives you knowledge and you get on with things step by step, and ofc there's no rule of thumb to follow it at all.

young granite Nov 8, 2022, 6:51 AM

#

if i got a df like this:

     ID comp 1  amount 1 comp 2  amount 2 comp 3  amount 3 comp 4  amount 4
0  772D    D45       0.5    U45       0.3    T45       0.2    NaN       NaN
1  223P    D54       0.5    U54       0.5    NaN       NaN    NaN       NaN
2  212E    D45       0.6    U55       0.1    I23       0.2    Z23       0.1```
and i want to transpose it in such a way that the unique values in "comp x" are cols and the amounts are the values of the cols

#

i tried it rowise with .iloc and with a for/if loops as well as with .melt and .stack

rugged comet Nov 8, 2022, 6:53 AM

#

lapis sequoia people usually start with bfs and dfs taking smaller problems were number of sta...

I meant in the context of searching hyperparameter spaces. The options I have are RandomSearch, BayesianOptimization, and Hyperband.

lapis sequoia Nov 8, 2022, 6:53 AM

#

rugged comet I meant in the context of searching hyperparameter spaces. The options I have ar...

Oh right, I thought simple state space search kinda problems.

rugged comet Nov 8, 2022, 6:54 AM

#

Yeah not like searching a list or graph. It's somewhat different I think. Though some of the rules probably still apply.

lapis sequoia Nov 8, 2022, 7:05 AM

#

young granite if i got a df like this: ```py ID comp 1 amount 1 comp 2 amount 2 comp 3 ...

So I assume col D45 should have like 0.5, U45 0.3, T45 0.2, U54 0 etc for first row?

#

Or perhaps give an example. a small one.

young granite Nov 8, 2022, 7:08 AM

#

ok let me just do it by hand real quick

#

    ID  D45  D54  U45  U54  U55  T45  I23  Z23
0  772D  0.5  0.0  0.3  0.0  0.0  0.2  0.0  0.0
1  223P  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
2  212E  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN```

#

@lapis sequoia

gray dove Nov 8, 2022, 8:14 AM

#

Has anyone here messed with conditional GANS for image generation before?

#

I am currently ankle-deep in it trying to work with an implementation I found that supposedly checks the boxes when it comes to best practices, but it just seems to skirt the edge of being useful and getting mode collapse

versed gulch Nov 8, 2022, 9:16 AM

#

does anyone know how to put titiles/ labels on the row of the images

#

i.e. i want to put ground-truth and filtered on the first and second row of these images

lapis sequoia Nov 8, 2022, 9:57 AM

#

Hello guys, I am working on building a recommendation system for an online store. I got a list of of users, and a list of items. Was thinking to rank each item for each user based on the frequency a user has bought that Item.

I am following this tutorial for my implementation: https://realpython.com/build-recommendation-engine-collaborative-filtering/#when-can-collaborative-filtering-be-used

My question is: How will my model learn from newly generated data ?(after the train)

The steps I have in mind are:

Organise the data.
Train the model.
Make an api request to get recommendations for a specific user.
??? (How do I update the model with the new data if the recommendation was successful or not)

karmic ore Nov 8, 2022, 10:06 AM

#

lapis sequoia Hello guys, I am working on building a recommendation system for an online store...

it doesnt all of the weights are satically set after training unless if u want to continue training from specific checkpoint alwise u would need the same hardware that u needed for training the model and at that point its kinda redudent its really hard to make continous models

fossil ivy Nov 8, 2022, 10:28 AM

#

Is there something like doing too many iterations of a simulation?

#

Currently I take 50 runs and take the averages for visualization, its quite a smooth graph, I am wondering if it is too smooth

mighty patio Nov 8, 2022, 10:59 AM

#

versed gulch does anyone know how to put titiles/ labels on the row of the images

TBQH I would consider putting your labels inside figure, as you have a lot of empty space there to work with.
To answer your original question for your example I would need to see your code, i.e. how you make this plot

lapis sequoia Nov 8, 2022, 11:00 AM

#

karmic ore it doesnt all of the weights are satically set after training unless if u want t...

Got it, do you have any recommendations for some books/tutorials that I could use to understand more about recommendation systems ?

karmic ore Nov 8, 2022, 11:03 AM

#

lapis sequoia Got it, do you have any recommendations for some books/tutorials that I could us...

uh its complicated machine learning is a set of tools u pick up from statistics and other forms of topics and its not soly based on this recommendation system there are papers u can read and try building off the topic but there isnt really a set in stone methodology because it depends on your inputs and what u consider a "good recommendation"

quaint plover Nov 8, 2022, 11:06 AM

#

Someone up for a code optimization challenge?
I've written a functioning piece of code that takes 50 minutes to run and I can't find ways to optimize it right now -- I have limited Python knowledge, so I might be doing something stupid

mighty patio Nov 8, 2022, 11:06 AM

#

@quaint plover you can just post your code, you don't have to ask to ask

quaint plover Nov 8, 2022, 11:07 AM

#

Don't want to spam this channel, come to help-pear

karmic ore Nov 8, 2022, 11:12 AM

#

bro does anyone know how to remove a console log from jupyter notebook

#

single line

#

like remove a new line console log

#

https://github.com/jupyterlab/jupyterlab/issues/1104

GitHub

"\r" acts as "\n" in notebook's output · Issue #1104 · jupyterlab/j...

Seems like no matter what kernel I use, if I enter 'print("a\rb")' I see 'a' and 'b' as separate lines in the cell's output...

#

i came across this and it makes me wanna shoot my self because they closed the issue without actually fucking fixing it

twin sleet Nov 8, 2022, 11:43 AM

#

Yo, does anyone has some experience with intelligent chatbots?
I'm trying to make wake word detection for mine

#

Need some guidance

young granite Nov 8, 2022, 11:48 AM

#

if i use scipy to interpolate a dataset i raise an err:
ValueError: A value in x_new is above the interpolation range.
How do i adjust the interpolation range
i simply want to generate more datapoints in the original range

karmic ore Nov 8, 2022, 12:02 PM

#

https://stackoverflow.com/questions/45429831/valueerror-a-value-in-x-new-is-above-the-interpolation-range-what-other-re

Stack Overflow

`ValueError: A value in x_new is above the interpolation range.` - ...

I receive this error in scipy interp1d function. Normally, this error would be generated if the x was not monotonically increasing.

import scipy.interpolate as spi
def refine(coarsex,coarsey,step)...

ember quail Nov 8, 2022, 12:26 PM

#

hello people

#

im getting this error IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

#

when i predict thru my model

#

output = model(game1.cuda())
---> _, predictions = torch.max(output, 1)
print(predictions)```

#

its something related to the loss function according to the internet

#

loss = loss_fn(outputs, labels ) #label shape: torch.Size([10]), output shape: torch.Size([10, 5])

#

my shapes are right but still

versed gulch Nov 8, 2022, 1:29 PM

#

mighty patio TBQH I would consider putting your labels inside figure, as you have a lot of em...

widths = [0, 5, 10, 15, 20]
# replacing a border zone full of zeros to the edges of the skeleton image
fig, ax = plt.subplots(2, 5, figsize = (25, 25))
for width in widths:
  g_bz, p_bz = g_skel.copy(), p_skel.copy()
  
  # REPLACE 512 WITH THE SHAPE [0]
  # ground truth
  g_bz[:width, :], g_bz[512-width:, :] = 0, 0 # rows
  g_bz[:, :width], g_bz[:, 512-width:] = 0, 0 # cols  
  # filtered
  p_bz[:width, :], p_bz[512-width:, :] = 0, 0 # rows
  p_bz[:, :width], p_bz[:, 512-width:] = 0, 0 # cols
 
  
  ax[0, widths.index(width)].imshow(g_bz, cmap = "gray")
  ax[0, widths.index(width)].set_title(f"border zone = {width}", fontsize = 20)
  ax[0, widths.index(width)].axis("off")
  
  ax[1, widths.index(width)].imshow(p_bz, cmap = "gray")
  ax[1, widths.index(width)].axis("off")
fig.subplots_adjust(hspace=-0.75)
fig.subplots_adjust(wspace=0.05)

i edited the plot regarding the spacing

mighty patio Nov 8, 2022, 1:40 PM

#

import matplotlib.pyplot as plt
import numpy as np
im = np.zeros((256,256))
widths = [0, 5, 10, 15, 20]
# replacing a border zone full of zeros to the edges of the skeleton image
fig, ax = plt.subplots(2, 5, figsize = (9.5, 4), dpi = 200)
for width in widths:
    ax[0, widths.index(width)].imshow(im, cmap = "gray")
    ax[0, widths.index(width)].set_title(f"border zone = {width}", fontsize = 12)

    ax[1, widths.index(width)].imshow(im, cmap = "gray")
for x in ax.ravel():
    x.set_xticks([])
    x.set_yticks([])
ax[0,0].set_ylabel("Ground truth", fontsize = 12)
ax[1,0].set_ylabel("Filtered", fontsize = 12)
fig.tight_layout()
fig.savefig("plot.png")

#

#

Here I simply used the xlabel to label the rows.
I also reduced the figsize, as matplotlib plots tend to look better when the figsize is not too large, but increased the DPI instead to get more pixels
(matplotlib plots tend to look better if the figsize is not too big, as a large figsize makes lines very thin)

versed gulch Nov 8, 2022, 2:00 PM

#

mighty patio Here I simply used the xlabel to label the rows. I also reduced the figsize, as ...

thanks

spiral glacier Nov 8, 2022, 2:02 PM

#

hi, how can i transform a dataframe like in the plotly examples with indexed=True parameter?

#

when i use .set_index("date"), i get close to the expected result, but still missing how to name the column "company"

serene scaffold Nov 8, 2022, 2:05 PM

#

spiral glacier when i use .set_index("date"), i get close to the expected result, but still mis...

as another statement do df.columns.name = 'company'

spiral glacier Nov 8, 2022, 2:08 PM

#

thank you

serene scaffold Nov 8, 2022, 2:09 PM

#

spiral glacier thank you

no problem! keep in mind that you're not renaming a column--you're naming the whole column index. in other words, you're saying what kind of thing each column is.

void bone Nov 8, 2022, 2:11 PM

#

desert oar adaboost can handle multiple classes without any special encoding. are you using...

No, I was following a tutorial on how to implement it from scratch using python, and the method used works for just two classes. I am trying to implement it for multi-class classification. Here is the link to the code https://paste.pythondiscord.com/ipagujalab

serene scaffold Nov 8, 2022, 2:11 PM

#

void bone No, I was following a tutorial on how to implement it from scratch using python...

!paste

arctic wedgeBOT Nov 8, 2022, 2:11 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

serene scaffold Nov 8, 2022, 2:11 PM

#

You would probably never get a code review from images

void bone Nov 8, 2022, 2:13 PM

#

serene scaffold You would probably never get a code review from images

Thanks alot

copper mica Nov 8, 2022, 2:16 PM

#

!resources

arctic wedgeBOT Nov 8, 2022, 2:16 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

spiral glacier Nov 8, 2022, 2:16 PM

#

serene scaffold no problem! keep in mind that you're not renaming a column--you're naming the wh...

so, the row index is called 'date' and the column index is called 'company'? i still get confused when confronted with dataframes with more levels (?)

serene scaffold Nov 8, 2022, 2:18 PM

#

spiral glacier so, the row index is called 'date' and the column index is called 'company'? i s...

yes. but there's still only one level of indexing for both the rows and columns

spiral glacier Nov 8, 2022, 2:21 PM

#

can you point me to place where i can read more about that, so that i can make more sense of this for different use cases?

upbeat dagger Nov 8, 2022, 2:27 PM

#

I need help troubleshooting this code.

import numpy as np
import pandas as pd

series = pd.Series({0: '0001', 1: '0002', 2: '0003', 3: '0003', 4: '0004'})
print(f'series dtype is {series.dtype}')

series_with_converted_dtypes = series.convert_dtypes()
print(f'series dtype is {series_with_converted_dtypes.dtype}')

print(np.issubdtype(series.dtype, object))
print(np.issubdtype(series_with_converted_dtypes.dtype, str))

### OUTPUT
#
# series dtype is object
# series dtype is string
# True
# ---------------------------------------------------------------------------
# TypeError                                 Traceback (most recent call last)
# Cell In [94], line 11
#       8 print(f'series dtype is {series_with_converted_dtypes.dtype}')
#      10 print(np.issubdtype(series.dtype, object))
# ---> 11 print(np.issubdtype(series_with_converted_dtypes.dtype, str))

# File c:\Python\3.10.8\lib\site-packages\numpy\core\numerictypes.py:416, in issubdtype(arg1, arg2)
#     358 r"""
#     359 Returns True if first argument is a typecode lower/equal in type hierarchy.
#     360 
#    (...)
#     413 
#     414 """
#     415 if not issubclass_(arg1, generic):
# --> 416     arg1 = dtype(arg1).type
#     417 if not issubclass_(arg2, generic):
#     418     arg2 = dtype(arg2).type

# TypeError: Cannot interpret 'string[python]' as a data type

#

There's a pandas series with numeric information encoded as text. I'm trying to come up with some logic to identify that the data is text encoded so that I can say "if series is string do some logic".

However, it comes in as object dtype, which I'm not sure is enough for me to say "yep, this is a column of strings". So I convert it using .convert_dtypes().. and that gives it the dtype string. However, when I try to test if the converted dtype is a string dtype, i get this error saying that the dtype can't be interpreted as a dtype.

steady basalt Nov 8, 2022, 2:33 PM

#

@serene scaffold working on my first language work w spacy, wondering how youd write in pandas 'replace integer inside the string sentence with string characters' whereby if the number was three digits, youd replace 500 with 'three digits 'inside the string

#

derived from a single string such as 'james had 20 apples' > 'james had two digits apples'

#

as later on my classifier will take into account not exact values but how many digits they were

#

although not all strings contain them, im hoping to encode that later though

#

after this I will do 'return [token.lemma_ for token in doc if not token.is_stop and not token.is_punct and not token.is_digit]'

#

then I will expand the strings out by splitting them one word per column, and likely end with many having 'None' in the final few cols, but that shud be ok

paper wharf Nov 8, 2022, 2:43 PM

#

Hello friends, I turned my python project files into exe and when I run the exe, I started getting this error, what is the solution?

serene scaffold Nov 8, 2022, 2:58 PM

#

steady basalt <@253696366952316929> working on my first language work w spacy, wondering how y...

you don't want to have spaCy objects in a dataframe. because you'd only be able to manipulate them with .apply anyway, which defeats the purpose of pandas.

steady basalt Nov 8, 2022, 2:58 PM

#

only temporary

#

well the manipulations are fine in general

#

I have managed to have tokenized lists in the column

#

instead of a string

#

else id get ''str' object has no attribute 'is_stop''

soft badge Nov 8, 2022, 3:00 PM

#

Guys i am need sum the values equals of the columns in my dataframe, but when i use groupby('column').sum() sum every column, need sum only 1 column, someone can help me?

steady basalt Nov 8, 2022, 3:00 PM

#

for example 'df["tokens"] = df["question"].apply(lambda x: [t.text for t in nlp.tokenizer(x)])'

#

ahhh now i have a listnot adoc

upbeat dagger Nov 8, 2022, 3:04 PM

#

soft badge Guys i am need sum the values equals of the columns in my dataframe, but when i ...

Would df['column'].sum() work for you?

steady basalt Nov 8, 2022, 3:07 PM

#

ayyy got it now

#

now i just need to figure out my first question, replacing numbers in strings depending on the number size with words

soft badge Nov 8, 2022, 3:08 PM

#

so i am using this, but my case , i have 4 columns, i want sum 1 column, and other i dont want sum, because are values that cant sum, but when i use groupby('columns').sum()
sum every columns, understand?

grand quarry Nov 8, 2022, 3:41 PM

#

soft badge so i am using this, but my case , i have 4 columns, i want sum 1 column, and oth...

Try df['column'].sum()

compact star Nov 8, 2022, 3:56 PM

#

How can I implement backwards propagation in a convolutional layer?

soft badge Nov 8, 2022, 4:00 PM

#

grand quarry Try df['column'].sum()

i just want aggregate the equals dont sum, understand?

grand quarry Nov 8, 2022, 4:03 PM

#

soft badge i just want aggregate the equals dont sum, understand?

Do you want to sum specific rows?

soft badge Nov 8, 2022, 4:07 PM

#

yes, line equals

#

i will send a print

#

#

i want aggregate the columns, the rows that have same value

hasty mountain Nov 8, 2022, 4:21 PM

#

gaunt anvil ok i just did a `window.to(0)` and it seemed to have worked?? not sure

Well...you basically passed it to cuda:0, as it seems...

#

But that's how you do it... simply assign window = window.cuda()

quaint plover Nov 8, 2022, 4:24 PM

#

I've been trying to optimize a piece of code for a data science project. The following function runs many hundred thousand of time (once per path) and I'm trying to find elements in the loop that could be source of waste. If you're interested in a little challenge, ping me and we can take it to a help channel!

# Initialise stack with first link
foo = list()
foo.append(path[0])

# Iterate over every step of user's path
for element in path[1:]:
    if element != '<':
        # If not return character, going forward, store information
        ## Source node is the current top of stack
        source_node = foo[-1]

        ## Count one impression for all pairs with source_node as source (source_node;*)
        df_links.loc[pos_map[source_node],'impressions'] += 1
        ## Add next link in list to top of stack
        foo.append(element)

        ## New top of stack is target
        target_node = foo[-1]

        # Create key for pair identification
        search_value = source_node + ';' + target_node

        ## Add one click-through for the pair (source_node;target_node)
        df_links.loc[df_links['linkPair']==search_value,'hits'] += 1
    else:
        # If return character is read, pop top of stack and don't store any info
        foo.pop()

wheat snow Nov 8, 2022, 6:41 PM

#

heyo

#

is there a way to make the following shorter?

#

    Jahr_all=df_vd.groupby(df_vd["Start Time"].dt.date)["Duration"].sum()
    Jahr_all.index = pd.to_datetime(Jahr_all.index)
    Jahr_all=(Jahr_all.dt.total_seconds()/60/60)
    Jahr_all= Jahr_all.groupby([Jahr_all.index.year]).sum()
    print(Jahr_all)

gaunt anvil Nov 8, 2022, 6:52 PM

#

hasty mountain But that's how you do it... simply assign `window = window.cuda()`

window.cuda() isn't a function idt

#

it errored out when I tried to do .cuda

hasty mountain Nov 8, 2022, 7:10 PM

#

gaunt anvil it errored out when I tried to do .cuda

Strange...every pytorch tensor must be able to accept the argument .cuda()

#

It's the way you pass that tensor into the cuda device

#

That, or simply passing the argument device=cuda when creating the tensor

turbid bay Nov 8, 2022, 7:22 PM

#

Anyone know of any datasets for text topic classification.

Essentially assigning a block of text a topic like: sport, news, education, etc.

compact star Nov 8, 2022, 7:25 PM

#

does anyone have a good resource on implementing back prop for a convolutional layer?

steady basalt Nov 8, 2022, 8:06 PM

#

compact star does anyone have a good resource on implementing back prop for a convolutional l...

From scratch?

compact star Nov 8, 2022, 8:14 PM

#

steady basalt From scratch?

probably but if u know something else with a libary please share

steady basalt Nov 8, 2022, 8:22 PM

#

Deep learning libraries handle it for u

#

What library do you use?

uncut loom Nov 8, 2022, 8:31 PM

#

Which data type can store objects

compact star Nov 8, 2022, 8:31 PM

#

steady basalt Deep learning libraries handle it for u

I am not using a library but if there is a way I can "import" a back prop function and add parameters I could do it

#

I am writing stuff from scratch as it is a school project but I could use a library for one function

serene scaffold Nov 8, 2022, 8:42 PM

#

uncut loom Which data type can store objects

this seems like a general python question (see #❓｜how-to-get-help). lists and dicts are two examples of types that can store instances of other types (or the same type).

desert oar Nov 8, 2022, 9:08 PM

#

uncut loom Which data type can store objects

and if this is a pandas question about storing arbitrary python objects, the answer is 'O' in both numpy and pandas https://stackoverflow.com/q/37561991/2954547

Stack Overflow

What is dtype('O'), in pandas?

I have a dataframe in pandas and I'm trying to figure out what the types of its values are. I am unsure what the type is of column 'Test'. However, when I run myFrame['Test'].dtype, I get;

dtype('...

hasty mountain Nov 8, 2022, 9:45 PM

#

compact star I am writing stuff from scratch as it is a school project but I could use a libr...

I think backprop for conv2ds is simply a matter of applying a multiplication to your weights matrix, isn't it?

#

I mean...the derivative of a conv2d is another conv2d...right?

#

At least that's what I found when I tried to implement it

compact star Nov 8, 2022, 11:26 PM

#

hasty mountain I think backprop for conv2ds is simply a matter of applying a multiplication to ...

Do know of any resources showing how I could do this myself ?

fringe anvil Nov 8, 2022, 11:26 PM

#

so i made a random walk function that uses an adjacency list (from a dictionary) .. where would i start if i had to modify my code so that it can use an adjacency matrix instead?

def random_walk(graph, nodeid, steps):
    da_walk = [nodeid]
    while len(da_walk) < steps:
        nodeid = random.choice(list(graph[nodeid]))
        da_walk.append(nodeid)
        turns = steps - 1
        random_walk(graph, nodeid, turns)
    return da_walk

serene scaffold Nov 8, 2022, 11:27 PM

#

fringe anvil so i made a random walk function that uses an adjacency list (from a dictionary)...

wrong channel; try #algos-and-data-structs

fringe anvil Nov 8, 2022, 11:27 PM

#

serene scaffold wrong channel; try <#650401909852864553>

ahh, my bad.

hasty mountain Nov 8, 2022, 11:28 PM

#

compact star Do know of any resources showing how I could do this myself ?

Numpy iteration through numpy array

#

https://github.com/Martyn0324/NumpyNetwork/blob/main/Convolutional Ungabunga.py

#

Maybe my code can inspire you

haughty pewter Nov 9, 2022, 1:01 AM

#

In a regression tree, is there a way to set the root node (at the top) to a specific column?

#

#

For example, what if I wanted to set it to company_revenue

haughty pewter Nov 9, 2022, 1:28 AM

#

Unless it just isn't possible

#

without only using 2 variables

mint palm Nov 9, 2022, 2:37 AM

#

Apart from explicit to.device()
What else can fill CUDA memory??
I am getting out of memory error, but all the to.device() are in a file that gets called using os.system().
I mean should they be freed continuously after they are done running?

#

Also i get that error after i have called os.system numerous times in a loop. I mean doing one step isnt showing error but doing multiple steps is I THINK building up and filling memory.
What could be filling it apart from explicit to.decive()

regal ingot Nov 9, 2022, 2:57 AM

#

need help with naive bayes classifer

#

im stuck on what my prior probability should be

#

since the probability im using checks if a things value is over a certain number

#

so like out of a range 500 check if value is over > 250

#

so would the probability be .5

weak cliff Nov 9, 2022, 3:49 AM