mild dirge Jan 24, 2023, 2:35 PM

#

It is pretrained on imagenet dataset, and the last layer seems to be replaced by 5 output nodes

#

But they might have done it differently

vast lintel Jan 24, 2023, 2:36 PM

#

How do you tell that the last layer got replaced?

#

Is it include_top = false?

queen cradle Jan 24, 2023, 2:36 PM

#

Also, @muted crypt I continue to think that machine learning is a poor fit for this kind of data. I think spline smoothing would be more likely to give you a good fit. However I have serious doubts about using this to improve control of the drone.

mild dirge Jan 24, 2023, 2:36 PM

#

vast lintel How do you tell that the last layer got replaced?

I haven't read the docs for keras on that

#

But it seems that seems to indicate to remove maybe even all dense layers

#

Seems only last single layer from docs

#

If you specify classes that include_top needs to be True according to docs though

#

And weights can not be specified it seems

#

https://keras.io/api/applications/resnet/

Keras documentation: ResNet and ResNetV2

vast lintel Jan 24, 2023, 2:38 PM

#

Oh documentation is online u mean?

#

Not specified? Is that bad?

mild dirge Jan 24, 2023, 2:39 PM

#

If you don't care about how it actually works underneath, you don't have to worry about it, but just follow the docs

vast lintel Jan 24, 2023, 2:39 PM

#

I will try to understand x.x I would like to make something rather complex anyway

mild dirge Jan 24, 2023, 2:39 PM

#

If you really want to learn how this transfer learning works, I would pick up maybe pytorch, that way you can more easily manually program it

vast lintel Jan 24, 2023, 2:40 PM

#

But is transfer learning what I should be ideally going after?

mild dirge Jan 24, 2023, 2:40 PM

#

I'm sure they have resnet implementation too

#

Well, if you want to use a pre-trained model yeah

vast lintel Jan 24, 2023, 2:41 PM

#

I probably would want to retrain a model in my case if I understand this correctly?

mild dirge Jan 24, 2023, 2:41 PM

#

Yeah sorta, though because you have a different amount of classes, the architecture changes too

#

So it is not just simply retraining

muted crypt Jan 24, 2023, 2:44 PM

#

queen cradle Also, <@484100119185063947> I continue to think that machine learning is a poor ...

Well it's not necessarily to improve the control but rather to know for sure what will be its position at a given time so other drones will avoid to cross the "real" path and not the planned one

#

And I feel like they say machine learning (among other stuff) to develop a model that will work for future data

vast lintel Jan 24, 2023, 3:16 PM

#

mild dirge Yeah sorta, though because you have a different amount of classes, the architect...

Will changing the architecture be something I can figure out by looking at the documentation you linked earlier?

#

It seems to be saying that for Resnet 50, you can only specify number of classes, if include_top is True, while you can only specify input_shape if include_top is false. So if I am understanding this correctly, I cannot specify both right?

mild dirge Jan 24, 2023, 3:29 PM

#

If that is what it says then that seems so

#

You can just reshape your images to fit the input dimensions

#

Make sure to use the same pre-processing steps that was used on the data used to train the model

vast lintel Jan 24, 2023, 4:00 PM

#

I am not so confident in the reshaping. I swear I was on some resource that showed me a good enough example but lost it. Do you happen to have the link to one example I could reference? I am going to be looking at some version of the one listed here https://ai.stackexchange.com/questions/36762/retraining-resnet-50-for-iris-flower-classification

But of course, I need to actually have a reshaping section of code

Additionally, I don't quite remember this exactly but my images must also be appended with the category that they would fall into for the training dataset at least right?

Artificial Intelligence Stack Exchange

Retraining ResNet-50 for iris flower classification

I am trying to retrain ResNet-50 for iris flower classification in tensorflow (TensorFlow version: 2.3.0) using the following code
import tensorflow as tf
import cv2, random
from sklearn.model_sele...

pulsar ether Jan 24, 2023, 4:01 PM

#

Hi Everyone! I'm new to python. I just have a quick question. I am primarily looking for a full feautured oppen source math platform like SAGEMATH, but holy hell, I've been trying 3 different ways on three different operating systems to install it and am in hell. Is there something newer? It seems it's made for older python and thats why I'm having trouble?

wooden sail Jan 24, 2023, 4:26 PM

#

you could use a docker image of sagemath

#

but also most linux flavors should allow you to get it with the package manager, what problem are you having?=

proper meteor Jan 24, 2023, 4:46 PM

#

should I learn plotly after completing tensorflow?

serene scaffold Jan 24, 2023, 4:48 PM

#

proper meteor should I learn plotly after completing tensorflow?

you won't really make meaningful progress if you just "learn libraries" as an end unto itself.

proper meteor Jan 24, 2023, 4:49 PM

#

serene scaffold you won't really make meaningful progress if you just "learn libraries" as an en...

I am learning libraries such as numpy, pandas, mathplotlib and scikit-learning

serene scaffold Jan 24, 2023, 4:49 PM

#

proper meteor I am learning libraries such as numpy, pandas, mathplotlib and scikit-learning

none of those libraries are intended to facilitate learning about AI in general

proper meteor Jan 24, 2023, 4:49 PM

#

serene scaffold none of those libraries are intended to facilitate learning about AI in general

machine learning?

serene scaffold Jan 24, 2023, 4:50 PM

#

proper meteor machine learning?

not machine learning, either.

#

it's not like web development where you can "learn django" and have a website at the end.

proper meteor Jan 24, 2023, 4:50 PM

#

serene scaffold it's not like web development where you can "learn django" and have a website at...

ik ik i need a lot of data

serene scaffold Jan 24, 2023, 4:51 PM

#

I would work your way through a machine learning book, and implement things as you go, if you want to learn about ML.

serene scaffold Jan 24, 2023, 4:51 PM

#

proper meteor ik ik i need a lot of data

you need a lot of what kind of data? for what?

proper meteor Jan 24, 2023, 4:51 PM

#

serene scaffold you need a lot of what kind of data? for what?

training data and testing data?

serene scaffold Jan 24, 2023, 4:52 PM

#

proper meteor training data and testing data?

I'm not sure what that has to do with my point about django, but sure.

proper meteor Jan 24, 2023, 4:52 PM

#

serene scaffold I'm not sure what that has to do with my point about django, but sure.

I know what point you trying to make

serene scaffold Jan 24, 2023, 4:53 PM

#

since you're trying to "learn tensorflow", I would encourage you to reframe it this way: first focus on learning about feed-forward neural networks, regardless of what library you use to implement one. and then learn about either convolutional neural networks, or recurrent neural networks.

proper meteor Jan 24, 2023, 4:55 PM

#

serene scaffold since you're trying to "learn tensorflow", I would encourage you to reframe it t...

thank you for you kind suggestion mate appreciate it :)

mint palm Jan 24, 2023, 5:02 PM

#

softmax is used as a norm in transformers' attention module , but Can i use some relu in place of it? i am getting higher accuracy in my task.

serene scaffold Jan 24, 2023, 5:04 PM

#

mint palm softmax is used as a norm in transformers' attention module , but Can i use som...

you can probably use [leaky] relu. softmax is used both as an activation function, and to cause a distribution to sum to 1.

#

I'm equally excited for Edd to tell me I'm right or to correct me.

wooden sail Jan 24, 2023, 5:05 PM

#

depending on what the interpretation of the params is, some activation funcs might not make sense. esp. at the output layer

#

a leaky relu does make sense there indeed, but i'd almost expect a softmax to train faster/be more nicely behaved

mint palm Jan 24, 2023, 5:06 PM

#

One more thing, i have incorporated Cosine based similarity and replaced simple dot product(similar, i know but just telling), along with changing to relu

wooden sail Jan 24, 2023, 5:06 PM

#

or to retain better interpretability for you

#

regarding this latter one, using leaky relu makes it so that you can no longer interpret the result as a mean/convex combination

lapis sequoia Jan 24, 2023, 5:56 PM

#

could someone roleplay as a student competitor for me? i'll be taking on the role of a consultant (grad student) for this upcoming undergraduate data science competition in april and i'm feeling a imposter syndrome – what if i blank out? here are several situations in which students can ask for consultation:
* Help students decide on a research question of interest and prepare the data for analysis
* Help students prepare the data for analysis and analyze the data
* Help students analyze the data

vast lintel Jan 24, 2023, 6:12 PM

#

Is it possible to train a neural network like a transformer to receive user input and generate an RMD file based off of that? Or at least the corresponding code for it?

agile cobalt Jan 24, 2023, 6:34 PM

#

vast lintel Is it possible to train a neural network like a transformer to receive user inpu...

in other word, Text to Speech?

#

or you're thinking something more generic like Stable Diffusion but for audio instead of images

#

if the former: it's literally all over the place and should be reasonably easy to find resources about and/or open source models
if the later: not that I am aware of, though I haven't really ever looked into it
edit; some exist it seems, https://www.harmonai.org/ | https://openai.com/blog/jukebox/ for example

shell crest Jan 24, 2023, 7:10 PM

#

lapis sequoia could someone roleplay as a student competitor for me? i'll be taking on the rol...

I won't really roleplay for you but I don't think it's uncommon to feel some imposter syndrome.
Just try your best.

mild dirge Jan 24, 2023, 7:16 PM

#

lapis sequoia could someone roleplay as a student competitor for me? i'll be taking on the rol...

Try to do a task similar to what competitors have to do to practice if you want

lapis sequoia Jan 24, 2023, 7:25 PM

#

Hi guys

arctic wedgeBOT Jan 24, 2023, 8:30 PM

#

Hey @karmic flicker!

It looks like you tried to attach file type(s) that we do not allow (.pdf). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

karmic flicker Jan 24, 2023, 8:31 PM

#

fine if your going to make my life difficult

Attached is an image of code from a paper I am following, however, in the movie_generator it calls, it says that ffmpeg has no attribute named input,

#

https://www.frontiersin.org/articles/10.3389/fspas.2022.1009450/full#supplementary-material

Frontiers | AuroraX, PyAuroraX, and aurora-asi-lib: A user-friendly...

Within the context of the Heliophysics System Observatory, optical images of the aurora are emerging as an important resource for exploring multi-scale geospace processes. This capability has never been more critical as we are on the cusp of a new era of geospace research, by which we mean studying the overall system as a system of systems. Hist...

#

Here is the raw code

#

https://aurora-asi-lib.readthedocs.io/en/latest/function_api.html?highlight=animate_fisheye_generator#asilib.plot.animate_fisheye.animate_fisheye_generator

#

Im very confused

#

stackoverflow says to use ffmeg-python but this is in the backend of a library

dusk musk Jan 24, 2023, 9:03 PM

#

I'm using scipy.integrate.solve_ivp(~) rn to solve a diffeq, but it keeps erroring presumably because it's using values outside t_span
https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html#scipy.integrate.solve_ivp

#

Here's my diffeq in TeX, and here's my code:

def aOfTForUniverse(OM, OL):
    age = ageOfTheUniverse(1, OM, OL)
    def diffEq(t, a):
        # Substitute for "present" comsological parameters.
        nonlocal OM, OL
        E2 = OM/a + OL*a**2 + (1 - OM - OL)
        adot = H0*np.sqrt(E2)
        return adot
    np.vectorize(diffEq)
    # Return integral
    solution = spi.solve_ivp(diffEq, (age, 1), [H0], dense_output=True, t_eval=np.linspace(age, 1, 1000))
    print(solution)
    return solution.sol```

#

  message: Required step size is less than spacing between numbers.
  success: False
   status: -1

{script}:25: RuntimeWarning: invalid value encountered in sqrt
  adot = H0*np.sqrt(E2)```

#

oh and age is always correct, it's always positive.

#

and almost always even greater than 1.

tidal bough Jan 24, 2023, 9:06 PM

#

Hmm, are ΩM and ΩL positive?

dusk musk Jan 24, 2023, 9:06 PM

#

yep

#

it keeps giving a negative values though.
which is ridiculous, and I don't see an argument to change that in solve_ivp(~)

tidal bough Jan 24, 2023, 9:08 PM

#

huh, your function call shouldn't work I think. You seem to be passing two values for t_eval - (age, 1) as the second positional argument and also np.linspace(age, 1, 1000) as a keyword one

dusk musk Jan 24, 2023, 9:09 PM

#

t_span and t_eval are different. I read something about the t_eval on stackexchange and decided to try it (didn't work).

tidal bough Jan 24, 2023, 9:09 PM

#

yeah, nevermind, I can't read.

dusk musk Jan 24, 2023, 9:09 PM

#

lol mood.

tidal bough Jan 24, 2023, 9:10 PM

#

dusk musk and almost always even greater than `1`.

that doesn't seem to be right considering your code - your t_span is (age, 1) and your t_eval goes from age to 1 too.

dusk musk Jan 24, 2023, 9:10 PM

#

I'll print it just to confirm

#

Age: 12.543762329373754
[-0.09861852] [nan]
[-0.2318735] [0.04902007]
[-0.61363283] [0.06146909]
[-0.55139964] [0.06069726]
[-0.04049321] [nan]
[-0.01119848] [nan]
[-0.00313937] [nan]
...
[-1.08734706e-07] [nan]
[-4.86968442e-08] [nan]```

#

first column is a

#

second column is adot (the value of the diffeq)

tidal bough Jan 24, 2023, 9:12 PM

#

So age is above 1 after all? That means you're passing backwards t_span and t_eval (they go from 12.54 to 1) to solve_ivp; no idea if it supports that

dusk musk Jan 24, 2023, 9:13 PM

#

i think it does, because it does validly give the right initial conditions

tidal bough Jan 24, 2023, 9:14 PM

#

huh, it really does seem to run time backwards

dusk musk Jan 24, 2023, 9:14 PM

#

I'mma check my math again (for like the 5 time sobbing )

tidal bough Jan 24, 2023, 9:14 PM

#

that seems to cause a to drop to zero

dusk musk Jan 24, 2023, 9:15 PM

#

yea, this should be integrable to a=0 too.

tidal bough Jan 24, 2023, 9:15 PM

#

when a is low, adot is very high, so it probably ends in finite time, yeah

#

Do you want to find when a=0?

dusk musk Jan 24, 2023, 9:16 PM

#

i'm wanting a as a function of t

#

i need to integrate over it with respect to t so this seemed like the most streamlined way of doing it

tidal bough Jan 24, 2023, 9:17 PM

#

Hmm, aren't you getting it? It complains about not being able to integrate further, which makes sense, but it returns you a(t) until it becomes zero, doesn't it?

dusk musk Jan 24, 2023, 9:17 PM

#

mm, no. success: False

#

oh wait the .sol

tidal bough Jan 24, 2023, 9:18 PM

#

sure, but it returns t and y

#

(and if you don't want it to complain, you could introduce a termination condition)

#

oh, does it not return .sol?

dusk musk Jan 24, 2023, 9:19 PM

#

no it does, I just realized. it doesn't seem to be returning the right function for what I want though.

#

but im not sure if that's because of my math, the program, or scipy.

dusk musk Jan 24, 2023, 9:19 PM

#

dusk musk no it does, I just realized. it doesn't seem to be returning the right function...

there's no way a difference of ~.5 would result in a dropping to almost 0 here (physically speaking)
so something is off, but that might just be my math.

#

its strange though because I'm comparing it side by side with a function i did earlier that's almost identical (just a bit flip flopped) and that had almost no issues

prime hearth Jan 24, 2023, 9:27 PM

#

hello, my SupportVectorMachine has 0.87 accuracy for trainning data but for testing with cross validation the accuracy with new data comes to be around 0.73. Is this okay?

dusk musk Jan 24, 2023, 9:57 PM

#

@tidal bough lol. I was being stupid

#

The initial condition i set?
idk why but i basically told it
a(t_0) = H0
when in reality it should've been
a(t_0) = 1
I got confused because the slope is H0 at t_0
a'(t_0) = H0

hasty mountain Jan 24, 2023, 10:17 PM

#

Hey guys, something a bit more...technical about Deep Learning.

Can I say that the VGG model and its architecture is the most fundamental one in Deep Learning(which means that it's usually the easiest one to learn when studying image classification)?

queen cradle Jan 25, 2023, 2:05 AM

#

pulsar ether Hi Everyone! I'm new to python. I just have a quick question. I am primarily loo...

Sage is made for modern Python. It's really excellent, but it has an unusually large number of dependencies. Many of the dependencies are hard to install themselves, so installing Sage from scratch is extremely difficult. I suggest using a package manager or a Docker image. There is also https://sagecell.sagemath.org/, which is fine for some things.

#

(Sage was stuck on Python 2 for a long time because of its dependencies, but that's thankfully been over for several years now. You may still find mention of that on the Internet.)

vast lintel Jan 25, 2023, 2:13 AM

#

agile cobalt in other word, Text to Speech?

Not quite talking about text to speech, I was thinking more along the lines of input text from a user to generate more text, but the text generated in response to this, rather than being a coherent conversation, would instead be in the form of say markdown

agile cobalt Jan 25, 2023, 2:25 AM

#

vast lintel Not quite talking about text to speech, I was thinking more along the lines of i...

if I had to guess, you would probably want to separate that into two independent parts, one that is just responsible for the text/conversation/markdown synthesis and one that is responsible for turning the previous output into an RMD file or whichever format you want

#

that really isn't my forte though

rancid sorrel Jan 25, 2023, 4:00 AM

#

anyone got any good resorces on reinforcement learning? and or stomatic anlyalis

split drift Jan 25, 2023, 9:47 AM

#

Does someone know how to "substract" one text column from the other in pandas? (without using apply):
Input:
A B 0 ABC A 1 ABC B

desired output:
0 BC 1 AC

supple saddle Jan 25, 2023, 9:49 AM

#

is arr (10,) the same with (1,10)?

mild dirge Jan 25, 2023, 9:50 AM

#

No there's an extra bracket there

#

(10,) is 1d, (10, 1) is 2d

#

Oh you said (1, 10)

#

Which is also 2d, but a matrix with single column

wooden sail Jan 25, 2023, 9:52 AM

#

(10,), (10,1) and (1,10) are all different and behave differently when you do math on them

supple saddle Jan 25, 2023, 9:53 AM

#

(10,1) and (1,10) i can visualize the differnce 1st is row second is column i cannot understand 10,

wooden sail Jan 25, 2023, 9:53 AM

#

that's a made up thing numpy and pandas use

#

it can behave both as (10,1) and as (1,10) depending on the scenario, sometimes giving you unexpected results when it should actually error out 😛

supple saddle Jan 25, 2023, 9:54 AM

#

ok so it is like a magic matrix ;d

wooden sail Jan 25, 2023, 9:54 AM

#

yeah

supple saddle Jan 25, 2023, 9:55 AM

#

thanks, i always get confused (3,5) to correlate (rows,columns)

copper lake Jan 25, 2023, 10:08 AM

#

Hi all, i'd like to iterate the json2lab function trough all the .json files in the folder, creating a .lab file for each file with the same name. This is the code

#

import json

def process_parts(data, beatz = None):
if 'parts' in data.keys():
for part in data['parts']:
process_parts(part, beatz)
else:
if beatz is not None:
beatz.extend(data['beats'])

def json2lab(infile, outfile):
with open(infile, 'r') as data_file:
data = json.load(data_file)
duration = float(data['duration'])
all_beats = []
process_parts(data, all_beats)
with open(outfile, 'w') as content_file:
for s in all_beats:
content_file.write(str(s) + '\n')

#

any suggestion?

solid ridge Jan 25, 2023, 5:46 PM

#

Hey folks, what's the right way to apply numpy.hypot to each pair of elements from a list of coordinates?
In regular python I'd do

{(i,j):math.dist(x,y) for (i,x),(j,y) in itertools.combinations(enumerate(coords), 2)}

or something like that.

misty flint Jan 25, 2023, 6:42 PM

#

https://seattledataguy.substack.com/p/how-does-league-of-legends-deploy if anyone is interested

How Does League Of Legends Deploy Machine Learning Models Into The ...

Looking at how LOL deploys machine learning models with Ian Schweer - ML/MLOps Engineer @ Riot Games

lapis sequoia Jan 25, 2023, 6:48 PM

#

how do you drop dataframe rows by index range

wooden sail Jan 25, 2023, 6:57 PM

#

solid ridge Hey folks, what's the right way to apply `numpy.hypot` to each pair of elements ...

the most straightforward way is to do the math on the broadcasted sum

#

np.sqrt( vals[:, np.newaxis]**2 + vals[np.newaxis, :]**2 )

#

it does have the disadvantage of computing the off diagonals twice though

#

scipy probably has a more clever wrapper for this

#

ah, i remember

#

the alternative is to use meshgrid to create the cartesian product. you can then call hypot on that

#

that also has repeated elements though

arctic wedgeBOT Jan 25, 2023, 7:43 PM

#

Hey @pastel blade!

It looks like you tried to attach file type(s) that we do not allow (.xlsx). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

worn stratus Jan 25, 2023, 8:13 PM

#

They explicitly ask you not to scrape, which makes it against the rules of this server

pastel blade Jan 25, 2023, 8:18 PM

#

Ok

fallen crown Jan 25, 2023, 8:30 PM

#

Hi,

#

I have a question related to the speed of execution of my code, here is my code

#

self.game.draw_line()['north_distance'],
                    self.game.draw_line()['south_distance'],
                    self.game.draw_line()['est_distance'],
                    self.game.draw_line()['west_distance'],
                    self.game.draw_line()['north_est_distance'],
                    self.game.draw_line()['north_west_distance'],
                    self.game.draw_line()['south_est_distance'],
                    self.game.draw_line()['south_west_distance'],

                    self.game.food_detection()['north'],
                    self.game.food_detection()['north_est'],
                    self.game.food_detection()['est'],
                    self.game.food_detection()['south_est'],
                    self.game.food_detection()['south'],
                    self.game.food_detection()['south_west'],
                    self.game.food_detection()['west'],
                    self.game.food_detection()['north_west'],```

#

here I call the same function several times, would calling it once then storing its value in a list be faster

#

?

charred light Jan 25, 2023, 8:38 PM

#

fallen crown ```python self.game.draw_line()['north_distance'], self.game...

I don't think there's a significant difference.

fallen crown Jan 25, 2023, 8:43 PM

#

charred light I don't think there's a significant difference.

even if it's in a while loop which is True by default ?

charred light Jan 25, 2023, 8:51 PM

#

fallen crown even if it's in a while loop which is True by default ?

No, you really shouldn't be using while true loops. I would argue it's bad practice as it opens you to infinite loops. There are specific cases for them (e.g. Waiting for user input), but not this case.

#

That doesn't have to do with speed, more so of general coding.

charred light Jan 25, 2023, 8:53 PM

#

fallen crown here I call the same function several times, would calling it once then storing ...

Also, if you want to learn more about code execution speed. You can look into the concept called O(n) (Big O notation) #algos-and-data-structs will help out with that.

fallen crown Jan 25, 2023, 9:12 PM

#

charred light No, you really shouldn't be using `while true loops`. I would argue it's bad pra...

Yess i know but it's for the training of my ai model, I have to use a while loop True by default to run each genome of my population

keen notch Jan 25, 2023, 10:53 PM

#

hey so i'm trying to write code to this question however i'm not getting the correct plot

#

so feel i'm understanding the physics wrong

arctic wedgeBOT Jan 25, 2023, 10:53 PM

#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

charred light Jan 25, 2023, 10:55 PM

#

!code

keen notch Jan 25, 2023, 10:55 PM

#

https://paste.pythondiscord.com/izovegifut

#

code is too big

#

can i send a .ipynb file

#

as it contains the question alongside the code

charred light Jan 25, 2023, 10:56 PM

#

I don't think the server allows .ipynb or .py files. Security reasons.

arctic wedgeBOT Jan 25, 2023, 10:57 PM

#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

charred light Jan 25, 2023, 10:57 PM

#

If your working with jupyter notebooks, you could export -> .py (File -> Download as -> .py) and then copy and paste that into pastebin.

keen notch Jan 25, 2023, 10:57 PM

#

https://paste.pythondiscord.com/vuhayituxa

#

this is the code to the question^ in other paste bin

#

so essentially my plot is 3d with multiple scatter points inside the plot but i think what i need is a 2d plot that is representing an avalanche of electrons in all directions something like the paint😂 where this many electrons scattered at the bottom🙈

#

let me know if this doesn't make sense!

charred light Jan 25, 2023, 11:03 PM

#

It's been a long long while since I've taken physics lmao. I have a sense of what the course covers, took a comp assisted math+physics course without knowing half the physics.

So, are you just looking for a 2d plot?

keen notch Jan 25, 2023, 11:04 PM

#

I'm thinking maybe my range should not be 1 by 1

keen notch Jan 25, 2023, 11:04 PM

#

charred light It's been a long long while since I've taken physics lmao. I have a sense of wha...

essentially yes!

#

like the one in my bad paint diagram

#

it's an avalanche

#

yes this is computational physics

charred light Jan 25, 2023, 11:07 PM

#

I'm not sure what the exact ask is, looks like you already tried scatter plots?
ax.scatter(locations[:,0], locations[:,1], locations[:,2], c=time, cmap='viridis') as an example.

keen notch Jan 25, 2023, 11:09 PM

#

soooo apparently

#

The middle orange is thr radius of thr cut off
And all the electrons are basically just in one plane

It's supposed to avalanche and have that shape as it gets towards the cut off radius
Yeah ignore orange top

charred light Jan 25, 2023, 11:09 PM

#

Do you also have a sample call of run function?

keen notch Jan 25, 2023, 11:09 PM

#

yes let me show you the test cell

#

keen notch Jan 25, 2023, 11:11 PM

#

charred light I'm not sure what the exact ask is, looks like you already tried scatter plots? ...

yes that gives this

charred light Jan 25, 2023, 11:11 PM

#

k, I regret running that locally

keen notch Jan 25, 2023, 11:13 PM

#

loll maybe not best

charred light Jan 25, 2023, 11:13 PM

#

lmao, I guess I'm about to see 1000 of these plots.

keen notch Jan 25, 2023, 11:13 PM

#

keen notch soooo apparently

So basically the correct plot . The middle orange is the radius of thr cut off
And all the electrons are basically just in one plane

keen notch Jan 25, 2023, 11:14 PM

#

charred light lmao, I guess I'm about to see 1000 of these plots.

ahhhh i got that first yes haha

#

definitely wrong

#

the for loop loops 1000 times soo🙈

charred light Jan 25, 2023, 11:15 PM

#

Yea, that's more of calling a new plot every time instead of plotting to the same plot.

More so, among the blue dots in the 3d map. So the paint earlier is cutting a slice and looking it from the side?

keen notch Jan 25, 2023, 11:16 PM

#

charred light Yea, that's more of calling a new plot every time instead of plotting to the sam...

exactly

#

The middle orange is thr radius of the cut off
And all the electrons are basically just in one plane

It's supposed to avalanche and have that shape as it gets towards the cut off radius

charred light Jan 25, 2023, 11:23 PM

#

keen notch The middle orange is thr radius of the cut off And all the electrons are basical...

What's the original code for the run function?

keen notch Jan 25, 2023, 11:25 PM

#

like without any implementation

#

this test cell was given to us

#

charred light Jan 25, 2023, 11:26 PM

#

keen notch like without any implementation

Yea, also probably something along the lines of: https://stackoverflow.com/questions/31605494/matplotlib-project-3d-surface-on-2d-plot

keen notch Jan 25, 2023, 11:26 PM

#

charred light Yea, also probably something along the lines of: <https://stackoverflow.com/ques...

hmm yes

charred light Jan 25, 2023, 11:27 PM

#

keen notch this test cell was given to us

I mean in this function, how much is what you wrote vs what was originally there.

keen notch Jan 25, 2023, 11:27 PM

#

none of it was there

#

i wrote all of it

charred light Jan 25, 2023, 11:28 PM

#

Oh I see, so everything after #your code here

keen notch Jan 25, 2023, 11:28 PM

#

yes exactly

charred light Jan 25, 2023, 11:31 PM

#

Are there parameters for where the entire thing will run relatively fast? For testing purposes, I know it's more of a simulation and meant to take time.

keen notch Jan 25, 2023, 11:31 PM

#

charred light Are there parameters for where the entire thing will run relatively fast? For te...

ahhh could take quite a few minutes so would leave in the background

#

we can speed up python though

solar yew Jan 25, 2023, 11:34 PM

#

Hey guys, if anyone has experience using largish LDA I'd really appreciate if you could take a look at this and perhaps point me in the right direction!
https://discord.com/channels/267624335836053506/1067948536337014936

#

I'm not entirely sure how I can cut down on the repetition within topics

#

The dense clustering just doesnt offer much insight

#

while i'd love to play it off that its the data, I have seen others using very similar data get more promising results (here for example - https://highdemandskills.com/topic-trends-fomc/ ). I've read that LDA is extremely sensitive to inputs, and can see some stopwords still getting through however I have doubts that this is the cause of the overlap

keen notch Jan 25, 2023, 11:36 PM

#

charred light Are there parameters for where the entire thing will run relatively fast? For te...

otherwise I don't think so sadly:(

rugged falcon Jan 25, 2023, 11:37 PM

#

what is good library for quick NN keras,tensorflow or pytorch?

solar yew Jan 25, 2023, 11:38 PM

#

rugged falcon what is good library for quick NN keras,tensorflow or pytorch?

likely keras for quick prototyping

hasty mountain Jan 25, 2023, 11:42 PM

#

Absolutely keras

#

Keras requires like... 6 lines of code using Sequential and voilá

charred light Jan 25, 2023, 11:42 PM

#

keen notch otherwise I don't think so sadly:(

Well, one thing is you're using ax for both storing a value and as a plot.
ax, ay, az = electric_field(charge, voltage, anode_radius, cathode_radius)
ax = fig.add_subplot(111, projection='3d')

hasty mountain Jan 25, 2023, 11:42 PM

#

The only bad side is the brain damage you suffer while trying to figure out how to organize your input dimension with Batch_size

charred light Jan 25, 2023, 11:42 PM

#

I also think something's wrong with the while true loop lol

#

Generates infinite graphs lmao

rugged falcon Jan 25, 2023, 11:44 PM

#

hasty mountain The only bad side is the brain damage you suffer while trying to figure out how ...

what do you mean by that

hasty mountain Jan 25, 2023, 11:44 PM

#

charred light I also think something's wrong with the while true loop lol

That's why you only use while True if you have a condition to break such loop

keen notch Jan 25, 2023, 11:44 PM

#

charred light Generates infinite graphs lmao

i think something is it shouldn't be nested

charred light Jan 25, 2023, 11:44 PM

#

There is one, but I think it takes a while to reach.

keen notch Jan 25, 2023, 11:44 PM

#

charred light Well, one thing is you're using `ax` for both storing a value and as a plot. `a...

ahhh😅 i see

rugged falcon Jan 25, 2023, 11:45 PM

#

keen notch i think something is it shouldn't be nested

how long is your zip

hasty mountain Jan 25, 2023, 11:45 PM

#

rugged falcon what do you mean by that

keras has 2 arguments for your input size.
It has "input_dimensions" and "batch_size". If you use one, your input data has to be organized in batches. If you use another, it'll do it for you.
If you use the wrong argument, you'll get an error saying that it requires an input with an extra dimension. If you add an extra dimension, it'll throw another error saying it requires another extra dimension, and so on...

#

This is exactly why I don't use keras anymore not even for prototypes

keen notch Jan 25, 2023, 11:46 PM

#

rugged falcon how long is your zip

so it goes through interaction_locations, transport_times

rugged falcon Jan 25, 2023, 11:46 PM

#

hasty mountain keras has 2 arguments for your input size. It has "input_dimensions" and "batch_...

hmm ok. guess ill encounter that then

rugged falcon Jan 25, 2023, 11:46 PM

#

keen notch so it goes through interaction_locations, transport_times

lol yes. but how long is the zip object

charred light Jan 25, 2023, 11:47 PM

#

very long

keen notch Jan 25, 2023, 11:47 PM

#

like how long i'm not actually suree

charred light Jan 25, 2023, 11:47 PM

#

lol

keen notch Jan 25, 2023, 11:47 PM

#

ohh

#

wow

charred light Jan 25, 2023, 11:49 PM

#

rugged falcon lol yes. but how long is the zip object

1,000 is the length b/c # of charges is 1000
But interaction_locations also is a list of lists.

charred light Jan 25, 2023, 11:50 PM

#

keen notch wow

So, scatter plot wise. I'm guessing interaction_locations is X,Y,Z. So you can just 2d scatter by pulling X,Y something along those lines.

keen notch Jan 25, 2023, 11:51 PM

#

charred light So, scatter plot wise. I'm guessing interaction_locations is X,Y,Z. So you can j...

so when i scatter just x y

#

it doesn't work

#

i tried that

charred light Jan 25, 2023, 11:56 PM

#

keen notch so when i scatter just x y

Got something along lines of this. Using values in interaction_locations

keen notch Jan 25, 2023, 11:57 PM

#

charred light Got something along lines of this. Using values in interaction_locations

sooo

#

i got this initially

#

and that is wrong

#

it should look like this

#

charred light Jan 25, 2023, 11:57 PM

#

Like a sombrero, yea.

keen notch Jan 25, 2023, 11:57 PM

#

exactlyy haha

keen notch Jan 25, 2023, 11:58 PM

#

charred light Got something along lines of this. Using values in interaction_locations

but yeah me week ago got this and couldn't move past this haha

charred light Jan 26, 2023, 12:00 AM

#

keen notch but yeah me week ago got this and couldn't move past this haha

I don't see you calling cross_section at all. Given that it's provided code, you should be using that function.

keen notch Jan 26, 2023, 12:01 AM

#

ahhh i missed it!

charred light Jan 26, 2023, 12:01 AM

#

keen notch so feel i'm understanding the physics wrong

I think this is the case, but not much I can help there.

keen notch Jan 26, 2023, 12:01 AM

#

if i add this could it fix the issue

#

do u think

charred light Jan 26, 2023, 12:05 AM

#

keen notch if i add this could it fix the issue

I think so, yes. You would want to leverage the cross_section code as it already provides inelasticCS + elasticCS calculations nicely bundled into a flag signifying inelastic scattering and the calculation.
I saw you included something similar below elastic_cross_section, etc. but it may be the case that there's some calculation error there

#

Specifically, the provided function should cover what's written in choose_scattering_event, elastic_cross_section and inelastic_cross_section

keen notch Jan 26, 2023, 12:10 AM

#

charred light Specifically, the provided function should cover what's written in `choose_scatt...

hmmm

#

let me see

charred light Jan 26, 2023, 12:13 AM

#

When I took a class like this, I didn't really know python or the math. Double suffering lmao.

#

Apparently we even covered PCA

keen notch Jan 26, 2023, 12:14 AM

#

charred light When I took a class like this, I didn't really know python or the math. Double s...

it's awful haha

#

like i feel the calculations are fine

#

so unsure

charred light Jan 26, 2023, 12:17 AM

#

If this is a class, try asking your professor or classmates would be a good bet.

keen notch Jan 26, 2023, 12:18 AM

#

it is more a project and it's weird cause the physics students know the physics but not the cs and vise versa haha

#

but we can't get it to work both ways lol

#

struggles haha

#

chatgpt hehe

#

butt yk i can't get it to work cause it's always busy so have not tried it

charred light Jan 26, 2023, 12:22 AM

#

I mean, if you could phase the original question, maybe chat gpt can help. I doubt it in this case.

#

At least without multiple user input to guide it, which at that point will take the same amount of time to learn the material anyways.

keen notch Jan 26, 2023, 12:24 AM

#

lmao true

#

but i can't even access the website

#

what do u get when u put it in

#

any luck

#

ahh

#

so wait i also not sure about this and how to fix

#

but

#

in the question run takes in 2 arg but i take in more than that?

#

Before plotting in run do i need to pass x y z
Into the func
And it’ll return new x new y new z
Store them and plot those

#

maybe

gloomy anvil Jan 26, 2023, 12:56 AM

#

I have a multivariate timeseries dataset that has stationary and non-stationary data. I want to fit a VAR, so I make all the data stationary and fit it. Unfortunately I figure out that some of the timeseries (not all!) are cointegrated. So VECM would be appropriate for these timeseries, but not for the timeseries that are not cointegrated. Is there a "traditional" econometric forecast model that I could use for my data that handels stationary, non-stationary, cointegrated and non-cointegrated data?

keen notch Jan 26, 2023, 1:05 AM

#

ahh not sure

charred light Jan 26, 2023, 1:05 AM

#

I don't use chatGPT, so I don't even have an account. Can't really help you there lol.

keen notch Jan 26, 2023, 1:05 AM

#

charred light I don't use chatGPT, so I don't even have an account. Can't really help you ther...

it's okay lol not really using it

#

so error?

#

changed this

#

is it because elastic_scattering is (vx, vy, vz) not x_coords e.t.c

charred light Jan 26, 2023, 1:12 AM

#

You should add print statements and add a break at the end of your while loop to check what values get inputted where.

#

Either theta or vx is no longer an int.

keen notch Jan 26, 2023, 1:15 AM

#

charred light Either theta or vx is no longer an int.

will try

keen notch Jan 26, 2023, 1:31 AM

#

how can i fix this error

#

#

charred light Jan 26, 2023, 1:34 AM

#

keen notch

You declare it as an empty list. Nothing gets stored in it between when it's declared, and called.

keen notch Jan 26, 2023, 1:35 AM

#

charred light You declare it as an empty list. Nothing gets stored in it between when it's dec...

new _vx

#

has to go above vx,vy...

#

wait no

#

after we append

#

sorry what do you mean

#

i might be just really tired or being dumb

charred light Jan 26, 2023, 1:44 AM

#

keen notch sorry what do you mean

Ok, so before the for loop for i in range(num_charges)
vx_coords = []
vx_coords is declared as an empty list.

Then we enter into the for loop:
charge = some randomized tuple of 3 numbers (a,b,c)

Next line, you're calling vx, vy, vz = vx_coords[i], vy_coords[i], vz_coords[i]
Simplified, it means your calling vx = vx_coords[i] where i is currently 0.
vx_coords[0] doesn't exist because it is empty and you haven't added anything into it

keen notch Jan 26, 2023, 1:44 AM

#

that's why i thought i should append first

#

but that throws another error

#

#

i feel this is wrong

#

but will go try sleep now it's nearly 2

uneven horizon Jan 26, 2023, 1:51 AM

#

How can I merge two series such that the index is the column of the result?
Using pandas, I have two series like:

name: "David"
age: 19```
```json
name: "Angel"
age: 18```
 I have some code working to merge them but is leaving name, age as rows and giving the series name (with suffix) as column

charred light Jan 26, 2023, 1:51 AM

#

keen notch i feel this is wrong

Yea, I think you meant to append to vx_coords. Probably go to bed first. doesn't help to continue to work while tired.

keen notch Jan 26, 2023, 1:54 AM

#

ahh

#

yes

#

yay

#

more errors hehe

charred light Jan 26, 2023, 1:59 AM

#

keen notch more errors hehe

Also, isn't this supposed to be in the second for loop? the enumerate charge one.

keen notch Jan 26, 2023, 2:00 AM

#

this within the second for loop

#

#

hmm why?

charred light Jan 26, 2023, 2:04 AM

#

keen notch hmm why?

You originally call elastic_scattering within the 2nd loop. So why are you moving elastic_scattering up where you are defining charges?

keen notch Jan 26, 2023, 2:08 AM

#

ahh

#

truei think i did it cause now i have undeclared parameter new_x

#

#

hmm

#

charred light Jan 26, 2023, 2:18 AM

#

keen notch truei think i did it cause now i have undeclared parameter new_x

I think you changed your original code too much. If that's 2 am, might be time to call it quits for today.

keen notch Jan 26, 2023, 2:25 AM

#

charred light I think you changed your original code too much. If that's 2 am, might be time t...

okay i fixed an error will finish the rest tomorrow

#

thanks for your help:)

lapis sequoia Jan 26, 2023, 3:05 AM

#

Hi, how do you ensure that a filter code is applied to a dataset using pandas?

#

anyone here an expert in using pandas cause I have a couple of questions?

charred light Jan 26, 2023, 3:21 AM

#

lapis sequoia Hi, how do you ensure that a filter code is applied to a dataset using pandas?

You can do spot checks if you want. Otherwise, as long as you don't have errors it'll apply to the entire dataset.

lapis sequoia Jan 26, 2023, 3:22 AM

#

charred light You can do spot checks if you want. Otherwise, as long as you don't have errors ...

but when I download the dataset after it doesnt show the filtered version though?

charred light Jan 26, 2023, 3:22 AM

#

lapis sequoia but when I download the dataset after it doesnt show the filtered version though...

Show relevant code.

#

!pastebin

arctic wedgeBOT Jan 26, 2023, 3:22 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

charred light Jan 26, 2023, 3:23 AM

#

lapis sequoia but when I download the dataset after it doesnt show the filtered version though...

Make sure your calling the right dataframe, and you're saving the filtered df back to a variable.

lapis sequoia Jan 26, 2023, 3:24 AM

#

charred light Show relevant code.

I pasted it if you could take a look at it

lapis sequoia Jan 26, 2023, 3:25 AM

#

charred light Make sure your calling the right dataframe, and you're saving the filtered df ba...

could you show an example

charred light Jan 26, 2023, 3:25 AM

#

lapis sequoia I pasted it if you could take a look at it

after you paste it, ctrl + s and send that link.

lapis sequoia Jan 26, 2023, 3:25 AM

#

charred light after you paste it, ctrl + s and send that link.

https://paste.pythondiscord.com/cofolojoma

#

this is an error I got

charred light Jan 26, 2023, 3:27 AM

#

lapis sequoia https://paste.pythondiscord.com/cofolojoma

try print f1, that should be your df

#

instead of df1[f1], just f1

#

f1 = (df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X') | (df1['Central posting block'] != 'X') | (df1['Central purchasing block'] != 'X')
df1[f1]

or

f1 = df1[(df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X') | (df1['Central posting block'] != 'X') | (df1['Central purchasing block'] != 'X')]
f1

Otherwise, currently df1[f1] is the same as:

f1 = df[ df1[(df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X') | (df1['Central posting block'] != 'X') | (df1['Central purchasing block'] != 'X')] ]

lapis sequoia Jan 26, 2023, 3:30 AM

#

let me try it and see what I get

lapis sequoia Jan 26, 2023, 3:32 AM

#

charred light ``` f1 = (df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X...

this did not work

charred light Jan 26, 2023, 3:34 AM

#

lapis sequoia this did not work

What's the error?
Similar to this:

https://stackoverflow.com/questions/38802675/create-bool-mask-from-filter-results-in-pandas

lapis sequoia Jan 26, 2023, 3:35 AM

#

charred light What's the error? Similar to this: <https://stackoverflow.com/questions/388026...

ValueError: setting an array element with a sequence

charred light Jan 26, 2023, 3:43 AM

#

lapis sequoia ValueError: setting an array element with a sequence

What's the full error? (Full traceback)

#

Above should be working.

#

lapis sequoia Jan 26, 2023, 3:53 AM

#

charred light What's the full error? (Full traceback)

https://paste.pythondiscord.com/hujerebusi

charred light Jan 26, 2023, 3:54 AM

#

lapis sequoia https://paste.pythondiscord.com/hujerebusi

f2 = df1[** [ (df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X') ] **| (df1['Central posting block'] != 'X') | (df1['Central purchasing block'] != 'X')]
you have an extra set of []

lapis sequoia Jan 26, 2023, 3:54 AM

#

i tried your method and I got the same error Boolean array expected for the condition, not object

#

where

charred light Jan 26, 2023, 3:55 AM

#

lapis sequoia where

I bolded it + spaced

lapis sequoia Jan 26, 2023, 3:56 AM

#

I think it worked im just not sure since it is a large dataset

lapis sequoia Jan 26, 2023, 3:58 AM

#

charred light I bolded it + spaced

print(len(df1))
f2 = (df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X') | (df1['Central posting block'] != 'X') | (df1['Central purchasing block'] != 'X')
df1[f2]

#

this is what I did

charred light Jan 26, 2023, 3:58 AM

#

lapis sequoia print(len(df1)) f2 = (df1['Central del.block'] != 'X') | (df1['Central deletion ...

Both methods would have worked. See images above.

lapis sequoia Jan 26, 2023, 3:59 AM

#

so if I want to extract the dataset, it would have the changes right? or do I need to define the df1[f2] when extracting?

#

LFA1_LFB1 = pd.merge(df2, df1[f2], left_on='Vendor', right_on='Vendor', how='inner', sort=False)

charred light Jan 26, 2023, 4:00 AM

#

lapis sequoia so if I want to extract the dataset, it would have the changes right? or do I ne...

print(len(df1))
f2 = (df1['Central del.block'] != 'X') | (df1['Central deletion flag'] != 'X') | (df1['Central posting block'] != 'X') | (df1['Central purchasing block'] != 'X')
df_new = df1[f2]
df_new.to_csv("path/to/file.csv", index = False)

Yes, you need to save the df to a variable

lapis sequoia Jan 26, 2023, 4:01 AM

#

ohh okay let me try that then

charred light Jan 26, 2023, 4:01 AM

#

lapis sequoia LFA1_LFB1 = pd.merge(df2, df1[f2], left_on='Vendor', right_on='Vendor', how='inn...

This would work too.

lapis sequoia Jan 26, 2023, 4:02 AM

#

charred light This would work too.

okay thank you very much Skyglow

charred light Jan 26, 2023, 4:04 AM

#

Np

lapis sequoia Jan 26, 2023, 4:04 AM

#

charred light Np

what timezone are you i?

charred light Jan 26, 2023, 4:43 AM

#

Antarctica

queen cradle Jan 26, 2023, 4:44 AM

#

solid ridge Hey folks, what's the right way to apply `numpy.hypot` to each pair of elements ...

Suppose your data is in x. Then you want:

idx = np.triu_indices(x.shape[0], 1)
y0, y1 = np.broadcast_arrays(x[None, :, ...], x[:, None, ...])
y_diff = y0[idx] - y1[idx]
dists = np.hypot(y_diff[..., 0], y_diff[..., 1])

sweet crypt Jan 26, 2023, 4:44 AM

#

In Jax, how do we define that certain process to use gpu and other process to use just CPU?

wooden sail Jan 26, 2023, 4:48 AM

#

sweet crypt In Jax, how do we define that certain process to use gpu and other process to us...

here are some solutions https://stackoverflow.com/questions/74537026/execute-function-specifically-on-cpu-in-jax people suggest to specify the device with the jit function or to make a with environment (look at the bottom answer)

Stack Overflow

Execute function specifically on CPU in Jax

I have a function that will basically instantiate a huge array and do other things. I am running my code on TPUs so basically my memory is limited.
How can I execute my function specifically on the...

sweet crypt Jan 26, 2023, 5:00 AM

#

wooden sail here are some solutions https://stackoverflow.com/questions/74537026/execute-fun...

Interesting, I cannot use the jit function because I have passed objects to the function as well

wooden sail Jan 26, 2023, 5:01 AM

#

aha, then it'll have to be context managers

#

but that kinda defeats the purpose, no? you lost autograd and jit

sweet crypt Jan 26, 2023, 5:02 AM

#

# create processes for playing games
        processes = []
        for p_idx in range(num_parallel_play_worker):
            p = Process(target=play_worker,  args = (config, self_play_iteration, iteration_stats, active_play_processes, states_q, inference_q[p_idx], replay_buffer, game_count, p_idx))
            processes.append(p)
        
        # create inference process
        inference_process = Process(target=inference_worker, args = (config, self_play_iteration, states_q, inference_q, active_play_processes))

        # create training process
        train_process = Process(target=train_worker, args = (config, replay_buffer, self_play_iteration, game_count, iteration_stats, active_play_processes))

        for p in processes: # start play processes
            p.start()
        inference_process.start() # start inference process
        train_process.start() # start training process

#

so this is my main.py

#

I want play_worker to use cpu, while inference and training use GPU

sweet crypt Jan 26, 2023, 5:03 AM

#

wooden sail but that kinda defeats the purpose, no? you lost autograd and jit

umm I can still use them inside the program

wooden sail Jan 26, 2023, 5:04 AM

#

fair enough

sweet crypt Jan 26, 2023, 5:06 AM

#

umm if it makes sense I printed out default_backed inside my inference.py file, which train_workers calls to, and it is says its GPU. So does train.py which is used by train_worker

#

So I am assuming they are using GPU by default?

wooden sail Jan 26, 2023, 5:06 AM

#

yeah. i believe unless you specify otherwise, jax looks for tpus, gpus, then cpus in that order

#

you can change this globally or per function

sweet crypt Jan 26, 2023, 5:07 AM

#

ahh I see

#

But I guess one more question would be

#

why is the inference process being called multiple times?

crisp sonnet Jan 26, 2023, 5:08 AM

#

Any django full stack developers there here looking for job switch

sweet crypt Jan 26, 2023, 5:08 AM

#

like I printed out default backend, but I get 'jax' printed out for as many processes as play worker

#

Is it something to do with how multiprocessing processes are started?

wooden sail Jan 26, 2023, 5:08 AM

#

crisp sonnet Any django full stack developers there here looking for job switch

!rule 6 9

sweet crypt Jan 26, 2023, 5:08 AM

#

I used start_method as spawn

arctic wedgeBOT Jan 26, 2023, 5:08 AM

#

Rules

6. Do not post unapproved advertising.

9. Do not offer or ask for paid work of any kind.

crisp sonnet Jan 26, 2023, 5:09 AM

#

Ok

sweet crypt Jan 26, 2023, 5:09 AM

#

nference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu
Inference is using:  gpu

wooden sail Jan 26, 2023, 5:10 AM

#

where did you put that print tho

sweet crypt Jan 26, 2023, 5:10 AM

#

I dont understand why this would be the case tho, as we can see we started just one process for inference_worker

sweet crypt Jan 26, 2023, 5:10 AM

#

wooden sail where did you put that print tho

oh so the inference_worker is as such

#

def inference_worker(config, self_play_iteration, states_q, inference_q, active_play_processes):
    states = []
    index = []

    current_self_play_iteration = 0
    nn_infer = nn_inference.NN_inference(config, mod=self_play_iteration.value) # create inference instance with model 0
    while active_play_processes.value != 0:
        # if the self play iteration has changed, then update the inference model
        if self_play_iteration.value != current_self_play_iteration: 
            current_self_play_iteration = self_play_iteration.value
            nn_infer = nn_inference.NN_inference(config, mod=self_play_iteration.value)

        while len(states) < active_play_processes.value:
            try:
                idx, state = states_q.get(block=True, timeout=0.1)
                states.append(state)
                index.append(idx)
            except:
                break
         
        # inference
        if len(states) > 0:
            y, policy = nn_infer.inference(states)

            for i in range(len(y)):
                inference_q[index[i]].put((y[i][0], policy[i]))

            # clear the states, index for new batch
            states = []
            index = []

    print("Inference worker exiting since play worker is finished.")
    return

#

  1 import numpy as np
  2 import jax
  3 import haiku as hk
  4 import jax.numpy as jnp
  5 import json
  6 import os
  7 import chex
  8 import sys
  9 from functools import partial
 10 
 11 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir)))
 12 from helpers import utils, log_helper
 13 from wrappers import pywrapper as pw
 14 from wrappers import ccwrapper as cw
 15 import random as rnd
 16 from models import AZeroModel
 17 
 18 #os.environ["CUDA_VISIBLE_DEVICES"]="1"
 19 print("Inference is using: ", jax.default_backend())
 20 
 21 # get logger
 22 logger = log_helper.get_logger("self_play_parallel")
 23 
 24 class NN_inference():
 25    ....

#

This is inference.py, which has NN_infernce class, which is used by inference_worker

wooden sail Jan 26, 2023, 5:15 AM

#

hmm yeah i'm not sure i see why that prints several times

#

i would expect it to print once for every time you import this file into another

sweet crypt Jan 26, 2023, 5:16 AM

#

If it gives you more idea, I am using linux, and set_start_method('spawn')

#

hmm let me check if any other file imports this file

#

Ya I have not used it anywhere else. Maybe I can try to set it inside if main function

#

Anyways I was wondering can i set it to use cpu by default?

wooden sail Jan 26, 2023, 5:25 AM

#

for the whole thing, yes

#

jax.config.update('jax_platform_name', 'cpu')

sweet crypt Jan 26, 2023, 5:26 AM

#

oh for like just say for example, inference.py file

#

or specifically

wooden sail Jan 26, 2023, 5:27 AM

#

yeah, you put this at the top of a file whose functions you want running on cpu

sweet crypt Jan 26, 2023, 5:27 AM

#

class NN_inference():
    # initialize model 
    def _forward_fn(self, x):
        mod = AZeroModel.AlphaZeroModel(num_filters=256, training=True)
        return mod(x)

    def __init__(self, config, mod): 
        self._name = "nn_inference"
        self.trained_model_params_path_parallel = config["file"]["trained_model_params_path"]
        self.model = hk.transform_with_state(self._forward_fn)
        self.rng_key = jax.random.PRNGKey(42)
        self.dummy_x = jax.random.uniform(self.rng_key, (1, cw.BOARD_SIZE, cw.BOARD_SIZE, 2))
        self.params, self.st = self.model.init(rng=self.rng_key, x=self.dummy_x)
        self.mod = mod
        self._load_trained_model(mod)

    def _load_trained_model(self, mod):
        pass
    
    @partial(jax.jit, static_argnums=(0,1))
    def _forward_pass (self, model, params, st, x, rng_key):
        forward, st = model.apply(params=params, state=st, x=x, rng=rng_key)
        y, policy = forward
        return y, policy

  
    def inference(self, states):
        batch_state = jnp.array(states)

        # batch size
        batch_size = len(states)

        # convert vector to board
        p1 = pw.vec_to_board(batch_state, 1, batch_size)
        p2 = pw.vec_to_board(batch_state, 2, batch_size)

        # stack the two boards together
        batch_state = jnp.array(jnp.stack((p1, p2), axis=3)).astype(jnp.float32)

        # get policy and value
        y, policy = self._forward_pass(self.model, self.params, self.st, batch_state, self.rng_key)

        return y, policy

wooden sail Jan 26, 2023, 5:28 AM

#

or at least my impression is that this should work, go ahead and try it out and see 😛 otherwise you'll have to use context managers

sweet crypt Jan 26, 2023, 5:28 AM

#

Like for example since I want the inference to be done on GPU, maybe I can pass some decorator to the inference function itself so that it uses GPU

wooden sail Jan 26, 2023, 5:29 AM

#

yeah, with context managers as in the link i sent at the beginning

#

but make sure all the quantities you pass to that function are created in the correct device as well

#

moving stuff from cpu to cpu and back is super slow

sweet crypt Jan 26, 2023, 5:31 AM

#

wooden sail but make sure all the quantities you pass to that function are created in the co...

Hmm, in the example that you sent I see that cpu device is created as cpu_device = jax.devices('cpu')[0]

#

that means its using just one core namely 0

#

but I am multiuple processes using multiple CPU cores, I was wondering how would i deal with it

wooden sail Jan 26, 2023, 5:33 AM

#

how you would deal with what?

sweet crypt Jan 26, 2023, 5:39 AM

#

wooden sail how you would deal with what?

oh like when we use context

#

with jax.default_device(jax.devices("cpu")[0]):
  pass

#

this will be using the cpu with id 0

#

but since we want to use multiple processes, we want that 0 to be dynamic

#

like for second process we want it to be 1 and so on

wooden sail Jan 26, 2023, 5:43 AM

#

ah, that's what you mean. you can try pmap for that https://jax.readthedocs.io/en/latest/multi_process.html#running-multi-process-computations

#

though in their example they do it on gpus, this should be doable with cpus

#

you should also be able to get the count of cpu cores you have and create processes in a loop, each with a different core

karmic parcel Jan 26, 2023, 5:56 AM

#

I love machine learning

sweet crypt Jan 26, 2023, 5:56 AM

#

wooden sail ah, that's what you mean. you can try pmap for that https://jax.readthedocs.io/e...

I see, thank you. I will work on it for a bit. Thank you so much 🙂

mint palm Jan 26, 2023, 9:37 AM

#

predicting perfect GPU memory isn't possible, and CUDA implementation isn't full leak proof too, but then in high stakes environment, how do ML engineer prevent loss of hundred of thousands of $, that are needed for training for just one day for model like chatGPT.

keen notch Jan 26, 2023, 10:20 AM

#

so noticed i do use elastic_scattering call in my current code

arctic wedgeBOT Jan 26, 2023, 10:20 AM

#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

keen notch Jan 26, 2023, 10:20 AM

#

https://paste.pythondiscord.com/logabitova

#

but i need to remove this line but then i have things undeclared new_x, new_y, new_z = elastic_Scattering(x,y,z)

#

#

i have called it here already

mild dirge Jan 26, 2023, 11:00 AM

#

Why do you need to remove it if you still need to use it?

gloomy anvil Jan 26, 2023, 11:06 AM

#

Hello! I have a multivariate timeseries dataset that has stationary and non-stationary data. I want to fit a VAR, so I make all the data stationary and fit it. Unfortunately I figure out that some of the timeseries (not all!) are cointegrated. So VECM would be appropriate for these timeseries, but not for the timeseries that are not cointegrated. Is there a "traditional" econometric forecast model that I could use for my data that handels stationary, non-stationary, cointegrated and non-cointegrated data?

keen notch Jan 26, 2023, 11:46 AM

#

mild dirge Why do you need to remove it if you still need to use it?

ahh so i did this but i want to see it as a 2d plot

#

mild dirge Jan 26, 2023, 11:47 AM

#

So can you not replace the function that plots it in 2d (with maybe color instead of z-axis)

keen notch Jan 26, 2023, 11:48 AM

#

I want an additional 2d plot to show it’s avalanche plot cause I can’t see it in 3D

keen notch Jan 26, 2023, 11:49 AM

#

mild dirge So can you not replace the function that plots it in 2d (with maybe color instea...

ohh so this

#

I'm not entirely sure what you mean

arctic wedgeBOT Jan 26, 2023, 11:56 AM

#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

keen notch Jan 26, 2023, 11:56 AM

#

https://paste.pythondiscord.com/iwupaxapic

mild dirge Jan 26, 2023, 11:59 AM

#

keen notch I'm not entirely sure what you mean

I'm not entirely sure what you mean either 😅
You have some code that you remove a function from, but you still use that function. But now you want a 2d plot instead of a 3d plot. That's all the context I understand right now.

keen notch Jan 26, 2023, 12:11 PM

#

not sure either i'll have a think haha

gloomy anvil Jan 26, 2023, 12:45 PM

#

Hello! I have a multivariate timeseries dataset that has stationary and non-stationary data. I want to fit a VAR, so I make all the data stationary and fit it. Unfortunately I figure out that some of the timeseries (not all!) are cointegrated. So VECM would be appropriate for these timeseries, but not for the timeseries that are not cointegrated. Is there a "traditional" econometric forecast model that I could use for my data that handels stationary, non-stationary, cointegrated and non-cointegrated data?

lapis sequoia Jan 26, 2023, 2:46 PM

#

Hi, how do you write this using pandas

#

LFA1 is filtered for [LFA1_Central del.block], [LFA1_Central deletion flag], [LFA1_Central posting block], [LFA1_Central purchasing block] not equal to “X”

serene scaffold Jan 26, 2023, 3:28 PM

#

@lapis sequoia so all of those columns need to equal X, or what?

lapis sequoia Jan 26, 2023, 3:40 PM

#

serene scaffold <@456226577798135808> so all of those columns need to equal X, or what?

yes basically

tranquil jasper Jan 26, 2023, 4:09 PM

#

Is julia a good choice to be used in data engineering and analysis?

feral hedge Jan 26, 2023, 4:10 PM

#

and sorry for the long message! needed some context for my two questions at bottom + i am new to ML, and only as a hobby 😂

lapis sequoia Jan 26, 2023, 4:15 PM

#

serene scaffold <@456226577798135808> so all of those columns need to equal X, or what?

No not equal to X but I keep getting an error when I do it

charred light Jan 26, 2023, 4:38 PM

#

tranquil jasper Is julia a good choice to be used in data engineering and analysis?

Python's probably better for both. In addition to being more popular, Python/R is also better for analysis. Julia has better performance and might be better for data engineering, but depends on the company. And many companies just use Python.

solid ridge Jan 26, 2023, 5:00 PM

#

queen cradle Suppose your data is in `x`. Then you want: ```python idx = np.triu_indices(x.sh...

That's very interesting, the solution that another person gave me was

xs, ys = (*np.array(coords).T,)
x1, x2 = np.meshgrid(xs, xs)
x_diff = x2-x1
y1, y2 = np.meshgrid(ys, ys)
y_diff = y2-y1
d_mat = np.hypot(x_diff, y_diff)

Which does work, but of course creates a number of intermediate variables that would probably take a lot more memory than your solution. I am glad to know about np.triu_indices, I wasn't able to find any triu stuff on my own.

Is there was way to tell numpy that I actually want the matrix to be symmetric, and only have to define one half, upper or lower? I feel like the answer might be "no" but it's worth asking.

lapis sequoia Jan 26, 2023, 5:09 PM

#

how do you perform a left anti join using pandas?

serene scaffold Jan 26, 2023, 5:26 PM

#

lapis sequoia how do you perform a left anti join using pandas?

looks like there isn't a "good" way to do it, but here's a blog post that says: https://www.statology.org/pandas-anti-join/

Statology

How to Perform an Anti-Join in Pandas - Statology

This tutorial explains how to perform an anti-join in pandas, including an example.

#

anti_left = (
    df1.merge(df2, how='outer', indicator=True)
    .query("_merge == 'left_only'")
    .drop(columns='_merge')
)

this would probably work.

lapis sequoia Jan 26, 2023, 6:09 PM

#

How do you assign a unique id using pandas to a large dataset?

#

anyone know if aws glue get-tags is slow as a dog? or is my company's environment messed up it takes like 15 seconds to get one tag

#

maybe more like 20-30sec

#

How do you assign a unique id using pandas to a large dataset?

charred light Jan 26, 2023, 6:35 PM

#

lapis sequoia How do you assign a unique id using pandas to a large dataset?

You can do UUID https://stackoverflow.com/questions/71163007/how-to-make-a-unique-id-in-pandas-from-names
or based on existing columns https://stackoverflow.com/questions/72216519/pandas-create-a-new-unique-identifier-column-based-on-values-from-two-other-col

lapis sequoia Jan 26, 2023, 6:36 PM

#

charred light You can do UUID <https://stackoverflow.com/questions/71163007/how-to-make-a-uniq...

What if I have many columns though?

#

does this work

charred light Jan 26, 2023, 6:37 PM

#

lapis sequoia What if I have many columns though?

You can choose to pick only some of them.

lapis sequoia Jan 26, 2023, 6:37 PM

#

import pandas as pd

Create a new dataframe

df = pd.DataFrame(data)

Assign a unique id to each row

df['id'] = df.index.map(lambda x: 'row_' + str(x))

lapis sequoia Jan 26, 2023, 6:38 PM

#

charred light You can choose to pick only some of them.

would the code below as well?

charred light Jan 26, 2023, 6:39 PM

#

lapis sequoia would the code below as well?

Run it and see?

lapis sequoia Jan 26, 2023, 6:40 PM

#

it showed me the dataset with the uid

lapis sequoia Jan 26, 2023, 6:41 PM

#

charred light Run it and see?

I just dont know how to make sure if it is unique and did not repeat rows

charred light Jan 26, 2023, 6:42 PM

#

lapis sequoia I just dont know how to make sure if it is unique and did not repeat rows

Use .nunique() and see if that's the same as your df length. len(df)

#

If it's smaller, then you have repeated rows.

lapis sequoia Jan 26, 2023, 6:44 PM

#

charred light If it's smaller, then you have repeated rows.

it is the same length

#

is that fine?

charred light Jan 26, 2023, 6:44 PM

#

Yes, that means each row is unique.

lapis sequoia Jan 26, 2023, 6:44 PM

#

ohh okay perfect thanks

lapis sequoia Jan 26, 2023, 6:47 PM

#

charred light Yes, that means each row is unique.

do you also know how make sure there arent any matching uids between two datasets?

charred light Jan 26, 2023, 6:49 PM

#

lapis sequoia do you also know how make sure there arent any matching uids between two dataset...

You can use sets and compare:
set(df1['uuid']) & set(df2['uuid']) if len of this is 0, there are no matches.

lapis sequoia Jan 26, 2023, 6:53 PM

#

charred light You can use sets and compare: `set(df1['uuid']) & set(df2['uuid'])` if len of th...

your amazing thanks!!

#

Would you like to connect?

#

lmaoo what does that mean

#

Just incase I have more questions

#

How do you create an intermediary table using pandas?

charred light Jan 26, 2023, 7:18 PM

#

lapis sequoia How do you create an intermediary table using pandas?

You can just save it to a variable and delete it manually after. There isn't really intermediary tables in pandas, that's more of a SQL thing.

lapis sequoia Jan 26, 2023, 7:26 PM

#

charred light You can just save it to a variable and delete it manually after. There isn't rea...

Also, how do you perform an Union on two datasets and have conditions on matching certain columns?

charred light Jan 26, 2023, 7:28 PM

#

lapis sequoia Also, how do you perform an Union on two datasets and have conditions on matchin...

Use SQL.
Pandas: Merge first, then filter.

lapis sequoia Jan 26, 2023, 7:29 PM

#

I would rather use pandas, so how would Ik if i should choose merge outer or inner?

charred light Jan 26, 2023, 7:31 PM

#

lapis sequoia I would rather use pandas, so how would Ik if i should choose merge outer or inn...

Depends on what your joining? Left join tends to be best, gives you least headaches.

lapis sequoia Jan 26, 2023, 7:32 PM

#

charred light Depends on what your joining? Left join tends to be best, gives you least heada...

Perform a union on LFA1_LFB1_int and EA_int with the following column matches called UNION

#

• LFA1_LFB1_int -> EA_int o [Test3_RecordID] -> [Test3_RecordID2] o [LFA1_Street] -> [EA_Street and House Number (emp)] o [Data Source] -> [Data Source]

#

this is what I need to do and I am stuck

charred light Jan 26, 2023, 7:33 PM

#

lapis sequoia Perform a union on LFA1_LFB1_int and EA_int with the following column matches ca...

Are LFA1_LFB1_int and EA_int these dataframes?

lapis sequoia Jan 26, 2023, 7:33 PM

#

yes

#

but I need to use pandas

errant bison Jan 26, 2023, 7:34 PM

#

hello. So i am trying to make an app which allows a user to capture an image of his room through camera. And then click on the wall to paint it and it detects the wall and color it. So for this which ai algorithm or opencv modules can i use?

charred light Jan 26, 2023, 7:34 PM

#

lapis sequoia but I need to use pandas

This will help https://realpython.com/pandas-merge-join-and-concat/

#

Merge specifically

lapis sequoia Jan 26, 2023, 7:36 PM

#

charred light Merge specifically

wouldnt concat help though?

charred light Jan 26, 2023, 7:37 PM

#

lapis sequoia wouldnt concat help though?

Not if your trying to use a join on specific ID/Keys.
https://towardsdatascience.com/3-key-differences-between-merge-and-concat-functions-of-pandas-ab2bab224b59

lapis sequoia Jan 26, 2023, 7:38 PM

#

charred light Not if your trying to use a join on specific ID/Keys. <https://towardsdatascien...

Now I know how to merge but what I am confused is adding the matching conditions

charred light Jan 26, 2023, 7:40 PM

#

lapis sequoia Now I know how to merge but what I am confused is adding the matching conditions

See the other link, it provides examples.

lapis sequoia Jan 26, 2023, 7:46 PM

#

charred light See the other link, it provides examples.

I keep getting errors its because I have to add multiple conditions

#

UNION = pd.merge(LFA1_LFB1_int, EA_int, left_on=(['Test3_RecordID'] & ['LFA1_Street'] & ['Data Source']), right_on=(['Test3_RecordID2'] & ['EA_Street and House Number (Emp)'] & ['Data Source']), how='inner')

#

this is what I did^

charred light Jan 26, 2023, 7:48 PM

#

lapis sequoia I keep getting errors its because I have to add multiple conditions

Do the merge first. THEN add filters for conditions.
Also, left_on/right_on should be a list:
left_on = ['col1', 'col2', 'col3'], right_on = ['col1', 'colA', 'col3']

lapis sequoia Jan 26, 2023, 7:52 PM

#

charred light Do the merge first. THEN add filters for conditions. Also, left_on/right_on sho...

Oh okay perfect, and you recommend inner, left, right, or outer join usually?

charred light Jan 26, 2023, 7:53 PM

#

lapis sequoia Oh okay perfect, and you recommend inner, left, right, or outer join usually?

Left join 99% of the time. Inner join the remaining 1%. if you need the others, restructure the problem.

lapis sequoia Jan 26, 2023, 7:57 PM

#

charred light Left join 99% of the time. Inner join the remaining 1%. if you need the others, ...

Also, A fuzzy match is performed to determine matching street instances, [LFA1_Street] -> [EA_Street and House Number (emp)], with a 90% threshold to produce a table with 2 columns, [Test3_RecordID] with their corresponding matching [Test3_RecordID2] called FM

#

how can this be done?

charred light Jan 26, 2023, 8:01 PM

#

lapis sequoia how can this be done?

Fuzzy matching requires a different package. Pandas doesn't do that natively https://www.geeksforgeeks.org/how-to-do-fuzzy-matching-on-pandas-dataframe-column-using-python/

GeeksforGeeks

How to do Fuzzy Matching on Pandas Dataframe Column Using Python? -...

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.

grave pewter Jan 26, 2023, 8:16 PM

#

Orain beste bat

serene scaffold Jan 26, 2023, 8:18 PM

#

grave pewter Orain beste bat

that was over a year ago bruh

#

https://tenor.com/view/flag-bandera-ikurrina-ikurriña-basque-gif-25272244

Tenor

grave pewter Jan 26, 2023, 8:19 PM

#

Bai

#

Gora eta

serene scaffold Jan 26, 2023, 8:19 PM

#

the basque flag is basically Christmas but British

grave pewter Jan 26, 2023, 8:19 PM

#

No, the british is a basque flag but gay

serene scaffold Jan 26, 2023, 8:20 PM

#

no this is that

#

anyway do you wanna talk about data science

grave pewter Jan 26, 2023, 8:21 PM

#

No pls

serene scaffold Jan 26, 2023, 8:21 PM

#

okay Sadge

lapis sequoia Jan 26, 2023, 8:51 PM

#

charred light Fuzzy matching requires a different package. Pandas doesn't do that natively htt...

but how do you perform the 90% threshold

tranquil jasper Jan 26, 2023, 8:55 PM

#

should i use dask or modin?
or something else?

gilded kestrel Jan 26, 2023, 9:14 PM

#

does anyone have a link for a 'good' transformer fine-tuning on text classification with tf?

silent stump Jan 26, 2023, 9:54 PM

#

hi guys, does anyone know if they have changed requests.get(link) in beautiful soup? keeps saying name requests is not defined?

serene scaffold Jan 26, 2023, 9:55 PM

#

silent stump hi guys, does anyone know if they have changed requests.get(link) in beautiful s...

you have to install requests, and then import requests. but that's not a data science question.

silent stump Jan 26, 2023, 9:56 PM

#

serene scaffold you have to install requests, and then `import requests`. but that's not a data ...

bruh cant believe i missed that, many thanks. sorry know for next one

dusty valve Jan 26, 2023, 10:20 PM

#

i made a dataset of 11k grayscale images of peoples faces, 256x256 res and used sklearn.cluster.SpectralClustering(10, assign_labels='cluster_qr', n_jobs=50, affinity='nearest_neighbors') to rate how good all of the people look on a scale of 1-10 in groups, but the results seem too vague. I put my own face as a test image.

#

Kmeans looks a little bit more organized however

dusty valve Jan 26, 2023, 10:41 PM

#

but it seems wrong as well

#

0 is similar to 5 and 9?

#

novel python Jan 26, 2023, 11:28 PM

#

dusty valve but it seems wrong as well

what do you mean by "how good they look?"

#

the label seems too vague tbh, how are they being defined?

dusty valve Jan 26, 2023, 11:34 PM

#

novel python what do you mean by "how good they look?"

how attractive each person is

dusty valve Jan 26, 2023, 11:35 PM

#

novel python the label seems too vague tbh, how are they being defined?

rn just encoded image

#

im gonna extract features from it

#

Transforming... images loaded (100.200%), shape (501, 65536, 3), 0 images skipped (0.00000%)  
Traceback (most recent call last):
  File "C:\Users\user\folder\Desktop\python\r-u-a-10\classify.py", line 57, in <module>
    # del l
TypeError: PCA.fit_transform() missing 1 required positional argument: 'X'```

#

and i got an error on a commented line

#

smh

lapis sequoia Jan 26, 2023, 11:42 PM

#

A fuzzy match is performed to determine matching street instances, [LFA1_Street] -> [EA_Street and House Number (emp)], with a 90% threshold to produce a table with 2 columns, [Test3_RecordID] with their corresponding matching [Test3_RecordID2] called FM
how can this be done using pandas?

pliant sundial Jan 27, 2023, 12:02 AM

#

hey

#

i have a question

#

how does someone get started with machine learning/artifical intelligence?

serene scaffold Jan 27, 2023, 12:36 AM

#

pliant sundial how does someone get started with machine learning/artifical intelligence?

depends on what your goals are. but AI/ML is different from other domains of programming in that it requires a lot of specialized knowledge that is entirely separate from programming, and is driven a lot more by experimentation.

if you want to actually understand what you're doing, and do things with AI/ML that other people haven't done before, then you need to know about probability, statistics, linear algebra, and multivariate calculus. if you just want to mess around with some AI/ML libraries to satisfy some short-term curiosity (and that's fine if that's what your goal is), then you can sort of ignore it and potentially hack something together.

#

if you've taken at least one calculus course, you might start following along with one of the books on our resources page. but if you don't have at least that level of exposure to formal math, you will probably find AI/ML overwhelming.

lapis sequoia Jan 27, 2023, 12:40 AM

#

A fuzzy match is performed to determine matching street instances, [LFA1_Street] -> [EA_Street and House Number (emp)], with a 90% threshold to produce a table with 2 columns, [Test3_RecordID] with their corresponding matching [Test3_RecordID2] called FM
how can this be done using pandas?

serene scaffold Jan 27, 2023, 12:42 AM

#

@lapis sequoia you've been asking a lot of pandas questions today. which is fine, but you should probably just do a pandas tutorial, at this point

#

https://www.kaggle.com/learn/pandas

Learn Pandas Tutorials

Solve short hands-on challenges to perfect your data manipulation skills.

lapis sequoia Jan 27, 2023, 12:43 AM

#

serene scaffold <@456226577798135808> you've been asking a lot of pandas questions today. which ...

I am struggling cause there are specific conditions that is confusing to compute

serene scaffold Jan 27, 2023, 12:45 AM

#

Okay, well, good luck.

hasty mountain Jan 27, 2023, 12:48 AM

#

So... Unsupervised Learning with Neural Networks is more focused on generating pseudo-labels and data augmentation, as far as I've seen...
But I wonder...does it make sense if I try to create a Diffusion Model based on unsupervised learning? Each pixel in my generated image could be interpreted as a pseudo-label... pithink

#

There's something about neural networks trying to decrease the data entropy with pseudo-labels...while the idea of diffusion is increasing entropy, just to decrease it when sampling... I guess...

queen cradle Jan 27, 2023, 1:44 AM

#

solid ridge That's very interesting, the solution that another person gave me was ```py xs,...

Unfortunately that's a hard no. NumPy is not like BLAS and LAPACK. It does not and will never have direct support for packed symmetric matrices because they're incompatible with NumPy indexing. That said, you can store a packed symmetric or triangular matrix in a one-dimensional array and use scipy.linalg.blas and scipy.linalg.lapack to get direct access to BLAS and LAPACK subroutines that work on packed matrices.

iron basalt Jan 27, 2023, 1:45 AM

#

hasty mountain So... Unsupervised Learning with Neural Networks is more focused on generating p...

Unsupervised learning is a type of algorithm that learns patterns from untagged data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a concise representation of its world and then generate imaginative content from it.

#

https://en.wikipedia.org/wiki/Latent_and_observable_variables

Latent and observable variables

In statistics, latent variables (from Latin: present participle of lateo, “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or measured. Such latent variable models are used in many disciplines, including political science, demography, engin...

#

https://en.wikipedia.org/wiki/Latent_variable_model

Latent variable model

A latent variable model is a statistical model that relates a set of observable variables (also called manifest variables or indicators) to a set of latent variables.
It is assumed that the responses on the indicators or manifest variables are the result of an individual's position on the latent variable(s), and that the manifest variables have ...

solid ridge Jan 27, 2023, 1:45 AM

#

queen cradle Unfortunately that's a hard no. NumPy is not like BLAS and LAPACK. It does not a...

Alright that's a great answer

queen cradle Jan 27, 2023, 1:47 AM

#

Really the only problem is that you have to remember what all the BLAS and LAPACK subroutines are called.

iron basalt Jan 27, 2023, 1:47 AM

#

hasty mountain There's something about neural networks trying to decrease the data entropy with...

Diffusion is a type of latent variable model.

#

It learns the latent structure by modelling how data points diffuse through the latent space.

hasty mountain Jan 27, 2023, 1:48 AM

#

iron basalt Diffusion is a type of latent variable model.

Uh...so... pithink

#

Doesn't make sense at all?

feral hedge Jan 27, 2023, 2:38 AM

#

(pytorch) I'm having an issue where if I don't set model_checkpoint to None or del it after loading the state dict from it to the model, i immediately run out of gpu memory. if i do this though, when I save the model it does not save the original from what I've heard and saw. right now every X number of batches i save the model state dict to a separate file, end the training loop and start it again, first by loading the model i just saved, and then continuing the next batches. due to the checkpoint getting deleted (i think) the size of my state dict file on disk is always the same, even after 20,000 images processed. sometimes after a couple hundred thousand images it will increase by a few hundred bytes. it is not compressed. how can I properly save and load my model so that it doesn't run out of gpu memory, and continues to update when I save it after the next X number of batches? I have also tried a bacth size of 25% my normal, and it didn't make a difference.

pliant sundial Jan 27, 2023, 2:58 AM

#

serene scaffold depends on what your goals are. but AI/ML is different from other domains of pro...

so i would have to learn a bunch of math before even starting?

#

what else is python used for in the real world?

#

i heard its only really popular for data scientists

serene scaffold Jan 27, 2023, 3:00 AM

#

pliant sundial so i would have to learn a bunch of math before even starting?

it's not math that you'd learn "before even starting". all of AI/ML is math, in one sense or another. The math never stops.

serene scaffold Jan 27, 2023, 3:01 AM

#

pliant sundial what else is python used for in the real world?

back-end web development, and systems automation.

pliant sundial Jan 27, 2023, 3:06 AM

#

serene scaffold back-end web development, and systems automation.

yeah but there's other languages that are better for back-end

serene scaffold Jan 27, 2023, 3:06 AM

#

pliant sundial yeah but there's other languages that are better for back-end

"better" in what way?

pliant sundial Jan 27, 2023, 3:14 AM

#

not necessarily better but there's more jobs for someone who knows javascript

serene scaffold Jan 27, 2023, 3:16 AM

#

pliant sundial not necessarily better but there's more jobs for someone who knows javascript

well, all of front-end is in javascript. and then some places also do back-end in javascript. I don't do web development, so I don't know what considerations go into choosing a back end language.

#

though the existence of django seems to drive the choice of Python

pliant sundial Jan 27, 2023, 3:16 AM

#

do you do machine learning?

serene scaffold Jan 27, 2023, 3:16 AM

#

ya

pliant sundial Jan 27, 2023, 3:49 AM

#

what would you recommend to someone that wants to get started but doesn't know any math

#

i feel like if i only learn the math involved i'll get bored really fast

#

also if you dont mind answering what are some cool projects you've worked on?

rancid thorn Jan 27, 2023, 5:03 AM

#

i have an intents.json for all the inputs you can give to my ai and it's responses back (or ability to call functions using "method": ["function_name"] rather than "responses": ...)

for example:

[
  "name": {
    "intents": ["what is your name", "whats your name"],
    "responses": ["My name is blah blah.", "Blah blah."]
  }
]

how could I adjust it to work with simple math problems? the intents part is basically just training data so i have to give it examples of math problems such as "what is five plus four"

how many or what kind of intents do i need to add for it to work with every number?

this uses nlp

rancid thorn Jan 27, 2023, 5:10 AM

#

rancid thorn i have an `intents.json` for all the inputs you can give to my ai and it's respo...

also do i need to change the code that processes and trains the model to make this more efficient and easy?

tranquil jasper Jan 27, 2023, 5:57 AM

#

should i use dask or modin?
or something else?

swift sleet Jan 27, 2023, 6:24 AM

#

any book recommendations for beginners to learn data structures and algorithms in python? Some of the highly reviewed ones are super technical

dusty cloud Jan 27, 2023, 6:33 AM

#

How generally are the digital whiteboards data modelled in backends?

tranquil jasper Jan 27, 2023, 6:40 AM

#

swift sleet any book recommendations for beginners to learn data structures and algorithms i...

!resources

arctic wedgeBOT Jan 27, 2023, 6:40 AM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

odd meteor Jan 27, 2023, 7:57 AM

#

pliant sundial i heard its only really popular for data scientists

Python is used in these field. The amount of python code used varies though.

Web Development
Software Development
Backend
Automation
Software Testing & Unit Testing
Ethical Hacking
Cybersecurity
Data Analytics
Data Science
AI / ML / Deep Learning
Game Development
Encryption / Cryptography
Image Processing
Robotics
Data Engineering
Hardware interfacing and control

odd meteor Jan 27, 2023, 8:11 AM

#

pliant sundial what would you recommend to someone that wants to get started but doesn't know a...

I still believe that there’s no such thing as not a math person.

Many cultural factors, misconceptions, stereotypes, and obstacles turn people off to math. You don't need to be good in math first before starting your ML journey; although it helps. You can always learn the needed math as you progress in the field.

That's my 2 cents ✌️

devout oak Jan 27, 2023, 8:21 AM

#

guys so i have this idea so in ft and fft we are basically boiling down a function into different sines and cosines so like what if i teach a neural net how sin works and cosine works and if can figure out the variations in sine and cosines by doing a FT transform we would have the function right ?

wooden sail Jan 27, 2023, 8:24 AM

#

what do you mean by "the variations in sines and cosines"?

odd meteor Jan 27, 2023, 8:25 AM

#

rancid thorn i have an `intents.json` for all the inputs you can give to my ai and it's respo...

You just need to give it as much data as you can. Since this is NLU related, you might wanna checkout RASA and DiagFlow for the model training part.

I've only used RASA and I know it allows for Training Data Importers option to import data of different formats and from different sources which enables the user to train the model and add custom actions and intents to their chatbot.

lapis sequoia Jan 27, 2023, 10:06 AM

#

woven coral Jan 27, 2023, 11:50 AM

#

https://www.kaggle.com/code/sadikaljarif/street-view-housing-number-digits-recognition

Street View Housing Number Digits Recognition

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

fallen crown Jan 27, 2023, 12:56 PM

#

Hi, I would like to have your opinion on my code, for example the improvements you would have made

#

def train_ai(self, genome, config):
        self.game.food.random_pos(self.game.draw_grid())
        net = neat.nn.FeedForwardNetwork.create(genome, config)
        i = 100
        while True:
            direction = self.game.snake.direction
            head = self.game.snake.positions[0]
            food = self.game.food.position
            screen_height = self.game.screen.get_height()
            screen_width = self.game.screen.get_width()
            walls = self.game.draw_line()
            food_positions = self.game.food_detection('distance')
            # DEFINITION DES INPUTS
            inputs = [
                walls['north_distance'],
                walls['south_distance'],
                walls['est_distance'],
                walls['west_distance'],
                walls['north_est_distance'],
                walls['north_west_distance'],
                walls['south_est_distance'],
                walls['south_west_distance'],

                food_positions['north'],
                food_positions['north_est'],
                food_positions['est'],
                food_positions['south_est'],
                food_positions['south'],
                food_positions['south_west'],
                food_positions['west'],
                food_positions['north_west'],

            ]```

#

# Si la tête se dirige vers la droite
            if direction == UP:
                # [up, down, left, rigth]
                inputs += [1, 0, 0, 0]
                if head[1] < 0:
                    self.game.score = 0
                    self.set_fitness(genome)
                    break

            elif direction == RIGHT:
                inputs += [0, 0, 0, 1]
                if head[0] >= screen_width:
                    self.game.score = 0
                    self.set_fitness(genome)
                    break

            elif direction == DOWN:
                inputs += [0, 1, 0, 0]
                if head[1] >= screen_height:
                    self.game.score = 0
                    self.set_fitness(genome)
                    break

            elif direction == LEFT:
                inputs += [0, 0, 1, 0]
                if head[0] < 0:
                    self.game.score = 0
                    self.set_fitness(genome)
                    break
            
            outputs = net.activate(inputs)
            index = outputs.index(max(outputs))

            if index == 0:
                self.game.snake.turn('right')
                i-=0.5
            if index == 1:
                self.game.snake.turn('left')
                i-=0.5
            else:
                pass
            
            pygame.font.init()
            font = pygame.font.Font(None, 20)
            texte = font.render(f"{i}", True, (255, 255, 255))
            self.game.screen.blit(texte, (50, 20))
            self.game.loop(i)
            i-=1
            
            if head == food:
                self.game.score +=1
                self.set_fitness(genome)
                i += 100
                self.game.food.random_pos(self.game.draw_grid())
            
            if i <= 0 :
                self.set_fitness(genome)
                break```

pliant sundial Jan 27, 2023, 2:02 PM

#

odd meteor I still believe that there’s no such thing as not a math person. Many cultural ...

okay thank you

feral hedge Jan 27, 2023, 2:06 PM

#

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    generator = Generator().to(device)
    discriminator = Discriminator().to(device)

    gen_checkpoint = torch.load(gen_path)
    dis_checkpoint = torch.load(dis_path)
    print(f"1 | mem alloc: {torch.cuda.memory_allocated()}, mem cache: {torch.cuda.memory_cached()}")

    generator.load_state_dict(gen_checkpoint["model_state_dict"]) # {file_number}_{small_file_number}
    generator.train()
    print(f"2 | mem alloc: {torch.cuda.memory_allocated()}, mem cache: {torch.cuda.memory_cached()}")

    discriminator.load_state_dict(dis_checkpoint["model_state_dict"])
    discriminator.train()

    gen_optimizer = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))
    dis_optimizer = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))
    
    gen_optimizer.load_state_dict(gen_checkpoint["optimizer_state_dict"])
    dis_optimizer.load_state_dict(dis_checkpoint["optimizer_state_dict"])
    print(f"3 | mem alloc: {torch.cuda.memory_allocated()}, mem cache: {torch.cuda.memory_cached()}")

    torch.cuda.empty_cache()
    print(f"4 | mem alloc: {torch.cuda.memory_allocated()}, mem cache: {torch.cuda.memory_cached()}")

output:

1 | mem alloc: 1632132096, mem cache: 1642070016
2 | mem alloc: 1632132096, mem cache: 1642070016
3 | mem alloc: 1632132096, mem cache: 1642070016
4 | mem alloc: 1632132096, mem cache: 1642070016

Error: torch.cuda.OutOfMemoryError: CUDA out of memory in my training loop
Why does nothing here effect allocated memory or cached? Why can I run my training loop over 10000 batches at once without running out of gpu memory, but the second i run 1000 batches and save the model, reload it from file, it instantly crashes hpu memory. Any ideas?

#

However, when I replace torch.cude.empty_cache() with

    gen_gen_checkpoint = None
    dis_checkpoint = None
    # or
    del gen_checkpoint
    del dis_checkpoint

The code will NOT crash the gpu memory, and will begin training 10000 batches. The problem is, when I re-save the updated model after 1000 more batches, the file state file size does not change at all, and I've read that the original checkpoint must be around still if it's going to save properly. (don't see why though).

#

and the output of deleting the gen and dis checkpioints:

4 | mem alloc: 408724992, mem cache: 1642070016

snow osprey Jan 27, 2023, 2:31 PM

#

I need some help in ML model, anyone?

rancid thorn Jan 27, 2023, 3:37 PM

#

odd meteor You just need to give it as much data as you can. Since this is NLU related, you...

the model uses nltk and keras i already trained the model and everything

#

but thanks, i wish i could rewrite the code

#

there are so many ways to do this and i just followed a tutorial without really understanding all the code so i feel like i have to like rewrite it

#

there could be a library that automates the whole thing for me

#

that’s why i don’t like following tutorials

austere swift Jan 27, 2023, 4:09 PM

#

feral hedge However, when I replace `torch.cude.empty_cache()` with ```py gen_gen_checkp...

the empty cache function doesn't clear out all gpu memory

#

it clears any memory that's allocated as cache but not being used, so that it may be used by other processes

#

so any tensors that are still in gpu memory will stay there

#

but when you set those vars to None, it does remove them from gpu memory

feral hedge Jan 27, 2023, 4:18 PM

#

austere swift but when you set those vars to None, it does remove them from gpu memory

thanks. when i do set them to None, and save the updated model after some more batches, it does not update the model, or only saves its current update. there are no changes to the file size at all after 100000 images using = None

#

I don't get why I can train 100000 images without saving the model to disk, and as soon as i save even after 1000 images and restart and load it from disk, it crashes GPU memory.

mild dirge Jan 27, 2023, 4:34 PM

#

Is it a gpu memory error?

#

Maybe loading to gpu is controlled by cpu, so bottleneck could be cpu ram? (might be talking completely out of ass here though)

serene scaffold Jan 27, 2023, 4:36 PM

#

snow osprey I need some help in ML model, anyone?

be sure to always ask your actual question, or people will just ignore you

feral hedge Jan 27, 2023, 4:47 PM

#

mild dirge Is it a gpu memory error?

Hey, yes it is. torch.cuda.OutOfMemoryError: CUDA out of memory. and also indicates the GPU device and capacity

mild dirge Jan 27, 2023, 4:47 PM

#

And you get the error even after completely closing the program, and then loading from disk?

#

Maybe it's because you instantize the model (which initializes with some weights) and then load all the trained weights, and then overwrite them?

#

So that's double the memory needed

feral hedge Jan 27, 2023, 4:49 PM

#

mild dirge And you get the error even after completely closing the program, and then loadin...

Yes, closing the programming just in case there is data from last iteration, thenb loading from disk after starting program crashes it before first training loop.

#

hmm

#

so I must instantiate the model to be able to load the state dict right? maybe there are parameters for this

mild dirge Jan 27, 2023, 4:50 PM

#

Yeah, not sure how that is handled

#

Maybe they do do that memory efficiently. Because you simply just use a load method right

#

Is it possible to do manually?

feral hedge Jan 27, 2023, 4:52 PM

#

i am looking at this https://pytorch.org/tutorials/beginner/basics/saveloadrun_tutorial.html one moment

#

looks like "To load model weights, you need to create an instance of the same model first, and then load the parameters using load_state_dict() method."

torch.save(model, 'model.pth')
We can then load the model like this:
model = torch.load('model.pth')

#

I think that's what you mean

#

i will try the second solution which is saving the whole model and loading it entirely without instantiating first.

mild dirge Jan 27, 2023, 4:54 PM

#

Yeah, if that works then that will likely have been the problem

#

Just train it 1 iteration and then save

#

Or not at all, shouldn't matter

feral hedge Jan 27, 2023, 5:04 PM

#

mild dirge Yeah, if that works then that will likely have been the problem

testing it out on a batch. will update in about 5 minutes

mild dirge Jan 27, 2023, 5:04 PM

#

Cool, curious to hear if that was actually it

trail river Jan 27, 2023, 5:05 PM

#

i've got an absurd amount of issues with my code for autokeras. would someone be able to dm and help me? the code is too long to send here.

feral hedge Jan 27, 2023, 5:25 PM

#

mild dirge Cool, curious to hear if that was actually it

so, it works.

I noticed something while doing this related to the file size. Previously I was saving generator state dict, and generator_optimizer state dict to the same file as separate keys., then the discriminator and discriminator_optimizer state dict the same way to another file. I didn't bother trying to do this when saving the whole model so I saved them to 4 separate files instead of 2 with 2 keys. The file size of the original generator + optimizer is nearly identical to the file size of just the generator optimizer in the current version, not including the couple hundred MB from the generator model. so it's like previously it was only saving the generator's optimizer and not the model, but looking at the code I'm not sure if it's due to the setting the old loaded state dict to None, or something else because I had a similar issue in an earlier version.

#

but yea, it works when saving each model and optimizer to a separate file, (not just the state dict) and loading them directly without instantiating.

#

thanks @mild dirge

mild dirge Jan 27, 2023, 5:27 PM

#

Awesome, good to hear!

trail river Jan 27, 2023, 5:37 PM

#

mild dirge Awesome, good to hear!

are you a machine learning genius i need help uwu

mild dirge Jan 27, 2023, 5:38 PM

#

Most certainly not, just an AI student. But if you want help from anyone here, you need to paste the code and explain the problem

#

!paste

arctic wedgeBOT Jan 27, 2023, 5:38 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

trail river Jan 27, 2023, 5:38 PM

#

i have a thread in the help channel

#

and posted my code there

#

i wonder if it'll bonk me for linking to my thread but its in the same server so here goes /shrug

https://discord.com/channels/267624335836053506/1068579390767767562

#

PagBounce yayyy

mild dirge Jan 27, 2023, 5:40 PM

#

I don't have too much experience with time series data or pandas tbh. Most of the data i've used for projects was already clean, and simply converted to a numpy arr, or generated myself.

#

So I doubt I can be of much help.

tranquil jasper Jan 27, 2023, 5:41 PM

#

as a beginner
what packages and tools i need to learn for data analysis

trail river Jan 27, 2023, 5:41 PM

#

tranquil jasper as a beginner what packages and tools i need to learn for data analysis

numpy and pandas i think are the big ones

#

idk i started coding like last week

tranquil jasper Jan 27, 2023, 5:42 PM

#

i know pandas
kinda

trail river Jan 27, 2023, 5:42 PM

#

it just depends on your use-case

mild dirge Jan 27, 2023, 5:42 PM

#

a plotting library as well

trail river Jan 27, 2023, 5:42 PM

#

what would an example of that be

tranquil jasper Jan 27, 2023, 5:42 PM

#

i know a little matplotlib

trail river Jan 27, 2023, 5:42 PM

#

oh ive seen that one

#

i kinda hated using it but its bc my data is jank

#

@mild dirge have you met anyone here who you think could potentially help me with my issue

#

ive tried every resource i know including chatGPT which also had no clue how to help me

#

it's just been stuck telling me to same 2 incorrect things over and over

mild dirge Jan 27, 2023, 5:45 PM

#

Yeah, well thats chatgpt for ya

#

You'll just have to wait or look* up any errors you get if no-one responds

trail river Jan 27, 2023, 5:46 PM

#

"loop up"?

#

oh look

mild dirge Jan 27, 2023, 5:46 PM

#

It wouldn't be nice of me to just ping some people so they feel forced to help. If someone is able and willing to help, they will

trail river Jan 27, 2023, 5:47 PM

#

mild dirge It wouldn't be nice of me to just ping some people so they feel forced to help. ...

i understand, thank you
im just being impatient like usual Okayge

#

also @mild dirge sorry last time ima ping you

have you ever seen an issue where the package was throwing the error instead of your code?

#

auto_model.py says it's referencing the variable 'graph' before its been assigned although it was self assigned

mild dirge Jan 27, 2023, 5:51 PM

#

Ehh, code from packages can throw errors sure

#

Seems like a weird error though

trail river Jan 27, 2023, 5:51 PM

#

the error doesnt exist though (from everywhere i looked)

mild dirge Jan 27, 2023, 5:51 PM

#

But if you supply wrong arguments, the code could give an error f.e.

trail river Jan 27, 2023, 5:51 PM

#

yea i submitted it to their git but its odd

#

yeah thats what i was thinking

#

but i already inputted the required arguments, so im assuming the format of my inputs (df data) are incorrect

#

rather than the arguments themselves

#

idk if im using the proper terminology but that's what i've got

shell crest Jan 27, 2023, 5:59 PM

#

trail river i wonder if it'll bonk me for linking to my thread but its in the same server so...

code wise it seems pretty ok except for the
df["PREVIOUS_WINNER_1"] = df["WINNING NUMBERS"].shift(1)
part

I don't have enough experience with timeseries splits/timeseries even to comment meaningfully.

tranquil jasper Jan 27, 2023, 6:09 PM

#

as a beginner
what packages and tools i need to learn for data analysis

trail river Jan 27, 2023, 6:13 PM

#

shell crest code wise it seems pretty ok except for the `df["PREVIOUS_WINNER_1"] = df["WINN...

well the goal was to test/train the current winning number, shift it, then use the previous number as a comparison for the current

that’s the see if there’s a pattern with previous drawings, along with the other metrics (time)

serene scaffold Jan 27, 2023, 6:32 PM

#

I'm in a grad school lecture. And for the previous lecture, someone wrote "wonderful product" and "terrible really bad" next to a matrix. I guess they were talking about sentiment analysis

hard birch Jan 27, 2023, 6:52 PM

#

What does '@' do in pandas

#

example

#

A.T @ A - np.eye(F)

#

here A is some matrix

serene scaffold Jan 27, 2023, 6:56 PM

#

hard birch What does '@' do in pandas

Nothing. But in numpy, it does matrix multiplication.

hard birch Jan 27, 2023, 6:56 PM

#

ah okok

#

that makes sense

#

cheers

feral hedge Jan 27, 2023, 7:23 PM

#

Alright, now I'm confused. Even after using torch.load() on the entire models and optimizers, not just the state_dict, it does not run out of GPU memory but every time I save the model to a file, the file size does NOT increase since the last 1000 batches. Sometimes it varies more or less by a few hundred bytes. The model is being trained because at the start of the first batch, its a random rgb image, and by the end of that batch, and the start and end of every batch after, the images are more "real" and not random. Would you expect the file size of the models and optimizers to increase after training them on thousands of batches of images? I don't get it because I save every 300 batches and the first batch image 1 is rgb random, first batch last image is more "real". and afterwards at the end of each 300 batches the model is saved and then loaded again. after its saved and loaded, first image produced by it is more "real" than random, so it appears to have at least stored and read some information. From what I understand, the file sizes should be increasing though?

tldr - after training 10k or 100k images in a text-to-image GAN, would you expect the model and optimizer file size to increase since the last time you saved it?

lapis sequoia Jan 27, 2023, 7:28 PM

#

can i convert dataframe to an array excluding headers

serene scaffold Jan 27, 2023, 7:32 PM

#

lapis sequoia can i convert dataframe to an array excluding headers

Use to numpy

lapis sequoia Jan 27, 2023, 7:39 PM

#

serene scaffold Use to numpy

that will give me a numpy array

#

anyways i found my solution which was .values

#

oh wait thats a numpy array too

#

i did .astype

mild dirge Jan 27, 2023, 7:52 PM

#

feral hedge Alright, now I'm confused. Even after using torch.load() on the entire models an...

Same amount of weights, same amount of data I'd think

#

Maybe it uses some compression for storing the data on disk, but I'd doubt trained weights are more efficient to store

#

Training/changing the parameters wouldn't change the memory size

hasty mountain Jan 27, 2023, 7:56 PM

#

iron basalt Diffusion is a type of latent variable model.

Hm... It's being a bit interesting...at least in the pseudo-label generation...I guess...it's at least assigning different pseudo-labels to different images, and appear to have some logic to it.
The only downsize is that...since I'm using convolutions with many channels in my own computer, it's taking quite some time to finish a single epoch.

EDIT: Whoops, I checked the wrong model. My Unsupervised Learning model is actually having the same performance as my normal diffusion model... which is: outputting random noise, no images

#

(Oh yeah, I'm making it generate pseudo-labels just so I can visualize how it's going)

ocean swallow Jan 27, 2023, 7:57 PM

#

Hey yo sup? I am trying to find an investing strategy that I can train an RL agent(s)

#

has anyone did one for stock market?

tawny spire Jan 27, 2023, 8:05 PM

#

what shape should the dependent variable have in a logistic regression model? can't find the answer anywhere

nocturne eagle Jan 27, 2023, 8:15 PM

#

should it be round?

tawny spire Jan 27, 2023, 8:20 PM

#

i mean the data you pass into train_test_split

#

say i want to build a logistic regression model, i have a dataset of 2 different types of photo, how should i structure the y parameter i pass into sklearn.model_selection.train_test_split?

ocean swallow Jan 27, 2023, 8:25 PM

#

don't matter the shape. it splits from the .shape[0]

tawny spire Jan 27, 2023, 8:26 PM

#

what should i pass to y?

#

it should be a 1d nparray, but of what length and holding what values?

ocean swallow Jan 27, 2023, 8:28 PM

#

what is log res

#

do you mean logistic regression?

tawny spire Jan 27, 2023, 8:28 PM

#

yeah

ocean swallow Jan 27, 2023, 8:29 PM

#

tawny spire it should be a 1d nparray, but of what length and holding what values?

if you have N photos RGB

#

your data is shaped (N, height, width, 3)

#

if you pass this to train test split they will be seperated to a x height width 3 and y height width 3

#

where x+y = N

tawny spire Jan 27, 2023, 8:31 PM

#

hmm i thought the dimensions were multiplied

ocean swallow Jan 27, 2023, 8:32 PM

#

no it doesn't flatten

tawny spire Jan 27, 2023, 8:32 PM

#

hmm

ocean swallow Jan 27, 2023, 8:32 PM

#

doesn't do anything else

tawny spire Jan 27, 2023, 8:33 PM

#

so ```x_train_n_samples = x_train.shape[1]
x_train_matrix_w = x_train.shape[2]
x_train_matrix_h = x_train.shape[3]
x_train_rgb = x_train.shape[4]

x_train_dims = (x_train_n_samples, x_train_matrix_w * x_train_matrix_h * x_train_rgb)``` is wrong

ocean swallow Jan 27, 2023, 8:33 PM

#

depending on your use case you will possibly want to keep the shape anyway

ocean swallow Jan 27, 2023, 8:33 PM

#

tawny spire so ```x_train_n_samples = x_train.shape[1] x_train_matrix_w = x_train.shape[2] x...

what model are you using

#

sklearn LogRes?

tawny spire Jan 27, 2023, 8:33 PM

#

sk learn logistic regression

#

i have two folders of different image, with shapes ((318, 512, 512, 3), (318, 512, 512, 3))

#

do i just feed them to the model?

#

as the x var

ocean swallow Jan 27, 2023, 8:34 PM

#

above should work then

tawny spire Jan 27, 2023, 8:34 PM

#

what about the y var?

ocean swallow Jan 27, 2023, 8:35 PM

#

what iss the y var exactly?

#

other images?

tawny spire Jan 27, 2023, 8:35 PM

#

labels

#

i dunno if they should be images or labels

#

sorry i'm a bit new to this, i thought since it was a classifier it would be a label but now i'm not so sure

ocean swallow Jan 27, 2023, 8:35 PM

#

if they are labels you should probably one hot encode them

#

so say if there is 10 labels for all possible labels

#

your y shape should be (10,)

tawny spire Jan 27, 2023, 8:36 PM

#

i see

#

so it should just be a list [0,1] i pass as the y var

ocean swallow Jan 27, 2023, 8:36 PM

#

yes. if they are labels.

tawny spire Jan 27, 2023, 8:36 PM

#

roger 😄 lemme try

ocean swallow Jan 27, 2023, 8:36 PM

#

np

#

that is classification problem

#

so it heavily depends on your situation

#

https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

scikit-learn

sklearn.neural_network.MLPClassifier

Examples using sklearn.neural_network.MLPClassifier: Classifier comparison Classifier comparison Compare Stochastic learning strategies for MLPClassifier Compare Stochastic learning strategies for ...

tawny spire Jan 27, 2023, 8:44 PM

#

should y be a list of 0's and 1's or something?

serene scaffold Jan 27, 2023, 9:39 PM

#

lapis sequoia oh wait thats a numpy array too

What array do you want that isn't a numpy array?

ocean swallow Jan 27, 2023, 9:40 PM

#

tawny spire should y be a list of 0's and 1's or something?

too vague of a question. what are you trying to accomplish exactly can you tell me?

tawny spire Jan 27, 2023, 9:44 PM

#

@ocean swallow ok so i have a dataset, comprised of two folders containing pictures of normal roads and pictures of roads with potholes [the files are named 1, 2, 3 etc].

i have imported the images, turned them into grayscale, and added them to a list, then turned the whole thing into a nparray

#

so my x variable is currently [[numpy arrays of images of normal roads], [numpy arrays of images containing potholes]]

#

is this the right way to structure the data for x?

#

#

or am i supposed to pass normal roads to x and pothole roads to y?

serene scaffold Jan 27, 2023, 9:50 PM

#

@tawny spire you're not allowed to be Steele

tawny spire Jan 27, 2023, 9:51 PM

#

u

#

as far as i understand it y should be a list of labels, but how it's structured i don't know

#

gonna call it a night, if anyone knows what i'm talking about don't hesitate to @ me

mild dirge Jan 27, 2023, 11:02 PM

#

Logistic regression is used for classification problems

mild dirge Jan 27, 2023, 11:03 PM

#

tawny spire gonna call it a night, if anyone knows what i'm talking about don't hesitate to ...

And logistic regression, if you have a single binary classification task, would probably take a 1d array (or maybe 2d with 1 column) of 0s and 1s

#

So a 0 or a 1 for each example

#

And if you look at the docs, it turns out that it indeed takes an array of shape (n_samples,) for the labels

#

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

scikit-learn

sklearn.linear_model.LogisticRegression

Examples using sklearn.linear_model.LogisticRegression: Release Highlights for scikit-learn 1.1 Release Highlights for scikit-learn 1.1 Release Highlights for scikit-learn 1.0 Release Highlights fo...

arctic wedgeBOT Jan 27, 2023, 11:13 PM

#

Hey @lapis sequoia!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

#

Hey @lapis sequoia!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

arctic wedgeBOT Jan 27, 2023, 11:32 PM

#

Hey @lapis sequoia!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

lapis sequoia Jan 27, 2023, 11:32 PM

#

hey I am trying to take values from index of line of text file and assign row of dataframe with values in columns, how can I go about this differently

#

every method I try takes far too long

#

`def emshr_parser(file_name = None):
df = pd.DataFrame()
rep = 0
if file_name == None: #if file missing will call form website
emshr_load()

with open('emshr_lite.txt', 'r') as file:
for line in file:
if rep == 0:
columns = line.split()

    for column in columns:
      df[column] = ''
  else:
    df['NCDC'].loc[rep] = line[0:7]
    df['BEG_DT'].loc[rep] = line[9:16]
    df['END_DT'].loc[rep] = line[18:25]
    df['COOP'].loc[rep] = line[27:32]
    df['WBAN'].loc[rep] = line[34:38]
    df['ICAO'].loc[rep] = line[40:43]
    df['FAA'].loc[rep] = line[45:49]
    df['NWSLI'].loc[rep] = line[51:55]
    df['WMO'].loc[rep] = line[57:61]
    df['TRANS'].loc[rep] = line[63:72]
    df['GHCND'].loc[rep] = line[74:84]
    df['STATION_NAME'].loc[rep] = line[86:185]
    df['CC'].loc[rep] = line[187:188]
    df['CTRY_NAME'].loc[rep] = line[190:224]
    df['ST'].loc[rep] = line[226:227]
    df['COUNTY'].loc[rep] = line[229:263]
    df['CD'].loc[rep] = line[265:266]
    df['UTC'].loc[rep] = line[268:270]
    df['LAT_DEC'].loc[rep] = line[272:280]
    df['LON_DEC'].loc[rep] = line[282:291]
    df['LOC_PREC'].loc[rep] = line[293:302]
    df['LAT_DMS'].loc[rep] = line[304:316]
    df['LON_DMS'].loc[rep] = line[318:331]
    df['EL_GR_FT'].loc[rep] = line[333:340]
    df['EL_GR_M'].loc[rep] = line[342:349]
    df['EL_AP_FT'].loc[rep] = line[351:358]
    df['EL_AP_M'].loc[rep] = line[360:367]
    df['TYPE'].loc[rep] = line[369:465]
    df['RELOCATION'].loc[rep] = line[470:499]
    df['GHCNMLT'].loc[rep] = line[501:511]
    df['IGRA'].loc[rep] = line[513:523]
    df['HPD'].loc[rep] = line[525:535]
    
rep += 1

return df`

#

here is the text file I am parsing

drifting osprey Jan 27, 2023, 11:36 PM

#

i searched quite a bit for session based recommendation but all that i could find was basically what is session based recommendation, where it is implemented, etc.
Nowhere i could find any basic implementation of session based recommendation, like how do we implement it, what all packages come into use, example code or stuff like that
it would be great if someone could provide with how can it be implemented

rancid thorn Jan 27, 2023, 11:46 PM

#

I have this project for a virtual assistant such as Siri or Alexa which uses natural language processing.

I got the code for training the model and everything from a tutorial but I'm not sure if this is the best/most efficient way to do it, I'm sure there are libraries out there such as PyTorch that automates this for me as I've found lots of NLP/Machine learning libraries and I don't know which one is best for this

This is the code for the model:
https://paste.gg/p/anonymous/2b5dc51a892c4afcac346baf5525ae14

And this is an example of the intents JSON file:

[
  "name": {
    "intents": ["what is your name", "whats your name"],
    "responses": ["My name is blah blah.", "Blah blah."]
  }
]

Does anyone know how I can optimize/make the code better or if there are any alternative libraries that can do this for me? The code could also be just fine, I just want to make sure it's not doing a bunch of unnecessary stuff when I could just automate the whole thing using one line of code utilizing another library.

Thanks!

prime hearth Jan 28, 2023, 1:18 AM

#

hi, im trying forward feature selection using SequentialFeatureSelector from mlxtend.feature_selection import for my SVM ( support vector model) which is using 100 features, but the process takes so long its already been an hour and it only completed 60 features. Is there any way to improve speed?

#

this is my output for example:

Features: 77/101[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  24 out of  24 | elapsed:  1.5min finished
Features: 78/101[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  23 out of  23 | elapsed:  1.5min finished
Features: 79/101[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
``` not sure if i can add multiple concurrent workers to improve speed?

ocean swallow Jan 28, 2023, 2:29 AM

#

tawny spire <@250736327593689088> ok so i have a dataset, comprised of two folders containin...

shape your data so that it is shaped (2 * 318, 512 * 512 ) so each indice of X is a flattened array of pixels (all images are on the X is correct )
y = np.zeros([2 * 318], astype=np.float64)
y[318:] = 1.0
Check if you greyscaled your images correctly (maybe you meant binarized?). That's way too much zeroes for greyscale.
Anyway you should probably normalize your image.

ocean swallow Jan 28, 2023, 2:34 AM

#

mild dirge And logistic regression, if you have a single binary classification task, would ...

yeah logres is like that but also you don't one hot encode even if it is a multiclass problem.

restive iris Jan 28, 2023, 6:08 AM

#

shouldn't this be returning a high correlation coefficient?

# Is there a correlation between the number of bedrooms and the price of the property
print(df.groupby('Bedrooms').median().sort_values('Price', ascending=False))
# There is a correlation between the number of bedrooms and the price of the property, as the more bedrooms
# the more expensive the property is.
# use statistics to prove this
# remove places which have 10 bedrooms
temp_df = df[df['Bedrooms'] != 10]
# get correlation of bedrooms and price
temp_df['Bedrooms'].corr(temp_df['Price'])

              Price        Date
Bedrooms                       
8.0       1275000.0  20205915.5
9.0       1140000.0  20180822.0
6.0        792000.0  20180212.0
5.0        675000.0  20180326.0
7.0        650000.0  20180530.0
4.0        495000.0  20170720.0
10.0       395000.0  20161009.0
3.0        265000.0  20170418.0
2.0        197000.0  20170523.0
1.0        190000.0  20170411.0

#

for some reason the correlation coefficient is:
0.08499127277742283

#

as in I understand there are other affecting factors such as date, location and property type however in general bedrooms should be much stronger than just 0.08

queen cradle Jan 28, 2023, 7:17 AM

#

restive iris shouldn't this be returning a high correlation coefficient? ```python # Is there...

First of all, what's up with 10 bedrooms? It looks like you have a data problem. Second, I believe Pandas defaults to the Pearson correlation coefficient. This is only appropriate if you believe that the relationship between bedrooms and price is approximately linear. You have to believe that an extra bedroom is worth the same amount, regardless of how many bedrooms the house has: Going from one to two bedrooms should add the same amount to the price as going from nine to ten bedrooms. What's displayed suggests that's not true, so the Pearson correlation coefficient will give you junk. (It may also be affected by the 10 bedroom data.) You should try Kendall's tau. It's intended for situations where there is a monotonic relationship (like "more bedrooms <=> higher price") but where that relationship is not linear.

restive iris Jan 28, 2023, 12:08 PM

#

interestingly I did this and the correlation coefficient changed

#

#

leaving the ten one there

#

However I found a fix for the ten

desert bear Jan 28, 2023, 12:57 PM

#

Hi Data Scientists! I have a question regarding building a regression model.
My task is to build a regression model that most accurately predicts the y column given the 5 features. Second task is to find some dependencies (correlations?) that this model detected in the data.
My questions are: what are your current steps in dealing with such a problem? What algorithms do you use?

#

My idea would be to:

first inspect the data: - plot distribution plots. - find if the data is skewed etc, - are there any missing values
Preprocess the data: -normalize or standarize all columns -fill missing values
Choose the model. - The hardest part
Should I go with the regression model from the statsmodels python library?

prime hearth Jan 28, 2023, 2:04 PM

#

Yeah i do similar steps look, but 2nd step I also check for any correlation that might affect the model, and after filling in NaN check if the disitribtion has changed or not, and the 3rd step i test with multiple models and see which gives better accuracy. Then the next step is improving the model accuracy by feature selection, features engineering (removing outliers from dataset), cross validation to check if any overfitting, handle imbalance dataset, and hyperparamter optimization

lyric geyser Jan 28, 2023, 2:41 PM

#

Hi guys, is here anyone who has experience in machine learning? Supervised learning or reinforcement learning for an online game where you just move mouse and click sometimes when needed. You are flying in a rocket through a tube and you have to avoid obstacles and click sometimes to boost through a wood, object detection for every little obstacle I think would be too big of a process, don't know what kind of approach would I take other than machine learning. I need help
If you want I can pay for the guidance.
Now I have a code that gets training data from a video while playing temple run and registering wasd keys, then other file that trains the model and then another file that plays the game, but I would need to reconfigure this for this other game that uses mouse click and mouse movement.

desert bear Jan 28, 2023, 2:46 PM

#

prime hearth Yeah i do similar steps look, but 2nd step I also check for any correlation that...

what other models would you choose? The problem is that I have dataset with total number of 350

queen cradle Jan 28, 2023, 3:33 PM

#

desert bear My idea would be to: 1. first inspect the data: - plot distribution plots. - fin...

You first say you want to build a "regression model", but then you say you want to build a "classification model." Those aren't the same thing. It looks to me like you want regression, not classification.

Regarding the choice of library: If you want to do statistics, R > statsmodels > sklearn. While statsmodels is a fine package, it's not as full-featured as the many thousands of packages available on CRAN. sklearn is fine for what it does, but what it does is machine learning, not statistics; but you may want to do machine learning, in which case it's fine.

supple onyx Jan 28, 2023, 3:40 PM

#

Hi

desert bear Jan 28, 2023, 3:54 PM

#

queen cradle You first say you want to build a "regression model", but then you say you want ...

Sorry, I meant the regression model

desert bear Jan 28, 2023, 4:35 PM

#

I am confused on how to start. I need to create a presentation showing that my model produces accurate predictions of the regression and show the dependencies in variables that the model learned.

tawny spire Jan 28, 2023, 4:53 PM

#

ocean swallow shape your data so that it is shaped (2 * 318, 512 * 512 ) so each indice of X i...

thanks for the heads up mate, i'll wait until the week when i can get help with it from a lecturer 🙂

agile cobalt Jan 28, 2023, 5:01 PM

#

@scarlet siren your channel closed while I was looking it up x-x
sounded a bit like https://www.tutorialspoint.com/numpy/numpy_inv.htm ?

TutorialsPoint

numpy.linalg.inv()

We use numpy.linalg.inv() function to calculate the inverse of a matrix. The inverse of a matrix is such that if it is multiplied by the original matrix, it results in identity matrix.

#

if so, you should've mentioned that it was in fact about inverse matrixes (and perhaps looked up numpy inverse matrix on your own)

lapis sequoia Jan 28, 2023, 5:55 PM

#

i created this today a valorant bot recognition

#

windows 💀

#

nexus

lapis sequoia Jan 28, 2023, 6:16 PM

#

for item in data_module.train_dataloader():
    print(item)
    break

#

why is this running forever

scarlet siren Jan 28, 2023, 6:34 PM

#

agile cobalt if so, you should've mentioned that it was in fact about inverse matrixes (and p...

No, the R and I matrixes are of different dimensions

#

R is n * 1
I is n * n

#

The formula is I * lambda = R
And I wanna find lambda

agile cobalt Jan 28, 2023, 6:40 PM

#

no clue then

scarlet siren Jan 28, 2023, 6:44 PM

#

Ty for the attempt!

#

But if anyone can help me with this it would be appreciated

#

I * lambda = R
I = n * n
lambda and R are n * n

#

I need to find lambda while knowing I and R

#

All are ndarrays

tidal bough Jan 28, 2023, 7:00 PM

#

scarlet siren The formula is I * lambda = R And I wanna find lambda

isn't that just numpy.linalg.solve?

wooden sail Jan 28, 2023, 7:02 PM

#

solve (or pinv/lstsq depending on the rank) should do the trick indeed

scarlet siren Jan 28, 2023, 7:13 PM

#

tidal bough isn't that just numpy.linalg.solve?

I had no idea that could be doable

woven coral Jan 28, 2023, 7:51 PM

#

https://www.kaggle.com/code/sadikaljarif/predicting-beer-consumption-using-machine-learning

Predicting beer consumption using Machine Learning

Explore and run machine learning code with Kaggle Notebooks | Using data from Beer Consumption - Sao Paulo

lavish crypt Jan 28, 2023, 8:19 PM

#

Hello! I am working on a problem where we will do real time object detection using YOLO v5s. The algorithm will use Raspberry Pi Camera Module v2 as the camera and NVIDIA Jetson Xavier NX as the computer. For this reason, there is a need for an algorithm with a minimum processing load and high FPS value and accuracy.

We are looking for an algorithm for object tracking. In addition to solutions such as DeepSORT, there are OpenCV modules such as GOTURN. Can someone with experience or knowledge inform about object tracking solutions? What are the differences between them and what should be used in this system? Thank you.

tawny spire Jan 28, 2023, 8:32 PM

#

do you feed target column [y] to features [x] when training a linear regression model?

#

i guess the question is, when you feed x to a linear regression model, is is it x = data y = data['target']

or x = data.drop('target') y = data['target']?

tawny spire Jan 28, 2023, 8:52 PM

#

thinking about it i'm coming round to the idea that it should be the latter >_>

lapis sequoia Jan 28, 2023, 9:11 PM

#

hello, i scraped this data set from a website , but i cant figure out how to group all the symptoms of a disease in one data record and get rid of NaN values

tawny spire Jan 28, 2023, 9:29 PM

#

the data needs the same dimensions, or you need to restructure it

hasty mountain Jan 28, 2023, 9:56 PM

#

lapis sequoia hello, i scraped this data set from a website , but i cant figure out how to gro...

Try making the symptoms being located in the same cell
data = {'hypertensive_disease': {'Symptom': 'pain_chest', 'shortness_of_breath', [...]}}

#

Also...did you get that data from National Library of Medicine?

lapis sequoia Jan 28, 2023, 9:57 PM

#

hasty mountain Also...did you get that data from National Library of Medicine?

https://people.dbmi.columbia.edu/~friedma/Projects/DiseaseSymptomKB/index.html

hasty mountain Jan 28, 2023, 9:57 PM

#

Thanks!

cosmic lynx Jan 29, 2023, 5:16 AM

#

How big of a jump is making a digit reader with a guide to an AI that plays Nim?

balmy junco Jan 29, 2023, 5:45 AM

#

I'm using TCN model from darts and I'm getting an odd error. If anybody could offer any advice, it would be much appreciated. My error is confusing to me because I think my data does extend this far back.
ValueError: For the given forecasting case, the provided past covariates at dataset index 0do not extend far enough into the past. The past covariates must start at time step2022-08-09 14:20:00, whereas now they start at time step 2022-08-09 14:51:00.

This occurs when I try to train the model after splitting the data.


row_count = df.shape[0]
last_train_row = int(0.8 * row_count)  
last_train_ts = df.iloc[last_train_row]['timestamp']  # This row has the timestamp of 2022-08-09 14:50:00
val_ts_data, train_ts_data = ts.split_after(last_train_ts)
val_cov_data, train_cov_data = covariates_scaled.split_after(last_train_ts)

tcn = TCNModel(input_chunk_length=31, output_chunk_length=30, n_epochs=10, dropout=0.1, dilation_base=2, weight_norm=True, kernel_size=5, num_filters=3, random_state=0, pl_trainer_kwargs={'accelerator': 'gpu', 'devices':[0]})
tcn.fit(series=train_ts_data, past_covariates=covariates_scaled, verbose=True)

Please let me know if I am overlooking something.

velvet rampart Jan 29, 2023, 9:12 AM

#

Please is it possible to produce a plot in pandas without matplotlib

proper mauve Jan 29, 2023, 10:19 AM

#

velvet rampart Please is it possible to produce a plot in pandas without matplotlib

You can use seaborn or plotly

tawdry sand Jan 29, 2023, 10:53 AM

#

hi, does anyone know to fix this?

lavish kraken Jan 29, 2023, 1:32 PM

#

lapis sequoia https://people.dbmi.columbia.edu/~friedma/Projects/DiseaseSymptomKB/index.html

What library did you use?

lavish kraken Jan 29, 2023, 1:38 PM

#

prime hearth Yeah i do similar steps look, but 2nd step I also check for any correlation that...

Do you have similar tutorial for this? Share link if you know

hasty mountain Jan 29, 2023, 1:42 PM

#

cosmic lynx How big of a jump is making a digit reader with a guide to an AI that plays Nim?

A big one...but one that might be interesting to try

#

Digit reader is an AI that can extracts features from an image and take a conclusion based on that.
An AI that plays a game is an AI that can extracts features from a game state(which can be an image), make an action(take a conclusion) based on that and check if that action was good or bad.

The problem is that there's some tricks involving the part of "deciding if that action was good or bad". AIs that play games(Reinforcement Learning) need some tricks to backpropagate based on the consequences of their actions(if you're using a Neural Network), and they're a bit unstable and might require gradient clipping or HuberLoss(which alternates between Mean Absolute Error and Mean Square Error)

#

And you also have to chose how the reward function(the feedback) will work. Otherwise, you might get specification gaming...or might get vanishing gradients when the model still can get better

#

You could also not use Neural Networks for this and use a Q-Learning algorithm...but then it would be completely different from digit reader...
and there would be no magic at all...no fun

dapper ibex Jan 29, 2023, 2:50 PM

#

tawdry sand hi, does anyone know to fix this?

You don't have a function called plot accuracy. Also #❓｜how-to-get-help

feral hedge Jan 29, 2023, 3:05 PM

#

I want to train my GAN model on 100 - 1000 epochs of my dataset. My current issue is that one epoch takes 8 hours. 1 epoch = 450k generated images @ 256x256
So 100 epochs will take over 30 days. 1000 epochs would take like a year. I'm using a 1050 GPU that pretty much is the minimum you would need for running something like this. It has 4GB memory. So from what I can tell my current options are: buy a better GPU, or look in to data centers that can supply GPU power. I recently heard of neural processors, but I just looked at the nvidia's TESLA V100 and its like $10,000... the TESLA P100 is $5k.

do you have any advice for renting GPU in datacenters? rough estimate cost for what i've described?

and if I were to buy a piece of hardware for ML, don't care about gaming - which GPU or NPU should I go with? is there any like "go-to"s for different price ranges?
$500 gpu will take X many hours for 100 epochs? (estimate)
$1000 gpu will take X many hours for 100 epochs?
hard question to answer but the context is my 1050 is taking 8 hours per epoch.

cosmic lynx Jan 29, 2023, 3:08 PM

#

hasty mountain And you also have to chose how the reward function(the feedback) will work. Othe...

Okay thanks

hasty mountain Jan 29, 2023, 3:23 PM

#

feral hedge I want to train my GAN model on 100 - 1000 epochs of my dataset. My current issu...

Go for Google Colaboratory, Gradient's Paperspace, Amazon SageMaker of even Kaggle

#

They got free GPUs which is usually a Tesla T4. If your dataset is too big(which I suppose it's, since it's 450k images), you'll just have to pay for storage.

If you really want to buy a datacenter GPU, you can at least have an idea on how some GPUs will perform on your dataset.

#

Also... you may want to consider simply using Iterations as a metric instead of epochs. At least this is what I see people doing when a single epoch takes too much time(or at least what I see in Diffusion Models)

feral hedge Jan 29, 2023, 3:46 PM

#

hasty mountain Go for Google Colaboratory, Gradient's Paperspace, Amazon SageMaker of even Kagg...

Thanks for the info. I'll check out those dc options as thats probably the best first bet. And could you elaborate on a little bit about using iterations as a metric? I currently am aiming for 100-1000 epochs because from what I've read the model will improve each time. currently, after 1 epoch my model produces images that I can tell have "learned" at least something based on the dataset, but no realistic distinguishable objects. I use a batch size of 16 and calculate loss / optimize after each batch of 16.

hasty mountain Jan 29, 2023, 3:47 PM

#

feral hedge Thanks for the info. I'll check out those dc options as thats probably the best ...

Each iteration can be each time your model receives an input and generates an output

feral hedge Jan 29, 2023, 3:47 PM

#

yea, so each batch / iteration of the training loop

hasty mountain Jan 29, 2023, 3:48 PM

#

Yep. You're optimizing it after each batch, or after each iteration

feral hedge Jan 29, 2023, 3:48 PM

#

mhm

hasty mountain Jan 29, 2023, 3:48 PM

#

You could then simply try checking your model after N iterations instead of waiting for too many epochs

feral hedge Jan 29, 2023, 3:50 PM

#

true, but I don't see any good results after a single epoch, the loss is way too high thats why I wanted the GPU power to train faster and more epochs. I think I get what you mean just kind of depends on how fast it's learning.

#

after this segment of the dataset is done I'll check the model and see what is can recognize

hasty mountain Jan 29, 2023, 3:53 PM

#

What is the highest number of channels you're using in your convs and transconvs?

#

Convolution and Transposed Convolution layers tend to make the process waaaay slower if you use too many channels

#

When I tested a DCGAN, I did with 100 channels decreasing until 3 channels. But I couldn't make RGB images. Only with 1000 decreasing until 200 and then 3 the model would generate proper images.
But this made the process much, much slower.

feral hedge Jan 29, 2023, 4:21 PM

#

hasty mountain What is the highest number of channels you're using in your convs and transconvs...

for my generator, the first transposed conv takes 228 channels which are the embeds (128) and random noise (100). 2048 out channels. 6 more transposed convolutions reducing the channels to 3. that being said, I am still learning and this is just hobby to me. there are some things I want to break down and learn more about once I get somewhat of a working solution that I can study more. this generator is based on an older model from around 2017 on github. the original model only trained on and generated 64x46 images. I learned to produce the 256x256 images by learning ConvTranspose2d (still lots to learn) and modifying/creating additional convolutions to result in 256x256 rgb images one an entirely different dataset which I preprocessed in line with the model.

hasty mountain Jan 29, 2023, 4:23 PM

#

Hm... 6 transposed convolutions... Seems similar to a DCGAN...

feral hedge Jan 29, 2023, 4:24 PM

#

the original project was based on another DCGAN

hasty mountain Jan 29, 2023, 4:24 PM

#

But that's the problem... It deals with too many channels, which makes the process too slow.
Of course, more channels might help generating better images, but there's the downside of making it too slow.

#

Also... Be careful with big images(bigger than 100x100). GANs that try generating big images are usually way more unstable...and GANs are too unstable by themselves.
Yet, it might be possible to generate big images with good quality...as Progressive Grow demonstrated...

feral hedge Jan 29, 2023, 4:31 PM

#

hmm.. do you think it is necessarily important to fully understand GANs before diffusion models? considering no other ML experience. I guess it doesn't matter too much.

would you recommend using diffusion for larger images, and is this slower as many suggest?

feral hedge Jan 29, 2023, 4:32 PM

#

hasty mountain But that's the problem... It deals with too many channels, which makes the proce...

re: ^

hasty mountain Jan 29, 2023, 4:33 PM

#

feral hedge hmm.. do you think it is necessarily important to fully understand GANs before d...

I don't think fully understanding GANs is necessary before studying diffusion models.

But...yes, diffusion models are an absurd in terms of slowness

#

I suppose that's even the motive to why you'll have some hard time trying to find a decent tutorial out there. Most tutorials are just "the diffusion models work like this:" then enters tons of math equations, poorly explains what each variable represents.
Then finishes with: "now let's make our own diffusion model. Install Hugging Face, import stable diffusion with pretrained weights and enjoy"

#

I was trying to make a diffusion model some days ago, but I gave up because it takes so much time to get some results that I don't know if I'm doing things correctly or simply being impatient

feral hedge Jan 29, 2023, 4:36 PM

#

wow. it does seem like a the setback is computing availability. maybe once NPUs are more common @ home the price will drop to that or less of a GPU. they put them in some laptops/macbooks.

#

unfortunate 😦

hasty mountain Jan 29, 2023, 4:37 PM

#

feral hedge hmm.. do you think it is necessarily important to fully understand GANs before d...

Also, Diffusion Models might also get some difficulty for larger images. There was a Guided Diffusion, a model that came before Stable Diffusion and that surpassed BigGAN(which was state-of-the-art around 2020). And it required a SuperResolution model to make its 64x64 images become 256x256

feral hedge Jan 29, 2023, 4:37 PM

#

thats how i also feel even training this model

#

yea, i guess there is a lot to learn. and a lot of foreseeable issues when trying to do this by yourself in 2023 without enough resoucres

hasty mountain Jan 29, 2023, 4:39 PM

#

What I've been trying to do is try making a DCGAN, since it's the fundamental GAN model, make it work. When I get proper results, then I try to modify it in order to generate the outputs I desire.

#

Also...I'd recommend taking a look at the papers about SRGAN, ESRGAN and Real-ESRGAN. They teach many things about GANs, especially the ESRGAN paper.

#

For example...residual connections might be useful in GANs...but DCGAN is incompatible with residual connections, so it's necessary to use another architecture.

feral hedge Jan 29, 2023, 4:58 PM

#

cool, thanks for sharing those!

keen notch Jan 29, 2023, 5:29 PM

#

hey does anyone understand this error

#

Compilation is falling back to object mode WITH looplifting enabled because Function "run" failed type inference due to: Untyped global name 'electric_field': Cannot determine Numba type of <class 'function'>

arctic wedgeBOT Jan 29, 2023, 5:30 PM

#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

keen notch Jan 29, 2023, 5:30 PM

#

https://paste.pythondiscord.com/ajasojovih

#

how can i implement njit in the main functions

verbal venture Jan 29, 2023, 5:40 PM

#

Hey,

I'm looking to combine datasets that are of different input sizes. Should I scale the datasets to match the input 1 specific input size?
I'm also going to be using swaths of data (perhaps 10gb - maybe more). Am I able to host this data anywhere, or does it need to be stored locally on my device? I guess I can have 10gb worth of images but seems excessive.
Lastly, most image classification models I've seen are being trained on like 5 images. I was hoping to do my training on around 100k - why are more models not being done on quantities of that size?

I'm also a noob, so if anything obvious is missed that's why 🙂

mild dirge Jan 29, 2023, 5:57 PM

#

verbal venture Hey, 1. I'm looking to combine datasets that are of different input sizes. Shou...

I assume you mean image size here (?). In that case you could rescale them to the same size, or use a model that can take images of multiple sizes (fully convolutional f.e.)
You could maybe host it on a server, or even back it up on google drive I would imagine. But 10GB could also just easily be stored on your local device.
I assume you mean 5 mil, instead of 5. And it really depends on the task. I've trained classifiers on 20 images per class, so in that case 100k would be more than enough. But some other tasks...

#

More data is generally more good for the performance of the model

#

It allows the model to generalize better

verbal venture Jan 29, 2023, 5:58 PM

#

mild dirge 1. I assume you mean image size here (?). In that case you could rescale them to...

what do you mean by class?

mild dirge Jan 29, 2023, 5:58 PM

#

The model was used for classifications, so assigning a class to each image.

#

I.e. "This is an image of a tiger/lion/cat" etc.

#

Tiger, lion and cat would be examples of classes

hasty mountain Jan 29, 2023, 5:59 PM

#

10 Gb can be used in Google drive, locally or in Amazon SageMaker

near void Jan 29, 2023, 6:00 PM

#

If i want to get into data science / AI, is it essential to know libraries like NumPy and Pandas?

hasty mountain Jan 29, 2023, 6:01 PM

#

near void If i want to get into data science / AI, is it essential to know libraries like ...

Numpy yes. Pandas I don't know, but will help

near void Jan 29, 2023, 6:01 PM

#

hasty mountain Numpy yes. Pandas I don't know, but will help

thanks

mild dirge Jan 29, 2023, 6:01 PM

#

For data science absolutely

#

pandas is def a must for ds

hasty mountain Jan 29, 2023, 6:01 PM

#

pithink

mild dirge Jan 29, 2023, 6:01 PM

#

If you plan on using python for it that is

near void Jan 29, 2023, 6:01 PM

#

mild dirge For data science absolutely

i'm still a little young, and for programming competitions are those libraries ever used

mild dirge Jan 29, 2023, 6:02 PM

#

If the data is cleaned for you then maybe not. But it doesn't hurt to at least know the basics of pandas.

hasty mountain Jan 29, 2023, 6:02 PM

#

I think the first time I used pandas for ML was...when I was trying to make a table for a project...

near void Jan 29, 2023, 6:02 PM

#

I'm currently prepping for the CCC lol and i'm wondering if there's any libraries for me to learn that would greatly benefit me

hasty mountain Jan 29, 2023, 6:02 PM

#

Oh, no...there was when I was studying NLP...

hasty mountain Jan 29, 2023, 6:03 PM

#

near void i'm still a little young, and for programming competitions are those libraries e...

For data science I don't know, but I use to see many people using numpy with tensorflow. There's also Pytorch, which uses many concepts from numpy

#

(In fact, I don't really know the difference between a pytorch tensor and a numpy array, besides the tensor being more easy to be allocated and assigned to graphs)

near void Jan 29, 2023, 6:04 PM

#

i'm gonna pretend i know what both of those are 🤣

mild dirge Jan 29, 2023, 6:05 PM

#

an array with multiple dimensions

#

I.e.
1d:
[1, 2, 3, 4]

2d:
[[1, 2],
[3, 4]]

3d:
[[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]]
etc.

near void Jan 29, 2023, 6:06 PM

#

mild dirge an array with multiple dimensions

like an array from java?

mild dirge Jan 29, 2023, 6:07 PM

#

Yeah probably. It's like a nested array.

#

Arrays within arrays.

near void Jan 29, 2023, 6:07 PM

#

yea ic

pine wolf Jan 29, 2023, 6:07 PM

#

numpy arrays are also ndimensional

verbal venture Jan 29, 2023, 6:07 PM

#

mild dirge 1. I assume you mean image size here (?). In that case you could rescale them to...

yeah I was thinking 100k per class

mild dirge Jan 29, 2023, 6:08 PM

#

Again, that may or may not be enough depending on the task and complexity of the data

verbal venture Jan 29, 2023, 6:08 PM

#

complexity of the data as in variabiliity?

hasty mountain Jan 29, 2023, 6:08 PM

#

pithink

pine wolf Jan 29, 2023, 6:08 PM

#

!e

import numpy as np
print(np.arange(27).reshape(3, 3, 3))

arctic wedgeBOT Jan 29, 2023, 6:08 PM

#

@pine wolf :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | [[[ 0  1  2]
002 |   [ 3  4  5]
003 |   [ 6  7  8]]
004 | 
005 |  [[ 9 10 11]
006 |   [12 13 14]
007 |   [15 16 17]]
008 | 
009 |  [[18 19 20]
010 |   [21 22 23]
011 |   [24 25 26]]]

hasty mountain Jan 29, 2023, 6:08 PM

#

100k per class seems like an overkill

#

pithink

mild dirge Jan 29, 2023, 6:08 PM

#

Well for images the size f.e.

hasty mountain Jan 29, 2023, 6:08 PM

#

I guess not even CIFAR10 has that much

mild dirge Jan 29, 2023, 6:08 PM

#

A big image is more complex than a small image normally

hasty mountain Jan 29, 2023, 6:09 PM

#

LSUN...

verbal venture Jan 29, 2023, 6:09 PM

#

mild dirge A big image is more complex than a small image normally

so should I aim for big images?

agile cobalt Jan 29, 2023, 6:09 PM

#

for fine tuning, 100k should almost definitely be a giant overkill
training from scratch, perhaps not

mild dirge Jan 29, 2023, 6:09 PM

#

Depends on the task haha

#

What do you even want to do?

verbal venture Jan 29, 2023, 6:09 PM

#

it's an image classifier for cancer

#

but the datasets I'm seeing are like 10gb in size

near void Jan 29, 2023, 6:09 PM

#

pine wolf numpy arrays are also ndimensional

aren't python lists the same thing?

hasty mountain Jan 29, 2023, 6:09 PM

#

Oh, then it might not be an overkill

pine wolf Jan 29, 2023, 6:10 PM

#

near void aren't python lists the same thing?

you can nest python lists, but they aren't the same as np.arrays

mild dirge Jan 29, 2023, 6:10 PM

#

Is it a 2d image, or like an mri scan?

verbal venture Jan 29, 2023, 6:10 PM

#

2d image

mild dirge Jan 29, 2023, 6:11 PM

#

Are you planning on really getting into machine learning and data science and stuff? Or just a fun project for now?

wooden sail Jan 29, 2023, 6:11 PM

#

near void aren't python lists the same thing?

numpy lets you do maths on the arrays quite quickly. especially for (multi)linear transformations, it uses blas/lapack for speedy computations. doing these operations on python directly is slow by comparison

mild dirge Jan 29, 2023, 6:11 PM

#

Because if you really want to learn how it works be able to use it well. You might first want to start with the (for some) more boring stuff.

#

Like statistics, and linear algebra and stuff.

verbal venture Jan 29, 2023, 6:11 PM

#

I wouldn't really call it a fun project, but I'm not trying to spend months on this thing

#

if I need to tho don't care. my lin alg and soft eng skills are good

mild dirge Jan 29, 2023, 6:13 PM

#

Implementing a model isn't too hard. Plenty of examples online. If it works, then that's great. But if it doesn't, and you don't really understand why the models work as they do, it could be hard to get better performance.

verbal venture Jan 29, 2023, 6:13 PM

#

I was considering using coreML because it seems plug and play-ish

#

and I was building this for iOS

#

I can learn this thing from scratch, I guess I might need to anyway, but I'd like a roadmap. so far I've done a few CNN tutorials and they don't explain shit, lmao

mild dirge Jan 29, 2023, 6:15 PM

#

Yeah online tutorials have nice colorful images and a very general explanation. But they almost never go in-depth.

#

You'd really need to read a book for that ime.

mild dirge Jan 29, 2023, 6:16 PM

#

mild dirge Yeah online tutorials have nice colorful images and a very general explanation. ...

Talking about the bloggy type of tutorials here btw

hasty mountain Jan 29, 2023, 6:16 PM

#

There might be some coursera courses that explains those things better, too...with some math included...

#

But idk if there's that much to see... The model must be able to extract features, with Convolutions being more effective at extracting relations between neighboring pixels...downsampling making the lower convolutions extract global relations...

wooden sail Jan 29, 2023, 6:17 PM

#

andrew ng has nice coursera courses where they don't just brush the math aside

hasty mountain Jan 29, 2023, 6:17 PM

#

VGG architecture might be interesting to learn about that, since it uses this idea

hasty mountain Jan 29, 2023, 6:18 PM

#

wooden sail andrew ng has nice coursera courses where they don't just brush the math aside

I didn't know he had more courses than the one about Transformer pithink

#

interesting

wooden sail Jan 29, 2023, 6:18 PM

#

coursera is a nice pick since you can get the certificates for free if you apply for financial support saying you're a student (if you're a student, ofc)

verbal venture Jan 29, 2023, 6:18 PM

#

I'm going to just plug into a tf model for now I think. so what's the general advice regarding combining datasets of different input sizews

verbal venture Jan 29, 2023, 6:19 PM

#

wooden sail coursera is a nice pick since you can get the certificates for free if you apply...

yeah I did a few andrew ng courses and ngl he doesn't explain anything really meaningful

#

they're all beginner things which you can learn in an hour but none of them really explain why things work and why they don't work

#

which is the bread and butter

#

they're just like "yeah this is CNN classification do x do y and now you have a successful model"

wooden sail Jan 29, 2023, 6:20 PM

#

well, that's my impression from glossing over a few of his videos which do go over the math, but i read from papers usually

#

that sounds like a bad course, indeed

#

you got a dud 😛

verbal venture Jan 29, 2023, 6:20 PM

#

well to really know what he's talking about you need like a 50-60h course and not a 2h one, lol

hasty mountain Jan 29, 2023, 6:20 PM

#

Try taking a look at VGG paper and ResNet paper, then

#

Inception, YOLO...

#

What's interesting about those papers is that they sometimes provide a mini-class in the Introduction section

verbal venture Jan 29, 2023, 6:21 PM

#

okay. is there any advice regarding combining the datasets?

wooden sail Jan 29, 2023, 6:23 PM

#

rescale at runtime as needed. keras and pytorch have preprocessing functions for this

verbal venture Jan 29, 2023, 6:23 PM

#

if I rescale to the same size won't I lose a lot of the features?

wooden sail Jan 29, 2023, 6:25 PM

#

that's an interesting question. in general, if you downsample, you lose information, yes. to prevent aliasing, you have to lowpass filter as you downsample. as for upsampling, you can't generate new info from nothing. you can upscale by enforcing priors, though, like the knowledge that most images are sparse in some domain. incorporating this makes the upscaling expensive though, and is usually not done when upscaling on-the-fly.

#

on the other hand, it also does imply that you usually don't lose THAT much info by downsampling, unless you do it too aggressively

#

so the TL;DR is: try out the preprocessing functions that rescale images for you

near void Jan 29, 2023, 6:28 PM

#

wooden sail numpy lets you do maths on the arrays quite quickly. especially for (multi)linea...

ic, would i be able to use numpy with computing competitions though? It isn't a built in pytho library right

wooden sail Jan 29, 2023, 6:28 PM

#

that i don't know. it isn't built-in, and i'd guess it depends on each competition's rules

nocturne eagle Jan 29, 2023, 6:43 PM

#

near void ic, would i be able to use numpy with computing competitions though? It isn't a ...

that would depend on the competition, don't you think?

mild dirge Jan 29, 2023, 7:10 PM

#

wooden sail that i don't know. it isn't built-in, and i'd guess it depends on each competiti...

I really doubt people would be forced to make milion parameter neural networks with python for loops 👀

#

I hope at least

wooden sail Jan 29, 2023, 7:14 PM

#

yeah that'd be my hope as well lol

gloomy anvil Jan 29, 2023, 7:38 PM

#

Hello,

I need some help understanding how to forecast non-stationary timeseries with a VAR. I have a dataset that also has non-stationary asset prices that I want to predict. But a VAR needs stationary data, right? So I used augmented Dickey Fuller test to determine stationarity and differenced the non-stationary timeseries to turn them into I(0) stationary timeseries, which is basically the same as the daily absolute change which was also one feature in my full dataset (eliminated it of course).

So now I have only stationary timeseries. I fit the VAR and use it to forecast the next periods of the price or should I rather say: the absolute price change. So far so good.

Here is the thing though: I trained a few LSTMs with that very same dataset and here I do not have any problems with stationarity. I trained LSTMs to predict the actual price points of the assets and I also trained LSTMs that predict the relative price change in %.

So I would like to compare results of my LSTMs vs. the VAR models. If I create a stationary timeseries of the relative price change just like I did for the LSTMs, I could easily compare results of both types of models. But could I compare the very same results also to the LSTMs that predict the non-stationary prices? I hope you understand my dilemma: In my eyes absolute price change and relative price change are more or less the same. The actual price is in my eyes different though. Could I still compare an R2 score of my VAR that predicts absolute price change to an R2 score of an LSTM that predicts actual prices? An then likewise VAR that predicts relative price change to an LSTM that predicts relative price change?

And lastly, wouldn't a VAR that predicts absolute price change and a VAR that predicts relative price change basically yield the same results?

Sorry for asking these basic questions but I want to understand how to handle this hurdle of stationarity when comparing models.

tropic tiger Jan 29, 2023, 9:02 PM

#

Hey guys, I have a project from a course and it asks for exploring new capabilities of visualization using Python. Do you guys have any suggestions? Mainly I want to write code to perform them rather then using the platform myself for learning experiences.

Ideally I want to be something that's useful and interesting/challenging enough to keep my brain going.

If you answer my question after few minutes plz tag me so I can get a notification. Thanks

cursive totem Jan 29, 2023, 9:40 PM

#

can someone check out my post in #1035199133436354600

serene scaffold Jan 29, 2023, 9:51 PM

#

cursive totem can someone check out my post in <#1035199133436354600>

You can at least link it and say what it's about

cursive totem Jan 29, 2023, 9:53 PM

#

#1069368344060371054

mild dirge Jan 29, 2023, 10:00 PM

#

I'm already helping but there is not enough info provided to be able to help

#

It's a key error

frail umbra Jan 29, 2023, 10:11 PM

#

tawdry sand hi, does anyone know to fix this?

You need to define sklearn dictionary key in accuracy_etc variable like an empty list instead of a dict

prime hearth Jan 29, 2023, 10:47 PM

#

hello, how can i please use tfidfvectorizer on new data?

#

for example i used tfidfvectorizer to transform some text into numerical values. But to predict new string values i not sure how to transform it into tfidf, maybe only bag of words i can see doing this

#

i cant do tfidf.fit_transform as it will add new dimensions

serene scaffold Jan 29, 2023, 10:55 PM

#

@prime hearth I can look into this more in a few hours, but it might be that you need documents containing all the tokens you ever plan to account for at the one time that you fit it.

#

Is there a way to see which row or column represents what token?

#data-science-and-ml

Create a new dataframe

Assign a unique id to each row