#data-science-and-ml | Python | Page 418

latent glacier Jul 7, 2022, 5:59 AM

#

okay thanks!!

young ridge Jul 7, 2022, 6:33 AM

#

Hello guys could anyone help me out with correlation coefficients in python

#

Short background, I need to do correlation analysis using ordinal data

#

but the thing is if i do spearman's correlation analysis

#

im not sure if its possible using spearman's correlation using Multiple ordinal columns

#

ive done some research but so far no one has done it with multiple ordinal columns

#

any advice?

#

or how do I calculate the correlation between multiple ordinal variables?

upbeat furnace Jul 7, 2022, 8:13 AM

#

Hi, I'm trying install wget in site-packages in my d drive using !pip install wget. However, my python is located in my c drive. Is it possible to install wget on d drive instead?

#

serene scaffold Jul 7, 2022, 8:27 AM

#

@upbeat furnace I've only seen wget as a bash command. Not as a python package

unique flame Jul 7, 2022, 9:09 AM

#

IMO exams are there to test your ability to handle tough situations by putting you under a lot of stress, similar to actual life events. Surely you have met people who lost their cool or stayed home when the situation needed them. I for one would hate to work with such a person.

serene scaffold Jul 7, 2022, 9:16 AM

#

unique flame IMO exams are there to test your ability to handle tough situations by putting y...

I think exams are intended to make it easy for the instructor to assign grades to everyone. What you've said sounds like a retroactive justification for what I see as evidence of their failure

#

Titles are often arbitrary. A "data scientist" might still write papers.

versed gulch Jul 7, 2022, 9:18 AM

#

Hi guys,

Is there a way I can create an new column based on the x column here in my dataframe that finds the difference between two rows i.e. the new column will be row1-row0 of the x column and so on?

serene scaffold Jul 7, 2022, 9:37 AM

#

@versed gulch yes, you would just do df['new'] = df['a'] - df['b']

#

If you're trying to do operations between elements of the same column, please be more specific about the expected input and output

versed gulch Jul 7, 2022, 9:41 AM

#

i.e. I want the first value to be Nan then do 65388.270 - 64624.593 and so on

serene scaffold Jul 7, 2022, 9:45 AM

#

versed gulch i.e. I want the first value to be Nan then do 65388.270 - 64624.593 and so on

What would the next one be? Two elements isn't enough to establish the rule

versed gulch Jul 7, 2022, 9:47 AM

#

then 66151.947-65388.270

serene scaffold Jul 7, 2022, 9:48 AM

#

@versed gulch look into rolling

#

Or even just diff

versed gulch Jul 7, 2022, 9:50 AM

#

ok thanks

#

Also is there a way to take the tile numbers that have the same y values and put them into a list?

steady basalt Jul 7, 2022, 11:09 AM

#

wud anyone like to help me put a bunch of dicts adn lists into a dataframe, i have the process done for one df but instead of doing it 10 times i wana do it in a loop in a single cell?

#

and my brains gettin goverloaded

pulsar cosmos Jul 7, 2022, 1:19 PM

#

Hey, anyworked with Celonis / pycelonis yet and encountered an issue?

#

I'm wondering if it is just a bug or I made a mistake in my query

steady basalt Jul 7, 2022, 2:08 PM

#

does anyone know why interpolating doesnt work on a list of dataframe object?

#

lets say i have list of dataframes [pd.dataframe(list[0]), pd.dataframe(list[1]) etc, why doesnt for i in range len dataframes: dataframes[i]['col']=dataframes[i]['col'].interpolate(method=linear) work?

#

does pandas not creeate the dataframe as an object when reading from lists?

#

infact, its replaceing them with None

sour tide Jul 7, 2022, 2:20 PM

#

helloo gfolks...so i have this dict data with me ..just wanted to know how to divide this into two parts preferably 70:30

  0.         0.04303315 0.03849002 0.         0.        ]
 [0.         1.         0.         0.12309149 0.         0.
  0.         0.         0.         0.05913124 0.        ]
 [0.         0.         1.         0.         0.         0.
  0.         0.         0.         0.         0.        ]
 [0.         0.12309149 0.         1.         0.07216878 0.
  0.         0.         0.         0.         0.        ]
 [0.         0.         0.         0.07216878 1.         0.06454972
  0.         0.         0.         0.         0.        ]
 [0.0496904  0.         0.         0.         0.06454972 1.
  0.         0.         0.         0.         0.14322297]
 [0.         0.         0.         0.         0.         0.
  1.         0.         0.         0.         0.        ]
 [0.04303315 0.         0.         0.         0.         0.
  0.         1.         0.04472136 0.06201737 0.        ]
 [0.03849002 0.         0.         0.         0.         0.
  0.         0.04472136 1.         0.         0.05547002]
 [0.         0.05913124 0.         0.         0.         0.
  0.         0.06201737 0.         1.         0.        ]
 [0.         0.         0.         0.         0.         0.14322297
  0.         0.         0.05547002 0.         1.        ]]```

steady basalt Jul 7, 2022, 2:20 PM

#

nvm fixed

rotund dock Jul 7, 2022, 2:21 PM

#

Hi all! Anyone familiar with sympy? I'm trying to solve an integral and need some help

lapis sequoia Jul 7, 2022, 2:32 PM

#

How would i go about training a TTS model to replicate someone's voice, i have essentially a infinite amount of training data as the person is locked in my basement, and i'm just looking for a framework or a good guide.

The goal is not to make any TTS, the goal is to replicate the person's voice

sour tide Jul 7, 2022, 2:33 PM

#

sour tide helloo gfolks...so i have this dict data with me ..just wanted to know how to d...

Simpoly just wanna split this data into two parts and store it..may i knwo how [0. 0. 0. 0. 0. 0. 0. 0.0766965 0.14142136 0. 1. 0.08164966] [0. 0. 0. 0. 0. 0. 0. 0.0766965 0.14142136 0. 1. 0.08164966]

#

its stored in a variable

steady basalt Jul 7, 2022, 2:34 PM

#

lapis sequoia How would i go about training a TTS model to replicate someone's voice, i have e...

MonkaS

steady basalt Jul 7, 2022, 2:34 PM

#

sour tide its stored in a variable

Yes just split it by index

lapis sequoia Jul 7, 2022, 2:35 PM

#

steady basalt MonkaS

I assume you're talking about the emote

steady basalt Jul 7, 2022, 2:35 PM

#

A = thelist[lenlist/5] would give u the first 20% right?

#

Sorry I missed a :

#

Put the column before

#

Lenlist

sour tide Jul 7, 2022, 2:36 PM

#

steady basalt A = thelist[lenlist/5] would give u the first 20% right?

so i have this store in a variable called similarity..so how do i pu this hia?

astral parrot Jul 7, 2022, 2:36 PM

#

can we use python to make ai

steady basalt Jul 7, 2022, 2:36 PM

#

Splitting it in half you index as yourlist[:lenyourlist/2]

sour tide Jul 7, 2022, 2:36 PM

#

sry its my forst time doing python so new to tiis

steady basalt Jul 7, 2022, 2:37 PM

#

I don’t understand what u want

#

I just gave u the code

sour tide Jul 7, 2022, 2:37 PM

#

steady basalt I just gave u the code

oh sorry i mean where do i put my vairbale which i store the list in

sour tide Jul 7, 2022, 2:38 PM

#

steady basalt A = thelist[lenlist/5] would give u the first 20% right?

thelist will be that variable right?\

steady basalt Jul 7, 2022, 2:38 PM

#

You store variables in python by just saying x = 1

#

So firsthalf = thecodeigaveyou means first half is that first half list

astral parrot Jul 7, 2022, 2:38 PM

#

can we use python to make ai

steady basalt Jul 7, 2022, 2:39 PM

#

Holy shit chat today

agile cobalt Jul 7, 2022, 2:39 PM

#

astral parrot can we use python to make ai

"make ai" is an extremely oversimplified way of putting it, that doesn't really matches the reality, but yes, Python is used for working with AI

steady basalt Jul 7, 2022, 2:39 PM

#

@sour tide go to main python channel not data for this it’s basic python

astral parrot Jul 7, 2022, 2:40 PM

#

agile cobalt "make ai" is an extremely oversimplified way of putting it, that doesn't really ...

like learn wt the user is interested in, or some simple ai

steady basalt Jul 7, 2022, 2:40 PM

#

Yes python is the most popular for that

astral parrot Jul 7, 2022, 2:40 PM

#

O thanks

sour tide Jul 7, 2022, 2:42 PM

#

steady basalt <@612308076233490462> go to main python channel not data for this it’s basic pyt...

funnny thing they actaully sent me hia lol

steady basalt Jul 7, 2022, 2:49 PM

#

weird

hollow sentinel Jul 7, 2022, 2:51 PM

#

did you guys know postman can give you code for python request for APIs?

#

i had no idea

ocean swallow Jul 7, 2022, 3:42 PM

#

hey nlp people, is there a framework or tool that we can define a grammar by notation or any other way, from a pool of words with pos tags, it stochastically creates sentences?

#

NLTK pos tags are too specific for my case. All I want to basically use is S NP V

dull granite Jul 7, 2022, 4:02 PM

#

spacy?

ocean swallow Jul 7, 2022, 4:12 PM

#

dull granite spacy?

which part in particular?

#

this looks good tbh

swift furnace Jul 7, 2022, 4:18 PM

#

Does anyone know if I need machine learning for image processing/computer vision?
How much can I do without machine learning?

#

Let's say I want to create a project in which I analyze an image of a heart, and then I as a result I want to know whether or not that person has some sort of disease, can I do that with only image processing/computer vision?

wooden sail Jul 7, 2022, 4:23 PM

#

computer vision often, but not always involves machine learning

#

the difference is how much math you do yourself 😛

#

same with image processing, which i would usually put in a separate category, as it deals with different tasks in general (they do have some overlap)

swift furnace Jul 7, 2022, 4:25 PM

#

wooden sail the difference is how much math you do yourself 😛

what if I'm bad at math? xD

wooden sail Jul 7, 2022, 4:26 PM

#

then image and signal processing in general are a bad idea, unless you're willing to sink in a lot of time

#

people get masters and phds in engineering and maths for signal processing/image processing/computer vision

#

especially if you wanna do it in a medical area

swift furnace Jul 7, 2022, 4:27 PM

#

I see!

#

Does that mean that if I use ML, then I wouldn't need to dive in too deep in math?

wooden sail Jul 7, 2022, 4:28 PM

#

hmm you dive into different math, but depending on how novel or old it is, you don't need to do it yourself

#

also consider that when i say "do math", i don't mean you're gonna go and multiply numbers and do integrals on paper, but rather that you'll formulate problems in a clever way and recognize good solution approaches

swift furnace Jul 7, 2022, 4:29 PM

#

I see!

#

When you said math, I was thinking of stuff such as calculus and linear algebra

wooden sail Jul 7, 2022, 4:30 PM

#

as a dumb example, noticing that you can find the coefficients of a polynomial of arbitrary degree by doing a linear regression, even though you'd normally associate "linear" with polynomials of order 1

#

calculus and linear algebra are the bare basics, you won't get anywhere without those

swift furnace Jul 7, 2022, 4:31 PM

#

I see!

swift furnace Jul 7, 2022, 4:31 PM

#

wooden sail calculus and linear algebra are the bare basics, you won't get anywhere without ...

What topics should I know?

#

besides calculus and linear algebra

wooden sail Jul 7, 2022, 4:36 PM

#

probability and statistics in the multivariate case, some optimization

#

image processing methods often basic physics and differential equations, e.g. when you optimally detect edges in an image or try to find regions that satisfy some condition, try to denoise, etc

swift furnace Jul 7, 2022, 4:37 PM

#

I see! @wooden sail

wooden sail Jul 7, 2022, 4:37 PM

#

and statistical signal processing itself

swift furnace Jul 7, 2022, 4:38 PM

#

Image processing and computer vision are actually very interesting topics

wooden sail Jul 7, 2022, 4:38 PM

#

i'd say optimization and sigproc are applications of the other topics... in a very handwavy way, because there's a lot to those topics in and of themselves

swift furnace Jul 7, 2022, 4:38 PM

#

swift furnace Image processing and computer vision are actually very interesting topics

But, when should I use which, though?

swift furnace Jul 7, 2022, 4:38 PM

#

wooden sail i'd say optimization and sigproc are applications of the other topics... in a ve...

I see!

wooden sail Jul 7, 2022, 4:39 PM

#

swift furnace But, when should I use which, though?

being able to detect that is what makes you be good at it 😛 it depends on what you're doing

swift furnace Jul 7, 2022, 4:40 PM

#

wooden sail being able to detect that is what makes you be good at it 😛 it depends on what ...

I see, that makes sense

steady basalt Jul 7, 2022, 4:42 PM

#

anyone waan help me code a weird nested loop list

#

for statement

velvet rover Jul 7, 2022, 5:17 PM

#

I want to plot a scatter plot with the drop down as Species in R, not in shiny. The dropdown should have the Species (setosa, versicolor, and virginica) and by selecting one the plot should change. Can someone suggest me here?

hollow sentinel Jul 7, 2022, 5:28 PM

#

we do R stuff here?

dusty valve Jul 7, 2022, 5:54 PM

#

can anyone recommend a good general tensorflow tutorial, like how to get a grasp of how to use it

odd meteor Jul 7, 2022, 6:06 PM

#

hollow sentinel we do R stuff here?

I'm afraid no, this is a Python community not R.

quick eagle Jul 7, 2022, 6:10 PM

#

I have a pressure log with oscillations from which I need to extract some timing info - anyone have suggestions on how to do so (I'm mostly on pandas):
basically - extract the 'timestamp' of each red dot, and the duration of the black bar:

#

in this particular example, crossing the '490' value would work, but unfortunately there's a significant low frequency component that makes absolute value approach useless:

mild dirge Jul 7, 2022, 6:12 PM

#

So using the derrivative then maybe?

#

If the value suddenly decreases rapidly, then you found the red dot

#

And if it rapidly increases, that's the end of the black bar

quick eagle Jul 7, 2022, 6:16 PM

#

that would make sense - I've been shying away from adding a low pass filter because I don't want to lose timing resolution (this is a ~20Hz signal), but maybe some additional step to ensure an 'extended' drop is noted, as opposed to 'jiggles'? I'm not sure if there are adaptive peak/threshold detection tools out there (and whether those would be too complicated for this)?

mild dirge Jul 7, 2022, 6:17 PM

#

Not sure about low-pass filters, but if I were to do this task, I would probably just check for each point if the point x time steps ahead is at least 10-20 lower

#

If it is, then red point, and we start looking for the end of black bar

#

which is when the point x time steps ahead is at least 10-20 higher

#

That would probably already give "decent-ish" results

#

If you have a lot of data, you could maybe even train a simple rolling regression model

hollow sentinel Jul 7, 2022, 6:23 PM

#

odd meteor I'm afraid no, this is a Python community not R.

what's up emyrs

odd meteor Jul 7, 2022, 6:26 PM

#

hollow sentinel what's up emyrs

I'm doing great myself I can't complain.

hollow sentinel Jul 7, 2022, 6:26 PM

#

that's good man

ocean swallow Jul 7, 2022, 6:49 PM

#

swift furnace what if I'm bad at math? xD

No. Don't be alarmed. If you are on the production side you don't need to know any math at all. Or even stats for that case. There is many out of the box running frameworks. Apart from that whether you need ML or Vanilla CV really depends on the problem.

#

What kind of things are you trying to detect from heart image?

swift furnace Jul 7, 2022, 6:56 PM

#

ocean swallow No. Don't be alarmed. If you are on the production side you don't need to know a...

I'm glad to hear that, thanks for the insight

swift furnace Jul 7, 2022, 6:56 PM

#

ocean swallow What kind of things are you trying to detect from heart image?

I don't really know, it was just something I thought of. Is there any beginner project you'd recommend me to do?

rough mountain Jul 7, 2022, 6:57 PM

#

I have a 1D array of vectorized words. How do I use it with a lstm? It want's ndim 3, but I only have 2 (batch size, vector count)

ocean swallow Jul 7, 2022, 7:37 PM

#

swift furnace I'm glad to hear that, thanks for the insight

For what, like for any image job?

#

MNIST is like the hello world of statistics dataset and there is many tutorials.

#

For computer vision

swift furnace Jul 7, 2022, 7:39 PM

#

ocean swallow For what, like for any image job?

Yes, something basic that involves image processing/cv with ml

#

Are there interesting platforms that I should look up, related to the topics I've mentioned above?

misty flint Jul 7, 2022, 7:41 PM

#

fall cohort about to start https://fullstackdeeplearning.com/course/

Full Stack Deep Learning

The community for people building ML-powered products.

#

ZoomEyes

#

considering doing it

#

~~especially since im already running into problems at work with model deployment~~

#

kekHands

ocean swallow Jul 7, 2022, 7:42 PM

#

swift furnace Are there interesting platforms that I should look up, related to the topics I'v...

pyimagesearch for more opencv projects and for practical applications sentdex tutorials are great in my opinion. not really theorizing everything.

#

or like anything at all

swift furnace Jul 7, 2022, 7:42 PM

#

ocean swallow pyimagesearch for more opencv projects and for practical applications sentdex tu...

Thank you, I'll look it up! 🙂

swift furnace Jul 7, 2022, 7:42 PM

#

misty flint fall cohort about to start https://fullstackdeeplearning.com/course/

What is this?

misty flint Jul 7, 2022, 7:47 PM

#

oh this isnt for you. this is mostly for the lurkers in chat. DoggoKek

#

theres also an academic discount for the students and an accessibility discount for those from low income countries

#

ID_GhostSip

misty flint Jul 7, 2022, 9:13 PM

#

havent read it myself, but i heard many from my podcasts say its a classic

arctic wedgeBOT Jul 7, 2022, 9:15 PM

#

Hey @ancient fractal!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

ancient fractal Jul 7, 2022, 9:19 PM

#

Can someone fix my python code? I am trying to Cluster a 2D array and calculate min and max value of each cluster. I got stuck, it does not output the results that I am looking for. Here is my code: ```py
import numpy as np
from collections import defaultdict
from scipy.cluster.vq import kmeans, vq

data = defaultdict(list)

arr =[[2, 230], [2, 233], [1, 676], [2, 233], [1, 698], [2, 233], [1, 685], [2, 234], [2, 236], [2, 232], [2, 261], [1, 674], [2, 262], [2, 236], [2, 267], [1, 690], [2, 261], [2, 231], [1, 540], [2, 231], [1, 696], [2, 233], [1, 528], [2, 231], [2, 232]]

for k in arr:
data[k[0]].append(k[1])

data = dict(data)

new_data = defaultdict(list)

check_cluster_list = [len(x) for ii,x in data.items()]

def chunk(l: list, N: int):
return [l[i:i+N] for i in range(0, len(l), N)]

arr_d = defaultdict(list)
for entry in arr:
arr_d[entry[0]].append(entry[1])

chunks = {
1: 3,
2: 4,
}

for k, l in arr_d.items():
number_of_clusters = chunks[k]

if number_of_clusters > min(check_cluster_list):
    print("Clusters cannot be larger than",min(check_cluster_list))
    raise Exception(f"Clusters cannot be larger than {min(check_cluster_list)}")


for indx, (id, y) in enumerate(data.items()):
  cluster_dict = defaultdict(list)

  codebook, _ = kmeans(np.array(y, dtype=float), number_of_clusters)
  cluster_indices, _ = vq(y, codebook)


  for i, val in enumerate(cluster_indices):
     cluster_dict[val].append(y[i])
  final_list = []
  for id_1,y_1 in cluster_dict.items():
    final_list.append([min(y_1), max(y_1)])
  new_data[id].append(final_list)


new_data = dict(new_data)
new_data = {id:y[0] for id,y in new_data.items()}
print(new_data)```

misty flint Jul 7, 2022, 9:41 PM

#

from the one and only andrew ng https://read.deeplearning.ai/the-batch/issue-152/

Autonomous Atlantic Crossing, AI in the Courtroom, and more

The Batch-AI News & Insights: Autonomous research ship crossed the Atlantic Ocean | ML is helping lawyers sift through documents to find evidence

#

published yesterday

#

More papers have been published on AI than any person can read in a lifetime. So, in your efforts to learn, it’s critical to prioritize topic selection. I believe the most important topics for a technical career in machine learning are:

Foundational machine learning skills.

Deep learning.

Math relevant to machine learning.

Software development.

#

he goes into the specifics in his article

#

PikaThink

tidal bough Jul 7, 2022, 9:57 PM

#

the most important topics for a technical career in machine learning are:

Foundational machine learning skills.
no shit

#

~~obligatory "this article was written by an AI"~~

misty flint Jul 7, 2022, 10:13 PM

#

tidal bough ~~obligatory "this article was written by an AI"~~

kekHands

#

well his audience is for early career peeps or students

#

so he needs to state the obvious

#

please dont hate andrew ng

#

feelsbongoman

#

hes done a lot for the community

tidal bough Jul 7, 2022, 10:15 PM

#

sure, i've done his ML course

misty flint Jul 7, 2022, 10:25 PM

#

tidal bough sure, i've done his ML course

hes doing a renewed version that uses python

#

unless youre referring to that one

#

this time theres RecSys + RL in the last module of the syllabus

#

ZoomEyes

thorny aurora Jul 7, 2022, 10:53 PM

#

yo, i built a model today which got an accuracy score of 90% and then ran some data on it from another year and got 80%, would this be a good model?

steady basalt Jul 7, 2022, 10:53 PM

#

@ano

#

@spare briar

#

how is y' the same as f'x?

#

unless y=fx

thorny aurora Jul 7, 2022, 10:54 PM

#

y and fx are the same thing in most contexts

steady basalt Jul 7, 2022, 10:54 PM

#

how ?

#

y is a varaible and fx is a function of x?

thorny aurora Jul 7, 2022, 10:55 PM

#

is this a 2 dimensional field

steady basalt Jul 7, 2022, 10:55 PM

#

no idea

#

prob

thorny aurora Jul 7, 2022, 10:55 PM

#

do you know what a function is in general?

steady basalt Jul 7, 2022, 10:55 PM

#

is this just used as an example

steady basalt Jul 7, 2022, 10:55 PM

#

thorny aurora do you know what a function is in general?

yes its something which can be applied to a variable

#

but then this confuses me as its being equated to y

thorny aurora Jul 7, 2022, 10:56 PM

#

so basically f(x) means you input x into a formula in this case and the result y

steady basalt Jul 7, 2022, 10:56 PM

#

then hes saying that if y = mx+c that multiplying by m and adding c is the function f?

thorny aurora Jul 7, 2022, 10:56 PM

#

yes

steady basalt Jul 7, 2022, 10:56 PM

#

so f in and and of it self also exists, but ive never seen it

#

represented

thorny aurora Jul 7, 2022, 10:56 PM

#

so the times m and + c are basically like applying a function to x and then spitting out y

steady basalt Jul 7, 2022, 10:56 PM

#

how do u write f(x) without x just as f in text?

#

is it possible?

#

so you can have a function that u can apply to anything

#

like u said, *m+c

thorny aurora Jul 7, 2022, 10:57 PM

#

uh im not sure what you're asking

#

x is just an input, it can be any number

steady basalt Jul 7, 2022, 10:57 PM

#

f = *m+c

#

is that even real?

#

ive never seen somene write that way

#

its always f(x)

thorny aurora Jul 7, 2022, 10:58 PM

#

i think it's just shorthand

steady basalt Jul 7, 2022, 10:58 PM

#

but can u write f on its own

#

without an x

#

just an arbitary function

thorny aurora Jul 7, 2022, 10:58 PM

#

my calculus teacher did a lot

#

especially with derivatives

steady basalt Jul 7, 2022, 10:58 PM

#

ive enver ever seen that, and someone told me that it doesnte xist

#

exist

#

so ive always been confused

#

someone said that f on itsown is nothing

thorny aurora Jul 7, 2022, 10:59 PM

#

i mean technically no, the function needs an input

#

without the input there's no output

steady basalt Jul 7, 2022, 10:59 PM

#

so this guy on the photo saying y = f(x)

#

hes just using it as an example for y = graphs?

thorny aurora Jul 7, 2022, 10:59 PM

#

yes

steady basalt Jul 7, 2022, 11:00 PM

#

its not always the case or its always the case in graphs

#

u cant have a graph withuot some sort of x

thorny aurora Jul 7, 2022, 11:00 PM

#

in 2 dimensions yes

#

in 3 dimensions it changes a bit

steady basalt Jul 7, 2022, 11:01 PM

#

what does it mean to say y vs y''

#

second derivative?

#

d2y/dx2 is physically on the graph what?

#

another location on the line?

#

shifted the dot so to speak?

worldly dawn Jul 7, 2022, 11:03 PM

#

steady basalt d2y/dx2 is physically on the graph what?

a physical analogy would be:

position
velocity is the derivative of the position
acceleration is the derivative of velocity

steady basalt Jul 7, 2022, 11:04 PM

#

the way i see it is a position yes

#

how does velocity come into play?

#

or acceleration?

#

the difference between the derivatives?

worldly dawn Jul 7, 2022, 11:05 PM

#

velocity is how fast you move

#

And acceleration is how much the velocity change

steady basalt Jul 7, 2022, 11:05 PM

#

yeah i took physics

#

but in math they never explained this

#

it was just random numbers and no actual meaning

#

solving first and second order derivatives

#

to pass a quesiton

#

without knowing lkiterally waht it means

#

how can a new point on the line related to velocity?

#

you mean gradient?

#

thers a new gradient

#

not sure about the acceleration part

iron basalt Jul 7, 2022, 11:17 PM

#

steady basalt it was just random numbers and no actual meaning

https://www.youtube.com/watch?v=WUvTyaaNkzM&list=PL0-GT3co4r2wlh6UHTUeQsrf3mlS2lk6x

YouTube

3Blue1Brown

The essence of calculus

What might it feel like to invent calculus?
Help fund future projects: https://www.patreon.com/3blue1brown
An equally valuable form of support is to simply share some of the videos.
Special thanks to these supporters: http://3b1b.co/lessons/essence-of-calculus#thanks

In this first video of the series, we see how unraveling the nuances of a simp...

▶ Play video

steady basalt Jul 7, 2022, 11:18 PM

#

why the fuck didnt they just say so in school?

#

limit theorem wasnt touched at all

#

more like remember simple rules

#

finally learning about first principals properly

dull granite Jul 7, 2022, 11:27 PM

#

steady basalt without knowing lkiterally waht it means

Have you taken Calculus 3?

steady basalt Jul 7, 2022, 11:28 PM

#

no just 1

dull granite Jul 7, 2022, 11:28 PM

#

It expands on the uses of Calculus in three dimensions.

steady basalt Jul 7, 2022, 11:28 PM

#

btw, if the limit of fx is undefined

mint palm Jul 7, 2022, 11:28 PM

#

for x1, x2,x3.....as input are simple feature engineering such as making a new x=x1*x2 etc., make any substantial improvement?
i feel like a much complex feature engineering might help more, but these simple feature engineering isnt much.

steady basalt Jul 7, 2022, 11:28 PM

#

whats that written as

dull granite Jul 7, 2022, 11:29 PM

#

steady basalt btw, if the limit of fx is undefined

L'Hopital's rule.

steady basalt Jul 7, 2022, 11:29 PM

#

so u cant just say 0

dull granite Jul 7, 2022, 11:29 PM

#

No lol.

steady basalt Jul 7, 2022, 11:29 PM

#

its 1?

#

or wahteever

#

the final stop is

#

wait

dull granite Jul 7, 2022, 11:29 PM

#

It's undefined.

#

1/0 is undefined.

steady basalt Jul 7, 2022, 11:29 PM

#

the limit is what causes the undefined on the x axis

#

right?

#

so you say

dull granite Jul 7, 2022, 11:30 PM

#

Study some calculus book dude.

#

Concepts are difficult ngl.

steady basalt Jul 7, 2022, 11:30 PM

#

if the lim fx = 1

#

u say approaches 1

dull granite Jul 7, 2022, 11:30 PM

#

But if you don't read it and do it yourself, you won't understand.

steady basalt Jul 7, 2022, 11:31 PM

#

oh so im wrong?

dull granite Jul 7, 2022, 11:33 PM

#

Don't understand what you're trying to say.

steady basalt Jul 7, 2022, 11:33 PM

#

u can have random points on the graph that literally arent even touching the curve

#

so long as the function defines it

#

?

dull granite Jul 8, 2022, 12:02 AM

#

Wut?

steady basalt Jul 8, 2022, 12:07 AM

#

Yup it’s true

spare briar Jul 8, 2022, 12:24 AM

#

I think what you are asking about are piecewise functions https://en.wikipedia.org/wiki/Piecewise

Piecewise

In mathematics, a piecewise-defined function (also called a piecewise function, a hybrid function, or definition by cases) is a function defined by multiple sub-functions, where each sub-function applies to a different interval in the domain. Piecewise definition is actually a way of expressing the function, rather than a characteristic of the f...

barren wedge Jul 8, 2022, 3:36 AM

#

How to improve prediction in classification problems?

main fox Jul 8, 2022, 3:41 AM

#

barren wedge How to improve prediction in classification problems?

This is a broad question, but you could try feature engineering to make sure you are extracting as much relevant information from your data as you can. Also, test different models. If your dataset is small, maybe focus on Logistic Regression or Random Forest.

barren wedge Jul 8, 2022, 3:53 AM

#

main fox This is a broad question, but you could try feature engineering to make sure you...

yes, I used random forest
and the object column I transform with one hot encoder

main fox Jul 8, 2022, 3:58 AM

#

What are you trying to classify?

barren wedge Jul 8, 2022, 4:05 AM

#

main fox What are you trying to classify?

classification problem
contain of number and object columns

#

I drop ID
because it is not important

faint cargo Jul 8, 2022, 7:02 AM

#

Need to Optimise my Program .......Help!!!

#

from numpy.core.fromnumeric import amax
from numpy.core.fromnumeric import amin
def peak_valley_detector(d, ker_sz, sigma, width):
  ds = gderiv(d, sigma, ker_sz)
  # print(ds)
  for j in range(1, int((ker_sz/2))+1):
    ds = np.delete(ds, 0)
    ds = np.delete(ds, len(ds)-1)
  d1 = ds
  idx1 = np.zeros(len(d1))
  idx2 = np.zeros(len(d1))
  for i in range(1, len(d1)-2):
    if (np.sign(d1[i-1])>=np.sign(d1[i+1])) and (d[i]>=0.5*(np.amax(d)+np.amin(d))):
      idx1[i] = 1
    else:
      idx1[i] = 0
  for i in range(1, len(d1)-2):
    if (np.sign(d1[i+1])>np.sign(d1[i-1])) and (d[i]<0.5*(np.amax(d)+np.amin(d))):
      idx2[i+1] = 1
    else:
      idx2[i+1] = 0
  
  index1 = np.where(idx1 == 1)
  index2 = np.where(idx2 == 1)

  flag = 0
  
  # indexo2 = [1,len(d1)]
  # for k in index2:
  #   indexo2.append(k)

  # index2 = indexo2  
  
  index2 = np.append(index2, [1, len(d1)])
  index2 = np.sort(index2)

  # Amongst multiple close peaks detected, choose the highest peak, discard the rest.
  while not flag:
    flag = 1
    for i in range(1 , len(index1)-1):
      if abs(index1[i]-index1[i+1]) < width:
        flag = 0 
        if d[index1[i]] < d[index1[i+1]]:
          index1[i] = 9999
        else:
          index1[i+1]  = 9999

    irx1 = np.where(index1 == 9999)
    index1 = np.delete(index1 , irx1)

  flag = 0   
      
  # Amongst multiple close valleys detected, choose the lowest valley, discard the rest.
  while not flag:
    flag = 1
    for i in range(1 , len(index2)-2):
      if abs(index2[i]-index2[i+1]) < width:
        flag = 0 
        if d[index2[i]] > d[index2[i+1]]:
          index2[i] = 9999
        else:
          index2[i+1]  = 9999

    irx2 = np.where(index2 == 9999)
    # print(irx2)
    index2 = np.delete(index2 , irx2)
  return index1,index2

sleek wolf Jul 8, 2022, 7:14 AM

#

are SQL related questions allowed in this channel? 8-)

hushed sail Jul 8, 2022, 8:01 AM

#

Hi everyone! I need some help with implementing this technology in my application. demo.py uses SSDLite, I need to change it to ResNeXt101. How can I do it? Thanks in advance 🙂

https://github.com/hukenovs/hagrid/

GitHub

GitHub - hukenovs/hagrid: HAnd Gesture Recognition Image Dataset

HAnd Gesture Recognition Image Dataset. Contribute to hukenovs/hagrid development by creating an account on GitHub.

brazen spire Jul 8, 2022, 8:23 AM

#

Is there any database somewhere with images for computer vision?

#

overalaping images for NeRF especially

steady basalt Jul 8, 2022, 9:01 AM

#

spare briar I think what you are asking about are piecewise functions https://en.wikipedia.o...

Where x=certain number

#

Fx under two conditions

mint palm Jul 8, 2022, 10:10 AM

#

like we can change the degree of fitting polynomial in LogReg algo using python library, can we also change the coefficient of those polynomial?or will we have write all the code by hand?

rose agate Jul 8, 2022, 10:56 AM

#

Like a box plot?

hasty kiln Jul 8, 2022, 12:25 PM

#

sleek wolf are SQL related questions allowed in this channel? 8-)

Can you use this channel #databases,
But if your questions about "use sql for ML or DS" I think can you use this channel

serene scaffold Jul 8, 2022, 12:32 PM

#

@hasty kiln you are right lemon_hyperpleased

pliant star Jul 8, 2022, 12:44 PM

#

hey quick question guys:
axis.scatter(r.flatten(), g.flatten(), b.flatten(), facecolors=pixel_colors, marker='.')

this line seems to crash my code, any idea why that could be?

#

img = cv2.imread('IMG_7659.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

pixel_colors = img.reshape(-1, 3).astype(int)
norm = colors.Normalize(vmin=-1.,vmax=1.)
norm.autoscale(pixel_colors)
pixel_colors = norm(pixel_colors).tolist()

r, g, b = cv2.split(img)

fig = plt.figure()
axis = fig.add_subplot(1, 1, 1, projection='3d')
axis.scatter(r.flatten(), g.flatten(), b.flatten(), facecolors=pixel_colors, marker='.')
axis.set_xlabel('Red')
axis.set_ylabel('Green')
axis.set_zlabel('Blue')
plt.show()```
it just doenst show the plot

hasty kiln Jul 8, 2022, 12:47 PM

#

serene scaffold <@830227617289601025> you are right <:lemon_hyperpleased:754441879822663811>

Thank you

mild dirge Jul 8, 2022, 1:04 PM

#

pliant star ```py img = cv2.imread('IMG_7659.jpg') img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB...

Show the error traceback

pliant star Jul 8, 2022, 1:04 PM

#

There is none

#

Doesnt show anything

mild dirge Jul 8, 2022, 1:06 PM

#

scatter normally takes an array of x coordinates, and an array of y coordinates

#

you supply r, g and b

#

Not sure what those are supposed to be

#

And you are saying that the line is "crashing" your code, what does that mean?

#

@pliant star

pliant star Jul 8, 2022, 1:16 PM

#

Yeah its just doing nothing and loading properly

#

Rgb are the color values of the imagrs

#

I got then by cv2.split(img)

mild dirge Jul 8, 2022, 1:17 PM

#

So what are you trying to plot?

pliant star Jul 8, 2022, 1:17 PM

#

The rgb values of the images

#

I‘d like to group them later

mild dirge Jul 8, 2022, 1:17 PM

#

The image itself?

pliant star Jul 8, 2022, 1:17 PM

#

Using the hsv ones

mild dirge Jul 8, 2022, 1:17 PM

#

Don't really get it

pliant star Jul 8, 2022, 1:17 PM

#

yeah

mild dirge Jul 8, 2022, 1:18 PM

#

Alright, so just do plt.imshow(img) and then plt.show()

#

Or use cv.imshow(img) and then cv2.waitkey(0) iirc

pliant star Jul 8, 2022, 1:18 PM

#

#

No i dont want to show the img itself

mild dirge Jul 8, 2022, 1:18 PM

#

This is a 3d scatter plot, you need to use something other than plt.scatter

pliant star Jul 8, 2022, 1:19 PM

#

I want sth like that, such that i can group the different pixel values

mild dirge Jul 8, 2022, 1:19 PM

#

scatter is for 2d

#

I thought at least

pliant star Jul 8, 2022, 1:19 PM

#

Ah okay

#

Idk

#

I want to group the different colors

mild dirge Jul 8, 2022, 1:19 PM

#

Oh hmm, nvm it seems it can be done with this as well

#

https://matplotlib.org/stable/gallery/mplot3d/scatter3d.html

#

Did you follow this?

#

And what does it show now?

#

Can you show your screen after it shows the plot

pliant star Jul 8, 2022, 1:20 PM

#

I need to group different colors in an image

#

Will do once i’m home

mild dirge Jul 8, 2022, 1:21 PM

#

I'm moving today, so I probably won't be able to look at it much later

#

But maybe someone else could help at that point ^^

hollow sentinel Jul 8, 2022, 3:56 PM

#

!pastebin

arctic wedgeBOT Jul 8, 2022, 3:56 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

hollow sentinel Jul 8, 2022, 3:57 PM

#

https://paste.pythondiscord.com/vabikibiha

#

pd.to_numeric(car_data["Price"])
pd.to_numeric(car_data["Doors"])

#

i thought the error was that it was stringss

#

something something int can't be divided by string

#

but i don't think there are any strings here

brave sand Jul 8, 2022, 4:12 PM

#

@iron basalt Sorry for the ping, but I’ve been wondering if I could somehow “test” the monotonicity constraint? Could I generate some fake values? I want to see how the monotonicity constraint works. Because obviously it works but I need to know how. Any idea how to proceed with this?

hollow sentinel Jul 8, 2022, 4:12 PM

#

that would do iit

#

price and doors is an object

#

i did it

#

https://tenor.com/view/another-one-bites-the-dust-queen-dance-dancing-gif-8171986

Tenor

steady basalt Jul 8, 2022, 4:53 PM

#

Man I hate waiting networks to train so long

#

It’s been 2 hours

#

Still only 10% in

#

Do any of u use a tuner or just eyeball it

iron basalt Jul 8, 2022, 6:01 PM

#

brave sand <@119925597395877889> Sorry for the ping, but I’ve been wondering if I could som...

IIRC, the source has a boolean in QMix that let you turn if off (on by default). And also you could check if the weights are ever negative or not. If you want to check the partial derivative, you could work it out by hand (or use a tool) to see that positive weights have the desired effect.

sour tide Jul 8, 2022, 6:06 PM

#

the output we get..is higher pg good or lower log

mint palm Jul 8, 2022, 7:04 PM

#

i want to classify racy or violent or border line videos( with audio), where can i get a good pre-trained model for that?

brave sand Jul 8, 2022, 7:29 PM

#

iron basalt IIRC, the source has a boolean in QMix that let you turn if off (on by default)....

how does the partial derivative help?

iron basalt Jul 8, 2022, 7:30 PM

#

brave sand how does the partial derivative help?

brave sand Jul 8, 2022, 7:33 PM

#

iron basalt

so I work that out by hand to prove if the weights are always positive?

iron basalt Jul 8, 2022, 7:39 PM

#

brave sand so I work that out by hand to prove if the weights are always positive?

No, non-negative weights give that constraint. The weights being non-negative is given, it's what the code does. Show that given non-negative weights, you have that constraint. And then show that given that constraint your problem becomes more feasible. Then demonstrate it with an experiment (this is what the paper is all about, it's pretty straight forward even if the code seems like a lot (it's easy to say that you have some network in text, but it can be very annoying to code / the difference between stating something and actually doing it)).

#

(5) is just (1) again, not sure why they wrote it twice.

quartz raptor Jul 8, 2022, 7:41 PM

#

hey there, im currently building a timeseries database for pandas / dask dataframe data which can handle multiple billions of lines of dataframes. if anyone has a usecase for this and would like a specific feature hit me up! https://github.com/mercator-labs/oakstore

GitHub

GitHub - mercator-labs/oakstore: highspeed timeseries pandas datafr...

highspeed timeseries pandas dataframe database. Contribute to mercator-labs/oakstore development by creating an account on GitHub.

fallen portal Jul 8, 2022, 8:29 PM

#

Can anyone give me a simple explanation as to why we have to specify the DataFrame name multiple times when using Pandas methods? e.g.

my_df[(my_df.value == 10) & (my_df.object == 'Sally')]

why do we need to keep repeating my_df?

tidal bough Jul 8, 2022, 8:34 PM

#

fallen portal Can anyone give me a simple explanation as to why we have to specify the DataFra...

How else can it work? The way this works is just combining a few pandas features:

a comparison with a Series returns a Series of booleans (the comparison results for each element)
boolean series can be elementwise ANDed using &
dataframes can be indexed with a boolean Series to select only the rows for which the corresponding element is True.
If a column's name is a valid Python identifier, like value, you can access that column not just as df["value"] (like usual), but as df.value.
It's not magic. The expression on the inside of the square brackets here knows nothing about what it's used for, and can't possibly replace, say, value with my_df.value.

lapis sequoia Jul 8, 2022, 8:49 PM

#

tidal bough Jul 8, 2022, 8:51 PM

#

~~heartbreaking: the worst person you know just made a great point~~

blazing lagoon Jul 8, 2022, 9:07 PM

#

I wanna start learning about ai programing but i don't know if im ready, i took a 20hour course before about python. Should i start learning data sceine and ai or should i learn something else before learning this?

quartz raptor Jul 8, 2022, 9:07 PM

#

learn some math (if you dont know already)

#

as in calculus and linear algebra

#

not necessarily before though

blazing lagoon Jul 8, 2022, 9:09 PM

#

i know a little bit of math, 10th grade math))

quartz raptor Jul 8, 2022, 9:10 PM

#

yeah so perhaps you know like dervatives, chain-rule, product-rule etc?

#

that stuff is important for ml

#

or perhaps you also had some stuff about vectors

blazing lagoon Jul 8, 2022, 9:13 PM

#

only a little bit about derivates and chain rule

quartz raptor Jul 8, 2022, 9:16 PM

#

in general you wont have to apply math yourself directly however having deeper understanding and intuition for those things is very useful

blazing lagoon Jul 8, 2022, 9:26 PM

#

thank you

#

you know some good courses about machine learning , etc?

steady basalt Jul 8, 2022, 10:11 PM

#

blazing lagoon I wanna start learning about ai programing but i don't know if im ready, i took ...

focus python and statistics first id sayt

#

in my experience starting similar to you once, the maths is really hard to just learn at will, just pick up the ideas over time and ull be ok

#

like we discussed earlier, i'd fail a linalg/calc2 exam 10/10 times but i still managed to impress interviewers for junior DS roles, if its not faang youll be fine

#

people like to pretend reality is that DS requires you to leave uni like zuckerberg, this isnt the case, alot of ds have had to learn for many years

#

Companies probably value your ability to produce and to make projects using tools more than your ability to do math off hand

#

Or so I’ve been told today

plain zephyr Jul 8, 2022, 10:36 PM

#

Check out this natural language query add-on for Pandas: https://pypi.org/project/askedith/

PyPI

askedith

Natural Language Query Engine for Pandas

mint palm Jul 8, 2022, 10:47 PM

#

youtube has violence, racism, hate detection model for video, what kind of algo is applied there?
anomaly etc. or is it like seperate algorithm for each of those??

#

i only see specific models for hate speech, or for gore etc

#

but i want to make a something that can detect all type of unpleasant behaviour

shrewd locust Jul 8, 2022, 11:03 PM

#

does someone know how to exclude that dtype int64?

#

having trouble when calculating the accuracy

tidal bough Jul 8, 2022, 11:10 PM

#

shrewd locust does someone know how to exclude that dtype int64?

Exclude from printing? Why do you need to?

shrewd locust Jul 8, 2022, 11:16 PM

#

yes, from printing

mild dirge Jul 8, 2022, 11:19 PM

#

Why would removing that from printing help you calculate the accuracy better?

steady basalt Jul 9, 2022, 12:13 AM

#

mint palm but i want to make a something that can detect all type of unpleasant behaviour

dont they need to be reported

rough mountain Jul 9, 2022, 2:27 AM

#

I have a text corpus. I want to train an AI in such a way that it takes in an input sentence and predicts the next sentence. How do I setup the training and testing data, as well as sentence pairs, without letting the AI "cheat"

surreal brook Jul 9, 2022, 2:33 AM

#

hey guys can anyone explain why its only capturing 100 items when there is 500 items on the website? https://www.pythonmorsels.com/p/2bfgr/

2bfgr - Python Pastebin - Python Morsels

A free Python-oriented pastebin service for sharing Python code snippets with anyone

untold quail Jul 9, 2022, 3:42 AM

#

How do I fix this?

#

I get this error when I pip install pyaudio

#

How to fix it without installing visual c++ 14.0

#

???

spare briar Jul 9, 2022, 3:50 AM

#

required

untold quail Jul 9, 2022, 3:51 AM

#

Yes

#

Is there no other way?

worldly dawn Jul 9, 2022, 4:27 AM

#

untold quail Is there no other way?

The first few answers to google for "pyaudio windows" yield some great results (including the very same error message you get). Rather than copy/pasting them, I would suggest you try that.
In general, it helps a lot to google the error messages too

#

(I don't use windows, so I can't really help further)

untold quail Jul 9, 2022, 4:29 AM

#

Thanks

lapis sequoia Jul 9, 2022, 4:56 AM

#

blazing lagoon I wanna start learning about ai programing but i don't know if im ready, i took ...

For u muaahHh ...
https://www.reddit.com/r/learnmachinelearning/comments/cxrpjz/comment/eyn8cna/?context=3

r/learnmachinelearning - Comment by u/MarcelDeSutter on ”A clear Ro...

391 votes and 67 comments so far on Reddit

#

Termux isnot compaitible with ai-python is it?

mint palm Jul 9, 2022, 5:58 AM

#

steady basalt dont they need to be reported

Thats secondary, without detection, no reporting

blissful nymph Jul 9, 2022, 6:12 AM

#

resize doesn't work and causes this too

#

ocean swallow Jul 9, 2022, 8:08 AM

#

Trying to detect views from title, subscriber and days since published data

#

What I did for embedding data is I got each word's embedding (100 features) in title and then averaged it.

#

what else can I do for dimensional reduction of Embedding data

#

or like anything to increase the prediction quality

steady basalt Jul 9, 2022, 9:58 AM

#

What gpu is that

ocean swallow Jul 9, 2022, 10:55 AM

#

steady basalt What gpu is that

me?

#

it is 2070

steady basalt Jul 9, 2022, 11:59 AM

#

Fast

serene scaffold Jul 9, 2022, 12:08 PM

#

untold quail Is there no other way?

why don't you want to install it? even if you find a workaround for this library, the c++ build tools are necessary for a lot of installations. it was one of the first things I install when I get a new Windows machine.

lavish obsidian Jul 9, 2022, 12:08 PM

#

Hi folks
Someone by chance knows how to recognize objects in image by using
"image registration" method?
I'll be happy get any help, thanks in advance ! 🙏

serene scaffold Jul 9, 2022, 12:09 PM

#

rough mountain I have a text corpus. I want to train an AI in such a way that it takes in an in...

do you have to do it in terms of whole sentences? do you want each sentence to be copied exactly from the corpus?

rough mountain Jul 9, 2022, 3:05 PM

#

serene scaffold do you have to do it in terms of whole sentences? do you want each sentence to b...

I have to do it in whole sentences, and I'm not sure about any of other way to do the last part. I think I got it working. (Every two sentences I put on in labels and one in inputs)

nimble valley Jul 9, 2022, 3:28 PM

#

Guys, how do you use AI GPT-3 to create music melodies, lyrics, and other things that help with programming, engineering, or even learning English? I am completely new to programming and have only recently discovered information about AI GPT-3; please point me in the direction of where I can read or watch (youtube) about it. I'm not sure if there is a UI for colobaration with this AI GPT-3. Thanks...

mint palm Jul 9, 2022, 4:00 PM

#

anyone CURRENTLY doing the cnn course of deep learning specialisation on coursera?? i need ot see updated lab code.

pearl locust Jul 9, 2022, 4:09 PM

#

Using matplotlib with Nodezator (pip install nodezator) for data visualization

#

https://www.youtube.com/watch?v=GlQJvuU7Z_8

YouTube

Indie Python

Node Editor in Python/pygame - Nodezator App - Indie Python

I'm happy to announce the Nodezator app, a node editor for the Python programming language that turns Python functions into nodes. It is expected to be released in June 2022, the first app of the Indie Python project to be released. Visit its dedicated website: http://nodezator.com

http://indiepython.com
https://twitter.com/IndiePython

http://...

▶ Play video

brave sand Jul 9, 2022, 4:50 PM

#

can I Remote Desktop into my machine from a laptop? and see the visualizations?

#

bc I’m not sure if I should buy a laptop or make my pc smaller

serene scaffold Jul 9, 2022, 4:56 PM

#

nimble valley Guys, how do you use AI GPT-3 to create music melodies, lyrics, and other things...

If you're getting started with AI, you don't want to start with GPT-3.

#

@pearl locust is this self promotion?

brave sand Jul 9, 2022, 5:15 PM

#

what about normal extensions? do I have to setup a server?

nimble valley Jul 9, 2022, 5:28 PM

#

serene scaffold If you're getting started with AI, you don't want to start with GPT-3.

What can u recommend then? And if it a not a big deal for you, pls, can ya send me some links, where I can get started with gpt-3?

minor turret Jul 9, 2022, 5:49 PM

#

How can I identify if a word is a material or not?

#

Such as:
black titanium handle

#

I am able to identify the color

#

but I need to be able to identify the titanium

pearl locust Jul 9, 2022, 6:15 PM

#

serene scaffold <@960634922123554876> is this self promotion?

It is. I made it brief so as not to disrupt your conversation. I'm just sharing this app, since I think it can be useful for someone, specially since it is free of charge and on the public domain. That's all.

#

Here's the github page: https://github.com/KennedyRichard/nodezator

GitHub

GitHub - KennedyRichard/nodezator: multi-purpose visual editor to c...

multi-purpose visual editor to connect Python functions visually (a node editor) - GitHub - KennedyRichard/nodezator: multi-purpose visual editor to connect Python functions visually (a node editor)

rough mountain Jul 9, 2022, 6:16 PM

#

For some reason pytorch is is giving me a single data point instead a full batch, and I don't know why.

dataset = torch.utils.data.TensorDataset(seqs, labels)
dataloader = DataLoader(dataset, batch_size=32, drop_last=True, shuffle=True)

for epoch in range(100):
    for i, data in enumerate(tqdm(dataset, desc=f"Epoch: {epoch}")):
        print(data)```

#

Data here is just a tuple of one tensor from seqs and one from labels

gloomy anvil Jul 9, 2022, 6:34 PM

#

Hey y'all, is there a rough formula or rule of thumb on how many iterations to try out when training a Self-Organizing Map (SOM) / Kohonen Map?

rough mountain Jul 9, 2022, 6:35 PM

#

minor turret I am able to identify the color

How are you getting the color right now?

serene scaffold Jul 9, 2022, 7:01 PM

#

nimble valley What can u recommend then? And if it a not a big deal for you, pls, can ya send ...

I would start with a beginner data science book. the fundamentals of data science lend themselves well to AI.

I know it's tempting to jump to the newest and coolest things that are happening in AI, but you won't be able to understand them until you've been developing your knowledge for a long time. starting from the basics and working your way up can still be satisfying. you have to keep a positive attitude about learning

#

!resources data science

arctic wedgeBOT Jul 9, 2022, 7:01 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

primal gyro Jul 9, 2022, 7:11 PM

#

Hello. Can I ask a statistics question here?

#

Or would #algos-and-data-structs be better?

gloomy anvil Jul 9, 2022, 7:16 PM

#

primal gyro Hello. Can I ask a statistics question here?

Dont ask to ask 😉

primal gyro Jul 9, 2022, 7:17 PM

#

gloomy anvil Dont ask to ask 😉

Nvm I see the channel description says statistics.

#

If I have some z-scores and combine them into one z-score, is this a good approach? I have 1 sample so I can't use Stouffer's method.

pulsar hull Jul 9, 2022, 7:30 PM

#

I'm trying to make backpropagation myself, and one problem I have is that it doesn't work when a layer is using relu as its activation function instead of sigmoid, since the derivative I tried was relu(x)/x, and that can sometimes result in division by zero errors. Do I have the derivative wrong or is there something else I'm missing about it?

nimble valley Jul 9, 2022, 7:32 PM

#

@serene scaffold man, thanks a lot, I utterly appreciate your swift responses and advices so so much, be blessed and happy

mint palm Jul 9, 2022, 7:34 PM

#

is YOLO still used in industry??

steady basalt Jul 9, 2022, 7:40 PM

#

nimble valley Guys, how do you use AI GPT-3 to create music melodies, lyrics, and other things...

u can use davinci on openais playground, but it isnt going to create melodies for u lol

serene scaffold Jul 9, 2022, 7:48 PM

#

primal gyro Hello. Can I ask a statistics question here?

statistics questions are better for here. #algos-and-data-structs is more about sorts, graph traversals, big-O, etc. as well as general CS theory stuff.

primal gyro Jul 9, 2022, 7:49 PM

#

serene scaffold statistics questions are better for here. <#650401909852864553> is more about so...

Ok.

nimble valley Jul 9, 2022, 7:56 PM

#

steady basalt u can use davinci on openais playground, but it isnt going to create melodies fo...

What will create davinci then? In generally, what can I do with this davinci addition? Which else benefits it can get? Thanks

steady basalt Jul 9, 2022, 8:02 PM

#

you can tell it do say something

misty flint Jul 9, 2022, 8:59 PM

#

speaking of R, no offense to R users, but i made it the 3rd round for a company and decided to cancel my upcoming interviews bc i found out they worked in R mostly (among other reasons)

#

if you do have to use R, for whatever reason, you can use R Studio or even Jupyter notebooks for it

#

DoggoKek

#

kekHands

#

it actually has some good stats packages

#

for monte carlo simulations, etc.

#

or bayesian stuff

#

but most of the time you dont need that

#

and if youre going to deploy, its better to be in python anyway

#

i feel like youre more marketable too

#

also you can switch to more software jobs easier if you want to later on

#

bro...

#

you should see javascript

#

kekHands

#

that dot notation

#

but yeah it is kinda heavy in R

#

oh the tidyverse is good too if you ever have to work in R

#

py_strong

ocean swallow Jul 9, 2022, 9:26 PM

#

ocean swallow Trying to detect views from title, subscriber and days since published data

oooooops. outputs are supposed to be between -1 and 1 and I used sigmoid. my bad

brave sand Jul 9, 2022, 9:49 PM

#

so guys should I buy a laptop to Remote Desktop into my pc or should I make my pc smaller? Rn I’m doing work in MARL

steady basalt Jul 9, 2022, 11:42 PM

#

misty flint for monte carlo simulations, etc.

how in pytthon

steady basalt Jul 9, 2022, 11:43 PM

#

brave sand so guys should I buy a laptop to Remote Desktop into my pc or should I make my p...

might as well get a macbook pro

misty flint Jul 9, 2022, 11:54 PM

#

steady basalt how in pytthon

hmm there are a couple specific libraries

#

but

steady basalt Jul 9, 2022, 11:55 PM

#

sm?

misty flint Jul 9, 2022, 11:55 PM

#

how to put this

#

you didnt like what i had to say about data engineering basics so ~~why would you listen to me about bayesian stats~~...

#

Oopsies

#

Run

steady basalt Jul 9, 2022, 11:55 PM

#

what data engineering does it require

#

cant u just use stats models or smtn

misty flint Jul 9, 2022, 11:56 PM

#

ok. try it

#

Oopsies

steady basalt Jul 9, 2022, 11:57 PM

#

https://pbpython.com/monte-carlo.html

Monte Carlo Simulation with Python

Performing Monte Carlo simulation using python with pandas and numpy.

#

#

lemon_blush

#

dude does it on a dataframe

#

what now?

brave sand Jul 9, 2022, 11:58 PM

#

steady basalt might as well get a macbook pro

it’s not supported

#

cuda isn’t supported

steady basalt Jul 9, 2022, 11:59 PM

#

https://pub.towardsai.net/monte-carlo-simulation-an-in-depth-tutorial-with-python-bcf6eb7856c8

Medium

Monte Carlo Simulation An In-depth Tutorial with Python

An in-depth tutorial on the Monte Carlo Simulation methods and applications with Python

#

no data engineering really just calculations

steady basalt Jul 9, 2022, 11:59 PM

#

brave sand cuda isn’t supported

why u need that

brave sand Jul 10, 2022, 12:04 AM

#

steady basalt why u need that

ml?

#

I use pytorch

steady basalt Jul 10, 2022, 12:10 AM

#

Well it depends if u need something rly strong like RTX 3080 or better

#

If not the m1 pro has a gpu that’s alright

still dirge Jul 10, 2022, 12:15 AM

#

i must say, you're one of my biggest inspirations to pursue DS
this one is hella neat, what equation/function is that? 👀

eager wedge Jul 10, 2022, 3:41 AM

#

I created a segmentation model with a train: 80%, val: 60 after 500 epochs. What is the problem? Is it overfitting or underfitting?

swift furnace Jul 10, 2022, 4:26 AM

#

Where to get started with Data Science?

tacit basin Jul 10, 2022, 5:03 AM

#

swift furnace Where to get started with Data Science?

https://www.pythondiscord.com/resources/?topics=data-science

Python Discord | Resources

We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.

burnt citrus Jul 10, 2022, 5:24 AM

#

having been on here for a while I see that many share my sentiment that pandas is very difficult to use so I made an alt for CSV parsing

#

https://github.com/TheArctesian/coalas/tree/main

GitHub

GitHub - TheArctesian/coalas

Contribute to TheArctesian/coalas development by creating an account on GitHub.

#

any feed back would be very nice

royal garnet Jul 10, 2022, 6:14 AM

#

So, I'm reading this: https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas

I need to read a dataframe row by row - and then depending on the contents of certain cells, write stuff to a list of lists that I'll eventually write to a csv.

But it sounds like iterating over a dataframe is not a great idea - so how would someone do that?

Stack Overflow

How to iterate over rows in a DataFrame in Pandas

I have a pandas dataframe, df:
c1 c2
0 10 100
1 11 110
2 12 120

How do I iterate over the rows of this dataframe? For every row, I want to be able to access its elements (values in cell...

#

For example, my df contains a column containing a uuid, a name, and then a bunch of other columns with boolean values - and based on those I want to write a line to a csv that has that uuid, name, and those boolean values.

haughty root Jul 10, 2022, 9:31 AM

#

Hello guys, I have a question

#

Supervised learning got performance measure techniques such a Cross-Val, Confusion-Matrix, Precision & Recall, ROC ..

#

How could we therefore evaluate a reinforced agent in RL ??

upbeat furnace Jul 10, 2022, 10:24 AM

#

Heyy guys, what does this mean? 😅 " tensorflow 2.9.1 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.21.2 which is incompatible."

#

does it mean I have to use an older version of protobuf?

serene scaffold Jul 10, 2022, 10:36 AM

#

upbeat furnace Heyy guys, what does this mean? 😅 " tensorflow 2.9.1 requires protobuf<3.20,>=3...

do you know what those three-part version numbers mean?

serene scaffold Jul 10, 2022, 10:37 AM

#

royal garnet So, I'm reading this: https://stackoverflow.com/questions/16476924/how-to-iterat...

depending on the contents of certain cells
let's make this question not abstract. Please do print(df.head().to_dict('list')), show the text, and say which rows you don't want and why.

upbeat furnace Jul 10, 2022, 10:38 AM

#

serene scaffold do you know what those three-part version numbers mean?

Requires protobuf version Less than 3.2 or more or equal than 3.9.2 is what I think

#

But I thougtht that wouldn't make sense

serene scaffold Jul 10, 2022, 10:38 AM

#

upbeat furnace Requires protobuf version Less than 3.2 or more or equal than 3.9.2 is what I th...

for version numbers, 3.2 is not the same as 3.20

#

they're not decimal numbers.

upbeat furnace Jul 10, 2022, 10:39 AM

#

Ohh

serene scaffold Jul 10, 2022, 10:39 AM

#

when you have x.y.z, x, y, and z are each their own number

#

that's why we went from Python 3.9 to 3.10

upbeat furnace Jul 10, 2022, 10:40 AM

#

Ahh thanks! 😅

serene scaffold Jul 10, 2022, 10:40 AM

#

upbeat furnace Ahh thanks! 😅

look into "semantic versioning"

steady basalt Jul 10, 2022, 11:22 AM

#

do u think we shud have a pinned or channel description for all the peope who asking where to start?

serene scaffold Jul 10, 2022, 11:25 AM

#

steady basalt do u think we shud have a pinned or channel description for all the peope who as...

I'd have to write it, and that is work.

eager wedge Jul 10, 2022, 1:48 PM

#

Can my segmented image be in gray scale if it is multiclass?

tidal bough Jul 10, 2022, 1:49 PM

#

still dirge i must say, you're one of my biggest inspirations to pursue DS this one is hella...

whoa, that's an old-ass plot of mine you found

#

this was simulating, IIRC, a particle moving in a combination of constant magnetic and electic fields, for an electrodynamics homework task

#

I realised after making it that the task in question was in fact exactly analytically solvable (rather than only approximately so like I thought) by just applying a fourier transform 😔

still dirge Jul 10, 2022, 1:52 PM

#

tidal bough whoa, that's an old-ass plot of mine you found

haha yeah was going through some videos/pics for some ✨ inspiration ✨ and was taken aback by that

#

do you perhaps have the task because i really want to try doing that

#

it's hella cool!

tidal bough Jul 10, 2022, 2:05 PM

#

still dirge do you perhaps have the task because i really want to try doing that

Here, dug it up (and translated to English). Looks like it was a variable field, not a constant one.

#

.latex A dielectric can be described using a model in which each atom constists of a stationary positive charge and an electron moving around close to it. Their interaction is described by a simplified potential $\frac{m \omega_0^2 r^2}{2}$, where $r$ is the distance between the electron and the center of the atom, $m$ is the electron mass. Find the dielectric permittivity tensor of the medium in a variable magnetic field $B$ which is pointing along axis $z$ and has frequency $\omega$. Represent the answer in terms of the following parameter:
[
\omega_{LT}= \frac{2 \pi e^2 n}{m \omega_0},
]

where $n$ is the concentration of atoms, $e$ is the electron charge.

strange elbowBOT Jul 10, 2022, 2:05 PM

#

$latex.png$

tidal bough Jul 10, 2022, 2:05 PM

#

yay, surprised the bot handled that correctly.

still dirge Jul 10, 2022, 2:06 PM

#

tidal bough .latex A dielectric can be described using a model in which each atom constists ...

wah, thank you!

tidal bough Jul 10, 2022, 2:17 PM

#

strange elbow

(note: the permittivity tensor is at frequency omega, too. So you have a magnetic field at that frequency, and apply an electric field at that frequency, and get some electrons moving at that frequency as a result.)

serene scaffold Jul 10, 2022, 2:31 PM

#

tidal bough (note: the permittivity tensor is at frequency omega, too. So you have a magneti...

what is this physics shit

steady basalt Jul 10, 2022, 2:34 PM

#

serene scaffold I'd have to write it, and that is work.

What about a link to a pre existing website ha

wooden sail Jul 10, 2022, 2:53 PM

#

ooh pretty nice

misty flint Jul 10, 2022, 4:31 PM

#

serene scaffold I'd have to write it, and that is work.

would rather you do the one for #career-advice instead

#

kekHands

sage fulcrum Jul 10, 2022, 4:45 PM

#

Hm

#

Fail to build pycocotools in linux ?

#

Without conda , does anyone have any fix

steady basalt Jul 10, 2022, 5:01 PM

#

is sklearns svr rbf guassian?

night sequoia Jul 10, 2022, 5:02 PM

#

Hey all , this is my new notebook on Support Vector Machines (from the book Hand's on Machine Learning) , do leave an upvote if you learn something. Cheers! https://www.kaggle.com/code/supreeth888/support-vector-machines-hand-s-on-ml/notebook

Support Vector Machines - Hand's on ML

Explore and run machine learning code with Kaggle Notebooks | Using data from No attached data sources

steady basalt Jul 10, 2022, 5:02 PM

#

@night sequoia ur the perfect person to answer my question then

#

im literally writing a report on svr as we speak

#

if its ok can i use ur visualisastion code i cba to write my own

night sequoia Jul 10, 2022, 5:04 PM

#

steady basalt is sklearns svr rbf guassian?

Cool ! Check that notebook and let me know what you think ?

steady basalt Jul 10, 2022, 5:04 PM

#

so is sklearn using guaissian rbf?

#

oh nvm

#

theres only one kernel

night sequoia Jul 10, 2022, 5:06 PM

#

steady basalt so is sklearn using guaissian rbf?

yes actually it does , you can use the kernel trick and specify like this : (kernel="rbf")

steady basalt Jul 10, 2022, 5:06 PM

#

do u think i shud visualise sample points to show how it works

#

or just use a randomised one like oyu

night sequoia Jul 10, 2022, 5:07 PM

#

u can use the make_moon dataset

iron basalt Jul 10, 2022, 5:20 PM

#

haughty root How could we therefore evaluate a reinforced agent in RL ??

https://arxiv.org/pdf/1912.05663.pdf

sharp sinew Jul 10, 2022, 5:55 PM

#

Can someone help to refer a text book of stats?

tacit basin Jul 10, 2022, 6:04 PM

#

sharp sinew Can someone help to refer a text book of stats?

https://www.statlearning.com/

Python code: https://github.com/JWarmenhoven/ISLR-python

An Introduction to Statistical Learning

GitHub

GitHub - JWarmenhoven/ISLR-python: An Introduction to Statistical L...

An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani, 2013): Python code - GitHub - JWarmenhoven/ISLR-python: An Introduction to Statistical Learning (James, Witten, Hastie, T...

sharp sinew Jul 10, 2022, 6:13 PM

#

@tacit basin I mean to say for helping data science,this above book doesn't contain topic like p-test,z-test,chi-square test

tacit basin Jul 10, 2022, 6:16 PM

#

sharp sinew <@490342783572246538> I mean to say for helping data science,this above book doe...

It's considered one of the best books on the subject.

steady basalt Jul 10, 2022, 6:48 PM

#

can i get ur guys take on normalising by dividing by 100 if values are between 80 and 300

#

my professor did it

#

he just put eveything on a 0-3 scale

#

lol

#

well theres v small values that are <1 and they were /100 also

#

so its the same scale

#

just not 0-1

tacit basin Jul 10, 2022, 6:55 PM

#

I thought they talk p value in stats learning. Was Long time ago when i read it though...

steady basalt Jul 10, 2022, 7:23 PM

#

yo does keras have attention function

#

oh damn it does

hollow sentinel Jul 10, 2022, 7:32 PM

#

tacit basin I thought they talk p value in stats learning. Was Long time ago when i read it ...

they do

nova matrix Jul 10, 2022, 8:52 PM

#

guys I am currently working on this binary classification dataset. I've made the model and want to now apply it to the actual test data. It contains around 1 mill rows and I fear it may restart my kernel due to insufficient memory. Is there anyway I could maybe get around this. Should I split the testing set or wot. I need a final csv of all the final scores

mild dirge Jul 10, 2022, 8:53 PM

#

How many columns and what type of data? @nova matrix

nova matrix Jul 10, 2022, 9:00 PM

#

mild dirge How many columns and what type of data? <@879805921302290472>

It's a numerical data 190 columns

#

I mean it did have a few categorical strings but aim is to change them into int vals

mild dirge Jul 10, 2022, 9:01 PM

#

Yeah so that might be too much to load at once

#

Not sure in what format it is stored right now, but you could just load it in batches

nova matrix Jul 10, 2022, 9:02 PM

#

you know any good way to do that or any link to smth that shows it, cuz I've never done it like that b4

mild dirge Jul 10, 2022, 9:02 PM

#

How is it stored right now?

nova matrix Jul 10, 2022, 9:02 PM

#

csv format

mild dirge Jul 10, 2022, 9:03 PM

#

https://datascienceparichay.com/article/pandas-read-first-n-rows-csv/

Data Science Parichay

Pandas - Read only the first n rows of a CSV file - Data Science Pa...

To read only the first n rows of a CSV file to a dataframe pass n to the nrows parameter of the pandas read_csv() function.

#

This explains how to do it pretty well

tropic tiger Jul 11, 2022, 2:16 AM

#

I'm using pandas to do expectation-maximization but I'm stuck with how the table is rearranged by the merge method after multiplying two factors.

#

#

this is what I have.
What I want rearrange column Dunett to the pattern of ('False', mild, severe, false, mild,severe, so and so forth)

#

That way it's easier for me to do a normalization

#

Does anyone have any suggestion?

crude shadow Jul 11, 2022, 2:50 AM

#

tropic tiger Does anyone have any suggestion?

Have you tried pd.sort_values

tropic tiger Jul 11, 2022, 2:52 AM

#

crude shadow Have you tried pd.sort_values

Would it work though because I'm sorting Dunett, and Dunett only has False, mild, and severe

crude shadow Jul 11, 2022, 2:53 AM

#

Oo just realized you wanted to order by columns too

tropic tiger Jul 11, 2022, 3:02 AM

#

maybe jump in this thread

#

https://discord.com/channels/267624335836053506/697134086614941706

#

I asked a question on here

tropic tiger Jul 11, 2022, 4:04 AM

#

I did it!! Praise God

#

In the end, I just had to add a new column seems like

#

Wow this is something I learned. To manipulate something in ur df, often if you add more columns to store values can help

#

grave marten Jul 11, 2022, 6:40 AM

#

#

is use the mnist dataset

#

i*

#

why is it showing 1875 instead of 60000 while training

#

?

#

# import libraries
import tensorflow as tf
from tensorflow import keras
from keras.datasets import mnist
from matplotlib import pyplot as plt

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images / 255.0
test_images = test_images / 255.0

class_names = [0,1,2,3,4,5,6,7,8,9]

model = keras.Sequential([keras.layers.Flatten(input_shape=(28,28)),
                          keras.layers.Dense(128,activation='relu'),
                          keras.layers.Dense(10, activation='softmax')])
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_images,train_labels,epochs=3)
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose = 1)
print("Test accuracy : ", test_acc)


i = 0
plt.figure()
plt.title(f"the number is : {train_labels[i]}")
plt.imshow(train_images[i])
plt.colorbar()
plt.grid(False)
plt.show()
print(class_names[test_labels[i]])

this is the code

tacit basin Jul 11, 2022, 8:07 AM

#

grave marten ``` # import libraries import tensorflow as tf from tensorflow import keras from...

batch_size will default to 32, so 60000/32 = 1875

lost ivy Jul 11, 2022, 10:38 AM

#

Hi, I hope this is the right forum to pose this question

#

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns

data = sheet.get_all_values()
df = pd.DataFrame(data)
df_sorted = df.sort_values("GAMEs")

NAME    GAMEs    ATT    COMP    YRDs    PCT    TDs    INTs
Mills    14    438    287    3468    65.5    18    8
Lance    19    318    208    2947    65.4    30    1
Trask    26    813    552    7386    67.9    69    15
Wilson    30    837    566    7652    67.6    56    15
Jones    30    556    413    6126    74.3    56    7
Fields    34    618    423    5761    68.4    67    9
Law     40    1138    758    10098    66.6    90    17
Book    45    1141    728    8948    63.8    72    20
Mond    46    1358    801    9661    59    71    27
Eger    46    1476    923    11436    62.5    94    27

num0 = df_sorted["GAMEs"]
num1 = ["green" if (g < max(num0)) else "red" for g in num0]

plt.figure(figsize = (10, 10))

plt.bar(x = df_sorted["NAME"], height = df_sorted["GAMEs"], width = 0.4, color = num1)

plt.grid(color = 'red', alpha = 0.2, linestyle = '--', linewidth = 1)

plt.xlabel("QB Names (x)", fontsize = 12)
plt.ylabel("Number of Games Played (y)", fontsize = 12)

plt.xticks(rotation = 45, fontsize = 12)
plt.yticks(fontsize = 12)

plt.title("Total Number of Games Played", fontsize = 12, fontweight = "bold")

plt.show()```

#

#

This starts at 14, I would like it to start at zero

#

adding plt.ylim(0, 50) does this

#

#

adding bottom = 0 or bottom = None, does not change the graph at all

#

I have tried everything and researched multiple sites that offer almost the same advice and none of it works. So decided to try here

stoic viper Jul 11, 2022, 10:49 AM

#

2 dataframes in a list and this doesnt work....

for df in df_list:
    df.loc[~(df==0).any(axis=1)]

#

why

lost ivy Jul 11, 2022, 10:50 AM

#

stoic viper 2 dataframes in a list and this doesnt work.... ```python for df in df_list: ...

Because they are the same df and you are telling it to produce the same output?

stoic viper Jul 11, 2022, 10:51 AM

#

no its not the same df

#

i just want rows with zeroes to be gone

lost ivy Jul 11, 2022, 10:55 AM

#

you want to replace 0's in the df with 1's correct?

stoic viper Jul 11, 2022, 10:56 AM

#

no just delete all 0

#

actually i want do delete rows that contain 1

#

in one row and zeroes in another, but i wrote a small example

#

and then use that on the list. Problem is it works without for, but not with for

lost ivy Jul 11, 2022, 11:03 AM

#

1 for df in df_list:
----> 2 df.loc[~(df==0).any(axis=1)]

AttributeError: 'int' object has no attribute 'loc'

tidal bough Jul 11, 2022, 11:05 AM

#

stoic viper 2 dataframes in a list and this doesnt work.... ```python for df in df_list: ...

df.loc doesn't modify the original df, so doing df.loc[something] and discarding the result is like doing nothing.

gloomy anvil Jul 11, 2022, 11:31 AM

#

So I evaluated a number of models in terms of binary classification for different data. Now that I am writing a about it, I am unsure how to group these models thematically and especially regarding my table of contents. The models are:

Logistic Regression, Bernoulli Naive Bayes, Random Forest, Regularized Greedy Forest, XGB, Deep Neural Network, ROCKET, SVM, KNN, LSTM, GRU, RNN, Voting, Stacking, Bagging.

Easy ones:
Recurrent models: LSTM, GRU, RNN

Decision Trees: Random Forest, RGF, XGB

Ensemble Models: Voting, Stacking, Bagging

That leaves: Logistic Regression, Bernoulli Naive Bayes, Deep Neural Network, ROCKET, SVM, KNN,

As ROCKET uses convolutional kernels and basically resembles CNNs, I thought about grouping DNN and ROCKET as Neural Networks, but I mean the recurrent models above are neural networks as well.

SVM and KNN could be grouped as nonparametric models. That would leave Logistic Regression and Bernoulli. Logistic Regression despite its name could be categorized as generalized linear model in this binary classification approach. But Bernoulli is not really a linear model, is it?

I would love to categorize the models ideally in groups of 3 (give or take). Do you have suggestions on what kind of meta chapters / categories to choose? How would you go about categorizing these models?

fallow frost Jul 11, 2022, 12:48 PM

#

If im going to pursue a career in Data analytics which languages, libraries, and tools do I need to know ?

#

I already know Python pretty well (OOP, Numpy, Pandas), some bash, and SQL pretty decently, what should be my next priority ?

serene scaffold Jul 11, 2022, 12:53 PM

#

@fallow frost try doing some actually projects that leverage or build on all of these skills

fallow frost Jul 11, 2022, 12:54 PM

#

serene scaffold <@923915277010370600> try doing some actually projects that leverage or build on...

I have! I made quite a few, and im planning to upload 2 more to my github

#

but do you think I could even get an Internship or an entry level job yet with just these skills ?

fallow frost Jul 11, 2022, 12:55 PM

#

fallow frost I have! I made quite a few, and im planning to upload 2 more to my github

https://github.com/shner-elmo

GitHub

shner-elmo - Overview

shner-elmo has 6 repositories available. Follow their code on GitHub.

serene scaffold Jul 11, 2022, 12:56 PM

#

fallow frost but do you think I could even get an Internship or an entry level job yet with j...

are you a CS student currently?

fallow frost Jul 11, 2022, 12:57 PM

#

Im self taught 😭

#

but I just enrolled in a boot camp

serene scaffold Jul 11, 2022, 12:57 PM

#

fallow frost Im self taught 😭

at least in the US, you're probably not going to get a data scientist job without a degree.

#

do you have a degree in something else?

fallow frost Jul 11, 2022, 12:57 PM

#

and it will take 7 months to finish, so I want to find something befoore

fallow frost Jul 11, 2022, 12:58 PM

#

serene scaffold do you have a degree in something else?

I dont have anything, not even a GED
but ik for a fact there are self taught data analyst in the US

#

which is why im doing the bootcamp,, to get some kinf of formal 'education'

serene scaffold Jul 11, 2022, 12:59 PM

#

fallow frost I dont have anything, not even a GED but ik for a fact there are self taught dat...

tbh, if I were you, I would quit the bootcamp (if you can get your money back) and enroll in university.

serene scaffold Jul 11, 2022, 12:59 PM

#

fallow frost which is why im doing the bootcamp,, to get some kinf of formal 'education'

sorry, but the bootcamp just isn't going to be valued the same as a degree by employers.

fallow frost Jul 11, 2022, 1:00 PM

#

ofc but you cant say that there is no chance at all

serene scaffold Jul 11, 2022, 1:00 PM

#

why would you go with the option that gives you a worse chance...?

fallow frost Jul 11, 2022, 1:01 PM

#

like for a senior data scientist I can understand, but for an ENTRY Data Analyst then It should be fairly easy if you have a solid portfolio

serene scaffold Jul 11, 2022, 1:01 PM

#

but for an ENTRY Data Analyst then It should be ~~fairly easy~~ not completely impossible if you have a solid portfolio

fallow frost Jul 11, 2022, 1:02 PM

#

serene scaffold why would you go with the option that gives you a worse chance...?

nah first of all I dont pay any money to the bootcamp until im hired

serene scaffold Jul 11, 2022, 1:02 PM

#

Anyway, I don't think I'm going to change your mind, but I hope this works out for you.

fallow frost Jul 11, 2022, 1:02 PM

#

so they will have to find me a job

#

Yeah probably

#

but what do you do ? @serene scaffold

#

Data scientist ? analyst ?

serene scaffold Jul 11, 2022, 1:03 PM

#

fallow frost but what do you do ? <@253696366952316929>

I'm an AI developer, namely for AIs that involve language.

fallow frost Jul 11, 2022, 1:03 PM

#

like NLP ?

serene scaffold Jul 11, 2022, 1:03 PM

#

yes

fallow frost Jul 11, 2022, 1:03 PM

#

ahh ok
and what kind of degree do you have ?

serene scaffold Jul 11, 2022, 1:03 PM

#

bachelors in CS, with AI and DS-related coursework.

#

I probably wouldn't have gotten this job with only a bachelors if I didn't also publish as an undergrad.

muted pendant Jul 11, 2022, 1:04 PM

#

serene scaffold bachelors in CS, with AI and DS-related coursework.

ayee im doing the same 🤝

fallow frost Jul 11, 2022, 1:04 PM

#

pardon my noobiness but wth is an undergrad

serene scaffold Jul 11, 2022, 1:05 PM

#

fallow frost pardon my noobiness but wth is an undergrad

a bachelors degree (ie not a masters or phd)

fallow frost Jul 11, 2022, 1:05 PM

#

so its the same as a bachelors ?

serene scaffold Jul 11, 2022, 1:05 PM

#

my degree is a bachelors, yes.

fallow frost Jul 11, 2022, 1:06 PM

#

ok

#

So what other tools should I study next

#

what do you recommend ?

serene scaffold Jul 11, 2022, 1:06 PM

#

I should add that I don't think university is the perfect mechanism for imparting knowledge. it's just the most widely recognized.

serene scaffold Jul 11, 2022, 1:07 PM

#

fallow frost So what other tools should I study next

I think that just learning arbitrary tools is the wrong way to do it. that's why I suggested to do more projects. preferably one that relies on some concept that you're not currently familiar with.

fallow frost Jul 11, 2022, 1:10 PM

#

serene scaffold I think that just learning arbitrary tools is the wrong way to do it. that's why...

I could keep making more projects, but for me it seems like I already mastered the basics of Python and Its already OP for being a data analyst which the max i'll do with it is clean data.
altough I do agree I should make more projects that are specifically related to data analysis using Pandas and SQL

#

but even then, just knowing those three tools isnt enough, right ?

serene scaffold Jul 11, 2022, 1:10 PM

#

fallow frost but even then, just knowing those three tools isnt enough, right ?

enough for what?

fallow frost Jul 11, 2022, 1:10 PM

#

so thats why im asking what else I should learn

#

to find a job

serene scaffold Jul 11, 2022, 1:11 PM

#

each job is going to have different criteria that they look for

fallow frost Jul 11, 2022, 1:11 PM

#

which imo I will be able to find even b4 I finish the boot camp if I manage to master those tools

#

and btw im not trying to get a 'good' job, I really dont give a shit as long as im hired

#

then I could leverage my work experience to get smth better instead of doing 3-4 years of college

wooden sail Jul 11, 2022, 1:12 PM

#

do you have some idea of the kind of position you'd like, or where you'd like to work?`you could look at job descriptions and see how you compare to the sought-after profile

#

or whether you feel confident enough to do what the task descriptions contain

fallow frost Jul 11, 2022, 1:14 PM

#

wooden sail do you have some idea of the kind of position you'd like, or where you'd like to...

where it dosent matter, but I would love to make scripts/programs with python and work with pandas, since I really enjoy creating programs, and Im good at that imho

wooden sail Jul 11, 2022, 1:14 PM

#

hmm but it makes more sense imo to focus on the task, not the tool. what if you don't even get the choice to work with python, for whatever reason?

fallow frost Jul 11, 2022, 1:16 PM

#

wooden sail do you have some idea of the kind of position you'd like, or where you'd like to...

I'm looking at the job descriptions on Linkdin and 95% of the job postings are by agencies who dont know jack shit about programming or data science, but most are like: 'were looking for an Entry level data analyst with 2+ years of experience and most know the following 10 technologies, and the ideal candidate will also be familiar with these additional 10 tools'
so its fucking ridiculous

fallow frost Jul 11, 2022, 1:16 PM

#

wooden sail hmm but it makes more sense imo to focus on the task, not the tool. what if you ...

exactly, so what should I learn ?

#

Like there is so much that im not sure where to continue
Data visualization: Tablue, Power BI, ETL, ML, pySpark and so many more

wooden sail Jul 11, 2022, 1:21 PM

#

i guess i couldn't say, i guess my position is a little removed from the real world

#

looks like you'd wanna go into visualization next, in any case

limber token Jul 11, 2022, 1:28 PM

#

Is there a way to pass multiple separators to pandas' read_csv()? Something akin to this:

spiral furnace Jul 11, 2022, 1:29 PM

#

fallow frost I'm looking at the job descriptions on Linkdin and 95% of the job postings are b...

so true

fallow frost Jul 11, 2022, 1:31 PM

#

limber token Is there a way to pass multiple separators to pandas' `read_csv()`? Something ak...

there is
https://stackoverflow.com/questions/48063620/pandas-read-csv-for-multiple-delimiters

Stack Overflow

pandas read_csv() for multiple delimiters

I have a file which has data as follows

1000000 183:0.6673;2:0.3535;359:0.304;363:0.1835
1000001 92:1.0
1000002 112:1.0
1000003 154435:0.746;30:0.3902;220:0.2803;238:0.2781;232:0.2717
1000004 118:...

limber token Jul 11, 2022, 1:32 PM

#

fallow frost there is https://stackoverflow.com/questions/48063620/pandas-read-csv-for-multi...

I've no idea how to use Regex, I need it to use both ; and , as separators

fallow frost Jul 11, 2022, 1:33 PM

#

spiral furnace so true

it sucks, Ill need to find a way to reach out to smaller companies directly

limber token Jul 11, 2022, 1:36 PM

#

fallow frost there is https://stackoverflow.com/questions/48063620/pandas-read-csv-for-multi...

pandas.errors.ParserError: Expected 1140 fields in line 1029, saw 1144. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

#

Used: df = pd.read_csv('products_2.csv', sep="\s+|;|,", engine='python')

serene scaffold Jul 11, 2022, 1:39 PM

#

limber token I've no idea how to use Regex, I need it to use both `;` and `,` as separators

@limber token chances are, this CSV actually represents nested tables

#

because what that error message is telling you is that if you use \s+|;|, as the delimiter, each row has different numbers of columns

#

and that's not allowed.

limber token Jul 11, 2022, 1:39 PM

#

Hm

#

Libre Office reads it perfectly, but it has a size limit

#

serene scaffold Jul 11, 2022, 1:40 PM

#

rekt

limber token Jul 11, 2022, 1:40 PM

#

And I prefer to manipualte data on Pandas anyway

#

What should I do?

serene scaffold Jul 11, 2022, 1:40 PM

#

see if you can figure out which delimiter is used the same number of times in each row

#

that's the real file-level delimiter. if you need to break up the nested data, you can use .str.split(';').explode()

limber token Jul 11, 2022, 1:42 PM

#

I'm not sure what to do.
The file is an import file for my work, but it was a faulty import file that setted a bunch of attribute values as empty strings when it shouldn't have, my job is to recover the lost attributes

serene scaffold Jul 11, 2022, 1:42 PM

#

limber token I'm not sure what to do. The file is an import file for my work, but it was a fa...

did you figure out which delimiter is used the same number of times in every row?

limber token Jul 11, 2022, 1:43 PM

#

serene scaffold did you figure out which delimiter is used the same number of times in every row...

Looks like it's the comma

#

regal warren Jul 11, 2022, 2:09 PM

#

Hi everyone. Can someone please let me know what is the best way and resource to learn Pyspark?

spiral furnace Jul 11, 2022, 2:56 PM

#

guys do you face problems lately using chrome with google colab?

mild dirge Jul 11, 2022, 4:03 PM

#

Nope 👍🏽

terse dagger Jul 11, 2022, 4:06 PM

#

Hey guys,

Does anyone know how to fix this error?

ValueError: cannot reindex on an axis with duplicate labels

#help-orange full contsxt

serene scaffold Jul 11, 2022, 4:07 PM

#

terse dagger Hey guys, Does anyone know how to fix this error? ```ValueError: cannot reind...

please show the whole error message from Traceback:, and the relevant code, and print(df.head().to_dict())

terse dagger Jul 11, 2022, 4:08 PM

#

Ill do the print lets continue in orange channel

river cloud Jul 11, 2022, 4:34 PM

#

Hello everyone!. I need your help please, I am investigate this to long time but i don't have answer. I need export a dataframe to excel but in excel in the "format number", I need that have "format number " be text, like the pic. Because always when I export to excel it puts "General" and not "Text"

spiral furnace Jul 11, 2022, 4:41 PM

#

river cloud Hello everyone!. I need your help please, I am investigate this to long time but...

I think "general" is the default option when you create an excel... try to convert it in your excel sheet instead

river cloud Jul 11, 2022, 4:49 PM

#

spiral furnace I think "general" is the default option when you create an excel... try to conve...

But I need to do it from python(pandas or other library) but automatically

spiral furnace Jul 11, 2022, 4:57 PM

#

river cloud But I need to do it from python(pandas or other library) but automatically

there are excel libraries for python

#

https://www.excelpython.org/

Python for Excel

Python for Excel compiles the best open-source Python libraries for working with Excel. It helps you choose the most suitable library for your use case.

#

have you also checked Pandas.ExcelWriter Method?

#

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.ExcelWriter.html

river cloud Jul 11, 2022, 5:05 PM

#

spiral furnace https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.ExcelWriter.ht...

I took that but the number format gives me "general"

spiral furnace Jul 11, 2022, 5:07 PM

#

river cloud I took that but the number format gives me "general"

maybe try with XlsxWriter

odd meteor Jul 11, 2022, 5:08 PM

#

regal warren Hi everyone. Can someone please let me know what is the best way and resource to...

I'd say, using YouTube or Buying courses on DataCamp/DataQuest/Coursera etc

spiral furnace Jul 11, 2022, 5:08 PM

#

but first you might have to create your excel filewith pandas and then edit it with XlsxWriter

odd meteor Jul 11, 2022, 5:08 PM

#

spiral furnace guys do you face problems lately using chrome with google colab?

Nah

river cloud Jul 11, 2022, 5:11 PM

#

spiral furnace but first you might have to create your excel filewith pandas and then edit it w...

I will try, thx u Obserdo!

paper quarry Jul 11, 2022, 5:14 PM

#

Hello! Does anyone here knows how to implement the NBEATS algorithm and can help me with it? Thanks in advance.

boreal cape Jul 11, 2022, 5:27 PM

#

Hello. Does anyone know what these mean or how to replicate this format?

1.346899999999999977e+02
1.322500000000000000e+02
1.300000000000000000e+02
1.335200000000000102e+02
1.303079999999999927e+02

mild dirge Jul 11, 2022, 5:27 PM

#

It's scientific notation

#

the top one is like 1.346899.. * 10 ^ 02

#

@boreal cape

boreal cape Jul 11, 2022, 5:28 PM

#

Hmm... These are supposed to be stock prices so I have no idea how it got turned into that.

wooden sail Jul 11, 2022, 5:29 PM

#

it's just a way to print a float

mild dirge Jul 11, 2022, 5:29 PM

#

Yeah, it doesn't change anything about the numbers

#

the number is still 134.6899...

boreal cape Jul 11, 2022, 5:29 PM

#

OH

#

Lightbulb

#

Y'all just solved 2 days of frustration

pliant star Jul 11, 2022, 6:28 PM

#

hey guys, any idea why the code keeps crashing me?

def detect(self, image, original_shape=None):
        height, width, channels = image.shape

        blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)

        self.net.setInput(blob)
        outs = self.net.forward(self.output_layers)```

#

(tensorflow)   SoSe22/appliedcomputerscienceinsports/detector » python main.py                                                                        main deleted modified untracked
Traceback (most recent call last):
  File "/Users/s/dev/uni/SoSe22/appliedcomputerscienceinsports/detector/main.py", line 55, in <module>
    run()
  File "/Users/s/dev/uni/SoSe22/appliedcomputerscienceinsports/detector/main.py", line 32, in run
    processed_image = detector.detect(preprocessed_image, original_image.shape)
  File "/Users/s/dev/uni/SoSe22/appliedcomputerscienceinsports/detector/test_ui/detector/__init__.py", line 32, in detect
    blob = cv2.dnn.blobFromImage(image, 1, (416, 416), (0, 0, 0), True, crop=False)
cv2.error: OpenCV(4.6.0) /Users/xperience/actions-runner/_work/opencv-python/opencv-python/opencv/modules/imgproc/src/resize.cpp:3689: error: (-215:Assertion failed) !dsize.empty() in function 'resize'```

#

(3, 2160, 1216) that's my image.shape

mild dirge Jul 11, 2022, 6:32 PM

#

That normally means that the image is empty @pliant star maybe you didn't load it in correctly

#

Like a wrong path or something

#

Try cv2.imshow(image) and cv2.waitkey(0)

#

before the line that gives the error

mint palm Jul 11, 2022, 6:39 PM

#

is it possible to detect 3 classes using only 2 "anchor boxes" when using bounding box technique??

#

#

why does he take 2

#

am i right if i think, we assume that more then 2 classes are highly unlikely in single bounding box??

mild dirge Jul 11, 2022, 6:54 PM

#

Anchor boxes are a set of predefined bounding boxes of a certain height and width. These boxes are defined to capture the scale and aspect ratio of specific object classes you want to detect and are typically chosen based on object sizes in your training datasets.

#

This is what I found on anchor boxes

#

It seems that there may be 2 different shapes that are likely to contain a car/motorcycle/pedestrian etc.

#

@mint palm

#

Does that make sense?

#

So for each bounding box shape (2) you get a result for each class (8) whether that class is contained in a single box of the 3x3 grid

mint palm Jul 11, 2022, 7:09 PM

#

mild dirge It seems that there may be 2 different shapes that are likely to contain a car/m...

this can be contradict with what he further said also the 8 signifies following:

is something detected
center of detection x
center of detection y
length of anchor box
widht of anchor box
one hot representaion for class1
one hot representaion for class2
one hot representaion for class3

these are stacked twice(as seen)

mint palm Jul 11, 2022, 7:10 PM

#

mint palm am i right if i think, we assume that more then 2 classes are highly unlikely in...

i think, this is actually the reason now

rough mountain Jul 11, 2022, 8:20 PM

#

Anyone know of a tokenizer that will parse a sentence like this
"You're amazing," she said.
['"', "you're", "amazing", ",", '"', "she", "said", "."]

#

" you're amazing , " she said .

serene scaffold Jul 11, 2022, 8:35 PM

#

rough mountain Anyone know of a tokenizer that will parse a sentence like this "You're amazing,...

you can probably do phrase chunking with spacy

rough mountain Jul 11, 2022, 8:36 PM

#

serene scaffold you can probably do phrase chunking with spacy

Could you elaborate, google isn't being useful today

serene scaffold Jul 11, 2022, 8:37 PM

#

rough mountain Could you elaborate, google isn't being useful today

spaCy (with the upper C) is a general-purpose NLP library. are you familiar with it?

rough mountain Jul 11, 2022, 8:38 PM

#

serene scaffold spaCy (with the upper C) is a general-purpose NLP library. are you familiar with...

oh, I some-how read scipy. I had forgotten about spacy.

#

I forgot how amazing spacy's docs are

fiery dust Jul 11, 2022, 9:12 PM

#

what do I need to know to learn and use ML/AI

serene scaffold Jul 11, 2022, 9:22 PM

#

fiery dust what do I need to know to learn and use ML/AI

linear algebra, calculus, and probability/statistics, to start.

fiery dust Jul 11, 2022, 9:24 PM

#

serene scaffold linear algebra, calculus, and probability/statistics, to start.

tysm

#

really 🙂

eager wedge Jul 11, 2022, 9:34 PM

#

could someone help me with 3d patch extraction?

grand vapor Jul 11, 2022, 9:39 PM

#

if you append dataframes to a list, such that each item in the list is a dataframe, is there a way to name these dataframes for reference? or do you just need to use the list index (listname[0], for example) and remember what each dataframe is meant to tie back to?

#

i think with my situation I may need to just keep track in my head what each df is. they're more or less in a certain order already, but it'd help to have a visual reminder

serene scaffold Jul 11, 2022, 10:04 PM

#

grand vapor i think with my situation I may need to just keep track in my head what each df ...

do all these dataframes have the same schema?

grand vapor Jul 11, 2022, 10:05 PM

#

serene scaffold do all these dataframes have the same schema?

yeah, if by same schema you mean the same columns

serene scaffold Jul 11, 2022, 10:07 PM

#

grand vapor yeah, if by same schema you mean the same columns

then you should probably have one dataframe with multi-indexed rows.

rough mountain Jul 11, 2022, 10:07 PM

#

unrelated to my other question.

I wish to make an model that creates space ships for a video game. I've got the data from the save files and the game store them as json.

  "Parts": [
    {
      "IDString": "cosmoteer.armor_1x2_wedge_L",
      "Location": [
        -14,
        6
      ],
      "Rotation": 0
    },
    {
      "IDString": "cosmoteer.armor_1x2_wedge_R",
      "Location": [
        -14,
        8
      ],
      "Rotation": 2
    },
ect...```
Json is impractical for ML models, but I could turn the part list into an image, giving each tile an id. The big issue is parts come in a bunch of different sizes. (Luckily always rectangles.) So if I just tell it to predict a grid of part ids, it would likely start making impossible parts. But I can't see a better way.

#

Also having it output a rotation for every tile seems problematic, but inferencing the rotation from an output takes away control from the AI when multiple rotations could work.

#

I also need it to output a map of doors. Those are 1x1 so it's easier. my only issue here is this might be too much to throw at one model.

fiery dust Jul 11, 2022, 10:42 PM

#

its this worth reading? http://aima.cs.berkeley.edu/

chilly abyss Jul 11, 2022, 10:47 PM

#

Hi all

#

pls how can I split this date and time using pandas lib?

shrewd locust Jul 11, 2022, 11:04 PM

#

Unknown label type: 'continuous-multioutput' what's this error?

tawdry urchin Jul 11, 2022, 11:07 PM

#

@chilly abyss I was looking into this earlier today. People suggested using pd.to_datetime(df.date) but I could not get it to work. You can do
import datetime
df['date'].dt.normalize which sets all of the times to midnight, it obviously doesnt get rid of the time, but it may be useful

chilly abyss Jul 11, 2022, 11:09 PM

#

Thanks @tawdry urchin This was the error msg I got - 'AttributeError: Can only use .dt accessor with datetimelike values'
I am still checking online for solution, I think because of the '..+00' at the end of the ts (time series)

#

I will also try the solution you gave

tawdry urchin Jul 11, 2022, 11:12 PM

#

@chilly abyss Sorry you are right, you need to use both, so you do
df.date = pd.to_datetime(df.date)
df.date = df.date.dt.normalize
I was able to get it working like this in mine, hopefully it helps you out

#

Im still learning myself, I basically have to learn python to create fixes in processes, so apologies if im missing anything

chilly abyss Jul 11, 2022, 11:13 PM

#

It's alright, we learn together. 🙂

#

I'm new to python too.

tawdry urchin Jul 11, 2022, 11:21 PM

#

haha sounds good, im pretty comfortable with pandas, and ive done some pretty neat projects with python, its the classes/functions that I am mostly unfamiliar with, the code i am writing isn't really set up to run forever because im working with pretty bad data

tropic matrix Jul 11, 2022, 11:34 PM

#

in pandas is there a way to specify dtype as "dict"? both strings and dictionaries are "objects" and it's making preprocessing a dataset rather annoying

#

i.e. select all columns that are a dict

serene scaffold Jul 11, 2022, 11:36 PM

#

tropic matrix in pandas is there a way to specify dtype as "dict"? both strings and dictionari...

no. anything other than a primitive type (including strings) are object.

you probably shouldn't have dicts as elements of the dataframe. that sounds like a bad version of a better data model.

tidal bough Jul 11, 2022, 11:36 PM

#

tropic matrix i.e. select all columns that are a dict

Not possible, I'm pretty sure. Dtypes are a numpy concept and essentially describe how an element is stored in memory. The dtype of object actually means "a pointer to some PyObject". Pointers to objects of different python types aren't different, so there's no dtypes for different python types.
(This is also exactly how python lists store elements.)

tropic matrix Jul 11, 2022, 11:37 PM

#

serene scaffold no. anything other than a primitive type (including strings) are `object`. you ...

can't do much about it as the dataset i'm running machine learning on has arbitrary dicts as an integral part

#

but thank you

tawdry urchin Jul 11, 2022, 11:38 PM

#

Could you give an example? Im not sure I quite understand

serene scaffold Jul 11, 2022, 11:38 PM

#

tropic matrix can't do much about it as the dataset i'm running machine learning on has arbitr...

it's pretty unlikely that there isn't a better way to do it, but we'd need to know what those dicts are (their key-value pairs, their types, and what they represent)

tropic matrix Jul 11, 2022, 11:43 PM

#

serene scaffold it's pretty unlikely that there isn't a better way to do it, but we'd need to kn...

i'll give an example of a few:

"attributes": {
  "mana_pool": 1,
  "undead_resistance": 1
}

there's around 15 different possible keys, and maximum of 2 can be present in one row, value is always an int

"runes": {
  "BLOOD_2": 1
}

there's an unknown amount of possible keys, but the value is always an int

"gems": {
  "AMETHYST_2": "FINE",
  "AMETHYST_1": "FINE",
  "AMETHYST_0": "FINE"
}

unknown amount of keys, value is always a string

the thing is is that i'm able to easily process those dicts to make them better to work with (i.e. similar to one hot encoding, convert the key to a column and the value becomes the cell for that column), however when trying to continue with preprocessing when i use pd.getdummies to one hot encode all of the leftover strings, it's finding a dict in the dataset and i'm trying to figure out where it's hiding

tawdry urchin Jul 11, 2022, 11:47 PM

#

I have never dealt with a dataset like that and truthfully im not sure what I would do, but luckily im not the smartest one here 🙂

tropic matrix Jul 11, 2022, 11:47 PM

#

tawdry urchin I have never dealt with a dataset like that and truthfully im not sure what I wo...

haha it's all good

#

there must be somewhere where i'm forgetting to delete a column or smth after i preprocess it

#

but when i print the dataset columns none of them have dicts which is odd

tawdry urchin Jul 11, 2022, 11:49 PM

#

just make sure your dataframe names are consistent throughout

#

pretty easy to bring the wrong dataframe in and then youre reviewing the correct one and working with a prior version

tropic matrix Jul 11, 2022, 11:50 PM

#

tawdry urchin pretty easy to bring the wrong dataframe in and then youre reviewing the correct...

mhm

iron basalt Jul 11, 2022, 11:54 PM

#

tropic matrix i'll give an example of a few: ```py "attributes": { "mana_pool": 1, "undead...

Numpy is designed to contain homogeneous plain old data (POD). It looks like you could simplify runes and gems since they don't seem like they should even be dicts, but rather lists.

tropic matrix Jul 11, 2022, 11:55 PM

#

iron basalt Numpy is designed to contain homogeneous plain old data (POD). It looks like you...

i know that, that's what i'm doing... i can't change how the dataset is originally stored bc that's out of my control 😭

#

i'm just working with what i have

iron basalt Jul 11, 2022, 11:55 PM

#

Convert them to lists and store them somewhere.

#

Then just use that, so you don't have to convert each time.

#

You can transform it however you want and store it however you want.

#

Take the bad data format and fix it.

#

The attributes looks like an array of booleans. If that is the case, since there are only 15 at most, you can use a bitmask.

#

(bitset)

#

How many different values the runes and gems can be changes how they can be stored.

#

If there is some fixed set of values you can improve it a lot.

tropic matrix Jul 12, 2022, 12:03 AM

#

iron basalt The attributes looks like an array of booleans. If that is the case, since there...

not booleans, can range from 1-10

iron basalt Jul 12, 2022, 12:04 AM

#

tropic matrix not booleans, can range from 1-10

So 11 possible states for each item.

#

(Not there, or 1-10)

#

Well you can store those in a numpy array.

#

Runes and gems just look like two lists, that are dicts for some reason.

#

(But out of order?)

tropic matrix Jul 12, 2022, 12:08 AM

#

iron basalt Runes and gems just look like two lists, that are dicts for some reason.

i blame the devs that made the game i'm running machine learning on lmao

#

but what i think you're not realizing is that i'm writing code on doing all of this rn

#

and i can handle that and i've already written code that converts these dicts into a better format

#

the issue is that when trying to run pd.getdummies it's saying there's still a dict dtype present, even though when I check the dataframe i don't find any

iron basalt Jul 12, 2022, 12:12 AM

#

tropic matrix the issue is that when trying to run pd.getdummies it's saying there's still a d...

Show code.

tropic matrix Jul 12, 2022, 12:12 AM

#

sheesh that was passive aggressive, one sec

tawdry urchin Jul 12, 2022, 12:13 AM

#

sounds like some stardew valley shit

tropic matrix Jul 12, 2022, 12:13 AM

#

tawdry urchin sounds like some stardew valley shit

hypixel skyblock 🙃

tropic matrix Jul 12, 2022, 12:14 AM

#

iron basalt Show code.

btw the code isn't pretty but it works, wrote it without the intention of it being read by other people

df = full_df.drop(['uuid'], axis=1, errors='ignore')

if verbose: print('Starting enchantments')
df = df.join(pd.DataFrame(list(df['enchantments'])).fillna(0).add_suffix('_enchantment')).drop('enchantments', axis=1)
if verbose: print('Encoded enchantments')

if verbose: print('Starting ability_scroll')
df = df.join(pd.DataFrame(ability_scroll_mlb.transform(df['ability_scroll']), columns=ability_scroll_mlb.classes_).add_suffix('_ability_scroll')).drop('ability_scroll', axis=1)
if verbose: print('Encoded ability_scroll')

if verbose: print('Starting gems')
df = df.join(pd.DataFrame(list(df['gems'])).fillna('').drop('unlocked_slots', axis=1, errors='ignore').add_suffix('_gem')).drop('gems', axis=1)
if verbose: print('Finished gems')
    
if verbose: print('Started runes')
df = df.join(pd.DataFrame(list(df['runes'])).fillna('').add_suffix('_rune')).drop('runes', axis=1)
if verbose: print('Finished runes')
    
if verbose: print('Started necromancer_souls')    
df = df.join(pd.DataFrame(necromancer_souls_mlb.transform(df['necromancer_souls'].apply(lambda x: list(map(lambda y: y['mob_id'], x)))), columns=necromancer_souls_mlb.classes_).add_suffix('_necromancer_soul')).drop('necromancer_souls', axis=1)
if verbose: print('Finished necromancer_souls')
    
if verbose: print('Started attributes')
df = df.join(pd.DataFrame(list(df['attributes'])).fillna(0).add_suffix('_attribute')).drop('attributes', axis=1)
if verbose: print('Finished attributes')

here's the code for converting all of those dicts/lists that ik are present in the dataset into a better format with pandas

#

@iron basalt

iron basalt Jul 12, 2022, 12:16 AM

#

tropic matrix btw the code isn't pretty but it works, wrote it without the intention of it bei...

Where does the get_dummies happen?

tropic matrix Jul 12, 2022, 12:18 AM

#

iron basalt Where does the get_dummies happen?

this is the code that contains it (it is run directly after the code block above):

X = df.drop('price', axis=1)
y = df[['price']]

if verbose: print('Started encoding')

X = pd.get_dummies(X) # here

df_columns = X.columns.tolist()

scaler_X = StandardScaler()
scaler_X.fit(X)

scaler_y = StandardScaler()
scaler_y.fit(y)

iron basalt Jul 12, 2022, 12:19 AM

#

What is X's type / how does it look like? Before get_dummies.

tropic matrix Jul 12, 2022, 12:20 AM

#

iron basalt What is X's type / how does it look like? Before get_dummies.

X should just be a df filled with ints/floats/strings assuming nothing wrong happened in the first code block

iron basalt Jul 12, 2022, 12:21 AM

#

tropic matrix X *should* just be a df filled with ints/floats/strings assuming nothing wrong h...

What is it in reality?

tropic matrix Jul 12, 2022, 12:21 AM

#

one moment

tropic matrix Jul 12, 2022, 12:29 AM

#

iron basalt What is it in reality?

it is what i believe it should be

set(X.dtypes)
# {dtype('int64'), dtype('float64'), dtype('O')}

iron basalt Jul 12, 2022, 12:29 AM

#

dtype('0') is something invalid, probably a dict.

#

Well, it can be in a numpy array, but it's kind of like object.

tropic matrix Jul 12, 2022, 12:30 AM

#

iron basalt dtype('0') is something invalid, probably a dict.

no it's just object (anything not a primitive, including strings)

#

when i checked all of the columns it's all either strings floats or ints

iron basalt Jul 12, 2022, 12:30 AM

#

So it's the strings.

tropic matrix Jul 12, 2022, 12:31 AM

#

iron basalt So it's the strings.

yeah... that's what i want, pd.get_dummies's purpose it to one hot encode all of the strings

iron basalt Jul 12, 2022, 12:33 AM

#

What does print(X.dtypes) show?

tropic matrix Jul 12, 2022, 12:34 AM

#

iron basalt What does print(X.dtypes) show?

well there's 300 columns by the time it gets to X, so when i print X.dtypes it shouls a few int64s, a decent amount of float64s, and a whole lot of objects, and when i go to check all of the columns that show object they are all just strings

#

and i won't be able to send that here without flooding the channel, it doesn't even print everything to console after some point it says "..." then goes to the end (but i checked the dtypes manually without relying on the console output

iron basalt Jul 12, 2022, 12:35 AM

#

And what is the exact error it gives?

#

Do your strings have some max size?

#

If your numpy arrays at any point contain different types of objects they will have the "object" type (like regular Python lists), but if it's all strings, with some max length, it can actually store that.

#

>>> x = np.array([foo, "Hello"])
>>> x
array([<__main__.Foo object at 0x7f402d6a7f70>, 'Hello'], dtype=object)
>>> y = np.array(["Hello", "World"])
>>> y
array(['Hello', 'World'], dtype='<U5')
>>> 
``` With a unicode dtype.

limber token Jul 12, 2022, 12:49 AM

#

limber token

@serene scaffold how should I go about this now?

iron basalt Jul 12, 2022, 12:54 AM

#

tropic matrix well there's 300 columns by the time it gets to X, so when i print X.dtypes it s...

It's hard to spot something obviously wrong with it, so all I can say at this point is to make hand crafted test data and assume that the input is hostile, especially if there is no official format specification.

#

(And that "object" may mean mixed types (it's a design flaw of pandas, because Numpy was not really meant for this and it uses it))

tropic matrix Jul 12, 2022, 12:55 AM

#

iron basalt (And that "object" may mean mixed types (it's a design flaw of pandas, because N...

i doubt there are mixed types, due to the fact that i make sure that all columns have the same type throughout it before i go ahead and process it

#

it's not letting me paste the error here so i'll pastebin it

#

https://pastebin.com/zqdaGebm @iron basalt here's the error

Pastebin

-------------------------------------------------------------------...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

this happens on the X = pd.get_dummies(X) line

iron basalt Jul 12, 2022, 12:56 AM

#

That is what I got when trying to get_dummies on a df that had both strings and dicts in the same column, dtype shows as "object" (for the column).

tropic matrix Jul 12, 2022, 12:57 AM

#

hm

#

..... you need to read up we've been discussing this for the past hour or two

iron basalt Jul 12, 2022, 12:58 AM

#

If it's all string it could in theory be something like <U... for the dtype, which would only be strings up to that length. It's both faster and in this case safer for detecting issues.

tropic matrix Jul 12, 2022, 12:58 AM

#

that's weird, i've never had pandas give me a unicode dtype

#

let me test smth rq

iron basalt Jul 12, 2022, 12:58 AM

#

Pandas uses object for strings, a design flaw to an extent.

#

They want dynamic strings.

#

(arbitrary length)

#

(This gives issues to databases too in terms of speed and safety and all that)

tropic matrix Jul 12, 2022, 12:59 AM

#

how could i convert a string column to a unicode dtype

#

?

#

test = pd.DataFrame.from_dict({
    'test': ['abc', 'def'],
    'test2': ['ghi', 'jkl']
})

print(test)

print(test.dtypes)

  test test2
0  abc   ghi
1  def   jkl
test     object
test2    object
dtype: object

iron basalt Jul 12, 2022, 1:00 AM

#

If you make a numpy array with strings it will do it.

tropic matrix Jul 12, 2022, 1:00 AM

#

test = pd.DataFrame.from_dict({
    'test': ['abc', 'def'],
    'test2': ['ghi', 'jkl']
})

print(np.array(test['test']))

array(['abc', 'def'], dtype=object)

#

@iron basalt hm

#

any way to coerce it into the unicode string dtype?

iron basalt Jul 12, 2022, 1:01 AM

#

Yeah if you explicitly do dtype=whatever for a numpy array

#

Pandas seems allergic to the idea though.

tropic matrix Jul 12, 2022, 1:03 AM

#

iron basalt Pandas seems allergic to the idea though.

yep, even when i converted it into a unicode string, when i try to put it back in the dataframe it decides to becmoe an object again

iron basalt Jul 12, 2022, 1:03 AM

#

Yeah, ```py

x = np.array(["Hello", "World"])
x
array(['Hello', 'World'], dtype='<U5')

#

Pretty annoying.

#

Makes debugging Pandas even harder.

tropic matrix Jul 12, 2022, 1:05 AM

#

iron basalt Makes debugging Pandas even harder.

what's worse, there's no way to limit casting

#

if i set any casting limits other than unsafe it decides that strings aren't strings anymore

#

the only way for it to work is casting unsafe

#

and the thing is, that converts the dict to a string too

#

🤦

iron basalt Jul 12, 2022, 1:06 AM

#

Playing fast and loose with types.

#

Yeah, IDK, now gotta make a separate part that loops through it all and tries to find out the type to detect an issue, then try to narrow down what caused it and craft an example to test against. Ideally the input would have some spec. to make this way less hacky.

tropic matrix Jul 12, 2022, 1:12 AM

#

iron basalt Yeah, IDK, now gotta make a separate part that loops through it all and tries to...

seems i got something basic to work for that

#

test = pd.DataFrame.from_dict({
    'test': ['abc', 'def', {'test': 1}],
    'test2': ['ghi', 'jkl', {'test': 2}]
})

def test_func(cell):
    try:
        json.loads(cell)
    except:
        print(cell)
    
    return cell
    
test['test'].apply(test_func)

#

it excepts whenever it encounters a dict

#

sorry whenever it doesn't encounter a dict

#

so i can kinda flip flop it

#

wait a minute no this doesn't work

#

first i need to convert it all to a string

#

oh you're kidding me

#

it expects double quotes

#

ok @iron basalt this one actually works:

#

test = pd.DataFrame.from_dict({
    'test': ['abc', 'def', {'test': 1}],
    'test2': ['ghi', 'jkl', {'test': 2}]
})

test['test'] = np.array(test['test']).astype('unicode')

def test_func(cell):
    try:
        isinstance(eval(cell), dict)
        print(cell)
    except:
        pass
    
    return cell
    
test['test'].apply(test_func)

#

alright trying to apply that to the entire dataset

#

istg if it passes cleanly i'm gonna punch a wall

iron basalt Jul 12, 2022, 1:21 AM

#

Ok, just make sure there is nothing strange happening with the eval.

#

Don't want to suddenly start deleting root.

tropic matrix Jul 12, 2022, 1:22 AM

#

yeah that's not gonna be an issue thankfully

#

ISTFG

#

@iron basalt IT PASSED CLEANLY

#

IDEK WHAT TO SAY

iron basalt Jul 12, 2022, 1:23 AM

#

Uh, i'm out of ideas for now.

tropic matrix Jul 12, 2022, 1:23 AM

#

me too

#

maybe i should use sklearn to onehotencode now

#

maybe that will work better than get dummies 😭

iron basalt Jul 12, 2022, 1:24 AM

#

Personally I would have switch languages to something with static types and done it manually at this point. This seems a bit too complex for Pandas.

tropic matrix Jul 12, 2022, 1:24 AM

#

iron basalt Personally I would have switch languages to something with static types and done...

what language would you suggest?

#

R?

iron basalt Jul 12, 2022, 1:25 AM

#

tropic matrix what language would you suggest?

IDK, whichever you prefer for manual looping. Or think can handle it with its own equivalent of Pandas.

#

Or a different dataframe lib for Python.

tropic matrix Jul 12, 2022, 1:30 AM

#

iron basalt Or a different dataframe lib for Python.

thank you, ig i'll try polars now

iron basalt Jul 12, 2022, 1:35 AM

#

tropic matrix thank you, ig i'll try polars now

It can convert to a Pandas df, so you could do your work before the get_dummies with it.

#

Then if get_dummies is still bugging, IDK.

#

>>> import polars as pl
>>> df = pl.DataFrame(
...     {
...         "A": [1, 2, 3, 4, 5],
...         "fruits": ["banana", "banana", "apple", "apple", "banana"],
...         "B": [5, 4, 3, 2, 1],
...         "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
...         "optional": [28, 300, None, 2, -30],
...     }
... )
>>> df
shape: (5, 5)
┌─────┬────────┬─────┬────────┬──────────┐
│ A   ┆ fruits ┆ B   ┆ cars   ┆ optional │
│ --- ┆ ---    ┆ --- ┆ ---    ┆ ---      │
│ i64 ┆ str    ┆ i64 ┆ str    ┆ i64      │
╞═════╪════════╪═════╪════════╪══════════╡
│ 1   ┆ banana ┆ 5   ┆ beetle ┆ 28       │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ banana ┆ 4   ┆ audi   ┆ 300      │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 3   ┆ apple  ┆ 3   ┆ beetle ┆ null     │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 4   ┆ apple  ┆ 2   ┆ beetle ┆ 2        │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 5   ┆ banana ┆ 1   ┆ beetle ┆ -30      │
└─────┴────────┴─────┴────────┴──────────┘
>>> 
``` Polars has an actual str type.

#

Rather than just object.

#

>>> df2 = df.to_pandas()
>>> df2
   A  fruits  B    cars  optional
0  1  banana  5  beetle      28.0
1  2  banana  4    audi     300.0
2  3   apple  3  beetle       NaN
3  4   apple  2  beetle       2.0
4  5  banana  1  beetle     -30.0
>>> df2.dtypes
A             int64
fruits       object
B             int64
cars         object
optional    float64
dtype: object
>>>

#

(Pandas just decided to throw everything under "object" because it's using numpy)

tropic matrix Jul 12, 2022, 2:06 AM

#

iron basalt ```py >>> import polars as pl >>> df = pl.DataFrame( ... { ... "A": ...

possible to go from pandas -> polars?

#

if so then i can do the first parts in pandas

#

possibly then after that convert to polars

#

unless polars doesn't have a get dummies equivalent

#

then i'll have to refactor my entire processing setup 😀

#

oh it seems it does

#

polars.from_pandas and polars.get_dummies both exist

iron basalt Jul 12, 2022, 2:10 AM

#

tropic matrix polars.from_pandas and polars.get_dummies both exist

Yeah, maybe try all polars. Should not be too much work to switch it.

#

I find polars easier to read. Feels like database queries.

tropic matrix Jul 12, 2022, 2:11 AM

#

iron basalt I find polars easier to read. Feels like database queries.

looking at the docs i agree

#

who tf decided pandas would be the standard

#

this seems so much better

#

If your Polars code looks like it could be Pandas code, it might run, but it likely runs slower than it should.
https://pola-rs.github.io/polars-book/user-guide/coming_from_pandas.html#key-syntax-differences

Coming from Pandas - Polars - User Guide

iron basalt Jul 12, 2022, 2:12 AM

#

Standards by popularity are not really standards, but also standardization should only be done when the thing being standardized has had a high level of effort and hindsight to it (and is not rapidly changing).

#

Most big Python libs are constantly changing (open source) and all that, not really stable stuff to make standards of.

#

More abstract standards do exist though and are fine, like the one for generic array-like API for Python libs.

tropic matrix Jul 12, 2022, 4:21 AM

#

@iron basalt might be a dumb question, but is polars able to store dicts?

#

because i keep getting an error when trying to convert a string representation of a dict into a dict saying "tuple must be same length"

charred light Jul 12, 2022, 4:30 AM

#

Hello darkness my old friend, over-fitting has come again.

iron basalt Jul 12, 2022, 4:30 AM

#

tropic matrix <@119925597395877889> might be a dumb question, but is polars able to store dict...

IDK, storing a dict in a dataframe is a strange thing to do, not what they are really meant for.

charred light Jul 12, 2022, 4:31 AM

#

tropic matrix who tf decided pandas would be the standard

Don't hate on pandas pandaScreams

#

Also, there's pyspark for larger datasets. and SQL

iron basalt Jul 12, 2022, 4:46 AM

#

iron basalt IDK, storing a dict in a dataframe is a strange thing to do, not what they are r...

I have not read through the entire Apache Arrow Columnar Format yet. Will get to it at some point. Only have a rough idea of how it works.

iron basalt Jul 12, 2022, 4:50 AM

#

iron basalt Personally I would have switch languages to something with static types and done...

Manually would have been done with it, it might seem like more work, but you don't have to deal with any limitations / really learn anything new (and it can be done in pretty much any language).

#

Dataframes are really nice when your problem / data fits with them (same with databases).

#

Extracting data from some non-standard file format will always be a pain best done manually.

#

(If that file format is complex and/or highly structured / nested)

#

(CSV would be an example that is the opposite, well understood, simple, built-in support everywhere, fits well into what dataframes want to do)

violet gull Jul 12, 2022, 5:06 AM

#

im going to a machine learning event thingy and i was wondering what all i should bring on a flash drive

#

only thing i can think of is a matrix multiplication and normalizing algorithm

grand vapor Jul 12, 2022, 5:15 AM

#

if my dataframe uses a datetime index, is it possible to find the "row number" of a specific time? I am wanting to drop all rows that occur after a certain time, but am unsure of how to do it

violet gull Jul 12, 2022, 5:16 AM

#

violet gull only thing i can think of is a matrix multiplication and normalizing algorithm

ping with response so i can see it

grand vapor Jul 12, 2022, 5:16 AM

#

i would typically do something like

df.drop(df.tail(rows).index, inplace = True)

But because of the datetime index, I can't come up with an integer to put in place of "rows"

royal garnet Jul 12, 2022, 6:16 AM

#

Is it possible for a function used with dataframe.apply to create a new dataframe?

#

I want to read each row in my original dataframe, and based on the values of certain columns, conditionally populate a new dataframe.

stoic viper Jul 12, 2022, 6:37 AM

#

Hey,

#

lets say i have a column in 2 dataframes, that are in a list, that has the same name and i want to remove alls rows with 1 in that column. with ```
for df in df_list

#

It doesnt apply it to the dataframes. Works without for

stoic viper Jul 12, 2022, 6:57 AM

#

df = df[df['Column_Name'] == 0]

#

example of what i do.

lofty elk Jul 12, 2022, 8:20 AM

#

#

I am making a visualization here and the bar chart has lines on the bars
I want to remove these lines
here is the code

#


import pandas as pd 
import plotly 
import plotly.express as px 
import plotly.io as pio 

df = pd.read_csv("Caste.csv")
df = df[df['state_name']=='Maharashtra']
#df = df.groupby(['year','gender',],as_index=False)[['detenues','under_trial','convicts','others']].sum()


barchart = px.bar(data_frame=df, 
    x='year', 
    y='convicts', 
    color='gender', 
    opacity=1, orientation='v', 
    barmode='relative',
)

pio.show(barchart)

serene briar Jul 12, 2022, 11:00 AM

#

@untold bloom cheers for your help in the help channel, was pulled into a meeting so wasn't able to follow through but thanks again ^^

terse dagger Jul 12, 2022, 11:22 AM

#

Guys help pls creating a dataframe to count members with diff activity statuses. I was here yesterday but need help again #help-pear

zealous burrow Jul 12, 2022, 12:48 PM

#

🌟 👉 hi guys, if you are into generative AI/generative art, you must try this Python library https://github.com/jina-ai/discoart super easy to use as long as you know a little bit Python and smoothly run with free GPU on Google colab!

GitHub

GitHub - jina-ai/discoart: Create Disco Diffusion artworks in one line

Create Disco Diffusion artworks in one line. Contribute to jina-ai/discoart development by creating an account on GitHub.

lofty elk Jul 12, 2022, 1:07 PM

#

Dash is pretty cool

ancient pendant Jul 12, 2022, 1:13 PM

#

Hi Everyone,
I wanted to know how to find correlation between multiple variables in python?

#

I was looking at numpy document where it said with np.corrcoef() you can find correaltion but only of 2 variables

#

No I have been told to use only np.corrcoef

wooden sail Jul 12, 2022, 1:27 PM

#

what's your question?

#

you can place the variables as columns (or rows) of the matrix (or matrices) you pass to np.corrcoef

wooden sail Jul 12, 2022, 1:34 PM

#

ancient pendant No I have been told to use only np.corrcoef

you can put as many variables as you want into the columns or rows of the matrix, as long as you specify which axis you put them on. in this example, we see that, as one would expect from independent random vectors with mean 0, their correlation goes to 0 with the number of observations

In [16]: import numpy as np

In [17]: x = np.random.normal(0,1,(20,2))

In [18]: np.corrcoef(x, rowvar=False)
Out[18]:
array([[1.        , 0.02356115],
       [0.02356115, 1.        ]])

In [19]: x = np.random.normal(0,1,(100,2))

In [20]: np.corrcoef(x, rowvar=False)
Out[20]:
array([[ 1.        , -0.06573243],
       [-0.06573243,  1.        ]])

*edit on second glance it doesn't behave so nicely even with so many samples, but you still get the idea of how to use the func

tacit basin Jul 12, 2022, 1:35 PM

#

ancient pendant I was looking at numpy document where it said with np.corrcoef() you can find co...

you can have as many variables as you want, by default array rows are variables and columns are observations

wooden sail Jul 12, 2022, 1:37 PM

#

In [25]: x = np.random.normal(0,1,(10000000,2))

In [26]: np.corrcoef(x, rowvar=False)
Out[26]:
array([[1.00000000e+00, 2.93622044e-04],
       [2.93622044e-04, 1.00000000e+00]])

ok, this is better

#

you could alternatively compute it "by hand" by normalizing the rows or columns, and then computing X^T X or X X^T depending on how you arrange your data

ancient pendant Jul 12, 2022, 1:42 PM

#

wooden sail what's your question?

So I have four variables(ABCD)
A, B, C, D
1 2 3 4
5 6 7 8
11 12 13 14

first I was told to find pearson correlation between A and D variables
by using scipy.stats.pearson()

and now I have been told to find correlations between all variables using np.corrcoef()

wooden sail Jul 12, 2022, 1:44 PM

#

aight. so you have your variables arranged as columns, and several observations arranged as the rows. that means you'd also have to use rowvar=False

ancient pendant Jul 12, 2022, 1:45 PM

#

Ohhhkayyyy Thanks🙏 🤩

wooden sail Jul 12, 2022, 1:46 PM

#

In [32]: x = np.zeros((3,4))

In [33]: x[0,:] = np.arange(1,5)

In [34]: x[1,:] = np.arange(5,9)

In [35]: x[2,:] = np.arange(11,15)

In [36]: x
Out[36]:
array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [11., 12., 13., 14.]])

In [37]: np.corrcoef(x, rowvar=False)
Out[37]:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

like so

ancient pendant Jul 12, 2022, 1:51 PM

#

SO when we don't use rowvar=False
Is np.corrcoef() gives correlation between each element in columns?

wooden sail Jul 12, 2022, 1:55 PM

#

yep

#

it would take the rows as defining variables, and each column would be an observation of the variables

#

the output would be 3x3

ancient pendant Jul 12, 2022, 1:57 PM

#

Okay got it thanks🎉

hollow sentinel Jul 12, 2022, 3:28 PM

#

import requests
import pprint as pp

# Change this to be your API key.
MY_API_KEY="bla bla"

url = "https://beta3.api.climatiq.io/search"
query="hotel room"

query_params = {
    # Free text query can be writen as the "query" parameter
    "query": query,
    # You can also filter on region, year, source and more
    # "AU" is Australia
    "region": "AU"
}

# You must always specify your AUTH token in the "Authorization" header like this.
authorization_headers = {"Authorization": f"Bearer: {MY_API_KEY}"}

# This performs the request and returns the result as JSON
response = requests.get(url, params=query_params, headers=authorization_headers).json()

# And here you can do whatever you want with the results
print(response.keys())

#

so i'm trying to form the URL for this API request but idk how

#

bc printing the json out on my console is lagging my machine

#

i can't even look at the json without my computer dyying