#data-science-and-ml | Python | Page 269

small tartan Nov 17, 2020, 12:47 AM

#

I ask this because if either the value of the max or min changes, it will influence the score for the rest of the records although they did not 'change' anything

boreal summit Nov 17, 2020, 12:47 AM

#

If the min and Max of that column is too wide apart with lots of digits in btw, better use standard scaler.

small tartan Nov 17, 2020, 12:49 AM

#

for context, var a is a %, var b, is a number between 0-5000, var c is a value between 0-10000

#

ultimatly joining the 3 to create a 'score' from 0-100'

boreal summit Nov 17, 2020, 12:49 AM

#

I haven't used them much outside basic knowledge, I'm also still learning. But I know that MinMax scaler clusters digits together when the difference is very large. For instance, 0.15 and 2.4 could be lumped into a single variable which might affect results.

#

Try the both and see which gives better results.

small tartan Nov 17, 2020, 12:50 AM

#

Fair enough

#

Thanks

boreal summit Nov 17, 2020, 12:50 AM

#

Yea, you can wait for the other experienced guys to come give you more insights or Google stuff on your own. Happy coding!

#

*More.

small tartan Nov 17, 2020, 12:51 AM

#

TY! Google has lead to a few, but its always nice to ask you guys and get a bit of the expert opinion

hollow gull Nov 17, 2020, 1:50 AM

#

The scaling shouldn't change in deployment with minmax scaler because you should have trained it on the training data set and you are applying it to any unseen data. I assume it will give values outside of 0 and 1 if your data goes outside of the range of the training data, but I am not sure. Maybe it just caps its.

somber torrent Nov 17, 2020, 2:28 AM

#

Do you guys any websites that give away free data? I want to practice my data science skills

sweet plaza Nov 17, 2020, 2:31 AM

#

Can anyone guide me on how to implement the I python kernel in to vs code. Freaking having a hard time doing it even with the main website and downloading conda

hollow gull Nov 17, 2020, 2:31 AM

#

@somber torrent https://scikit-learn.org/stable/datasets/index.html

somber torrent Nov 17, 2020, 2:32 AM

#

thanks bro

heady hatch Nov 17, 2020, 2:42 AM

#

Hey all what's a way to convert a series of lists in a column of a groupby object to one giant list for each group?

ie.

a [1, 2]
a [2, 3]
b [1, 2]
b [2, 3]

into

a [1, 2, 2 3]
b [1, 2, 2, 3]

green hemlock Nov 17, 2020, 4:13 AM

#

Do you guys any websites that give away free data? I want to practice my data science skills
@somber torrent try kaggle or data.world

velvet thorn Nov 17, 2020, 4:40 AM

#

The scaling shouldn't change in deployment with minmax scaler because you should have trained it on the training data set and you are applying it to any unseen data. I assume it will give values outside of 0 and 1 if your data goes outside of the range of the training data, but I am not sure. Maybe it just caps its.
@hollow gull by default, it can give values outside the range of [0, 1]

#

Hey all what's a way to convert a series of lists in a column of a groupby object to one giant list for each group?

ie.
a [1, 2]
a [2, 3]
b [1, 2]
b [2, 3]
into
a [1, 2, 2 3]
b [1, 2, 2, 3]

@heady hatch write a custom aggregation function

#

and pass that into .agg

#

Hey guys, i have a question around dataset and what actions to take. I have 3 variables that are not at all on the same scale. I need to normalize or standardize them so they result in something i can weight and then pull into a single score as a result.
@small tartan where do the weights come from

small tartan Nov 17, 2020, 4:41 AM

#

To clarify, i am not deploying this in a ML model. But a dashboard

#

I will manually adjust the weights to achieve my desired rack and stack

#

i have 50 records (with about 5 being added quarterly)

velvet thorn Nov 17, 2020, 4:41 AM

#

hm

#

well

#

you could also scale the weights to the data

small tartan Nov 17, 2020, 4:42 AM

#

I've picked the top 10 and bottom 10 based on understanding the data and what it represents. I need to apply the weights to basically get those top and bottom 10 in the correct area and let the rest work within the scale

#

I'm just getting really caught up in the standardizing of the data so its on an equal playing field

velvet thorn Nov 17, 2020, 4:43 AM

#

seems like

#

scaling them to 0-100 would make sense

small tartan Nov 17, 2020, 4:43 AM

#

Since its not exactly dealing with metrics that are easy to just add together, hence standardizing. I did standardize so 95% is within 1 Standard deviation. and the outputs are mostly between -1 and 1

velvet thorn Nov 17, 2020, 4:43 AM

#

since the first value is a percentage 🤷‍♂️

small tartan Nov 17, 2020, 4:43 AM

#

Well i can build that metric to not be a percentage. It just comes that way raw

#

but yeah the first value being a percent is nice since thats inherently a 0-100 already ha

#

I'm building a score for spaces where content is held. The content age, usage, and some data about it is what is driving the variables. I'll have this score updated monthly with the backwards rolling 180 days worth of info

deep spire Nov 17, 2020, 5:22 AM

#

anyone have any luck using to_sql in pandas to load data into Snowflake when the dataframe has a datetime field? It keeps giving me this error: Failed processing pyformat-parameters; 255001: Binding data in type (timestamp) is not supported. and I haven't been able to find a solution online that doesn't involve manually converting each column that is a date, which isn't feasible for my use case

deep spire Nov 17, 2020, 5:42 AM

#

or is there any way to dynamically convert all datetime columns to strings in pandas without knowing every single column name that is a datetime?

green hemlock Nov 17, 2020, 6:25 AM

#

you can always loop, after getting pd.dtypes

#

and convert them to string, whosever data type is datetime

deep spire Nov 17, 2020, 6:26 AM

#

i cant even get the string conversion to work using astype(str) or unicode due to ascii-unicode errors

#

really wonder why pandas apparently had this working in 0.15 but then broke it in 0.24 🤷‍♂️

velvet thorn Nov 17, 2020, 6:42 AM

#

or is there any way to dynamically convert all datetime columns to strings in pandas without knowing every single column name that is a datetime?
@deep spire .select_dtypes

#

into .apply

#

into .dt.strftime

earnest forge Nov 17, 2020, 7:20 AM

#

I'm at beginning of my ML learning path, so it'd be nice of you if you help me clarify one question: is PCA the same as estimator?

deep spire Nov 17, 2020, 7:50 AM

#

@velvet thorn whats the proper way to call this and set those columns in the df? I'm trying df.select_dtypes(include='datetime64') = df.select_dtypes(include='datetime64').apply(lambda x: x.strftime('%Y-%m-%d %H:%M:%S')) which fails (as does apply(dt.strftime('%Y-%m-%d %H:%M:%S'))

But I think the error is with setting the columns to be the new column with converted type. getting SyntaxError: can't assign to function call

flat turtle Nov 17, 2020, 8:03 AM

#

hi :)
how to correctly install vaex lib
I mean, that i downloaded vaex by
pip install vaex, but when i want to start it in shell
shell throw ModuleNotFoundError: No module named 'vaex.remote'.
Than i want to install this one module,....
ERROR: Could not find a version that satisfies the requirement vaex.remote (from versions: none) ERROR: No matching distribution found for vaex.remote

lavish zinc Nov 17, 2020, 8:12 AM

#

how to increase the spacing between each x axis element?

📎 2020-11-17-134124_1256x79_scrot.png

#

or how to set x-axis each element text on multiple line?

earnest forge Nov 17, 2020, 8:43 AM

#

@lavish zinc you better rotate them 90°
plt.xticks(roation=90)

#

Conversely, you can set huge width in plt.figure(figsize=(30, 5))

#

But second option is not recommended, of course

brave crest Nov 17, 2020, 9:08 AM

#

My Validation accuracy and Training accuracy print there own values. Am I meant to make them print together as an average of both?

velvet thorn Nov 17, 2020, 9:17 AM

#

@deep spire lambda s: s.dt.strftime

#

df.select_dtypes(include='datetime64')[:]

late torrent Nov 17, 2020, 9:31 AM

#

hi 🙂

#

for any JupyterLab users who like a good dark theme, and maybe want something that looks a bit more modern than what JupyterLab offers, I just published a build of One Dark Pro

#

📎 jlab-one-dark-pro.png

#

which can be installed in the Extension manager 😊

#

https://github.com/johnnybarrels/jupyterlab_onedarkpro

GitHub

johnnybarrels/jupyterlab_onedarkpro

One Dark Pro theme for Jupyter Lab. Contribute to johnnybarrels/jupyterlab_onedarkpro development by creating an account on GitHub.

boreal summit Nov 17, 2020, 9:40 AM

#

@somber torrent try out kaggle.com for loads of datasets. You'll also find ML and data analysis examples of those datasets.

vapid sorrel Nov 17, 2020, 10:12 AM

#

Hi, someone knows how can I use TPU on google colab to compute a ANN please?

tawny cradle Nov 17, 2020, 10:16 AM

#

https://github.com/Nikhil0504/AI-SEWA/blob/main/Computer Vision/Paper.pdf

GitHub

Nikhil0504/AI-SEWA

This is a repo for SJPS SEWA Project on Computer Vision and AI - Nikhil0504/AI-SEWA

#

Hi people I made a paper for my school project on AI and ML

#

Can you please check it out and tell me if it’s good or not

delicate night Nov 17, 2020, 1:03 PM

#

I have a very good math background, and a lot of experience in swe, but not data sci. Are there any intermediate projects you guys could recommend? I want something that can allow me to get a feel for this field

bronze barn Nov 17, 2020, 1:07 PM

#

@hoary sluice thank you kind stranger for your suggestion to use HDBSCAN it worked nicely!

azure locust Nov 17, 2020, 1:11 PM

#

Hi, does anyone have an idea about calculating the reading time of an article by considering the syllables of the words as well, apart from considering the number of words and words per minute? or does anyone know how the reading time and speaking time is calculated in Grammarly?

vapid sorrel Nov 17, 2020, 2:42 PM

#

Hi guys, I have question about NN. I create a custom loss function which work on a ANN but it doesn't when I put it in a RNN. Why? def correlation(y_true, y_pred): corr = tfp.stats.correlation(y_true, y_pred, sample_axis=0, event_axis=None) return corr

torpid cave Nov 17, 2020, 2:56 PM

#

Hi guys, any scraper here? Just looking for opinnions on Scrapy vs BS

sharp herald Nov 17, 2020, 3:52 PM

#

@torpid cave what you mean by BS?

heady hatch Nov 17, 2020, 4:08 PM

#

Hmm I've never used BeautifulSoup as a scraper but more as a parser and I've never used Scrapy as a parser but as a scraper.

Often I just use request + bs instead of Scrapy unless I need something heavy duty.

What are you scraping?

sharp herald Nov 17, 2020, 4:08 PM

#

Ah thats BeautifulSoup

#

they are not comparable

#

BS is a HTML parser, Scrapy is a web crawler framework which includes a HTML parser too

#

using scrapy just to parse HTML does not makes sense

#

you can use BS to parse the HTML pages crawled by Scrapy

slender nymph Nov 17, 2020, 4:32 PM

#

hi how can i convert this in python

#

> set.seed(1)
> x <- w <- rnorm(100)
> for (t in 3:100) x[t] <- 0.666*x[t-1] - 0.333*x[t-2] + w[t]
> layout(1:2)
> plot(x, type="l")
> acf(x)```

#

it is R

heady hatch Nov 17, 2020, 5:17 PM

#

What's x <- w <- rnorm(100)?

slender nymph Nov 17, 2020, 5:18 PM

#

x=w=np.random.normal(100 values)

heady hatch Nov 17, 2020, 5:19 PM

#

And what's x[t]? is that accessing x at index t?

#

if so

import random
import numpy as np
random.seed(1)
np.random.seed(1)

x = w = np.random.normal(size=100)
for t in range(3, 100):
  x[t] = 0.666 * x[t-1] - 0.333 * x[t-2] + w[t]

... plotting

slender nymph Nov 17, 2020, 5:24 PM

#

thank you master @heady hatch

waxen birch Nov 17, 2020, 5:48 PM

#

Hello guys, I'm new to data science and python, however i have some experience with languages like C#, java or js. I have to do some tasks, is it a good place to ask some questions?

#

Right now i have to complete some programming tasks using pandas module

heady hatch Nov 17, 2020, 5:52 PM

#

Sounds relevant to data science, shoot your questions.

waxen birch Nov 17, 2020, 6:03 PM

#

okay so, i have a dataframe with columns(ID;Country;owns_car;gender;Age) and i have to create new df that has coums Country, average goods, minimalAge and %ofWomen

#

so i don't know how to create a new df with given columns and them populate the columns

#

i am a total pandas noob so maybe it is a simple task but i don't know the tool to achieve my goal 😄

cerulean spindle Nov 17, 2020, 6:59 PM

#

I used to use jupyter notebook with VScode. It was really slow and sometimes made really weird errors (not on me). Did anyone have the same problem? Does anyone recommend alternatives?

heady hatch Nov 17, 2020, 7:51 PM

#

@waxen birch

We'll work on it one at a time, but I do recommend reading up on basic Pandas first then we'll break down the problem at hand.

#

@cerulean spindle

Hmm what do you mean by really slow? Comparing it to regular Jupyter instances?

lapis sequoia Nov 17, 2020, 7:53 PM

#

@cerulean spindle i usually use a docker image to run jupyter notebooks and just pass the url to localhost

waxen birch Nov 17, 2020, 8:06 PM

#

@heady hatch okay, do you know maybe some good source of basic pandas? Maybe some tutorial which is valuable? ;)

heady hatch Nov 17, 2020, 8:08 PM

#

@waxen birch Here are some to get started.

https://github.com/ajcr/100-pandas-puzzles

They also have links on how to get started.

GitHub

ajcr/100-pandas-puzzles

100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete) - ajcr/100-pandas-puzzles

#

I recommend getting the basics of pandas down first because otherwise you have to think about data transformation and pandas syntax as the same time.

#

Unless you feel comfortable enough to dive right in, then show us your data and we can go straight in.

waxen birch Nov 17, 2020, 8:10 PM

#

Ooo! This looks great! Thank you so much. I've read in one of the O'Reilly book that python community is really nice. I guess they were right :)

heady hatch Nov 17, 2020, 9:04 PM

#

@glad mulch I don't know if this will work, but you can try df.T

#

hahah

Another way would be

df.index = df.columns

#

Though I'm unsure how that will go.

#

I guess you can do a temp or xor exchange.

#

df.columns, df.index = df.columns, df.index

#

I think you can just

#

pd.DataFrame or

#

pd.concat(list)

#

I might need more information.

#

What do you mean by same indexes and how do you want the dataframe to look?

median dove Nov 17, 2020, 9:22 PM

#

Hey, how could I update a complete Pandas column following a condition? For example: update Sex: male, female, male, male, female... to Sex: 0, 1, 0, 0, 1...?

#

Yeah but df[“sex”].map(...)?

#

It did, thanks

heady hatch Nov 17, 2020, 9:50 PM

#

@glad mulch oh 6 of those have the same indexes?

strong oasis Nov 18, 2020, 12:51 AM

#

Anybody else staring at some code not knowing where to start or is that just me?? (I'm still fairly new to python, but I spent some time away from it so I'm picking it back up and trying to lean ML 😓 )

hoary sluice Nov 18, 2020, 12:59 AM

#

@hoary sluice thank you kind stranger for your suggestion to use HDBSCAN it worked nicely!
@bronze barn no problem, i suggested HDBSCAN because i myself found in similar problem... HDBSCAN improves the way density clustering works building a hierarquical structure also, its Very good to find outliers... And toghter with UMAP ia great

cyan sun Nov 18, 2020, 1:41 AM

#

Trying to remove all columns that have a last row element of NaN.

📎 Screen_Shot_2020-11-17_at_8.40.30_PM.png

#

started by trying to create a droplist but i'm running into problems

#

droplist = [col for col in df.columns if ((df.loc[df['date'] == today][col]).isna()) == True]

heady hatch Nov 18, 2020, 1:57 AM

#

@cyan sun
You can try
df.iloc[-1].isna()

#

That should give you all the columns that have nan in the last row.

cosmic prairie Nov 18, 2020, 1:59 AM

#

Hello was wondering if anyone code explain this piecewise fit function code using numpy import https://paste.pythondiscord.com/yisowirelu.properties as I am struggling to grasp the idea

graceful glacier Nov 18, 2020, 2:02 AM

#

can anyone who knows SQL tell me why this works

#

📎 unknown.png

#

but this doesnt

#

📎 unknown.png

cyan sun Nov 18, 2020, 2:13 AM

#

@heady hatch thanks for the help but the drop method doesn't allow for boolean arrays - any suggestion on how to handle that?

heady hatch Nov 18, 2020, 2:15 AM

#

@cyan sun You don't need to use drop. You can just filter it out using boolean indexing.

minor star Nov 18, 2020, 2:16 AM

#

Yo so i have a csv file with a list of multipolygons that are 'community areas' of chicago. I am trying to find the center coordinate of each polygon, how should i go about doing this?

bleak spindle Nov 18, 2020, 2:18 AM

#

could you guys take a look at help-oxygen?

#

please

slate wagon Nov 18, 2020, 2:20 AM

#

Make a 10X4 dataframe with random numbers, you can use any names for columns names.

Use one easy built in function to show the basic statistics of all the columns such as count, mean, std and percentiles.

Transpose your dataframe.

Print the 3rd row and 5th and 6th columns from the transposed dataframe.

#

can someone help with this?

cosmic prairie Nov 18, 2020, 2:21 AM

#

if anyone could take a look at help-copper aswell that would be great

cyan sun Nov 18, 2020, 2:44 AM

#

@heady hatch got it working. Thanks again for your help 👍

heady hatch Nov 18, 2020, 2:44 AM

#

Nice nice.

shell berry Nov 18, 2020, 5:07 AM

#

Anyone here good at pytorch?

#

Willing to pay for some help

heady hatch Nov 18, 2020, 5:09 AM

#

@shell berry What do you need help with in pt?

stray spade Nov 18, 2020, 5:28 AM

#

Hye everyone

weary heart Nov 18, 2020, 5:47 AM

#

hi, does anyone know how to plot eeg using csv files?

#

and willing to help me with a projects?

pliant kestrel Nov 18, 2020, 6:04 AM

#

hi all, i have posted my issue in help-nickle but no one replied. I will summarize the essay i have written there in one sentence. any one expert in machine learning in python can give me a private tutoring to guide me in my project. I don't think i can learn everything in 10 days and submit my project.

velvet thorn Nov 18, 2020, 6:10 AM

#

hi all, i have posted my issue in help-nickle but no one replied. I will summarize the essay i have written there in one sentence. any one expert in machine learning in python can give me a private tutoring to guide me in my project. I don't think i can learn everything in 10 days and submit my project.
@pliant kestrel honestly

#

your problem

#

isn't really that advanced

#

but you're asking for quite a big commitment.

pliant kestrel Nov 18, 2020, 6:12 AM

#

i know man, i know, but i just feel totally alone in this shit that is driving me into despair

shell berry Nov 18, 2020, 6:21 AM

#

@heady hatch Are you familiar with pytorch lightning

heady hatch Nov 18, 2020, 6:21 AM

#

Nope.

shell berry Nov 18, 2020, 6:21 AM

#

oh rip

#

I guess my question can be generalized to basic pytorch too

heady hatch Nov 18, 2020, 6:22 AM

#

It looks super clean though.

velvet thorn Nov 18, 2020, 6:22 AM

#

i know man, i know, but i just feel totally alone in this shit that is driving me into despair
@pliant kestrel you can ask specific questions here

#

and have a reasonably high chance of an answer.

shell berry Nov 18, 2020, 6:22 AM

#

If I have a tensor of input tensors and a tensor of outputs tensors, how exactly should I feed it into the model

velvet thorn Nov 18, 2020, 6:22 AM

#

but I think your problems run deeper than that

#

and well

#

maybe this isn't exactly the place

shell berry Nov 18, 2020, 6:22 AM

#

[inputs] + [labels]

#

or

#

[(input, label), (input, label)]

#

I know I have to feed it into a DataLoader

pliant kestrel Nov 18, 2020, 6:23 AM

#

@velvet thorn if this is not the place, then where is the proper place? i tried asking in reddit, but no buddy responded

#

i will try to see the codes that have been used in this course, and try to figure out as much as i can

velvet thorn Nov 18, 2020, 6:24 AM

#

like I said

#

this isn't really the place to find someone who is willing to commit to that long term

#

you might, but it's really unlikely.

#

especially for free

pliant kestrel Nov 18, 2020, 6:24 AM

#

what if not for free

heady hatch Nov 18, 2020, 6:24 AM

#

Interestingly @shell berry I might actually look into pytorch lightning. hahaha Thank you for this.

In terms of your question. Depends on your model.

velvet thorn Nov 18, 2020, 6:24 AM

#

what if not for free
@pliant kestrel then you need to take into account that you get what you pay for

shell berry Nov 18, 2020, 6:24 AM

#

Haha np, it is pretty clean

velvet thorn Nov 18, 2020, 6:24 AM

#

and if you want quality it's not going to be cheap

#

yeah.

shell berry Nov 18, 2020, 6:25 AM

#

I just want to use a straightforward MLP for now

#

Super basic

pliant kestrel Nov 18, 2020, 6:25 AM

#

what do u think the prices might be

shell berry Nov 18, 2020, 6:25 AM

#

    def __init__(self, input_size, hidden_size, output_size, dropout=False, dropout_p=0.1):
        super(MultiLayerPerceptron, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size, bias=True)
        self.fc2 = nn.Linear(hidden_size, output_size, bias=True)

        self.add_dropout = dropout
        self.dropout = nn.Dropout(dropout_p)```

#

Super basic

pliant kestrel Nov 18, 2020, 6:26 AM

#

i have been laid off since july, so I can barely offer much

shell berry Nov 18, 2020, 6:27 AM

#

@pliant kestrel If you're using sklearn I can help you for free tomorrow.

heady hatch Nov 18, 2020, 6:28 AM

#

@shell berry So I see you have your layers set up.

For PT, you can either define a full on loop or a

def forward(self, x):
  x = layer1(x)
  x = layer2(x)
  x = layer3(x)
  return x

shell berry Nov 18, 2020, 6:28 AM

#

Yes, thank you

heady hatch Nov 18, 2020, 6:28 AM

#

And then when you're calling the model in your loop

shell berry Nov 18, 2020, 6:28 AM

#

I have the model done and working with basic pytorch

#

Im just using lightning now and trying to wrap my head around datamodule and dataloader haha

#

Datalaoder takes in one input corresponding to data, so I assume its a list of tuples of inputs and labels?

heady hatch Nov 18, 2020, 6:30 AM

#

So I think you use a dataloader with a dataset.

#

ie make your dataset subclass, then create a dataloader based on that dataset.

shell berry Nov 18, 2020, 6:31 AM

#

Yup, so I created my dataset with another class

#

Now I just want to put it into a dataloader

#

But my subclass just returns a list of inputs and outputs

#

should I zip them into tuples and then into the dataloader?

heady hatch Nov 18, 2020, 6:32 AM

#

I want to say yes, but if you don't mind let me read it up real quick.

#

Been working with tf largely recently.

shell berry Nov 18, 2020, 6:33 AM

#

Sure, thank you very much

heady hatch Nov 18, 2020, 6:33 AM

#

Ahh okay. Yea you bundle the two up together.

#

And when you're loading it, you would unpack it.

pliant kestrel Nov 18, 2020, 6:33 AM

#

@shell berry kind sir, what is in sklearn that you are willing to explaing

shell berry Nov 18, 2020, 6:34 AM

#

@heady hatch Thank you

#

If you're working with text data @pliant kestrel

heady hatch Nov 18, 2020, 6:34 AM

#

So let's say you have your dataset.

iterating through your dataset object from the dataloader

for idx, data in enumerate(dataloader):
  image, label = data[0], data[1]
  ...

pliant kestrel Nov 18, 2020, 6:34 AM

#

the file is in csv file

shell berry Nov 18, 2020, 6:34 AM

#

I can help reading it in, cleaning it, augmenting/tokenizing/etc., then putting it into a format sklearn can read, then using a SVC or decision tree or whatever you need on it

heady hatch Nov 18, 2020, 6:35 AM

#

Yea no problem, happy to help.

#

@glad mulch 👋 hello.

shell berry Nov 18, 2020, 6:35 AM

#

Also by the way @heady hatch

#

For my labels

#

Does it matter if I'm using a labexindexer or a multilabelbinarizer

#

Like let's say I have 5 labels

#

I can represent them as [0], [1], etc

#

or [1 0 0 0 0], [0 1 0 0 0]

#

I'm doing multilabel classification so I'm using the latter, but I see some people use the former

pliant kestrel Nov 18, 2020, 6:36 AM

#

@shell berry, sounds good

#

when are u free tomorrow

shell berry Nov 18, 2020, 6:37 AM

#

Just message me whenever tomorrow and Ill let you know, but please don't hedge your entire project on me helping, I don't want you to get screwed if I'm busy or something

heady hatch Nov 18, 2020, 6:37 AM

#

I have no idea what either of those are. hahaha

In terms of deep learning, you would end up using different losses.

If you have 5 labels, and depending on what the output is like you would either use sparse categorical crossentropy or categorical crossentropy.

shell berry Nov 18, 2020, 6:38 AM

#

oh lol my bad

#

Ive been doing alot of classical machine learning with sklearn before this so Im thinking of everything like that lol

heady hatch Nov 18, 2020, 6:38 AM

#

No worries, I had a similar transition too.

pliant kestrel Nov 18, 2020, 6:38 AM

#

no man, i just want the push to start, since i am lost and don't know where to go

shell berry Nov 18, 2020, 6:38 AM

#

What does your data look like @pliant kestrel

pliant kestrel Nov 18, 2020, 6:39 AM

#

the professor is supposed to post the project today, but he did not yet for some reason

shell berry Nov 18, 2020, 6:39 AM

#

Wait you don't have it?

#

Then why are you worrying

pliant kestrel Nov 18, 2020, 6:39 AM

#

it is due in two weeks

shell berry Nov 18, 2020, 6:40 AM

#

Yeah u havent even looked at it man

pliant kestrel Nov 18, 2020, 6:40 AM

#

it is supposed to be posted already

shell berry Nov 18, 2020, 6:40 AM

#

atleast look at it before you give up 😛

pliant kestrel Nov 18, 2020, 6:41 AM

#

since the begining, we had miniprojects, a guy in our group was doing the coding while the rest did the exercises in excel for better understanding of the concepts

#

the final project is an indiviual work, i haven't look at a code for ml, since the code our professor gave us was old that the confusing matrix was working due to an update in panda_ml that is contradicting something else, and since our beloved instructor did not update, and me being total noob in python, i gave up on learning

velvet thorn Nov 18, 2020, 6:43 AM

#

Does it matter if I'm using a labexindexer or a multilabelbinarizer
@shell berry in short, not really.

#

but

#

okay wait do you understand the difference between multi-class and multi-label?

shell berry Nov 18, 2020, 6:43 AM

#

Yup

#

@pliant kestrel So you're a complete python noob?

#

I recommend learnign python before you start doing ML then

#

Also don't let one guy in the group do all the coding

pliant kestrel Nov 18, 2020, 6:44 AM

#

but now this individual project came like a 12 inch stick in the .. and now i need to do the following, 1-import the data ( i believe it is cleaned, since we are given training and test data) 2- do classification ( various classifiers) 3- using the confusion matrix, 4- write a report about each observation

#

well, i don't know any more man

shell berry Nov 18, 2020, 6:45 AM

#

How much python do you know?

pliant kestrel Nov 18, 2020, 6:45 AM

#

@glad mulch i think so

#

basics

shell berry Nov 18, 2020, 6:45 AM

#

Because all the stuff you told me can be done in like 50 lines lol

pliant kestrel Nov 18, 2020, 6:45 AM

#

i know, that is why i am a bit confused in this discussion 😄

velvet thorn Nov 18, 2020, 6:45 AM

#

Yup
@shell berry yeah so the latter representation can handle multi-label classification

#

since you can have multiple 1s

shell berry Nov 18, 2020, 6:46 AM

#

@velvet thorn Got it, thanks

#

Why can't the former? What if I have like

pliant kestrel Nov 18, 2020, 6:46 AM

#

@glad mulch how much of a pain was it

shell berry Nov 18, 2020, 6:46 AM

#

[5, 16, 20] for each label

velvet thorn Nov 18, 2020, 6:46 AM

#

[5, 16, 20] for each label
@shell berry then you'd have a variable-length target

shell berry Nov 18, 2020, 6:46 AM

#

Oh yes

#

Silly me 🙂

#

Thanks

velvet thorn Nov 18, 2020, 6:46 AM

#

yw

shell berry Nov 18, 2020, 6:46 AM

#

@pliant kestrel you're getting ahead of yourself

pliant kestrel Nov 18, 2020, 6:46 AM

#

no wait man, i am not, am i?

shell berry Nov 18, 2020, 6:46 AM

#

if you don't know programming then don't worry about the data yet

pliant kestrel Nov 18, 2020, 6:47 AM

#

but i have a project to deliver

#

and i have to do it by myself

shell berry Nov 18, 2020, 6:47 AM

#

I don't know what to tell you man

#

If you don't know programming

pliant kestrel Nov 18, 2020, 6:47 AM

#

tell me whatever u want

shell berry Nov 18, 2020, 6:47 AM

#

When did you start learning python?

pliant kestrel Nov 18, 2020, 6:47 AM

#

yesterday

shell berry Nov 18, 2020, 6:48 AM

#

yikes

pliant kestrel Nov 18, 2020, 6:48 AM

#

i had some circumstances this semester

#

being laid off and such

#

it was not funny

#

you get destroyed when u are alone

shell berry Nov 18, 2020, 6:48 AM

#

Yeah man Im not judging you or anything, I said that with kind intentions

#

I just dont know how you can do ML in python if you dont even know python

#

But your project seems pretty basic, you could look up tutorials and put together the pieces

pliant kestrel Nov 18, 2020, 6:49 AM

#

that is what i am trying to do at the moment.

shell berry Nov 18, 2020, 6:49 AM

#

You have two weeks?

pliant kestrel Nov 18, 2020, 6:50 AM

#

i will try to ask more clearer questions in the future

#

yea

shell berry Nov 18, 2020, 6:50 AM

#

Your project can be done in two hours

#

I'd spend a week just learning python itself

pliant kestrel Nov 18, 2020, 6:50 AM

#

that is a ray of hope

shell berry Nov 18, 2020, 6:50 AM

#

What is your masters degree in?

pliant kestrel Nov 18, 2020, 6:50 AM

#

well, i have another project in regression analysis that i am trying to solve also using JMP

#

engineering mangement

#

the course i am taking is called : data mining

shell berry Nov 18, 2020, 6:51 AM

#

They didn't have prereqs for that course?

pliant kestrel Nov 18, 2020, 6:51 AM

#

statistical learning in machine learning or close enough

shell berry Nov 18, 2020, 6:51 AM

#

I have never seen a course like this without programming courses as a prerequisite

pliant kestrel Nov 18, 2020, 6:52 AM

#

the prerequiset was that u should have taken a programming in ur undergrade

#

i did, that was 9 years ago in Java

#

so i am a bit flexed on some conepts, but that is that, i have never dealt with python past this point

shell berry Nov 18, 2020, 6:53 AM

#

http://ct-chess.us/pdf/py3_hard.pdf

#

study that for 2-4 days and you'll be fine

pliant kestrel Nov 18, 2020, 6:54 AM

#

should i neglect the course i am taking in datacamp?

lapis sequoia Nov 18, 2020, 6:54 AM

#

So for handwritten dataset. Is it true that sprite sheet is more common than csv?

minor star Nov 18, 2020, 6:54 AM

#

Practice cant hurt

shell berry Nov 18, 2020, 6:54 AM

#

That is up to you

pliant kestrel Nov 18, 2020, 6:54 AM

#

ok i see, but can i continue to ask more questions on what to do in the future?

shell berry Nov 18, 2020, 6:56 AM

#

Sure

pliant kestrel Nov 18, 2020, 6:58 AM

#

sorry man i laughed a bit

shell berry Nov 18, 2020, 6:58 AM

#

are you sure you didnt open ur data incorrectly cause wtf

pliant kestrel Nov 18, 2020, 7:00 AM

#

ok guys, see you, going to watch some lectures. Thanks kali, gm, light

shell berry Nov 18, 2020, 7:00 AM

#

good luck, sorry to hear about ur lay off

pliant kestrel Nov 18, 2020, 7:01 AM

#

thanks, it is ok, hopefully things will be resolved soon

shell berry Nov 18, 2020, 7:02 AM

#

I'm in grad school for NLP 😛

#

how about you

#

oh nice

#

you look like a finance student with your suit 😛

#

pct?

#

Can't you just parse through and check for what you need at each row

#

Sorry, I know literally nothing about sql 😛

#

Parse through the dataframe and read each row

#

each row will have the values of each column

#

so itll be like

#

[date, ticker, price], [date, ticker, price]

#

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

minor star Nov 18, 2020, 7:14 AM

#

How am i able to find the Geolocation and Geocoding Limits for API usage? i have a dataset of approximately 300,000 entries it needs ran on. Will i be able to run it out all of them?

#

Google APIs

#

How much is 100QPS?

velvet thorn Nov 18, 2020, 7:16 AM

#

if i wanted to do pct changes in price for each ticker based on the date, how would i do that
@glad mulch groupby

#

then diff

#

oh, no, not diff

#

pct_change

#

yeah

#

so what's wrong with groupby

shadow quiver Nov 18, 2020, 8:01 AM

#

In a saved keras model (via model.save()), there are two keys in the .h5 file: model_weights and optimizer_weights. Am I gonna ever have to use optimizer_weights If I'm never gonna continue the training on the model? I'm willing to use it for prediction only

shell berry Nov 18, 2020, 8:37 AM

#

📎 unknown.png

#

Getting a loss of almost 0 after only 200 epochs.. Is this fishy? Something is wrong, right?

#

For a multilabel dataset with 3k examples and ~40 labels

winged lark Nov 18, 2020, 9:44 AM

#

hello

#

Can anyone help me on how to load multiple models from checkpoints in TensorFlow 2.1?

#

Have two checkpoint directories, and I need to load in a model for each.

#

🙂

lapis sequoia Nov 18, 2020, 9:50 AM

#

Getting a loss of almost 0 after only 200 epochs.. Is this fishy? Something is wrong, right?
@shell berrydo you have a test dataset or validation dataset? its likely that your model has overfitted/overfit (unsure of correct grammar here)

#

if you can run a test with your model using the test or validation dataset and see what the loss there is then you could potentially get an answer

#

if your testing loss is super high and ur test accuracy is low then your model has overfit

#

if your model has overfit, well it depends on the model and what you want to actually do because there's various ways to counter overfitting but if it's a CNN for example you can add dropout layers

gaunt venture Nov 18, 2020, 11:02 AM

#

Hello, so i simply want to calculate the percentage of something, yet i keep getting a divide by zero error, when i'm not dividing by 0

📎 unknown.png

bitter harbor Nov 18, 2020, 11:16 AM

#

that equals 0

📎 unknown.png

#

you wouldn't get the error otherwise

#

oh no actually a = 0, you don't have any parentheses so I'm assuming it's doing ((number/a) + sa + nad + d+ sd)

gaunt venture Nov 18, 2020, 11:23 AM

#

oh no...i removed the brackets when i moved it to a function, thank you

mild topaz Nov 18, 2020, 12:08 PM

#

i am saving an image at path_resources folder and then i am deleting it os.remove(path_resources/"im.png") this way
i am getting error at test_img = cv2.imread(path_resources/"im.png") this line

#

Traceback (most recent call last):
  File "E:\demo3\modules\recDoc1.py", line 211, in post
    test_img = cv2.imread(path_resources/"im.png")
SystemError: <built-in function imread> returned NULL without setting an error```

surreal willow Nov 18, 2020, 12:46 PM

#

CAN I use Conda with the PyCharm Community Edition?

agile pollen Nov 18, 2020, 1:40 PM

#

what is the module for calculus?

#

anyone?

light warren Nov 18, 2020, 1:46 PM

#

hey, can anyone help me with my scatterplot, im trying to chance the values of the x axis as my graph is coming out like this https://gyazo.com/68542f25c58330b3c0db3fed929eb9e4

Gyazo

bitter harbor Nov 18, 2020, 1:52 PM

#

what is the module for calculus?
@agile pollen scipy's got quite a bit for it as well as sympy

agile pollen Nov 18, 2020, 2:18 PM

#

ModuleNotFoundError: No module named 'Tkinter'

#

please help

bitter harbor Nov 18, 2020, 2:22 PM

#

lowercase t

#

this is the wrong channel for that tho

split eagle Nov 18, 2020, 2:40 PM

#

I am trying to apply a function to all columns in a df without typing out the column names. Is there a faster way to do this? I've looked to see if I could use the column numbers, but this hasn't worked.

torpid cave Nov 18, 2020, 2:41 PM

#

Whta function?

split eagle Nov 18, 2020, 2:42 PM

#

.astype(int)

#

I've been .astype(int) individual columns, but I need to do it to all 16 in my df. The column titles are long, and I am to save time.

torpid cave Nov 18, 2020, 2:44 PM

#

df = df.astype(int)

#

Assuming all your columns are numbers

split eagle Nov 18, 2020, 2:45 PM

#

The columns are objects.

torpid cave Nov 18, 2020, 2:45 PM

#

import pandas as pd
d = {'A':['1','2','3','4','5'], "B":['2','3','4','5','6'], 'C':['3','4','5','6','7']}
df = pd.DataFrame(d)

df = df.astype(int)
type(df.loc[1,'A'])

#

Just tried tat and it worked

#

by numbers I meant, whatever they are but in the bottom of their hearth they are numbers

#

Other way is creating a column index and then changing the type to that column index

#

col_index = df.columns[0:2]
df[col_index] = df[col_index].astype(int)

#

Then you can use slicers

#

To select your columns

#

Or use a loop

#

for col in df.columns:
    df[col] = df[col].astype(int)

#

Last one I don't approve as it is not vectorized

#

There are at least 3 other ways I can think of doing this, let me know if what I did earlier works or if I missunderstood your problem

split eagle Nov 18, 2020, 2:50 PM

#

I'll give these a shot and let you know.

#

Thanks.

torpid cave Nov 18, 2020, 2:50 PM

#

Nww

split eagle Nov 18, 2020, 2:54 PM

#

col_index = df.columns[0:2] df[col_index] = df[col_index].astype(int) Worked like a charm. Thanks again.

slender nymph Nov 18, 2020, 3:09 PM

#

hi, how can i convert this code in python. it's is matlab ```py
%Simulate AR(3)
T = 1000; %Set how many observations you need
y = ones(T,1); %Create a vector of dim Tx1 to store the simulations in
y(1) = 1; %Set the first obs. to 1
y(2) = 0.5; %Set the second obs. to 0.5
y(3) = 1.5; %Set the third obs. to 1.5
rho1 = 0.2; %Set the value of rho1 (coefficient on y(t-1))
rho2 = 0.2; %Set the value of rho2 (coefficient on y(t-2))
rho3 = 0.1; %Set the value of rho3 (coefficient on y(t-3))
sigma = 1; %Set the value of the s.d. of the error term
mu_e = 0; %Set the value of the mean of the error term
eps = normrnd(mu_e, sigma, T, 1); %Creat a vector of normal random numbers with mean, mu_e and s.d. sigma. Dimension is Tx1

for t=4:1000; %Start the loop running from obs. 4 to 1000
y(t) = rho1y(t-1) + rho2y(t-2) + rho2*y(t-3) + eps(t); %The AR(3) model
end```

torpid cave Nov 18, 2020, 3:17 PM

#

Well up to mu_e is exactly the same

#

After that I think it is better if you explain what you want to do

#

Nevermind, I see what you are trying to do

#

It sucks that I can do it in R but not Python

#

I think you could either work out the equation to get Y

#

And loop y

#

it would be easier to ignore y1, y2, y3.. and just do an AR(3) simulation

#

import statsmodels.api as sm
import numpy as np

arparams = np.array([0.2, 0.2, 0.1])
ar = np.r_[1, -arparams]
arma_process =sm.tsa.ArmaProcess(ar=ar, nobs=1000)

#

Something among those lines

slender nymph Nov 18, 2020, 3:34 PM

#

arparms it is nothing like the exemple i put . you cannot put 3 (y) in a same array

#

i already tried it. i did like you. i took it from forums

burnt wharf Nov 18, 2020, 3:35 PM

#

cross_val_score of sklearn returning list of nan values...any help guys?

torpid cave Nov 18, 2020, 3:36 PM

#

Hmm

#

A loop should work with some calculus

halcyon vale Nov 18, 2020, 3:37 PM

#

https://www.linkedin.com/posts/thinam-tamang-3b12831a2_300daysofdata-machinelearning-deeplearning-activity-6734796720862973952-M4aE

Thinam Tamang on LinkedIn: #300DaysOfData #machinelearning #deeplea...

Day 9 of #300DaysOfData!

Reinforcement Learning :
In Reinforcement Learning, The Learning system called an agent in a particular context can observe the...

torpid cave Nov 18, 2020, 3:37 PM

#

I mean you are simulating an AR process

burnt wharf Nov 18, 2020, 3:37 PM

#

@torpid cave any help?

torpid cave Nov 18, 2020, 3:37 PM

#

I am thinking but it is 230 am here

#

I havent done much calculus in python tbh

#

I guess I would try to get Y on one side and roll the equations

#

Ok got it I think

#

I am using my tablet so I cant test this but I think the idea is quite clear

#

y_1 = 1.5
y_2 = 0.5
y_3 = 1
rho_1 = 0.2
rho_2 = 0.2
rho_3 = 0.1
mu, sigma = 0, 1
error = np.random.normal(mu, sigma, 1000)
y_list = [y_3, y_2, y_1]
for i in range(3,999):
    y = rho_1 * y_1 + rho2 * y_2 + rho3 * y_3 + error[i]
    d = {i: 'y_value'}
    y_list.append(y)
    #Update lag variables
    y = y[i]
    y = y[i-1]
    y = y[i-2]

#

Damn

#

I forgot index starts at 0

#

Just fix that and you should be good I guess

#

*fixed

light warren Nov 18, 2020, 4:33 PM

#

can someone help me with this error

#

https://gyazo.com/c5a698d166fe7a44e867e23b12cad4e3

Gyazo

sharp herald Nov 18, 2020, 4:37 PM

#

In poisson regression what means "deviance"?

heady hatch Nov 18, 2020, 4:40 PM

#

https://gyazo.com/c5a698d166fe7a44e867e23b12cad4e3
@light warren
I think your data has infs or nans in them.

Gyazo

#

Check your dataframe.

df.info()

light warren Nov 18, 2020, 4:41 PM

#

yeah it does, do u have how i can make it skips those data rows?

heady hatch Nov 18, 2020, 4:43 PM

#

You can drop the na via df.dropna().

light warren Nov 18, 2020, 4:46 PM

#

would i just add that to the above code?

heady hatch Nov 18, 2020, 4:47 PM

#

You're going to need to reassign it.

df = df.dropna()

fast plover Nov 18, 2020, 4:47 PM

#

Ok so I have a pandas dataframe i need to split into quintiles as I need to get the average of the top/bottom 20% of the rows in it by a given key (INDEX, an integer that's a calculated score)

#

having difficulty finding the function i need in the docs

sharp herald Nov 18, 2020, 4:49 PM

#

Is there a way to do the following transformation without that for loop? I am spliting rows and changing column names

def df_split_rows(df: pd.DataFrame):
    raw_df = {'Attacker': [], 'Defender': [], 'AttackerAdvantage': [], 'Damage': []}
    for _, row in df.iterrows():
        raw_df['Attacker'].append(row['Player1'])
        raw_df['Defender'].append(row['Player2'])
        raw_df['AttackerAdvantage'].append(1)
        raw_df['Damage'].append(row['Player1_score'])
        raw_df['Attacker'].append(row['Player2'])
        raw_df['Defender'].append(row['Player1'])
        raw_df['AttackerAdvantage'].append(0)
        raw_df['Damage'].append(row['Player2_score'])
    return pd.DataFrame(raw_df)

rich silo Nov 18, 2020, 4:52 PM

#

Hey guys does anyone knows of a way to dynamically create from a list of numerical data, a dataframe with 2 columns (Bins, frequency) AKA a frequency table, with the only arguments being the list with the data and the number of desired bins?
The bins should be of equal size. For ease here is a random list:

lst = [111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139]

#

!code

arctic wedgeBOT Nov 18, 2020, 4:53 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

light warren Nov 18, 2020, 4:55 PM

#

@heady hatch thanks sooo much

bitter harbor Nov 18, 2020, 5:39 PM

#

@rich silo range(111, 140)?

cosmic prairie Nov 18, 2020, 5:41 PM

#

can anyone give me some help with running this code im on help-aluminium?

light warren Nov 18, 2020, 6:02 PM

#

https://gyazo.com/a5c28edcbd41640f4f3e8c003247b0e3 what does this error mean?

Gyazo

earnest herald Nov 18, 2020, 6:07 PM

#

Can someone point me out if I am wrong here. I am making an image recognition application (for verifying signatures).

I am currently looking at Tensorflow to get the work done but there are just so many libraries such as OpenCV and LSH.

I was hoping someone can point out what I should implement in my code. PS- I have to make a check for mirrored images as well

I just joined the industry so go easy on me. Cheers!

sage rock Nov 18, 2020, 6:14 PM

#

📎 Screenshot_2020-11-18_234201.png

#

📎 Screenshot_2020-11-18_234220.png

#

getting an error while importing matplotlib

#

How do i resolve this?

earnest forge Nov 18, 2020, 7:21 PM

#

I have faced this problem before, solution is simple: update matplotlib

deep ingot Nov 18, 2020, 8:00 PM

#

hi , im struggling on starting this question that i have been given on data science, i tried doing it on my own thou im getting more confused

this is my guide line

#

📎 Screenshot_20201118-122127.png 📎 Screenshot_20201118-122109.png

neon shard Nov 18, 2020, 8:09 PM

#

@deep ingot What part are you struggling with?

deep ingot Nov 18, 2020, 8:10 PM

#

@neon shard i am new to data science , basically i started reading my notes and now application is calling and i dont know where to start. i have knowledge on the topiv but dont know how to apply

neon shard Nov 18, 2020, 8:11 PM

#

Specifically, what in the above document are you stuck on?

deep ingot Nov 18, 2020, 8:11 PM

#

📎 unknown.png

#

what i have done

#

i just like to get a whole example of how i can do this question so i can do it by myself again if that makes sense

neon shard Nov 18, 2020, 8:14 PM

#

I don't think anyone is just going to do your whole homework assignment for you. You need to break it down into chunks, try it, and then ask for help on parts that you're stuck on

#

It looks like you need to generate the student number with range() or a loop. Do you know how to do that?

deep ingot Nov 18, 2020, 8:15 PM

#

uhhm is it like so

#

for x in range(150):

#

studentNumber = input()

neon shard Nov 18, 2020, 8:16 PM

#

That will require the user to input 150 numbers. I think it's asking you to just generate the IDs

#

So, for each loop you can just automatically generate an ID

#

You could do this if the IDs can be 1-150

for x in range(150):
    ids.append(x)

#

Or this which does the same thing in a cleaner way. It's a list comprehension

[x for x in range(150)]

deep ingot Nov 18, 2020, 8:20 PM

#

i perfer the 1st way u did it cause i have done that prev

lapis sequoia Nov 18, 2020, 8:33 PM

#

Anyone know if you can use NumPy to solve algebra problems?

cosmic prairie Nov 18, 2020, 8:35 PM

#

Hello does anyone know how I can get this code to run struggling at the minute?

📎 unknown.png

blazing bridge Nov 18, 2020, 8:36 PM

#

Don’t quote me on this but I don’t think you can run matplotlib on repl.it

cosmic prairie Nov 18, 2020, 8:38 PM

#

is there anyway around it

#

just remove it

#

can you run it on jupiter notebook

worn hinge Nov 18, 2020, 8:41 PM

#

You can run it on any real IDE

cosmic prairie Nov 18, 2020, 8:42 PM

#

sorry my coding language isn the best fairly new to the game lol what do you mean

#

do u mean a debugger

blazing bridge Nov 18, 2020, 8:46 PM

#

Integrated Development Environment. A place to write and run code

#

Like pycharm

#

If you really wanna do it in the web you can use google colab or kaggle notebooks

cosmic prairie Nov 18, 2020, 8:47 PM

#

i dont know why it isnt running cause he ran it on repl.it

blazing bridge Nov 18, 2020, 8:48 PM

#

Or the best thing to use when using matplotlib and numpy is Jupyter notebook

cosmic prairie Nov 18, 2020, 8:48 PM

#

yeah unless I use that

#

nearly sure I was told you can do it but

blazing bridge Nov 18, 2020, 8:51 PM

#

Are you using a video

#

Like who ran it

cosmic prairie Nov 18, 2020, 8:52 PM

#

ahhh he sent me the code just on an email and I copied it

blazing bridge Nov 18, 2020, 8:54 PM

#

Ok maybe in the terminal you can try doing pip install matplotlib

#

Or !pip install matplotlib

#

Not really sure about this one

#

Just some advice:

#

Do yourself a favour and start running code on an IDE rather than using these web based editors

cosmic prairie Nov 18, 2020, 8:55 PM

#

could it be I am using the wrong code for matplotlib?

blazing bridge Nov 18, 2020, 8:56 PM

#

It doesn’t look like it is

cosmic prairie Nov 18, 2020, 8:58 PM

#

def plot_it(x,y,p): #(uncomment to start working on this function - optional)

plot_it(x,y,p)

#

are those plot commands needed

blazing bridge Nov 18, 2020, 9:00 PM

#

No these are just functions and the error is with matplotlib before anything else

#

Python goes from the top to bottom and when it encounters a error it displays like and doesn’t show the other errors

cosmic prairie Nov 18, 2020, 9:01 PM

#

yeah I noticed that first error it sees it just tells you that one when you could have 5 more

#

What do you think is the best way round this problem?

blazing bridge Nov 18, 2020, 9:03 PM

#

Honestly, just use google colab instead

#

Much better

#

Before you copy and paste the code

#

Write !pip install matplotlib in the first code cell

cosmic prairie Nov 18, 2020, 9:04 PM

#

and then just copy the rest

blazing bridge Nov 18, 2020, 9:05 PM

#

Yeah try that

cosmic prairie Nov 18, 2020, 9:06 PM

#

right I will try that and get back to you,thanks @blazing bridge ,

blazing bridge Nov 18, 2020, 9:06 PM

#

Ok gl

cosmic prairie Nov 18, 2020, 9:08 PM

#

Im guessing that means its all good?

📎 unknown.png

#

seem to be getting no output tho

blazing bridge Nov 18, 2020, 9:10 PM

#

That’s because the code has to be separate from the installation

#

With cells you only get one output

#

Break the code up to different outputs

cosmic prairie Nov 18, 2020, 9:11 PM

#

How do I do that lol ?

heady falcon Nov 18, 2020, 9:17 PM

#

How is the cost function minimized in a neural network?

cosmic prairie Nov 18, 2020, 9:19 PM

#

hahha my code skills aint the best got this off a mate, I just need to which variables to change so that I get an output

serene scaffold Nov 18, 2020, 9:45 PM

#

I have a dataframe like this:

2         DNA  False
3         DNA  False
4         DNA  False
...       ...    ...
8790  nonDRNA  False

I need to get the percentage of rows where the boolean value is True, grouped by the first column.

#

df.groupby('class').count() is a great start but that counts every row; I could divide another dataframe by this

#

I want to say it's x.groupby('class').sum() / x.groupby('class').count() but idk what is being summed

cyan matrix Nov 18, 2020, 10:03 PM

#

anyone utilize pdftotext often? trying to pull out data from a huge PDF but having trouble with it

lapis sequoia Nov 18, 2020, 10:09 PM

#

I successfully made a script to load npz files. I need some advices about how can I extract datetime corresponding to some of the vars of the npz. Thanks for the help

📎 numpy.PNG

velvet thorn Nov 18, 2020, 10:59 PM

#

i have a dataframe where i want to calculate the returns for each ticker
@glad mulch what do you mean not working

#

I want to say it's x.groupby('class').sum() / x.groupby('class').count() but idk what is being summed
@serene scaffold the boolean value

serene scaffold Nov 18, 2020, 11:00 PM

#

@velvet thorn so it's just summing all numeric-like values in the dataframe?

low oracle Nov 18, 2020, 11:35 PM

#

Hey, I need a little help... Anyone here use jupyter on AWS?

serene scaffold Nov 18, 2020, 11:39 PM

#

@low oracle go ahead and ask what you would ask if someone said yes

low oracle Nov 18, 2020, 11:43 PM

#

@serene scaffold lol alrighty then. So I'm trying to utilize a tsv file into jupyter, I have looked around (YouTube, sof, etc...) and cant figure out how to properly work with my tsv data file

serene scaffold Nov 18, 2020, 11:44 PM

#

are you using pandas?

low oracle Nov 18, 2020, 11:44 PM

#

Attempting to yes

serene scaffold Nov 18, 2020, 11:44 PM

#

@low oracle if you're using pandas, it's pd.read_csv but you have to specify that tabs are the delimiter

#

!docs pandas.read_csv

arctic wedgeBOT Nov 18, 2020, 11:44 PM

#

`pandas.read_csv`

pandas.read_csv(filepath_or_buffer, sep=',', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, [...]```
Read a comma-separated values (csv) file into DataFrame.

Also supports optionally iterating or breaking of the file into chunks.

Additional help can be found in the online docs for [IO Tools](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html).

Parameters  **filepath\_or\_buffer**str, path object or file-like objectAny valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be: <file://localhost/path/to/table.csv>.

If you want to pass in a path object, pandas accepts any `os.PathLike`.

By file-like object, we refer to objects with a `read()` method, such as a file handler (e.g. via builtin `open` function) or `StringIO`.... [read more](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html#pandas.read_csv)

serene scaffold Nov 18, 2020, 11:45 PM

#

and then you get to experience the joy of learning how to use pandas, which as you can see from my earlier question is not something I've fully accomplished myself.

#

but there are people who hang out in this channel who have

low oracle Nov 18, 2020, 11:47 PM

#

This is what I have so far

📎 Screenshot_2020-11-18_184704.png

hollow gull Nov 19, 2020, 12:47 AM

#

@low oracle those seem like really strange errors. Are you creating a spark session somewhere? Are you trying to use pyspark?

low oracle Nov 19, 2020, 12:48 AM

#

Yeah I made that mistake and changed to conda-python3

#

but I'm still having issues

#

<ipython-input-12-530695f4cce5> in <module>
      2 import matplotlib.pyplot as plt
      3 tsv_file = open('data.tsv')
----> 4 read_tsv = csv.reader(tsv_file, delimiter='\t')

NameError: name 'csv' is not defined

hollow gull Nov 19, 2020, 12:50 AM

#

you just haven't imported csv it looks like

low oracle Nov 19, 2020, 12:51 AM

#

ok that changed things

#

Now it's saying tsv_file does not exist

hollow gull Nov 19, 2020, 1:05 AM

#

okay, maybe your path is wrong.

slender nymph Nov 19, 2020, 1:35 AM

#

guys i need some help : well i need to do a list like y = 5 ; y = 0.98 *y; y = 0.90 *y

#

for i in range(1, 20):
    np.random.seed(1000)
    y[i] = y[i-1]+ np.random.normal(0,1,size=200)```
after i need to use them here

#

someone can help me?

hollow gull Nov 19, 2020, 1:47 AM

#

@glad mulch there is a argument in df.dropna that lets you specify the axis. Think you want axis=1 for column or axis=0 for row, so axis=1 for you.

#

@slender nymph I don't really understand your question based on what you have said.

hollow gull Nov 19, 2020, 2:12 AM

#

!docs pandas.DataFrame.dropna

arctic wedgeBOT Nov 19, 2020, 2:12 AM

#

`pandas.DataFrame.dropna`

DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)```
Remove missing values.

See the [User Guide](../../user_guide/missing_data.html#missing-data) for more on which values are considered missing, and how to work with missing data.

Parameters  **axis**{0 or ‘index’, 1 or ‘columns’}, default 0Determine if rows or columns which contain missing values are removed.

• 0, or ‘index’ : Drop rows which contain missing values.

• 1, or ‘columns’ : Drop columns which contain missing value.

Changed in version 1.0.0: Pass tuple or list to drop on multiple axes. Only a single axis is allowed.

**how**{‘any’, ‘all’}, default ‘any’Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.

• ‘any’ : If any NA values are present, drop that row or column.

• ‘all’ : If all values are NA, drop that row or column.

**thresh**int, optionalRequire that many non-NA values.... [read more](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html#pandas.DataFrame.dropna)

ruby wyvern Nov 19, 2020, 2:39 AM

#

Can anyone help me out with this question?
https://stackoverflow.com/questions/64904587/how-to-generate-a-list-of-tokens-that-are-most-likely-to-occupy-the-place-of-a-m

Stack Overflow

How to generate a list of tokens that are most likely to occupy the...

How to generate a list of tokens that are most likely to occupy the place of a missing token in a given sentence?
I've found this StackOverflow answer, however, this only generates a possible word,...

#

Basically, I've found a StackOverflow answer, but this did not answer the question.

boreal summit Nov 19, 2020, 3:23 AM

#

Mehn, I was practicing and stuff, training a heavy dataset on my PC, now my PC is not responding. This is the first time. 😂

#

The activity light is on ATM. Normally, it blinks once every minute but now it's just on. Don't know if I should shut down and restart or leave it.

#

I first did some processing like oneHotEncoder and stuff, then I scaled it using standard scaler, used isomap to reduce it to 3 components, fit it using linear regression, make predictions and check the mean accuracy score. That's basically what I was doing.

#

Even the clock on my PC is not working anymore. I'll leave it for 10 more minutes.

#

@glad mulch not sure if the first guy answered your questions well enough. If you use axis=1, that means it should drop columns with null values, there's other parameters which you can use to fine-tune this also like threshold, how etc. If you use axis=0, that means it would drop any row that contains NaN, you can also set threshold and stuff for this argument also. You can read the documentation to get insights about the other parameters.

velvet thorn Nov 19, 2020, 4:04 AM

#

@velvet thorn so it's just summing all numeric-like values in the dataframe?
@serene scaffold yup

#

count, on the other hand, counts non-null values

#

@velvet thorn ok, lemme do a different question. if i wanted to skip the first date in my data frame how would i do that in multiindex
@glad mulch are you thinking of .iloc

serene scaffold Nov 19, 2020, 4:05 AM

#

how is iloc different from loc?

velvet thorn Nov 19, 2020, 4:08 AM

#

how is iloc different from loc?
@serene scaffold .loc takes boolean series or string indexers (labels, strictly speaking)

#

.iloc takes boolean series or positional indexers

#

so one common pattern is

#

selecting a subset of a DataFrame by applying a condition to the rows and taking only some columns

#

e.g. df.loc[df['value'] > 3000, ['colour', 'model']]

serene scaffold Nov 19, 2020, 4:10 AM

#

huh, I didn't think that worked

velvet thorn Nov 19, 2020, 4:10 AM

#

whoops I forgot the .loc

#

LOL

serene scaffold Nov 19, 2020, 4:10 AM

#

I didn't think that worked, either

#

but the first one doesn't? (without the .loc)

velvet thorn Nov 19, 2020, 4:12 AM

#

but the first one doesn't? (without the .loc)
@serene scaffold nope

#

but with .loc it does

#

!e

import pandas as pd

df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])
print(df.loc[df['a'] > 2, ['b']], end='\n\n')
print(df[df['a'] > 2, ['b']])

arctic wedgeBOT Nov 19, 2020, 4:14 AM

#

@velvet thorn :x: Your eval job has completed with return code 1.

001 |    b
002 | 1  4
003 | 
004 | Traceback (most recent call last):
005 |   File "<string>", line 5, in <module>
006 |   File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in __getitem__
007 |     indexer = self.columns.get_loc(key)
008 |   File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
009 |     return self._engine.get_loc(casted_key)
010 |   File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
011 |   File "pandas/_libs/index.pyx", line 75, in pandas._libs.index.IndexEngine.get_loc
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/ipoquzinuz.txt

serene scaffold Nov 19, 2020, 4:15 AM

#

@velvet thorn I'm learning lemon_hyperpleased

lapis sequoia Nov 19, 2020, 4:29 AM

#

h

undone flare Nov 19, 2020, 5:33 AM

#

OwO snekbox back

lapis sequoia Nov 19, 2020, 7:27 AM

#

(Noob) I successfully made a script to load npz files. I need some advices about how can I extract datetime corresponding to some of the vars of the npz. Thanks for the help

📎 numpy.PNG

cobalt jetty Nov 19, 2020, 8:31 AM

#

if the position of the dates are the same in the files, can't you loop through it and record them in a list?

#

the datetime module would be helpful if you need to reformat those dates later on.

lapis sequoia Nov 19, 2020, 9:02 AM

#

What do you mean by loop through it and record them in a list? How can I capture those data

cobalt jetty Nov 19, 2020, 9:12 AM

#

it seems that you have an array.
So your array must have items like rows? Using a for loop wouldn't you be able to go through that array?

📎 unknown.png

lapis sequoia Nov 19, 2020, 9:18 AM

#

Yes, I thought the same, the thing is that I have no clue about how to extract some of the columns of the npz

#

just regular slicing isn't it?

dim moss Nov 19, 2020, 9:20 AM

#

where can I learn data science

lapis sequoia Nov 19, 2020, 9:21 AM

#

where can I learn data science
@dim moss datacamp

dim moss Nov 19, 2020, 9:23 AM

#

is it free

cobalt jetty Nov 19, 2020, 9:24 AM

#

If you're using python, Nass, and you assigned that array to a variable name, type

type(var_name)

#

to see what you got.

lapis sequoia Nov 19, 2020, 9:24 AM

#

📎 vars.PNG

#

these are the vars included in the data files

#

📎 txt.PNG

cobalt jetty Nov 19, 2020, 9:25 AM

#

I'm not asking about what is in the file. I'm asking about the type of your data structure in your python shell.

lapis sequoia Nov 19, 2020, 9:26 AM

#

2 sec

#

I'll check it

cobalt jetty Nov 19, 2020, 9:26 AM

#

cuz the type/structure will impact what you can do with it.

dim moss Nov 19, 2020, 9:27 AM

#

is data camp free

cobalt jetty Nov 19, 2020, 9:28 AM

#

The best way to pick up data science is you build yourself your own project, cloneb.

#

Practice is important

dim moss Nov 19, 2020, 9:31 AM

#

I have not learnt datascience

#

I am looking for a free source

#

to learn data science

molten hamlet Nov 19, 2020, 9:37 AM

#

@dim moss check maybe some pinned messages

dim moss Nov 19, 2020, 10:08 AM

#

@dim moss check maybe some pinned messages
@molten hamlet nah

molten hamlet Nov 19, 2020, 10:09 AM

#

try kaggle

#

pros are there

#

i think

dim moss Nov 19, 2020, 10:09 AM

#

what is kaggle

#

oh it is google some community

#

I need a free data science learning scourcw

lapis sequoia Nov 19, 2020, 10:25 AM

#

I need help with an array, is someone available ?

molten hamlet Nov 19, 2020, 10:33 AM

#

@dim moss go to kaggle

#

now

#

grab any set

#

and start some tutorial 😄

dim moss Nov 19, 2020, 10:35 AM

#

what?

lapis sequoia Nov 19, 2020, 10:36 AM

#

yo can i DM anyone with my colab link my shit is crashing

#

like im tryna run this GAN but its crashing

lapis sequoia Nov 19, 2020, 11:26 AM

#

@cobalt jetty sorry to bother you would you be able to help me

cobalt jetty Nov 19, 2020, 11:26 AM

#

Hey, I'm still in class for the next 5 hours. Maybe then. But understand I've never implemented a GAN.

lapis sequoia Nov 19, 2020, 11:28 AM

#

no worries, is it cool if i DM you and you can look at it when you're available?

cobalt jetty Nov 19, 2020, 11:31 AM

#

I'll ping you here when I'm available.

deep ingot Nov 19, 2020, 11:40 AM

#

Hi I have to calculate the percentage for maths score but with a maximum of 130 can someone assist me

#

📎 20201119_134116.jpg

sage rock Nov 19, 2020, 12:18 PM

#

I have faced this problem before, solution is simple: update matplotlib
@earnest forge This didnt work

#

ImportError: DLL load failed while importing ft2font: The specified procedure could not be found.

#

Still getting this

#

if anyone could help me out

boreal summit Nov 19, 2020, 1:25 PM

#

@dim moss first, if you know Python basics and stuff, you should learn data analysis before moving to data science.

#

I could help you with resources to learn data analysis and data science, PDF files.

prisma isle Nov 19, 2020, 1:44 PM

#

Is there a faster way to find a linear combination of several large numpy memmaps?

#

I can't load them all into disk

earnest forge Nov 19, 2020, 1:58 PM

#

@sage rock try to import marplotlib without %inline

lapis sequoia Nov 19, 2020, 1:59 PM

#

can anyone help me figure this error out

#

ValueError: Dimensions must be equal, but are 16 and 60000 for '{{node mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](generator/activation_17/Relu, mean_squared_error/Cast/x)' with input shapes: [16,28,28,1], [60000,28,28,1].

#

im inputting a dataset with batch size of 16

#

but i cant seem to batch the mnist dataset to the same size

earnest forge Nov 19, 2020, 3:10 PM

#

@lapis sequoia it'd be better if you provided an exerpt from your code?

lapis sequoia Nov 19, 2020, 3:11 PM

#

i can paste it one sec

#

https://pastebin.com/VB0Ve6T0

Pastebin

ganfull - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

@earnest forge

#

i just changed the batch size to 16

earnest forge Nov 19, 2020, 3:13 PM

#

._.

#

i meant particular part of code

#

not the whole

lapis sequoia Nov 19, 2020, 3:13 PM

#

oh

earnest forge Nov 19, 2020, 3:14 PM

#

ValueError: Dimensions must be equal, but are 16 and 60000 for '{{node mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](generator/activation_17/Relu, mean_squared_error/Cast/x)' with input shapes: [16,28,28,1], [60000,28,28,1].
@lapis sequoia anyway, it says your second input is the size of 60000, not 16

lapis sequoia Nov 19, 2020, 3:14 PM

#

lol

earnest forge Nov 19, 2020, 3:14 PM

#

make sure you pass it the right data

halcyon vale Nov 19, 2020, 3:25 PM

#

https://github.com/ThinamXx/300Days__MachineLearningDeepLearning/blob/main/README.md

GitHub

ThinamXx/300Days__MachineLearningDeepLearning

I am sharing my Journey of 300DaysOfData in Machine Learning and Deep Learning. - ThinamXx/300Days__MachineLearningDeepLearning

lapis sequoia Nov 19, 2020, 4:16 PM

#

just wanted to say that i think ive gotten my model working and i want to thank everyone in here for all their help because i think i was on the verge of a mental breakdown i have a very good feeling that this shit will look like hot ass but we made it

tribal wind Nov 19, 2020, 4:21 PM

#

is anyone good with nltk that I could ask a beginner question?

hasty orchid Nov 19, 2020, 4:22 PM

#

When was save_fig() introduced in pyplot? Is it recent?

molten hamlet Nov 19, 2020, 5:00 PM

#

no

#

i think there was always in it

lapis sequoia Nov 19, 2020, 5:12 PM

#

Anyone do Monte Carlo sin?

#

Sim

glad kestrel Nov 19, 2020, 5:15 PM

#

Does anybody here know good dimensionality reduction techniques for binary data? I'm working on a data science project, with a dataset of 130 binary attributes. I'm looking for something that could be easily implemented in Python using sklearn or similar

#

@tribal wind I have **some experience with nltk, so shoot your shot

lapis sequoia Nov 19, 2020, 5:40 PM

#

Anyone do Monte Carlo sim?

hasty orchid Nov 19, 2020, 5:57 PM

#

They are very easy to find through Google solar

cobalt jetty Nov 19, 2020, 6:14 PM

#

You should increase your batch size if your hardware allow it, @lapis sequoia

#

You're also missing

validation_split=0.2,
subset="validation"``` in your preprocessing function to create test_ds

ancient venture Nov 19, 2020, 6:32 PM

#

Hi, I am very new to Python. By that I mean, I only started learning it about 1 month ago as part of my university physics course.

#

I have been given a csv file with wavelength (x) and 31 sets of observations (y)

#

I need to fit a linear + gaussian model for each observation (though if I do it for one I can probably repeat a similar thing for the remaining observations)

#

I have fit a linear model to the first observation data but I am struggling with fitting a gaussian fit

#

We are using the numpy and matplotlib packages

#

📎 unknown.png

#

📎 unknown.png

#

As you can see the linear fit is there

#

I am supposed to fit a gaussian to it, similar to how it is done in this example

#

📎 unknown.png

#

If someone could help me with using the scipy.optimise.curve_fit() function, it would be appreciated

#

I need to make initial guesses for the peak and width of the gaussian, and that should hopefully be automated so that I can repeat it for the other 31 observations

ancient venture Nov 19, 2020, 8:23 PM

#

I tried something on the 2nd column of data

📎 unknown.png

#

but I still only get a linear fit

#

📎 unknown.png

lapis sequoia Nov 19, 2020, 8:41 PM

#

hi, i need help

#

ModuleNotFoundError: No module named 'scitools.std'

ivory panther Nov 19, 2020, 11:11 PM

#

Hello everybody, Is there an discord chanel or a forum focused on python Pandas?

grave path Nov 19, 2020, 11:15 PM

#

How can I make them go next to each other

#

📎 unknown.png

velvet thorn Nov 20, 2020, 12:05 AM

#

grave path How can I make them go next to each other

create a Figure with multiple Axes and pass them manually

velvet thorn Nov 20, 2020, 12:05 AM

#

ivory panther Hello everybody, Is there an discord chanel or a forum focused on python Pandas?

this is it, unless you mean a different server

velvet thorn Nov 20, 2020, 12:06 AM

#

hasty orchid When was save_fig() introduced in pyplot? Is it recent?

a long time ago

velvet thorn Nov 20, 2020, 12:06 AM

#

lapis sequoia ModuleNotFoundError: No module named 'scitools.std'

you're missing a dependency. did you try Googling?

hasty orchid Nov 20, 2020, 1:53 AM

#

velvet thorn a *long* time ago

Actually I think it is a function I was supposed to create... not sure if there is a save fig method?

#

Was sorta testing the new reply feature

frozen gazelle Nov 20, 2020, 4:05 AM

#

Hey. I need some help with data science within 2 hours. Can I pay somebody here to help me 1:1?

#

Sorry if the wrong place. Have yet to find anyone who wants to help me 😦

#

Paying $60 for some quick q&a

#

📎 Screen_Shot_2020-11-19_at_10.png

dim moss Nov 20, 2020, 4:47 AM

#

boreal summit <@693365665033224222> first, if you know Python basics and stuff, you should lea...

I know python basics

north plinth Nov 20, 2020, 5:10 AM

#

Can anybody tell me how feed forward works in a conv model

velvet thorn Nov 20, 2020, 5:11 AM

#

north plinth Can anybody tell me how feed forward works in a conv model

what do you mean?

#

a convolutional layer?

north plinth Nov 20, 2020, 5:12 AM

#

I mean like we have 32 filter in the first conv layer

#

Yes

velvet thorn Nov 20, 2020, 5:12 AM

#

what dimension?

north plinth Nov 20, 2020, 5:12 AM

#

Conv2d

velvet thorn Nov 20, 2020, 5:12 AM

#

okay

#

so what specifically are you confused about

north plinth Nov 20, 2020, 5:12 AM

#

So we have 32 filters

#

That mean if i enter 1 img that img will be 32 imgs after passing the first conv layer

#

Isnt it?

velvet thorn Nov 20, 2020, 5:14 AM

#

uh

#

no?

#

okay, purely in the abstract sense

#

and assuming you're using same padding

north plinth Nov 20, 2020, 5:14 AM

#

Yaa

#

Then the maxpooling will shrink the imgs

velvet thorn Nov 20, 2020, 5:15 AM

#

say each image is of shape (w, h, c) (width, height, channels) and you pass it into a layer with k filters, the output will be of shape (w, h, k).

north plinth Nov 20, 2020, 5:15 AM

#

Okay that makes some sense

#

I thought each filter will be applied on the img and get a new img

velvet thorn Nov 20, 2020, 5:16 AM

#

no

north plinth Nov 20, 2020, 5:16 AM

#

But it is not that simple

velvet thorn Nov 20, 2020, 5:16 AM

#

each filter is applied on a channel.

north plinth Nov 20, 2020, 5:18 AM

#

Thanks dude

velvet thorn Nov 20, 2020, 5:19 AM

#

therefore, the number of parameters for a layer with k filters of size (fw, fh) is (fw * fh) * k * fc + k

#

yw

whole mica Nov 20, 2020, 5:21 AM

#

Hey guys!

#

How are yall?

#

is there anyone here who is really good with neural networking?

velvet thorn Nov 20, 2020, 5:31 AM

#

whole mica is there anyone here who is really good with neural networking?

just ask your question

whole mica Nov 20, 2020, 5:36 AM

#

Ok ok. So i am wanting to build a game bot that plays through Pokemon Emerald all on its own. But i am having trouble getting it set up

velvet thorn Nov 20, 2020, 5:42 AM

#

whole mica Ok ok. So i am wanting to build a game bot that plays through Pokemon Emerald al...

well

#

maybe you could elaborate on what trouble you're having

#

in general it would be better to ask a question

#

which people can answer

#

without having to ask for more info.

whole mica Nov 20, 2020, 5:45 AM

#

just about everything. I cannot find any resources on how to create a game bot.

velvet thorn Nov 20, 2020, 5:45 AM

#

well

#

then this probably isn't a good place to start

#

it's not as difficult or specialised as it was

#

but it's still a fair bit of work.

whole mica Nov 20, 2020, 5:46 AM

#

I'm wanting it to play on an emulator and i want the bot to recognize the emulator and play through it

velvet thorn Nov 20, 2020, 5:46 AM

#

how much Python and DL experience do you have

north plinth Nov 20, 2020, 5:46 AM

#

Specific game bot should be easy

#

Start with pyautogui

whole mica Nov 20, 2020, 5:46 AM

#

uhhhhhh

#

ive made a classifier

north plinth Nov 20, 2020, 5:47 AM

#

So what is ur problem

whole mica Nov 20, 2020, 5:47 AM

#

how to i get it to recognize the emulator as a whole?

north plinth Nov 20, 2020, 5:48 AM

#

What u r talking bro

#

Take a screen of ur game

#

Feed it into ur classifier

#

Predict an action

whole mica Nov 20, 2020, 5:48 AM

#

uh

#

gonna be honest

#

do not really know how the classifier works

#

just kinda built it

north plinth Nov 20, 2020, 5:49 AM

#

It doesnt predict the right?

velvet thorn Nov 20, 2020, 5:49 AM

#

north plinth Feed it into ur classifier

I am going to predict that this will not work

#

at all

north plinth Nov 20, 2020, 5:49 AM

#

Or got error during training

whole mica Nov 20, 2020, 5:49 AM

#

no it does

#

it works perfect

north plinth Nov 20, 2020, 5:50 AM

#

velvet thorn I am going to predict that this will not work

Y bro?

whole mica Nov 20, 2020, 5:50 AM

#

i think itll work

velvet thorn Nov 20, 2020, 5:50 AM

#

many reasons, but the simplest one is

#

not much of the gamestate is exposed by the screen

velvet thorn Nov 20, 2020, 5:51 AM

#

whole mica i think itll work

sure, go ahead then

#

good luck!

north plinth Nov 20, 2020, 5:51 AM

#

Yaaa

whole mica Nov 20, 2020, 5:51 AM

#

i just need a lot of help

#

what if i can get the games code?

north plinth Nov 20, 2020, 5:51 AM

#

Is it a strategy game?

whole mica Nov 20, 2020, 5:51 AM

#

its pokemon

north plinth Nov 20, 2020, 5:52 AM

#

Never tired out..ive a bot but it was of dino run

velvet thorn Nov 20, 2020, 5:52 AM

#

north plinth Never tired out..ive a bot but it was of dino run

which is a totally different game from Pokemon

north plinth Nov 20, 2020, 5:52 AM

#

Which worked pretty nice as it can be easily predict through screen data

velvet thorn Nov 20, 2020, 5:52 AM

#

yes

#

which is my point

velvet thorn Nov 20, 2020, 5:53 AM

#

north plinth Which worked pretty nice as it can be easily predict through screen data

doing this would be like creating a poker AI that looks only at what cards you have

whole mica Nov 20, 2020, 5:53 AM

#

well if you have extensive knowledge of the game it should not be hard to know the game state, just coding it is the hard part for me

velvet thorn Nov 20, 2020, 5:53 AM

#

whole mica well if you have extensive knowledge of the game it should not be hard to know t...

hm are you sure you understand what my problem is

whole mica Nov 20, 2020, 5:54 AM

#

i think i do, but i am not sure

velvet thorn Nov 20, 2020, 5:54 AM

#

okay.

#

so

#

you said you think

#

it's enough to predict the next action

#

given a screencap.

#

that doesn't make sense for multiple reasons.

#

the first is that a screencap won't, for example, take into account what Pokemon you have

#

or where in the story you are (because you might need to backtrack, for example)

#

the second is that you're going to need a ton of training data that more or less requires you to play the game yourself

#

where will you get that?

whole mica Nov 20, 2020, 5:55 AM

#

well, if i play the game will that work?

velvet thorn Nov 20, 2020, 5:55 AM

#

probably not

#

not enough data

whole mica Nov 20, 2020, 5:55 AM

#

what if i get a few people to play it?

velvet thorn Nov 20, 2020, 5:56 AM

#

you could try it

#

but my guess is not enough data

#

if you said

#

write AI to handle a subset of the game

#

like battling

#

that would be much simpler

#

to play the whole game?

#

I think you underestimate the scope of that project.

#

by a lot.

#

unless you hardcode a ton of stuff, but even then

#

Pokemon vs Dino Run is like chess vs tic-tac-toe.

whole mica Nov 20, 2020, 5:57 AM

#

even if i know the game in and out>

#

that would not help ?

velvet thorn Nov 20, 2020, 5:57 AM

#

whole mica that would not help ?

how are you going to translate your knowledge into code?

#

it's not that it wouldn't help

#

that is the very least you need

#

and it's nowhere near enough

whole mica Nov 20, 2020, 5:57 AM

#

i do not know. That is why i need help.

north plinth Nov 20, 2020, 5:58 AM

#

Translate the image data into feature set as u know bout the game

velvet thorn Nov 20, 2020, 5:58 AM

#

whole mica i do not know. That is why i need help.

I would suggest you build an AI for a much simpler game first

whole mica Nov 20, 2020, 5:58 AM

#

The reason i want to do this big of a project is because i want to learn as i go

velvet thorn Nov 20, 2020, 5:58 AM

#

then you can properly appreciate how difficult this is

velvet thorn Nov 20, 2020, 5:58 AM

#

north plinth Translate the image data into feature set as u know bout the game

also it wouldn't be efficient to do this IMO

rotund sail Nov 20, 2020, 5:58 AM

#

I have a decently simple data science question that I posted in #help-croissant , if anyone would be able to help me out it would be much appreciated 🙂

velvet thorn Nov 20, 2020, 5:58 AM

#

would probably make more sense to just pull data from the game's memory

velvet thorn Nov 20, 2020, 5:59 AM

#

rotund sail I have a decently simple data science question that I posted in <#43970295124669...

don't advertise your help channels please

#

but you can just post here

#

the question

#

@rotund sail pandas methods in general make copies; they do not modify inplace.

#

anyway, that's a bad way to do things

#

you should use vectorised filtering

#

penguin_data = penguins.loc[penguins['species'] != 'Chinstrap', ['species', 'flipper_length_mm']

#

look up the .loc indexer.

#

in general, if you have a for loop in pandas code, you're doing things wrong.

whole mica Nov 20, 2020, 6:01 AM

#

well gm, what do you think would be easier?

velvet thorn Nov 20, 2020, 6:01 AM

#

also look up the inplace parameter.

rotund sail Nov 20, 2020, 6:01 AM

#

Oh really? I took a data science course at my university last year and they wanted it

velvet thorn Nov 20, 2020, 6:01 AM

#

whole mica well gm, what do you think would be easier?

there is no easy way.

velvet thorn Nov 20, 2020, 6:01 AM

#

rotund sail Oh really? I took a data science course at my university last year and they want...

nope, it's wrong

#

100%

rotund sail Nov 20, 2020, 6:01 AM

#

Essentially every project we did utilized a for loop lol

velvet thorn Nov 20, 2020, 6:01 AM

#

200%, actually

velvet thorn Nov 20, 2020, 6:01 AM

#

whole mica well gm, what do you think would be easier?

but I would suggest at least learning the basics of DL (and in particular RL)

#

and then

#

writing an AI for a simpler game.

#

well I mean

#

you don't have to take what I say as the truth

whole mica Nov 20, 2020, 6:02 AM

#

no! i mean an easier game

velvet thorn Nov 20, 2020, 6:02 AM

#

like...tic-tac-toe or something

#

that's a good start

north plinth Nov 20, 2020, 6:03 AM

#

Tic tac toe should be used to learn RL

whole mica Nov 20, 2020, 6:03 AM

#

ok cool! and yes im gonna take advice. I do not know what im doing haha !

north plinth Nov 20, 2020, 6:04 AM

#

Cos that would be too easy for DL

north plinth Nov 20, 2020, 6:04 AM

#

whole mica ok cool! and yes im gonna take advice. I do not know what im doing haha !

But eventually u will know

whole mica Nov 20, 2020, 6:04 AM

#

im trying, i dont have the money to go to school so im trying to learn online

velvet thorn Nov 20, 2020, 6:05 AM

#

you can try this

#

https://www.deeplearningbook.org/

#

it's free

rotund sail Nov 20, 2020, 6:07 AM

#

@velvet thorn Is data science your profession?

#

if you don't mind me asking that is

velvet thorn Nov 20, 2020, 6:08 AM

#

rotund sail if you don't mind me asking that is

not really?

#

just for fun

#

or rather

#

not right now

rotund sail Nov 20, 2020, 6:09 AM

#

Nice, was just curious since you seemed rather knowledgeable about it

velvet thorn Nov 20, 2020, 6:09 AM

#

but I used to be a DS/teach DS

#

thank you

#

I try 👋

rotund sail Nov 20, 2020, 6:09 AM

#

So why are for loops bad usage in DS?

velvet thorn Nov 20, 2020, 6:09 AM

#

rotund sail So why are for loops bad usage in DS?

no, not in DS

#

just in pandas (and not all the time, but in general)

#

okay, pandas DataFrames use numpy arrays for storage

#

these arrays have fixed sizes.

whole mica Nov 20, 2020, 6:10 AM

#

what do you do for work?

velvet thorn Nov 20, 2020, 6:10 AM

#

so every time you remove or add a column/row, you're actually creating a whole new array (and DataFrame wrapping it).

#

in that for loop, therefore

#

for every row that doesn't satisfy your condition, you create a new DataFrame

rotund sail Nov 20, 2020, 6:10 AM

#

oh, so it's just very inefficient

velvet thorn Nov 20, 2020, 6:10 AM

#

so say there are N such rows; you end up creating N - 1 throwaway DataFrames

#

yup

#

that's the first thing

#

secondly

#

modern processors have something called SIMD instructions

#

which basically let them perform arithmetic on more than one memory address at a time

#

if you use a for loop, this optimisation isn't triggered

rotund sail Nov 20, 2020, 6:12 AM

#

So, for efficiency purposes, I should read up on vectorization

velvet thorn Nov 20, 2020, 6:12 AM

#

as an illustration

velvet thorn Nov 20, 2020, 6:12 AM

#

rotund sail So, for efficiency purposes, I should read up on vectorization

import numpy as np; a = np.arange(10000000); b = [v + 1 for v in a]; print('comprehension done'); c = a + 1; print('vectorised done')

#

run this

#

and watch the prints

velvet thorn Nov 20, 2020, 6:13 AM

#

whole mica what do you do for work?

backend engineering, mostly

#

some frontend

#

I'm looking at going back into DS/ML though

rotund sail Nov 20, 2020, 6:14 AM

#

that was actually so fast

#

jesus

velvet thorn Nov 20, 2020, 6:15 AM

#

so there are two problems with the for loop.

#

one, it's not vectorised (slow)

#

two, it creates throwaway objects (slow AND wastes memory)

whole mica Nov 20, 2020, 6:15 AM

#

well is it ok if i keep asking ya questions?

velvet thorn Nov 20, 2020, 6:15 AM

#

whole mica well is it ok if i keep asking ya questions?

don't ask me specifically, but you can post stuff here and whomever will answer

whole mica Nov 20, 2020, 6:16 AM

#

well its about jobs

velvet thorn Nov 20, 2020, 6:16 AM

#

try #career-advice

whole mica Nov 20, 2020, 6:16 AM

#

okie!

#

so, using that book you gave me should help a lot right?

#

I really wanna get to make a A.I to play pokemon

velvet thorn Nov 20, 2020, 6:18 AM

#

whole mica so, using that book you gave me should help a lot right?

it's a start.

whole mica Nov 20, 2020, 6:19 AM

#

and tic-tac-to is a good game to start with?

round tulip Nov 20, 2020, 7:03 AM

#

for the 12 people who haven't seen this yet https://www.youtube.com/watch?v=aircAruvnKk&ab_channel=3Blue1Brown

YouTube

3Blue1Brown

But what is a Neural Network? | Deep learning, chapter 1

Home page: https://www.3blue1brown.com/
Brought to you by you: http://3b1b.co/nn1-thanks
Additional funding provided by Amplify Partners

Full playlist: http://3b1b.co/neural-networks

Typo correction: At 14 minutes 45 seconds, the last index on the bias vector is n, when it's supposed to in fact be a k. Thanks for the sharp eyes that caught th...

▶ Play video

peak schooner Nov 20, 2020, 10:21 AM

#

are there any resources for converting excel spreadsheets to python

sand escarp Nov 20, 2020, 12:01 PM

#

Do you mean read excel files? Or is it something else I'm too inexperienced to understand? If it is the former, there is a read_excel() function in pandas.

#

Hello guys, do you happen to know any free resources for learning statistics and probability? What I want to do is supplement a course I'm learning on statistics to have a better intuitive understanding of the things, with some interactivity and stuff. I have in mind jupyter notebooks or webpages or books or anything at all. One I tried is Think Stats by Allen Downey but he doesn't delve too much into the mathematics so it isn't that helpful to me.

neon cave Nov 20, 2020, 1:12 PM

#

does anyone here know R?

#

I need help with R

earnest herald Nov 20, 2020, 1:41 PM

#

Hello (:
I am looking for a decent image comparison algorithm. I have looked into MSE and SSIM so far. Can someone recommend another algorithm except LSH or using OpenCV?

My end goal is to create a large scale image comparison (hand written signatures) algorithm most likely using Tensorflow.

Thanks

#

Ping me (:

rough cedar Nov 20, 2020, 1:55 PM

#

earnest herald Hello (: I am looking for a decent image comparison algorithm. I have looked int...

if i don't get it wrong, you want to use deep learning to compare images. a very fundamental way is that you can use a pretrained VGG model to generate features for two input images and calculate cos similarity

earnest herald Nov 20, 2020, 2:00 PM

#

rough cedar if i don't get it wrong, you want to use deep learning to compare images. a very...

Sweet! I'll look into VGG right away.

There was also this one thing I couldn't find on the internet. How do I store multiple images in a file and import it in python rather than using img.open everytime

#

Thanks a lot for VGG heads up by the way (:

rough cedar Nov 20, 2020, 2:03 PM

#

earnest herald Sweet! I'll look into VGG right away. There was also this one thing I couldn't...

in tensorflow there is a module called tf.data.Dataset where you can store and read all the image in the form of arrays (or tensors)

#

you would encounter it anyway since you are going to use VGG

earnest herald Nov 20, 2020, 2:04 PM

#

Awesome! This makes things a lot easier for me. Thanks bud

rough cedar Nov 20, 2020, 2:04 PM

#

you are welcome

signal delta Nov 20, 2020, 2:24 PM

#

Hello,
Can someone help me in the help-nitrogen chanel. I've described the problem I am facing in that channel

#

Thank you!

safe tapir Nov 20, 2020, 3:41 PM

#

Is there a way to make a nested dataframe accessor?

e.g.
@pd.api.extensions.register_dataframe_accessor("base.second.third")

charred pagoda Nov 20, 2020, 4:52 PM

#

Can someone help me? I have a list of times it takes my server to proccess and I want to find anomalis but for some reason It's not working,
I'm doing the following if

stdev = statistics.stdev(times)
if min(currentTime) + stdev < statistics.mean(times):

#

But it's not working, it doesn't find the anomalis or mark normals as anomalis

vital drift Nov 20, 2020, 5:21 PM

#

Hi, can someone help me in the -krypton channel? I'm new to python and it's just some simple excel manipulation, but I don't start python in my grad program until next year.

gentle kindle Nov 20, 2020, 5:41 PM

#

Hello, I am looking for a way to automatically classify JSON data that may or may not have headers, using an external site like Wikipedia to determine context and collect tags. Is there a script I can look at?

#

Please ping me if anyone knows. Thanks.

heady hatch Nov 20, 2020, 7:08 PM

#

Heya anyone here familiar with protobuf and parsing tfrecords?

#

I'm having issue parsing tfrecords and not too sure how to work with it since I'm not familiar with protobufs at all.

snow maple Nov 20, 2020, 7:55 PM

#

so waht is up

rich silo Nov 20, 2020, 8:03 PM

#

Hello guys, i am look to create a function take as arguments continuous data and bin number and create a frequency table with 2 columns (pandas), the bin ranges and the frequency count.
The data input should be a list like range(0,1500).
Anyone has any ideas about this?

green hemlock Nov 20, 2020, 8:34 PM

#

do you need to create function @rich silo , or can you use function like pd.qcut()? (assuming thats what you meant by binning continous value)

rich silo Nov 20, 2020, 8:36 PM

#

That's what i have tried at first but i couldnt get it to the format that i wanted meaning the 2 column table

#

Maybe i am missing something obvious......

green hemlock Nov 20, 2020, 8:37 PM

#

how would output look like? can you give an example, will make it easier to understand exact objective

rich silo Nov 20, 2020, 8:38 PM

#

Sure give me a sec

#

📎 unknown.png

#

something along those lines

green hemlock Nov 20, 2020, 8:46 PM

#

are you looking for something like this?

📎 unknown.png

#

@rich silo ^

rich silo Nov 20, 2020, 8:48 PM

#

Yeah kinda. can this be converted into a pandas table?

green hemlock Nov 20, 2020, 8:48 PM

#

definetely

#

📎 unknown.png

#

you can also perform some cleanup, to format the data, incase you need

rich silo Nov 20, 2020, 8:50 PM

#

Something else can the bound of the bins be open?
For example , to also have less than -0.05 and more than 50

green hemlock Nov 20, 2020, 8:53 PM

#

your min/max values becomes the bound, when you use cut

rich silo Nov 20, 2020, 8:53 PM

#

A i see. i think i got it from here

#

Thanks a lot for your helps

#

help

#

I am quite new to this

green hemlock Nov 20, 2020, 8:54 PM

#

sure, no problem

rich silo Nov 20, 2020, 9:04 PM

#

@green hemlock its seems that now it creates the bins but all of them have the same number of observations.

#

📎 unknown.png

#

Could i perhaps use numpy linspace to try and sort this?

green hemlock Nov 20, 2020, 9:37 PM

#

they are not same

#

Last 2 values are 386, rest are 387

#

And yeah, you can use linspace, but your problems should be possible to be solved by cut/qcut.

#

@rich silo

rich silo Nov 20, 2020, 9:40 PM

#

I have just done it using linspace now it looks like this:

📎 unknown.png

#

Is there anyway to sort the bins from lowest to greatest and also can those be formatted as percentage for example

green hemlock Nov 20, 2020, 9:43 PM

#

Can you try qcut and see the results

#

And by sorting lowest to greatest, do you mean frequency or 1st value of tuple?

rich silo Nov 20, 2020, 9:44 PM

#

the first value of the tuple

#

because i will be this to go in a plotly graph

green hemlock Nov 20, 2020, 9:47 PM

#

Your best best would be extract the numbers by treating it as string, or replacing the last ] with ), and then use ast.literal_eval for converting it into python tuple. Will make it easier to sort then

rich silo Nov 20, 2020, 9:48 PM

#

Hm that's what i was thinking as well although string manipulation is always painful to me

green hemlock Nov 20, 2020, 9:49 PM

#

I am not sure, if there is any other easier way, but if there is, let me know too

boreal summit Nov 20, 2020, 10:33 PM

#

On classification reports for sklearn, I'm finding it hard to wrap my head around what recall actually means. I know support is the total number of that variable present in that dataset, f1 is the harmonic mean btw precision and recall, predictions is the percentage of right predictions over the total of the dataset being predicted, but can't seem to understand recall.

#

I'd appreciate if someone explains it to me like a 10 years old. Tha is.

#

*thanks.

blissful pendant Nov 20, 2020, 11:56 PM

#

im trying to find away to convert latex expressions into images however i cant find a way to do this, any help would be apprecieated

velvet thorn Nov 21, 2020, 12:01 AM

#

boreal summit On classification reports for sklearn, I'm finding it hard to wrap my head aroun...

say I have 10 dogs and 10 cats in a room

#

and I tell you "go into the room and get me all the dogs".

boreal summit Nov 21, 2020, 12:01 AM

#

Listening...

velvet thorn Nov 21, 2020, 12:01 AM

#

but you're not very good at differentiating dogs and cats

#

so you can make the very safe play

#

and bring out every single animal

#

or

#

you could choose only those you are very sure are dogs.

#

in the first case, you get 10 dogs and 10 cats.

#

which is good in one sense, because you got every dog that was there to get

#

but you also have lots of stuff that I didn't ask for (cats)

#

in the second case, maybe you only got 2 dogs?

#

but you didn't get any cats

#

which is good in a different sense, because you didn't get any extraneous rubbish.

#

precision measures the second sense of goodness: how much stuff you got that wasn't relevant

#

recall measures the first sense of goodness: of the relevant stuff that was available, how much did you get?

#

and that's generally why you want measures that combine both, like f1 score: because you usually want results that are largely complete (get most of what you're looking for) and relevant (don't contain much of what you're not looking for)

#

make sense?

velvet thorn Nov 21, 2020, 12:05 AM

#

blissful pendant im trying to find away to convert latex expressions into images however i cant f...

in what context?

#

like you pass a LaTeX expression to code and it generates a .png or .jpg or something like that?

boreal summit Nov 21, 2020, 12:05 AM

#

So recall is like the percentage of right predictions you got (dogs) over all the stuff I brought out (dogs and cats). So recall is 0.5 in the first instance?

velvet thorn Nov 21, 2020, 12:06 AM

#

rich silo Is there anyway to sort the bins from lowest to greatest and also can those be f...

@green hemlock yes and yes.

#

for the percentage thing, pass normalize=True to value_counts

#

to sort, call .sort_index().

velvet thorn Nov 21, 2020, 12:07 AM

#

boreal summit So recall is like the percentage of right predictions you got (dogs) over all th...

no, that's precision

#

recall is the percentage of correct predictions over all the correct predictions there are to make

#

so there were 10 dogs and you got all 10

#

recall is 1

#

in the second case, there were 10 dogs and you got 2

boreal summit Nov 21, 2020, 12:09 AM

#

So recall in the second case is 0.2?

blissful pendant Nov 21, 2020, 12:09 AM

#

velvet thorn in what context?

yes exacty, i give it a latex expression and then a png is rendered showing it eg:

#

sin(sqrt(x**2 + 20)) + 1

#

📎 Js3gz.png

velvet thorn Nov 21, 2020, 12:10 AM

#

blissful pendant yes exacty, i give it a latex expression and then a png is rendered showing it e...

did you Google "render latex with Python"?

velvet thorn Nov 21, 2020, 12:10 AM

#

boreal summit So recall in the second case is 0.2?

ye

#

p sure MPL can do that @blissful pendant

boreal summit Nov 21, 2020, 12:10 AM

#

Ooh, okay. Thanks. I really appreciate @velvet thorn 🙏🏿

blissful pendant Nov 21, 2020, 12:10 AM

#

yes but sympy wasnt working correctly, prob should have specified that

blissful pendant Nov 21, 2020, 12:11 AM

#

velvet thorn ye

can you save directly from it?

velvet thorn Nov 21, 2020, 12:11 AM

#

blissful pendant can you save directly from it?

probably need to do some stuff

#

but yeah just go try it out

blissful pendant Nov 21, 2020, 12:11 AM

#

ok thx

molten hamlet Nov 21, 2020, 12:37 AM

#

did anybody heard about geopandas?

austere swift Nov 21, 2020, 1:16 AM

#

!d pandas.DataFrame.dropna

arctic wedgeBOT Nov 21, 2020, 1:16 AM

#

`pandas.DataFrame.dropna`

DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)```
Remove missing values.

See the [User Guide](../../user_guide/missing_data.html#missing-data) for more on which values are considered missing, and how to work with missing data.

Parameters  **axis**{0 or ‘index’, 1 or ‘columns’}, default 0Determine if rows or columns which contain missing values are removed.

• 0, or ‘index’ : Drop rows which contain missing values.

• 1, or ‘columns’ : Drop columns which contain missing value.

Changed in version 1.0.0: Pass tuple or list to drop on multiple axes. Only a single axis is allowed.

**how**{‘any’, ‘all’}, default ‘any’Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.

• ‘any’ : If any NA values are present, drop that row or column.

• ‘all’ : If all values are NA, drop that row or column.

**thresh**int, optionalRequire that many non-NA values.... [read more](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html#pandas.DataFrame.dropna)

velvet thorn Nov 21, 2020, 1:21 AM

#

@glad mulch show code

austere swift Nov 21, 2020, 1:22 AM

#

you still have a threshold on it, so those are probably just nans that werent dropped since it was below the threshold

velvet thorn Nov 21, 2020, 1:22 AM

#

the code you're using to drop nulls

#

and how specifically it's not doing what you want it to do

#

how big is the original

austere swift Nov 21, 2020, 1:23 AM

#

if you wanna check how many nans there are left do df[column].isna().sum()

#

and see if thats below the threshold