#data-science-and-ml | Python | Page 306

jolly ginkgo Apr 18, 2021, 8:23 PM

#

try plotly

#

and seaborn

lapis sequoia Apr 18, 2021, 8:23 PM

#

I'm just asking what "mat" stands for rooThink1

vale crown Apr 18, 2021, 8:50 PM

#

I think it stands for "Matlab", but not sure. Matplotlib looks like Matlab plotting system and has created to resemble matlab

lapis sequoia Apr 18, 2021, 8:54 PM

#

Sounds correct rooYes Matrix labratory bearStudy

grave frost Apr 18, 2021, 10:07 PM

#

uh-huh

#

is matplotlib short for math or matrix
Does it honestly matter?

lapis sequoia Apr 18, 2021, 10:23 PM

#

I enjoy etymology ruwu

mint obsidian Apr 18, 2021, 10:49 PM

#

hey um my question wasn't answered in help so I figured maybe I could ask here?
so I'm new... if there was 5 columns for each person, is there a way to add a new column that counts how many columns were filled for each person?
considering the dataset comes from Excel that is

dapper halo Apr 18, 2021, 11:38 PM

#

mint obsidian hey um my question wasn't answered in help so I figured maybe I could ask here? ...

maybe just sum across rows? If you need it to be binary, then just set it to 1 if the value is above zero. Then input your new array as an additional column

#

So for missing data....is there a way to have a sort of initialization layer that will deactivate an input neuron based on a specific value?

sharp turret Apr 18, 2021, 11:40 PM

#

Hey fellows, if I were to run kmeans.predict() on a really large dataset, would it be faster to give it one huge numpy array, or a series of smaller numpy arrays?

mint obsidian Apr 18, 2021, 11:43 PM

#

dapper halo maybe just sum across rows? If you need it to be binary, then just set it to 1 i...

I'm not sure if I understood you correctly... as I said I just started on Python, could you explain it with a little less programming lingo? especially for the second sentence
EDIT: I'm sorry I don't want to hinder the traffic here, esp for the person who asked a question just now, is it still ok to discuss here?

dapper halo Apr 18, 2021, 11:48 PM

#

mint obsidian I'm not sure if I understood you correctly... as I said I just started on Python...

The second sentence was my own question haha, so ignore that.

My suggestion for you was to just to sum the values across each row. Maybe normalize them or something by their own values if you just want a total count. I'm sure there are better ways to do it. Also not sure what your missing values are labeled as. Whether or not this is the appropriate channel to ask that question...i have no idea haha.

exotic maple Apr 18, 2021, 11:57 PM

#

mint obsidian hey um my question wasn't answered in help so I figured maybe I could ask here? ...

You mean a column that counts how many columns are filled per row per person?

velvet thorn Apr 19, 2021, 12:00 AM

#

mint obsidian I'm not sure if I understood you correctly... as I said I just started on Python...

yes it is

#

show a sample of your data please

velvet thorn Apr 19, 2021, 12:01 AM

#

lapis sequoia I'm just asking what "mat" stands for <:rooThink1:580614815408586762>

matrix

#

well

#

because the "mat" comes from MATLAB

#

so, yes, matrix laboratory

mint obsidian Apr 19, 2021, 12:08 AM

#

exotic maple You mean a column that counts how many columns are filled per row per person?

yes!

exotic maple Apr 19, 2021, 12:09 AM

#

mint obsidian yes!

Can you show sample data? Its better to see it

mint obsidian Apr 19, 2021, 12:09 AM

#

velvet thorn yes it is

I'm not sure how to do that but in case it helps, each row is a person with 43 columns, 5 of them are diagnoses (because max diagnoses in this data set is 5)
I just want to know how many diagnoses a person has (how many of the 5 columns are filled)

#

thanks for taking the time to answer btw

velvet thorn Apr 19, 2021, 12:11 AM

#

mint obsidian I'm not sure how to do that but in case it helps, each row is a person with 43 c...

by "unfilled" you mean empty?

#

like null

mint obsidian Apr 19, 2021, 12:11 AM

#

do I just send a screenshot? but the data is confidential

#

yes

velvet thorn Apr 19, 2021, 12:11 AM

#

mint obsidian do I just send a screenshot? but the data is confidential

no

#

for two reaons

#

confidential

#

screenshots in general are bad because hard to read + non-reproducible

#

data as text is better

#

anyway

#

I'm assuming

#

you're working with your data in Python

#

using pandas

#

is that correct?

mint obsidian Apr 19, 2021, 12:11 AM

#

haha I'm not sure how to do that yet, I just started Python like 3 days ago

velvet thorn Apr 19, 2021, 12:11 AM

#

or openpyxl?

mint obsidian Apr 19, 2021, 12:12 AM

#

yes and numpty

velvet thorn Apr 19, 2021, 12:12 AM

#

okay

#

what's your dataframe called?

mint obsidian Apr 19, 2021, 12:12 AM

#

i just called it excel lol

velvet thorn Apr 19, 2021, 12:12 AM

#

excel.notna().sum(axis=1) should get you the column you want

velvet thorn Apr 19, 2021, 12:13 AM

#

mint obsidian haha I'm not sure how to do that yet, I just started Python like 3 days ago

not sure if you should be working on something like this so early

#

fundamentals are important

mint obsidian Apr 19, 2021, 12:13 AM

#

I know

velvet thorn Apr 19, 2021, 12:13 AM

#

velvet thorn `excel.notna().sum(axis=1)` should get you the column you want

and you still need to know how to combine it with your original data

mint obsidian Apr 19, 2021, 12:13 AM

#

but the data set is very messy, I had to split the diagnoses from a single cell separated with either comma or ;

#

my colleagues work with only excel and that doesnt cut it so....

#

anyway, wouldnt the code u gave me count also the other columns that are not diagnoses?

velvet thorn Apr 19, 2021, 12:14 AM

#

mint obsidian anyway, wouldnt the code u gave me count also the other columns that are not dia...

what?

#

that counts the number of non-null values in each row

mint obsidian Apr 19, 2021, 12:15 AM

#

there are 43 columns in each row, only 5 of them are diagnoses

velvet thorn Apr 19, 2021, 12:15 AM

#

oh

#

well

#

then subset the DataFrame first

mint obsidian Apr 19, 2021, 12:15 AM

#

I want to know how many of the 5 is filled

#

yeah I figured

#

thank you for the pointers

velvet thorn Apr 19, 2021, 12:15 AM

#

yw

#

feel free to ask further questions

mint obsidian Apr 19, 2021, 12:15 AM

#

and sorry for that guy who didnt get his question answered yet

sharp turret Apr 19, 2021, 12:32 AM

#

Hahah it's alright bro I figured it out experiementally

lavish tundra Apr 19, 2021, 1:32 AM

#

if u are a expert about pandas help me in #help-ramen pls
its a really hard problem

dapper halo Apr 19, 2021, 1:43 AM

#

typically you standardize data so if the domains of each input are different, they'll still be weighted equally. But if you want certain features to have a higher weight..is it common practice to maybe standardized particular features but leave others as raw? Or is this an inappropriate way to deal with this?

velvet thorn Apr 19, 2021, 1:52 AM

#

dapper halo typically you standardize data so if the domains of each input are different, th...

why do you want that

dapper halo Apr 19, 2021, 2:07 AM

#

velvet thorn why do you want that

My data physically has a higher dependency on certain parameters. Although thinking about it more shouldnt matter and should standardize them all. Idk not much forward progress has been made in a couple weeks and im just tryna think of stuff I could try and see if it potentially benefits my model. Still know very little with ML in general...so kinda the whole..if you know nothing everything is fair game whether its valid or not haha

velvet thorn Apr 19, 2021, 2:07 AM

#

dapper halo My data physically has a higher dependency on certain parameters. Although think...

then the model would learn that

#

if you already knew

#

how to weight your parameters

dapper halo Apr 19, 2021, 2:07 AM

#

velvet thorn then the model would learn that

Yeah hence my second sentence haha

velvet thorn Apr 19, 2021, 2:07 AM

#

you wouldn't be building a model, right

dapper halo Apr 19, 2021, 2:07 AM

#

this is true.

floral mantle Apr 19, 2021, 2:08 AM

#

In pandas dataframe I have a huge DF and my data has multiple formats for name. I want to grab the first instance and drop the rest...

Ex:
ID Name
1 Sean S
1 Sean—
1 Sean
2 Bob

#

Not sure how and it’s killing me. I’d do MAXIF in excel

velvet thorn Apr 19, 2021, 2:09 AM

#

floral mantle In pandas dataframe I have a huge DF and my data has multiple formats for name. ...

how do you determine the "first" instance

floral mantle Apr 19, 2021, 2:10 AM

#

Honestly I don’t care which one I grab

velvet thorn Apr 19, 2021, 2:10 AM

#

like

floral mantle Apr 19, 2021, 2:10 AM

#

I just can’t dump duplicates unless I pick one

velvet thorn Apr 19, 2021, 2:10 AM

#

based on that

#

what do you want the result to be?

#

deduplicate based on the ID column?

floral mantle Apr 19, 2021, 2:11 AM

#

I could drop the name column and deduplicate but then I need to bring a name back for reference

#

Without réduplications if this makes sense

velvet thorn Apr 19, 2021, 2:11 AM

#

floral mantle I could drop the name column and deduplicate but then I need to bring a name bac...

so

#

yes or no?

floral mantle Apr 19, 2021, 2:11 AM

#

So
1 Sean S
2 Bob

Would be a good result

velvet thorn Apr 19, 2021, 2:11 AM

#

okay

#

df.drop_duplicates(subset=['ID'])

floral mantle Apr 19, 2021, 2:14 AM

#

Ok

#

So not sure how that works but yeah

#

That did what I wanted

#

That’s awesome. Thanks @velvet thorn

#

Pandas has some weird voodoo commands

velvet thorn Apr 19, 2021, 2:15 AM

#

floral mantle So not sure how that works but yeah

it drops duplicate rows

dapper halo Apr 19, 2021, 2:15 AM

#

built for convenience

velvet thorn Apr 19, 2021, 2:15 AM

#

but whether rows are duplicated

#

will be determined solely

#

by the values in the columns in subset

#

which is just ID

floral mantle Apr 19, 2021, 2:15 AM

#

Oh that’s genius. So basically it didn’t care about the name at all

velvet thorn Apr 19, 2021, 2:16 AM

#

so of each group of rows with the same ID value, the first will be taken

floral mantle Apr 19, 2021, 2:16 AM

#

And it just said screw you — take the first one you dumb python guy

#

I love it

velvet thorn Apr 19, 2021, 2:16 AM

#

I would suggest you look @ the documentation for that function

#

plenty of options

floral mantle Apr 19, 2021, 2:17 AM

#

Definitely will. I’m rewriting some long processes for my team to run more consistently

#

Getting out of crazy excel sheets

#

That was really helpful. I’ll dig into the docs more but needed this one asap

#

👍

lavish tundra Apr 19, 2021, 2:25 AM

#

velvet thorn plenty of options

ur code helped me a lot, ty

#

i only need to figure out now how to replace a str value in the dict's

brittle wing Apr 19, 2021, 3:07 AM

#

is it possible to quantize a tensorflow(keras) model without converting to tflite

floral mantle Apr 19, 2021, 5:07 AM

#

How about a pd function to add a column for earliest date?

Example: dataframe with all invoices paid by an account. I want the earliest since that’s customer start date.

Using:
loc now to filter to active accounts only, then .groupby(account ID) .agg(invoice date, min)

#

Sorry for the formatting... on mobile

stuck socket Apr 19, 2021, 5:22 AM

#

sean

#

@floral mantle r u there?

floral mantle Apr 19, 2021, 5:28 AM

#

@stuck socket yes

stuck socket Apr 19, 2021, 5:29 AM

#

how would u do to grab numbers from a column in this way: 1,2,3,4--2,3,4,5--3,4,5,6.. etc

worn bough Apr 19, 2021, 6:07 AM

#

If the basic Series is 1,2,3,4 you can just do series+1

#

If you want to shift the Series, you can use series.shift(1)

slim jackal Apr 19, 2021, 7:28 AM

#

Is anyone a data analyst here?

#

I need some help.

velvet thorn Apr 19, 2021, 7:35 AM

#

floral mantle How about a pd function to add a column for earliest date? Example: dataframe w...

looks about right

#

unless you’re asking how to combine the two

#

in which case .merge

tropic junco Apr 19, 2021, 8:02 AM

#

HOW DO I MAKE A CHAT BOT

#

@scarlet wasp

#

i need help

grave frost Apr 19, 2021, 8:34 AM

#

tropic junco HOW DO I MAKE A CHAT BOT

make a giant file of questions, and another for responses. simple if/else

tropic junco Apr 19, 2021, 8:37 AM

#

ok

austere swift Apr 19, 2021, 8:38 AM

#

so i'm loosely following this pytorch chatbot tutorial to learn more about seq2seq models and the like but I'm stuck at this part https://pytorch.org/tutorials/beginner/chatbot_tutorial.html#masked-loss

#

I can't seem to understand why they would use torch.gather

#

also

#

the code just doesnt work

#

brings up all sorts of cuda errors

grave frost Apr 19, 2021, 8:40 AM

#

austere swift the code just doesnt work

welcome to pytorch!

austere swift Apr 19, 2021, 8:41 AM

#

lol yeah i've used pytorch for a while now but I'm just now getting into more advanced NLP concepts

#

like attention and stuff

grave frost Apr 19, 2021, 8:41 AM

#

but seriously, are you not using lightning?

austere swift Apr 19, 2021, 8:41 AM

#

for some reason i just don't really like lightning that much tbh

#

plus i'm more used to normal pytorch

#

but i don't understand why they would give a tutorial code that literally just doesn't work lol

grave frost Apr 19, 2021, 8:42 AM

#

wheres the error BTW?

#

oof, that's so programming heavy

austere swift Apr 19, 2021, 8:50 AM

#

well with the original function they gave me it had the error RuntimeError: Size does not match at dimension 0 expected index [32, 1] to be smaller than src [1, 32] apart from dimension 1

#

I did a bunch of stuff to fix it but then I gave up and went back to the original one cus the stuff i did to fix it just ended up giving me a ton of cuda errors

#

theres also the issue that if i try to print any of those tensors my pc blue screens entirely

#

so i can't even visualize what it's doing :)

#

I don't know why it's doing that though

#

very weird

grave frost Apr 19, 2021, 8:53 AM

#

austere swift well with the original function they gave me it had the error `RuntimeError: Siz...

is that near the loss calc?

austere swift Apr 19, 2021, 8:53 AM

#

its on the gather function

grave frost Apr 19, 2021, 8:54 AM

#

for CUDA errors, switch to CPU to have a clearer traceback 🤷

grave frost Apr 19, 2021, 8:54 AM

#

austere swift its on the gather function

I thought so

#

basically, your guess is as good as mine - you passed out of bound index to the index arg of torch.gather

#

S.O has a really nice explanation, if you want to understand torch.gather

scenic wedge Apr 19, 2021, 11:19 AM

#

Hey guys, im using a keras deeply connected net and when i convert my labels to categorical i keep getting the error "TypeError: 'NoneType' object is not callable" when i run model.fit(). Anyone know why this is happening?

ripe forge Apr 19, 2021, 11:25 AM

#

Full trace back? Sounds like something that shouldn't be a None is a None

scenic wedge Apr 19, 2021, 11:26 AM

#

this happens only when i change the labels to categorical and the loss function to categorical crossentropy

#

Oh, nevermind its fixed somehow after i compiled the model again. Thanks anyways!

misty thicket Apr 19, 2021, 11:54 AM

#

who good with pandas?

serene scaffold Apr 19, 2021, 11:57 AM

#

misty thicket who good with pandas?

Go ahead and ask your question.

#

Asking the actual question is going to be a lot faster than asking who knows about the topic of an unasked question.

misty thicket Apr 19, 2021, 11:58 AM

#

serene scaffold Go ahead and ask your question.

well its a big one

#

like I need constant help

#

VC?

serene scaffold Apr 19, 2021, 11:58 AM

#

Can you isolate a specific question for the moment?

misty thicket Apr 19, 2021, 11:59 AM

#

serene scaffold Can you isolate a specific question for the moment?

I have a csv file and wanna convert some specific columns unique values to numbers

serene scaffold Apr 19, 2021, 11:59 AM

#

misty thicket I have a csv file and wanna convert some specific columns unique values to numbe...

alright. Have you gotten as far as opening the CSV and selecting the column?

misty thicket Apr 19, 2021, 11:59 AM

#

serene scaffold alright. Have you gotten as far as opening the CSV and selecting the column?

yes

#

got its unique values too

serene scaffold Apr 19, 2021, 12:00 PM

#

misty thicket I have a csv file and wanna convert some specific columns unique values to numbe...

can the numbers be arbitrary?

misty thicket Apr 19, 2021, 12:00 PM

#

like I wanna asign a unique number to that unique value

serene scaffold Apr 19, 2021, 12:00 PM

#

misty thicket like I wanna asign a unique number to that unique value

yes--does the number matter?

misty thicket Apr 19, 2021, 12:00 PM

#

serene scaffold can the numbers be arbitrary?

no

#

but for the same value it should be same

serene scaffold Apr 19, 2021, 12:01 PM

#

alright. so make a dictionary mapping the unique values to numbers with enumerate, and then use the .replace method

misty thicket Apr 19, 2021, 12:01 PM

#

serene scaffold alright. so make a dictionary mapping the unique values to numbers with `enumera...

ok

#

.replace

serene scaffold Apr 19, 2021, 12:01 PM

#

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html

misty thicket Apr 19, 2021, 12:02 PM

#

thanks a lot

serene scaffold Apr 19, 2021, 12:02 PM

#

No problem

misty thicket Apr 19, 2021, 12:04 PM

#

serene scaffold No problem

what if that the data I want to convert exists multiple times?

serene scaffold Apr 19, 2021, 12:04 PM

#

misty thicket what if that the data I want to convert exists multiple times?

what do you mean by that?

misty thicket Apr 19, 2021, 12:05 PM

#

serene scaffold what do you mean by that?

column 1|
abcd
xyz
eces
abcd

#

like

#

I wanna convert every single one to a number

#

but

#

abcd exists twice

#

will

#

the .replace convert it too?

serene scaffold Apr 19, 2021, 12:06 PM

#

it would replace both instances of abcd with the same value

#

it just looks up abcd in the dict that you provide

misty thicket Apr 19, 2021, 12:06 PM

#

can you give a code for like this example please

#

if you can*

serene scaffold Apr 19, 2021, 12:10 PM

#

unique_vals = df['column'].unique()
num_mapping = dict(enumerate(unique_vals))

#

This is the first part

misty thicket Apr 19, 2021, 12:10 PM

#

['M Chinnaswamy Stadium' 'Punjab Cricket Association Stadium, Mohali'
 'Feroz Shah Kotla' 'Eden Gardens' 'Wankhede Stadium'
 'Sawai Mansingh Stadium' 'Rajiv Gandhi International Stadium, Uppal'
 'MA Chidambaram Stadium, Chepauk' 'Dr DY Patil Sports Academy' 'Newlands'
 "St George's Park" 'Kingsmead' 'SuperSport Park' 'Buffalo Park'
 'New Wanderers Stadium' 'De Beers Diamond Oval' 'OUTsurance Oval'
 'Brabourne Stadium' 'Sardar Patel Stadium, Motera' 'Barabati Stadium'
 'Vidarbha Cricket Association Stadium, Jamtha'
 'Himachal Pradesh Cricket Association Stadium' 'Nehru Stadium'
 'Holkar Cricket Stadium'
 'Dr. Y.S. Rajasekhara Reddy ACA-VDCA Cricket Stadium'
 'Subrata Roy Sahara Stadium'
 'Shaheed Veer Narayan Singh International Stadium'
 'JSCA International Stadium Complex' 'Sheikh Zayed Stadium'
 'Sharjah Cricket Stadium' 'Dubai International Cricket Stadium'
 'Maharashtra Cricket Association Stadium'
 'Punjab Cricket Association IS Bindra Stadium, Mohali'
 'Saurashtra Cricket Association Stadium' 'Green Park'
 'M.Chinnaswamy Stadium' 'MA Chidambaram Stadium' 'Arun Jaitley Stadium'
 'Rajiv Gandhi International Stadium'
 'Punjab Cricket Association IS Bindra Stadium'
 'MA Chidambaram Stadium, Chepauk, Chennai' 'Wankhede Stadium, Mumbai']```

serene scaffold Apr 19, 2021, 12:10 PM

#

Did you look at the docs for .replace?

misty thicket Apr 19, 2021, 12:10 PM

#

serene scaffold ```py unique_vals = df['column'].unique() num_mapping = dict(enumerate(unique_va...

ok thanks

misty thicket Apr 19, 2021, 12:10 PM

#

serene scaffold Did you look at the docs for `.replace`?

yes

serene scaffold Apr 19, 2021, 12:10 PM

#

do you understand what dict(enumerate(unique_vals)) does?

misty thicket Apr 19, 2021, 12:12 PM

#

serene scaffold do you understand what `dict(enumerate(unique_vals))` does?

no

#

but it converts to dict I see

#

I know what dict does

serene scaffold Apr 19, 2021, 12:13 PM

#

!e

letters = 'abcdefg'
stuff = list(enumerate(letters))
print(stuff)
print(dict(stuff))

arctic wedgeBOT Apr 19, 2021, 12:13 PM

#

@serene scaffold :white_check_mark: Your eval job has completed with return code 0.

001 | [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g')]
002 | {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g'}

serene scaffold Apr 19, 2021, 12:13 PM

#

@misty thicket enumerate gives you tuples of ints and items from whatever iterable you pass to it

misty thicket Apr 19, 2021, 12:14 PM

#

serene scaffold <@!696644016896475146> `enumerate` gives you tuples of ints and items from whate...

got it

serene scaffold Apr 19, 2021, 12:14 PM

#

and you can put those in a dict as key-value pairs

misty thicket Apr 19, 2021, 12:14 PM

#

and it converts to dict by dict method

#

yea ok got it thanks

serene scaffold Apr 19, 2021, 12:14 PM

#

it's more a function than it is a method

misty thicket Apr 19, 2021, 12:14 PM

#

ok

misty thicket Apr 19, 2021, 12:16 PM

#

serene scaffold <@!696644016896475146> `enumerate` gives you tuples of ints and items from whate...

df.replace({0: 10, 1: 100})

#

then I use this right?

serene scaffold Apr 19, 2021, 12:17 PM

#

It occurs to me that I may have given you backwards instructions

misty thicket Apr 19, 2021, 12:17 PM

#

😐

serene scaffold Apr 19, 2021, 12:18 PM

#

unique_vals = df['column'].unique()
num_mapping = dict(v, k for k, v in enumerate(unique_vals))

#

it's a small fix.

misty thicket Apr 19, 2021, 12:18 PM

#

serene scaffold ```py unique_vals = df['column'].unique() num_mapping = dict(v, k for k, v in en...

ok

serene scaffold Apr 19, 2021, 12:22 PM

#

misty thicket ```python df.replace({0: 10, 1: 100}) ```

do you want to do that replacement in every column?

misty thicket Apr 19, 2021, 12:22 PM

#

serene scaffold do you want to do that replacement in every column?

yes

#

like there are 5-6 columns

serene scaffold Apr 19, 2021, 12:22 PM

#

misty thicket yes

then there's only one problem left

misty thicket Apr 19, 2021, 12:23 PM

#

yea

#

same numbers

serene scaffold Apr 19, 2021, 12:23 PM

#

most data frame methods return copies

#

you have to specify when you want to change a dataframe in place

misty thicket Apr 19, 2021, 12:23 PM

#

serene scaffold most data frame methods return copies

just to be sure
I'm totally not into data managment and stuff

serene scaffold Apr 19, 2021, 12:25 PM

#

>>> a = pd.DataFrame([[1, 2], [3, 4]])
>>> a
   0  1
0  1  2
1  3  4
>>> b = pd.DataFrame([[5, 6], [7, 9]])
>>> b
   0  1
0  5  6
1  7  9
>>> a.add(b)  # returns a new dataframe, does NOT change a
    0   1
0   6   8
1  10  13
>>> a  # a is the same
   0  1
0  1  2
1  3  4

#

most dataframe methods are the same way. it does whatever change you wanted to a copy rather than changing the one that you already have.

misty thicket Apr 19, 2021, 12:34 PM

#

ok nvm I'll figure it out

#

thanks for the help bud

lavish tundra Apr 19, 2021, 12:51 PM

#

i have one pkl file like the img
i'm trying to replace the values inside the dict from "'s" to "s", but the values are dicts and i need to keep the type as dict
i know if i want change the keys i can use:

di['Name'] = di['Name'].map(lambda d: {k if k != 'EN-US' else 'en': v for k, v in d.items()})

but how about change the values?

topaz obsidian Apr 19, 2021, 12:58 PM

#

hiii!!

serene scaffold Apr 19, 2021, 1:00 PM

#

lavish tundra i have one pkl file like the img i'm trying to replace the values inside the dic...

So you just want to delete apostrophes?

lavish tundra Apr 19, 2021, 1:00 PM

#

ye

#

but not only delete the apostrophes, but replace some words too

velvet thorn Apr 19, 2021, 1:19 PM

#

lavish tundra i have one pkl file like the img i'm trying to replace the values inside the dic...

you didn't

#

answer my question

#

that time

#

why do you need to store as dict?

#

that's a bad data model IMO

lavish tundra Apr 19, 2021, 1:19 PM

#

cause i need to read the languages(keys) of the items

velvet thorn Apr 19, 2021, 1:20 PM

#

not a good reason

#

you can model the data differently

#

you should

#

paste

#

your data above

#

as text

lavish tundra Apr 19, 2021, 1:20 PM

#

i dont did it with different columns cause i need to check for one word fast in all languages too

velvet thorn Apr 19, 2021, 1:20 PM

#

not as a screenshot

velvet thorn Apr 19, 2021, 1:20 PM

#

lavish tundra i dont did it with different columns cause i need to check for one word fast in ...

it will be fast

#

in any case

#

did you profile?

#

or are you prematurely optimising?

lavish tundra Apr 19, 2021, 1:22 PM

#

i did a few of profiles

lavish tundra Apr 19, 2021, 1:26 PM

#

velvet thorn it will be fast

what u suggest?

grave frost Apr 19, 2021, 2:07 PM

#

"bylat, this shit slaps"

#

They are killing us by the suspense. just publish the code already!!!

ember jungle Apr 19, 2021, 2:36 PM

#

what is this called

empty patio Apr 19, 2021, 2:56 PM

#

anyone knows how backup R modules without installing every time(I'm using a VPS ) ?

serene scaffold Apr 19, 2021, 3:35 PM

#

empty patio anyone knows how backup R modules without installing every time(I'm using a VPS...

This server isn't really R oriented.

empty patio Apr 19, 2021, 3:39 PM

#

What if I had to do the same thing under python

#

does it has check sum authentication like R ?

lapis sequoia Apr 19, 2021, 4:02 PM

#

Hey anyone knows how to combine kernel density estimator and naive bayes classifier?

lapis sequoia Apr 19, 2021, 4:03 PM

#

lapis sequoia Hey anyone knows how to combine kernel density estimator and naive bayes classif...

with sklearn

uncut monolith Apr 19, 2021, 5:02 PM

#

does anyone here has experience with dash an plotly? im using then for a physics assignment and getting some trouble

uncut barn Apr 19, 2021, 5:19 PM

#

what are the ways that I can improve my solution when using k-means?

serene scaffold Apr 19, 2021, 5:33 PM

#

empty patio What if I had to do the same thing under python

I'm not sure what you mean

#

You can install python libraries with pip and save a list of what you have installed

grave frost Apr 19, 2021, 5:35 PM

#

uncut barn what are the ways that I can improve my solution when using k-means?

use neural net

uncut barn Apr 19, 2021, 5:45 PM

#

will look into that

lapis sequoia Apr 19, 2021, 7:22 PM

#

Guys, after the great community interaction our AutoDataCleaner (https://pypi.org/project/AutoDataCleaner/) received, we figured out that it is time for a free web-based end-to-end ML service. Drop your CSV/Excel, choose what column to predict and it will take you through a drag-and-drop wizard which will end in having your new model, python code downloadable and FastAPI web-based server to test out your predictions!

The drag-and-drop wizard will contain all necessary steps for data cleaning, EDA, feature engineering, automatic model selection and a automatic hyper parameter optimization.

We will call this free service: var.blue (we already got the domain!)

Building something that ends up on a dusty shelf really sucks; that is why we would like to know the following?

Is this something you would appreciate and use?
If yes, what features would you like to see in var.blue.? This could be any statistical functions, specific data cleaning functions, data exploring practices, specific machine learning models. Literally anything that would make your ML project easier.
If no, what ML service would you appreciate?

We are trying to build a go-to place for ML projects; a place for pros to get setup quickly and for beginners to explore and learn.

We are eager to hear from you!

Shout out to the people who supported AutoDataCleaner by their valuable feedback:
u/0x256
u/EvenMoreConfusedNow
u/browneyesays
u/jiejenn

If you would like to jump on board and help, please DM.

#

Your input would be really helpful

lapis sequoia Apr 19, 2021, 8:00 PM

#

For what is useful ai?

fierce oracle Apr 19, 2021, 8:02 PM

#

I begin AI today and Bellman algorithm burnt my brain 😂😂 I wasn't ready for this

robust charm Apr 19, 2021, 8:15 PM

#

Hi, Can anyone help with this? I have created a CNN model that detects a certain object. I now want to test the model with random images with the object in the picture. When the CNN has detected the object I would like to draw a rectangle at the location. Could someone point me in the right direction.

#

So far Im reading about cascades using CV

ripe forge Apr 19, 2021, 9:00 PM

#

" I have created a CNN model that detects a certain object." did you train a classifier or an object detection model?

robust charm Apr 19, 2021, 9:09 PM

#

A classifier

serene scaffold Apr 19, 2021, 10:07 PM

#

lapis sequoia For what is useful ai?

It's useful for when you want to automate decision making for decisions that can't be reduced to a sequence of if statements

flint mason Apr 19, 2021, 10:08 PM

#

How can we parse java script using beautiful soup that we returned from an API

serene scaffold Apr 19, 2021, 10:08 PM

#

flint mason How can we parse java script using beautiful soup that we returned from an API

This doesn't sound like a data science question. Try asking in #web-development

flint mason Apr 19, 2021, 10:09 PM

#

Umm its web scrapping in python for data analysis and ML

serene scaffold Apr 19, 2021, 10:10 PM

#

flint mason Umm its web scrapping in python for data analysis and ML

Then this would be the channel to visit once you have the data. Though be sure that you're allowed to scrape the websites you're scraping from

flint mason Apr 19, 2021, 10:10 PM

#

Yeah I got developers account and access as student developer

velvet thorn Apr 19, 2021, 10:23 PM

#

depends on what kind of padding but

#

padding often means increasing the size of input data, adding zeros where necessary

#

masking is generally about removing certain (not necessarily contiguous) parts of data

primal tulip Apr 19, 2021, 10:24 PM

#

I'm trying to gather data from a source while it streams, do some transformations, and stop the program if the source ends. I must be doing something wrong because I'm running out of memory.
How should I share the code? Is a screenshot ok?

exotic maple Apr 19, 2021, 10:37 PM

#

primal tulip I'm trying to gather data from a source while it streams, do some transformation...

if you're running out of memory it might be because you're pulling too much info at once, have you looked into Async or batch requests?

primal tulip Apr 19, 2021, 10:39 PM

#

I'm (trying) to use chunksize with Pandas' read_csv().

#

That is being passed as a module.

#

And called from this. The issue is at the While loop.

#

I'm not sure if I should change the open_csv() function to YIELD instead of RETURN. That way I could use a generator, but I'm not aware on how could I use it.

primal tulip Apr 19, 2021, 10:44 PM

#

exotic maple if you're running out of memory it might be because you're pulling too much info...

Oh and thanks for your answer by the way. Are those python packages?

grave frost Apr 19, 2021, 10:45 PM

#

velvet thorn masking is generally about removing certain (not necessarily contiguous) parts o...

I would say more accurately hiding it - the data is always present, just not visible for the model (like in an attention mask) for theoretical purposes

serene scaffold Apr 19, 2021, 11:04 PM

#

primal tulip I'm trying to gather data from a source while it streams, do some transformation...

Copying and pasting the fire with markdown is strongly preferred to screenshots

#

!code

arctic wedgeBOT Apr 19, 2021, 11:04 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

exotic maple Apr 19, 2021, 11:04 PM

#

no. There are packages for Async, but in general async is a "type" of programming. Batch request is what the name says, instead of requsting all data, you request N responses, hold, and then continue

primal tulip Apr 19, 2021, 11:05 PM

#

Thank you man I'll use it from now on.

exotic maple Apr 19, 2021, 11:05 PM

#

you're running out of emmory reading a CSV with pandas?

#

what sthe filesize lol

#

ive opened 12GB no problem

#

I thought you said you were requesting from an API

#

@primal tulip take a look at this article from KDNuggets

#

looks like your same problem

#

https://www.kdnuggets.com/2021/03/pandas-big-data-better-options.html

KDnuggets

Are You Still Using Pandas to Process Big Data in 2021? Here are tw...

When its time to handle a lot of data -- so much that you are in the realm of Big Data -- what tools can you use to wrangle the data, especially in a notebook environment? Pandas doesn’t handle really Big Data very well, but two other libraries do. So,…

primal tulip Apr 19, 2021, 11:08 PM

#

exotic maple no. There are packages for Async, but in general async is a "type" of programmin...

I'll definitely read about it. I was trying to use generators, but I'm failing miserably. Oh and thanks for the link.
I'm doing a test with the csv to isolate the problem. I do some transforms to the data then proceed, the chunksize is small (1k rows) and the csv is like 56 gb, and I'm on 16gb of ram.

velvet thorn Apr 20, 2021, 12:02 AM

#

grave frost I would say more accurately hiding it - the data is always present, just not vis...

depends on what kind of masking I guess

#

like

#

subsetting with a mask

#

but yes, in general I would say that is more correct

mystic turtle Apr 20, 2021, 12:17 AM

#

hi guys im new here, currently im doing my assignment and i faced this error, need some help to solve it

#

im doing nlp with textblob, but the error showing that is a name error with textblob

#

NameError: name 'TextBlob' is not defined

#

i had import the libraries textblob

#

please help me with this, thanks in advance

velvet thorn Apr 20, 2021, 12:27 AM

#

mystic turtle i had import the libraries textblob

you need to import the TextBlob object

#

from the textblob library

mystic turtle Apr 20, 2021, 12:28 AM

#

from textblob import TextBlob

#

this is the code right?i had done it

velvet thorn Apr 20, 2021, 12:29 AM

#

mystic turtle this is the code right?i had done it

show all your code as text

#

not as a screenshot

arctic wedgeBOT Apr 20, 2021, 12:33 AM

#

Hey @mystic turtle!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

mystic turtle Apr 20, 2021, 12:35 AM

#

!code-blocks

arctic wedgeBOT Apr 20, 2021, 12:35 AM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

mystic turtle Apr 20, 2021, 12:37 AM

#

import textblob as Textblob

pol = lambda x: TextBlob(x).sentiment.polarity
sub = lambda x: TextBlob(x).sentiment.subjectivity

df1[b'Subjectivity'] = df1[b'comment'].apply(sub)
df1[b'Polarity'] = df1[b'comment'].apply(pol)

velvet thorn Apr 20, 2021, 12:39 AM

#

mystic turtle ```py import textblob as Textblob pol = lambda x: TextBlob(x).sentiment.polarit...

no

#

do you know what as does when importing?

ionic drum Apr 20, 2021, 12:39 AM

#

hey can someone help me out real quick with a program I'm trying to do?

velvet thorn Apr 20, 2021, 12:39 AM

#

I'm p sure you want from textblob import TextBlob

ionic drum Apr 20, 2021, 12:40 AM

#

I need to calculate something from info in a file.

velvet thorn Apr 20, 2021, 12:40 AM

#

ionic drum I need to calculate something from info in a file.

just post your question

#

and someone will get around to it, or not

#

(hopefully the former) 😉

mystic turtle Apr 20, 2021, 12:41 AM

#

velvet thorn do you know what `as` does when importing?

not really

mystic turtle Apr 20, 2021, 12:41 AM

#

velvet thorn I'm p sure you want `from textblob import TextBlob`

this helps ,thanks for the answer

ionic drum Apr 20, 2021, 12:42 AM

#

AGE VEF HT SEX SMOKE
11 3.2220 72.0 1 0
10 2.5920 65.0 1 0
13 3.1930 70.0 1 0
11 1.6940 60.0 1 1
14 3.9570 72.0 1 1
11 2.3460 59.0 0 0
13 4.7890 69.0 1 1

#

ok I have this data and I need to calculate the average of the 3rd column for those who are smoke=1 and smoke = 0

velvet thorn Apr 20, 2021, 12:42 AM

#

mystic turtle not really

I suggest

#

you focus more on your Python basics

#

I mean

#

I get that machine learning is fun but

#

fundamentals are important

#

and how to import is REALLY fundamental

#

is that a DataFrame or what

ionic drum Apr 20, 2021, 12:43 AM

#

yeah I sent nothing haha

#

sent by accident

velvet thorn Apr 20, 2021, 12:43 AM

#

so

#

yes?

ionic drum Apr 20, 2021, 12:43 AM

#

yes

velvet thorn Apr 20, 2021, 12:43 AM

#

okay

#

that's quite a simple question

#

so instead of an answer

#

I'm going to give you a hint

#

you need a groupby

fickle tinsel Apr 20, 2021, 12:44 AM

#

hello

velvet thorn Apr 20, 2021, 12:44 AM

#

do you know what groupby is?

fickle tinsel Apr 20, 2021, 12:44 AM

#

i need help

velvet thorn Apr 20, 2021, 12:44 AM

#

fickle tinsel i need help

just post your question.

ionic drum Apr 20, 2021, 12:44 AM

#

nope but actually my prof said that we should do it just by using basic functions

fickle tinsel Apr 20, 2021, 12:44 AM

#

AUC area under the curve

mystic turtle Apr 20, 2021, 12:44 AM

#

velvet thorn you focus more on your Python basics

ya, i would like to do that first, but college are fast , they can't do baby sitting

velvet thorn Apr 20, 2021, 12:44 AM

#

mystic turtle ya, i would like to do that first, but college are fast , they can't do baby sit...

you need to learn faster then

#

on your own time

mystic turtle Apr 20, 2021, 12:44 AM

#

this is where i can't catch up with the syllabus

ionic drum Apr 20, 2021, 12:44 AM

#

it's a practice example

velvet thorn Apr 20, 2021, 12:45 AM

#

fickle tinsel AUC area under the curve

so what's your question

velvet thorn Apr 20, 2021, 12:45 AM

#

ionic drum nope but actually my prof said that we should do it just by using basic function...

what do you consider "basic"

#

you can also do that with filtering and aggregation

#

do you know how to filter a DataFrame?

ionic drum Apr 20, 2021, 12:45 AM

#

basically I can use the open/close stuff but I only use if/while/for, etx

#

etc

#

this is what I have

mystic turtle Apr 20, 2021, 12:45 AM

#

thanks for the advice, i will try as much as i can to improve my basics

velvet thorn Apr 20, 2021, 12:45 AM

#

huh

velvet thorn Apr 20, 2021, 12:45 AM

#

ionic drum basically I can use the open/close stuff but I only use if/while/for, etx

wait just to be clear

fickle tinsel Apr 20, 2021, 12:46 AM

#

velvet thorn Apr 20, 2021, 12:46 AM

#

you said it was a DataFrame?

#

like a pandas DataFrame?

ionic drum Apr 20, 2021, 12:46 AM

#

nonono

velvet thorn Apr 20, 2021, 12:46 AM

#

fickle tinsel

don't post stuff as a screenshot, it's super hard to read

ionic drum Apr 20, 2021, 12:46 AM

#

wait

fickle tinsel Apr 20, 2021, 12:46 AM

#

I am having troubling with the formula

#

oh

#

sorry

ionic drum Apr 20, 2021, 12:46 AM

#

ok now it's good

velvet thorn Apr 20, 2021, 12:46 AM

#

fickle tinsel I am having troubling with the formula

what trouble

#

are you having

#

do you understand the concept?

#

or do you have questions about it too

ionic drum Apr 20, 2021, 12:46 AM

#

    return open(path,'rb').read().decode('utf-8')

#Fonction écrire dans le fichier
def writeFile(path,texte):
    f=open(path,'wb')
    f.write(texte.encode('utf-8'))
    f.close()

#Découpe en lignes texte dans fichier
def decouperEnLignes(contenu):
    lignes = contenu.split('\n')
    if lignes[-1] =='':
        lignes.pop()
    return lignes


path=input("Insert path of the data file.")```

#

this is what I have

velvet thorn Apr 20, 2021, 12:47 AM

#

ionic drum this is what I have

do you know what a DataFrame is?

fickle tinsel Apr 20, 2021, 12:47 AM

#

basically you need to loop an two set of array

ionic drum Apr 20, 2021, 12:47 AM

#

nope

velvet thorn Apr 20, 2021, 12:47 AM

#

ionic drum nope

it's a specific kind of object used to deal with data

ionic drum Apr 20, 2021, 12:47 AM

#

aaaah

velvet thorn Apr 20, 2021, 12:47 AM

#

and what you have is not a DataFrame

#

so

velvet thorn Apr 20, 2021, 12:47 AM

#

ionic drum yes

this is wrong

#

but anyway

ionic drum Apr 20, 2021, 12:47 AM

#

yeah sorry this is first year coding

velvet thorn Apr 20, 2021, 12:47 AM

#

I'm going to assume

#

what you have

fickle tinsel Apr 20, 2021, 12:47 AM

#

using a for loop to declare and initialize it

velvet thorn Apr 20, 2021, 12:47 AM

#

is a list of lists

#

is that correct?

ionic drum Apr 20, 2021, 12:47 AM

#

yes

velvet thorn Apr 20, 2021, 12:47 AM

#

sure?

#

like it's okay to be not sure

ionic drum Apr 20, 2021, 12:48 AM

#

yes I have list of lists

velvet thorn Apr 20, 2021, 12:48 AM

#

okay

velvet thorn Apr 20, 2021, 12:48 AM

#

ionic drum yes I have list of lists

do you know how to access an element in a list?

ionic drum Apr 20, 2021, 12:48 AM

#

yes

velvet thorn Apr 20, 2021, 12:48 AM

#

fickle tinsel basically you need to loop an two set of array

yup

#

have you written code already?

fickle tinsel Apr 20, 2021, 12:48 AM

#

no

velvet thorn Apr 20, 2021, 12:48 AM

#

okay

#

are you having trouble there?

fickle tinsel Apr 20, 2021, 12:48 AM

#

I already created the list

ionic drum Apr 20, 2021, 12:49 AM

#

here's the deal i'm in this class and I know how to do this but It uses methods not used in class

#

and they get all pissed

velvet thorn Apr 20, 2021, 12:49 AM

#

ionic drum here's the deal i'm in this class and I know how to do this but It uses methods ...

you can do everything

#

with a simple loop

fickle tinsel Apr 20, 2021, 12:49 AM

#

I got confused trying to lay down the logic

ionic drum Apr 20, 2021, 12:49 AM

#

ok 🙂

velvet thorn Apr 20, 2021, 12:49 AM

#

it's just

#

weird

#

for example...

#

!e

numbers = [3, 6, 1, 3]
accumulator = 0
count = 0

for number in numbers:
    accumulator += number
    count += 1

print(f'The mean is {accumulator / count}')

arctic wedgeBOT Apr 20, 2021, 12:50 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

The mean is 3.25

velvet thorn Apr 20, 2021, 12:50 AM

#

yes?

#

same concept

#

just scaled up because you also need to extract an element from the inner list

velvet thorn Apr 20, 2021, 12:50 AM

#

fickle tinsel I got confused trying to lay down the logic

confused with what specifically?

ionic drum Apr 20, 2021, 12:50 AM

#

ok I see

fickle tinsel Apr 20, 2021, 12:50 AM

#

the formula

#

AUC

velvet thorn Apr 20, 2021, 12:51 AM

#

fickle tinsel AUC

where do A, delta-X, and y come from?

fickle tinsel Apr 20, 2021, 12:52 AM

#

that's the formula for Area under the curve

velvet thorn Apr 20, 2021, 12:52 AM

#

fickle tinsel that's the formula for Area under the curve

I mean

#

what I'm asking is

#

okay maybe show the whole assignment or something

fickle tinsel Apr 20, 2021, 12:53 AM

#

can I pm

#

or should I share screenshot

velvet thorn Apr 20, 2021, 12:54 AM

#

uh

#

I'm guessing

#

wait

#

screenshot I guess

fickle tinsel Apr 20, 2021, 12:55 AM

#

okay

#

velvet thorn Apr 20, 2021, 12:57 AM

#

fickle tinsel okay

why can't you take a screenshot

#

with your computer

velvet thorn Apr 20, 2021, 12:58 AM

#

fickle tinsel

okay so

#

you know how to create two lists

#

right?

#

one for x and one for y

fickle tinsel Apr 20, 2021, 12:58 AM

#

yes

#

i did that already using numpy

velvet thorn Apr 20, 2021, 12:59 AM

#

okay

#

oh you're uspposed to use arrays

#

right

#

fair

velvet thorn Apr 20, 2021, 1:00 AM

#

fickle tinsel yes

do you know how to create

#

the delta-x array?

fickle tinsel Apr 20, 2021, 1:00 AM

#

no

velvet thorn Apr 20, 2021, 1:01 AM

#

fickle tinsel no

okay

fickle tinsel Apr 20, 2021, 1:01 AM

#

i do know its the chage in x

modern phoenix Apr 20, 2021, 1:01 AM

#

I have a pandas table that has a real number field 'success' and another field X that is an integer. I have a suspicion that success is correlated to X; is a scatter plot the best way to start with the hypothesis? Also, if there is a better channel for this, please let me know!

velvet thorn Apr 20, 2021, 1:01 AM

#

so

velvet thorn Apr 20, 2021, 1:01 AM

#

fickle tinsel i do know its the chage in x

it's basically

#

[x1 - x0, x2 - x1, x3 - x2...xn - x(n - 1)], right?

velvet thorn Apr 20, 2021, 1:01 AM

#

modern phoenix I have a pandas table that has a real number field 'success' and another field X...

a scatter plot would be a nice way to visualise it

#

but

fickle tinsel Apr 20, 2021, 1:01 AM

#

yes

velvet thorn Apr 20, 2021, 1:01 AM

#

what kind of correlation are you thinking of?

velvet thorn Apr 20, 2021, 1:01 AM

#

fickle tinsel yes

okay

#

so

#

how would you get

#

these two arrays?

#

[x1, x2, x3...xn]

#

and

modern phoenix Apr 20, 2021, 1:02 AM

#

@velvet thorn I don't have a stats background, but I suspect the higher X is the worse success will be on average

velvet thorn Apr 20, 2021, 1:02 AM

#

[x0, x1, x2...x(n - 1)]?

#

think about that

velvet thorn Apr 20, 2021, 1:02 AM

#

modern phoenix <@!171929073063297024> I don't have a stats background, but I suspect the higher...

hm

#

that wasn't really what I meant

#

more like...

#

are you talking about linear correlation?

#

or nonlinear correlation

modern phoenix Apr 20, 2021, 1:03 AM

#

I don't understand the difference. They are independent variables but I'm trying to understand what is affecting my 'success' score

velvet thorn Apr 20, 2021, 1:03 AM

#

modern phoenix I don't understand the difference. They are independent variables but I'm trying...

okay

#

imagine this

#

hm

modern phoenix Apr 20, 2021, 1:03 AM

#

and so far in the 20+ cols, the thing that stands out is X (just eyeballing it)

velvet thorn Apr 20, 2021, 1:03 AM

#

modern phoenix I don't understand the difference. They are independent variables but I'm trying...

basically

fickle tinsel Apr 20, 2021, 1:03 AM

#

velvet thorn and

by having a for-loop function?

velvet thorn Apr 20, 2021, 1:03 AM

#

say you have two correlated variables

#

x and y

#

now, when x changes, y changes too

#

we can think of this roughly as "when x changes by an amount dx, y changes by an amount k * dx, on average"

#

where k is a fixed number.

velvet thorn Apr 20, 2021, 1:04 AM

#

velvet thorn where k is a fixed number.

if this is true, it's linear correlation

modern phoenix Apr 20, 2021, 1:05 AM

#

it's most likely not like that

velvet thorn Apr 20, 2021, 1:05 AM

#

if it is not, it's nonlinear

modern phoenix Apr 20, 2021, 1:05 AM

#

success is generally 0 or 1

velvet thorn Apr 20, 2021, 1:05 AM

#

but anyway, a scatterplot would do well

velvet thorn Apr 20, 2021, 1:05 AM

#

modern phoenix success is generally 0 or 1

in this case

modern phoenix Apr 20, 2021, 1:05 AM

#

but sometimes 0.24 but that's rare

velvet thorn Apr 20, 2021, 1:05 AM

#

it could be correlation

#

with the probability of success

modern phoenix Apr 20, 2021, 1:05 AM

#

and I have a feeling if X is very high, like 35000 then success is probably going to be 0

velvet thorn Apr 20, 2021, 1:05 AM

#

or something called the log-odds

#

anyway

#

so success

#

is categorical?

#

i.e. 1 or 0

modern phoenix Apr 20, 2021, 1:05 AM

#

pretty much but as I mentioned it can be fractional

velvet thorn Apr 20, 2021, 1:06 AM

#

why?

#

okay

#

never mind that

#

one simple thing you can do is

#

group by success value

#

and get the mean/median

#

of X

modern phoenix Apr 20, 2021, 1:06 AM

#

think of a headshot kill, you either miss or kill but somtimes they can get off wounded

velvet thorn Apr 20, 2021, 1:06 AM

#

fickle tinsel by having a for-loop function?

nope

#

try slicing

velvet thorn Apr 20, 2021, 1:06 AM

#

modern phoenix think of a headshot kill, you either miss or kill but somtimes they can get off ...

how many unique values

#

ar ether

#

for success?

modern phoenix Apr 20, 2021, 1:06 AM

#

not what I'm dealing with but will give you an idea

#

i.e, say anything > 0 is success

velvet thorn Apr 20, 2021, 1:07 AM

#

modern phoenix i.e, say anything > 0 is success

meaning 2?

fickle tinsel Apr 20, 2021, 1:07 AM

#

velvet thorn try slicing

slicing? how woudl you do that?

velvet thorn Apr 20, 2021, 1:07 AM

#

fickle tinsel slicing? how woudl you do that?

do you know what slicing is?

modern phoenix Apr 20, 2021, 1:07 AM

#

no, success is in the range [0, 1]

velvet thorn Apr 20, 2021, 1:07 AM

#

modern phoenix i.e, say anything > 0 is success

you just said this

#

which suggests quantisation

fickle tinsel Apr 20, 2021, 1:07 AM

#

velvet thorn do you know what slicing is?

no i have no idea

velvet thorn Apr 20, 2021, 1:07 AM

#

fickle tinsel no i have no idea

time to Google it 🙂

fickle tinsel Apr 20, 2021, 1:07 AM

#

ok

modern phoenix Apr 20, 2021, 1:08 AM

#

velvet thorn meaning 2?

ah, sorry I misread thinking you meant success could be > 1

velvet thorn Apr 20, 2021, 1:08 AM

#

modern phoenix ah, sorry I misread thinking you meant success could be > 1

yeah so let's assume

#

0 or 1

#

i.e. round up

#

now

#

find the mean/median

#

of X

#

for each group

#

i.e. success = 0, and success > 0

#

that's a very quick and dirty way

modern phoenix Apr 20, 2021, 1:08 AM

#

@velvet thorn you mean mean, median of X? the dependent variable?

velvet thorn Apr 20, 2021, 1:09 AM

#

modern phoenix <@!171929073063297024> you mean mean, median of X? the dependent variable?

yes

#

wait

#

I thought you said

#

success was the dependent variable

#

well anyway it doesn't matter

#

the idea is that

#

if the two groups

#

have a wildly different X value

#

that suggests that htere's ome relationship

#

either way

modern phoenix Apr 20, 2021, 1:11 AM

#

velvet thorn success was the dependent variable

sorry, my terminology is backwards.. don't know the term for X if X is hypothesized to affect success

velvet thorn Apr 20, 2021, 1:11 AM

#

modern phoenix sorry, my terminology is backwards.. don't know the term for X if X is hypothesi...

okay

#

basically

#

you have a hypothesis

#

that variable A

#

affects variable B

#

therefore

#

B is the dependent variable

#

because it depends on A, yes?

#

and by extension A is the independent variable

#

for obvious reasons

#

in other words, you ask

#

"if I were to increase A by 10%, how would B change?"

#

and not the other way round

modern phoenix Apr 20, 2021, 1:13 AM

#

thank you

#

trying a groupby on success > 0

#

@velvet thorn unfortunately mean, median don't really look different....

sour thunder Apr 20, 2021, 3:44 AM

#

velvet thorn B is the *dependent* variable

!!

velvet thorn Apr 20, 2021, 5:12 AM

#

modern phoenix <@!171929073063297024> unfortunately mean, median don't really look different......

well

#

sometimes it can be hard to eyeball

inland isle Apr 20, 2021, 8:11 AM

#

i am learning ml from the past 2 months, shall i first clear the concepts of ml algos or shall i put more focus on the data manipulation (feature engineering) part?

royal lintel Apr 20, 2021, 9:25 AM

#

Hey, anyone knows if it is underfitting and if so, how to fix it?

#

I can provide code samples if needed

primal tulip Apr 20, 2021, 9:33 AM

#

royal lintel Hey, anyone knows if it is underfitting and if so, how to fix it?

I think it's a bit vague without the code and what are you actually implementing. But you could always run more iterations and see how it evolves and when compared to the random test sample, you gain some insight on the result.

royal lintel Apr 20, 2021, 9:49 AM

#

I'm working on a dataset from kaggle - https://www.kaggle.com/rashikrahmanpritom/heart-attack-analysis-prediction-dataset and got something like this after some time

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 

import tensorflow as tf
import pandas as pd
import keras
from sklearn.model_selection import train_test_split
import numpy as np

physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

df = pd.read_csv('datasets/heart.csv')
train_df, test_df = train_test_split(df, test_size=0.3, random_state=42)

train_x = train_df.drop('output', axis=1)
train_x = train_x.astype('float32')
train_x /= 255.0

train_y = train_df.output.values
train_y = train_y.astype('float32')

test_x = test_df.drop('output', axis=1)
test_x = test_x.astype('float32')
test_x /= 255.0

test_y = test_df.output.values
test_y = test_y.astype('float32')

model = keras.Sequential([
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(16, activation='relu'),
    keras.layers.Dense(2, activation='sigmoid')
])

model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(),
    optimizer='adam',
    metrics=['accuracy']
)

model.fit(train_x, train_y, epochs=5, batch_size=2)
model.evaluate(test_x, test_y)```

Heart Attack Analysis & Prediction Dataset

A dataset for heart attack classification

#

Layers which you see now are just by trying various things (more, less denses/dropouts etc.) but overall it always showed about 60% accuracy at fitting data, and 70% at eval

grave frost Apr 20, 2021, 10:26 AM

#

royal lintel Layers which you see now are just by trying various things (more, less denses/dr...

what's the problem?

royal lintel Apr 20, 2021, 10:27 AM

#

royal lintel Hey, anyone knows if it is underfitting and if so, how to fix it?

@grave frost

grave frost Apr 20, 2021, 10:28 AM

#

royal lintel <@!738058085083381760>

lower learning rate

#

*slightly

royal lintel Apr 20, 2021, 10:30 AM

#

any ideas on how to improve it? Idk if loss/denses are done properly and cant find any solutions. I assume data is not made wrongly since it works, but can be wrong

grave frost Apr 20, 2021, 10:30 AM

#

lowering learning rate should help a bit - along with adjusting dense, dropout, conv etc.

royal lintel Apr 20, 2021, 11:20 AM

#

Is it okay when training set accuracy varies but overall increases? like e. g. 50,40,60,55,70. Tutorials always show data acc that always increases 'lineary' in some way

ripe forge Apr 20, 2021, 11:45 AM

#

How the test accuracy performing? That's the more important metric you should be looking at

grave frost Apr 20, 2021, 1:09 PM

#

Id rather use k-fold for smaller datasets

delicate garnet Apr 20, 2021, 2:14 PM

#

hi is somone good with streamlit?

glass cedar Apr 20, 2021, 2:36 PM

#

I have a question about CNNs, I'm putting the image of a frog into the convolutional and pooling layers of a CNN, and I'm getting plotting the output after every layer. With the image on the right, I'm wondering how the CNN is able to make a determination that it's a frog on something that blurry

#

For nearly any image, by the time it goes through to the right most image, it's become an amorphous blob, and I'm not sure how training weights of the kernels will help make any identifications

tidal bough Apr 20, 2021, 2:39 PM

#

The intermediate results of a neural network will generally not "mean" anything to humans.

burnt meadow Apr 20, 2021, 3:04 PM

#

hey guys, wondering if it's possible to use data from an excel sheet as inputs and outputs in pytorch and how to do so

glass cedar Apr 20, 2021, 3:20 PM

#

i think so, i've done it by saving it as a csv and reading it using pandas

#

then from pandas to pytorch should be trivial

burnt meadow Apr 20, 2021, 3:41 PM

#

Thank you appreciate it

digital aurora Apr 20, 2021, 3:45 PM

#

Hello people!!

#

Any data scientist available?

serene scaffold Apr 20, 2021, 3:51 PM

#

digital aurora Any data scientist available?

What is your question?

digital aurora Apr 20, 2021, 3:52 PM

#

Just wanted some guidance as a new Comer to this field

serene scaffold Apr 20, 2021, 3:52 PM

#

digital aurora Just wanted some guidance as a new Comer to this field

What is your current math and programming background?

digital aurora Apr 20, 2021, 3:53 PM

#

Currently pursuing Engineering in computer science.

serene scaffold Apr 20, 2021, 3:54 PM

#

digital aurora Currently pursuing Engineering in computer science.

Have you taken linear algebra and statistics? How much programming have you done?

digital aurora Apr 20, 2021, 3:54 PM

#

Currently in 1st year. I know python language completely.

digital aurora Apr 20, 2021, 3:55 PM

#

serene scaffold Have you taken linear algebra and statistics? How much programming have you done...

No, not linear algebra.

#

Although I have started studying statistics for data science from online resources.

#

Currently I am studying normal distribution in statistics.

serene scaffold Apr 20, 2021, 4:07 PM

#

digital aurora No, not linear algebra.

A lot of approaches to AI (especially those termed "deep learning") depend on linear algebra, so I would plan to take a course in it

#

Another thing you can do to get started is to get comfortable with numpy and pandas.

digital aurora Apr 20, 2021, 4:08 PM

#

Well, I would jump to it a bit later once my stats portion is complete.

digital aurora Apr 20, 2021, 4:09 PM

#

serene scaffold Another thing you can do to get started is to get comfortable with numpy and pan...

Can you please tell how much knowledge of python is required for it?

#

Like I know the complete basic python language.

#

Is it enough?

serene scaffold Apr 20, 2021, 4:10 PM

#

digital aurora Can you please tell how much knowledge of python is required for it?

Numpy and pandas use the language in a special way. Operations with numpy arrays or pandas dataframes that look atomic (like a + b) are actually iterative.

digital aurora Apr 20, 2021, 4:11 PM

#

I don't know anything about Data structures and algorithms...so do I need study them also?

ripe forge Apr 20, 2021, 4:12 PM

#

Those are good to know in general

digital aurora Apr 20, 2021, 4:13 PM

#

But I guess they are not so important for data science field

serene scaffold Apr 20, 2021, 4:13 PM

#

digital aurora I don't know anything about Data structures and algorithms...so do I need study ...

Those are going to be part of your general CS education. They're a classic way of teaching runtime complexity.

glass cedar Apr 20, 2021, 4:14 PM

#

i don't think you'd use them algos & dat structs directly in data science, but it wouldn't be good not to know them

serene scaffold Apr 20, 2021, 4:14 PM

#

Understanding runtime complexity is key to understanding why, for example, you shouldn't keep appending a dataframe

digital aurora Apr 20, 2021, 4:15 PM

#

See, data structures and algorithms are a part of my CS curriculum, I will have to study it..but if its not much used in Data science field, I won't put much stress over it then.

serene scaffold Apr 20, 2021, 4:16 PM

#

digital aurora See, data structures and algorithms are a part of my CS curriculum, I will have ...

I would encourage you to learn them

ripe forge Apr 20, 2021, 4:16 PM

#

Cs is used in data science. It's like.. Its like foundations for your work. You never really use those things directly, but rather build your understanding of things on top of the concepts you learn there

digital aurora Apr 20, 2021, 4:16 PM

#

serene scaffold I would encourage you to learn them

Okay..

digital aurora Apr 20, 2021, 4:17 PM

#

ripe forge Cs is used in data science. It's like.. Its like foundations for your work. You ...

I see!

#

So as of now, I should start with numpy and pandas then?

serene scaffold Apr 20, 2021, 4:17 PM

#

Those would be good to know, yes

ripe forge Apr 20, 2021, 4:17 PM

#

Definitely

serene scaffold Apr 20, 2021, 4:18 PM

#

If you have time after that, SQL is the other language you want to know.

#

Unlike R or Java, its use case doesn't really overlap with Python.

ripe forge Apr 20, 2021, 4:18 PM

#

hears sql, breaks down and cries

serene scaffold Apr 20, 2021, 4:19 PM

#

ripe forge *hears sql, breaks down and cries*

Don't worry, I won't let the bad language hurt you

#

(it's not that bad)

ripe forge Apr 20, 2021, 4:19 PM

#

. /sniff you promise?

serene scaffold Apr 20, 2021, 4:19 PM

#

ripe forge . /sniff you promise?

Of course I do 🤗

digital aurora Apr 20, 2021, 4:19 PM

#

Firstly, let me study panda and numpy...

#

Will ask you for further guidance then..🙂

serene scaffold Apr 20, 2021, 4:20 PM

#

The reason I throw in sql is that pandas is about working with tabular data that's in live memory

digital aurora Apr 20, 2021, 4:21 PM

#

Btw, do companies hire data scientist after bachelor's?

#

Like I see people getting hired after their masters

ripe forge Apr 20, 2021, 4:22 PM

#

You can, though you do need a bit of luck with these kinds of things. I suppose it would depend on your country too

digital aurora Apr 20, 2021, 4:22 PM

#

I see!

glass cedar Apr 20, 2021, 4:25 PM

#

i'd imagine having many projects and internships under your belt upon graduation would help your job prospects after bachelors

serene scaffold Apr 20, 2021, 4:26 PM

#

digital aurora Btw, do companies hire data scientist after bachelor's?

I'm finishing up my cs bachelor's and I'm only applying for data scientist positions. The only reason I'm even remotely competitive for those positions is that I have formal research experience.

#

The courses really weren't enough.

glass cedar Apr 20, 2021, 4:28 PM

#

i'd consider taking linear algebra, multi-variate calc, probability, algos & dat structs, and machine learning as early as possible to be competitive for internships + give you time to work on projects

glass cedar Apr 20, 2021, 4:29 PM

#

serene scaffold I'm finishing up my cs bachelor's and I'm only applying for data scientist posit...

^ research also valuable

digital aurora Apr 20, 2021, 4:32 PM

#

I see!!

prime vortex Apr 20, 2021, 4:45 PM

#

anyone know how to use zobrist hash for chess game?

#

im confused

#

i mean i know the theory but idk how to implement it

grave frost Apr 20, 2021, 5:05 PM

#

you guys forgot ML competetions too!

#

while they won't be much of a direct factor, most people in the industry now recognize the value of Kaggle. they won't care about your kernels, but saying things like ("top 2% out x-grand data scientists") is nice

stuck socket Apr 20, 2021, 7:23 PM

#

sup

exotic maple Apr 20, 2021, 7:44 PM

#

serene scaffold The courses really weren't enough.

I dont think courses, including formal university courses., are enough for any career, Ultimately you should always try to know more than what you are taught :p

exotic maple Apr 20, 2021, 7:45 PM

#

serene scaffold If you have time after that, SQL is the other language you want to know.

smiles in postgresql

dapper halo Apr 20, 2021, 8:39 PM

#

Coming in hot with a super dumb question. I know how I could reorder this, but I'm sure there's a one liner that i'd much rather have.

If I have a dataframe with values x,y,z and a presorted array [y,z,x]. Without using .index, is there a way to reorder the dataframe to match the presorted array?

lavish tundra Apr 20, 2021, 9:13 PM

#

i'm trying to get closest values from a wrong typed string
i know i can do this

difflib.get_close_matches("mamoth", db['en'].astype(str))

tha problem is: i need to look for the closest's results not only in one column but in multiples columns, someone know how to do that?

stuck socket Apr 20, 2021, 9:28 PM

#

lumi

stuck socket Apr 20, 2021, 9:29 PM

#

lavish tundra i'm trying to get closest values from a wrong typed string i know i can do this ...

lambada

lavish tundra Apr 20, 2021, 9:32 PM

#

i solved it doing a loop

flint mason Apr 20, 2021, 9:34 PM

#

I want to color a bar plot based based on a true or false value from a dataframe

#

When I use this I lost the names from the plot

exotic maple Apr 20, 2021, 9:35 PM

#

dapper halo Coming in hot with a super dumb question. I know how I could reorder this, but I...

I mean, wouldnt it be easier to create the Dataframe FROM the array?

dapper halo Apr 20, 2021, 9:36 PM

#

exotic maple I mean, wouldnt it be easier to create the Dataframe *FROM* the array?

Lot more to it unfortunately haha. BUT I brute forced my way through with only two lines. So I guess im okay hahah

exotic maple Apr 20, 2021, 9:36 PM

#

dapper halo Lot more to it unfortunately haha. BUT I brute forced my way through with only t...

how did you do it?

dapper halo Apr 20, 2021, 9:38 PM

#

exotic maple how did you do it?

wanting me to put my poor coding practices on full display 😦

#

idxx = [] for xx in range_idx.to_list(): idxx.append(ioncopy['ions'][ioncopy['ions']==xx].index[0]) ioncopy = ioncopy.sort_index(idxx)

exotic maple Apr 20, 2021, 9:38 PM

#

lavish tundra i'm trying to get closest values from a wrong typed string i know i can do this ...

Define "closest". There are many ways to compare word similarity depending on what you want

dapper halo Apr 20, 2021, 9:38 PM

#

oh geeze that copied bad...nvm squished screen. And I lied, three lines with list initialization

exotic maple Apr 20, 2021, 9:40 PM

#

take a look at this @dapper halo https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reindex_like.html

dapper halo Apr 20, 2021, 9:41 PM

#

exotic maple take a look at this <@!530207557252415508> https://pandas.pydata.org/pandas-docs...

ooo think that should work fine. More or less what I was after

exotic maple Apr 20, 2021, 9:42 PM

#

I always refer to the documentation first before trying to self-define stuff 😛 I mean it's always good to try a personal solution, but those in the doc are almost always optimize

flint mason Apr 20, 2021, 9:43 PM

#

plt.bar(popular_crypto_data['Name'], popular_crypto_data['percent_change_24h'], label = 'percent_change_24h', color = 'g' if popular_crypto_data['positive'] else 'r')

#

can someone check this

dapper halo Apr 20, 2021, 9:43 PM

#

exotic maple I always refer to the documentation first before trying to self-define stuff 😛 ...

Yeah, I went there first couldnt find much so I said screw it after a bit and redefined.

I blame the fact that I stupidly use edge....shh

exotic maple Apr 20, 2021, 10:03 PM

#

Does anyone here confidently understand gradient descent?

I understand all the concepts and the overall intuition but for God's sake I'm always missing something.

As I get, it works like this:

A(i , j , k) = [ i , j , k]
where i, j, k are the dimensions / features / independent variables that form the vector A.

We get the partial derivative of A for each vector ala -> dA / di

Then for a randomly initialized value we calculate the gradient / derivative at the value. Then we move in a direction and calculate again.

that's usually where i stop before getting lost in steps >.<

grave frost Apr 20, 2021, 10:13 PM

#

exotic maple Does anyone here confidently understand gradient descent? I understand all the ...

I dunno the specifics too, but wouldn't you multiply it by the learning rate too, and then repeat until you reach the minima?

exotic maple Apr 20, 2021, 10:49 PM

#

grave frost I dunno the specifics too, but wouldn't you multiply it by the learning rate too...

In theory, I get the learning rate as the lenght of the "step" we take towards the minima, but still. Idk, I just feel im missing something lol

grave frost Apr 20, 2021, 10:50 PM

#

exotic maple In theory, I get the learning rate as the lenght of the "step" we take towards t...

yea, I forgot the finer details 😅 maybe... back to 3blue1brown?

rigid sundial Apr 20, 2021, 11:22 PM

#

guys i need to make a desicion

#

I have to choose between data scicence and computer science

#

what should i chooose

#

which is better overall

velvet thorn Apr 20, 2021, 11:59 PM

#

rigid sundial I have to choose between data scicence and computer science

try #career-advice

#

although that's not really a good question to ask

velvet thorn Apr 21, 2021, 12:00 AM

#

exotic maple Does anyone here confidently understand gradient descent? I understand all the ...

so what's your question

boreal summit Apr 21, 2021, 12:00 AM

#

What's the difference between a depth dimension and spatial dimension in CNN?

exotic maple Apr 21, 2021, 12:01 AM

#

Its not a question in itself. Im jiust not understanding "how" the gradient changes with each iteration proper

boreal summit Apr 21, 2021, 12:01 AM

#

I was studying and couldn't find much online.

exotic maple Apr 21, 2021, 12:01 AM

#

I understand the derivative, the learning rate, etc. But not the whole thing together, i think @velvet thorn

solar geyser Apr 21, 2021, 12:02 AM

#

I am working on a university project which is based on NLP. I have to use two datasets and perform latent semantic analysis on the two. I have tried to make term-document matrix first but not sure if it's correct or not. I provide link to google colab where I have written the code, please if someone can review the code and guide me through what to do next would be really helpful. https://colab.research.google.com/drive/1fOAMztRAqogl738koEgYzDkXlTxVvn21?usp=sharing

Google Colaboratory

velvet thorn Apr 21, 2021, 12:02 AM

#

boreal summit What's the difference between a depth dimension and spatial dimension in CNN?

I'm assumin

#

g

#

you're talking about height/length/width

#

vs number of channels

#

okay so like

#

for an image

#

assuming it's greyscale

#

each pixel can be described by (x, y, v)

#

x-coordinate, y-coordinate, value (how black or white it is)

#

now

#

consider a standard RGB colour image

#

you still need one x-coordinate and one y-coordinate

#

but now you need 3 numbers for colour

#

R, G, B

boreal summit Apr 21, 2021, 12:04 AM

#

...following

velvet thorn Apr 21, 2021, 12:04 AM

#

so

#

in the colour dimension

#

you have 3 values that can vary independently, yes?

#

R, G, and B

#

and on the "position" side you have X and Y

#

which can also vary independently

#

so X and Y control "where" the pixel is

#

and RGB controls "what" it is

#

the canonical way to store such an image

#

is in an array with the shape (x, y, c)

#

such that for the pixel at (x, y) c is an array describing its colour

boreal summit Apr 21, 2021, 12:07 AM

#

So the RGB filter is the depth dimension, while each color cordinate is the spatial dimension?

velvet thorn Apr 21, 2021, 12:08 AM

#

boreal summit So the RGB filter is the depth dimension, while each color cordinate is the spat...

I have not heard the dimensions described that way, but that would be my guess

#

because

#

when you pass an image through a convolutional layer (assuming 2D here)

#

the c axis changes size

#

in particular, it will have a length equal to the number of filters in the layer

boreal summit Apr 21, 2021, 12:10 AM

#

okay man, thanks.

barren iris Apr 21, 2021, 1:11 AM

#

Hello guys! Can anyone recommend a good package that can extract a table from an image (or an image inside a pdf?)

I'm have some old paper documents with tables that i have to turn into an excel, but they're very different from each other and it would take weeks to do it by hand.

Tabula-py works fantastic but it requires that the table is stored as tabulated text or something, it doesn't work with images.

soft dock Apr 21, 2021, 2:20 AM

#

Not exactly a package but this is the first thing I would follow
https://github.com/jainammm/TableNet

GitHub

jainammm/TableNet

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images" - jainammm/TableNet

rotund dagger Apr 21, 2021, 3:59 AM

#

is there a way to go through a csv and write each row to its own csv?

glacial monolith Apr 21, 2021, 4:12 AM

#

rotund dagger is there a way to go through a csv and write each row to its own csv?

Are you trying to create a bunch of new files?

rotund dagger Apr 21, 2021, 4:14 AM

#

glacial monolith Are you trying to create a bunch of new files?

yes. i have a bunch of csv files that contain a company name for stocks. each file represents all of the companies for 1 day. what i need to do is get each company into its own csv so that i can add it to a dataframe and sort by date

glacial monolith Apr 21, 2021, 4:17 AM

#

rotund dagger yes. i have a bunch of csv files that contain a company name for stocks. each fi...

Check out PyFilesystem2 as a replacement for native file methods if you aren't familiar. It can make it easier to programmatically sort through existing and created files.

rotund dagger Apr 21, 2021, 4:17 AM

#

i will look at it. i havent encoutered it before

glacial monolith Apr 21, 2021, 4:19 AM

#

I'm happy with it so far. It also supports multiple storage modes with an API very close to the native methods and file handling.

#

So the experience of using Dropbox or S3 feels like you're accessing a subdirectory of your project.

barren iris Apr 21, 2021, 4:20 AM

#

soft dock Not exactly a package but this is the first thing I would follow https://github....

I'll definitely check it out! Thanks a lot

rotund dagger Apr 21, 2021, 4:21 AM

#

glacial monolith So the experience of using Dropbox or S3 feels like you're accessing a subdirect...

thank you for the info. i will go read up on it now and see if i can apply it here

glacial monolith Apr 21, 2021, 4:23 AM

#

rotund dagger i will look at it. i havent encoutered it before

Anyway, you will just need to load in your source files (are they small enough that you don't have to worry about memory management?) and write a script to reorganize the data, then ship a new list off to a PyFS enabled method to build the new CSV files in the designated output folder.

rotund dagger Apr 21, 2021, 4:24 AM

#

sounds accurate

#

yea the files are not large enough for memory trouble

glacial monolith Apr 21, 2021, 4:25 AM

#

That makes things easier.

#

PyFS is kind of like a framework. Some of what it gives you isn't actually different from the core libraries, but it has some powerful utilities like walking and its optimized copy handling. In this case, it will give you conceptual structure for the conversion of data locked up in the file system into fully manipulable Python data structures.

#

That's the boring hard part, and the library takes care of a lot that you would have to figure out as you start to expand your idea of what you want to do.

rotund dagger Apr 21, 2021, 4:32 AM

#

it sounds like a pretty powerful tool to leverage. from what im reading it should do what i need.

#

i will just need to try to implement it

glacial monolith Apr 21, 2021, 4:33 AM

#

I'm using it on every Python project going forward that deals with files.

#

At the simplest level, you implement it in the same way as a vanilla file: with a context manager, where you use reading and writing commands on the open object to exchange data with instance variables inside your program.

rotund dagger Apr 21, 2021, 4:35 AM

#

i just installed it now im playing around with it a bit.

#

they have a decent documentation for it it looks like

glacial monolith Apr 21, 2021, 4:59 AM

#

rotund dagger i just installed it now im playing around with it a bit.

Is this going to run once or be an ongoing process?

rotund dagger Apr 21, 2021, 5:00 AM

#

just once.

#

basically im doing a time series forcast on historical stock data.

#

i was in a group of three people in school, and they decided not to finish this assignment so im trying to complete it solo lol. got stuck on reading the data in becuase of the way the data is presented

arctic wedgeBOT Apr 21, 2021, 6:22 AM

#

Hey @severe cloud!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

severe cloud Apr 21, 2021, 6:27 AM

#

hi new ai/ml dev here working on my first project but i am getting this error
https://paste.pythondiscord.com/haqegoxeru.sql

steady oxide Apr 21, 2021, 6:28 AM

#

did you restart kernel?

severe cloud Apr 21, 2021, 6:30 AM

#

never played with kernel

#

i am on linux kernel 5.8.0-44

ripe forge Apr 21, 2021, 7:04 AM

#

Ah not that kernel. Confusing terminology but if you run jupyter notebooks or ipython repl, the place where the code runs and holds variables in memory is also called a kernel. Ipython kernel to be specific. Has no relation to Linux kernels*

#

So, first question to you would be, how are you running this code?

#

If I'm reading this correctly you're not dealing with kernels at all, just running python as a script ftom terminal

#

So the kernel suggestion doesn't apply to you

modern lion Apr 21, 2021, 7:48 AM

#

I hope this is right forum for my question.. Do anyone got an example how to do 3D FFT from CSV?

raven knoll Apr 21, 2021, 8:23 AM

#

Is it possible to do text sentiment with unsupervised learning? I need to do a sentiment analysis on a twitter tweets

restive spade Apr 21, 2021, 9:01 AM

#

Yes

#

But you need to know the sentiment of tweets

#

So it's better to use supervised learning 😅

severe cloud Apr 21, 2021, 9:12 AM

#

ripe forge So, first question to you would be, how are you running this code?

env terminal main.py

raven knoll Apr 21, 2021, 9:57 AM

#

restive spade So it's better to use supervised learning 😅

I agree, but my project is about scraping twitter tweets about a company and getting the sentiment in the tweets. It's hard to label the data.

I am new to data science but I really love it. If there is anyting I miss I like to learn. I haven't yet touched neural networks but that is my next topic

restive spade Apr 21, 2021, 10:02 AM

#

🤔 I think you have to learn how to use neural networks first, then train a model with already labeled data (to be found on the Internet), then use it with your data.

raven knoll Apr 21, 2021, 10:52 AM

#

That’s what I thought as well but the language of the tweets are in Dutch and there are little trained models for that language, but thank you for confirming my thoughts

primal tulip Apr 21, 2021, 10:56 AM

#

https://nvlabs.github.io/GANcraft/ fun read for those that are into minecraft and want to learn AI.

GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds

grave frost Apr 21, 2021, 10:58 AM

#

raven knoll Is it possible to do text sentiment with unsupervised learning? I need to do a s...

yes

#

you can pre-train a model with the unsupervised data, and fine-tune on supervised to get the best accuracy. but you can fine-tune BERT tho, if your tweets are english

lapis sequoia Apr 21, 2021, 10:59 AM

#

Hey anyone has a clue how to apply OneHotEncoder in the model?

#

(sklearn)

primal tulip Apr 21, 2021, 11:01 AM

#

lapis sequoia Hey anyone has a clue how to apply OneHotEncoder in the model?

There you go.
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html

lapis sequoia Apr 21, 2021, 11:01 AM

#

which page you think I am on 🤣

primal tulip Apr 21, 2021, 11:01 AM

#

Then what's your actual question?

lapis sequoia Apr 21, 2021, 11:01 AM

#

just confused on how to implement it in a model

#

okay let me make an example

#

so I have my unique items in the columns, I transform them into a vector with 1,2,3,4 etc
Let's say we are using... SVC (example)

#

how do I implement my encoder in the SVC?

raven knoll Apr 21, 2021, 11:03 AM

#

grave frost you can pre-train a model with the unsupervised data, and fine-tune on supervise...

Alright I will try. This is my first unsupervised project so this will be one hell of a adventure

lapis sequoia Apr 21, 2021, 11:04 AM

#

X = [['Male', 1], ['Female', 3], ['Female', 2]]
enc.fit(X)```

#

so i pass this, but what's next?

#

Hi

grave frost Apr 21, 2021, 11:07 AM

#

raven knoll Alright I will try. This is my first unsupervised project so this will be one he...

wdym? you have no labelled data?

lapis sequoia Apr 21, 2021, 11:07 AM

#

I want to ask that from where i can learn ai

grave frost Apr 21, 2021, 11:08 AM

#

lapis sequoia I want to ask that from where i can learn ai

there are some resources pinned to this channel - raggy especially recommends one with good basics and maths

#

https://www.reddit.com/r/learnmachinelearning/wiki/resource

resource - learnmachinelearning

r/learnmachinelearning: A subreddit dedicated to learning machine learning

#

looks good too ^^^

primal tulip Apr 21, 2021, 11:10 AM

#

lapis sequoia so i pass this, but what's next?

I gave someone an example video on how to use the OneHotEncoder. I remember he was working with the Titanic dataset.
While I look for it, I'll tell you that you already have your data with the fit method, but if I recall correctly you need to transform it. Still if you call enc, you should have it properly encoded.

#

And just to make sure, you do OneHotEncoding in categorical data, not numerical values.

lapis sequoia Apr 21, 2021, 11:12 AM

#

primal tulip I gave someone an example video on how to use the OneHotEncoder. I remember he w...

okay and how would I use it? that's the issue haha, do I use it as X_train or X_test?

lapis sequoia Apr 21, 2021, 11:12 AM

#

primal tulip And just to make sure, you do OneHotEncoding in categorical data, not numerical ...

yeah i'm using it for a string feature

primal tulip Apr 21, 2021, 11:13 AM

#

Got it. It's a bit lenghty but you can skip until he goes into the good stuff.
https://www.youtube.com/watch?v=irHhDMbw3xo

YouTube

Data School

How do I encode categorical features using scikit-learn?

In order to include categorical features in your Machine Learning model, you have to encode them numerically using "dummy" or "one-hot" encoding. But how do you do this correctly using scikit-learn?

In this video, you'll learn how to use OneHotEncoder and ColumnTransformer to encode your categorical features and prepare your feature matrix in a...

▶ Play video

#

Go to minute 10 and on.

lapis sequoia Apr 21, 2021, 11:14 AM

#

cheers, i'll check it out, yeah no worries will just do 2x

hard hound Apr 21, 2021, 12:14 PM

#

Hey any tips on cleaning data

raven knoll Apr 21, 2021, 12:43 PM

#

grave frost wdym? you have no labelled data?

My project is to scrape Twitter tweets about the company and visualize a lot of things including sentiment. My part is the sentiment. Those tweets aren’t labeled

serene scaffold Apr 21, 2021, 12:50 PM

#

raven knoll My project is to scrape Twitter tweets about the company and visualize a lot of ...

I believe tweepy gives you some sentiment scores, so it would be interesting to see how your system compares.

late shell Apr 21, 2021, 12:50 PM

#

I have this bollywood movie dataset, where each movie has a few features, like box office collection & stuff, and the target variable is the performance of the movie, namely 4 categories [flop, average, hit, super hit]. For this classification problem, how do I use the 1st column i.e movie name in a ML model (since it's string dtype)? I can't encode all of the movie names, right? so what do i do?

serene scaffold Apr 21, 2021, 12:51 PM

#

late shell I have this bollywood movie dataset, where each movie has a few features, like b...

think about this: how would the name of the movie inform your model?

late shell Apr 21, 2021, 12:52 PM

#

umm it won't ig.. like it's no use for predicting the performance? idk, sorry, I'm still a noobie

#

im not sure abt that

serene scaffold Apr 21, 2021, 12:53 PM

#

late shell umm it won't ig.. like it's no use for predicting the performance? idk, sorry, I...

You're right. Unless you have a theory that the name of a movie can predict its popularity, you don't want to consider it

late shell Apr 21, 2021, 12:54 PM

#

nice, but when I'm using the model to predict a bunch of movies, how will I know which performance prediction corresponds to which movie (I cant figure it out from the features , right.)?

serene scaffold Apr 21, 2021, 12:55 PM

#

late shell nice, but when I'm using the model to predict a bunch of movies, how will I know...

it depends on how you write the program, but that's not going to be an issue

late shell Apr 21, 2021, 12:56 PM

#

alright. thank you very much mate. 🙌

serene scaffold Apr 21, 2021, 12:57 PM

#

late shell alright. thank you very much mate. 🙌

Look at your other features. I can see some that pretty clearly correspond with what's in the Verdict column

late shell Apr 21, 2021, 12:58 PM

#

The Tcollection?

serene scaffold Apr 21, 2021, 1:00 PM

#

late shell The `Tcollection`?

That's the one

#

@late shell is this dataset on kaggle btw?

gentle sedge Apr 21, 2021, 1:44 PM

#

Hey! How do i get my bar plot from matplotlib to show more? Right now the window ends almost instantly after the biggest bar.

#

I want it to give the bars a little bit of space from the top.

candid sable Apr 21, 2021, 1:49 PM

#

Hi guys - what would the numbers on the right mean in an activation map?

floral nexus Apr 21, 2021, 2:01 PM

#

Hi guys, can someone help me to troubleshoot this issue with Pandas?
https://www.kaggle.com/questions-and-answers/233859

Pandas groupby indexes problem | Data Science and Machine Learning

Pandas groupby indexes problem.

late shell Apr 21, 2021, 2:22 PM

#

serene scaffold <@!594900402634227752> is this dataset on kaggle btw?

yeah, ig so. I don't remember clearly, coz i downloaded it a long time ago

#

I have a dataset that has a few nan values in it. im using python. I want to replace the nan values with the average of the value that lies one above & below the nan value. How can I do this? bcoz df.fillna() only provides 2 methods [ffill, bfill].

ripe forge Apr 21, 2021, 2:34 PM

#

Maybe do the task in separate steps. One way is to create two temp variables. One with ffill one with bfill. Then subset them on places where value is nan, and average the two. Then assign back to the slots where nan was present

spark stag Apr 21, 2021, 2:38 PM

#

gentle sedge Hey! How do i get my bar plot from matplotlib to show more? Right now the window...

try py plt.set_ylim(ymax=<the max value you want on the y-axis>)

late shell Apr 21, 2021, 2:40 PM

#

ripe forge Maybe do the task in separate steps. One way is to create two temp variables. On...

ok lemme try, thanks

grave frost Apr 21, 2021, 2:45 PM

#

raven knoll My project is to scrape Twitter tweets about the company and visualize a lot of ...

there are plenty of trained models out there on tweets - just use on of them, and walk away with SOTA accuracy

grave frost Apr 21, 2021, 2:46 PM

#

serene scaffold I believe tweepy gives you some sentiment scores, so it would be interesting to ...

ehh...ancient rule based ⚰️

exotic maple Apr 21, 2021, 2:55 PM

#

grave frost there are plenty of trained models out there on tweets - just use on of them, an...

hugginface models?

grave frost Apr 21, 2021, 2:55 PM

#

exotic maple hugginface models?

ofc bruddah

stuck swallow Apr 21, 2021, 4:24 PM

#

I trained an opencv cascade sheet to find among us characters. Is there any way to use this cascade sheet to generate images? I cant find any info on this online.

serene mural Apr 21, 2021, 4:27 PM

#

Anyone here that can help in #help-cupcake ?

grave frost Apr 21, 2021, 4:30 PM

#

@iron basalt Finally got Hawkins' A thousand brains theory😌 (curse Brexit and world shipping) 🥳 🥳 I look forward to deep-diving fully into HTM in the coming months !!! 😎

ebon hound Apr 21, 2021, 4:59 PM

#

is there an easy way to add a function on top of a scatter plot in matplotlib?

serene mural Apr 21, 2021, 5:11 PM

#

import nltk
from nltk import word_tokenize
from nltk import FreqDist

my_text = input("Enter something: ")

cuss_words = #my list of cusswords, not putting here since its "vagour"

tokens = word_tokenize(my_text)
text = nltk.Text(tokens)

fdist = FreqDist(text)
``` So what I am trying to do, is enter a input, say "abc fuck" or "abc" or "abc sh$t" and have it detect cusswords and bypassed cusswords, so it learns what the "bypassed cusswords" can be in a correct context, how can I achieve this?

flint mason Apr 21, 2021, 6:52 PM

#

is there a library for sentiment analysis of a text

rotund dagger Apr 21, 2021, 7:04 PM

#

how can i add a date column to my dataframe where the date value in each row is set to the date stored on the csv file header.

#

basicly i have a bunch of csv files that are named "NSYE-Thursday-August-02-2018" and so on and so forth. and i want each row to have a date value that matches that header date. i would also add a day vaule to show the day of week.

late shell Apr 21, 2021, 7:10 PM

#

I've been told that if I'm using the model for prediction only, then there is no need to get rid of multicollinearity? is that true?

ripe forge Apr 21, 2021, 7:25 PM

#

i mean, if youre using it for just predictions, it must have been trained on the features that you no longer have control over in the first place.

#

so even thinking about changing features seems like a nonsensical conversation.

modern beacon Apr 21, 2021, 7:34 PM

#

how can i easily write pcm data on an array? i want to write binary data using hearable sound but it all seems complicated

analog pike Apr 21, 2021, 8:11 PM

#

Does anyone here have any suggestions for a good website to start with learning ai, tensorflow, decision trees and that sort of stuff. If it helps I'd say my skill level is between amateur and novice with python

#

The only ai I've built was a text generator using markov chain analysis

rotund dagger Apr 21, 2021, 9:55 PM

#

analog pike Does anyone here have any suggestions for a good website to start with learning...

i have learned a ton from this. it goes on sale for about 10 bucks and it has everything you listed. https://www.udemy.com/share/101WaUBUoacFhaRHQ=/

Udemy

Learn Python for Data Science, Structures, Algorithms, Interviews

Learn how to use NumPy, Pandas, Seaborn , Matplotlib , Plotly , Scikit-Learn , Machine Learning, Tensorflow , and more!

analog pike Apr 21, 2021, 9:56 PM

#

@rotund dagger how did you like it?

rotund dagger Apr 21, 2021, 9:56 PM

#

i use it all the time. it was very well worth it.

analog pike Apr 21, 2021, 9:56 PM

#

Is it similar at all to code academy because I really just don't like them

rotund dagger Apr 21, 2021, 9:57 PM

#

i dont think so. i would say it is more like a youtube playlist but has more interaction. it provides resource files.

analog pike Apr 21, 2021, 9:57 PM

#

Alright, i'll check it out

#

thanks

rotund dagger Apr 21, 2021, 9:57 PM

#

np

#

i have a dataframe that looks like this but the company apears multiple times in the data frame.

#

i would like to make a dictionary with the key being the unique symbol. and the value being a dataframe with rows for each entry of the unique symbol from the original data frame.

#

for example: for symbol 'FCCY' i would like to add those entries to the values dataframe of the dictionary

exotic maple Apr 21, 2021, 10:18 PM

#

rotund dagger for example: for symbol 'FCCY' i would like to add those entries to the values d...

you could probably do something like this

#

"dict_name"[COMPANY] = Dataframe[dataframe["ticker"] == FCCY ]

#

basically

#

create an entry in the dictionary, and the value its the masked / resulting dataframe from filtering by ticker

#

now, if that's efficient, it's a different question...

rotund dagger Apr 21, 2021, 10:20 PM

#

the end goal to is forecast stocks using holt winter time series.

#

the toughest time im having is importing in the data from all the csvs in such a way that i can use them in time series.

#

this was the only way i could think to do it, but im sure there is a more efficient way

rotund dagger Apr 21, 2021, 10:31 PM

#

exotic maple you could probably do something like this

looks like i was actually able to implement that in a loop and get exactly what i needed thank you much

lapis sequoia Apr 22, 2021, 12:48 AM

#

exotic maple now, if that's efficient, it's a different question...

I always find it difficult to figure out what the efficient way of doing stuff like that is

#

Always end up coding like that

#

then my whole kernel goes slow as hell

#

Sad

modern vine Apr 22, 2021, 1:50 AM

#

Good night! What is the best AI Area & Framework to find certain patterns on HTML documents? In this case it would be bidding items

serene scaffold Apr 22, 2021, 2:17 AM

#

modern vine Good night! What is the best AI Area & Framework to find certain patterns on HTM...

Are you are you need ai for this?

#

What is a bidding item?

modern vine Apr 22, 2021, 2:38 AM

#

serene scaffold Are you are you need ai for this?

Yeah, the customer is asking

modern vine Apr 22, 2021, 2:38 AM

#

serene scaffold What is a bidding item?

Like a public offer in Brazil

#

They're asking an AI for finding items in different HTMLs and then from these items suggest a product[

#

I already have a algorithm to suggest a product using spacy, but I need another AI to find these texts to suggest a product

serene scaffold Apr 22, 2021, 3:06 AM

#

modern vine I already have a algorithm to suggest a product using spacy, but I need another ...

Can you give some real examples of what the documents look like and what you're labeling?

modern vine Apr 22, 2021, 3:07 AM

#

Of course

#

http://rir.ibiz.com.br/ri2/rdoform?ENTIDADE=RI_REL_LICITACAO_ITEM_DET&ACAO=REL&id_licitacao=17879770&id_seq=-1&NW_ATIVA=1&UC=d0a116fa08481cfbfcde3ff2e1b66e38&UP=id_licitacao

#

These are the items

#

But not every HTML is like this

#

I want to convert to Python Objects in a list

lapis sequoia Apr 22, 2021, 3:34 AM

#

what resources worked for you in learning ml?

exotic maple Apr 22, 2021, 3:47 AM

#

modern vine Good night! What is the best AI Area & Framework to find certain patterns on HTM...

this ounds more like scraping project than a data science project

#

you can try using scrapy https://scrapy.org/ to scrap the pages and obtain the info you want. Then, you might be able to determine what kind of ML / AI you want to implement.

Scrapy | A Fast and Powerful Scraping and Web Crawling Framework

#

It seems you're building a recommender system of sorts?

visual spear Apr 22, 2021, 4:50 AM

#

how would i graph non-functions (matplotlib, numpy, sympy) from their equation inputted as a string?

#

like conic sections

#

example:
the input is

x^2/4+y^2/16=1

and it should graph an ellipse

velvet thorn Apr 22, 2021, 5:02 AM

#

visual spear how would i graph non-functions (matplotlib, numpy, sympy) from their equation i...

well

#

you need a step to parse the equation

#

and a way to decide what bounds you want

#

those are the main issues

#

sympy has a function for the former

#

but it uses eval

visual spear Apr 22, 2021, 5:03 AM

#

okay

rotund dagger Apr 22, 2021, 6:08 AM

#

i think is is happening becuase of the indexing. is there a way to fix it simply. the date on x column is not showing the dates but appears to be "binning" the index. each row doesnt have its own index

autumn basin Apr 22, 2021, 6:49 AM

#

Just add an index

#

IMO that’s the simplest way

rotund dagger Apr 22, 2021, 6:52 AM

#

i got it working.

#

i just reset the index

#

however i need to figure out how to make it display all of the values of x currently it shows 6 values of x out of 88 @autumn basin

#

whole mica Apr 22, 2021, 7:36 AM

#

Hey guys! I was wondering if there are any places i can learn AI equations/algorithms that i can incorporate into my trading bot

primal tulip Apr 22, 2021, 8:46 AM

#

lapis sequoia cheers, i'll check it out, yeah no worries will just do 2x

Hey man. Just checking in, did your doubt got solved? Found what the next step was?

grave frost Apr 22, 2021, 9:49 AM

#

whole mica Hey guys! I was wondering if there are any places i can learn AI equations/algor...

nothing that would help you substantially

#

it should have been the reverse - integrating your model into a trading bot. doesn't make sense to make a bot that way (unless you are using some off-the-shelf financial strategy which wouldn't yield much)

lapis sequoia Apr 22, 2021, 10:41 AM

#

primal tulip Hey man. Just checking in, did your doubt got solved? Found what the next step w...

yeah there was no issue with that, but now I'm dying in preprocessing hell 🤣

#

I have like 30 columns and good luck finding which one is actually producing a good result

lapis sequoia Apr 22, 2021, 10:43 AM

#

primal tulip Hey man. Just checking in, did your doubt got solved? Found what the next step w...

btw thanks man

raven knoll Apr 22, 2021, 10:44 AM

#

I'm still trying to start that dutch text-sentiment part of my project, but I currently cannot find a pre trained model or a labeled dataset.

I found one pre trained model but im new into machine learning and I don't really understand the code

grave frost Apr 22, 2021, 11:10 AM

#

I found one pre trained model but im new into machine learning and I don't really understand the code
well, then do the basics then, or find some tutorial to teach transfer learning if you already know the basics

broken warren Apr 22, 2021, 1:00 PM

#

Hello, i'd like to build an AI (Neuronal Network) to predict the 6. number of a given 5 number series. My current one (copied for the internet, but i do understand it) is able to do something like 10,20.30,40,50 but at 10,0,10,0,10 for eg. it fails miserably. Do u guys have any advise what i could do. I'm quite new to AI.

bronze skiff Apr 22, 2021, 1:40 PM

#

how many training examples did you give it

hard hound Apr 22, 2021, 1:49 PM

#

hey I was getting a error and it said It was unable to convert to float I searched for it on stack overflow but wasn't able to solve

#

ValueError: could not convert string to float: 'Biggin

#

Its a house price dataset with a lot of parameters

young dock Apr 22, 2021, 2:08 PM

#

i have a dataframe that for some reason was missing a few indexes, how do I reorder it so it's fixed?

It goes 47774, 47775, and then, 47778

#

I want the 47778 to become 47776, and 47779 to become 47777

#

nvm i'm dumb

#

if an explanatory variable only increases the adjusted r-squared by 0.01, is it still worth including in the regression?

#

what if only increases it by 0.05?

daring peak Apr 22, 2021, 2:35 PM

#

I am making a game which has 2 ais battle each other, I have coded the game (might change a few things) but here is the game and I was wondering what modules should I use or how do i get started with adding the ais? (if code is needed I'll provide)

bronze skiff Apr 22, 2021, 3:18 PM

#

daring peak I am making a game which has 2 ais battle each other, I have coded the game (mig...

do you know any ml at all? should probably start with baby steps

winged yew Apr 22, 2021, 3:21 PM

#

anyone

whole mica Apr 22, 2021, 3:26 PM

#

grave frost it should have been the reverse - integrating your model into a trading bot. doe...

Oh really? How come?

winged yew Apr 22, 2021, 3:28 PM

#

can anyone suggest me which would be better to deploy machine learning models ?
Django or Flask ?

grave frost Apr 22, 2021, 3:53 PM

#

whole mica Oh really? How come?

because making a trading bot is much, much easier than a model that can trade very well and churn out a good profit

flint mason Apr 22, 2021, 4:37 PM

#

do I need permission if I want to scrape data off linkedin ?

wicked mantle Apr 22, 2021, 5:10 PM

#

flint mason do I need permission if I want to scrape data off linkedin ?

you can scrape all what you need.
did you mean law about dynamic sites?

#

torch.Size([10, 1, 28, 28]) in this shape 28, 28 is 28x28 pixel image
What is 10, 1? dimensions?

daring peak Apr 22, 2021, 6:27 PM

#

bronze skiff do you know any ml at all? should probably start with baby steps

I don't, where could I start/what could I start with?

flint mason Apr 22, 2021, 7:00 PM

#

wicked mantle you can scrape all what you need. did you mean law about dynamic sites?

yeah

serene mural Apr 22, 2021, 7:24 PM

#

How hard would it be to make a chat bot?

uncut orbit Apr 22, 2021, 7:40 PM

#

without ai its quite simple

#

with ai it'll take some more work

#

but you can use telegram and gpt 2

lapis sequoia Apr 22, 2021, 7:57 PM

#

hi, ive found this model but the dataset isnt there anymore. Do u know how can i find the data set used to train this model? https://github.com/AbdulAhadSiddiqui11/Pokemon-Image-Classifier

GitHub

AbdulAhadSiddiqui11/Pokemon-Image-Classifier

Its a convNet built upon InceptionV3 and trained on 928 pokemon classes. - AbdulAhadSiddiqui11/Pokemon-Image-Classifier

grave frost Apr 22, 2021, 8:18 PM

#

serene mural How hard would it be to make a chat bot?

A decent one? few days at most. a good one? weeks

cold mantle Apr 22, 2021, 9:53 PM

#

my number classifier is not working, i upload an image, but it always says the image is a 2 or a 0

arctic wedgeBOT Apr 22, 2021, 9:54 PM

#

Hey @cold mantle!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

cold mantle Apr 22, 2021, 9:56 PM

#

https://drive.google.com/file/d/119EAN4JKDAdGR24bGzLW32aa0snEHrb0/view?usp=sharing

Google Docs

Handwritten+digit+reader.ipynb

lapis sequoia Apr 22, 2021, 11:35 PM

#

Can anybody explain what we choose unit sizes of 16/32?

velvet thorn Apr 22, 2021, 11:43 PM

#

lapis sequoia Can anybody explain what we choose unit sizes of 16/32?

what?

lapis sequoia Apr 22, 2021, 11:43 PM

#

velvet thorn what?

unit sizes

#

why

velvet thorn Apr 22, 2021, 11:43 PM

#

what unit sizes

lapis sequoia Apr 22, 2021, 11:43 PM

#

Like in a Dense layer

velvet thorn Apr 22, 2021, 11:44 PM

#

ah

#

like number of neurons

lapis sequoia Apr 22, 2021, 11:44 PM

#

Intuitively I understand we are getting the input vectors and shoving them into neurons

#

But why multiple neurons?

velvet thorn Apr 22, 2021, 11:44 PM

#

hm

lapis sequoia Apr 22, 2021, 11:44 PM

#

Arent they all doing the same thing?

velvet thorn Apr 22, 2021, 11:44 PM

#

are you asking

#

why multiple neurons

lapis sequoia Apr 22, 2021, 11:44 PM

#

Yes

velvet thorn Apr 22, 2021, 11:44 PM

#

or why powers of 2

lapis sequoia Apr 22, 2021, 11:44 PM

#

Why multiple neurons

velvet thorn Apr 22, 2021, 11:44 PM

#

lapis sequoia Yes

each neuron has its own weight

lapis sequoia Apr 22, 2021, 11:44 PM

#

What decides the weight?

velvet thorn Apr 22, 2021, 11:44 PM

#

in general, backpropagation of error

#

you can think of each neuron as learning a very limited aspect of the relationship between data and target

lapis sequoia Apr 22, 2021, 11:45 PM

#

Yeah but what part of my code is distinguishing the weights?

#

It kinda just seems that I shove it in and get answers out

velvet thorn Apr 22, 2021, 11:45 PM

#

lapis sequoia It kinda just seems that I shove it in and get answers out

unless you're building your own library

lapis sequoia Apr 22, 2021, 11:45 PM

#

Are there preferred weighting systems?

velvet thorn Apr 22, 2021, 11:45 PM

#

it will seem like that

velvet thorn Apr 22, 2021, 11:45 PM

#

lapis sequoia Yeah but what part of my code is distinguishing the weights?

it's all in the fit step

velvet thorn Apr 22, 2021, 11:46 PM

#

lapis sequoia Are there preferred weighting systems?

what do you mean by "weighting systems"

lapis sequoia Apr 22, 2021, 11:46 PM

#

Like, are there common weight architectures people use for more accurate results?

velvet thorn Apr 22, 2021, 11:46 PM

#

lapis sequoia Like, are there common weight architectures people use for more accurate results...

...what do you mean by "weight architecture"

#

"model architecture" makes sense

#

so does "weight initialisation method", but I'm not really sure what you mean by "weight architecture"

lapis sequoia Apr 22, 2021, 11:47 PM

#

Basically how much each neurons function is changing

velvet thorn Apr 22, 2021, 11:47 PM

#

depending on how you mean that

#

you could be referring to learning rate

#

or optimiser

lapis sequoia Apr 22, 2021, 11:47 PM

#

So if we have a linear neuron y = cx + b let's say

velvet thorn Apr 22, 2021, 11:47 PM

#

(assuming we're still in the realm of gradient descent backpropagation)

lapis sequoia Apr 22, 2021, 11:47 PM

#

The weights would be all the values of c and b

#

Right?

velvet thorn Apr 22, 2021, 11:47 PM

#

well

#

that's an implementation detail; some use one single bias value per layer