#data-science-and-ml | Python | Page 332

undone flare Aug 10, 2021, 12:28 PM

#

because the default axis is set to 0, that means it is row

silk axle Aug 10, 2021, 12:28 PM

#

Ah yeah, thanks

#

Rather than dropping 40 and keeping 20, is there a way to specify which 20 to keep? @undone flare

#

So basically remove everything except what's passed

undone flare Aug 10, 2021, 12:34 PM

#

you could just create a new dataframe with those 20 columns

silk axle Aug 10, 2021, 12:34 PM

#

undone flare you could just create a new dataframe with those 20 columns

How?

undone flare Aug 10, 2021, 12:38 PM

#

silk axle How?

try new_df = df[["col1", "col2"]].copy()

silk axle Aug 10, 2021, 12:49 PM

#

Ended up doing a different approach and instead only reading the columns from the csv that I want (using usecols kwarg)

#

Seemed bad to read the entire database when I only need a fraction of it

undone flare Aug 10, 2021, 12:49 PM

#

that works too

undone flare Aug 10, 2021, 1:15 PM

#

skewness before tranformation

ph : 0.04891026669821542
Hardness : -0.08517383101708786
Solids : 0.595449442721807
Chloramines : 0.01296659647911324
Sulfate : -0.04652296251790013
Conductivity : 0.26666972862929905
Organic_carbon : -0.020002726567027108
Trihalomethanes : -0.051383722200829214
Turbidity : -0.03302682552748457
```after log transformation

ph : -2.2032213464172155
Hardness : -0.8250213149082217
Solids : -1.230858118768609
Chloramines : -1.069749910885117
Sulfate : -0.692747912780153
Conductivity : -0.20033687775898243
Organic_carbon : -0.9940495304526159
Trihalomethanes : -1.2119564041594677
Turbidity : -0.702269975309455

#

the goal is to make something look like a normal distribution right?

frosty ore Aug 10, 2021, 2:55 PM

#

Anyway to install older versions of tensorflow like 2.3.1?

#

Without having to compile from source.

unborn glacier Aug 10, 2021, 2:57 PM

#

pip install tensorflow==2.3.1

#

That's also the way you're supposed to format requirements.txt files, with the exact version listed

frosty ore Aug 10, 2021, 3:03 PM

#

Of course

#

Go ahead and try pip install tensorflow==2.3.1 Let me know how it works.

hoary wigeon Aug 10, 2021, 3:25 PM

#

Hello everyone

#

i need help with deploying my ML model

undone flare Aug 10, 2021, 3:31 PM

#

hoary wigeon i need help with deploying my ML model

yea, you can ask here

hoary wigeon Aug 10, 2021, 3:32 PM

#

i have successfully deployed my application heroku

#

for tweet sentiment analysis

#

It collects specified number of tweets and analyze it , and generate reports

#

when i tried using 900 tweets it exceeded the memory over 640Mib

#

when i tried with 600 tweets i faced server timeout, delay in response

#

how can i avoid it ?

undone flare Aug 10, 2021, 3:37 PM

#

different dynos offer different max ram

#

are you using a free one?

hoary wigeon Aug 10, 2021, 3:40 PM

#

yes

#

https://c6-tweepy-sentiment.herokuapp.com/

#

check this

#

for now im using 300 tweets

hoary wigeon Aug 10, 2021, 3:40 PM

#

undone flare are you using a free one?

can we extend service timeout ?

#

if not ram

#

so that app can work with 600 tweets

undone flare Aug 10, 2021, 3:43 PM

#

I don't know, haven't used heroku much

hoary wigeon Aug 10, 2021, 3:44 PM

#

im asking about heroku now

#

vectorizer does take time ?

undone flare Aug 10, 2021, 3:47 PM

#

Count Vectorizer or Tfidf?

hoary wigeon Aug 10, 2021, 4:06 PM

#

undone flare Count Vectorizer or Tfidf?

tfidf

undone flare Aug 10, 2021, 4:08 PM

#

hoary wigeon tfidf

did you set stop_words_ to None before pickling? (if you used it)

hoary wigeon Aug 10, 2021, 4:13 PM

#

i dint set stopword in tfidf

undone flare Aug 10, 2021, 4:13 PM

#

ah alright

#

I don't think Tfidf is slow, I think nltk is slow

hoary wigeon Aug 10, 2021, 4:15 PM

#

actually twitter api is slow in sending response

#

then selecting part from json and creating a dict

#

converting dict to df

#

and vectorizing it, applying model

#

is taking time

undone flare Aug 10, 2021, 4:16 PM

#

can you not load json to pandas df?

hoary wigeon Aug 10, 2021, 4:16 PM

#

contains too much unwanted data

undone flare Aug 10, 2021, 4:16 PM

#

hmm

hoary wigeon Aug 10, 2021, 4:19 PM

#

undone flare hmm

https://hastebin.com/ogawanihaw.py

#

have a look

#

that's just one tweet result

arctic wedgeBOT Aug 10, 2021, 4:21 PM

#

Hey @hoary wigeon!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

undone flare Aug 10, 2021, 4:21 PM

#

yikes

hoary wigeon Aug 10, 2021, 4:21 PM

#

yep

#

so we cant do anything ?

undone flare Aug 10, 2021, 4:22 PM

#

not to my knowledge, maybe someone has done this same thing

hoary wigeon Aug 10, 2021, 4:22 PM

#

#

its just one tweet and like this im using 300 tweets

#

and creating a dataframe

undone flare Aug 10, 2021, 4:24 PM

#

that's gonna take some time ye

lapis sequoia Aug 10, 2021, 4:35 PM

#

How would you put the legend outside to the right?

plt.legend(loc='center right')

has no effect.

hardy hornet Aug 10, 2021, 4:35 PM

#

I've run my tsv file by Pandas on Jupyter

#

but i dont know why it appeared like this

#

anyone know to fix this?

lapis sequoia Aug 10, 2021, 4:37 PM

#

hardy hornet but i dont know why it appeared like this

filenotfound error?

hardy hornet Aug 10, 2021, 4:38 PM

#

but it is true file in my computer

lapis sequoia Aug 10, 2021, 4:38 PM

#

put the absolute path

hardy hornet Aug 10, 2021, 4:39 PM

#

yepp, i have done it

fading wigeon Aug 10, 2021, 4:40 PM

#

What sort of analysis would you use to figure out how close two groups of numbers are to one another?

hardy hornet Aug 10, 2021, 4:40 PM

#

lapis sequoia put the absolute path

thanks bro

lapis sequoia Aug 10, 2021, 4:40 PM

#

fading wigeon What sort of analysis would you use to figure out how close two groups of number...

eigenvalues?

fading wigeon Aug 10, 2021, 4:40 PM

#

Basically there is a "real" number and two algorithms that "guess" what that number is

#

and I have a ton of data

#

Trying to determine which algo is better

#

Oh, pandas has a .corr() method, I'll just use that, haha

lapis sequoia Aug 10, 2021, 4:54 PM

#

Anyone know how I can move my matplotlib legend outside to the right of my graph?

silk axle Aug 10, 2021, 5:04 PM

#

lapis sequoia Anyone know how I can move my matplotlib legend outside to the right of my graph...

loc="right"?

lapis sequoia Aug 10, 2021, 5:07 PM

#

silk axle `loc="right"`?

Only moves right and inside the graph.

silk axle Aug 10, 2021, 5:07 PM

#

You can't move it outside

#

Or at least not with the default options

#

There might be a different method

lapis sequoia Aug 10, 2021, 5:07 PM

#

You can. I think it's done with a subplot.

#

https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.legend.html

#

bbox_to_anchor

somber prism Aug 10, 2021, 5:32 PM

#

anyone know a good metrics for multiclass clf ?

gentle epoch Aug 10, 2021, 5:40 PM

#

I need an opinion about pandas

#

can I ask it here?

serene scaffold Aug 10, 2021, 5:40 PM

#

gentle epoch I need an opinion about pandas

Yes

gentle epoch Aug 10, 2021, 5:42 PM

#

I'm still rather new to programming in general and this is just my second program. I'm using pandas to parse a csv file and pick specific cells with iloc

#

I haven't read the whole documentation yet, but rather posts on blogs and stacked overflow

#

my question is, is defining the type of data contained in each column with dtype={} really necessary?

#

@serene scaffold

serene scaffold Aug 10, 2021, 5:44 PM

#

gentle epoch my question is, is defining the type of data contained in each column with dtype...

not always

#

can you show what you've written so far?

#

!code

arctic wedgeBOT Aug 10, 2021, 5:44 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

gentle epoch Aug 10, 2021, 5:45 PM

#

here's a sample

#

from os import path
import numpy as np
import random
import pandas as pd
from datetime import datetime as dt
from datetime import timedelta as td



def rollDice(numDice):
    results = []
    for x in range(numDice):
        results.append(random.choice(range(1,7)))
    
    return sum(results)

print(rollDice(3))


def HabMod_Table(path, rowN, colN):
    a = pd.read_csv(path)
    
    return a.iloc[rowN,colN]


table_path = R"H:\01 Libraries\Documents\Tosh0kan Studios\Coding\GURPS Space\Tables\12 - Habitability Modifiers Table.csv"
path = table_path

ovrl_wrldType = HabMod_Table(path,1,1)

print(type(ovrl_wrldType))```

#

and here's the table:

"No atmosphere or Trace atmosphere"                                               ,0
"Non-breathable atmosphere, Very Thin or above, Suffocating, Toxic, and Corrosive",-2
"Non-breathable atmosphere, Very Thin or above, Suffocating and Toxic only"       ,-1
"Non-breathable atmosphere, Very Thin or above, Suffocating only"                 ,0
"Breathable atmosphere (Very Thin)"                                               ,1
"Breathable atmosphere (Thin)"                                                    ,2
"Breathable atmosphere (Standard or Dense)"                                       ,3
"Breathable atmosphere (Very Dense or Superdense)"                                ,1
"Breathable atmosphere is not Marginal"                                           ,1
"No liquid-water oceans, or Hydrographic Coverage 0%"                             ,0
"Liquid-water oceans, Hydrographic Coverage 1% to 59%"                            ,1
"Liquid-water oceans, Hydrographic Coverage 60% to 90%"                           ,2
"Liquid-water oceans, Hydrographic Coverage 91% to 99%"                           ,1
"Liquid-water oceans, Hydrographic Coverage 100%"                                 ,0
"Breathable atmosphere, climate type is Frozen or Very Cold"                      ,0
"Breathable atmosphere, climate type is Cold"                                     ,1
"Breathable atmosphere, climate type is Chilly, Cool, Normal, Warm, or Tropical"  ,2
"Breathable atmosphere, climate type is Hot"                                      ,1
"Breathable atmosphere, climate type is Very Hot or Infernal"                     ,0```

serene scaffold Aug 10, 2021, 5:46 PM

#

doesn't seem like pandas is necessary for this.

gentle epoch Aug 10, 2021, 5:46 PM

#

agreed

#

but it was the first method I found when googling "how to pick a cell in a csv with python" so here we are

serene scaffold Aug 10, 2021, 5:47 PM

#

well, looks like you figured it out lemon_hyperpleased

gentle epoch Aug 10, 2021, 5:47 PM

#

yeah

serene scaffold Aug 10, 2021, 5:47 PM

#

Also, your code is not following pep8

gentle epoch Aug 10, 2021, 5:47 PM

#

what's pep8?

serene scaffold Aug 10, 2021, 5:47 PM

#

the style guide. you should never have variableNamesLikeThis

#

# not pep8
def rollDice(numDice):
    results = []
    for x in range(numDice):
        results.append(random.choice(range(1,7)))
    
    return sum(results)

# pep8
def roll_dice(num_dice):
    results = []
    for x in range(num_dice):
        results.append(random.choice(range(1,7)))
    
    return sum(results)

#

# not pep8
def HabMod_Table(path, rowN, colN):
    a = pd.read_csv(path)
    
    return a.iloc[rowN,colN]


# pep8
def hab_mod_table(path, row_n, col_n):
    a = pd.read_csv(path)
    return a.iloc[row_n,col_n]

gentle epoch Aug 10, 2021, 5:49 PM

#

I see

#

generally, when should I use dtype={}?

#

because pandas is too awesome to run tables any other way from now on

#

@serene scaffold

serene scaffold Aug 10, 2021, 6:01 PM

#

gentle epoch <@!253696366952316929>

If you want numeric types to be stored a certain way.

gentle epoch Aug 10, 2021, 6:02 PM

#

serene scaffold If you want numeric types to be stored a certain way.

can you ellaborate?

serene scaffold Aug 10, 2021, 6:08 PM

#

gentle epoch can you ellaborate?

Like float or int32 or whatever

gentle epoch Aug 10, 2021, 6:14 PM

#

I see

#

also, it's reading numbers as str for some reason lol

#

so I guess I should just always use dtype

#

how annoying

serene scaffold Aug 10, 2021, 6:22 PM

#

gentle epoch so I guess I should just always use dtype

You don't always need to use it though.

#

I rarely do. I'd have to look at the source file and the code to understand why your numbers are being inferred as strings.

chilly geyser Aug 10, 2021, 6:26 PM

#

I'm able to get proper ints and floats from really simple examples.

A bit more 'adversarial' example might be csvs that store floats with surrounding ""

gentle epoch Aug 10, 2021, 6:28 PM

#

9E10000000               ,0.009                     ,Trace
0.01                     ,0.5                       ,Very Thin
0.51                     ,0.8                       ,Thin
0.81                     ,1.2                       ,Standard
1.21                     ,1.5                       ,Dense
1.51                     ,10                        ,Very Dense
11                       ,9E10000000                ,Superdense```

#

all values in this table are being read as strings

chilly geyser Aug 10, 2021, 6:29 PM

#

Those don't seem really...proper 👀

#

I think it'd be hard for a computer to know they are floats

gentle epoch Aug 10, 2021, 6:29 PM

#

I mean

#

how is 1.51 not proper? legit question

sonic scaffold Aug 10, 2021, 6:29 PM

#

What does it mean by ambiguous, i got ValueError saying the truth value of a Series is ambiguous

chilly geyser Aug 10, 2021, 6:30 PM

#

gentle epoch how is `1.51` not proper? legit question

It's not that 1.51 is bad, it's that 1.51 is bad

#

I don't think space-padding in csvs is common, you generally get them really dense

hdr,hdr2
0,1.523523
234,4.5234
23,666.3453

chilly geyser Aug 10, 2021, 6:31 PM

#

sonic scaffold What does it mean by ambiguous, i got ValueError saying the truth value of a Ser...

You might be asking for if pandas.Series (an instance, not the class) which is taken to be ambiguous

#

So the elements of pandas.Series can be truthy or falsey

gentle epoch Aug 10, 2021, 6:32 PM

#

chilly geyser I don't think space-padding in csvs is common, you generally get them really den...

ah okay. it's something I added on Notepad++ so the columns are a bit clearer

chilly geyser Aug 10, 2021, 6:32 PM

#

And you should try to ask for the truthiness of the elements of the Series instead

sonic scaffold Aug 10, 2021, 6:32 PM

#

chilly geyser You might be asking for `if pandas.Series` (an instance, not the class) which is...

So that means they are neither nor true not false?

chilly geyser Aug 10, 2021, 6:32 PM

#

sonic scaffold So that means they are neither nor true not false?

lol I have no idea, I think they just raise an error actually

gentle epoch Aug 10, 2021, 6:32 PM

#

serene scaffold I rarely do. I'd have to look at the source file and the code to understand why ...

code

import pandas as pd
from datetime import datetime as dt
from datetime import timedelta as td

from pandas.io.parsers import read_csv



def atmo_pressure_table(path, rowN, colN):
    a = pd.read_csv(path)
    
    return a.iloc[rowN,colN]


table_path = R"H:\01 Libraries\Documents\Tosh0kan Studios\Coding\GURPS Space\Tables\3 - Atmospheric Pressure Categories Table.csv"

ovrl_wrldType = atmo_pressure_table(table_path,5,0)

print(type(ovrl_wrldType))```

gentle epoch Aug 10, 2021, 6:32 PM

#

gentle epoch ```Pressure Range(Lower End),Pressure Range(Higher End),Pressure Category 9E1000...

the table

sonic scaffold Aug 10, 2021, 6:33 PM

#

chilly geyser And you should try to ask for the truthiness of the *elements* of the `Series` i...

Ohh so i'll have to target the elements of the series right?

chilly geyser Aug 10, 2021, 6:33 PM

#

sonic scaffold Ohh so i'll have to target the elements of the series right?

That's probably what you want, right?

#

You generally want an all or some or on the entire Series

sonic scaffold Aug 10, 2021, 6:33 PM

#

Yeah i want to get out the values that are False

chilly geyser Aug 10, 2021, 6:33 PM

#

Or some and of some kind, of the elements of the series

gentle epoch Aug 10, 2021, 6:34 PM

#

@chilly geyser padding removed

#

9E10000000,0.009,Trace
0.01,0.5,Very Thin
0.51,0.8,Thin
0.81,1.2,Standard
1.21,1.5,Dense
1.51,10,Very Dense
11,9E10000000,Superdense```

chilly geyser Aug 10, 2021, 6:34 PM

#

sonic scaffold Yeah i want to get out the values that are False

As in, the values are exactly False?

gentle epoch Aug 10, 2021, 6:34 PM

#

still being read as string

chilly geyser Aug 10, 2021, 6:34 PM

#

Hmm that's a good question

#

I didn't know it doesn't understand float notation xEy

#

I think that might be a cause

sonic scaffold Aug 10, 2021, 6:35 PM

#

chilly geyser As in, the values are exactly `False`?

print(data1[data1['Price']>0 & data1['Type']=='Free'])

#

This should target the elements idk why it's giving me that error

gentle epoch Aug 10, 2021, 6:36 PM

#

the result

PS H:\01 Libraries\Documents\Tosh0kan Studios\Coding> & C:/Users/Tosh0kan/AppData/Local/Programs/Python/Python39/python.exe "h:/01 Libraries/Documents/Tosh0kan Studios/Coding/tester.py"
1.51
<class 'str'>```

sonic scaffold Aug 10, 2021, 6:37 PM

#

sonic scaffold This should target the elements idk why it's giving me that error

nvm it worked i missed out brackets

chilly geyser Aug 10, 2021, 6:39 PM

#

gentle epoch the result ```powershell PS H:\01 Libraries\Documents\Tosh0kan Studios\Coding> &...

Wait I think I got it

chilly geyser Aug 10, 2021, 6:39 PM

#

gentle epoch the result ```powershell PS H:\01 Libraries\Documents\Tosh0kan Studios\Coding> &...

Not sure why yours is being str but when I changed the 9E10000000 to be within float limits like 3E100 it got read as a float properly

gentle epoch Aug 10, 2021, 6:40 PM

#

wait

#

there are limits to scientific notation?

chilly geyser Aug 10, 2021, 6:40 PM

#

Lmao yes

gentle epoch Aug 10, 2021, 6:40 PM

#

over witch, it just becomes a string?

chilly geyser Aug 10, 2021, 6:40 PM

#

It's stored in a limited memory space

gentle epoch Aug 10, 2021, 6:40 PM

#

whaaaaaa?! lmao

#

oh

#

I see

chilly geyser Aug 10, 2021, 6:41 PM

#

There's an IEEE standard for this, but the double limit (or just float64 is +-E308)

chilly geyser Aug 10, 2021, 6:41 PM

#

gentle epoch whaaaaaa?! lmao

The exact string inf will work as well

#

I'm not too sure about InF, Inf, inF, etc within the csv, can test that

gentle epoch Aug 10, 2021, 6:42 PM

#

InF, Inf, inF,
what are these?

#

ah

#

infinity

chilly geyser Aug 10, 2021, 6:42 PM

#

oh, they all seem to work, so I think it's auto .lower()-ing them or something

#

Yeah there are a lot of 'obvious things' done in pd I think

#

It's almost too convenient

gentle epoch Aug 10, 2021, 6:43 PM

#

because like

#

these tables are for a tabletop RPG

#

and I'm making a program to automate rolling them

chilly geyser Aug 10, 2021, 6:44 PM

#

is there a difference between inf and those numbers you were using

gentle epoch Aug 10, 2021, 6:44 PM

#

I didn't know how to refer to inf in a csv

#

so I used a ludicrous number lol

chilly geyser Aug 10, 2021, 6:44 PM

#

ah, so this should work

gentle epoch Aug 10, 2021, 6:45 PM

#

I put 9E10000000 in those two places because in the table is "0.009 or less"

chilly geyser Aug 10, 2021, 6:45 PM

#

I think there's dedicated functions for inf, like DataFrame.isinf()

gentle epoch Aug 10, 2021, 6:45 PM

#

so I just needed a really ridiculous number that would never come up

chilly geyser Aug 10, 2021, 6:45 PM

#

Oh, that will certainly fail that, no issue with infinity order checking

gentle epoch Aug 10, 2021, 6:46 PM

#

this is the original table

chilly geyser Aug 10, 2021, 6:46 PM

#

Functionally (to me?) positive infinity is just an entity that is greater than any number, and should(?) error if compared against another positive infinity

#

Yeah >10 seems like you can go inf on it

#

less than 0.01, you can use 0? or just -inf

gentle epoch Aug 10, 2021, 6:48 PM

#

I don't think it can read negative inf

#

I put -inf on the table

#

but it's reading print(ovrl_wrldType < 0) as false

#

when it should be true

#

so I'll go with just 0

proven sigil Aug 10, 2021, 8:24 PM

#

gentle epoch ah okay. it's something I added on Notepad++ so the columns are a bit clearer

Use tab ('\t') as delimiter if you want that

proven sigil Aug 10, 2021, 8:36 PM

#

gentle epoch so I'll go with just 0

yes, pressure can't be negative.

prime hearth Aug 10, 2021, 9:04 PM

#

hello, i would like to please know if ML certificates gives more of an eye to employers vs someone who didnt get one but they. have projects to showcase they know ML?

I am currently pursing a degree in CS though, but school doesnt teach ML, but im learning it on my own.

quasi sparrow Aug 10, 2021, 9:20 PM

#

Hey guys, question about ML

#

How do you call a machine learning task that uses it's output for another machine learning task?

gentle epoch Aug 10, 2021, 9:24 PM

#

is there any difference between:

import pandas as pd

def csv_parser(path):
    a = pd.read_csv(path)
    
    return a

teeburu = csv_parser(R"H:\01 Libraries\Documents\Tosh0kan Studios\Coding\GURPS Space\Tables\1 - Overall Type Table.csv")

print(type(teeburu.iloc[0,1]))
print(type(teeburu.iloc[0,0]))```

and: 
```py
import pandas as pd

def csv_parser(path):
    a = pd.read_csv(path)
    
    return a

teeburu = csv_parser(R"H:\01 Libraries\Documents\Tosh0kan Studios\Coding\GURPS Space\Tables\1 - Overall Type Table.csv")

teeburu_df = pd.DataFrame(data=teeburu)

print(type(teeburu_df.iloc[0,1]))
print(type(teeburu_df.iloc[0,0]))```

quasi sparrow Aug 10, 2021, 9:26 PM

#

I think the second one is redundant

#

You don't need to specify it as a dataframe if you have already opened the CSV with pandas.

#

Pandas will upload it as a dataframe

gentle epoch Aug 10, 2021, 9:28 PM

#

quasi sparrow Pandas will upload it as a dataframe

so, with the first, the csv is already being stored internally?

quasi sparrow Aug 10, 2021, 9:31 PM

#

import pandas as pd
teeburu=pd.read_csv("H:\01 Libraries\Documents\Tosh0kan Studios\Coding\GURPS Space\Tables\1 - Overall Type Table.csv",header=None,sep=';')

print(type(teeburu_df.iloc[0,1]))
print(type(teeburu_df.iloc[0,0]))

#

I normally read CSV documents like this

#

you can choose if you want headers or not and the sep is dependent on what your CSV uses for separation of rows

#

It could be comma, or this ;

#

oops, like this:

import pandas as pd
teeburu=pd.read_csv("H:\01 Libraries\Documents\Tosh0kan Studios\Coding\GURPS Space\Tables\1 - Overall Type Table.csv",header=None,sep=';')

print(type(teeburu.iloc[0,1]))
print(type(teeburu.iloc[0,0]))

#

if the file does not load, check for relative or absolute path.

quasi sparrow Aug 10, 2021, 9:34 PM

#

gentle epoch so, with the first, the csv is already being stored internally?

Yeah, it's in memory as an object.

gentle epoch Aug 10, 2021, 9:50 PM

#

alright

#

thank you

gentle epoch Aug 10, 2021, 9:52 PM

#

quasi sparrow Yeah, it's in memory as an object.

so, if I set header=0, then the first roll will be column's names?

strange portal Aug 10, 2021, 9:53 PM

#

How can I do a very simple reinforcement learning, I already know a little about ML

quasi sparrow Aug 10, 2021, 9:54 PM

#

gentle epoch so, if I set `header=0`, then the first roll will be column's names?

No, the arguments apply to the entire dataset. If set to false, the entire header row will be left out

#

you'll end up with a matrix of just features but no description/headers

gentle epoch Aug 10, 2021, 9:56 PM

#

quasi sparrow No, the arguments apply to the entire dataset. If set to false, the entire heade...

but that's set to 0, as in row 0.

#

yep

#

here

#

headerint, list of int, default ‘infer’

Row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names: if no names are passed the behavior is identical to header=0 and column names are inferred from the first line of the file,

#

straight from the documentation

quasi sparrow Aug 10, 2021, 9:57 PM

#

feature1|feature2|feature3|
---------------------------
0934828|349823849|348238943|
---------------------------
4327823|323848484|378327474|
---------------------------

if set to False it is loaded like this:

---------------------------
0934828|349823849|348238943|
---------------------------
4327823|323848484|378327474|
---------------------------

#

If set to zero, then you are telling pandas row 0 is the header row

gentle epoch Aug 10, 2021, 9:57 PM

#

I'm not asking about false. I was asking about setting header to 0

#

I don't think I ever asked anything about false

quasi sparrow Aug 10, 2021, 9:58 PM

#

Oh yeah. Set to zero points to the zero row as header row

quasi sparrow Aug 10, 2021, 9:58 PM

#

gentle epoch I don't think I ever asked anything about false

ok

strange portal Aug 10, 2021, 9:58 PM

#

I did something amazing with neural network, can I send it to you?

#

the code

gentle epoch Aug 10, 2021, 10:00 PM

#

proven sigil Use tab ('\t') as delimiter if you want that

how so?

strange portal Aug 10, 2021, 10:03 PM

#

😦

prime hearth Aug 10, 2021, 10:13 PM

#

sorry repost :
hello, i would like to please know if ML certificates gives more of an eye to employers vs someone who didnt get one but they. have projects to showcase they know ML?

I am currently pursing a degree in CS though, but school doesnt teach ML, but im learning it on my own.

strange portal Aug 10, 2021, 10:17 PM

#

prime hearth sorry repost : hello, i would like to please know if ML certificates gives more...

can i send a script using ML, do you want to see?

prime hearth Aug 10, 2021, 10:17 PM

#

you can just post it here

#

but i not sure what you want me to do

strange portal Aug 10, 2021, 10:19 PM

#

prime hearth but i not sure what you want me to do

Maybe with the script, you can get a sense of it, it's very basic, I send it and teach you step by step

visual field Aug 10, 2021, 10:28 PM

#

Today I was using pandas to import csv to sqlite3 and using the separator ';' and notice that all 2000 records imported except for 3 records that found that one of the columns had a semi Colin and create another column. trying to find a way for the cvs data to be imported and ignoring the discovered extra semi colin. any thoughts our idea?

prime hearth Aug 10, 2021, 10:33 PM

#

@strange portal oh sorry, it just my question is different, i was wondering about if ML certificate are really worth it when it comes down to internships

#

because i already have projects to showcase that i know ML, but there are other interns who have coursea certifate on ML

strange portal Aug 10, 2021, 10:34 PM

#

prime hearth <@658741280964345906> oh sorry, it just my question is different, i was wonderi...

Sorry, I can't help you now, I live in Brazil

prime hearth Aug 10, 2021, 10:35 PM

#

but yeah thats cool to see, you can share git repo here

strange portal Aug 10, 2021, 10:36 PM

#

prime hearth but yeah thats cool to see, you can share git repo here

can i share the zipped folder

strange portal Aug 10, 2021, 10:39 PM

#

prime hearth but yeah thats cool to see, you can share git repo here

send dm, ok?

grave frost Aug 10, 2021, 10:47 PM

#

prime hearth sorry repost : hello, i would like to please know if ML certificates gives more...

nothing major, but it might help

grave frost Aug 10, 2021, 10:48 PM

#

strange portal How can I do a very simple reinforcement learning, I already know a little about...

start with the basic math, and then move up with Q-learning to DQNs

serene scaffold Aug 10, 2021, 10:56 PM

#

gentle epoch is there any difference between: ```py import pandas as pd def csv_parser(path)...

your csv_parser is just a wrapper around pd.read_csv and doesn't actually afford you anything.

iron basalt Aug 10, 2021, 10:57 PM

#

strange portal How can I do a very simple reinforcement learning, I already know a little about...

https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

strange portal Aug 10, 2021, 11:11 PM

#

grave frost start with the basic math, and then move up with Q-learning to DQNs

using keras?

iron basalt Aug 10, 2021, 11:15 PM

#

strange portal using keras?

You don't need any libraries to do basic RL.

#

(And I recommend you don't and instead start out with simple tabular implementations)

strange portal Aug 10, 2021, 11:16 PM

#

iron basalt (And I recommend you don't and instead start out with simple tabular implementat...

ok, thanks

iron basalt Aug 10, 2021, 11:18 PM

#

strange portal ok, thanks

The book I linked is written by the people that came up with the stuff in the first place (RL, in its modern form).

#

It's not a very hard read, although it does require some math.

quiet vault Aug 10, 2021, 11:35 PM

#

I am using keras and it is training models on my cpu. Is there a way to make it use my gpu?

lapis sequoia Aug 11, 2021, 12:08 AM

#

spank it

desert oar Aug 11, 2021, 1:13 AM

#

gentle epoch so, with the first, the csv is already being stored internally?

Don't overthink it. pd.read_csv reads the data and returns it as a DataFrame

gentle epoch Aug 11, 2021, 2:39 AM

#

desert oar Don't overthink it. `pd.read_csv` reads the data and returns it as a DataFrame

alright. thank you!

#

btw, i noticed something weird

#

I've been messing with themes in vscode

#

and for some reason, vscode thinks iloc is a variable, rather than part of a function, in regards to coloring

#

it works fine

#

it just the color

ripe forge Aug 11, 2021, 2:44 AM

#

Well, iloc isn't a function is it.

#

Ie. You don't use it with round brackets.

desert oar Aug 11, 2021, 5:01 AM

#

what was that library that was some kind of wrapper around numba, jax, torch, and a few other python-optimizer tools?

#

transonic

#

!pypi transonic

arctic wedgeBOT Aug 11, 2021, 5:02 AM

#

transonic v0.4.10

Make your Python code fly at transonic speeds!

copper loom Aug 11, 2021, 5:10 AM

#

im using pytorch ....i have a folder of images that i want to pass to model one by one and get outputs ? custom dataloader just seems too much to do all i want is to pass files one by one to a model

gentle epoch Aug 11, 2021, 5:15 AM

#

ripe forge Well, iloc isn't a function is it.

actually, the textMate scope that changes that color is source.python. whatever that means

tawdry lily Aug 11, 2021, 5:16 AM

#

trying to get a notebook that was written with tensorflow 1 working locally on cuda 11 like: https://cdn.discordapp.com/attachments/777174797934264320/874379048787247224/redditsave.com_from_ttmrs_twitter_russell_after_he_loses_alyxs-g2yjxzkpu0g71.mp4

▶ Play video

arctic wedgeBOT Aug 11, 2021, 5:48 AM

#

Hey @lapis sequoia!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

#

Hey @lapis sequoia!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

#

Hey @lapis sequoia!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

vast elk Aug 11, 2021, 6:17 AM

#

Hello!, does anyone have ideas about data analytics business in pandemic era that has social impacts?

*sorry for broken englsih, TYSM

bold timber Aug 11, 2021, 6:22 AM

#

Hi, I have a question : whether Recommender System is part of Machine Learning?

desert oar Aug 11, 2021, 6:48 AM

#

bold timber Hi, I have a question : whether Recommender System is part of Machine Learning?

yes

desert oar Aug 11, 2021, 6:49 AM

#

vast elk Hello!, does anyone have ideas about data analytics business in pandemic era tha...

analyzing public health data could be good. someone recently here was working on hospital bed occupancy during covid

undone flare Aug 11, 2021, 7:22 AM

#

So this is a classification problem, I fixed the skewed data, scaled the data but still can't get better results what else can I do?

#

also weirdly enough scaled and non scaled data give me the same results

#

fathom tiger Aug 11, 2021, 7:39 AM

#

undone flare So this is a classification problem, I fixed the skewed data, scaled the data bu...

I think you can do a lot to get better performance, you can take some time to do feature engineer(this might be tedious if you don’t have domain knowledge). You can gather more data, you can tune hyperameters of your models etc I also think having domain understanding of the problem would help alot.

undone flare Aug 11, 2021, 7:41 AM

#

fathom tiger I think you can do a lot to get better performance, you can take some time to do...

I can't gather more data so that's not happening, I will try tuning the hyper parameters of the models. Also I have some domain knowledge

#

Thanks for the suggestions

fathom tiger Aug 11, 2021, 7:45 AM

#

That’s great having domain knowledge… you can spend time on feature engineering, it would help your models a lot. Cheers 🥂

lapis sequoia Aug 11, 2021, 7:52 AM

#

Has anyone here applied ML to heavy-tail data? I have a few questions

desert oar Aug 11, 2021, 8:02 AM

#

undone flare So this is a classification problem, I fixed the skewed data, scaled the data bu...

scaling won't affect linear model results, it's only for interpretability. but it also helps with numerical stability and it does have an effect in nonlinear models.

#

in a linear model, if you scale x up by 10, the parameter just scales down by 10 to compensate

#

how many classes do you have?

#

what are the f1 scores, precision, recall, log or brier score, etc.?

undone flare Aug 11, 2021, 8:09 AM

#

desert oar how many classes do you have?

9 features

desert oar Aug 11, 2021, 8:09 AM

#

classes, not features

undone flare Aug 11, 2021, 8:11 AM

#

I don't know what you mean by classes, do you mean the categories the model outputs?

desert oar Aug 11, 2021, 8:15 AM

#

yes

#

those are usually called "classes"

#

the term "categories" is usually applied to features, as in "categorical features"

undone flare Aug 11, 2021, 8:15 AM

#

2 (water is potable or not)

desert oar Aug 11, 2021, 8:15 AM

#

oh, so it's binary

#

that's easier then

undone flare Aug 11, 2021, 8:16 AM

#

yea I think the features are making it harder

desert oar Aug 11, 2021, 8:16 AM

#

why do you think that?

undone flare Aug 11, 2021, 8:16 AM

#

tried many models with different hyper params but still can't get the mean score above a certain point

desert oar Aug 11, 2021, 8:19 AM

#

are you expecting higher accuracy? sometimes data is noisy or the features just aren't that tightly related to the outcome

undone flare Aug 11, 2021, 8:20 AM

#

features just aren't that tightly related to the outcome
I believe that's the case

desert oar Aug 11, 2021, 8:20 AM

#

are these ppm values?

#

not that it matters, maybe there isn't any interesting feature engineering to be done here

undone flare Aug 11, 2021, 8:21 AM

#

different units ppm, μg/L and mg/L

undone flare Aug 11, 2021, 8:22 AM

#

desert oar not that it matters, maybe there isn't any interesting feature engineering to be...

yea not that I can think of any

desert oar Aug 11, 2021, 8:22 AM

#

how is potability determined?

#

70% out-of-sample accuracy based on those 8 of features actually sounds kind of good, but i'm not a water treatment expert either

undone flare Aug 11, 2021, 8:23 AM

#

mtft is the standard I think

#

also I think hard water doesn't really make drinking water unsafe

weary summit Aug 11, 2021, 8:31 AM

#

Hi
I have two numpy arrays, data which has (3,n) dim points which has (3, m) dim

Until now, there was a for loop over n which produced the following output:
for i in range(n):
diff = data[:,i,newaxis] - points # shape is 3,m

meaning, subtract each column n times.
I want to that in a vectorized fashion, that the result would be in shape n,3,m

How can I achieve that?

tulip girder Aug 11, 2021, 8:52 AM

#

Minecraft ai bot possible?

tidal bough Aug 11, 2021, 8:53 AM

#

weary summit Hi I have two numpy arrays, data which has (3,n) dim points which has (3, m) d...

Hmm, that 3 is annoying, otherwise it'd be np.sub.outer.

#

Oh, you can probably use einsum.
EDIT: ah, sadly no, doesn't support subtraction

#

you want res[i,j,k] = data[i,j] - points[i,k], I believe, where i is in range(3), j in range(n), k in range(m).

I think you might be able to broadcast data and points both to the output shape and subtract them like that.

wispy bay Aug 11, 2021, 9:01 AM

#

hello! so i'm trying to make a speech recognition ai (kind of) using the google speech recognition module and it keeps showing this error: Traceback (most recent call last): File "C:\Users\rorop\Desktop\ai.py", line 10, in <module> text=r.recognize_google(audio_data) File "C:\Users\rorop\Desktop\speech_recognition\__init__.py", line 822, in recognize_google assert isinstance(audio_data, AudioData), "``audio_data`` must be audio data" AssertionError: ``audio_data`` must be audio data

#

here's my code: ` import speech_recognition as sr
from speech_recognition import AudioFile

r=sr.Recognizer()

audio=AudioFile('vs.wav')

audio_data=r.record
type(audio_data)
text=r.recognize_google(audio_data)
print(text)`

#

Can anybody help me please?

#

Thank you

#

I've been trying to fix this for days

#

I'm also quite new to python

#

I hope someone helps soon! 🙂

weary summit Aug 11, 2021, 10:04 AM

#

tidal bough you want `res[i,j,k] = data[i,j] - points[i,k]`, I believe, where `i` is in rang...

And how would I do that?

bold timber Aug 11, 2021, 10:05 AM

#

What is different of splitting with train_test_split and Cross fold validation?

polar rapids Aug 11, 2021, 10:27 AM

#

Hi. I want to learn data science. Where to start? I know Python up to average.
Of course, a basic question is what does a data scientist do?

PS:
I hope I have not violated the rules of society regarding my questions :)

copper loom Aug 11, 2021, 10:39 AM

#

Does anyone have idea about onnx operators ?

late shell Aug 11, 2021, 12:08 PM

#

Hey, there's a dataset I'd like to use, but It's 900+ mb, I don't want it on my local disk. Is there a way to use the dataset for CNN training without having to download it?

undone flare Aug 11, 2021, 12:23 PM

#

late shell Hey, there's a dataset I'd like to use, but It's 900+ mb, I don't want it on my ...

do you have the csv link?

late shell Aug 11, 2021, 12:32 PM

#

undone flare do you have the csv link?

https://www.kaggle.com/tawsifurrahman/covid19-radiography-database?select=COVID-19_Radiography_Dataset
this is the dataset.

COVID-19 Radiography Database

COVID-19 Chest X-ray Database

undone flare Aug 11, 2021, 12:34 PM

#

uh oh

#

you can just create a kaggle notebook and add the data if you don't want to download the data set

#

@late shell https://buggyprogrammer.com/load-kaggle-dataset-in-colab-or-jupyter/

shadow frigate Aug 11, 2021, 1:12 PM

#

Hello 👋 in Pytorch, can I somehow measure a loss on a subset of a tensor? say I have

pred=[1,2,3,0]
labels=[1,1,3,nan]

Is it possible to tell the loss to only consider the first 3 values, without modifying the tensors? Possibly by passing a mask for those values to ignore ( mask=[1,1,1,0] in the example). I can't filter out the samples to ignore before sending them through the NN because I'd have to modify a large part of the model to do so.

#

I didn't think of this situation when I prepared the model blobDerp

bold timber Aug 11, 2021, 2:48 PM

#

Hi, I have a question, how to evaluate RecommenderSystem?

lapis sequoia Aug 11, 2021, 3:21 PM

#

What kind of recommender system?

wheat spire Aug 11, 2021, 3:51 PM

#

Hi there, I am starting to prepare data science seriously from the beginning. And am gonna complete whole data science within 6 to 12 months. If there is anyone who would like me join me, it would be great as we can study together. 🙂

flat hollow Aug 11, 2021, 3:52 PM

#

what are you going to use as learning resource?

wheat spire Aug 11, 2021, 3:52 PM

#

I have enrolled a course in a website

#

Also, as usual Youtube has a lot of content

undone flare Aug 11, 2021, 3:59 PM

#

I have one hot encoded data and I want to revert this to just digits as in an 1D array with digits [0, 0, 0, 2, 5, 9...]

#

how can I achieve this?

#

do I check if the value is one and return it? I think there is probably more efficient way

late shell Aug 11, 2021, 4:01 PM

#

undone flare you can just create a kaggle notebook and add the data if you don't want to down...

oh, never used a kaggle notebook before, thanks 👍 .

late shell Aug 11, 2021, 4:03 PM

#

undone flare I have one hot encoded data and I want to revert this to just digits as in an 1D...

did you use the sklearn.preprocessing.OneHotEncoder class?

undone flare Aug 11, 2021, 4:04 PM

#

late shell did you use the `sklearn.preprocessing.OneHotEncoder` class?

nope the data was in the shape of (2062, 10)

#

so I made it a dataframe and it looks like one hot encoded

undone flare Aug 11, 2021, 4:04 PM

#

undone flare do I check if the value is one and return it? I think there is probably more eff...

I am doing this right now

umbral ferry Aug 11, 2021, 4:08 PM

#

In xgboost, how is it determining which feature will be the root of the tree? I have a few continuous variables and many categorical which I one hot encoded

desert oar Aug 11, 2021, 4:14 PM

#

umbral ferry In xgboost, how is it determining which feature will be the root of the tree? I ...

a decision tree uses the same splitting algorithm at every node, including the first/root node

#

the basic algorithm is on page 3 of https://arxiv.org/abs/1603.02754

#

there are other split-finding algorithms implemented in xgboost, but the others are all just approximations to the exact algorithm

umbral ferry Aug 11, 2021, 4:16 PM

#

like Gini impurity or gain? or is that unrelated to determine which to split at

desert oar Aug 11, 2021, 4:16 PM

#

think of a node in a decision tree as containing data points, not as containing a split on a feature. the edges between nodes are the splits.

#

yep, that's it. the algorithm just finds the split point with the greatest gain for each feature, then splits on the best feature

umbral ferry Aug 11, 2021, 4:18 PM

#

here's the first tree of my model

undone flare Aug 11, 2021, 4:18 PM

#

I keep getting

ValueError: Expected 2D array, got 1D array instead:
array=[6 9 3 9 0 5 8 2 5 9 4 9 7 1 3 3 0 5 0 7 0 8 3 6 9 2 7 3 5 9 8 5 4 6 4 6 3
 1 9 2 7 7 3 1 1 2 0 7 8 9 1 9 6 2 1 0 6 8 2 8 8 7 2 7 5 9 2 3 6 4 1 1 5 7
 4 9 9 4 3 8 8 9 2 0 9 0 0 4 1 5 5 4 7 4 7 4 2 2 8 7 2 0 9 0 2 1 7 8 8 7 2
 8 3 3 2 2 6 1 5 5 5 0 1 5 8 2 6 5 1 0 3 1 9 9 8 3 8 9 2 2 2 6 2 6 6 1 6 2
 5 4 9 2 1 2 6 2 6 6 1 1 7 5 9 8 6 2 4 7 6 9 8 7 2 9 1 6 7 6 0 6 1 7 4 8 4
 3 2 2 4 2 8 6 8 3 2 0 8 8 8 5 4 7 0 8 2 4 2].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
```can anyone tell why? I tried many things but still end up with this

umbral ferry Aug 11, 2021, 4:18 PM

#

ahhh so it's gain, ok

undone flare Aug 11, 2021, 4:19 PM

#

shapes

umbral ferry Aug 11, 2021, 4:19 PM

#

and it will compare gains of various features, and if it's a continuous features, it will find which split in the continuous feature has the highest gain

#

and use the largest gain as splitting point

desert oar Aug 11, 2021, 4:19 PM

#

umbral ferry here's the first tree of my model

i think that visualization is just a little weird. unless i'm really badly misunderstanding something, NTON<585 is supposed to be the splitting criterion after the first node. it is not "the" first node as such.

desert oar Aug 11, 2021, 4:20 PM

#

undone flare I keep getting ``` ValueError: Expected 2D array, got 1D array instead: array=[6...

show the code that caused this

umbral ferry Aug 11, 2021, 4:21 PM

#

the first node would contain all the data right, I'd call that the 0th node, so it decided to split using NTON at the first

#

and then it's the same for further nodes, where it will compare gain from each feature using the new subset of data points?

undone flare Aug 11, 2021, 4:23 PM

#

desert oar show the code that caused this

clf1 = LogisticRegression(random_state=42).fit(X_train, y_train.values.ravel())
y_pred1 = clf1.predict(X_test)
clf1.score(y_test, y_pred1)

#

if I make y a list then also it gives the same thing

umbral ferry Aug 11, 2021, 4:24 PM

#

on and on until it reaches max depth, reaches minimum data points in the node, or gets pruned based on what regularization I use

desert oar Aug 11, 2021, 4:24 PM

#

undone flare ```py clf1 = LogisticRegression(random_state=42).fit(X_train, y_train.values.rav...

what is y_train, a Series?

#

.values is deprecated

undone flare Aug 11, 2021, 4:25 PM

#

desert oar what is `y_train`, a `Series`?

def get_digit(row):
    for c in df.columns:
        if row[c]==1:
            return c

y = df.apply(get_digit, axis=1)
y = np.array(y)

desert oar Aug 11, 2021, 4:25 PM

#

that's... weird

#

what does this df look like

#

oh, the one-hot encoded data you posted in the screenshot above?

undone flare Aug 11, 2021, 4:26 PM

#

I have two npy files

undone flare Aug 11, 2021, 4:26 PM

#

desert oar oh, the one-hot encoded data you posted in the screenshot above?

yea that's the y

#

X had 3 dimensions so I reduced it to two by X.reshape(2062, 64*64)

grave frost Aug 11, 2021, 4:27 PM

#

undone flare I keep getting ``` ValueError: Expected 2D array, got 1D array instead: array=[6...

what's the task?

undone flare Aug 11, 2021, 4:27 PM

#

grave frost what's the task?

Sign Language digits classification

grave frost Aug 11, 2021, 4:27 PM

#

undone flare Sign Language digits classification

image?

undone flare Aug 11, 2021, 4:27 PM

#

nope npy files

grave frost Aug 11, 2021, 4:28 PM

#

tried just converting it to 2d

#

its literally what the error says

undone flare Aug 11, 2021, 4:28 PM

#

undone flare X had 3 dimensions so I reduced it to two by `X.reshape(2062, 64*64)`

did

#

y doesn't need to be 2D does it?

#

(2062,) is the dimension

grave frost Aug 11, 2021, 4:28 PM

#

oh no my bad

#

y can be 1d

desert oar Aug 11, 2021, 4:28 PM

#

LogisticRegression y should only be 1d actually

undone flare Aug 11, 2021, 4:29 PM

#

yea

grave frost Aug 11, 2021, 4:29 PM

#

x has to be 2d

undone flare Aug 11, 2021, 4:29 PM

#

I don't know why it's giving me this error

desert oar Aug 11, 2021, 4:29 PM

#

i don't think they even support multiclass-onehot or multilabel

grave frost Aug 11, 2021, 4:29 PM

#

because its an image

undone flare Aug 11, 2021, 4:29 PM

#

I tried doing it with digits dataset of sklearn and it worked but doesn't wanna work with this data

grave frost Aug 11, 2021, 4:29 PM

#

BTW what is there in the npy files? why dont they give images in jpegs?

desert oar Aug 11, 2021, 4:29 PM

#

X should probably have shape like (n_images, image_height * image_width), no?

undone flare Aug 11, 2021, 4:30 PM

#

desert oar `X` should probably have shape like `(n_images, image_height * image_width)`, no...

yes

#

image is 64 x 64

undone flare Aug 11, 2021, 4:30 PM

#

grave frost BTW what is there in the npy files? why dont they give images in jpegs?

This is the set https://www.kaggle.com/ardamavi/sign-language-digits-dataset

grave frost Aug 11, 2021, 4:31 PM

#

undone flare This is the set https://www.kaggle.com/ardamavi/sign-language-digits-dataset

Image size: 64x64
Color space: Grayscale
it should be 3D?

desert oar Aug 11, 2021, 4:31 PM

#

it looks like they're flattening it to 2d

grave frost Aug 11, 2021, 4:31 PM

#

(img_height, img_widht, channel)

undone flare Aug 11, 2021, 4:32 PM

#

grave frost > Image size: 64x64 > Color space: Grayscale it should be 3D?

it was 3D I flattened to 2D

grave frost Aug 11, 2021, 4:32 PM

#

undone flare it was 3D I flattened to 2D

reshape or flatten?

#

check your vars

undone flare Aug 11, 2021, 4:32 PM

#

reshape

X = X.reshape(2062, 64 * 64)
X.shape

#

This is the image

#

this is when X has 3 dimensions but you can't provide array with 3 dims to LogisticRegression

lapis sequoia Aug 11, 2021, 4:47 PM

#

Hi ! What prerequisites do I need to get started in ml ?

umbral ferry Aug 11, 2021, 4:51 PM

#

Good knowledge of pandas will really help

#

other than that it's just curiosity and decent reading comprehension lol

grave frost Aug 11, 2021, 5:10 PM

#

undone flare reshape ```py X = X.reshape(2062, 64 * 64) X.shape ```

what's the ouput of X.shape\

#

because that's not what you are passing

undone flare Aug 11, 2021, 5:11 PM

#

undone flare shapes

.

grave frost Aug 11, 2021, 5:19 PM

#

undone flare .

clf1 = LogisticRegression(random_state=42).fit(X_train, y_train.values.ravel())
y_pred1 = clf1.predict(X_test)
clf1.score(y_test, y_pred1)
you are obviously doing some more processing, since you don't pass X?

desert oar Aug 11, 2021, 5:25 PM

#

@undone flare
code: https://paste.pythondiscord.com/kigelewune.py
output: https://paste.pythondiscord.com/okafexanum.yaml

#

curious what's considered "good" out-of-sample accuracy on this problem, i assume it's in the high 90s

#

https://www.kaggle.com/esercicek/sign-language-with-cnn low 90s with a basic CNN, seems reasonable

Sign Language with CNN

Explore and run machine learning code with Kaggle Notebooks | Using data from Sign Language Digits Dataset

undone flare Aug 11, 2021, 5:28 PM

#

What was I doing wrong tho? The dimensions were wrong?

desert oar Aug 11, 2021, 5:29 PM

#

not sure. take a look at how i did it maybe and compare

raw temple Aug 11, 2021, 5:29 PM

#

Hi, I'm looking at coding to implement an AR model for time series and in this image, can anyone tell me what does the -100 signify in the code
train_data = df['Consumption'][:len(df)-100]

desert oar Aug 11, 2021, 5:30 PM

#

raw temple Hi, I'm looking at coding to implement an AR model for time series and in this i...

that's a sloppy way of saying "everything but the last 100 elements". :n means "take elements until n", and they're using len(df)-100 as the "n". but this is bad style, they should have written

train_data = df['Consumption'].iloc[:-100]
test_data = df['Consumption'].iloc[-100:]

which is equivalent

#

indexing or slicing with a negative number means "count from the end"

undone flare Aug 11, 2021, 5:32 PM

#

desert oar <@!298841376563527690> code: https://paste.pythondiscord.com/kigelewune.py outp...

Can you tell what are the shapes of x and y? I am on mobile right now

desert oar Aug 11, 2021, 5:32 PM

#

!eval @raw temple```python
import numpy as np
import pandas as pd

x_py = list(range(10))
x_np = np.array(x_py)
x_pd = pd.Series(x_np, index=list('abcdefghij'))

print(x_py[:-3])
print()
print(x_np[:-3])
print()
print(x_pd.iloc[:-3])

arctic wedgeBOT Aug 11, 2021, 5:33 PM

#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

001 | [0, 1, 2, 3, 4, 5, 6]
002 | 
003 | [0 1 2 3 4 5 6]
004 | 
005 | a    0
006 | b    1
007 | c    2
008 | d    3
009 | e    4
010 | f    5
011 | g    6
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/vorubegiya.txt?noredirect

desert oar Aug 11, 2021, 5:33 PM

#

undone flare Can you tell what are the shapes of x and y? I am on mobile right now

In [92]: x.shape
Out[92]: (2062, 4096)

In [93]: y.shape
Out[93]: (2062,)

undone flare Aug 11, 2021, 5:33 PM

#

Hmm I have the same shapes

desert oar Aug 11, 2021, 5:33 PM

#

In [95]: x.iloc[:10,:5]
Out[95]:
pixel           0,0       0,1       0,2       0,3       0,4
image_num
0          0.466667  0.474510  0.478431  0.482353  0.486275
1          0.596078  0.607843  0.619608  0.631373  0.643137
2          0.588235  0.603922  0.619608  0.631373  0.643137
3          0.556863  0.568627  0.584314  0.600000  0.611765
4          0.580392  0.576471  0.592157  0.607843  0.615686
5          0.517647  0.529412  0.552941  0.615686  0.635294
6          0.427451  0.439216  0.454902  0.474510  0.490196
7          0.564706  0.576471  0.588235  0.603922  0.615686
8          0.498039  0.509804  0.521569  0.537255  0.549020
9          0.501961  0.517647  0.533333  0.545098  0.564706

undone flare Aug 11, 2021, 5:34 PM

#

Yea same

#

I will try again tomorrow thanks for the help

raw temple Aug 11, 2021, 5:35 PM

#

@desert oar thanks for your help. so if I only have 55 points of data, then would I just change the value to 50?

#

or maybe -15? as I want all points included until the last 15?

#

is that what it means?

desert oar Aug 11, 2021, 5:37 PM

#

raw temple or maybe -15? as I want all points included until the last 15?

:-15 seems reasonable, yes

raw temple Aug 11, 2021, 5:38 PM

#

desert oar `:-15` seems reasonable, yes

great! thanks so much for your help

rigid zodiac Aug 11, 2021, 6:49 PM

#

Hi everyone, pretty random question on the neural network. I saw a sample code on the towardDataScience they have model.add(Dense(64, activation=tf.nn.relu, kernel_initializer='uniform', input_dim = input_dim)) # fully-connected layer with 64 hidden units Where 64 is the number of layers.

Why do they choose 64 layers? is it hurt if I choose more layers than that?

unborn glacier Aug 11, 2021, 6:50 PM

#

I think you mean the number of nodes?

chilly geyser Aug 11, 2021, 6:50 PM

#

64 is definitely the number of nodes

unborn glacier Aug 11, 2021, 6:52 PM

#

There are general rules on how to determine the number of nodes, and layers. Here's an explanation:
https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw

Cross Validated

How to choose the number of hidden layers and nodes in a feedforwar...

Is there a standard and accepted method for selecting the number of layers, and the number of nodes in each layer, in a feed-forward neural network? I'm interested in automated ways of building neu...

chilly geyser Aug 11, 2021, 6:53 PM

#

unborn glacier There are general rules on how to determine the number of nodes, and layers. Her...

Ah, seems like sensible rules of thumb.

I would still consider experimenting on a grid or random-search basis though

rigid zodiac Aug 11, 2021, 6:58 PM

#

thanks yall

lusty stag Aug 11, 2021, 7:07 PM

#

how do you guys define class imbalance?
I have 10 classes of observations
max class has 1100 observations
min class has 750 observations
should I consider oversampling or keep it as it is?

fallen trellis Aug 11, 2021, 7:35 PM

#

I want to plot 2 classes with barely any variance but a big interclass difference

#

Any way to do this? Boxplots fail miserably

#

desert oar Aug 11, 2021, 7:36 PM

#

what are you trying to show? maybe you just want a table

fallen trellis Aug 11, 2021, 7:36 PM

#

Loading times

desert oar Aug 11, 2021, 7:36 PM

#

lusty stag how do you guys define class imbalance? I have 10 classes of observations max cl...

i wouldn't worry about this difference

fallen trellis Aug 11, 2021, 7:36 PM

#

left with precompilation, right without

#

A table could work but Id like to have a nice looking figure

#

and a logarithmic axis is definitely not nice, sadly

desert oar Aug 11, 2021, 7:38 PM

#

make 3 plots:

average loading time as a bar chart with some kind of error bar showing that the errors are small relative to the difference between groups
2-3) kernel density plot or histogram for each group, separately, or faceted together so you can compare the distributions without worrying about scale

fallen trellis Aug 11, 2021, 7:38 PM

#

I feel like I should just summarize the results textually..

desert oar Aug 11, 2021, 7:39 PM

#

and yeah, it's almost never bad to include a table with some combination of mean, std dev, median, min, max, 25%, 75%, 10%, 90%

#

(in general i wouldn't recommend using tuples for "array-like" things)

#

!eval ```python
import pandas as pd

data = pd.DataFrame({
'without_precomp': [90, 91, 92],
'with_precomp': [10, 11, 12],
})

def p25(x):
return x.quantile(0.25)

def p75(x):
return x.quantile(0.75)

table = data.agg([
'mean',
'std',
'min',
p25,
'median',
p75,
'max',
]).transpose()

print(table)

arctic wedgeBOT Aug 11, 2021, 7:42 PM

#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

001 |                  mean  std   min   p25  median   p75   max
002 | without_precomp  91.0  1.0  90.0  90.5    91.0  91.5  92.0
003 | with_precomp     11.0  1.0  10.0  10.5    11.0  11.5  12.0

fallen trellis Aug 11, 2021, 7:54 PM

#

great

#

thanks you, as always

#

Ill see to get that stuff visualized, as well

molten hamlet Aug 11, 2021, 8:19 PM

#

Hey guys, I stumbled upon this problem, I use DeepQ algorithm (reinforced learning) and my output as you see (actions) are same for every input, which is very bad. Can some help me identify what is the cause? or name the problem im facing? Is this due to learning rate? Loss function? to small model? I used huber, and mae loss functions, but nothing works in long run...

thorny willow Aug 11, 2021, 8:22 PM

#

does anyone know a doc or guide that explains the installation process of keras and tensorflow in anaconda on linux? I'm facing a few problems

#

or, if you're willing, provide all the lines of code here?

fallen trellis Aug 11, 2021, 8:30 PM

#

@desert oar sorry for the ping, but regarding my loading-time testing:

With the std calculated, can I use e.g., chi² to say that "std is not relevant so the difference between mean_with and mean_without is equal to the the compilation time"?

desert oar Aug 11, 2021, 8:39 PM

#

fallen trellis <@!389497659087650836> sorry for the ping, but regarding my loading-time testing...

i'm not sure you need any kind of test for that, if the difference is that big

fallen trellis Aug 11, 2021, 8:40 PM

#

Ok, just want to make sure my prof doesnt bicker at me

#

But I think thats a valid argument

rigid zodiac Aug 11, 2021, 8:49 PM

#

have anyone do RNN, CNN or ConLSTM? I need some help of how to set it up. I got my optimal gridsearch and epoch but idk how to use those deep learning model

agile jolt Aug 11, 2021, 9:05 PM

#

can someone explain this 2 rows

molten hamlet Aug 11, 2021, 9:14 PM

#

agile jolt can someone explain this 2 rows

input number 31, 32

agile jolt Aug 11, 2021, 10:27 PM

#

hahaha, thanks but i meant like.. what does data = [trace] and line 32 mean?

slow vigil Aug 11, 2021, 10:45 PM

#

Anyone here have a lot of experience with Spark?

velvet thorn Aug 11, 2021, 11:00 PM

#

slow vigil Anyone here have a lot of experience with Spark?

you should just ask your question.

slow vigil Aug 11, 2021, 11:05 PM

#

Well I have to pull a lot of data in from an api, and I have to use a ton of separate requests due to the API limitations. I want to write all the data to a parquet file, and I think the way to do that is each time I do an api call I append the returned data to a spark dataframe that gets written into a parquet file at the end. I'm new to spark so I'm pretty sure I should be using partitions to write the file since the dataframe will be too large to fit in memory. I guess I'm just wondering if I have the idea right, and also wondering if there is a fast way to make concurrent api calls with Spark since I know it's basically designed for big data ingestion

#

Everything I can find is all about doing one api call or working with one JSON file that already exists

#

So I'm having trouble visualizing how the pieces go together in my situation

#

I have to pay for the API access so I'm trying to get this all sorted out without having access to the data

velvet thorn Aug 11, 2021, 11:11 PM

#

slow vigil Well I have to pull a lot of data in from an api, and I have to use a ton of sep...

concurrent API calls is a separate thing entirely.

#

in general

#

for that kind of stuff

#

you want to do async IO

#

as for the partitioning...

#

Spark will do that for you

#

(basically)

#

do you have a cluster or what?

slow vigil Aug 11, 2021, 11:21 PM

#

No it's all local, but I have probably about 20-30 gigs of data at least

#

and I will probably move it to a cloud service eventually

#

So am I right about just adding everything to a dataframe?

#

Should I even be using Spark? I do want to use the parquet format that's why I'm looking at it

#

Like I just don't understand how it goes about it. In my head I picture the data getting added to the dataframe as it comes in after each api call, then once the dataframe reaches a certain size it writes it to a parquet file... and then what? it basically starts a new dataframe and once that reaches the same size it appends it to the same parquet file?

#

It's stock data, so I have to loop over a list of tickers and do multiple api calls for each ticker, and there are thousands of tickers.

velvet thorn Aug 11, 2021, 11:28 PM

#

slow vigil Should I even be using Spark? I do want to use the parquet format that's why I'm...

either that or some other distributed thing

#

like dask

velvet thorn Aug 11, 2021, 11:28 PM

#

slow vigil So am I right about just adding everything to a dataframe?

what format is the data in

#

when it comes in?

slow vigil Aug 11, 2021, 11:28 PM

#

json

velvet thorn Aug 11, 2021, 11:28 PM

#

I would say

#

store it in memory first

#

when you hit a certain limit

#

write that to disk

#

then

#

you'll have multiple dataframes

#

once you're done with the API

#

concatenate them

#

ALTERNATIVELY

#

you can use spark-streaming

#

I haven't

#

but I'm fairly sure

#

it would work here?

slow vigil Aug 11, 2021, 11:30 PM

#

interesting

#

This looks similar to websockets or something

#

Thank you for your help

quiet vault Aug 12, 2021, 12:37 AM

#

Would Cuda version 11.4 work with the newest version of Tensorflow?

serene scaffold Aug 12, 2021, 12:43 AM

#

quiet vault Would Cuda version 11.4 work with the newest version of Tensorflow?

for what OS?

quiet vault Aug 12, 2021, 12:43 AM

#

windows

#

#

the website says 11.2

#

but i cant get it

#

because the version is too old for my gpu

#

so im asking if anyone has tried using 11.4

serene scaffold Aug 12, 2021, 12:45 AM

#

quiet vault because the version is too old for my gpu

did you get an error message?

quiet vault Aug 12, 2021, 12:45 AM

#

no

#

im just asking before i download

#

wait

#

wait

#

no

#

i read it wrong

#

yes

serene scaffold Aug 12, 2021, 12:45 AM

#

quiet vault

can you give the link for this page?

quiet vault Aug 12, 2021, 12:46 AM

#

serene scaffold can you give the link for this page?

https://www.tensorflow.org/install/source_windows

TensorFlow

Build from source on Windows | TensorFlow

serene scaffold Aug 12, 2021, 12:46 AM

#

yes, you did get an error message? if so, show.

quiet vault Aug 12, 2021, 12:46 AM

#

i read ur message wrong

#

i did get an error

serene scaffold Aug 12, 2021, 12:46 AM

#

if you're asking for help that's in any way related to an error message, always show the error message.

quiet vault Aug 12, 2021, 12:46 AM

#

will do in the future, sorry about that

serene scaffold Aug 12, 2021, 12:46 AM

#

no problem

quiet vault Aug 12, 2021, 12:46 AM

#

serene scaffold Aug 12, 2021, 12:47 AM

#

Please do text next time.

quiet vault Aug 12, 2021, 12:47 AM

#

i cant copy paste it

serene scaffold Aug 12, 2021, 12:47 AM

#

Anyway, try installing and running tensorflow and see what happens.

quiet vault Aug 12, 2021, 12:47 AM

#

but it hasnt installed yet

serene scaffold Aug 12, 2021, 12:47 AM

#

did you try to install it?

quiet vault Aug 12, 2021, 12:48 AM

#

i cant

serene scaffold Aug 12, 2021, 12:48 AM

#

what happened when you tried?

quiet vault Aug 12, 2021, 12:48 AM

#

i cant

#

it doesnt give me an option

#

look at the pic

serene scaffold Aug 12, 2021, 12:48 AM

#

that's not how you install tensorflow

quiet vault Aug 12, 2021, 12:48 AM

#

i have tensorflow installed. im talking about cuda

#

i cant install cuda because of the error message

serene scaffold Aug 12, 2021, 12:49 AM

#

try doing something with tensorflow so we see what error message you get from tensorflow.

#

like python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

quiet vault Aug 12, 2021, 12:50 AM

#

I did not get a code breaking error

#

But I got the usual warnings

serene scaffold Aug 12, 2021, 12:51 AM

#

can you show the warnings?

quiet vault Aug 12, 2021, 12:51 AM

#

2021-08-11 19:50:14.363291: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-08-11 19:50:14.364295: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2021-08-11 19:50:14.365360: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2021-08-11 19:50:14.366408: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2021-08-11 19:50:14.367435: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2021-08-11 19:50:14.368440: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2021-08-11 19:50:14.369441: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2021-08-11 19:50:14.370436: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2021-08-11 19:50:14.370627: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-08-11 19:50:14.473931: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

TensorFlow

GPU support | TensorFlow

serene scaffold Aug 12, 2021, 12:51 AM

#

thank you

#

can you do pip freeze | grep tensorflow

quiet vault Aug 12, 2021, 12:52 AM

#

grep : The term 'grep' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:14

pip freeze | grep tensorflow
```
         ~~~~
```
- CategoryInfo : ObjectNotFound: (grep:String) [], CommandNotFoundException
- FullyQualifiedErrorId : CommandNotFoundException

quiet vault Aug 12, 2021, 12:52 AM

#

serene scaffold can you do `pip freeze | grep tensorflow`

like in the terminal?

serene scaffold Aug 12, 2021, 12:52 AM

#

yes but I guess windows doesn't have grep

#

do you know the three-number version number of tensorflow that you have?

quiet vault Aug 12, 2021, 12:52 AM

#

yeah, ive never heard of grep before

serene scaffold Aug 12, 2021, 12:53 AM

#

should be like x.y.z

quiet vault Aug 12, 2021, 12:53 AM

#

serene scaffold do you know the three-number version number of tensorflow that you have?

hold on

serene scaffold Aug 12, 2021, 12:53 AM

#

if you do pip freeze I guess you can just look for tensorflow

quiet vault Aug 12, 2021, 12:53 AM

#

2.6.0

serene scaffold Aug 12, 2021, 12:53 AM

#

great, let me see

#

@quiet vault can you use 2.5?

ebon walrus Aug 12, 2021, 12:55 AM

#

@quiet vault

quiet vault Aug 12, 2021, 12:55 AM

#

bruh

ebon walrus Aug 12, 2021, 12:55 AM

#

https://tenor.com/view/floppa-im-still-amazed-at-it-gif-21363568

Tenor

serene scaffold Aug 12, 2021, 12:55 AM

#

!mute 314448333739524096 investigating

arctic wedgeBOT Aug 12, 2021, 12:55 AM

#

:incoming_envelope: :ok_hand: applied mute to @ebon walrus until <t:1628733351:f> (59 minutes and 59 seconds).

quiet vault Aug 12, 2021, 12:55 AM

#

serene scaffold <@!645750811753709615> can you use 2.5?

sure, how do i make the version go down?

quiet vault Aug 12, 2021, 12:56 AM

#

serene scaffold !mute 314448333739524096 investigating

lol

serene scaffold Aug 12, 2021, 12:56 AM

#

quiet vault sure, how do i make the version go down?

looks to me like 2.5.0 is compatible with cuda 11.2

#

which I realize doesn't solve your problem

quiet vault Aug 12, 2021, 12:56 AM

#

yeah

#

i just want to get cuda in the first place

#

should i just try to use 2.6.0 tensorflow with 11.4 cuda

#

see what happens

serene scaffold Aug 12, 2021, 12:57 AM

#

quiet vault should i just try to use 2.6.0 tensorflow with 11.4 cuda

I thought we already tried that and it didn't work

quiet vault Aug 12, 2021, 12:58 AM

#

no

#

i was asking if someone had tried it

#

before installing

serene scaffold Aug 12, 2021, 12:59 AM

#

it's unlikely that anyone will have tried what you did with the exact same OS, tensorflow version, cuda version, and remember that they did it with those exact versions.

quiet vault Aug 12, 2021, 12:59 AM

#

true

#

well

#

im gonna be the first

#

epic

serene scaffold Aug 12, 2021, 1:02 AM

#

Truly a pioneer joe_salute

quiet vault Aug 12, 2021, 1:04 AM

#

salute

#

Well

#

My power went out

#

Right before installation finished

#

Hope that didn’t fuck anything up

serene scaffold Aug 12, 2021, 1:10 AM

#

quiet vault Hope that didn’t fuck anything up

worst case, you just delete that virtual environment. it's fine.

#

I mean I suppose there are worse cases but suffice to say you won't have any data loss.

quiet vault Aug 12, 2021, 2:01 AM

#

Yeah

#

Still not back tho 😦

sour spindle Aug 12, 2021, 2:38 AM

#

How is this for my first time making a AI stock predictor?

unborn glacier Aug 12, 2021, 2:40 AM

#

Did you train it on that data???

#

Or is that data that it's never seen before?

silk rune Aug 12, 2021, 2:41 AM

#

or better yet, how did you train it lol

sour spindle Aug 12, 2021, 2:42 AM

#

unborn glacier Or is that data that it's never seen before?

Its data that it hasn't seen before

unborn glacier Aug 12, 2021, 2:42 AM

#

K what's your bitcoin wallet I'd like to buy your code

sour spindle Aug 12, 2021, 2:42 AM

#

silk rune or better yet, how did you train it lol

I made my own dataset with 60 parameters

sour spindle Aug 12, 2021, 2:42 AM

#

unborn glacier K what's your bitcoin wallet I'd like to buy your code

Sorry but i wont sell it

unborn glacier Aug 12, 2021, 2:42 AM

#

Lol I'm kidding anyway

sour spindle Aug 12, 2021, 2:43 AM

#

Oh

#

Ok

unborn glacier Aug 12, 2021, 2:43 AM

#

But yeah if that's legit you should continue testing and use it if it works into the future

silk rune Aug 12, 2021, 2:44 AM

#

sour spindle I made my own dataset with 60 parameters

thats sick

sour spindle Aug 12, 2021, 2:44 AM

#

Yeah it was a hassel and it took me like 5 days to make

unborn glacier Aug 12, 2021, 2:44 AM

#

5 days only??

sour spindle Aug 12, 2021, 2:44 AM

#

Yeah

silk rune Aug 12, 2021, 2:45 AM

#

so is it based on training or does it look at parameters and make predictions as time goes on

sour spindle Aug 12, 2021, 2:47 AM

#

It takes some closing prices from the present closing price and the closing 2 days before and then predicts the next days price

silk rune Aug 12, 2021, 2:47 AM

#

ooo

sour spindle Aug 12, 2021, 2:47 AM

#

I just need to make it scalable for some bots i will make

silk rune Aug 12, 2021, 2:48 AM

#

yeah, it looks promising already so thats awesome

sour spindle Aug 12, 2021, 2:48 AM

#

Thanks.

unborn glacier Aug 12, 2021, 2:49 AM

#

So what's apple gonna be tomorrow?

sour spindle Aug 12, 2021, 2:49 AM

#

I havent tested it on present data yet because i am sceptical of my code

quiet vault Aug 12, 2021, 2:50 AM

#

sour spindle How is this for my first time making a AI stock predictor?

did u use a cnn?

unborn glacier Aug 12, 2021, 2:50 AM

#

Yeah good to have a skeptical attitude haha

sour spindle Aug 12, 2021, 2:50 AM

#

quiet vault did u use a cnn?

Ann

quiet vault Aug 12, 2021, 2:50 AM

#

ah

#

nice job

sour spindle Aug 12, 2021, 2:50 AM

#

Thanks

unborn glacier Aug 12, 2021, 2:51 AM

#

When you say 60 parameters do you mean stuff like the weather and other stock data and things?

sour spindle Aug 12, 2021, 2:51 AM

#

Just stock data only

#

I might also make an article reader and add it in

quiet vault Aug 12, 2021, 2:52 AM

#

one more question from me

sour spindle Aug 12, 2021, 2:52 AM

#

Ok

quiet vault Aug 12, 2021, 2:53 AM

#

How did u make this graph? Did you use walk forward validation?

#

two questions i guess

sour spindle Aug 12, 2021, 2:53 AM

#

??

unborn glacier Aug 12, 2021, 2:53 AM

#

And you're sure you haven't fed it like the next days google stock price to predict the same day's apple price? (I.e. giving it future data)

sour spindle Aug 12, 2021, 2:54 AM

#

No

#

The testing data is from 2018 to 2020

quiet vault Aug 12, 2021, 2:54 AM

#

dog in the fog be getting interrogated rn

silk rune Aug 12, 2021, 2:54 AM

#

lol

unborn glacier Aug 12, 2021, 2:54 AM

#

I mean if someone makes an accurate stock predictor that's like a billion dollar tool so...

#

I worked at a finance company and they basically laughed at the idea that was even possible

sour spindle Aug 12, 2021, 2:55 AM

#

I just need to edit something in my code one sec and i will come back with a new graph. Just to make sure

quiet vault Aug 12, 2021, 2:56 AM

#

Just by the way, I was working on something similar and I thought I had the perfect model with these results:

#

https://cdn.discordapp.com/attachments/833734782164402186/854911167394349076/unknown.png

#

but

#

i realized that tensorflow was reusing backend graphs which meant it was cheating kind of

sour spindle Aug 12, 2021, 2:57 AM

#

I have no idea what i am looking at. I just watched some tutorials and then made this from scratch

quiet vault Aug 12, 2021, 2:58 AM

#

its the results of some testing

#

u know what

#

nvm

unborn glacier Aug 12, 2021, 2:58 AM

#

quiet vault i realized that tensorflow was reusing backend graphs which meant it was cheatin...

Funny my model does so well on data it has seen before...
lol

quiet vault Aug 12, 2021, 2:59 AM

#

yeah

#

well for me this was not surprising

#

i was using a basic neural network

#

just like 2 dense layers

#

well

serene scaffold Aug 12, 2021, 3:00 AM

#

@quiet vault did you figure out your thing?

quiet vault Aug 12, 2021, 3:00 AM

#

ah

#

kind of

serene scaffold Aug 12, 2021, 3:00 AM

#

lemon_hyperpleased

quiet vault Aug 12, 2021, 3:00 AM

#

i came here originally to ask a question but i got distracted

#

the power came back

#

and then i did everything and i think it work

serene scaffold Aug 12, 2021, 3:01 AM

#

You had the power all along bb

quiet vault Aug 12, 2021, 3:01 AM

#

serene scaffold You had the power all along bb

i didnt but ok

serene scaffold Aug 12, 2021, 3:01 AM

#

No I mean
You have the power.
Like as a person

quiet vault Aug 12, 2021, 3:01 AM

#

oh

#

lol

#

anyway

#

2021-08-11 21:49:13.705324: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-11 21:49:14.151069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3993 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1660, pci bus id: 0000:01:00.0, compute capability: 7.5
2021-08-11 21:49:14.333000: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-08-11 21:49:15.356801: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8202

#

this comes up when i import tensorflow

#

is everything ok

serene scaffold Aug 12, 2021, 3:02 AM

#

Oh I'm going to sleep but hopefully someone knows

quiet vault Aug 12, 2021, 3:02 AM

#

nonono

#

pls

#

7781meongcute

unborn glacier Aug 12, 2021, 3:03 AM

#

If you google that error "None of the MLIR Optimization Passes are enabled" it says it's fine

#

So...

quiet vault Aug 12, 2021, 3:03 AM

#

aight

#

epic

serene scaffold Aug 12, 2021, 3:04 AM

#

unborn glacier If you google that error "None of the MLIR Optimization Passes are enabled" it s...

That's exactly the part I wanted to single out but I can't select text on mobile
Thanks lemon_hyperpleased

quiet vault Aug 12, 2021, 3:04 AM

#

This means that i am the first person to make tensorflow 2.6.0 work with cuda 11.4

sour spindle Aug 12, 2021, 3:04 AM

#

This is gonna take awhile to train

unborn glacier Aug 12, 2021, 3:06 AM

#

Are you using a local GPU or colab or AWS or something?

sour spindle Aug 12, 2021, 3:06 AM

#

Colab

sudden canyon Aug 12, 2021, 3:15 AM

#

Statistics question here. I was thinking about architecting a code running/code contest system where students submit solutions to various problems, and I wanted to calculate how many cores I'd need to allocate for a particular contest.

Let's say that I have S students solving problems concurrently. This is a perfect system, so each student's workflow is like this: they are working on the code for I seconds, then they submit the code for testing and wait for the testing system to run all the test. After the student gets the results, they start the next iteration and so on.

Each problem has M tests in it that it runs (a test consists of running a program with a certain input and checking that the output matches the predefined one). Each test takes T_avg on average and T_worst in the worst case to run (there's a hard upper limit, but many submissions will run very quickly). I have C cores at my disposal, in other words I can run C total tests in parallel.

If I'm willing to accept that students will wait for X seconds for the results of the test, how many cores (C) do I need?

unborn glacier Aug 12, 2021, 3:17 AM

#

Why not simulate it to get a pretty solid estimation

#

I mean it's definitely a solvable stats problem, but it's pretty trivial to run a simulation

undone flare Aug 12, 2021, 3:22 AM

#

@desert oar The code which I have: https://hastebin.com/ajidowofir.py
Output: https://hastebin.com/sedosoposa.yaml
The commented lines in code gives the reshape error same as yesterday

#

ValueError: Expected 2D array, got 1D array instead:

Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

undone flare Aug 12, 2021, 3:44 AM

#

However it works on your model

#

like

base_model = LogisticRegression(max_iter=100)
legacy_random_state = np.random.RandomState()
tune_model = HalvingRandomSearchCV(
    base_model,
    param_distributions={
        "C": scipy.stats.expon(scale=100),
        "class_weight": ["balanced", None],
    },
    cv=3,
    n_jobs=3,
    random_state=legacy_random_state,
    verbose=1,
)

tune_model.fit(X_train, y_train)
pred_train = tune_model.predict(X_train)
score_train = tune_model.score(X_train, y_train)

pred_test = tune_model.predict(X_test)
score_test = tune_model.score(X_test, y_test)

print("Train:", score_train)
print("Test:", score_test)

#

so I don't even know what I am doing wrong

unborn glacier Aug 12, 2021, 3:56 AM

#

sudden canyon Statistics question here. I was thinking about architecting a code running/code ...

import random
import scipy.stats as stats
from collections import Counter


S = 100 #students
I = 200 #time per problem
sI = 30 #std.dev of I
maxI = 3600 #maximum time
minI = 60 #minimum time
M = 10 #number of problems
T = 3 #typical time
Tf = 30 #timeout time
pTf = 3 #percent chance the student's code times out

a, b = minI, maxI
mu, sigma = I, sI
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

max_cores = []
for i in range(100):
    all_students = []
    for student in range(S):
        in_use = []
        for problem in range(M):
            problem_time = T
            if random.randint(0,100)<pTf:
                problem_time = Tf
            solve_time = int(dist.rvs())
            if in_use==[]:
                in_use+=(list(range(solve_time,solve_time+problem_time)))
            else:
                in_use+=(list(range(in_use[-1]+solve_time,in_use[-1]+solve_time+problem_time)))
        all_students+=in_use
    max_cores.append(Counter(all_students).most_common(1)[0][1])

print(max_cores)

#

PAINFULLY slow and not optimized, but it works!

sudden canyon Aug 12, 2021, 3:58 AM

#

unborn glacier ```py import random import scipy.stats as stats from collections import Counter ...

hm... I guess that works if I can make a big lookup table, thanks

#

downloading scipy

unborn glacier Aug 12, 2021, 3:59 AM

#

That gives the max cores to never have a collision, if you have a queue or something it would change the complexity

#

Also I think there is a way to have more python instances than cores, just that everyone's code would slow down a bit

undone flare Aug 12, 2021, 3:59 AM

#

100 cases tho rite? ._.

sudden canyon Aug 12, 2021, 4:00 AM

#

well, there's no reason to run 7 processes on 4 cores, right

#

(in this context)

unborn glacier Aug 12, 2021, 4:00 AM

#

Well it's better to be sharing a core with someone in an infinite while loop than to be queuing behind them

sudden canyon Aug 12, 2021, 4:02 AM

#

but it will have the same throughput, or maybe a lower throughput, right?

unborn glacier Aug 12, 2021, 4:03 AM

#

Yes

#

But like if 10 people use up all 10 available cores, for example, and are all running while loops that last forever, I'll never get the chance to run something as simple as print("hello")

#

But if I have an available thread on a shared cpu, then my print("hello") will take one nanosecond longer to run, but I won't have to wait forever to start it

sudden canyon Aug 12, 2021, 4:08 AM

#

the issue is, I'll have several machines, not one machine

unborn glacier Aug 12, 2021, 4:09 AM

#

Yeah, I'm not really sure how you'd route traffic effectively

sudden canyon Aug 12, 2021, 4:10 AM

#

The numbers are something like:

maximum test time = 5s
test cases per problem = 20
students = 1000 * N

unborn glacier Aug 12, 2021, 4:11 AM

#

1000 * N ? Like you have multiple thousands of students?

sudden canyon Aug 12, 2021, 4:11 AM

#

yeah, that's the theoretical idea

#

I'm not making anything practical yet, just wondering how many servers that would need

unborn glacier Aug 12, 2021, 4:12 AM

#

So now the question becomes, how many cores do you need to run my poorly optimized code for that many students

#

Lol

sudden canyon Aug 12, 2021, 4:12 AM

#

brainmon

#

write another simulation to find out

unborn glacier Aug 12, 2021, 4:12 AM

#

Haha

#

Honestly you might be able to generalize the simulation results to an approximation

sudden canyon Aug 12, 2021, 4:15 AM

#

Not sure if my logic is correct, but in a perfect world```
N = number of students
i = iteration time
t = time waiting in queue
c = number of cores
r = time to run a test
M = tests per problem

The system can process `capacity = c/r` tests per second, while students can submit `throughput = N * M / (i + t)` problems per second
`c/r = N * M / (i + t)`
`c = N * M * r / (i + t)`
So if 
```py
N = 1000
i = 300  # 5 minutes
t = 10   # 10 s wait on average
r = 5    # 5 s per test
M = 20

then I need 323 cores (ouch)

inland zephyr Aug 12, 2021, 4:15 AM

#

i want to ask about keras loss function. There is a categorical cross-entropy and sparse categorical cross-entropy. What is the different on both of them, and what suitable function i need to use if the data are slightly imbalance (2:1)?

unborn glacier Aug 12, 2021, 4:19 AM

#

Was 323 what you calculated theoretically?

sudden canyon Aug 12, 2021, 4:19 AM

#

yeah, from these numbers on my ""model""

unborn glacier Aug 12, 2021, 4:20 AM

#

The simulation predicts a very similar number, so your theory is correct!

sudden canyon Aug 12, 2021, 4:20 AM

#

lemon_exploding_head

unborn glacier Aug 12, 2021, 4:20 AM

#

-> [310, 293, 280, 293, 318, 309, 294, 323, 297, 301]

#

Weird it actually has 323 as one of the values

unborn glacier Aug 12, 2021, 4:21 AM

#

inland zephyr i want to ask about keras loss function. There is a categorical cross-entropy an...

Does this answer it:
https://stackoverflow.com/questions/58565394/what-is-the-difference-between-sparse-categorical-crossentropy-and-categorical-c

Stack Overflow

What is the difference between sparse_categorical_crossentropy and ...

What is the difference between sparse_categorical_crossentropy and categorical_crossentropy? When should one loss be used as opposed to the other? For example, are these losses suitable for linear

odd falcon Aug 12, 2021, 4:42 AM

#

why is it giving such an error? can someone help me?

austere swift Aug 12, 2021, 5:25 AM

#

odd falcon why is it giving such an error? can someone help me?

you're trying to convert the string 'labels' into an integer, which doesn't work

#

what is the line where that error shows up?

iron basalt Aug 12, 2021, 6:25 AM

#

inland zephyr i want to ask about keras loss function. There is a categorical cross-entropy an...

This vector (a) is dense:  [0.1, 0.5, 0.3, 1.0, 0.8, 0.6, 0.1, 0.7]
This vector (b) is sparse: [0.0, 0.0, 0.0, 0.0, 0.4, 0.0, 0.0, 1.0]
The vector b can be compressed. It can instead be stored it like this:
nnz = [0.4, 1.0]
indices = [4, 7]
Where "nnz" is the non-zero values, and indices is the indices of those non-zero values.
Consider adding b to a (changing a in-place). When using the non-compressed form of b, there are n iterations, where n is the length of the two vectors (8 iterations).
Now consider adding b to a, but this time using the compressed form of b. Adding the zero values from b to a is pointless as it leaves those values unchanged.
So there are m iterations where m is the number of non-zero values in b (2 iterations). This provides a very large speedup with large enough sizes and if b is sparse enough (e.g. 1% non-zero).
In addition, since only the non-zero values of b are stored and b has mostly zero values, there is a large reduction is memory usage for b.

#

Now consider having categories of color: red, blue, green.
Each category can be stored as one-hot encoding:
red = [1, 0, 0]
blue = [0, 1, 0]
green = [0, 0, 1]
Each of these can be stored in a compressed form. While for only 3 categories and batch size 1 this does not provide much benefit, for larger sizes there are gains to be had.

inland zephyr Aug 12, 2021, 6:35 AM

#

my class are coded 0 and 1 but the proportion is 2:1 for 0 and 1 class. I using the sparse with 2 output at the dense. When i try to use sparse, the accuracy are higher compare with dense one. Of course, in dense i only use 1 output and at the sparse are 2

iron basalt Aug 12, 2021, 6:36 AM

#

inland zephyr my class are coded 0 and 1 but the proportion is 2:1 for 0 and 1 class. I using ...

You are doing binary classification?

inland zephyr Aug 12, 2021, 6:36 AM

#

Its binary but i hot encode my class as 0 and 1 (in integer)

iron basalt Aug 12, 2021, 6:37 AM

#

If it's binary classification there is no need for one-hot encoding.

#

It's just one output, true or false

inland zephyr Aug 12, 2021, 6:40 AM

#

I.m sorry to clarify the last statement, I mean from the source data, the csv dataset, the class is set on 0 and 1 in integer (although class 0 and 1 are come from the different csv file)

iron basalt Aug 12, 2021, 6:41 AM

#

So you have multiple classes and you have one-hot encoded each?

inland zephyr Aug 12, 2021, 6:41 AM

#

yup

iron basalt Aug 12, 2021, 6:41 AM

#

what is this "proportion" then?

inland zephyr Aug 12, 2021, 6:41 AM

#

2:1

iron basalt Aug 12, 2021, 6:41 AM

#

of what

inland zephyr Aug 12, 2021, 6:42 AM

#

2 from class 0 and 1 from class 1

#

so if theres 14 class 0 so there would be 12 for class 1

iron basalt Aug 12, 2021, 6:42 AM

#

I thought there was more than 2 classes.

inland zephyr Aug 12, 2021, 6:43 AM

#

so the case is there is normal condition data which labeled as 1, and sick one as 0. There are 2 rows different between class 0 and 1 which class 0 bigger than 1 at proportion

iron basalt Aug 12, 2021, 6:43 AM

#

What do you mean by class? Category? Sub-category? Or something else.

inland zephyr Aug 12, 2021, 6:43 AM

#

its category

iron basalt Aug 12, 2021, 6:44 AM

#

So the proportion is the amount of each class in the dataset (labelled)?

inland zephyr Aug 12, 2021, 6:46 AM

#

there is only one dataset (combine of class 0 and 1 in separated CSV), which has 40 class 0 or sick and 36 class 1 or healthy

iron basalt Aug 12, 2021, 6:46 AM

#

So it's binary classification.

#

Two categories, sick and not sick.

#

True/False

#

Use binary cross-entropy loss.

#

https://towardsdatascience.com/understanding-binary-cross-entropy-log-loss-a-visual-explanation-a3ac6025181a

Medium

Understanding binary cross-entropy / log loss: a visual explanation

Have you ever thought about what exactly does it mean to use this loss function?

inland zephyr Aug 12, 2021, 7:09 AM

#

thanks @iron basalt now i know how to do. But there is small issue... about the net i used and the class to feed the CNN. I read since i have two class, i need to define my net

            model = Sequential([
            InputLayer(input_shape=(f_leng,1)),
            Conv1D(filters= 128,kernel_size=3,activation='relu'),
            Conv1D(filters=128,kernel_size=5,activation='relu'),
            MaxPool1D(pool_size=10),
            Conv1D(filters= 256,kernel_size=3,activation='relu'),
            Conv1D(filters=256,kernel_size=5,activation='relu'),
            MaxPool1D(pool_size=2),
            Dropout(rate=0.3),
            Flatten(),
            Dense(2,activation='softmax')
        ])
            model.compile(loss = tf.keras.losses.BinaryCrossentropy(),optimizer='Adam', metrics=["accuracy"])

like this since when i use Dense(1,activation='softmax')giving bad result (always fail to classifiy the 2nd class. But when i feed the class with
trainY = trainY.reshape(trainY.shape[0], 1, 1) since trainY is 1D array
always give error ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1)). This error not happen when i used sparse categorical

odd falcon Aug 12, 2021, 7:10 AM

#

austere swift you're trying to convert the string `'labels'` into an integer, which doesn't wo...

line 24, error in loop

austere swift Aug 12, 2021, 7:12 AM

#

inland zephyr thanks <@!119925597395877889> now i know how to do. But there is small issue... ...

sparse categorical uses index labels with one hot outputs, categorical uses onehot labels with onehot outputs

#

i'm assuming from the error that you have index labels so you need to use sparse categorical

austere swift Aug 12, 2021, 7:12 AM

#

odd falcon line 24, error in loop

can you give the full traceback?

inland zephyr Aug 12, 2021, 7:13 AM

#

austere swift sparse categorical uses index labels with one hot outputs, categorical uses oneh...

now i'm using binary

austere swift Aug 12, 2021, 7:13 AM

#

inland zephyr now i'm using binary

if your labels are 0 or 1, then that would be considered index labels

inland zephyr Aug 12, 2021, 7:13 AM

#

and my class is 0 and 1... which arranged in 1D array

austere swift Aug 12, 2021, 7:13 AM

#

one hot would be [1, 0] or [0, 1]

inland zephyr Aug 12, 2021, 7:13 AM

#

ow so thats the problem

#

is it okay if using sparse categorical
for index labels things?

iron basalt Aug 12, 2021, 7:15 AM

#

inland zephyr thanks <@!119925597395877889> now i know how to do. But there is small issue... ...

https://dataaspirant.com/difference-between-softmax-function-and-sigmoid-function/

Dataaspirant

Difference Between Softmax Function and Sigmoid Function

Understand the fundamental differences between softmax function and sigmoid function with the in details explanation and the implementation in Python.

austere swift Aug 12, 2021, 7:15 AM

#

inland zephyr is it okay if using sparse categorical for index labels things?

yes

#

you can't use categorical with index labels anyways

iron basalt Aug 12, 2021, 7:16 AM

#

You can either have 2 outputs with softmax, or 1 output with sigmoid, both should work. I prefer the sigmoid route because it's less computation being done.

#

For the 2 output softmax version, sparse can be used but you won't really gain anything from it.

#

As for keras API specific stuff, idk, I have not used it in a long time.

austere swift Aug 12, 2021, 7:18 AM

#

yeah i was discussing it from the point of view that you don't wanna change the model, if you want to change the model to use sigmoid then you can change the loss function as well

iron basalt Aug 12, 2021, 7:19 AM

#

To explain the difference between the two, with softmax you would get something like [0.3, 0.7] as output (probability of each class), and with sigmoid you would just get output of like [0.7]

#

You know the probability of the other class is just 1.0 - 0.7

austere swift Aug 12, 2021, 7:19 AM

#

in most cases people usually go the sigmoid route for binary stuff and softmax for more than that

inland zephyr Aug 12, 2021, 7:20 AM

#

so for binary one, its pretty costly in performance with softmax since the sigmoid one are enough

iron basalt Aug 12, 2021, 7:20 AM

#

Softmax is for when you have 3 or more because then it can give you something like [0.1, 0.2, 0.7]

austere swift Aug 12, 2021, 7:20 AM

#

^

iron basalt Aug 12, 2021, 7:20 AM

#

Sigmoid does not work for that

austere swift Aug 12, 2021, 7:20 AM

#

iron basalt Sigmoid does not work for that

technically it can work but it will give weird results so its usually just avoided

iron basalt Aug 12, 2021, 7:20 AM

#

Yeah, but when I say work, I mean also work well. A lot of things "work" in ML

austere swift Aug 12, 2021, 7:21 AM

#

yeah

iron basalt Aug 12, 2021, 7:21 AM

#

Technically you could have 3 sigmoid outputs and get away with it on simple stuff.

austere swift Aug 12, 2021, 7:21 AM

#

because softmax will always have the results add up to 1, while sigmoid they will not

odd falcon Aug 12, 2021, 7:21 AM

#

austere swift can you give the full traceback?

I have a sentiment analysis dataset with 2 columns comment and label(POS, NEG, NEU). Then I encode label to lables(2, 0, 1) and I use CNN model to classify them.

austere swift Aug 12, 2021, 7:22 AM

#

so softmax will be like the probabilities of each class, which will add up to 1

odd falcon Aug 12, 2021, 7:22 AM

#

it's define function train model

austere swift Aug 12, 2021, 7:22 AM

#

odd falcon I have a sentiment analysis dataset with 2 columns comment and label(POS, NEG, N...

by traceback i mean the error output

inland zephyr Aug 12, 2021, 7:26 AM

#

austere swift you can't use categorical with index labels anyways

should i convert again the index labels or not? i directly feed the class without further encoding to the model

iron basalt Aug 12, 2021, 7:26 AM

#

The classes are 0 and 1 so they are directly the targets in the case of using 1 sigmoid output.

#

They act as the probabilities

inland zephyr Aug 12, 2021, 7:27 AM

#

and now it works will waiting for the result...

iron basalt Aug 12, 2021, 7:27 AM

#

probability of sick is 1 or 0

inland zephyr Aug 12, 2021, 7:28 AM

#

the proba for sick is 0 and health is 1
so far its work with sigmoid and sparse one

#

actually in this project i combine signal processing since the source data is an ECG record. Why i store it as CSV because its the proper way to represent the data

inner pebble Aug 12, 2021, 7:52 AM

#

Hello everyone,
I have a question regarding regex.
I d like to check if a variable follows a certain pattern before mutating it from str to date.
The pattern is this one:
2022-05-20 13:21:29

I can t succeed in writting the necessary regex to check it.
How would you proceed?
Thanks.

#

I tried this:
\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}

#

oups nevermind the problem was not coming from my regex it works 👍

velvet thorn Aug 12, 2021, 9:17 AM

#

inner pebble Hello everyone, I have a question regarding regex. I d like to check if a variab...

why do you need to validate dates manually?

inner pebble Aug 12, 2021, 9:17 AM

#

Because I have some unappropriate string which are not following the correct format

#

I realised it when attempting to mutate str to date

#

but Wanted to extract a list of all the non appropriate formats

#

Do you know a better way to proceed?

velvet thorn Aug 12, 2021, 9:22 AM

#

inner pebble but Wanted to extract a list of all the non appropriate formats

why not just use strptime in a loop

#

with try-except

inner pebble Aug 12, 2021, 9:22 AM

#

I don't know this, I m gonna check it 🙂 thanks

#

This is great, how if I use assert instead of a try-except? Would I see the incorrect format appearing?

chilly geyser Aug 12, 2021, 10:14 AM

#

sudden canyon Statistics question here. I was thinking about architecting a code running/code ...

Not sure if I should ping, but pinging anyway.
You can try
https://en.wikipedia.org/wiki/M/M/c_queue
If you assume times are Markovian,
else
https://en.wikipedia.org/wiki/M/G/k_queue
(which is less likely you get analytical solutions)
There's also G/M/k queues, which should relate to M/G/k queues (but don't ask me how)

G/G/k sounds very hopeless, you might want to approximate (IIRC - heavy traffic approximation might make the problem analytically easier), or even just use the raw M/M/k queues for approximation.

M/M/c queue

In queueing theory, a discipline within the mathematical theory of probability, the M/M/c queue (or Erlang–C model) is a multi-server queueing model. In Kendall's notation it describes a system where arrivals form a single queue and are governed by a Poisson process, there are c servers, and job service times are exponentially distributed. It is...

M/G/k queue

In queueing theory, a discipline within the mathematical theory of probability, an M/G/k queue is a queue model where arrivals are Markovian (modulated by a Poisson process), service times have a General distribution and there are k servers. The model name is written in Kendall's notation, and is an extension of the M/M/c queue, where service ti...

#

The problem is kind of different given that you have finite students, but infinite students with a distribution of visiting times might still be a good idea

vital widget Aug 12, 2021, 10:22 AM

#

hi um, idk if this is the right place to ask but, how much python should i learn to be able to start learning AI/ML stuff?

austere swift Aug 12, 2021, 10:41 AM

#

a basic understanding should be fine, you can learn more advanced stuff as you go on

undone flare Aug 12, 2021, 10:52 AM

#

vital widget hi um, idk if this is the right place to ask but, how much python should i learn...

I would say basic fundamentals and functions

somber prism Aug 12, 2021, 10:52 AM

#

can someone explain me what is class_weight parameter in sklearn clf algorithms

#

it has options like either None or balanced

undone flare Aug 12, 2021, 10:55 AM

#

somber prism can someone explain me what is class_weight parameter in sklearn clf algorithms

I found this helpful: https://stackoverflow.com/questions/30972029/how-does-the-class-weight-parameter-in-scikit-learn-work
basically class_weight="balanced" means that it will try to replicate the smaller class until it has same samples as the larger class

somber prism Aug 12, 2021, 11:00 AM

#

undone flare I found this helpful: https://stackoverflow.com/questions/30972029/how-does-the-...

so its used when the target labels are imbalanced ?

#

i tried it to some imbalanced dataset , after changing the parameter to 'balanced' my f1 scored lowered

undone flare Aug 12, 2021, 11:02 AM

#

if you dataset is imbalanced try using technique like SMOTE

somber prism Aug 12, 2021, 11:04 AM

#

yeh going to try that now

undone flare Aug 12, 2021, 11:11 AM

#

somber prism i tried it to some imbalanced dataset , after changing the parameter to 'balance...

what is the average param set for your f1 score?

somber prism Aug 12, 2021, 11:13 AM

#

undone flare what is the average param set for your f1 score?

binary

undone flare Aug 12, 2021, 11:13 AM

#

oh binary classification okay

somber prism Aug 12, 2021, 11:14 AM

#

undone flare oh binary classification okay

if its more than 2 clf then i should change that to weighted right ?

undone flare Aug 12, 2021, 11:15 AM

#

well if you have multiclass labels then only this parm is required

undone flare Aug 12, 2021, 11:16 AM

#

somber prism if its more than 2 clf then i should change that to weighted right ?

do you want to calculate metrics for each label?

somber prism Aug 12, 2021, 11:16 AM

#

undone flare do you want to calculate metrics for each label?

yes

undone flare Aug 12, 2021, 11:17 AM

#

then set it to weighted

#

also want the average?

#

bcuz that is what weighted will do

vital widget Aug 12, 2021, 11:19 AM

#

austere swift a basic understanding should be fine, you can learn more advanced stuff as you g...

ooo okay then

vital widget Aug 12, 2021, 11:19 AM

#

undone flare I would say basic fundamentals and functions

yep thanks i'll look into it

grave frost Aug 12, 2021, 11:59 AM

#

undone flare well if you have multiclass labels then only this parm is required

uh, no

#

you can use class_weights for n classes, be it 2 or 10

#

its useful for baselines if you don't want to do any augmentation

#

but its impact on accuracy is kinda inconsisitent

#

try removing the param and training + with param to see if there is any difference in accuracy scores (setting seed ofc) @somber prism if you don't see any major difference, use some augmentation

undone flare Aug 12, 2021, 12:06 PM

#

grave frost you can use `class_weights` for n classes, be it 2 or 10

I was talking about f1_score average parameter

grave frost Aug 12, 2021, 12:08 PM

#

undone flare I was talking about f1_score `average` parameter

ayy, my bad

lapis sequoia Aug 12, 2021, 12:58 PM

#

where do i learn data science for free

carmine tide Aug 12, 2021, 1:09 PM

#

Hello! I have some data for x and y axes and I also have their errors. The thing is that I can't find a way to fit a curve that takes into consideration both x and y errors. The function curve_fit only considers y error. I also tried using odr but it doesn't give me the correct curve. I have searched a lot but I can't find anything that works, so I would appreciate some help. Thanks in advance

chilly geyser Aug 12, 2021, 2:13 PM

#

How do you know it's not giving you the 'correct curve'?

late shell Aug 12, 2021, 2:20 PM

#

Hello, I have a dataset of X-ray images that fall under one of the classes : covid or non-covid. The assignment requires me to perform EDA on these images. Can someone help me with this. Except plotting the mean and S.D of the pixels of each image in a scatter plot, I don't know what kind of EDA can I run on images.

carmine tide Aug 12, 2021, 2:29 PM

#

chilly geyser How do you know it's not giving you the 'correct curve'?

Well you can easily tell that it doesn't correctly follow the points. But I also checked with OriginLab which should give me the correct curve and they were different.

sudden canyon Aug 12, 2021, 2:46 PM

#

chilly geyser Not sure if I should ping, but pinging anyway. You can try https://en.wikipedia....

Yes, please ping me always!
That's... a bit over my head to be honest, but I'll try reading it again tomorrow

#

thanks

frozen hound Aug 12, 2021, 2:59 PM

#

This is the appropriate channel for TF questions I assume?

chilly geyser Aug 12, 2021, 3:02 PM

#

sudden canyon Yes, please ping me always! That's... a bit over my head to be honest, but I'll ...

Basically you can get statistics for waiting time or time in queue if you use such models, and I think you can compare those against X I think

chilly geyser Aug 12, 2021, 3:02 PM

#

frozen hound This is the appropriate channel for TF questions I assume?

Yes, but whether someone answers... that's a little difficult to say. It depends on your question

#

And like, if anyone is well-experienced and willing to answer

austere loom Aug 12, 2021, 3:31 PM

#

Anyone using the new VSCode notebooks?

umbral ferry Aug 12, 2021, 3:41 PM

#

So I know on xgboost, you can use a few different metrics to determine feature importance, but is there a way to determine the importance of a set of features? For example, could you pick a certain feature, and then look at all the features the algo decided to split on immediately after your picked feature, and compare the gain? Like say every time it split on "Color" and then "Size", the gain from both of them is larger on average than "Color" and then "Weight". And then you'd interpret that as those two features interacting with each other somehow

desert oar Aug 12, 2021, 4:01 PM

#

umbral ferry So I know on xgboost, you can use a few different metrics to determine feature i...

what about the average importance of a set of features? all of the importance scores used in xgboost are "additive" and behave linearly, so you can just add up the scores https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.Booster.get_score

#

well, if you add the averages it won't be quite the same

#

but you can sum the total_gains for each of the features, and sum the number of splits (called weight in xgboost) for each of the features, and divide to get an average

umbral ferry Aug 12, 2021, 4:08 PM

#

I'm not sure if that would have the effect I'm looking for, which is determining the relationship between two (or more) features. I'm less concerned about the absolute value, and more about for this specific feature, which feature in combination is the most important

#

isn't total_gain/weight the same as just gain?

#

gain is the average gain per split (according to docs)

undone flare Aug 12, 2021, 4:11 PM

#

does anyone have a tutorial on image classification with machine learning (no CNN)

unborn glacier Aug 12, 2021, 4:25 PM

#

undone flare does anyone have a tutorial on image classification with machine learning (no CN...

https://medium.com/analytics-vidhya/applying-ann-digit-and-fashion-mnist-13accfc44660

#

If you just search MNIST ANN or basic MNIST NN you'll find a bunch

undone flare Aug 12, 2021, 4:28 PM

#

thanks

chilly geyser Aug 12, 2021, 4:33 PM

#

umbral ferry So I know on xgboost, you can use a few different metrics to determine feature i...

Are you looking for some non-linear way for measuring importance?

Not sure if it's a good idea. Suppose you have a definition, then now you have a powerset to solve for (which you can't solve)

grave frost Aug 12, 2021, 5:12 PM

#

undone flare does anyone have a tutorial on image classification with machine learning (no CN...

there are plenty that use traditional sklearn algos

undone flare Aug 12, 2021, 5:13 PM

#

grave frost there are plenty that use traditional sklearn algos

I am trying to do one and not getting that good results

#

so I was trying to explore more

grave frost Aug 12, 2021, 5:13 PM

#

undone flare I am trying to do one and not getting that good results

what did you try?

undone flare Aug 12, 2021, 5:13 PM

#

didn't wanna do CNN just yet

grave frost Aug 12, 2021, 5:13 PM

#

I believe SVM with a bit of preprocessing does good

undone flare Aug 12, 2021, 5:14 PM

#

grave frost I believe SVM with a bit of preprocessing does good

yea that gave the highest result of all of them I tried ~85%

#

I think it can be better

#

like in the 90+

grave frost Aug 12, 2021, 5:14 PM

#

undone flare didn't wanna do CNN just yet

nobody uses MNIST with CNNs lol

#

its too overkill - MLPs are better

undone flare Aug 12, 2021, 5:15 PM

#

I am doing sign lang digits classification, is CNN overkill?

grave frost Aug 12, 2021, 5:16 PM

#

no

#

I recommend you learn the basics of NNs first rather than jumping straight to CNNs

undone flare Aug 12, 2021, 5:16 PM

#

I am

grave frost Aug 12, 2021, 5:16 PM

#

well, you are using SVM in one and CNNs in another 🤷

undone flare Aug 12, 2021, 5:17 PM

#

No?

#

ANN

grave frost Aug 12, 2021, 5:17 PM

#

yes, calling it MLP's is maybe more accurate

#

"Artificial Neural Network" isnt very specific

undone flare Aug 12, 2021, 5:18 PM

#

alright my bad

grave frost Aug 12, 2021, 5:18 PM

#

cool - for images, CNNs are the only archs (bar some esoteric ones)

#

ViT is not something you would use unless you want to win or smthing

#

it would be overkill anyways. CNNs are always the best for images

sonic scaffold Aug 12, 2021, 5:23 PM

#

Im getting a error while making a box plot for my series

#

ValueError: The number of FixedLocator locations (2), usually from a call to set_ticks, does not match the number of ticklabels (1).

#

Can someone help pls

undone flare Aug 12, 2021, 5:25 PM

#

grave frost I believe SVM with a bit of preprocessing does good

here are the results from SVC

#

I don't know why I did mae

vague stratus Aug 12, 2021, 5:28 PM

#

undone flare here are the results from SVC

Please normalise the data before making a confusion matrix

#

Also I would suggest use log_loss available in sklearn as the loss function

undone flare Aug 12, 2021, 5:29 PM

#

vague stratus Please normalise the data before making a confusion matrix

It already is?

#

0-9 are digits

vague stratus Aug 12, 2021, 5:30 PM

#

undone flare It already is?

No the confusion matrix isnt normalised. Like instead of a colorbar from 0 to 40 it would show from 0 to 1

undone flare Aug 12, 2021, 5:30 PM

#

vague stratus Aug 12, 2021, 5:31 PM

#

Benefits would be like lets say in test cases ther were only 30 8s and 40 9s so the 9 will be brighter than 8 even if the accuracy in both classes were the same

chilly geyser Aug 12, 2021, 5:32 PM

#

sonic scaffold Can someone help pls

I'd recommend you give more detail, because I don't think anyone can help you without additional detail

vague stratus Aug 12, 2021, 5:32 PM

#

undone flare

you have normalised the pixel values which you should have, i am saying you could normalise the number of data from each class before making a confusion matrix so that it would not be confusing

undone flare Aug 12, 2021, 5:33 PM

#

oh how to do that?

desert oar Aug 12, 2021, 5:34 PM

#

undone flare yea that gave the highest result of all of them I tried ~85%

you mean the one i wrote for you? 😉

#

i think PCA + SVM is the "classic" MNIST solution

#

or something like preprocessing to binarize the data

#

i think you could do something similar with sign language

#

use some kind of image processing to extract an "outline"

#

then PCA + SVM or keep using RBF

vague stratus Aug 12, 2021, 5:36 PM

#

Well we are avoiding CNN so does that mean we are avoding NNs too?

desert oar Aug 12, 2021, 5:36 PM

#

true a boring feedforward NN could be in play

vague stratus Aug 12, 2021, 5:36 PM

#

Because I have tried using autoencoders and then SVM; it works well too

desert oar Aug 12, 2021, 5:36 PM

#

"let's pretend it's 2008 again"

undone flare Aug 12, 2021, 5:36 PM

#

desert oar you mean the one i wrote for you? 😉

yea lol

#

this is what I got

vague stratus Aug 12, 2021, 5:37 PM

#

undone flare oh how to do that?

while creating the confusion matrix cm
you can use:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

vague stratus Aug 12, 2021, 5:38 PM

#

undone flare this is what I got

did you try ada-boosting the models?

desert oar Aug 12, 2021, 5:39 PM

#

undone flare this is what I got

you should note that it's specifically SVM with an RBF kernel

undone flare Aug 12, 2021, 5:39 PM

#

nope

desert oar Aug 12, 2021, 5:39 PM

#

a linear SVM wouldn't be much better than ridge regression, if at all

undone flare Aug 12, 2021, 5:39 PM

#

yea

undone flare Aug 12, 2021, 5:39 PM

#

vague stratus did you try ada-boosting the models?

I don't know what that is

desert oar Aug 12, 2021, 5:39 PM

#

use "SVM + RBF" in the graph so you don't forget (and other people don't wonder about it)

#

and yeah gradient boosting could be a good option

undone flare Aug 12, 2021, 5:39 PM

#

vague stratus while creating the confusion matrix cm you can use: cm = cm.astype('float') / c...

thanks

desert oar Aug 12, 2021, 5:40 PM

#

vague stratus did you try ada-boosting the models?

does boosting work with non-weak learners?

#

i'm not surprised the random forest didn't do well... feature splitting doesn't make sense on pixels, they're too "specific"

#

you need to extract bigger features like with PCA

#

then you can try something like ensembling the SVM-RBF with the PCA+RF setup

#

this is why CNNs are so cool, they learn useful features from "fine-grained" data like this

#

and why deep learning works so well on this kind of data, where the data is very "high resolution"

undone flare Aug 12, 2021, 5:43 PM

#

I was going to learn about CNNs, think this is the right time?

quiet vault Aug 12, 2021, 5:54 PM

#

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (None, 12)
X.shape = (118, 12, 1)
Y.shape = (118,)
The input shape of the model is input_shape(12, 1)
Does anyone know why I am getting this error

model = Sequential()
            model.add(Conv1D(128, 7, activation='relu', input_shape=(n_steps, n_features)))
            model.add(Conv1D(128, 7, padding='same', activation='relu'))
            model.add(MaxPooling1D(pool_size=3, padding='same'))
            model.add(Conv1D(256, 5, padding='same'))
            model.add(Conv1D(256, 5, padding='same', activation='relu'))
            model.add(MaxPooling1D(pool_size=3, padding='same'))
            model.add(Conv1D(512, 3, padding='same'))
            model.add(Conv1D(512, 3, padding='same', activation='relu'))
            model.add(MaxPooling1D(pool_size=3, padding='same'))
            model.add(Conv1D(512, 1, padding='same'))
            model.add(Conv1D(512, 1, padding='same', activation='relu'))
            model.add(MaxPooling1D(pool_size=3, padding='same'))
            model.add(Flatten())
            model.add(Dense(1))
            model.add(Activation('sigmoid'))
            model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

Here is the code for the model
can someone help

grave frost Aug 12, 2021, 7:42 PM

#

undone flare this is what I got

looks decent ig, but what's stochastic gradient descent lol

unborn glacier Aug 12, 2021, 7:50 PM

#

quiet vault ValueError: Input 0 of layer sequential is incompatible with the layer: : expect...

You need to define the input shape with one of these methods, I think

# With explicit InputLayer.
model = tf.keras.Sequential([
  tf.keras.layers.InputLayer(input_shape=(4,)),
  tf.keras.layers.Dense(8)])
model.compile(tf.optimizers.RMSprop(0.001), loss='mse')
model.fit(np.zeros((10, 4)),
          np.ones((10, 8)))

# Without InputLayer and let the first layer to have the input_shape.
# Keras will add a input for the model behind the scene.
model = tf.keras.Sequential([
  tf.keras.layers.Dense(8, input_shape=(4,))])
model.compile(tf.optimizers.RMSprop(0.001), loss='mse')
model.fit(np.zeros((10, 4)),
          np.ones((10, 8)))

#

https://www.tensorflow.org/api_docs/python/tf/keras/layers/InputLayer

#

That's just an example not the exact code

quiet vault Aug 12, 2021, 7:50 PM

#

i fixed

#

it was weird

#

thanks tho

grave frost Aug 12, 2021, 7:51 PM

#

most probably you put the wrong shapes

quiet vault Aug 12, 2021, 7:55 PM

#

maybe

glad mulch Aug 12, 2021, 8:37 PM

#

question i have data that looks like this. When i convert it to date time its in the format Year-01-01 but i want it to be Year-12-31, how would i go about doing that

iron basalt Aug 12, 2021, 8:39 PM

#

undone flare I was going to learn about CNNs, think this is the right time?

#

Note that MNIST is a trivial task and does not at all represent a real computer vision task.

#

Just about anything will work on it. Also some digits are miss-labeled so keep that in mind. Don't expect 100% accuracy ever.

#

Because it is so simple, it does make a for a good bug check.

#

Fashion MNIST and other datasets that use the MNIST name are more a of a real task and you will notice the large drop in accuracy.

glad mulch Aug 12, 2021, 9:09 PM

#

my solution to above for anyone who cares

rigid zodiac Aug 12, 2021, 9:33 PM

#

Hey guys, sorry for keep asking this type of question. so this is what I have ```c['cat'] = np.nan
for i in range(len(c)):
if (abs(c['ay'].iloc[i]) >= 50) and (abs(c['az'].iloc[i]) >= 70)and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+1] < abs(c['az'].iloc[i])) and (abs(c['ay'].iloc[i+2]) < 20 ) and (abs(c['az'].iloc[i+2]) < 20) and (abs(c['ay'].iloc[i+3]) < 20 )and (abs(c['az'].iloc[i+3]) < 20) and (abs(c['ay'].iloc[i+4]) < 20 ) and (abs(c['az'].iloc[i+4]) < 20) and (abs(c['ay'].iloc[i+5]) < 20 ) and (abs(c['az'].iloc[i+5]) < 20) and (abs(c['ay'].iloc[i+6]) < 20 ) and (abs(c['az'].iloc[i+6]) < 20) and(abs(c['ay'].iloc[i+7]) < 20 ) and (abs(c['az'].iloc[i+7]) < 20):
c['cat'].iloc[i] = 1

elif (abs(c['ay'].iloc[i]) >= 50) and (abs(c['ax'].iloc[i]) >= 70) and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['ax'].iloc[i+1] < abs(c['ax'].iloc[i])) and (abs(c['ay'].iloc[i+2]) < 20 ) and (abs(c['ax'].iloc[i+2]) < 20) and (abs(c['ay'].iloc[i+3]) < 20 ) and (abs(c['ax'].iloc[i+3]) < 20) and (abs(c['ay'].iloc[i+4]) < 20 ) and (abs(c['ax'].iloc[i+4]) < 20) and (abs(c['ay'].iloc[i+5]) < 20 ) and (abs(c['ax'].iloc[i+5]) < 20) and (abs(c['ay'].iloc[i+6]) < 20 ) and (abs(c['ax'].iloc[i+6]) < 20) and(abs(c['ay'].iloc[i+7]) < 20 ) and (abs(c['ax'].iloc[i+7]) < 20):
    c['cat'].iloc[i] = 1 
    
else: c['cat'].iloc[i] = 0```

How can i set any other c['cat'].iloc[i+1] and so on to i+7 =1

umbral ferry Aug 12, 2021, 9:46 PM

#

so my model is fitting really really well to the training data, but it's also fitting well to the test data (RMSE on train of 1, on test is 6) is that ok? Or do I want reduce how well it fits the training data?

modest mulch Aug 12, 2021, 10:07 PM

#

Hiya, anyone knows what is the simplest way to kind of model like a "gesture is not among known gestures" label in hand gesture recognition? threshold is one way but it's not really that robust, I thought of estimating uncertainty using bayesian neural networks but was wondering if there happened to be a simpler fairly robust method?
would using sigmoid intseade of softmax for the last layer help? if all the probailities are less than 0.5, it means either there was no gesture, or the gesture is not known enough?

grave frost Aug 12, 2021, 10:23 PM

#

modest mulch Hiya, anyone knows what is the simplest way to kind of model like a "gesture i...

add another category

modest mulch Aug 12, 2021, 10:23 PM

#

grave frost add another category

a bit difficult for a huge dataset, I'm already using a pre trained model

#

I would have to modify the dataset, and retrain the model

hasty mountain Aug 12, 2021, 10:47 PM

#

Hey guys, I want to read a video using matplotlib.image, can someone give me an idea on how to do that?
I've tried using image.io, which can use a reader and then iterate through the reader to get frames and an array with the pixels. However, I gave up using this library because it doesn't return a proper array that I can use in my algoritms.

Here's the code I've used so far:

data = imageio.get_reader(r'video_sample.mp4', 'ffmpeg')

for frame, rgb in enumerate(data):
    X = rgb
    y = frame

I'm out of ideas on how to iterate through a video using matplotlib.image

velvet thorn Aug 12, 2021, 10:47 PM

#

desert oar does boosting work with non-weak learners?

it shouldn’t

#

because each successive learner increases variance and decreases bias

#

if you start with strong learners you’re probably going to get mad overfitting

grave frost Aug 12, 2021, 10:52 PM

#

modest mulch a bit difficult for a huge dataset, I'm already using a pre trained model

should have planned it out before then - that sort of thing can only be interpreted by the confidence values

grave frost Aug 12, 2021, 10:53 PM

#

hasty mountain Hey guys, I want to read a video using matplotlib.image, can someone give me an ...

that's not how enumerate works

#

it simply converts the data your are interating over to a tuple while providing its count as the second elem

#

atleast, that's what I understand ¯_(ツ)_/¯

hasty mountain Aug 12, 2021, 10:56 PM

#

grave frost that's not how `enumerate` works

Enumerate is working fine and it returns an array, but it returns an imageio array, not a "proper" array

velvet thorn Aug 12, 2021, 10:57 PM

#

hasty mountain Hey guys, I want to read a video using matplotlib.image, can someone give me an ...

why did this not work for you?

#

ah okay

#

wait hold up

velvet thorn Aug 12, 2021, 10:58 PM

#

hasty mountain Enumerate is working fine and it returns an array, but it returns an imageio arr...

it should return a subclass

#

which you can use as if it were a numpy array

velvet thorn Aug 12, 2021, 10:58 PM

#

grave frost it simply converts the data your are interating over to a tuple while providing ...

what do you mean by this

grave frost Aug 12, 2021, 11:02 PM

#

velvet thorn what do you mean by this

as in RGB = the index of the iterator?

#

does that provide any useful information?

velvet thorn Aug 12, 2021, 11:02 PM

#

grave frost as in RGB = the index of the iterator?

rgb is the array representing each frame

grave frost Aug 12, 2021, 11:03 PM

#

velvet thorn `rgb` is the array representing each frame

I thought it was the counter 🤔

hasty mountain Aug 12, 2021, 11:04 PM

#

velvet thorn which you can use as if it were a `numpy` array

When I use print(type(X)) it returns <class 'imageio.core.util.Array'. If I try using X in my neural network, it returns the following error:
ValueError: Failed to find data adapter that can handle input: <class 'imageio.core.util.Array'>

velvet thorn Aug 12, 2021, 11:04 PM

#

hasty mountain When I use `print(type(X))` it returns `<class 'imageio.core.util.Array'`. If I ...

convert it then

grave frost Aug 12, 2021, 11:04 PM

#

can you print the shapes?

velvet thorn Aug 12, 2021, 11:04 PM

#

np.asarray

velvet thorn Aug 12, 2021, 11:04 PM

#

grave frost I thought it was the counter 🤔

no, index comes first

grave frost Aug 12, 2021, 11:05 PM

#

velvet thorn no, index comes first

ahh, lightbulb

velvet thorn Aug 12, 2021, 11:05 PM

#

hasty mountain When I use `print(type(X))` it returns `<class 'imageio.core.util.Array'`. If I ...

this is one of those dynamic typing things I guess

#

🙏

hasty mountain Aug 12, 2021, 11:06 PM

#

velvet thorn convert it then

I've tried that, but I'll try again, just to make sure