warm verge Mar 15, 2022, 1:53 PM

#

Can you send your variables?

#

Like copy and paste a sample

#

Are you trying to convert your labels to floats?

lapis sequoia Mar 15, 2022, 2:11 PM

#

hey, i was wondering how could i make labels and test images ? because ever since i started learning they're given to me

upper spindle Mar 15, 2022, 2:18 PM

#

can someone help my with my lstm model, this is what it predicts

#

lstm_5 = Sequential([ 
    tf.keras.layers.InputLayer(input_shape=[n_past, n_dims]),

    # ADDING 1st LSTM LAYER
    tf.keras.layers.LSTM(64, return_sequences=True),
    tf.keras.layers.Dropout(0.2),

    # ADDING 2nd LSTM LAYER
    tf.keras.layers.LSTM(32),
    tf.keras.layers.Dropout(0.2),

    # DENSE OUTPUT LAYER
    tf.keras.layers.Dense(1)
])

lstm_5.compile(loss='mse', optimizer="adam")

multivariate = lstm_5.fit(mat_X_train, mat_y_train, 
                  validation_split=0.2, shuffle=True,
                  verbose=0, batch_size=batch_size, epochs=200)

# FORECASTING ON VALIDATION SET
multivariate_prediction = lstm(lstm_5, validation_index)

# SCALING OUTPUT TO MINMAXSCALER FITTED TO TRAINING CURRENT VOLUME
multivariate_prediction_scaled = scale(vol_scaler, multivariate_prediction)

#

have i used too many epochs or are my units incorrect

dusk tide Mar 15, 2022, 2:25 PM

#

Has anyone tried the book The hundred page ML book by Andrey Burkov?

regal gale Mar 15, 2022, 2:42 PM

#

any kind soul can help to give some feedback to a self-check assignment from a regression textbook? Unfortunately I have to pay for the answer and I am not willing to, hopefully someone can let me know if there's any glaring issue

woven fractal Mar 15, 2022, 3:19 PM

#

Any newbies want to do some NumPy practice? I was thinking about doing some live coding.

civic ivy Mar 15, 2022, 3:58 PM

#

So question. i want to have it so an AI does things repeatedly(wake up, go to the window, open fridge, sleep, repeat) but it doing so brings down a variable called happiness. then i want another map like (wake up then leave) but i want the AI choices this when happiness is a certain point but as well having a chance to not choose it. Ik i could some like

if happi == "0" 
  map1 = False
  map2 = True
  while map2 == True
    other code stuff

but i want the AI to choose map2 instead of doing map1. I am wondering if this possible?

serene scaffold Mar 15, 2022, 4:01 PM

#

civic ivy So question. i want to have it so an AI does things repeatedly(wake up, go to th...

this doesn't really sound like AI in the sense that this channel is concerned with AI. If something just follows the same steps repeatedly, that's just a regular program.

regal gale Mar 15, 2022, 4:06 PM

#

Anyone know autocorrelation function (ACF) and partial autocorrelation function (PACF) #🤡help-banana

civic ivy Mar 15, 2022, 4:06 PM

#

that is a good point but if it was possible i was going to see if i could use this in my project SLOAM as a way to emulate a form of emotion that is caused by repetitive actions. SLOAM is a Self Learning Optical Auditory Machine. it will learn from well optical and auditory responses.

bitter pilot Mar 15, 2022, 4:10 PM

#

Hello Everyone,
I have a data science use case, (I am not super beginner tho), however I need some guidance in where to start/look for.
I have a dataset of people with some work information, (department, skills, responsabilities, etc) and also training they have taken.
I need to train a system in order to be able to suggest to new employees or even existing employees which training they should take based on the model.
for example if I a switched from Junior to Senior range, then the system would recommend me which trainings to follow.
any pointer in the right direction would be useful

odd meteor Mar 15, 2022, 4:11 PM

#

Just a quick question. What makes you believe your clustering "didn't make sense"?

Your observation/answer to this will determine the kind of solution I'll suggest you try.

urban lance Mar 15, 2022, 4:16 PM

#

odd meteor Just a quick question. What makes you believe your clustering "didn't make sense...

I made new datasets by their clusters and did a profile report on them.
I compared the values of every feature and they didn't differ all that much

for ex. "days_since_last_visit" ranged from 0-13 in one cluster and from 0-7 in another

While I would expect some devision

odd meteor Mar 15, 2022, 4:17 PM

#

It could possibly be the problem of exploding gradient.

regal gale Mar 15, 2022, 4:19 PM

#

Anyone know autocorrelation function (ACF) and partial autocorrelation function (PACF)

odd meteor Mar 15, 2022, 4:20 PM

#

urban lance I made new datasets by their clusters and did a profile report on them. I compar...

Which clustering algorithm did you use? KMeans?

urban lance Mar 15, 2022, 4:20 PM

#

no not KMeans, it wasn't well suited from my problem

#

I used Chi2 distance with gaussian mixture models and hierachical clustering (My data was/is count of the occurences within a time interval)

regal gale Mar 15, 2022, 4:20 PM

#

I need some help with know autocorrelation function (ACF) and partial autocorrelation function (PACF)

upper spindle Mar 15, 2022, 4:27 PM

#

regal gale I need some help with know autocorrelation function (ACF) and partial autocorre...

im an econ student, have done time series last year, if that helps

regal gale Mar 15, 2022, 4:28 PM

#

@upper spindle omg life savior

upper spindle Mar 15, 2022, 4:28 PM

#

wasnt the best at it, but i understood acf and pacf partially

regal gale Mar 15, 2022, 4:28 PM

#

Do you have 30 mins or smth #🤡help-banana

odd meteor Mar 15, 2022, 4:28 PM

#

urban lance I used Chi2 distance with gaussian mixture models and hierachical clustering (My...

Hmmm I have not used the chi-square distance before ... Probably it's the case of not using the right number of clusters. I usually would use KMeans to get the right number of clusters and re-confirm using Silhouette plot or dendrogram from hierarchical clustering.

odd meteor Mar 15, 2022, 4:31 PM

#

urban lance I used Chi2 distance with gaussian mixture models and hierachical clustering (My...

Alternatively, try using DBSCAN algorithm for your clustering and check if it's much better suited for the kind of data you have.

urban lance Mar 15, 2022, 4:36 PM

#

DBSCAN had even worse results 😅

#

I've now tempered some more with the freatures, tomorrow I'll see if it worked or not

regal gale Mar 15, 2022, 4:45 PM

#

Anyone know autocorrelation function (ACF) and partial autocorrelation function (PACF)

odd meteor Mar 15, 2022, 4:46 PM

#

urban lance DBSCAN had even worse results 😅

If the dimension of your data is much, try decomposing it with t-SNE before applying your preferred clustering algorithm. Maybe you'll get a more customer-friendly result that way.

Nonetheless, may the force be with you 😊

urban lance Mar 15, 2022, 4:49 PM

#

I read that t-SNE might find false patterns

odd meteor Mar 15, 2022, 4:50 PM

#

urban lance I read that t-SNE might find false patterns

It doesn't find pattern. It's used for dimensionality reduction.

urban lance Mar 15, 2022, 4:51 PM

#

I mean that it might not be able to reconstruct the data in 2d

odd meteor Mar 15, 2022, 4:53 PM

#

urban lance I mean that it might not be able to reconstruct the data in 2d

If the data was in 2d you wouldn't have need to decompose the data in the first place.

#

You can use PCA as well but t-SNE is better than PCA.

regal gale Mar 15, 2022, 5:03 PM

#

Anyone know autocorrelation function (ACF) and partial autocorrelation function (PACF)

lapis sequoia Mar 15, 2022, 5:58 PM

#

odd meteor It could possibly be the problem of exploding gradient.

i do remember model giving nice values(and decreasing loss more and more)

#

so it could be a vanishing gradient too right?

misty flint Mar 15, 2022, 6:30 PM

#

learning that i can replace pandas' default matplotlib viz library with the plotly one has greatly improved my mood

#

DoggoKek

#

whoever came up with that feature. i am glad. shiroGomen

spiral gale Mar 15, 2022, 6:33 PM

#

hey, i am stuck with assigning weights to my multinominal one-vs-rest logistic regression with sklearn. I know that I can assign weights but how would I go around it in a multilabel setting with e.g. seven possible outcomes / labels?

regal gale Mar 15, 2022, 6:37 PM

#

Anyone familiar with bootstraping sample replacement

frosty jackal Mar 15, 2022, 6:52 PM

#

Which platform is good for data science colab or jupyter

cosmic pewter Mar 15, 2022, 6:53 PM

#

i'm VERY NEW to tensorflow

and I want to make a rock paper scissor game using tensorflow (basically just detect rock paper or scissor)
I've trained my model but i wonder how can i trigger a function when CONFIDENT is above 70% or something like that like retuned a list with data that I can check for
||or maybe it already retuned something i can check for like confident level but i just don't know how to access it educate me pls||

anyone have tutorials or article about this matter?

there's a ref : this project use tensorflow trained model to check for laughing face and trigger "LOSE"
https://github.com/andypotato/do-not-laugh

**
tl;dr : wanna make rock paper scissor game using my already trained model
how can i check for confident score so i can use it to trigger win/lose

please suggest what i need to know and what i need to learn
**

GitHub

GitHub - andypotato/do-not-laugh: A simple AI game based on Vue.js ...

A simple AI game based on Vue.js and Electron. Contribute to andypotato/do-not-laugh development by creating an account on GitHub.

prisma mist Mar 15, 2022, 7:04 PM

#

any way to make pytesseract.image_to_string() faster?

jolly knoll Mar 15, 2022, 8:20 PM

#

Guys, how do you showcase a Tableau workbook when you no longer have Tableau?

tacit basin Mar 15, 2022, 8:34 PM

#

jolly knoll Guys, how do you showcase a Tableau workbook when you no longer have Tableau?

there is tableau community or something like that which is free

karmic valley Mar 15, 2022, 8:35 PM

#

#data-science-and-ml message

#

please

#

if anyone could help me

hollow sentinel Mar 15, 2022, 8:45 PM

#

damnit

#

dead kernel

#

why

serene scaffold Mar 15, 2022, 9:06 PM

#

because fuck notebooks, that's why

hollow sentinel Mar 15, 2022, 9:11 PM

#

https://youtu.be/6D33x_eQt2U

YouTube

GalacticsTutorials

The kernel appears to have died. It will restart automatically. [MA...

Apologies for the audio distortion in the beginning of the video: I am a robot.

In this video, I'll show you how to fix the error message "The kernel appears to have died. It will restart automatically." in JupyterNotebook if you've recently updated anaconda.

My issue is with MATPLOTLIB, and was fixed by typing in the anaconda terminal: conda...

▶ Play video

#

bless

#

i created a new environment and then just copy pasted that and it worked

iron basalt Mar 15, 2022, 9:32 PM

#

odd meteor You can use PCA as well but t-SNE is better than PCA.

You can also try UMAP: https://umap-learn.readthedocs.io/en/latest/ @urban lance

#

https://www.youtube.com/watch?v=nq6iPZVUxZU

YouTube

Enthought

UMAP Uniform Manifold Approximation and Projection for Dimension Re...

This talk will present a new approach to dimension reduction called UMAP. UMAP is grounded in manifold learning and topology, making an effort to preserve the topological structure of the data. The resulting algorithm can provide both 2D visualisations of data of comparable quality to t-SNE, and general purpose dimension reduction. UMAP has been...

▶ Play video

graceful glacier Mar 15, 2022, 10:56 PM

#

what would be the best method of getting the second column given the first column

tall blaze Mar 15, 2022, 10:57 PM

#

graceful glacier what would be the best method of getting the second column given the first colum...

Is this in pandas?

graceful glacier Mar 15, 2022, 10:57 PM

#

yes

misty flint Mar 15, 2022, 10:58 PM

#

serene scaffold because fuck notebooks, that's why

why the hate for notebooks ID_BoomKek

#

i mean theyre obv not meant for dev or production

tall blaze Mar 15, 2022, 10:58 PM

#

I would separate the text column by the space delimiter. The write a lambda function to multiply the newly created numeric column based on if the first column is minus or not

misty flint Mar 15, 2022, 10:58 PM

#

but theyre decent for experiments

tall blaze Mar 15, 2022, 10:58 PM

#

tall blaze I would separate the text column by the space delimiter. The write a lambda func...

Does this help or do you want to see code?

serene scaffold Mar 15, 2022, 10:59 PM

#

@misty flint we were shitting on notebooks in #pedagogy earlier

misty flint Mar 15, 2022, 10:59 PM

#

DoggoKek

#

honestly they can be kinda annoying tbh

serene scaffold Mar 15, 2022, 11:00 PM

#

~~honestly~~ they ~~can~~ be ~~kinda~~ annoying tbh

misty flint Mar 15, 2022, 11:00 PM

#

kekHands

graceful glacier Mar 15, 2022, 11:01 PM

#

tall blaze Does this help or do you want to see code?

i dont think the code is necessary

#

thats along the lines of wht i was thinking as well

misty flint Mar 15, 2022, 11:01 PM

#

maybe they will die out in the future when better tooling comes out

#

blobhyperthink

tall blaze Mar 15, 2022, 11:08 PM

#

graceful glacier i dont think the code is necessary

Snap I started writing it. I’m on my phone and it’s not the most efficient but it gets the job done:
df[['col1', 'col2']]=df['string'].str.split(' ', expand=True)

df['col2']=pd.to_numeric(df['col2'])

df['col1']=df['col1'].apply(lambda x:-1 if x=='Minus' else 1)

df['final_col']=df['col1']*df['col2']

graceful glacier Mar 15, 2022, 11:09 PM

#

thanks

grave frost Mar 15, 2022, 11:10 PM

#

hollow sentinel dead kernel

do cpr

hollow sentinel Mar 15, 2022, 11:10 PM

#

grave frost do cpr

HAHAHAHHA

#

that made me almost spit out my coffee

#

good one

#

CPR failed, trying paddles

grave frost Mar 15, 2022, 11:10 PM

#

thanks lol

grave frost Mar 15, 2022, 11:11 PM

#

serene scaffold > ~~honestly~~ they ~~can~~ be ~~kinda~~ annoying tbh

honestly they can be kinda annoying tbh

I don't get the hate towards notebooks, they're lightweight useful and pretty nice. if one wants to go hardcore, just use git with sublime text?

tall blaze Mar 15, 2022, 11:12 PM

#

grave frost > honestly they can be kinda annoying tbh I don't get the hate towards noteb...

Crashing a kernel > crashing your computer

grave frost Mar 15, 2022, 11:12 PM

#

tall blaze Crashing a kernel > crashing your computer

facts

#

if you mean to say its more convenient to crash a kernel

tall blaze Mar 15, 2022, 11:14 PM

#

grave frost if you mean to say its more convenient to crash a kernel

Lol yea, for standard development they might not be as useful. But for handling big data it’s a lifesaver to run on a kernel

grave frost Mar 15, 2022, 11:15 PM

#

I find colab more useful. the fact that I can totally mess my env up by downloading 10 versions of torch and do a simple reset blows my mind

tall blaze Mar 15, 2022, 11:15 PM

#

grave frost I find colab more useful. the fact that I can totally mess my env up by download...

I like colab but with their pricing lately you might as well just open up a databrick account

grave frost Mar 15, 2022, 11:16 PM

#

a big F U to docker nerds, nerding over why learning 100 commands and debugging errors is better than colab

grave frost Mar 15, 2022, 11:16 PM

#

tall blaze I like colab but with their pricing lately you might as well just open up a data...

I doubt it

#

colab pro+ is heavily undercutting its competition

tall blaze Mar 15, 2022, 11:17 PM

#

grave frost I doubt it

Maybe, depending on if you are using it a ton. I have a databricks account with aws cloud and never go over 50 a month for my personal stuff

misty flint Mar 15, 2022, 11:18 PM

#

tall blaze Maybe, depending on if you are using it a ton. I have a databricks account with ...

ZoomEyes

#

takes notes

#

blobpoll

tall blaze Mar 15, 2022, 11:18 PM

#

Just be supperrrrr careful to terminate clusters when you done

grave frost Mar 15, 2022, 11:18 PM

#

tall blaze Maybe, depending on if you are using it a ton. I have a databricks account with ...

so...its just AWS but with a better UI

tall blaze Mar 15, 2022, 11:19 PM

#

Pretty much

grave frost Mar 15, 2022, 11:19 PM

#

useless

#

AWS costs a shitton

misty flint Mar 15, 2022, 11:19 PM

#

databricks has a ton of ML specific stuff tho

#

blobhyperthink

grave frost Mar 15, 2022, 11:19 PM

#

GCP is soo cheap

#

or maybe y'all are just rich

tall blaze Mar 15, 2022, 11:20 PM

#

misty flint <:blobhyperthink:683298669872545921>

True but the data storage

misty flint Mar 15, 2022, 11:20 PM

#

i mean i think 50/mo is cheap

tall blaze Mar 15, 2022, 11:20 PM

#

Heaven

grave frost Mar 15, 2022, 11:20 PM

#

well, depends

tall blaze Mar 15, 2022, 11:20 PM

#

Spark sql pass throughs with hive storage

grave frost Mar 15, 2022, 11:20 PM

#

Google's TPUs are dirt cheap, and AWS doesn't do much for pricing for A100s

tall blaze Mar 15, 2022, 11:21 PM

#

grave frost Google's TPUs are dirt cheap, and AWS doesn't do much for pricing for A100s

It’s not terrible

#

But if you training all bloody month go with colab, although I think they have ways of limiting usage

grave frost Mar 15, 2022, 11:21 PM

#

what are you training that takes a month?

#

even GPT2 took a few weeks

tall blaze Mar 15, 2022, 11:22 PM

#

grave frost what are you training that takes a month?

I meant multiple projects

grave frost Mar 15, 2022, 11:22 PM

#

I suppose

#

Colab is for experimentation anyways, which is why I love the colab+kaggle comob

#

got the data https://cloud-gpus.com/

Cloud GPU Comparison

Cloud GPU Price and Feature Comparison

#

GCP is just objectively cheap all around. it has a pretty UI, helpful support and plenty of integrations

tall blaze Mar 15, 2022, 11:26 PM

#

I think google is trying to capture space in the market, they’ll prolly jack up prices

#

Just don’t use Microsoft lol

grave frost Mar 15, 2022, 11:26 PM

#

¯_(ツ)_/¯

#

I doubt it

#

I think they're just getting more competitive with thinner margins - and their TPUs ofc

tall blaze Mar 15, 2022, 11:26 PM

#

Maybe

grave frost Mar 15, 2022, 11:27 PM

#

TPUs are a pain to get working, but they just outperform every single GPU out there no biggie

#

they're criminally underrated IMO

#

(but I hope it stays that way lol)

serene scaffold Mar 15, 2022, 11:47 PM

#

grave frost > honestly they can be kinda annoying tbh I don't get the hate towards noteb...

the nonlinear execution order and discouragement of modularity.

iron basalt Mar 15, 2022, 11:49 PM

#

grave frost (but I hope it stays that way lol)

I would say that they are everywhere now due to being part of Apple's new SOCs. But in typical Apple fashion they are very closed off in terms of access. You have to use their own tools to do specific operations on it while in reality it's pretty generic and could probably support something like OpenCL.

iron basalt Mar 15, 2022, 11:50 PM

#

serene scaffold the nonlinear execution order and discouragement of modularity.

The thing is that that kind of thing where you can visualize stuff and have code everywhere in random spots is better served by functional reactive programming / declarative programming.

#

Since it all readjusted as you change anything then.

#

(see Enso)

#

(or shader graphs in Blender or flow graphs in Unreal, etc)

serene scaffold Mar 15, 2022, 11:52 PM

#

regardless, I've decided to position myself as the anti-notebook, anti-jupyter guy to spread awareness of their limitations/issues.

iron basalt Mar 15, 2022, 11:52 PM

#

serene scaffold regardless, I've decided to position myself as the anti-notebook, anti-jupyter g...

I agree, they are kind of like a bad repl in a way.

#

Rather than what they actually want to be which is this pipes and filters thing.

inland mantle Mar 16, 2022, 12:03 AM

#

@serene scaffold This is just a personal question. Because you are a data scientist with linguistics, do you work on NLP?

raven cloud Mar 16, 2022, 12:35 AM

#

how would you go about counting objects (with tensorflow for example) in a video where the camera is moving randomly ?
As an example , say the camera is pointing towards an object and counts it and afterwards it moves away from that object facing anywhere else away from the object and later faces again to the same object but you would not add it to the count since its already been counted.

karmic valley Mar 16, 2022, 12:49 AM

#

i have an image and i want python to work out the average pixel intensity below the blue line and average pixel intensity above blue line from image. I know code to work out average pixel intensity of full image. But dont know how to do pixel intensity below blue line and above. IF you have any ideas please help. Or another way is to get python to split image into 2 - one with everything below blue line and other with everything above line then i can do my code. But i dont know how to do that

#

just @ me if you have any thoughts

misty flint Mar 16, 2022, 1:00 AM

#

~~fourier transformation~~

#

RunFail

#

jk

#

idk anything about signal processing

#

even tho i took one class

#

made me avoid it more monkaCHRIST

elfin jungle Mar 16, 2022, 1:02 AM

#

Got a machine learning question for you guys
I have an airbnb data set that has prices of properties throughout the year (300+ entries for a single ID). Applying ml on the data would result in heavy overfitting and not capture the true goal of measuring price change throughout the year. I know in R there is lm.clustering which accounts for multiple entries, is there any equivalent in python? @tender hearth

serene scaffold Mar 16, 2022, 1:03 AM

#

inland mantle <@!253696366952316929> This is just a personal question. Because you are a data ...

Yes

misty flint Mar 16, 2022, 1:07 AM

#

serene scaffold Yes

stelercus i know you might be biased and may or may not be able to look into the future, but do you think specializing in NLP is good in terms of future job growth or should i try to specialize in another set of ML algos 🔮

#

PikaThink

karmic valley Mar 16, 2022, 1:18 AM

#

misty flint idk anything about signal processing

okay no problem thanks for at least replying

misty flint Mar 16, 2022, 1:20 AM

#

karmic valley okay no problem thanks for at least replying

i joke but seriously maybe look into the direction of signal processing bc im like 75% sure your answer is at least in that direction

desert oar Mar 16, 2022, 1:30 AM

#

karmic valley i have an image and i want python to work out the average pixel intensity below ...

there is probably something in cv2 that can help segment this image

#

heck you can probably even do it in photoshop, if you don't need to automate it for more than one image

#

then you can literally just look at the pixel rgb values

#

of course you can use gimp if you don't want to use adobe proprietary shitware 🙂

carmine oasis Mar 16, 2022, 1:58 AM

#

can someone help in #help-potato

neon imp Mar 16, 2022, 2:31 AM

#

misty flint stelercus i know you might be biased and may or may not be able to look into the...

specialising in ML algos is a really bad idea

#

unless you are postdoc

#

my experience is that it's very hard to find work as a specialist

#

there are too many postdoc kaggle grandmasters for rent

#

I think people have an extremely tilted idea of how competitive the post doc grind is and how much talent there is in the market atm

#

ML algorithm work is being advanced by huge research teams, and the impact of a lone ranger in the field is becoming super low

iron basalt Mar 16, 2022, 2:35 AM

#

neon imp ML algorithm work is being advanced by huge research teams, and the impact of a ...

It's as high as ever if you think outside the box, but if you want to do incremental improvements then yeah that is already done in parallel by large teams.

#

(Actually applies to more than just ML)

#

(incremental vs different axis)

neon imp Mar 16, 2022, 2:37 AM

#

I think you underestimate how hard it is to do research outside a research group

#

of highly motivated peers

#

but that's sorta not industry

#

I think in terms of industry the applications of NLP are extremely saturated right now kinda. The capacity for NLP projects is maxxed out.

iron basalt Mar 16, 2022, 2:38 AM

#

I don't but I know it's hard for many to do so. Solo is not only hard for motivation, but also confidence. You may fail and have wasted your time or not, but if you can get over that fear it's fine.

neon imp Mar 16, 2022, 2:39 AM

#

To be successful in research you need to know 100 other successful researchers

#

just to help you course correct etc

#

and learn of opportunity

#

and yeah, motivation confidence etc

iron basalt Mar 16, 2022, 2:40 AM

#

Eh, not exactly. You have one more trick up your sleeve and that is the results. If it clearly works (and you made sure it actually works and are not tricking yourself) then it works.

neon imp Mar 16, 2022, 2:40 AM

#

Yes, but the low hanging fruit is very picked

#

in the 21st century every research area has a small army of wannabes assaulting the easy stuff

#

so you need quite the edge to tackle stuff.

#

and the network and mentorship is part of that

iron basalt Mar 16, 2022, 2:41 AM

#

It's actually not because of again the axis issue. The focus is different. There are many things in ML for which there are only a handful of people working on it.

#

It's just not even being attempted often.

neon imp Mar 16, 2022, 2:42 AM

#

like?

iron basalt Mar 16, 2022, 2:43 AM

#

Online learning, causal modelling, non-backpropagation based methods. These things have relatively few people working on them.

#

There are still many, but these big teams in the ML world are often focused on their specific stuff which others follow making it seem like that is all there is.

#

And that is a losing strategy because they are already doing it, and in a large group.

neon imp Mar 16, 2022, 2:45 AM

#

Hmm, online learning I think is not very compatible with large scale backprop

#

also to be frank unsupervised online learning in industry is very...

#

risky

iron basalt Mar 16, 2022, 2:45 AM

#

Yeah online learning is an example where incremental improvements by just using backprop does not work.

neon imp Mar 16, 2022, 2:46 AM

#

but yes I guess so

iron basalt Mar 16, 2022, 2:46 AM

#

Risky, but it must also happen, because an AGI can do online learning.

#

But that's research, it's risky and you will probably fail, but that is how all interesting things happen. Not being afraid of failure is the first step. It's why kids are considered creative, they lack that fear in their shielded environment.

neon imp Mar 16, 2022, 2:47 AM

#

That's a very pure research mentality haha

#

the flipside is that succesful research requires extremely insane feedback loops to be done successfully

iron basalt Mar 16, 2022, 2:48 AM

#

Having a group to work with is of course much better, but if there is not such group, and you know that it must be done (e.g. for AGI), then it is what it is. You need to start that group, someone has to do it.

neon imp Mar 16, 2022, 2:48 AM

#

So you need to be on top of your game to really do any thing, just because the intellectual capital being deployed is significant so you have to be really cutting edge to find delta

#

Artificial general intelligence talk kinda doesn't motivate me much

#

that entire field has skipped so much foundational knowledge

#

about neuroscience

#

we know so little about problem solving techniques that don't involve Von Neumann state machines

iron basalt Mar 16, 2022, 2:50 AM

#

This is why our approach is based on modern neuroscience and results. If it does not work, even with a nice pretty theory, it goes into the bin. There are things however that we know we must have, like online learning.

neon imp Mar 16, 2022, 2:50 AM

#

but it's clear that we will need to use models that are not von neumann state machines to get anywhere close

iron basalt Mar 16, 2022, 2:50 AM

#

We also are very into non-von neumann machines.

neon imp Mar 16, 2022, 2:51 AM

#

yea, but that is pure research.

iron basalt Mar 16, 2022, 2:51 AM

#

The reason for the neuroscience is that while it may not be necessary, it's the current real world example out there of it working (the human brain).

neon imp Mar 16, 2022, 2:51 AM

#

I don't doubt it's valuable, but you must bring a pure research mindset to it

#

neuroscience is very necessary

#

we need other models of computatation haha

#

depending on your definition of neuroscience

iron basalt Mar 16, 2022, 2:52 AM

#

For non-research, yeah you can join a big group and help with the incremental improvements. And maybe just keep tabs on the pure research people's stuff.

neon imp Mar 16, 2022, 2:52 AM

#

Well non research means that its commercial applications

#

if that makes sense

#

and commercially all this stuff is just miles and miles off

inland zephyr Mar 16, 2022, 2:52 AM

#

hello all
i'm now in progress on my personal project with image classification with cnn. i want to build airplane tail classifier started with 20 different airlines and each airlines provide 20 different tail image in 120x120 px dims and manually cropped using PS (i dont event care about the aircraft type). something that i want to ask is it feasible to do with only 20 images which i planned to split for train and test?

misty flint Mar 16, 2022, 2:52 AM

#

neon imp specialising in ML algos is a really bad idea

thats not what i actually meant sorry but i understand your concern

neon imp Mar 16, 2022, 2:52 AM

#

allg

iron basalt Mar 16, 2022, 2:52 AM

#

Not exactly, there are already non-backprop methods in use and have been long before deep learning actually. And they can out perform them.

neon imp Mar 16, 2022, 2:53 AM

#

inland zephyr hello all i'm now in progress on my personal project with image classification w...

5000 images per class is my rule of thumb

iron basalt Mar 16, 2022, 2:53 AM

#

Especially when it comes to computation cost**

neon imp Mar 16, 2022, 2:53 AM

#

but you can cheat and go with a few hundred if you're really good at oversampling techniques

misty flint Mar 16, 2022, 2:53 AM

#

i did research in my past life. not a fan. kekHands

#

if i can avoid it, i will.

neon imp Mar 16, 2022, 2:53 AM

#

where are you coming from then?

inland zephyr Mar 16, 2022, 2:53 AM

#

neon imp 5000 images per class is my rule of thumb

i know cnn more image could help

misty flint Mar 16, 2022, 2:53 AM

#

secret RunFail

neon imp Mar 16, 2022, 2:54 AM

#

well where do you think about going

#

I think the tl;dr of a MLE career is Kaggle good

#

Pump the weights in the Kaggle gym it's really real world relevant

neon imp Mar 16, 2022, 2:54 AM

#

inland zephyr i know cnn more image could help

So I have two pieces of advice

misty flint Mar 16, 2022, 2:54 AM

#

tbh im probs gonna go the route of technical PM

#

so yeah

#

Oopsies

neon imp Mar 16, 2022, 2:54 AM

#

#1 don't use photoshop for labelling

inland zephyr Mar 16, 2022, 2:54 AM

#

maybe i will bring the sampling to albumentation or keras preprocess to help

neon imp Mar 16, 2022, 2:54 AM

#

invest the effort up front to configure a labelling tool

#

#2 get a bit more source data, say 50 distinct original images for each

neon imp Mar 16, 2022, 2:55 AM

#

misty flint tbh im probs gonna go the route of technical PM

Don't

#

the world doesn't need more pms

#

I'm really serious

inland zephyr Mar 16, 2022, 2:56 AM

#

what is PM? project manager?

neon imp Mar 16, 2022, 2:56 AM

#

in this context yea

misty flint Mar 16, 2022, 2:56 AM

#

the world doesnt need a lot of things but it still gets it. ill do what makes me happy. i wouldve continued my past life if i didnt care about my happiness DoggoKek

neon imp Mar 16, 2022, 2:57 AM

#

I think the hardest lesson I've learnt in life that happiness doesn't get solved by the easy path

#

most happiness is hard earnt

#

if you've got a good path ahead of you for the hard yards of that domain sure, but just keep that in mind

#

I am of the view that it's very hard to succeed in any domain in this industry if you can't succeed at the technical IC path

misty flint Mar 16, 2022, 2:58 AM

#

i know myself and my personality and i would suffer if i went down the IC route

#

everyone is different. dif strokes for dif folks.

neon imp Mar 16, 2022, 2:59 AM

#

Perhaps, but I think the fundamental thing you need is the ability to focus and achieve difficult hard work

#

I think the skills to be a succesful IC are a lot less rare than people think

#

well, a lot less specialised

#

but what happens is that highly succesful ICs are highly successful because of a range of skills that are successful in any domain

misty flint Mar 16, 2022, 3:00 AM

#

i dont doubt your words, but i think success in life looks dif for everybody, yknow?

neon imp Mar 16, 2022, 3:00 AM

#

hence they are given 10 direct reports because they're highly successful people

neon imp Mar 16, 2022, 3:00 AM

#

misty flint i dont doubt your words, but i think success in life looks dif for everybody, yk...

I think the output of people's success

#

is different for everyone

#

I think the inputs of success are extremely similar for everyone

iron basalt Mar 16, 2022, 3:01 AM

#

neon imp I think the hardest lesson I've learnt in life that happiness doesn't get solved...

To add to that and the thing about research. I could have done web development or some "normal" programming job, but I would have been miserable. Instead of having a relatively easy route I do AGI research (although we actually have immediate applications) which is risky and niche.

misty flint Mar 16, 2022, 3:01 AM

#

i think we are talking past each other. lets just say i wouldve stayed in my original path had i cared about what others thought and what society thought of "having a successful career" meant

#

Oopsies

#

but to each their own, you do bring up some valid points.

neon imp Mar 16, 2022, 3:02 AM

#

That's fair Squiggle haha. I was trying to think from a industry mindset

#

AGI research is of course extremely interesting as long as you do it right

neon imp Mar 16, 2022, 3:03 AM

#

misty flint i think we are talking past each other. lets just say i wouldve stayed in my ori...

I mean that's the thing, you're being kinda mysterious about what you want so I can't give you advice

#

I'm just saying that you need to figure out what you want and then you need to really work on improving your input goods to success kinda

#

but I can't give more advice to you really without knowing more, I know you said ML nat lang stuff but yea

#

My experience in life is just that

#

successful people are rarely bad at a particular area

#

The successful researchers I know are often surprisingly solid full stack devs,

misty flint Mar 16, 2022, 3:06 AM

#

and im saying that success in life =/= success in career

#

thats what i mean

#

you get me?

iron basalt Mar 16, 2022, 3:07 AM

#

neon imp The successful researchers I know are often surprisingly solid full stack devs,

You may be onto something with that, myself and my circle of researchers all do stuff from web to simulation to ml and more.

misty flint Mar 16, 2022, 3:07 AM

#

anyway, im getting off-topic for this channel, so ill see myself out RunFail

iron basalt Mar 16, 2022, 3:09 AM

#

(often out of need because we need to simulate things, have a web UI for it, etc. Or because people now in ML often come from other things such as game dev)

#

"jack of all trades master of none" - lame, self limiting phrase, "jack of all trades master of one" - much better.

#

(a large math knowledge base is probably the underlying thing here (and/or physics))

neon imp Mar 16, 2022, 3:15 AM

#

and im saying that success in life =/= success in career
thats what i mean
you get me?

#

I understand, but I completely disagree

#

The #1 factor to success IMO is not having a mentality of.

#

"I'm not good at that and that's fine"

#

Most successful people are "Jack of all Trades AND Master of Something"

#

oh that's what you already said squiggle haha

#

jack of all master of one

iron basalt Mar 16, 2022, 3:17 AM

#

Yeah it's what can be seen when looking at pretty much all famous researchers (and lesser known ones). For example Newton was good at much more than just physics.

#

But that was his one master thing.

neon imp Mar 16, 2022, 3:17 AM

#

I don't like Newton or older researchers as an example

#

back in the 19th century you could be a weird shit and still successful

#

for various different reasons.

iron basalt Mar 16, 2022, 3:18 AM

#

I mean yeah, Newton and such are weird.

neon imp Mar 16, 2022, 3:18 AM

#

Modern successful researcher just looks like Alan Kay kinda

#

Just... they're not bad at stuff.

iron basalt Mar 16, 2022, 3:19 AM

#

Yeah. Well, the thing is there is of course always some stuff that is out of your hands, but using that as an excuse is not going to help you.

#

The thing is though, you can often see such people, successful or not, doing many different things.

#

And often just as a sort of problem along the way to trying to solve a different problem.

#

It really comes down to whether or not you throw up hands when there is a problem or you constantly push for a solution.

neon imp Mar 16, 2022, 3:23 AM

#

in a way

#

I think the biggest attribute is investing in your personal productive capacity kinda

#

I think it's less about charging through problems

#

becuase there are a lot of failed people who pushed really hard on bad problems

#

Survivorship bias is a really strong thing

#

I think it's about investing in your personal capacity to do really good work, and to find and identify really important and profitable work to do

#

and there's a lot involved in doing that successfully for obvious reasons

iron basalt Mar 16, 2022, 3:25 AM

#

The thing is that it's a multi-arm bandit problem and it's really hard to tell if you keep going or not. At some point the whole different approach thing kicks in for some, and others just keep pushing the same way. Knowing more about seemingly unrelated things may just happen to give you the alternative solution and math happens to be generic enough to often allow for such bridges.

neon imp Mar 16, 2022, 3:26 AM

#

I just would try to not worry about the problem

#

rather than the process

#

be process minded

#

Always, be very process minded

#

Don't sweat about the win sweat about having an amazingly good process

iron basalt Mar 16, 2022, 3:28 AM

#

I agree.

neon imp Mar 16, 2022, 3:29 AM

#

breadth of knowledge is also a very good point

pastel valley Mar 16, 2022, 5:48 AM

#

is 87% training accuracy considered underfitting?

alpine bay Mar 16, 2022, 6:06 AM

#

#

alpine bay Mar 16, 2022, 6:06 AM

#

alpine bay

I want to print this in google colab any can help??

pastel valley Mar 16, 2022, 6:17 AM

#

alpine bay I want to print this in google colab any can help??

i just saw this on net

fig, ax = plt.subplots(figsize=(8, 8))
ax.matshow(con_mat, cmap=plt.cm.Blues, alpha=0.3)
for i in range(con_mat.shape[0]):
    for j in range(con_mat.shape[1]):
        ax.text(x=j, y=i,s=con_mat[i, j], va='center', ha='center', size='xx-large')

plt.xlabel('Predictions', fontsize=18)
plt.ylabel('Actuals', fontsize=18)
plt.title('Confusion Matrix', fontsize=18)
plt.show()

alpine bay Mar 16, 2022, 6:24 AM

#

silver sun Mar 16, 2022, 6:24 AM

#

Im getting this error on my CNN deep learning code. Any help? I tried to fix it but Im stuck. ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 24), found shape=(None, 8)

regal gale Mar 16, 2022, 6:53 AM

#

Hi

#

Anyone familiar with fitting a logistic regression model using 70%-30% of the data for training-testing the model. Repor AUC

stone marlin Mar 16, 2022, 7:18 AM

#

This feels like another homework question, Jessica.

lapis sequoia Mar 16, 2022, 7:20 AM

#

silver sun Im getting this error on my CNN deep learning code. Any help? I tried to fix it ...

hm seems like you are either giving the wrong input or you need to edit architecture a bit more, why dont you share more details.

regal gale Mar 16, 2022, 7:36 AM

#

@stone marlin i have the answer

#

just need some explaination @stone marlin

pastel valley Mar 16, 2022, 9:02 AM

#

alpine bay

try ctrl v or paste on notepad first then copy paste again

lapis sequoia Mar 16, 2022, 9:30 AM

#

When my pipeline gets really convoluted, it'd be nice to see all the steps graphically and be able to inspect intermediate results

tacit basin Mar 16, 2022, 9:40 AM

#

lapis sequoia When my pipeline gets really convoluted, it'd be nice to see all the steps graph...

scikit learn pipelines have graphical representation: https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#using-the-prediction-pipeline-in-a-grid-search

scikit-learn

Column Transformer with Mixed Types

This example illustrates how to apply different preprocessing and feature extraction pipelines to different subsets of features, using ColumnTransformer. This is particularly handy for the case of ...

inland zephyr Mar 16, 2022, 10:06 AM

#

My model get emotional damage for now

Epoch 2/500
5/5 [==============================] - 1s 116ms/step - loss: 33.5003 - accuracy: 0.0507 - false_negatives: 139.0000 - categorical_crossentropy: 33.5003 - val_loss: 236863552.0000 - val_accuracy: 0.8100 - val_false_negatives: 57.0000 - val_categorical_crossentropy: 236863568.0000
Epoch 3/500
5/5 [==============================] - 1s 116ms/step - loss: 7.6320 - accuracy: 0.0300 - false_negatives: 131.0000 - categorical_crossentropy: 7.6320 - val_loss: 361702784.0000 - val_accuracy: 0.8233 - val_false_negatives: 53.0000 - val_categorical_crossentropy: 361702784.0000
Epoch 4/500
5/5 [==============================] - 1s 123ms/step - loss: 2.8214 - accuracy: 0.0343 - false_negatives: 127.0000 - categorical_crossentropy: 2.8214 - val_loss: 139447792.0000 - val_accuracy: 0.8100 - val_false_negatives: 57.0000 - val_categorical_crossentropy: 139447792.0000
Epoch 5/500
5/5 [==============================] - 1s 142ms/step - loss: 4.0497 - accuracy: 0.0379 - false_negatives: 131.0000 - categorical_crossentropy: 4.0497 - val_loss: 51209328.0000 - val_accuracy: 0.8233 - val_false_negatives: 53.0000 - val_categorical_crossentropy: 51209328.0000

I need to tweak it more ...

solemn dragon Mar 16, 2022, 10:11 AM

#

Hey guys how would you go about using groupby only on rows that are "1T" apart in a pandas timeseries ?
So im my mind it would be something like this :
deltaTimeThreshold = np.timedelta64(1, 'm')

Not valid code obviously : df = df.groupby('sn')(if df.date -df.date.shift() <= deltaTimeThreshold)

misty flint Mar 16, 2022, 10:50 AM

#

pithink

#

im curious, what happens if you try it

urban lance Mar 16, 2022, 10:52 AM

#

on the topic of groupby
does a groupby sum change nan to 0 🤔
if a group has only nans, I want it to remain nan in the sum

df = df.groupby(["x",pd.Grouper(key="y", freq="M")]).agg({
    "feature" : "sum",

#

ex:

features = [nan, nan, nan, nan, nan]
output : nan

features = [nan, nan, 0.14, 0, nan]
output : 0.14

inland zephyr Mar 16, 2022, 11:41 AM

#

https://stackoverflow.com/questions/26145585/pandas-aggregation-ignoring-nans @urban lance using nansum if you want to ignore that nan

Stack Overflow

Pandas aggregation ignoring NaN's

I aggregate my Pandas dataframe: data. Specifically, I want to get the average and sum amounts by tuples of [origin and type]. For averaging and summing I tried the numpy functions below:

import n...

mint palm Mar 16, 2022, 11:48 AM

#

#

i actually was trying to run a NN model from github, but i am getting this error .......all i changed was remove the encoding part as i had the encoded dataset for the same

upper spindle Mar 16, 2022, 12:03 PM

#

could some be able to help at #help-pretzel

mint palm Mar 16, 2022, 12:08 PM

#

nobody helping

#

😢

lapis sequoia Mar 16, 2022, 12:09 PM

#

mint palm i actually was trying to run a NN model from github, but i am getting this error...

i assume that either you have changed model bit, or data.

#

its simple shape error

mint palm Mar 16, 2022, 12:10 PM

#

yes but i can even show you my code, all i did is remove encoding

lapis sequoia Mar 16, 2022, 12:10 PM

#

mint palm yes but i can even show you my code, all i did is remove encoding

yeah showing code will help.

#

also given the error already, what is the shape of your X and y?

inland zephyr Mar 16, 2022, 12:15 PM

#

s_label=[]
s_image=[]
fig = plt.figure(figsize=(5, 20))
k = 0
sample = paths.list_images(r'this is the path')
for s in sample:
    s_label = s.split(os.path.sep)[-2]
    s_image = cv2.imread(s)
    s_image = np.array(s_image, dtype="float") / 255.0
    fig.add_subplot(2, 5, (k + 1))
    plt.imshow(s_image)
    plt.axis='off'
    plt.title=s_label
plt.savefig(r'this is also a path', bbox_inches='tight')

I want to plot 10 image into a plot of 2x5 area but when i run this, i only got the last image

#

the structure look like this, but only the last folder shown in the plot

lapis sequoia Mar 16, 2022, 12:20 PM

#

inland zephyr ``` s_label=[] s_image=[] fig = plt.figure(figsize=(5, 20)) k = 0 sample = paths...

hm i think i have solved this before. hold on.

#

you want to save all of them right? its out of loop.

#

put in in loop.

inland zephyr Mar 16, 2022, 12:22 PM

#

actually i want to make it as a montage

#

so all in one image

lapis sequoia Mar 16, 2022, 12:23 PM

#

oh, so then you should use...

#

subplot

inland zephyr Mar 16, 2022, 12:35 PM

#

it works

#

but now the axis make it annoying

#

nvm

fig = plt.figure()
k = 0
r = 2
c = 5
i = 1
sample = paths.list_images(r'')
print(sample)
for s in sample:
    plt.subplot(r,c,i)
    plt.title(s.split(os.path.sep)[-2])
    plt.axis('off')
    s_img = cv2.imread(s)
    plt.imshow(cv2.cvtColor(s_img, cv2.COLOR_BGR2RGB))
    i=i+1

plt.savefig(r'', bbox_inches='tight')

#

and the output seems fine

#

lapis sequoia Mar 16, 2022, 12:46 PM

#

cheers!

misty flint Mar 16, 2022, 12:47 PM

#

praise

inland zephyr Mar 16, 2022, 12:50 PM

#

i dont know event each tail looks very distinct, i still cannot made the model predict all of them easily

#

visually

#

it was easy to classified them since I was choose carefully the easy one, not put the hard to separate pattern

mint palm Mar 16, 2022, 12:51 PM

#

lapis sequoia yeah showing code will help.

import tensorflow.keras
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils

import numpy
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import matplotlib.pyplot as plt
from keras.utils.vis_utils import plot_model

seed = 9
numpy.random.seed(seed)

data = pd.read_csv("C:\\Users\\rahul\\PycharmProjects\\pythonProject1\\complete.csv")
X = data.iloc[:, 0:8]
Y = data.iloc[:, 8]

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.001, random_state=seed)
# create model
model = Sequential()
model.add(Dense(8, input_dim=8, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(3, activation='tanh'))
model.add(Dense(3, activation='softmax'))
print(model.summary())
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

history = model.fit(X_train, Y_train, validation_split=0.3, epochs=16, batch_size=128)

# evaluate the model
scores = model.evaluate(X_test, Y_test)

print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

plot_model(model, to_file='model.png')

# Plot training & validation accuracy values
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

# Plot training & validation loss values
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper right')
plt.show()

#

my code^

lapis sequoia Mar 16, 2022, 12:52 PM

#

mint palm ```python import tensorflow.keras from keras.models import Sequential from keras...

i bet your dimensions of Y are (something, 1)

mint palm Mar 16, 2022, 12:53 PM

#

from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils

import numpy
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import matplotlib.pyplot as plt
from keras.utils import plot_model

seed = 9
numpy.random.seed(seed)

# load datasets
#csv files were filtered based on the data.
input_file = "C:\\XXX...csv"
test_file = "C:\\XXX.csv"

dataset = pd.read_csv(input_file).values

# read training data
datasetTest = pd.read_csv(test_file).values

# split into input (X) and output (Y) variables
X = dataset[:,0:8].astype("int32")
Y = dataset[:,8]
XT = datasetTest[:,0:8].astype("int32")

encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)

(X_train, X_test, Y_train, Y_test) = train_test_split(X, dummy_y, test_size=0.001, random_state=seed)
# create model
model = Sequential()
model.add(Dense(8, input_dim=8, init='normal', activation='relu'))
model.add(Dense(4, init='normal', activation='relu'))
model.add(Dense(3, init='normal', activation='tanh'))
model.add(Dense(3, init='normal', activation='softmax'))
print(model.summary())
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

history = model.fit(X_train, Y_train, validation_split=0.3, epochs=16, batch_size=128)

# evaluate the model
scores = model.evaluate(X_test, Y_test)

print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

plot_model(model, to_file='model.png')

# Plot training & validation accuracy values
plt.plot(history.history['acc'])
```(removed some lines to fit  in)

#

dataset is 63168 by 9

lapis sequoia Mar 16, 2022, 12:54 PM

#

i repeat, what is the shape of your Y?

mint palm Mar 16, 2022, 12:54 PM

#

where last column is Y

#

63168 by 1

lapis sequoia Mar 16, 2022, 12:54 PM

#

model.add(Dense(3, init='normal', activation='softmax'))

your model expects output of vector of 3 not 1.

mint palm Mar 16, 2022, 12:55 PM

#

i didnt change it but...

#

i will try

lapis sequoia Mar 16, 2022, 12:55 PM

#

mint palm i didnt change it but...

no hold on listen.

#

you are not using encoded y right?

mint palm Mar 16, 2022, 12:55 PM

#

yes

#

yes

#

i have encoded data

lapis sequoia Mar 16, 2022, 12:56 PM

#

so your encoded y has the shape of (63168, 3) that is why it works for that, and not for this.

mint palm Mar 16, 2022, 12:56 PM

#

no Y is encoded to either 1 or 2 or 3

#

for each 63K example

lapis sequoia Mar 16, 2022, 12:57 PM

#

whats the shape of encoded_Y can you print it?

mint palm Mar 16, 2022, 12:57 PM

#

yes i will verify

mint palm Mar 16, 2022, 12:59 PM

#

lapis sequoia whats the shape of `encoded_Y` can you print it?

#

its 1 column

lapis sequoia Mar 16, 2022, 1:00 PM

#

i said print encoded_Y.shape lemon_grumpy

mint palm Mar 16, 2022, 1:00 PM

#

i removed my encoded

lapis sequoia Mar 16, 2022, 1:00 PM

#

okay...okay do you first of all understand what is the issue?

mint palm Mar 16, 2022, 1:00 PM

#

yeah

lapis sequoia Mar 16, 2022, 1:01 PM

#

and why removing encoding affects your code?

mint palm Mar 16, 2022, 1:01 PM

#

it want different size

#

i am giving it 3 by something

#

it need 1 by something

lapis sequoia Mar 16, 2022, 1:01 PM

#

yes.

#

do you know the simplest way to resolve it?

mint palm Mar 16, 2022, 1:01 PM

#

i think i should try printing his encoded Y too

#

i will know if error is after that or somewhere else

mint palm Mar 16, 2022, 1:02 PM

#

lapis sequoia do you know the simplest way to resolve it?

no

lapis sequoia Mar 16, 2022, 1:08 PM

#

mint palm no

hm you can use to_categorical

urban lance Mar 16, 2022, 1:08 PM

#

I'm guessing those 2 purple clusters are classified as being of the same cluster right? 😬

lapis sequoia Mar 16, 2022, 1:08 PM

#

https://www.tensorflow.org/api_docs/python/tf/keras/utils/to_categorical

TensorFlow

tf.keras.utils.to_categorical | TensorFlow Core v2.8.0

Converts a class vector (integers) to binary class matrix.

#

Y = to_categorical(Y)

this will convert your Y in (something, 3) so then you're good.

#

@mint palm

mint palm Mar 16, 2022, 1:09 PM

#

ok thank you....i will try all that you said

lapis sequoia Mar 16, 2022, 1:10 PM

#

alright, ping me here if you're still stuck.

urban lance Mar 16, 2022, 1:11 PM

#

urban lance I'm guessing those 2 purple clusters are classified as being of the same cluster...

well there must be 2 clusters and so they aren't all visible in 2d space (being covered by other datapoints)

#

I'm using Kmeans so how does this work 🤔

#

my input is a distance matrix of my dataset

lapis sequoia Mar 16, 2022, 1:14 PM

#

urban lance I'm guessing those 2 purple clusters are classified as being of the same cluster...

depends on how you show your dots you need to show different colors for different clusters Im not sure if you did that.

urban lance Mar 16, 2022, 1:16 PM

#

of course I did

lapis sequoia Mar 16, 2022, 1:17 PM

#

urban lance of course I did

oh lol no offense. how much dimensions are there actually?

urban lance Mar 16, 2022, 1:18 PM

#

it's 2 dimensional data

#

(with chi2 distance)

#

lapis sequoia Mar 16, 2022, 1:19 PM

#

if it's 2 dimensional then by the definition of kmeans, there are 2 possibilities, something is wrong with your algo or something is wrong with your visualization.

urban lance Mar 16, 2022, 1:19 PM

#

this is what I imput

#

probably the plot then, thought finally I had something cause I clearly saw 3 clusters now :/

#

been trying to cluster this data for almost 2 weeks 😅

urban lance Mar 16, 2022, 1:29 PM

#

urban lance I'm guessing those 2 purple clusters are classified as being of the same cluster...

this is the same data just with different x-y values

#

the elbow method tells me the optimal number of clusters is 3

#

and this is what happens when I input the original data set

coral garden Mar 16, 2022, 1:31 PM

#

what is that zombie infestation

urban lance Mar 16, 2022, 1:31 PM

#

haha

#

ye

lapis sequoia Mar 16, 2022, 1:31 PM

#

urban lance and this is what happens when I input the original data set

okay that seems like k means output.

coral garden Mar 16, 2022, 1:31 PM

#

;_)

urban lance Mar 16, 2022, 1:31 PM

#

lapis sequoia okay that seems like k means output.

it is

#

but kmeans isn't optimal for my dataset;

#

I tried with chi1 distance and just counts of my dataset

#

lapis sequoia Mar 16, 2022, 1:32 PM

#

coral garden what is that zombie infestation

i was gonna say zombie infestation is not a part of #data-science-and-ml or something but then i got the context lol

urban lance Mar 16, 2022, 1:32 PM

#

urban lance

this is what the data looks like in that case

urban lance Mar 16, 2022, 1:33 PM

#

urban lance

clusters I made looked fine on the 2d plot

coral garden Mar 16, 2022, 1:33 PM

#

lapis sequoia i was gonna say zombie infestation is not a part of <#366673247892275221> or som...

lol

urban lance Mar 16, 2022, 1:33 PM

#

but when I did a profile report, I saw that the clusters didn't make sense

#

I know what you're all thinking, outliers! haha

lapis sequoia Mar 16, 2022, 1:34 PM

#

urban lance but when I did a profile report, I saw that the clusters didn't make sense

I'm not sure what do you mean by profile report here?

urban lance Mar 16, 2022, 1:35 PM

#

pandas profiling report

#

lapis sequoia Mar 16, 2022, 1:36 PM

#

...so how does this relate with kmeans?

#

wtf @strange elbow

urban lance Mar 16, 2022, 1:38 PM

#

I made a different dataset for each cluster, then looked at the values for al features but they weren't really different
for ex:
in cluster 1, the num_visits ranges from 1-27
and in cluster 2 they'd range from 1-24
and cluster 3: 1-32

#

I tried with count values + chi2 distance and hierarchical clustering/GMM
and redid the feature engineering so I had the right data from kmeans

#

should add that the data is normalized

#

@lapis sequoia

lapis sequoia Mar 16, 2022, 1:44 PM

#

urban lance <@456226577798135808>

hm yeah I've been thinking about it, but I'm lost.

#

i cannot understand whats the issue

urban lance Mar 16, 2022, 1:51 PM

#

the clusters don't make sense so something must not be up with the feature engineering
I'm trying to predict where in the customer journey a certain user is so the data should make sense 🤔

#

(tried with 3 datasets from different companies)

#

all the same results

karmic valley Mar 16, 2022, 2:05 PM

#

i want to work out the gradient of colour from this image. so basically im thinking of creating a line or box that starts at e.g. y=0 and goes up until it hits the black colour and it should find pixel colours in terms of whiteness at all points along that line. so it should go from high white pixel values to medium grey pixel values to blackish pixel values. i can use some more basic softwares like imagej to draw a line and plot pixel values but the problem is in my image region of interest there are lots of random pixels with completely different greyness colours so the graph would have lots of noise (go from very high pixel whiteness to sudden low pixel whiteness). i want some way to kind of smooth image or change odd pixels to have same colour as its neughbouring pixels. for smoothing i cant smooth much because essentially i want to get an accuarate plot of changing pixel intensity across the line. could anyone siggest some code

grave frost Mar 16, 2022, 2:14 PM

#

iron basalt I would say that they are everywhere now due to being part of Apple's new SOCs. ...

wait what - TPUs merging with Apple Sillicon?

#

idk how they're exactly closed - their speciality is bf16 ops, that's what they're designed to do and Jax, Pytorch or TF works very well IMO

mint palm Mar 16, 2022, 2:49 PM

#

Will double encoding do anything at all?

serene scaffold Mar 16, 2022, 2:50 PM

#

mint palm Will double encoding do anything at all?

for what

mint palm Mar 16, 2022, 2:51 PM

#

for encoding string to integers

mint palm Mar 16, 2022, 2:51 PM

#

serene scaffold for what

for unique representation of set of possible value of any attribute

lean harbor Mar 16, 2022, 2:53 PM

#

so i've built this classification model on the fruits360 dataset, and it's pretty accurate, but i'm not sure how to match up folder names to the numbered classes that pytorch outputs. does anyone have experience with this or know how to get that classification? thanks https://colab.research.google.com/drive/1WWtTrG57chcm2xf5bHlP7NCiFE_YpHin?usp=sharing

Google Colaboratory

serene scaffold Mar 16, 2022, 2:53 PM

#

it depends on the algorithm, but generally speaking, you can't just assign words to arbitrary integers, as this tells the algorithm that a word with a higher number is "more" than another word, which makes no sense.

mint palm Mar 16, 2022, 2:54 PM

#

serene scaffold it depends on the algorithm, but generally speaking, you can't just assign words...

#

i am talking about this

serene scaffold Mar 16, 2022, 2:55 PM

#

I don't look at screenshots of code, sorry

#

!code

arctic wedgeBOT Mar 16, 2022, 2:55 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

serene scaffold Mar 16, 2022, 2:55 PM

#

^ that and the paste bin are the only way I will look at code.

mint palm Mar 16, 2022, 2:56 PM

#

serene scaffold ^ that and the paste bin are the only way I will look at code.

import tensorflow.keras
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils

import numpy
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import matplotlib.pyplot as plt
from keras.utils.vis_utils import plot_model

seed = 9
numpy.random.seed(seed)

input_file = "C:\\Users\\rahul\\PycharmProjects\\pythonProject1\\complete.csv"
test_file = "C:\\Users\\rahul\\PycharmProjects\\pythonProject1\\complete.csv"

dataset = pd.read_csv(input_file).values

# read training data
datasetTest = pd.read_csv(test_file).values

# split into input (X) and output (Y) variables
X = dataset[:,0:8].astype("int32")
Y = dataset[:,8]
XT = datasetTest[:,0:8].astype("int32")
print(Y)
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
print(dummy_y)
(X_train, X_test, Y_train, Y_test) = train_test_split(X, dummy_y, test_size=0.5, random_state=seed)

serene scaffold Mar 16, 2022, 2:56 PM

#

what's the algorithm?

mint palm Mar 16, 2022, 2:58 PM

#

serene scaffold what's the algorithm?

now see

serene scaffold Mar 16, 2022, 2:58 PM

#

so you're using a neural network, built with keras

#

what is the model intended to do?

mint palm Mar 16, 2022, 2:58 PM

#

i have a dataset thats basically all strings

#

so i have to predict an output that falls in one of the 3 category

#

for which i use onehot representation

serene scaffold Mar 16, 2022, 3:00 PM

#

"strings". just saying that you have strings is uninformative--what kind of strings? what do they represent? where do they come from?

#

and what are the categories?

mint palm Mar 16, 2022, 3:00 PM

#

1,1,1,1,0,2,1,5,1
1,1,1,1,1,2,1,5,1
1,1,1,1,2,2,1,5,1
1,1,1,1,3,2,1,5,1
1,1,1,1,4,2,1,5,1
1,1,1,1,5,2,1,5,1
1,1,1,1,6,2,1,5,1
1,1,1,1,7,2,1,5,1
1,1,1,1,8,2,1,5,1
1,1,1,1,9,2,1,5,1
1,1,1,1,10,2,1,5,1
1,1,1,1,11,2,1,5,1```

#

this is a sample

serene scaffold Mar 16, 2022, 3:00 PM

#

after you encode them?

mint palm Mar 16, 2022, 3:00 PM

#

encoded sample^

lapis sequoia Mar 16, 2022, 3:00 PM

#

Is there a good format to save a table with images? I'll typically plot a data frame and save each plot to a file, but it might be nice to just save the whole table to a single file which I can open/annotate. The question is, is there a format which already has visualizers, which allow you to filter/maybe annotate?

serene scaffold Mar 16, 2022, 3:01 PM

#

lapis sequoia Is there a good format to save a table with images? I'll typically plot a data f...

you can pickle the dataframe and load it again if you need to generate additional figures later, if that's what you mean.

mint palm Mar 16, 2022, 3:02 PM

#

serene scaffold after you encode them?

yes columns are weekdays, network types, etc etc

serene scaffold Mar 16, 2022, 3:02 PM

#

mint palm yes columns are weekdays, network types, etc etc

don't say "etc etc". I can't possibly guess what they are unless you tell me.

#

remember: I know nothing about what you're trying to do. only you do

lapis sequoia Mar 16, 2022, 3:02 PM

#

serene scaffold you can pickle the dataframe and load it again if you need to generate additiona...

I want a file format which I can open with a GUI, to examine plots (probably sorting/filtering) and maybe annotate them

#

Something a little better than opening a folder full of plots with the file explorer

karmic valley Mar 16, 2022, 3:04 PM

#

#data-science-and-ml message can anyone help me

mint palm Mar 16, 2022, 3:04 PM

#

serene scaffold remember: I know nothing about what you're trying to do. only you do

lapis sequoia Mar 16, 2022, 3:04 PM

#

I've cooked up something like this, just wondering if there's something mature and off-the-shelf out there

serene scaffold Mar 16, 2022, 3:04 PM

#

karmic valley https://discord.com/channels/267624335836053506/366673247892275221/9536552735312...

I'm sorry that you haven't gotten help with that question after all this time. It's looking unlikely that anyone can help with that, so you might need to do some more investigation on your own and come back with a more pointed question later.

mint palm Mar 16, 2022, 3:06 PM

#

mint palm

the left hand side column element are column name for encoded data

#

almost

serene scaffold Mar 16, 2022, 3:06 PM

#

mint palm

what are the categories you're predicting for?

#

I should really get back to work, but I would recommend reading about feature encoding. you have to be intentional about how you represent each feature for the network, or it won't understand what you're telling it to do.

mint palm Mar 16, 2022, 3:16 PM

#

serene scaffold what are the categories you're predicting for?

like my weekday attribute is 0 to 6......for sunday to saturday
time is 0 to 23
like this so on

#

actually the main problem was....when i run this code it says 100% accuracy lol

crisp sluice Mar 16, 2022, 3:21 PM

#

has anyone had any experience with openai gym? i installed in python 3.9 and tried running sample code from the docs on the openai gym site but the example isn't popping up for me...

#

here's the code:

#


env = gym.make('CartPole-v1')
observation = env.reset()
for _ in range(1000):
    env.render()
    action = env.action_space.sample()
    observation, reward, done, info = env.step(action) # take a random action
    if done:
        observation = env.reset()
env.close()```

misty flint Mar 16, 2022, 3:38 PM

#

karmic valley i want to work out the gradient of colour from this image. so basically im think...

look at various edge detection methods like sobel or canny. if those arent fine-detailed enough, maybe look at specific ones or adjust/tweak them to fit your needs.

i know matlab does this type of stuff really well, but i think you might be able to get by with python's opencv library

#

you might have to mess around with kernel sizes

karmic valley Mar 16, 2022, 3:43 PM

#

misty flint look at various edge detection methods like sobel or canny. if those arent fine-...

Okay I'll try to read up on this. Can I ask you again once I read up

misty flint Mar 16, 2022, 3:45 PM

#

im not a CV person, so im not the best person to ask sorry. i really didnt like my image processing/feature engineering class kekHands

silver sun Mar 16, 2022, 4:15 PM

#

lapis sequoia hm seems like you are either giving the wrong input or you need to edit architec...

My X train input is this shape (700227, 8) and my y train input is this shape(700227, 11). Here is my model architecture: model.add(Dense(2000, activation='relu',input_dim=24)) model.add(Dense(1500, activation='relu')) model.add(Dropout(0.2)) model.add(Dense(800,activation='relu')) model.add(Dropout(0.2)) model.add(Dense(400,activation='relu')) model.add(Dropout(0.2)) model.add(Dense(150,activation='relu')) model.add(Dropout(0.2)) model.add(Dense(12, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) My error is ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 24), found shape=(None, 8)

karmic valley Mar 16, 2022, 4:22 PM

#

misty flint im not a CV person, so im not the best person to ask sorry. i really didnt like ...

haha no worries!

#

is there any matlab discord server

mint palm Mar 16, 2022, 4:36 PM

#

#

this is a part from github code

#

why are there seperate input_file and test_file

#

he later does split data set into X_train, Y_train, and X_test , Y_test

lean kindle Mar 16, 2022, 4:51 PM

#

Hello all, I am trying to perform invoice data extraction from an image of invoice and export the data into an excel file. But I want to extract only a few fields from the invoice and not the entire invoice. Can anyone please advise how can I do that ?

from PIL import Image # pip install Pillow

set tesseract cmd to the be the path to your tesseract engine executable

(where you installed tesseract from above urls)

and start doing it

your saved images on desktop

list_with_many_images = [
"PartI_Data/Img1.PNG",
"PartI_Data/Img1.PNG",

"PartI_Data/Img1.PNG"
]

create a function that returns the text

def image_to_str(path):
""" return a string from image """
return pytesseract.image_to_string(Image.open(path))

now pure action + csv part

with open("images_content.csv", "w+", encoding="utf-8") as file:
file.write("ImagePath, ImageText")
for image_path in list_with_many_images:
text = image_to_str(image_path)
line = f"{image_path}, {text}\n"
file.write(line)

lapis sequoia Mar 16, 2022, 5:06 PM

#

mint palm ```python import tensorflow.keras from keras.models import Sequential from keras...

and whats the error?

lapis sequoia Mar 16, 2022, 5:07 PM

#

lean kindle Hello all, I am trying to perform invoice data extraction from an image of invoi...

!code first of all

arctic wedgeBOT Mar 16, 2022, 5:07 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

mint palm Mar 16, 2022, 5:11 PM

#

lapis sequoia and whats the error?

hi so i had one doubt, if i apply LabelEncoder on already encoded dataset, will it change anything?

#

i think NO

lapis sequoia Mar 16, 2022, 5:11 PM

#

mint palm i think NO

I am not aware about LabelEncoder, what is it?

mint palm Mar 16, 2022, 5:12 PM

#

its like giving unique value to represent a string like in dataset containing sunday to saturday we can substitute 0 to 6

#

well leaving that aside for a while, my main issue is i am getting 200% accuracy on running github code without any change

lapis sequoia Mar 16, 2022, 5:14 PM

#

mint palm its like giving unique value to represent a string like in dataset containing su...

200% ACCURACY!!!!

#

what in the world

mint palm Mar 16, 2022, 5:14 PM

#

100

#

sorry

#

lmao

#

its 100%

#

it seems like my test set is a part of train set

lapis sequoia Mar 16, 2022, 5:15 PM

#

lol

mint palm Mar 16, 2022, 5:16 PM

#

https://github.com/adtmv7/DeepSlice/blob/master/Source/DeepSlice

lapis sequoia Mar 16, 2022, 5:16 PM

#

seems like private

mint palm Mar 16, 2022, 5:16 PM

#

no

#

its public

lapis sequoia Mar 16, 2022, 5:16 PM

#

modest shuttle Mar 16, 2022, 5:17 PM

#

Hello, I'm beginner in AI,
What is best methods for fake image(modified) detection? (without CUDA)

serene scaffold Mar 16, 2022, 5:17 PM

#

modest shuttle Hello, I'm beginner in AI, What is best methods for fake image(modified) detecti...

the best method is going to involve CUDA, just so you know.

lapis sequoia Mar 16, 2022, 5:17 PM

#

serene scaffold the best method is going to involve CUDA, just so you know.

hm isn't cuda just for GPU?

serene scaffold Mar 16, 2022, 5:17 PM

#

lapis sequoia hm isn't cuda just for GPU?

yes

mint palm Mar 16, 2022, 5:17 PM

#

https://github.com/adtmv7/DeepSlice/blob/master/Source/DeepSlice.py

GitHub

DeepSlice/DeepSlice.py at master · adtmv7/DeepSlice

DeepSlice: A Deep Learning Approach towards an Efficient and Reliable Network Slicing in 5G Networks - DeepSlice/DeepSlice.py at master · adtmv7/DeepSlice

modest shuttle Mar 16, 2022, 5:18 PM

#

serene scaffold the best method is going to involve CUDA, just so you know.

I know that but i don't have CUDA GPU

lapis sequoia Mar 16, 2022, 5:18 PM

#

so involve as in to make it faster right?

modest shuttle Mar 16, 2022, 5:18 PM

#

serene scaffold the best method is going to involve CUDA, just so you know.

do you have any other suggestion?

serene scaffold Mar 16, 2022, 5:19 PM

#

modest shuttle do you have any other suggestion?

what are the fake images you're trying to detect, anyway?

lapis sequoia Mar 16, 2022, 5:19 PM

#

mint palm https://github.com/adtmv7/DeepSlice/blob/master/Source/DeepSlice.py

if test_size=0.001, thats like...0.1% you do realise how much test data you are giving right?

modest shuttle Mar 16, 2022, 5:19 PM

#

serene scaffold what are the fake images you're trying to detect, anyway?

photoshped and etc...

serene scaffold Mar 16, 2022, 5:20 PM

#

also, you can experiment with GPU computation using google colab.

lapis sequoia Mar 16, 2022, 5:20 PM

#

^for certain hours

serene scaffold Mar 16, 2022, 5:20 PM

#

modest shuttle photoshped and etc...

just detecting if an image has passed through photoshop in some way is probably going to be too broad.

mint palm Mar 16, 2022, 5:20 PM

#

lapis sequoia if test_size=0.001, thats like...0.1% you do realise how much test data you are ...

yes thats just 65 examples, i dont get what input_file and test_file mean

#

imput_file is ok

#

but if he used train_test_split then why he need test_file

lapis sequoia Mar 16, 2022, 5:21 PM

#

oh yes so if you have different testdata why are you bothering to split train data?

mint palm Mar 16, 2022, 5:22 PM

#

lapis sequoia oh yes so if you have different testdata why are you bothering to split train da...

i am not bothering, the git user is!

lapis sequoia Mar 16, 2022, 5:23 PM

#

ok so lets see, I'll assume he has just split train data to get some scores, but if you do look below, he has commented out the part

#from sklearn.metrics import confusion_matrix
#y_pred_keras = model.predict_classes(XT)

#csv = open("C:\\DeepSlice\\5G\\output.csv", "w")
#"w" indicates that you're writing strings to the file

#pd.DataFrame(y_pred_keras).to_csv("C:\\DeepSlice\\5G\\output.csv")
#cm = confusion_matrix(Y_test, y_pred_keras, labels=[0, 1, 2])

#

which is for...testing i assume

#

now why he did above thing, well thats the thing I am unaware of.

mint palm Mar 16, 2022, 5:24 PM

#

so we are stuck

lapis sequoia Mar 16, 2022, 5:25 PM

#

you* are lemon_pensive

mint palm Mar 16, 2022, 5:25 PM

#

lapis sequoia ok so lets see, I'll assume `he` has just split train data to get some scores, b...

i also tried to train for test_size=0.5,

#

it still gives 100 %accuracy

lapis sequoia Mar 16, 2022, 5:26 PM

#

i mean... it literally depends on dataset and the problem, just for the sake of argument, give it like 0.1?

#

also just ping me here if you ask another question, I'm reading a novel.

mint palm Mar 16, 2022, 5:27 PM

#

lapis sequoia i mean... it literally depends on dataset and the problem, just for the sake of ...

in 5/16 epoch it had 100% acc

#

lmao

#

lapis sequoia Mar 16, 2022, 5:28 PM

#

i mean, i dont know what the problem is, may be it could be solve by some linear function for all i know.

mint palm Mar 16, 2022, 5:29 PM

#

cant be....an author mentioned after applying CNN +LSTM it gave 95% accu

#

how can it be 100 with linear function lmao

lapis sequoia Mar 16, 2022, 5:30 PM

#

mint palm how can it be 100 with linear function lmao

as i said, i am not aware of even the problem.

mint palm Mar 16, 2022, 5:31 PM

#

yeah i understand, i am not complaining, you are fair at your place

exotic thicket Mar 16, 2022, 7:07 PM

#

#

How come the solution got (20,20) would someone mind helping me with this solution

agile cobalt Mar 16, 2022, 7:12 PM

#

that isn't really fit for this channel
but either way, it's just (2+2, 3+1) = (4,4) then (4*5, 4*5) = (20, 20).
You could've asked in a general help channel though (#❓｜how-to-get-help)

barren vigil Mar 16, 2022, 7:34 PM

#

hello sorry which chat talks about computational neuroscience?

#

૮₍ ˶•⤙•˶ ₎ა
./づ~ 🍓

karmic valley Mar 16, 2022, 7:44 PM

#

could anyone help me with something. i want to get the pixel whiteness (in terms of greyscale) from bottom to top along a single imaginary line in my image. so i get many values of numbers

serene scaffold Mar 16, 2022, 7:45 PM

#

karmic valley could anyone help me with something. i want to get the pixel whiteness (in terms...

if the image is already a 2d array of grayscale values, then the pixel "whiteness" along a given line would just be one row-column of the array.

karmic valley Mar 16, 2022, 7:52 PM

#

oh okay, would you be able to help me with the code

serene scaffold Mar 16, 2022, 7:52 PM

#

I can't do that rn. sorry.

karmic valley Mar 16, 2022, 7:53 PM

#

sure, no problem. would i be able to come back at a time that suits you to get more help

serene scaffold Mar 16, 2022, 7:56 PM

#

karmic valley sure, no problem. would i be able to come back at a time that suits you to get m...

It's never guaranteed that I'm available at any particular time. But if you loaded the image with PIL, you can read it into a numpy array. If you get a 3d array, that means one of the dimensions is RGB values, so you'd have to look into how to convert that to a 2d array of grayscale values.

iron basalt Mar 16, 2022, 8:20 PM

#

grave frost wait what - TPUs merging with Apple Sillicon?

They call them neural cores or whatever, but the thing is that you can't just write programs for them without using their high level ML library which is Pytorch-like (and can convert Pytorch, TF models). Which is limiting because for those who want to run their Pytorch models and such there are missing functions (and you just have to hope that Apple adds the missing stuff), and for those that want to just get as much compute as possible out of the SOC, they would have to now hack on this high level API rather than just being able to generate instructions for it directly. People are already actively reverse engineering it, but there is not really any (good) reason for Apple to make it this painful.

#

(The neural cores are more or less just CUDA-like cores ripped out of the GPU)

#

In addition, in the high level API, you can't control where the ANN runs. Apple's driver decides it dynamically and can place it either on the CPU, GPU, or neural cores. This might sound nice, but in practice programmers often know where they want it to run and the driver will just make everything worse by trying to be smart (many have already run into this issue). It would be fine if that was the default setting and you can still force it to run where you want.

minor elbow Mar 16, 2022, 8:44 PM

#

Apple hardware has not really been that good for scientific computing. It partly stems from the apple/nvidia rift and NV being completely ahead of the game wrt CUDA, but also historically the higher end apple hardware has been designed to be really good at photo/video work

#

i say this as a huge apple fanboy who just got an m1 pro macbook

#

i think they are broadening things tho with the arm/m1 stuff

grave frost Mar 16, 2022, 8:46 PM

#

iron basalt In addition, in the high level API, you can't control where the ANN runs. Apple'...

I do agree but you don't have these restrictions on XLA at all, so I don't see any basis for arguing - they're pretty flexible for training large DL models. Its not like anything is closed source 🤔 one could still compile and execute custom operations

minor elbow Mar 16, 2022, 8:48 PM

#

its still very early days for desktop ARM. As I see it building massive deep learning models is still a very niche thing and most orgs are going to either do it via cloud or dedicated specialized hardware

#

also ipython nb are for sharing/documenting research output for other ppl, they arent good as units of computation imo

#

stuff like anaconda are obsolete and shouldnt be used IMO

#

it seems to me if you want to do serious GPU compute, you want to use CUDA cause thats what all the libraries support best, which means you're going to use Nvidia hw, which rules out apple.

#

I am not sure how correct that view is

iron basalt Mar 16, 2022, 8:53 PM

#

grave frost I do agree but you don't have these restrictions on XLA at all, so I don't see a...

Not sure if XLA supports the neural cores on Apple's SOCs, nor if it ever will since it's closed off. Can't just throw LLVM at it (LLVM has to be allowed to target it in the first place). Also some want to non-ml stuff with the neural cores, because it's some compute power that would otherwise be wasted if they are not doing any ml.

grave frost Mar 16, 2022, 8:55 PM

#

there hasn't been any information released about the neural cores AFAIK

iron basalt Mar 16, 2022, 8:56 PM

#

We know that you can use Apple's core-ml lib or whatever it's called which is Pytorch-like and convert Pytorch / TF models to itself. Unofficially people have been reverse engineering it for some time.

minor elbow Mar 16, 2022, 8:56 PM

#

if u want to use other ppls built models coreml is good imo

#

its a decent solution to a tricky problem

grave frost Mar 16, 2022, 8:57 PM

#

and I still don't see how it is relevant to the original point - TPUs are pretty customizable w/ Jax, and nothing like Apple's SOC at all. they have plenty of information available and work directly with XLA as well as the Jax team

iron basalt Mar 16, 2022, 8:57 PM

#

I thought you wanted more info on TPUs being a thing on Apple silicon.

grave frost Mar 16, 2022, 8:58 PM

#

I don't see any operation you can't do with jax, just that it won't be any faster or optimized unless its precision-agnostic

iron basalt Mar 16, 2022, 8:58 PM

#

TPUs are Google's specific terminology for them, but it's more or less the same thing as Apple's neural cores. 16 bit floats and all.

grave frost Mar 16, 2022, 8:58 PM

#

iron basalt I would say that they are everywhere now due to being part of Apple's new SOCs. ...

uhh, no I was actually looking for an elaboration of this 🙂

iron basalt Mar 16, 2022, 8:59 PM

#

TPUs are much more open, but the general hardware idea of them could be considered to be everywhere now that they are in Apple silicon.

#

And will continue to be everywhere like how a GPU is now.

grave frost Mar 16, 2022, 9:00 PM

#

well, again AFAIK fundamentally TPU structure is pretty different to other hardware alternatives

#

its more about the hardware architecture 🤔 rather than direct customizations of the chips themselves - they have different memory systems and other complex stuff which I didn't get

iron basalt Mar 16, 2022, 9:03 PM

#

TPUs do two things well, fast low precision floats for matrix multiplies, and convolutions. They were also designed to fit nicely into their data center racks. Other than that, they are basically just stripped down GPU cores.

#

Apple's neural cores do these two things well also.

#

And are also stripped down GPU cores.

#

So other than name, TPU vs neural core, they are more or less the same thing. My guess is that the rename was to both avoid confusion with Google's stuff and to sell it better.

#

There are probably differences between the two, but they both have the same goal of those fast low precision float operations. And both come from having previously used the GPU and so they are still GPU-like to save R&D time.

misty flint Mar 16, 2022, 9:30 PM

#

may or may not get to use gpt-3 at my company

#

pog

frank moth Mar 17, 2022, 12:52 AM

#

Hello, I am trying to use plot_predict with python 3.10.3 and statsmodels 0.13.2. My advisor ran the exact same code and it worked but when I run it I get the following error. I have tried uninstalling and reinstalling everything 3 times with python 3.9.11, 3.9.9 and statsmodels 0.12.2 which is the version that the advisor uses. None of it is working, how can I get it to work?
Thank you

fig, ax = plt.subplots()
ax = adtrain.loc['2020-05-02':].plot(ax=ax)
fig = result_whole.plot_predict(start = '2021-05-02', end = "2022-02-14", dynamic=True, ax=ax, plot_insample=False)
plt.show()

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [7], in <cell line: 6>()
      4 ax = adtrain.loc['2020-05-02':].plot(ax=ax)
      5 ## fig = result_whole.plot_predict(start = '2020-05-02', end = "2022-02-14", dynamic=True, ax=ax, plot_insample=False)
----> 6 fig = result_whole.plot_predict(start = '2021-05-02', end = "2022-02-14", dynamic=True, ax=ax, plot_insample=False)
      7 plt.show()

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\statsmodels\base\wrapper.py:34, in ResultsWrapper.__getattribute__(self, attr)
     31 except AttributeError:
     32     pass
---> 34 obj = getattr(results, attr)
     35 data = results.model.data
     36 how = self._wrap_attrs.get(attr)

AttributeError: 'ARIMAResults' object has no attribute 'plot_predict'

#

If I try to install statsmodels 0.12.2 with the current python version I get a very long error: ``` note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.```
I found this solution ( https://stackoverflow.com/questions/71009659/note-this-error-originates-from-a-subprocess-and-is-likely-not-a-problem-with ) but Idk what plugin to take from the site. The one from the stackoverflow solution does not work.

Stack Overflow

note: This error originates from a subprocess, and is likely not a ...

while downloading pip install allennlp==1.0.0 allennlp-models==1.0.0
I faced this problem :

    [6 lines of output]
    running bdist_wheel
    running build
    running build_py

...

eager wedge Mar 17, 2022, 1:21 AM

#

How do I change the learning rate of my CNN using tensorflow?

serene scaffold Mar 17, 2022, 1:22 AM

#

eager wedge How do I change the learning rate of my CNN using tensorflow?

I'm not a tensorflow user, but isn't it one of the parameters when you go to compile the model, or something like that?

eager wedge Mar 17, 2022, 1:22 AM

#

ill check it out

#

Is there a way to check out the current learning rate

serene scaffold Mar 17, 2022, 1:23 AM

#

eager wedge Is there a way to check out the current learning rate

can you show the code that creates the model? so I have an entry point for where I should be looking in the docs.

eager wedge Mar 17, 2022, 1:24 AM

#

cnn = tf.keras.models.Sequential()
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
cnn.add(tf.keras.layers.MaxPooling2D())
cnn.add(tf.keras.layers.Conv2D(32, 3, activation='relu'))
cnn.add(tf.keras.layers.MaxPooling2D())
cnn.add(tf.keras.layers.Conv2D(32, 3, activation='relu'))
cnn.add(tf.keras.layers.MaxPooling2D())
cnn.add(tf.keras.layers.Flatten())
cnn.add(tf.keras.layers.Dense(units=255, activation='relu'))
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
cnn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
cnn.fit(x=train_set,validation_data=test_set,epochs=25)

serene scaffold Mar 17, 2022, 1:25 AM

#

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential()
model.add(layers.Dense(64, kernel_initializer='uniform', input_shape=(10,)))
model.add(layers.Activation('softmax'))

opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(loss='categorical_crossentropy', optimizer=opt)

I found this.

eager wedge Mar 17, 2022, 1:25 AM

#

ok thx

serene scaffold Mar 17, 2022, 1:31 AM

#

tbh I'm not clear on what an "optimizer" is. is it a way of representing backpropagation as an object?

#

has OOP gone too far?

desert oar Mar 17, 2022, 1:39 AM

#

serene scaffold tbh I'm not clear on what an "optimizer" is. is it a way of representing backpro...

no, it's the actual optimization algorithm

#

i guess you could say it's an implementation of the strategy pattern, if you want to think about it in OO terms

#

basically it just changes the weight update algorithm

#

although in principle you could have l-bfgs or something like that, i don't think there's a "stochastic" l-bfgs version, so you'd have to set batch size = training set

serene scaffold Mar 17, 2022, 1:43 AM

#

desert oar basically it just changes the weight update algorithm

interesting. so these are algorithms for updating the weights that are "smarter" than backprop?

desert oar Mar 17, 2022, 1:43 AM

#

no, backprop is backprop

#

"backprop" just is how you calculate the gradient

serene scaffold Mar 17, 2022, 1:44 AM

#

tangerine_think

desert oar Mar 17, 2022, 1:44 AM

#

so backprop is always backprop

#

but how the weights are updated, is the optimizer

#

so SGD is the basic algorithm

serene scaffold Mar 17, 2022, 1:45 AM

#

gradient descent that conforms to a defined probabilistic distribution?!

desert oar Mar 17, 2022, 1:45 AM

#

what do you mean

serene scaffold Mar 17, 2022, 1:46 AM

#

just expanding out "SGD" to have the definition of "stochastic" in it.

desert oar Mar 17, 2022, 1:46 AM

#

oh

serene scaffold Mar 17, 2022, 1:46 AM

#

I'm still listening, if there is more that you were planning to say.

desert oar Mar 17, 2022, 1:47 AM

#

sorry i got pulled into a dota match!

#

ping me in an hour lol

#

tldr you can have fancier weight updates than sgd

serene scaffold Mar 17, 2022, 1:48 AM

#

!remind 1h lick the salt

arctic wedgeBOT Mar 17, 2022, 1:48 AM

#

Affirmative!

Your reminder will arrive on <t:1647485281:F>!

ionic palm Mar 17, 2022, 1:50 AM

#

Is simulated annealing in category of machine learning?

serene scaffold Mar 17, 2022, 1:51 AM

#

ionic palm Is simulated annealing in category of machine learning?

I suppose it's kind of like unsupervised learning

desert oar Mar 17, 2022, 1:51 AM

#

its an optimization algorithjm

serene scaffold Mar 17, 2022, 1:52 AM

#

are you playing dota or not?!

ionic palm Mar 17, 2022, 1:54 AM

#

Can be algorithm and unsupervised learn same time?

iron basalt Mar 17, 2022, 1:56 AM

#

ionic palm Is simulated annealing in category of machine learning?

It's not a category, it's a method.

serene scaffold Mar 17, 2022, 1:57 AM

#

yes. informally, unsupervised learning is when you don't tell it what the answers are.

odd meteor Mar 17, 2022, 1:57 AM

#

ionic palm Can be algorithm and unsupervised learn same time?

There are various algorithms for unsupervised learning.

desert oar Mar 17, 2022, 1:58 AM

#

@serene scaffold an optimization algorithm in general is an algorithm for finding a local or global maximum or minimum

#

so gradient descent is a general category of optimization algorithms

ionic palm Mar 17, 2022, 1:59 AM

#

Okay a method of unsupervised learning

iron basalt Mar 17, 2022, 1:59 AM

#

Unsupervised is the type of algorithm, not a specific one. There are several.

ionic palm Mar 17, 2022, 2:00 AM

#

Huh then not machine learn

desert oar Mar 17, 2022, 2:01 AM

#

simulated annealing is another optimization algorithm

iron basalt Mar 17, 2022, 2:01 AM

#

There are categories of machine learning, and there are categories of optimization.

grave frost Mar 17, 2022, 2:01 AM

#

@ionic palm you are probably looking for k-means clustering as an unsupervised algorithm

serene scaffold Mar 17, 2022, 2:01 AM

#

grave frost <@!270864978569854976> you are probably looking for k-means clustering as an uns...

Zachary never made this about unsupervised learning. I did.

grave frost Mar 17, 2022, 2:02 AM

#

I was referring to this

Can be algorithm and unsupervised learn same time?

iron basalt Mar 17, 2022, 2:02 AM

#

So you pick which category of machine learning you want / need. Then it will probably involve some optimization problem which can be solved by picking a method from some category of optimization algorithms.

ionic palm Mar 17, 2022, 2:02 AM

#

Wtf is unsupervised algorithm

grave frost Mar 17, 2022, 2:02 AM

#

no labels

#

say if you have a dataset of cats and dogs - in unsupervised learning, you just lump together photos that look like cats together and photos of dogs together, but you don't know which one of the either sets are cats or dogs

#

this is an example btw

serene scaffold Mar 17, 2022, 2:03 AM

#

what was your actual question going to be, anyway, @ionic palm?

ionic palm Mar 17, 2022, 2:03 AM

#

Is simulated annealing in category of machine learning?

#

Now i understand it simulated annealing is a optimize method of unsupervised learning

odd meteor Mar 17, 2022, 2:07 AM

#

ionic palm Wtf is unsupervised algorithm

😀 It's a type of Machine Learning. So apparently there are basically 5 types of ML (some even say 4 types)

Supervised Learning
Unsupervised Learning
Semi-Supervised Learning
Self-Supervised Learning
Reinforcement Learning

serene scaffold Mar 17, 2022, 2:08 AM

#

odd meteor 😀 It's a type of Machine Learning. So apparently there are basically 5 types of...

if you ever feel like writing an explanation for each in one comment, let me know and I'll pin it.

odd meteor Mar 17, 2022, 2:09 AM

#

serene scaffold if you ever feel like writing an explanation for each in one comment, let me kno...

Alright I'll do that.

serene scaffold Mar 17, 2022, 2:09 AM

#

odd meteor Alright I'll do that.

no pressure. I say I'm gonna do stuff and then decide not to increasingly lately.

ionic palm Mar 17, 2022, 2:09 AM

#

Im sorry for asking that much, thank you so much

serene scaffold Mar 17, 2022, 2:10 AM

#

ionic palm Im sorry for asking that much, thank you so much

nah, we like talking about this stuff.

#

it's long debugging sessions that are more draining for us

odd meteor Mar 17, 2022, 2:10 AM

#

serene scaffold no pressure. I say I'm gonna do stuff and then decide not to increasingly lately...

Sure! It's 3:00 am here so I'll do that perhaps tonight or tomorrow

iron basalt Mar 17, 2022, 2:23 AM

#

odd meteor 😀 It's a type of Machine Learning. So apparently there are basically 5 types of...

There is more, but these are the most known, so I think this list is fine. Some like to say there is only 3 (supervised, unsupervised, rl). But as time goes on there will probably be even more.

#

(some don't even have proper names yet beyond temporary made up names)

#

(you could argue there is just two, unsupervised (external input only (and those generated by itself)), and addition input / structure / supervised (e.g. labels, embedded vector spaces, knowledge graphs, reward signals, etc (can be hand crafted)))

#

(but you could also go more extreme and say it's all just inputs, which is not very useful because it does not distinguish anything (how many clusters do I choose?))

arctic wedgeBOT Mar 17, 2022, 2:48 AM

#

serene scaffold !remind 1h lick the salt

It has arrived!

Here's your reminder: lick the salt
[Jump back to when you created the reminder](#data-science-and-ml message)

misty flint Mar 17, 2022, 3:36 AM

#

as much as people seem to dislike the idea of full stack DS, i really think there is power in being able to prototype stuff, especially "data apps"

#

thats my hot take for tonight kekHands

serene scaffold Mar 17, 2022, 3:38 AM

#

@misty flint if full stack were applied to DS, I would think it should mean "a data scientist who is also a software engineer"

#

Because there are a lot of data scientists who shit out terrible code. Especially if they only use notebooks.

#

There's my hot take as well.

iron basalt Mar 17, 2022, 3:39 AM

#

misty flint as much as people seem to dislike the idea of full stack DS, i really think ther...

Why would I hire someone that can only do DS versus someone that can do DS and write an app?

#

(Assuming they do DS just as well)

serene scaffold Mar 17, 2022, 3:40 AM

#

I'm not sure what rex means by disliking the concept of full stack ds

misty flint Mar 17, 2022, 3:41 AM

#

PikaThink

#

im trying to understand that myself

#

i think those who are more "A-type Analyst" data scientists dislike the concept of full stack is what it seems like

serene scaffold Mar 17, 2022, 3:41 AM

#

There are a lot of concepts I hate as well. Including Keurig and Applebee's.

misty flint Mar 17, 2022, 3:41 AM

#

but the "B-type Builder" data scientists obv are all for it

mint palm Mar 17, 2022, 3:41 AM

#

(X_train, X_test, Y_train, Y_test) = train_test_split(X, dummy_y, test_size=0.001, random_state=seed)
does this suffle dataset before dividing between train and test?

serene scaffold Mar 17, 2022, 3:43 AM

#

@mint palm the docs probably specify if the partitions are random, arbitrary, or deterministic.

misty flint Mar 17, 2022, 3:45 AM

#

i think full-stack DS can prove business value to average companies much easier than someone who doesnt; i think maybe as this field develops more they can help add ML features to apps and such (outside of data-driven orgs)

#

otherwise you have business people always asking DS whats their business value

#

since many dont truly understand the concept of R&D

#

kekHands

#

and experimentation

#

"what did you do this quarter?"
"...our experiments were inconclusive"
"..."

#

kekHands

iron basalt Mar 17, 2022, 3:47 AM

#

I'm not sure who would be opposed to this. If you can do more work for me then sure, go ahead.

misty flint Mar 17, 2022, 3:48 AM

#

DoggoKek

#

oh, im all for this squiggle

lone merlin Mar 17, 2022, 3:48 AM

#

I am new to the data science world. I want to ask some questions. If I want to be Business Analytics Specialist, is it better to focus on the ML aspects or I need to learn something else? RIght now I am still an undergrad student at math. Thank you!

iron basalt Mar 17, 2022, 3:48 AM

#

iron basalt "jack of all trades master of none" - lame, self limiting phrase, "jack of all t...

.

misty flint Mar 17, 2022, 3:49 AM

#

T-shaped for life

iron basalt Mar 17, 2022, 3:50 AM

#

Is this your chart?

ionic palm Mar 17, 2022, 3:50 AM

#

Sadly.. yes

misty flint Mar 17, 2022, 3:50 AM

#

lone merlin I am new to the data science world. I want to ask some questions. If I want to ...

Business Analytics doesnt necessarily use ML. they might use some simpler statistical models like linear regression, etc; usually these "business" titles require some domain expertise in the same industry (i.e. real estate experience, logistics, etc.)

iron basalt Mar 17, 2022, 3:50 AM

#

Simulated annealing is outside all of that, in another bubble, under "optimization".

misty flint Mar 17, 2022, 3:51 AM

#

kekHands

#

optimization, oof

#

thats dissertation stuff right there

#

RunFail

ionic palm Mar 17, 2022, 3:51 AM

#

I am so sorry

iron basalt Mar 17, 2022, 3:51 AM

#

What are you sorry for?

#

Getting any kind of set chart with intersections, subsets, and such will be difficult and not really work for this.

#

Not everything fits in those.

misty flint Mar 17, 2022, 3:53 AM

#

as humans, we like to put everything into boxes PikaThink

iron basalt Mar 17, 2022, 3:54 AM

#

It's more like pick type of ML, then pick algorithm in that type, and that may involve annealing, etc, which is its own separate thing.

ionic palm Mar 17, 2022, 3:55 AM

#

ML Type + Algorithm = ML?

iron basalt Mar 17, 2022, 3:56 AM

#

What you had is kind of like putting linear algebra under chemistry. Chemistry makes use of it, but it's not a subset of it, linear algebra exists on its own.

iron basalt Mar 17, 2022, 3:57 AM

#

ionic palm ML Type + Algorithm = ML?

ML is any program that "learns", which basically means it starts out with some base amount of knowledge and gets more data over time.

#

Note that this is super loose and can pretty much include any program that stores inputs or information about those inputs.

#

A really simple example is a program that takes two numbers as input, X and Y.

#

The program then stores those are pairs.

#

And if you give it one, it can give you the other.

#

It learned to associate them.

#

The complexity of ML is how to do more with less. And how to infer stuff based on what was stored.

#

And also what if your input is noisy / not exact? Etc.

inland zephyr Mar 17, 2022, 4:01 AM

#

DS research the model

#

ML Engineer make it to production level

iron basalt Mar 17, 2022, 4:01 AM

#

What if the input is too complex to really deal with directly (e.g. an image)?

inland zephyr Mar 17, 2022, 4:02 AM

#

There are still confusion about the job responsibility and capability for ML and DS

iron basalt Mar 17, 2022, 4:02 AM

#

(how do you store the relevant stuff about it?)

#

ML is about machine learning, nothing more and nothing less. DS often involves standard statistics, forcasting, business stuff. DS can make use of ML since ML happens to also often make use of statistics to function well.

#

DS is more like, I have this job, and I need to pick the right tools and such. That might include picking an ML based tool.

inland zephyr Mar 17, 2022, 4:04 AM

#

from my previous code
i hate that big gap between rows... now find the way to tighten the gap

inland zephyr Mar 17, 2022, 4:04 AM

#

iron basalt DS is more like, I have this job, and I need to pick the right tools and such. T...

i thought MLE just the implementer of a model...

iron basalt Mar 17, 2022, 4:05 AM

#

inland zephyr i thought MLE just the implementer of a model...

Yeah they implement the models, and then some DS somewhere might find it applicable to their task.

#

However, many DS are also often MLEs, etc. People are not limited to one thing, it's just their job title.

inland zephyr Mar 17, 2022, 4:06 AM

#

oh now i get the point

misty flint Mar 17, 2022, 4:06 AM

#

also this field is so new that the boundaries tend to blur quite often

inland zephyr Mar 17, 2022, 4:06 AM

#

DS in title
full stack Data and Modelling is the real job desk

misty flint Mar 17, 2022, 4:06 AM

#

and it differs per company too

iron basalt Mar 17, 2022, 4:06 AM

#

misty flint also this field is so new that the boundaries tend to blur quite often

Yes, all of this is fuzzy sets really, so don't worry too much about it.

inland zephyr Mar 17, 2022, 4:07 AM

#

The more distinctive is DS and DA

#

DS could also handle job of MLE

#

and sometimes DA too

iron basalt Mar 17, 2022, 4:07 AM

#

Yes, DS is broad.

misty flint Mar 17, 2022, 4:08 AM

#

for companies, i think in general for a "data team" if they can cover the majority of the skills between all the roles, i think thats sufficient for most business use cases

iron basalt Mar 17, 2022, 4:08 AM

#

(in part due to companies not knowing what DS is and often just want a statistician that knows Python or R)

misty flint Mar 17, 2022, 4:08 AM

#

obv if you are a data-driven SaaS company, thats very dif then

iron basalt Mar 17, 2022, 4:09 AM

#

(but also acts somewhat as an accountant? idk, it's weird)

misty flint Mar 17, 2022, 4:09 AM

#

iron basalt (but also acts somewhat as an accountant? idk, it's weird)

(and for some reason reports to the CFO. strange)

inland zephyr Mar 17, 2022, 4:09 AM

#

iron basalt (in part due to companies not knowing what DS is and often just want a statistic...

at least could operate Excel fluently

lone merlin Mar 17, 2022, 4:09 AM

#

I rarely find a job title dan use 'Machine Learning Engineer'. sometimes it's also business analytics. and many people don't know what Business Analytics means. That's why I often are confused, haha

mint palm Mar 17, 2022, 4:10 AM

#

serene scaffold <@408337360548528138> the docs probably specify if the partitions are random, ar...

i read the docs,
random_state = int is for reproducing same division
shuffle = bool is for shuffling the dataset
but i didnt use shuffle, and its still shuffled

misty flint Mar 17, 2022, 4:10 AM

#

misty flint (and for some reason reports to the CFO. strange)

i heard about this on a podcast today and it comes from business just not understanding data teams kekHands

inland zephyr Mar 17, 2022, 4:10 AM

#

I find an MLE job
and it need higher degree like Master or PhD

iron basalt Mar 17, 2022, 4:11 AM

#

I mean really it's a nebulous "handles data" person. Which often involves statistics, some spreadsheets (or getting stuff from databases / any tables), and some graphing.

#

(from what the companies know / POV)

misty flint Mar 17, 2022, 4:13 AM

#

inland zephyr I find an MLE job and it need higher degree like Master or PhD

many times companies look for graduate degrees for their DS roles too

#

DoggoKek

iron basalt Mar 17, 2022, 4:15 AM

#

inland zephyr I find an MLE job and it need higher degree like Master or PhD

Many companies will ask for PhD but there is not really enough competition / supply so if you apply you may get accepted anyhow (without any degree even).

#

They will often make the requirements much bigger than needed.

misty flint Mar 17, 2022, 4:17 AM

#

yknow what i heard about that on a separate podcast

#

that particular element as well as increasing number of years allows companies to do one critical thing for job postings

#

~~decrease the amount of job applicants~~

#

kekHands

#

dunno if thats actually true but thats what the podcast guest advocated for

#

hes a director-level so maybe Oopsies

misty flint Mar 17, 2022, 5:13 AM

#

image_f46958c9-7b83-4671-a459-7a5ff05d35c520220317_001250.jpg

#

interesting

#

pithink

#

i do want to try out airflow sometime

#

get bit familiar with it

#

see if i can use it for this one project

glossy hinge Mar 17, 2022, 5:31 AM

#

can anyone suggest me a good YouTube video on reinforcement learning ?

lapis sequoia Mar 17, 2022, 5:39 AM

#

misty flint

where did you draw this?

#

i need a nice platform, I'm using draw.io but I dont know...it seems okay.

tough barn Mar 17, 2022, 6:54 AM

#

please anyone help me with this 2 csv , I want a single row of each which contains only numeric data so that i can again convert that into another and use to test my model, I am getting this type of csv as a feature extracted from a package. So please help me out in this. Thank you

📎 t2.csv 📎 t3.csv

#

I tried but it gives weird output and in t2.csv unable to open using pandas and also unable to convert to float from string

tacit basin Mar 17, 2022, 7:08 AM

#

tough barn please anyone help me with this 2 csv , I want a single row of each which contai...

each you mean c(...) ?
based on the input can you show what you want as output?

tough barn Mar 17, 2022, 7:38 AM

#

like 0.0406, 0.0363, 0.0278, 0.0206, 0.1041, -0.0145, -6e-04, 0.0654, 0.04, 0.086, 0.0775, 0.0018, 0.0285, 0.109, 0.0569, 0.0169, 0.0484, 0.161, 0.0248, 0.0696, 0.0285, 0.0367, 0.0438, 0.0269, 0.0758, 0.0389, 0.0049, 0.0367, 0.0325, 0.0796, 0.0778, 0.0334, 0.0589, 0.0939, 0.0919, 0.026, 0.0331, 0.0943, 0.0247, 0.0616, 0.014, 0.0314, 0.0409, 0.0419, 0.0949, 0.0409, 0.0249, 0.0614, 0.0345, 0.066, 0.0485, 0.0438, 0.031, 0.0688, 0.1064, 0.0406, 0.0488, 0.0868, 0.0314, 0.0431, 0.0329, 0.0514, 0.0432, 0.0533,
0.0747, 0.0552, 0.0489, 0.0638, 0.045, 0.0484, 0.031, 0.0579, 0.0085, 0.0498, 0.1074, 0.0454, 0.0442, 0.0902, 0.0173, 0.0316, 0.0124, 0.0327, 0.0582, 0.0438, 0.116, 0.0352, 0.029, 0.0849, 0.0306, 0.0418, 0.0375, 0.0412, 0.0265, 0.0628, 0.0717, 0.0515, 0.0487, 0.0904, 0.0454, 0.0399, 0.0316, 0.0517, 0.0459, 0.0304, 0.0781, 0.0454, 0.0245, 0.0442, 0.0446, 0.0643, 0.0492, 0.0501, 0.0239, 0.0616, 0.0838, 0.0381, 0.0484, 0.1091, 0.0281, 0.0469, 0.0461, 0.0619, 0.0493, 0.0503, 0.0629, 0.0572, 0.0522, 0.0611, ...... in a single row without c() for t2.csv and t3.csv

#

separate output for t2 and t3

#

like c(1,2,3,4),c(5,6,7,8,..)
should give output as 1,2,3,4,5,6,7,8...

tacit basin Mar 17, 2022, 7:52 AM

#

tough barn separate output for t2 and t3

what is t2 and t3?

tough barn Mar 17, 2022, 7:53 AM

#

diff files t2.csv and t3.csv

#

which are uploaded

tacit basin Mar 17, 2022, 8:26 AM

#

tough barn which are uploaded

not the most elegant solution, but you can:
split on ","
replace "c(" with ""
replace ")" with ""
change all items to float

tough barn Mar 17, 2022, 8:37 AM

#

okay I have to try this, if you have code ref you can share

tacit basin Mar 17, 2022, 8:44 AM

#

tough barn okay I have to try this, if you have code ref you can share

its ugly but should work 😉
t2a = t2.split(",")
t2b = [item.replace("c(", "") for item in t2a]
t2c = [item.replace(")", "") for item in t2b]
t2d = [float(item) for item in t2c[3:]]

tacit basin Mar 17, 2022, 8:45 AM

#

tough barn okay I have to try this, if you have code ref you can share

also asking in general channel may hel, but probably simpolifing inut a bit, jsut to show format

tough barn Mar 17, 2022, 8:45 AM

#

can we perform t2.split() directly on csv

tacit basin Mar 17, 2022, 8:45 AM

#

tough barn can we perform t2.split() directly on csv

i meant to read file first

tough barn Mar 17, 2022, 8:45 AM

#

by which func

tacit basin Mar 17, 2022, 8:47 AM

#

tough barn by which func

with open(file) as f:
t2 = f.read()

tough barn Mar 17, 2022, 8:47 AM

#

thank you for saving my day

#

🤝

mint palm Mar 17, 2022, 9:09 AM

#

@lapis sequoia

#

this is the unencoded matrix

#

i found it
the green box in Y(prediction)

#

Y can be either eMBB, URLLC, mMTC

#

as you can see the rest0(X[0:8,:]) are categorical data, thats why author might have encoded it

stark breach Mar 17, 2022, 9:16 AM

#

Hey ,wanted to begin ML ,don't know where to start and if i should to algorithms and data structures and dwell into competitive coding first

#

Also i don't comprehend uni level math

#

like the stuff in andrew Ng

lapis sequoia Mar 17, 2022, 9:26 AM

#

mint palm as you can see the rest0(``X[0:8,:]``) are categorical data, thats why author mi...

hm?

mint palm Mar 17, 2022, 9:27 AM

#

yes

lapis sequoia Mar 17, 2022, 9:28 AM

#

yes what?

#

I'm sorry I don't even remember what was the question. it was all yesterday.

mint palm Mar 17, 2022, 9:29 AM

#

ok, the problem was:
is ok to encode X before solving

mint palm Mar 17, 2022, 9:29 AM

#

mint palm <@456226577798135808>

yesterday i only had encoded data but now i have this description of dataaset

exotic thicket Mar 17, 2022, 9:58 AM

#

#

How come the 2nd question got an answer of 77mm. Can someone mind interpretin it?

odd meteor Mar 17, 2022, 12:02 PM

#

mint palm ``(X_train, X_test, Y_train, Y_test) = train_test_split(X, dummy_y, test_size=0....

Yes it does. By default, it randomizes your sample observations before splitting them into train and test set. You can also disable shuffling prior splitting.

train_test_split(x, y, test_size =0.15, shuffle=False, random_state = 2022)

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

scikit-learn

sklearn.model_selection.train_test_split

Examples using sklearn.model_selection.train_test_split: Release Highlights for scikit-learn 0.23 Release Highlights for scikit-learn 0.23, Release Highlights for scikit-learn 0.24 Release Highligh...

odd meteor Mar 17, 2022, 12:10 PM

#

lone merlin I am new to the data science world. I want to ask some questions. If I want to ...

Where I'm from, you basically need to be good in SQL, Excel, PowerBI and/or Tableau. Any other skill like python, etc is an added advantage.

So, you really don't need to focus on ML to excel in your job as a Business Analyst.

iron shell Mar 17, 2022, 12:18 PM

#

Hey I'm in doubt, I was training a model and I used a split part to be train and other to be valid, to both I normalized, but I've another dataset to be test, that simulate "real data", should I normalize too?

exotic thicket Mar 17, 2022, 12:25 PM

#

@lone merlin hello dude do u know any best discord communities on mathematics I had stuck with linear algebra, coordinate geometry, a random process that is all based on problems on computer vision and image processing problems.. Plz would u mind helping me to get what I'm looking for desperately since I took the CV and IP course which is all abt mathematical and physical underpinnings

odd meteor Mar 17, 2022, 12:29 PM

#

stark breach Hey ,wanted to begin ML ,don't know where to start and if i should to algorithms...

I think a large number of people who recently got into ML had at some point used the popular Andrew NG's Machine Learning course on Coursera.

You can start from there and see if that works for you. However, if you happen to be like me who was practically dosing off each time I go beyond 20 mins in Andrew Ng's course, then feel free to drop the course and try Udemy, DataQuest, DataCamp, Books, Bootcamp, etc.

I guess the moral of my story is, try as much resources as possible before you eventually settle for one, and don't waste time to drop any material that doesn't work for you.

Only after you've gained some form of experience in learning ML would you truly appreciate using Hackathons to validate and reinforce what you've learned.

All the best ✌️

exotic thicket Mar 17, 2022, 12:31 PM

#

@odd meteor @catcatgurl hello dude do u know any best discord communities on mathematics I had stuck with linear algebra, coordinate geometry, a random process that is all based on problems on computer vision and image processing problems.. Plz would u mind helping me to get what I'm looking for desperately since I took the CV and IP course which is all abt mathematical and physical underpinnings

jaunty mural Mar 17, 2022, 12:33 PM

#

any suggestion, how to improve looking this graph?

#

and this one

b88ObQCAYSCIqVL3n7kDHCiLAmz6JPQUAIAQiwAAAIRYBAAAIMQiAAAAIRYBAAAIsQgAAECIRQAAAOJ9jNyBQAAAKv4LAIAABBiEQAAgBCLAAAAhFgEAAAgxCIAAAAhFgEAAIgHY16X7eElN9wAAAAASUVORK5CYII.png

odd meteor Mar 17, 2022, 12:35 PM

#

iron shell Hey I'm in doubt, I was training a model and I used a split part to be train and...

Hi Patrick, what is sauce for the goose is sauce for the gander. To ensure equity and fairness, and to give your new data an equal playing ground, yes it's necessary!

Assume you have done all the preprocessing and wrapped it in a pipeline, even your new test dataset is suppose to pass through the same furnace (pipeline) to get it in its most useful state for your ML model.

nimble jolt Mar 17, 2022, 12:35 PM

#

Hey guys I'm a beginner in python and I have some module installation problems,I don't know how to solve can any f you help

#

Or is this not the place to ask

odd meteor Mar 17, 2022, 12:38 PM

#

exotic thicket <@519319496868233227> @catcatgurl hello dude do u know any best discord communit...

Unfortunately, I don't know any. However, I'm sure Salt and Stelercus would have a better answer to your question.

odd meteor Mar 17, 2022, 12:40 PM

#

nimble jolt Hey guys I'm a beginner in python and I have some module installation problems,I...

What's the name of the module and what error message did you get?

nimble jolt Mar 17, 2022, 12:41 PM

#

The module name is pywhatkit

#

The error message is line 3, in <module>import pywhatkit

odd meteor Mar 17, 2022, 12:43 PM

#

nimble jolt The module name is pywhatkit

Forgive my ignorance and laziness to use google, what's the work of this library?

nimble jolt Mar 17, 2022, 12:44 PM

#

I don't understand still a beginner

#

Can you explain in simpler terms

#

plz

jaunty mural Mar 17, 2022, 12:48 PM

#

nimble jolt I don't understand still a beginner

wrong go to #python-discussion and you need to install this module
run pip install pywhatkit

odd meteor Mar 17, 2022, 12:48 PM

#

nimble jolt The error message is line 3, in <module>import pywhatkit

Did you successfully install this library on your machine?
If answer to #1 is yes, then try to add the location where this library was downloaded in your machine to your PATH
Alternatively, if you use Anaconda, you can also use conda to install the library.

nimble jolt Mar 17, 2022, 12:48 PM

#

Ok

#

But what if i'm using pycharm

jaunty mural Mar 17, 2022, 12:50 PM

#

nimble jolt But what if i'm using pycharm

point to the error and pycharm would suggest you to install it

nimble jolt Mar 17, 2022, 12:50 PM

#

OK

odd meteor Mar 17, 2022, 12:51 PM

#

nimble jolt But what if i'm using pycharm

It doesn't matter. If you used PyCharm to install the library, check if it actually download the library in your default python environment or in a new one entirely.

nimble jolt Mar 17, 2022, 12:51 PM

#

But i've tried installing it on both pycharm and command propmt and they all say it has been installed.

nimble jolt Mar 17, 2022, 12:52 PM

#

odd meteor It doesn't matter. If you used PyCharm to install the library, check if it actua...

Ok

nimble jolt Mar 17, 2022, 12:52 PM

#

jaunty mural point to the error and pycharm would suggest you to install it

I will retry

#

Oh and thanks for your help.

odd meteor Mar 17, 2022, 1:00 PM

#

jaunty mural any suggestion, how to improve looking this graph?

I don't know if there's any other better way to visualise a 3d dataset other than probably making it interactive. That way you can interact, rotate, and view each dimension of the data with ease.

Isn't this plotly? 😀

If not, try using plotly or cufflinks to plot and visualize the data

sand fossil Mar 17, 2022, 1:01 PM

#

hello peoples

urban lance Mar 17, 2022, 1:02 PM

#

what clustering methods are great for hyperbolic data?

jaunty mural Mar 17, 2022, 1:05 PM

#

odd meteor I don't know if there's any other better way to visualise a 3d dataset other tha...

yeah you're quite right) only some improvments are change alpha value, size and color

exotic thicket Mar 17, 2022, 1:19 PM

#

exotic thicket

@serene scaffold @digital radish plz any of u help me with this problem

jaunty mural Mar 17, 2022, 1:19 PM

#

@odd meteor no, plotly is good for zooming live, but I need this plots to include in my paper, so

exotic thicket Mar 17, 2022, 1:19 PM

#

exotic thicket <@519319496868233227> @catcatgurl hello dude do u know any best discord communit...

@serene scaffold...

ember lark Mar 17, 2022, 1:27 PM

#

anyone here have any experience with pyttsx3 for text to speech? Currently weighing different options for a voice engine for my AI but for windows pyttsx3 uses sapi5 engine which brings robotic voices. Other txt to speech engines I have found need a file to read from to turn it to speech. Does anyone have a recommendation for natural sounding voices using pyttsx3 or a free API that can achieve the same thing?

misty flint Mar 17, 2022, 1:31 PM

#

anyone work with gpt3 before

odd meteor Mar 17, 2022, 1:32 PM

#

jaunty mural <@!519319496868233227> no, plotly is good for zooming live, but I need this plot...

If you want to use the plot on paper, I don't suppose the interactivity of the plot would still be possible. If that's the case, then you can visualize your 3D data using Seaborn, Plotly, Cufflinks, etc.

You only need to get the 3d plot then copy it and paste it ( or download and upload the image) on your MSWord or Notion.

serene scaffold Mar 17, 2022, 1:42 PM

#

@exotic thicket it's impolite to ping random people to draw attention to your question. Please refrain in the future--this is a warning.

#

@lapis sequoia did you see what I just said to pari?

lapis sequoia Mar 17, 2022, 1:43 PM

#

serene scaffold <@456226577798135808> did you see what I just said to pari?

oh i apologise, i had a question about where did they draw the diagram.

#

I'll delete it if you want.

serene scaffold Mar 17, 2022, 1:44 PM

#

lapis sequoia oh i apologise, i had a question about where did they draw the diagram.

no; never delete a message in which you pinged someone

lapis sequoia Mar 17, 2022, 1:44 PM

#

damn, i just..did.

serene scaffold Mar 17, 2022, 1:44 PM

#

now they've been pinged and they'll never be able to figure out where it came from.

#

anyway, you can ping someone if they've already engaged with your specific question.

lapis sequoia Mar 17, 2022, 1:45 PM

#

serene scaffold anyway, you can ping someone if they've already engaged with your specific quest...

no they have not exactly engaged. I asked them before a while and seen them online now, so wanted to ask, but hm, I'll wait when they are actually typing or something.

lapis sequoia Mar 17, 2022, 1:46 PM

#

jaunty mural any suggestion, how to improve looking this graph?

shouldn't that be velocity? also as much better in terms of what? I can look if we can save in some lossless way or not.

#

high definition was possible in matlab as much i remember.

jaunty mural Mar 17, 2022, 1:47 PM

#

odd meteor If you want to use the plot on paper, I don't suppose the interactivity of the p...

it's fine with matplib only, seaborn how can it improve look-up and only set the style. Anyway

jaunty mural Mar 17, 2022, 1:48 PM

#

lapis sequoia shouldn't that be `velocity`? also as much better in terms of what? I can look i...

no it's welocity of magnetic field, not speed velocity)

lapis sequoia Mar 17, 2022, 1:48 PM

#

ah alr, my bad lol.

jaunty mural Mar 17, 2022, 1:48 PM

#

it's ok

lapis sequoia Mar 17, 2022, 1:48 PM

#

so yeah what do you mean by improve here?

jaunty mural Mar 17, 2022, 1:50 PM

#

lapis sequoia so yeah what do you mean by improve here?

good looking, I added dpi property, increased the size of scatter points, added (but commented this line) that changed the angle of view

mint palm Mar 17, 2022, 1:50 PM

#

i used one hot encoding on all Y.
is it correct?

jaunty mural Mar 17, 2022, 1:50 PM

#

changed transparency value

mint palm Mar 17, 2022, 1:50 PM

#

my output isnt one hot though

#

output is still is probability(floats)

lapis sequoia Mar 17, 2022, 1:51 PM

#

jaunty mural changed transparency value

you could perhaps include more than one angles?

odd meteor Mar 17, 2022, 1:51 PM

#

jaunty mural it's fine with matplib only, seaborn how can it improve look-up and only set the...

Are you just interested in the aesthetics? Or are you more interested in the quality of .png file generated from each of those visualization libraries?

If Matplotlib is okay you can use that as well.

jaunty mural Mar 17, 2022, 1:53 PM

#

odd meteor Are you just interested in the aesthetics? Or are you more interested in the qua...

aesthetics depends on data, but the data is that I've got after research so the here's the picture as it is

lapis sequoia Mar 17, 2022, 1:56 PM

#

https://jakevdp.github.io/PythonDataScienceHandbook/04.12-three-dimensional-plotting.html
this has some interesting ways to show them

Three-Dimensional Plotting in Matplotlib | Python Data Science Hand...

odd meteor Mar 17, 2022, 2:01 PM

#

jaunty mural aesthetics depends on data, but the data is that I've got after research so the ...

I'd argue it depends on customization of your plot not the data. We simply cannot panel-beat the data just for aesthetics.

But then again, nobody's ever gotten an A++ from just plotting a picturesque dreamscape 😀

I personally think, you are good to go with what you have already (I mean the initial image you sent.) I honestly think it's looking nice.

jaunty mural Mar 17, 2022, 2:02 PM

#

lapis sequoia https://jakevdp.github.io/PythonDataScienceHandbook/04.12-three-dimensional-plot...

i've already read the official documentation) to find out some ways

serene scaffold Mar 17, 2022, 2:02 PM

#

!otn a panel beat the data

arctic wedgeBOT Mar 17, 2022, 2:02 PM

#

:ok_hand: Added panel-beat-the-data to the names list.

jaunty mural Mar 17, 2022, 2:03 PM

#

odd meteor I'd argue it depends on customization of your plot not the data. We simply canno...

yeap!) only useful command(method) ax.view_init to rotate and get the better angle of view

mint palm Mar 17, 2022, 2:11 PM

#

copper dirge Mar 17, 2022, 2:24 PM

#

G'day everyone, not sure if it would be able to be done or not, but is it possible to create a VERY simple machine learning algorithm using ONLY numpy?

#

Something that can categorise messages as spam or not for example

#

If so, please ping/pm me 🙂

lapis sequoia Mar 17, 2022, 2:27 PM

#

copper dirge Something that can categorise messages as spam or not for example

sure you can. hm you can use naive bayes for spam detection for example.

copper dirge Mar 17, 2022, 2:35 PM

#

That's pretty cool

#

I'm not sure what to look for, do you think that you'd be able to nudge me in the right direction @lapis sequoia ?

#

Would it also be possible to do the following:

#

1. Export the programs "learning" to a text file of some sort
2. Read this file when a "classify" function is run?

I assume this would be better rather than training the bot every time you want to classify something?

lapis sequoia Mar 17, 2022, 2:41 PM

#

Anyone with experience in (geo)pandas that wants to help me figuring out why my plot does not show? #help-cookie

#

Hi, does anyone here use kaggle, want to ask for their opinions on the cost on data since I don't have any unlimited internet plans

tacit basin Mar 17, 2022, 3:06 PM

#

copper dirge ``` 1. Export the programs "learning" to a text file of some sort 2. Read this f...

You can train model and then use it for inference correct

tacit basin Mar 17, 2022, 3:07 PM

#

lapis sequoia Hi, does anyone here use kaggle, want to ask for their opinions on the cost on d...

I use kaggle from time to time

lapis sequoia Mar 17, 2022, 3:09 PM

#

tacit basin I use kaggle from time to time

I am thinking of signing up to kaggle to learn pytorch on their jupyter notebooks. i was afraid on whether it would cost money to use their gpu but someone told me that there aren't any charges.
I would like to ask, does kaggle need a constant internet connection to code

tacit basin Mar 17, 2022, 3:11 PM

#

lapis sequoia I am thinking of signing up to kaggle to learn pytorch on their jupyter notebook...

You can run code in so called commit mode. This means it's run in the background and results are saved. You get 38 GPU hours per week and some tpu hours as well. Cpu hours unlimited i think. Max session time is 12 hrs

lapis sequoia Mar 17, 2022, 3:12 PM

#

tacit basin You can run code in so called commit mode. This means it's run in the background...

and just to be sure, if I go over the 38 hours, it just slows down with no hidden cost right? and the datasets that I download for the training are saved in the notebooks and not locally?

tacit basin Mar 17, 2022, 3:15 PM

#

lapis sequoia and just to be sure, if I go over the 38 hours, it just slows down with no hidde...

If you want to use GPU above 38 hrs they try to sell gcp cloud. Kaggle is a google company. But one 38 hrs are used you will not be able to use GPU compute in that month. You can access your data.

#

There are more free GPU options: google colab, paperspace gradient, AWS sagemaker studio lab

lapis sequoia Mar 17, 2022, 3:18 PM

#

tacit basin There are more free GPU options: google colab, paperspace gradient, AWS sagemake...

thank you for answering all my questions 😄 . I actually was scared of hidden costs because of a reddit post where the comments were saying google cloud deep learning something can have hidden charges

tacit basin Mar 17, 2022, 3:19 PM

#

lapis sequoia thank you for answering all my questions 😄 . I actually was scared of hidden co...

Can you share the link?

lapis sequoia Mar 17, 2022, 3:19 PM

#

https://www.reddit.com/r/googlecloud/comments/j6slno/how_to_avoid_being_charged_on_google_clouds_free/?utm_source=share&utm_medium=web2x&context=3

r/googlecloud - How to avoid being charged on Google Cloud's Free t...

9 votes and 26 comments so far on Reddit

tacit basin Mar 17, 2022, 3:19 PM

#

lapis sequoia thank you for answering all my questions 😄 . I actually was scared of hidden co...

You don't need credit card to use kaggle so they can't charge you :)

lapis sequoia Mar 17, 2022, 3:20 PM

#

tacit basin Can you share the link?

here's the link, not sure if the google cloud in this one is the same used for their other services

https://www.reddit.com/r/googlecloud/comments/j6slno/how_to_avoid_being_charged_on_google_clouds_free/?utm_source=share&utm_medium=web2x&context=3

r/googlecloud - How to avoid being charged on Google Cloud's Free t...

9 votes and 26 comments so far on Reddit

lapis sequoia Mar 17, 2022, 3:20 PM

#

tacit basin You don't need credit card to use kaggle so they can't charge you :)

ahhhh, thats great to hear

tacit basin Mar 17, 2022, 3:21 PM

#

lapis sequoia here's the link, not sure if the google cloud in this one is the same used for t...

Yeah for gcp you need to provide cc, so you need to be careful with what use. If you run out of free credit your card will be charged

#

But kaggle is data science competition platform. They provide free GPU hours so ppl can learn or start competitions, but it's not used as paid cloud computing resource as for example gcp, AWS, Azure

lapis sequoia Mar 17, 2022, 3:24 PM

#

tacit basin Yeah for gcp you need to provide cc, so you need to be careful with what use. If...

that is the thing that scares me, I was googling around remote services to practice ML and google cloud deep learning something came up, thought I would look at reddit before I try the account. so when kaggle came up, I thought it might have similar problems

lapis sequoia Mar 17, 2022, 3:24 PM

#

tacit basin But kaggle is data science competition platform. They provide free GPU hours so ...

that is wonderful to hear, thank you so much for taking the time to answer

#

i'll setup an account and try it only for learning data science.

tacit basin Mar 17, 2022, 3:26 PM

#

lapis sequoia i'll setup an account and try it only for learning data science.

Nice. Kaggle also have nice free courses (with certs) as well. One you create account you will get access to compute, datasets, competitions, courses, disciforums. It's a great platform.

#

See you on some competition leaderboard soon. Good luck!

lapis sequoia Mar 17, 2022, 3:27 PM

#

tacit basin Nice. Kaggle also have nice free courses (with certs) as well. One you create ac...

yeah, the certificates dont interest me much but they are a plus 😄

lapis sequoia Mar 17, 2022, 3:28 PM

#

tacit basin See you on some competition leaderboard soon. Good luck!

https://tenor.com/view/sweating-nervous-paranoid-gif-4974019

Tenor

#

Ohhhh

lapis sequoia Mar 17, 2022, 3:29 PM

#

tacit basin See you on some competition leaderboard soon. Good luck!

...thank ...you ...i ...will...try

steady basalt Mar 17, 2022, 3:42 PM

#

@lapis sequoia

lapis sequoia Mar 17, 2022, 4:09 PM

#

Can anyone give me a good educational documentation regarding heatmapping etc. With matplotlib?

#

https://www.bigendiandata.com/2017-06-27-Mapping_in_Jupyter/

How to plot data on maps in Jupyter using Matplotlib, Plotly, and B...

If you’re trying to plot geographical data on a map then you’ll need to select a plotting library that provides the features you want in your map. And if you haven’t plotted geo data before then you’ll probably find it helpful to see examples that show different ways to do...

#

Ive found this but it isnt well explained

steady basalt Mar 17, 2022, 4:25 PM

#

I’d normally recommend seaborn for that but matplot really straightforward it’s only takes one line of code @lapis sequoia

#

Nevermind, opened your document to find something that looks far from the heatmap I am used to

misty flint Mar 17, 2022, 4:35 PM

#

kekHands

#

yeah im used to the other heatmap term

#

not the geological one

#

kekHands

agile cobalt Mar 17, 2022, 4:42 PM

#

maybe look up Choropleth charts?
(and relevant documentation / stackoverflow questions for whichever libraries you use)

mint palm Mar 17, 2022, 4:53 PM

#

#

Sterlercus
i found the data

steady basalt Mar 17, 2022, 5:05 PM

#

Anyone else find the titanic Kaggle impossible to beat 80% accuracy?

#

I see scores even high

serene scaffold Mar 17, 2022, 5:06 PM

#

steady basalt Anyone else find the titanic Kaggle impossible to beat 80% accuracy?

the Titanic dataset is difficult because there really aren't that many data points.

steady basalt Mar 17, 2022, 5:10 PM

#

Have you done the space titanic too?

#

It’s better

#

It’s about a spaceship called titanic

misty flint Mar 17, 2022, 5:12 PM

#

blobhyperthink

#

~~do the majority of the people die as well~~

#

kekHands

steady basalt Mar 17, 2022, 5:21 PM

#

No they disappear

misty flint Mar 17, 2022, 5:24 PM

#

CL6_ThinkingIntensifies

stark breach Mar 17, 2022, 5:33 PM

#

odd meteor I think a large number of people who recently got into ML had at some point used...

Thanks

tacit basin Mar 17, 2022, 5:55 PM

#

steady basalt I see scores even high

100 % scores are cheating

lapis sequoia Mar 17, 2022, 6:30 PM

#

steady basalt <@456226577798135808>

why was i pinged?

misty flint Mar 17, 2022, 7:35 PM

#

ahhhhh

#

endless query optimizations

#

kekHands

#

if only people had all their requirements listed at the beginning

#

tragic

#

this is the real data scientist life

#

kekHands

exotic thicket Mar 17, 2022, 7:48 PM

#

serene scaffold <@!855723635469975613> it's impolite to ping random people to draw attention to ...

Sorry I was desperate at that moment..

steady basalt Mar 17, 2022, 7:52 PM

#

Ah you guys.. how long does RFE take to run on a 8000 row and 8 feature big set

#

On Kaggle CPUS

#

It’s been trying to select 3 features for 8 minutes now

#

Step=1

#

logistic regression estimator

steady basalt Mar 17, 2022, 8:14 PM

#

Okay it’s been running for 25 mins now, and on my macs cpu 15 mins

#

It’s sklearn btw

#

Is there even any point in doing this or should it just be manual selection based on a correlation heatmap

#

#

okay my laptops about to set on fire holy shit

#

AAA HELP

tacit basin Mar 17, 2022, 8:23 PM

#

living on the edge :))

steady basalt Mar 17, 2022, 8:25 PM

#

Bro, i am on 170%

#

Did i do something wrong?

#

its been running for 30 mins

#

no, more actually

#

does this feature selection only work on data with same data type or wat?

tacit basin Mar 17, 2022, 8:26 PM

#

gpu memory 15.7 out of 15.9, amost the dreded out of memory error lol

steady basalt Mar 17, 2022, 8:26 PM

#

the gpu accelerator doesnt even work with mine

#

is it normal for feature selection to run for hours? when theres only 8 features in total?

#

okay well, im actually running with one hot encoded version so quite alot more

tacit basin Mar 17, 2022, 8:27 PM

#

not sure why you select features from X, after you split to X_train
not sure if this will speed up massively as this is still 70% of data, but it's a correct way not to use test data to set up features etc

steady basalt Mar 17, 2022, 8:28 PM

#

I have read before that its better to do this way

tacit basin Mar 17, 2022, 8:28 PM

#

can you share a link?

#

to where you read that

steady basalt Mar 17, 2022, 8:29 PM

#

I just googled 'feature selection before or after test train spit'

#

first result

tacit basin Mar 17, 2022, 8:29 PM

#

because it's a bit cheating, in practice you don't know the test data, that's the reason to split