#data-science-and-ml | Python | Page 362

delicate sphinx Dec 20, 2021, 10:50 PM

#

The weights for my tensorflow model are 57MB haha

stone marlin Dec 20, 2021, 10:50 PM

#

Wait, okay, so you're just putting in the tokenized words (as int indexes) into a NN?

delicate sphinx Dec 20, 2021, 10:51 PM

#

yeah through an embedding layer

#

Yeah the tokenizer has a vocabulary I can index to get words with a given int

#

i.e. tokenizer.sequences_to_text([581, 20, 14, 3414]) = what are you doing

stone marlin Dec 20, 2021, 10:52 PM

#

Right, okay, so far so good. IIRC, Keras requires integer stuff for this kind of thing. I think this is what Word2Vec does?

delicate sphinx Dec 20, 2021, 10:52 PM

#

I only use every word in my project which is around 28,000 unique ones

#

it keeps count of the unique words

#

though there is another part of it that counts occurrences of the words I think

delicate sphinx Dec 20, 2021, 10:53 PM

#

stone marlin Right, okay, so far so good. IIRC, Keras requires integer stuff for this kind o...

Yeah I think the tokenizer was like a word2vec idk

stone marlin Dec 20, 2021, 10:53 PM

#

Okay, so, you've got the embedding layer. What are you hoping to get out of your NN at the end? You end up with a 16-element thing, what is that supposed to be?

delicate sphinx Dec 20, 2021, 10:53 PM

#

an answer

#

i.e.

"what are you doing" --> model --> "nothing"

#

hopefully between lengths of 1 and 16 as my longest answer in my dataset is 12 (and I want to keep it in powers of 2)

stone marlin Dec 20, 2021, 10:54 PM

#

Okay, so you're using like SGD to train the NN? This is where my knowledge of NLP / NNs gets a bit fuzzy, apologies.

delicate sphinx Dec 20, 2021, 10:54 PM

#

CCE and Adam

#

Optimizer and loss

stone marlin Dec 20, 2021, 10:55 PM

#

Okay, same dealio. Okay.

delicate sphinx Dec 20, 2021, 10:55 PM

#

CCE loss

#

myb I always say them the wrong way around haha

stone marlin Dec 20, 2021, 10:55 PM

#

Okay, so now I'm gonna be totally in the dark. Does your output, right now (on a test set) correspond to up to 16 words (index of words) which you've tokenized?

#

If you get back like, [1, 4, 6, 7, ...] is that going to be those words?

delicate sphinx Dec 20, 2021, 10:56 PM

#

Yeah I just realised the way I'm doing it is definitely wrong

#

as my answer is tokenized too

#

which means as long as it outputs float I'll never get good enough answers

stone marlin Dec 20, 2021, 10:56 PM

#

Right, the softmax is gonna be wacky.

delicate sphinx Dec 20, 2021, 10:56 PM

#

because the output expects float and I have int

#

so I need to either:

convert the stuff to int just before it's output or
find a way to convert from float to int and do the inverse of that to my answers so the model can actually learn it

#

either way my answer format is wrong :/

stone marlin Dec 20, 2021, 10:58 PM

#

Yeah, it feels like it's giving you a "fuzzy" answer, but that makes no sense with an integer corpus. Hm.

delicate sphinx Dec 20, 2021, 10:58 PM

#

yeah it's my processing that converts it to float

stone marlin Dec 20, 2021, 10:58 PM

#

I'm not knowledgeable enough in this field to be able to help right now, but I'll take a look at it a bit later and see if I can mess around with a toy model.

delicate sphinx Dec 20, 2021, 10:58 PM

#

tbh with you I combine an image and a question model into one

#

so it's fairly complex so dw too much

#

but if you can make a sort of model that converts integers to floats then that's fair

stone marlin Dec 20, 2021, 10:59 PM

#

Haha, I'm just going to do the question part, mostly so that I know it better. :']

delicate sphinx Dec 20, 2021, 10:59 PM

#

but because of my image input I think I have to avoid just processing on ints

stone marlin Dec 20, 2021, 10:59 PM

#

I haven't done much with NNs since I've never needed them for work, but they seem pret cool.

delicate sphinx Dec 20, 2021, 10:59 PM

#

Yeah Tensorflow has lots of documentation and all as well

stone marlin Dec 20, 2021, 10:59 PM

#

Can you process your image in another NN and pick out relevant features, then pass those in?

delicate sphinx Dec 20, 2021, 11:00 PM

#

sadly the only project that appealed to me like it could've helped me used custom encoder/decoders which I don't want to steal

stone marlin Dec 20, 2021, 11:00 PM

#

Object detection or whatnot? Or, possibly, figure out the "topic" of the question from your NN and feed that into an Object Detection NN?

delicate sphinx Dec 20, 2021, 11:00 PM

#

stone marlin Can you process your image in another NN and pick out relevant features, then pa...

my image is preprocessed in InceptionV3 to give me features from the image

#

and that's passed into an image model and my image model and question models combine into my merged model

stone marlin Dec 20, 2021, 11:00 PM

#

Ahh, okay. Wild.

delicate sphinx Dec 20, 2021, 11:01 PM

#

yeah, I can find help for either 1 thing but the only help I can find online for my project are quite different and don't just use tensorflow which I'd like to do

#

the way I've managed my model so far, this is the output

#

I think softmax is only realy useful for one-hot-encoding because it represents a probability on each word being correct

arctic crown Dec 20, 2021, 11:27 PM

#

@stone marlin which algo can i use for this?

stone marlin Dec 20, 2021, 11:29 PM

#

There are a number of ways to architecture this, but if you're specifically talking about one thing, you can record the times you've done this in a db or something and use linear regression.

#

At least for a proof of concept.

arctic crown Dec 20, 2021, 11:29 PM

#

stone marlin There are a number of ways to architecture this, but if you're specifically talk...

yea thats what i was thinking store the data in some sort of db and then use linear regression

#

but

stone marlin Dec 20, 2021, 11:29 PM

#

You could, for example, store "time lights went on in the last 30 days" or something, and regress on those.

arctic crown Dec 20, 2021, 11:30 PM

#

but in linear regression lets say one axis is the time what would be the other axis?

stone marlin Dec 20, 2021, 11:30 PM

#

x-axis is the day, y-axis is the time you put the light on or whatever.

#

Honestly, for your thing, you could literally just take the mean of the last N days.

arctic crown Dec 20, 2021, 11:31 PM

#

arctic crown Dec 20, 2021, 11:31 PM

#

stone marlin Honestly, for your thing, you could literally just take the mean of the last N d...

im sorry?

stone marlin Dec 20, 2021, 11:31 PM

#

Your y-axis would be the time. Your x-axis would be the day number.

arctic crown Dec 20, 2021, 11:32 PM

#

hmm

stone marlin Dec 20, 2021, 11:32 PM

#

If you want a 1-dimensional linear regression, that's pret much just the mean. So, you could do the last 7 days and be like, [7, 7, 7, 8, 7, 7, 6] for time to wake up, and it would take the mean of those and turn the light on then.

arctic crown Dec 20, 2021, 11:33 PM

#

what do you mean by "mean"?

stone marlin Dec 20, 2021, 11:33 PM

#

Average.

desert oar Dec 20, 2021, 11:35 PM

#

there's nothing wrong imo if the index isn't "1 per row". if anything, it's good practice to try to use meaningful indices whenever you can, instead of just integer row numbers

delicate sphinx Dec 20, 2021, 11:35 PM

#

mean as in: mean of [5,5,6,9,4] = 5+5+6+9+4 / 5 = 29/5 = 5.8 (mean = sum_of_list / len_of_list)

#

tbh I would've said just do the mean of it too, but I'm really no genius haha, and obviously you can use AI to prevent outliers contaminating it

#

but for the most part mean should work

arctic crown Dec 20, 2021, 11:37 PM

#

ah

stone marlin Dec 20, 2021, 11:37 PM

#

Yeah, mean is fine, median is prob better tbh.

arctic crown Dec 20, 2021, 11:37 PM

#

but then i dont even need to use ml

stone marlin Dec 20, 2021, 11:37 PM

#

Median's robust to outliers, so that'll be better.

#

Yes. You don't.

#

Haha, not every problem needs ML to solve!

arctic crown Dec 20, 2021, 11:38 PM

#

yea lol

delicate sphinx Dec 20, 2021, 11:39 PM

#

If it's some sort of "ML is required" project you've been given, Linear regression is probably best bet

arctic crown Dec 20, 2021, 11:40 PM

#

delicate sphinx mean as in: mean of [5,5,6,9,4] = 5+5+6+9+4 / 5 = 29/5 = 5.8 ...

but in my case it can also be: [5:30,5:40,6,6:30]

delicate sphinx Dec 20, 2021, 11:40 PM

#

But you might be trying to use a workshop of tools to hammer a nail in place here haha

stone marlin Dec 20, 2021, 11:40 PM

#

Yeah, you can translate that to military time or whatever.

delicate sphinx Dec 20, 2021, 11:40 PM

#

yeah 5:30 = 0530 in military time

#

much easier to manage

#

(or just do 5 * 60 + 30)

#

as long as you're consistent

#

If you want to bring in machine learning that will probably have to be a more complex project, i.e. learning whether or not a value is an outlier

#

i.e. if on the weekends they turn on the lights at 11:00 because they slept in longer than usual

#

then that shouldn't change the times for the rest of the week

arctic crown Dec 20, 2021, 11:56 PM

#

you know when some people say their ai can improve over time

#

what do they exactly mean?

#

like it improves in what?

#

@delicate sphinx

#

@stone marlin

#

sorry for the pings

delicate sphinx Dec 20, 2021, 11:57 PM

#

It depends on what they look to improve

#

And it depends on if they let it continue to learn

arctic crown Dec 20, 2021, 11:57 PM

#

what does it learn tho?

delicate sphinx Dec 20, 2021, 11:57 PM

#

It learns based on the information you give it

loud cave Dec 20, 2021, 11:57 PM

#

Usually they are referring to 'online' learning, which is a method that can continually update itself

delicate sphinx Dec 20, 2021, 11:57 PM

#

I've not heard of online learning for AI but I'm familiar with the idea behind it

arctic crown Dec 20, 2021, 11:58 PM

#

im still a bit confused on what it learns

loud cave Dec 20, 2021, 11:59 PM

#

You might already be familiar with methods that partition a large data set into train, test, etc. So that type of model learns once on the training data and that's it. If you have newer data that you want it to learn from, you have to make an entirely new model

#

I guess I"m jumping into this conversation without context. I assumed we're using 'AI' and 'ML' interchangably but from Tentenmen's answer is sounds like we're distinguishing them

delicate sphinx Dec 21, 2021, 12:00 AM

#

I'm not as well versed in you with all the lingo and Jargon so you're probably better

#

at guessing the topic with your assumption

loud cave Dec 21, 2021, 12:01 AM

#

I have no idea 😛

delicate sphinx Dec 21, 2021, 12:01 AM

#

peace i forgot

#

how well versed are u with TF

loud cave Dec 21, 2021, 12:02 AM

#

I have made some models but probably still a beginner in the big picture of things

arctic crown Dec 21, 2021, 12:02 AM

#

same but im learning sklearn

delicate sphinx Dec 21, 2021, 12:02 AM

#

have you ever used things like Embedding layer, Tokenizer or TextVectorization?

#

I've been stuck for like 2 weeks on how to actually get valid answers from a model

#

at first I just had <unk> <unk> <unk>, ....

#

then I had the same array over and over

arctic crown Dec 21, 2021, 12:03 AM

#

also my ai is a personal assistant and if i want it to improve what can i make it improve in?

delicate sphinx Dec 21, 2021, 12:03 AM

#

and now I'm just getting floats but have no idea how to translate that into words

loud cave Dec 21, 2021, 12:08 AM

#

@arctic crown An example of online vs offline learning would be the linear regression model someone mentioned to you above. If you trained AKA fit that type of model, it would be 'learning' values for a coefficient that minimizes the squared error beween the line and all of the ground truth values AKA labels AKA 'y'. If you had an algorithm that fit the line given a dataset and couldn't update the coefficient after, that would be 'offline' learning. if you had an algorithm that could update the coefficient after each new example, that would be 'online' learning, so it could improve over time. I think SKLearn has some online learning algorithms in it

#

@delicate sphinx I actually have used those at a previous job. Or at least worked on a project where someone else used them and I inherited the model they made. When you say you don't know how to translate that into words, what do you mean?

delicate sphinx Dec 21, 2021, 12:09 AM

#

So I can get float outputs, but I've no idea how to get that back into integer or string form

#

My current output is basically a softmax probability which should probably be used more for one-hot encoding

#

but apparently softmax could also work with other methods

#

if I one-hot encode I'm gonna destroy my processing time

#

(each answer / prediction would take 479,999 0's and one "1" value if I use one-hot encoding for most answers)

loud cave Dec 21, 2021, 12:11 AM

#

What are you training the model to do? I guess the input is some words/sentences, but what is the ground truth/label that you are training it against?

delicate sphinx Dec 21, 2021, 12:11 AM

#

Preprocessed image features + Question (LSTM) model --> a merged model that should output an answer

loud cave Dec 21, 2021, 12:11 AM

#

Oh, I'm scrolling back up

#

and see some of the detail now

delicate sphinx Dec 21, 2021, 12:12 AM

#

I also made a help in #help-cookie but I'm not sure if it's any use really

#

I've looked at TextVectorization, Embedding, Tokenizer but I can't understand any of them :/

loud cave Dec 21, 2021, 12:12 AM

#

Is the answer supposed to be a single word?

delicate sphinx Dec 21, 2021, 12:12 AM

#

everytime I've used them I've ended with float values I don't know how to translate back

#

the longest answer my dataset uses is 12 words long

#

and to keep my model easy to use, I try to make everything a power of 2

#

so the output length is 16 (16 words maximum)

#

my inputs:

delicate sphinx Dec 21, 2021, 12:13 AM

#

delicate sphinx My current output is basically a softmax probability which should probably be us...

my output

loud cave Dec 21, 2021, 12:14 AM

#

so input1 is the RGB/greyscale values of the image, input2 is words of the question and output is words of the anser?

delicate sphinx Dec 21, 2021, 12:14 AM

#

input1 is: (36,2048,3)

#

where 3 is the channels (RGB)

#

so yeah

#

question input2 is (32)

#

longest question is 24 so I pad it to 32 (subnet masking in place)

iron basalt Dec 21, 2021, 12:15 AM

#

loud cave Usually they are referring to 'online' learning, which is a method that can cont...

That's incremental learning. Online learning requires the learning to happen in order that the data arrives. For example, waiting for 10 seconds to receive a bunch of data and then randomizing it and learning it in a batch is not online learning, which most DL requires to work well (the i.i.d. assumption).

#

Real life data that a robot receives in real time for example, is almost always not i.i.d.

loud cave Dec 21, 2021, 12:16 AM

#

so there is some vocabulary of answer words defined?

iron basalt Dec 21, 2021, 12:16 AM

#

(Though it's not binary, it can be more or less)

delicate sphinx Dec 21, 2021, 12:16 AM

#

loud cave so there is some vocabulary of answer words defined?

I have tokenized all ~28,000 words

#

Which includes all questions and answers in my training, validation and test datasets

#

(28,000 unique words)

#

Peace is it cool if I DM? It's 00:19am here and I need to walk my dog before I code until 4am and walk him x-x

#

if not that's perfectly fine and don't worry

loud cave Dec 21, 2021, 12:18 AM

#

So is the output supposed to be a vector of something like 16 X 28001? A probability for each word in each index of the answer, plus one more word for 'none'?

delicate sphinx Dec 21, 2021, 12:18 AM

#

well, I was thinking of one hot encoding it

#

but that requires 16 * 28,000 values in lists

#

which is for most cases 477,999 0's

#

with one "1" value

#

and that is just such a waste imo, so I was hoping to use some sort of TextVectorization as apparently that's more of a dynamic approach that only uses as much as is needed

#

that's why I took the Tokenizer approach to begin with

loud cave Dec 21, 2021, 12:20 AM

#

It's almost bedtime for me, you can send a message if you like but I may be asleep by the time you return. But I assume this isn't necessarily something you must finish in the next couple hours so if you haven't resolved it by the time I see your message I can still try to help

delicate sphinx Dec 21, 2021, 12:21 AM

#

Yeah I mean I've been up till 4am every day this week trying to code this fix

#

I need to finish my model soon. If not by the end of this week I probably will have to leave it where its at

#

I've asked for help on this every day for the past week but understandably I'm not catching attention of many who understand TF and ofcourse TF is hard in itself so understandably I haven't been able to fix it

loud cave Dec 21, 2021, 12:22 AM

#

I guess the only suggestion I can give is to try to make a simple dummy model using the embedding/vectorizer with a very limited vocabulary/answer size for the sake of easy debugging/understanding, and once you have that expand it into the real size

delicate sphinx Dec 21, 2021, 12:23 AM

#

Yeah I guess I could :/ i sort of got tunnel vision with it all

loud cave Dec 21, 2021, 12:23 AM

#

I guess 28,000 isn't that big a vocabulary really. But you could restrict it to like 5 words

#

It's not open source is it? If you put it on github or something I would try running it myself

delicate sphinx Dec 21, 2021, 12:24 AM

#

I can give u a drive link to it but its not public yet

loud cave Dec 21, 2021, 12:24 AM

#

Does the output answer have to be grammatical? or is like just a list of tags?

delicate sphinx Dec 21, 2021, 12:24 AM

#

When I finish it I plan to put it on my website and make like 20 questions on stack overflow for every issue I had to ask for help for

#

So that I can answer them myself

#

The output can be grammatical but for the most part is one worded answers

arctic crown Dec 21, 2021, 12:29 AM

#

loud cave <@!828826828847710249> An example of online vs offline learning would be the lin...

Sorry but why do you mean by “coefficient”

arctic crown Dec 21, 2021, 12:30 AM

#

arctic crown also my ai is a personal assistant and if i want it to improve what can i make i...

Like can it learn from my habits or something?

delicate sphinx Dec 21, 2021, 12:57 AM

#

arctic crown Sorry but why do you mean by “coefficient”

coefficient as far as I'm aware, is a mathematical term

#

so y = mx + c

#

m would be a coefficient

#

or: 10 = 5y + 2

#

5 would be a coefficient

desert oar Dec 21, 2021, 1:05 AM

#

arctic crown Sorry but why do you mean by “coefficient”

older stats and science jargon for the weights in a regression model

arctic wedgeBOT Dec 21, 2021, 1:08 AM

#

:incoming_envelope: :ok_hand: applied mute to @balmy bolt until <t:1640049534:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

arctic crown Dec 21, 2021, 1:27 AM

#

if i want my personal assistant to improve automatically what can i make it improve in?

iron basalt Dec 21, 2021, 1:59 AM

#

desert oar older stats and science jargon for the weights in a regression model

coefficient - "that which unites in action with something else to produce a given effect"

#

ML people call them weights because they see it as a weighted sum.

#

https://en.wikipedia.org/wiki/Weighted_sum_model

Weighted sum model

In decision theory, the weighted sum model (WSM), also called weighted linear combination (WLC) or simple additive weighting (SAW), is the best known and simplest multi-criteria decision analysis (MCDA) / multi-criteria decision making method for evaluating a number of alternatives in terms of a number of decision criteria.

desert oar Dec 21, 2021, 3:45 AM

#

iron basalt coefficient - "that which unites in action with something else to produce a give...

i never really thought about what the word "coefficient" meant. in math i always learned that a "coefficient" was "a constant factor in a term" and that was that

iron basalt Dec 21, 2021, 4:07 AM

#

desert oar i never really thought about what the word "coefficient" meant. in math i always...

The definition given is the original math definition (from the 1600s).

desert oar Dec 21, 2021, 4:08 AM

#

yeah i had no idea it was such a general term

#

interesting

#

makes sense given the latin etymology

iron basalt Dec 21, 2021, 4:09 AM

#

There is not really any good word to come up with anyhow for something abstract like that, but better than no word.

#

Weight also does not really make sense.

lapis sequoia Dec 21, 2021, 5:14 AM

#

desert oar yeah i had no idea it was such a general term

"Misinformation cannot be corrected" said by @elder thunder

#

dont trust people online

elder thunder Dec 21, 2021, 5:15 AM

#

What?

lapis sequoia Dec 21, 2021, 5:15 AM

#

you said every information is not correct

#

so i told people not to trust them

elder thunder Dec 21, 2021, 5:15 AM

#

No I didn't

lapis sequoia Dec 21, 2021, 5:15 AM

#

yes u did

#

"Misinformation cannot be corrected"

#

u siad that

elder thunder Dec 21, 2021, 5:15 AM

#

I told you to not DM people because misinformation cannot be corrected

lapis sequoia Dec 21, 2021, 5:16 AM

#

who?

#

cares

#

so u go fuck off and leave me the fuck alone

elder thunder Dec 21, 2021, 5:16 AM

#

Also check out #discord-bots ^

obsidian fulcrum Dec 21, 2021, 5:17 AM

#

lapis sequoia so u go fuck off and leave me the fuck alone

elder thunder Dec 21, 2021, 5:21 AM

#

He was dealt with

sour quarry Dec 21, 2021, 8:10 AM

#

Have any of you had any experience with Google colab or similar tools? I'm trying to train some models but I'm a beginnier. I'm looking for something that's relatively easy to use but will work well enough for what I'm trying to do. Do any of you have any suggestions? Thank you. (:

lone drum Dec 21, 2021, 8:49 AM

#

Hello
I am trying to select rows for specific date using df.loc method
My date column is object type
I am not able to get specific rows data from dataframe
Ping me when replying

stray pewter Dec 21, 2021, 11:21 AM

#

Hello, I was assigned a task "Route Planning and ETA estimation in Urban Traffic Network Using Artificial Intelligence".
No dataset was given. I was planning to use data from OpenStreetMap.org. My question is How do you approach these types of problem?

teal mortar Dec 21, 2021, 11:53 AM

#

sour quarry Have any of you had any experience with Google colab or similar tools? I'm tryin...

you can use google colab or kaggle notebooks, would recommend kaggle if dataset you use is on kaggle, colab when you get your dataset from github or google drive

rigid zodiac Dec 21, 2021, 12:04 PM

#

Hey guys I'm trying to do CNN model and I receive this error message.

The response suppose to have 2 classes, either fall or non Fall.
https://paste.pythondiscord.com/keminotixa.sql

#

!paste

hybrid mica Dec 21, 2021, 12:57 PM

#

does anyone know of a dataset for a positive/negative sentiment analysis?

serene scaffold Dec 21, 2021, 1:06 PM

#

hybrid mica does anyone know of a dataset for a positive/negative sentiment analysis?

https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/data

Sentiment Analysis on Movie Reviews

Classify the sentiment of sentences from the Rotten Tomatoes dataset

#

I didn't check to see if the reviews are classified strictly as positive or negative, but if it's on a scale of some kind, you can descretize it.

#

(for example, you could say data['is_positive'] = data['sentiment'] >= 2.5)

#

I dunno, try it

#

(rather, I don't remember because I'm on my work computer. if it requires an account then I must have one on my home computer)

coral sage Dec 21, 2021, 1:33 PM

#

I have a Pandas Series with each value as a Series of authors and the number of messages they sent in a given day, how do I convert it into a DataFrame?

#

#

Excuse the terrible drawing but that's basically what I wanna do on a bigger scale

serene scaffold Dec 21, 2021, 1:44 PM

#

coral sage I have a Pandas Series with each value as a Series of authors and the number of ...

so your series has multiple levels of indexing?

#

can you do print(series.to_dict()) and paste the text into the chat?

coral sage Dec 21, 2021, 1:45 PM

#

serene scaffold so your series has multiple levels of indexing?

It's a bunch of series within a series

#

The index is a bunch of dates

serene scaffold Dec 21, 2021, 1:46 PM

#

coral sage It's a bunch of series within a series

that's fine; please do the statement I showed you and show the text.

coral sage Dec 21, 2021, 1:46 PM

#

serene scaffold that's fine; please do the statement I showed you and show the text.

Yeah sure give me a minute

serene scaffold Dec 21, 2021, 1:51 PM

#

Please ping me if you come back; otherwise I'm going to do something else.

#

!docs pandas.DataFrame.unstack

arctic wedgeBOT Dec 21, 2021, 1:53 PM

#

pandas.DataFrame.unstack


DataFrame.unstack(level=- 1, fill_value=None)```
Pivot a level of the (necessarily hierarchical) index labels.

Returns a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels.

If the index is not a MultiIndex, the output will be a Series (the analogue of stack when the columns are not a MultiIndex).

serene scaffold Dec 21, 2021, 1:53 PM

#

The solution will probably involve that. Good luck!

lapis sequoia Dec 21, 2021, 1:54 PM

#

Hi all

#

Anyone has build a tool for forecasting in python?

serene scaffold Dec 21, 2021, 1:55 PM

#

lapis sequoia Anyone has build a tool for forecasting in python?

there's tslearn for time series stuff

lapis sequoia Dec 21, 2021, 1:56 PM

#

Yes but I need built an exe

serene scaffold Dec 21, 2021, 1:56 PM

#

why does it need to be an exe?

lapis sequoia Dec 21, 2021, 1:56 PM

#

Requirment

serene scaffold Dec 21, 2021, 1:56 PM

#

well, once you have something working, you can ask how to make it an exe in #tools-and-devops

lapis sequoia Dec 21, 2021, 1:57 PM

#

Does tslearn give me some dashboard sort of things

serene scaffold Dec 21, 2021, 1:58 PM

#

if you want to make something with a UI, you can ask about that in #user-interfaces. AI libraries are about the actual AI component, and then there are other libraries for making interfaces.

coral sage Dec 21, 2021, 2:07 PM

#

serene scaffold Please ping me if you come back; otherwise I'm going to do something else.

{(datetime.date(2021, 6, 20), 'Name1'): 398,
 (datetime.date(2021, 6, 20), 'Name2'): 3,
 (datetime.date(2021, 6, 20), 'Name3'): 180,
 (datetime.date(2021, 6, 20), 'Name4'): 99,
 (datetime.date(2021, 6, 20), 'Name5'): 120,
 (datetime.date(2021, 6, 20), 'Name6'): 1,
 (datetime.date(2021, 6, 20), 'Name7'): 1347,
 (datetime.date(2021, 6, 20), 'Name8'): 893,
 (datetime.date(2021, 6, 20), 'Name9'): 207,
 ...

Sorry, I got preoccupied by something IRL. This is what I get when I use the .to_dict() method on the series.

serene scaffold Dec 21, 2021, 2:08 PM

#

coral sage ```py {(datetime.date(2021, 6, 20), 'Name1'): 398, (datetime.date(2021, 6, 20),...

In [12]: series.to_frame()
Out[12]:
                     0
2021-06-20 Name1   398
           Name2     3
           Name3   180
           Name4    99
           Name5   120
           Name6     1
           Name7  1347
           Name8   893
           Name9   207

In [13]: series.to_frame().unstack(level=1)
Out[13]:
               0
           Name1 Name2 Name3 Name4 Name5 Name6 Name7 Name8 Name9
2021-06-20   398     3   180    99   120     1  1347   893   207

There's only one row because the sample data only has one unique date.

#

to_frame turns the Series into a DataFrame with two levels of indexing. The first level is the date and the second level is the name

#

Unstacking the second level (which is 1, because the numbering starts at 0) achieves the desired result.

#

actually it looks like you can just do series.unstack(level=1) and it's converted to a DataFrame as part of that

#

Yay!

coral sage Dec 21, 2021, 2:11 PM

#

ahh yeah

#

Thank you very much :D, both worked

serene scaffold Dec 21, 2021, 2:11 PM

#

lemon_hyperpleased

#

also remember to have a copy/pastable example ready whenever yo have a pandas question

#

🐼 lemon_warpaint

coral sage Dec 21, 2021, 2:12 PM

#

ah thanks for the tip, I'll remember that for next time :D

serene scaffold Dec 21, 2021, 2:12 PM

#

lemon_hyperpleased

coral sage Dec 21, 2021, 2:12 PM

#

lemon_happy

rigid zodiac Dec 21, 2021, 2:42 PM

#

Hey guys I'm trying to do CNN model and I receive this error message.

The response suppose to have 2 classes, either fall or non Fall.
https://paste.pythondiscord.com/keminotixa.sql

delicate sphinx Dec 21, 2021, 2:44 PM

#

rigid zodiac Hey guys I'm trying to do CNN model and I receive this error message. The respo...

Your x train is probably the wrong size for the model input

rigid zodiac Dec 21, 2021, 2:45 PM

#

delicate sphinx Your x train is probably the wrong size for the model input

this is my x_train and y_train shape

#

delicate sphinx Dec 21, 2021, 2:45 PM

#

So your model is probably outputting shape 10 my bad

#

I'm not sure 10 and 2 are good values because if it was a power of 2 (i.e. 16) its much easier to compress to your desired shape

#

(I.e. 16 > 8 > 4 > 2)

#

Instead of 10 > 5

#

(The > is an arrow im just on phone)

rigid zodiac Dec 21, 2021, 2:47 PM

#

i'm still a bit confusing about this. so what is the outputting shape mean? Is it for the number of classes for my outcome?

delicate sphinx Dec 21, 2021, 2:47 PM

#

So maybe add another layer thats size 2

#

Can you do model.summary()

#

And show the final layer(s)

#

The output shape section will likely say (None, 10)

#

If you add another layer that outputs shape (None, 2) it should work, otherwise maybe try and make a layer that goes into 16 then go to 2 for the output

rigid zodiac Dec 21, 2021, 2:49 PM

#

delicate sphinx If you add another layer that outputs shape (None, 2) it should work, otherwise ...

how can I do that? it's my first time try with CNN model

delicate sphinx Dec 21, 2021, 2:49 PM

#

Can you send pictures of your model code or your model.summary()

#

Just so I know if what I'm saying might fix it or not

rigid zodiac Dec 21, 2021, 2:51 PM

#

delicate sphinx Can you send pictures of your model code or your model.summary()

delicate sphinx Dec 21, 2021, 2:51 PM

#

Ok so can you add a Dense(2)

#

As its sequential you should be able to just do model.add(tf.keras.layers.Dense(2))

#

Just put that right at the bottom

rigid zodiac Dec 21, 2021, 2:52 PM

#

before the model.compile() and model.fit () right

delicate sphinx Dec 21, 2021, 2:52 PM

#

Also just to give you a bit of extra info that may help you understand it

#

The (None, x) has None because thats your batch size

#

So that's a variable output size

#

If you're new to tf I just figured it worth saying that :)

delicate sphinx Dec 21, 2021, 2:53 PM

#

rigid zodiac before the model.compile() and model.fit () right

Yes

rigid zodiac Dec 21, 2021, 2:54 PM

#

delicate sphinx Yes

thank you so much, so I will adjust my outputting as 16

delicate sphinx Dec 21, 2021, 2:55 PM

#

delicate sphinx The (None, x) has None because thats your batch size

I see you are a tad confused.

Doing batch size 50 gives (50,x),
batch size 100 gives (100, x) etc

delicate sphinx Dec 21, 2021, 2:56 PM

#

rigid zodiac thank you so much, so I will adjust my outputting as 16

Yeah maybe 16 was wrong of me to say. But you can replace the 10 with 16 and add a layer dense 2

#

Sorry I woke up a few minutes ago haha

rigid zodiac Dec 21, 2021, 2:57 PM

#

delicate sphinx Sorry I woke up a few minutes ago haha

lol well i'm glad that you wake up on time. I havent learn or do this model before.. What book / site should I read to know more about this

rigid zodiac Dec 21, 2021, 2:58 PM

#

delicate sphinx Yeah maybe 16 was wrong of me to say. But you can replace the 10 with 16 and add...

may be 16 is not ok

delicate sphinx Dec 21, 2021, 2:59 PM

#

rigid zodiac lol well i'm glad that you wake up on time. I havent learn or do this model befo...

Tensorflow has lots of documented guides that really give you a boost in your learning

#

Yeah sorry I meant 16 and 2

#

You could keep the 10 if you really want but personally I love using powers of 2 for all my layers due to the way tensorflow can squash stuff

#

But do another layer dense 2 just after the 1y

#

16

sleek sentinel Dec 21, 2021, 3:01 PM

#

Hi

#

There is a module for detect font from image?

rigid zodiac Dec 21, 2021, 3:02 PM

#

delicate sphinx Yeah sorry I meant 16 and 2

When I add model.add(Dense(2)) it gives me this error

delicate sphinx Dec 21, 2021, 3:04 PM

#

Model summary?
Might need to change a dense layer from 10 to 16

rigid zodiac Dec 21, 2021, 3:05 PM

#

delicate sphinx Model summary? Might need to change a dense layer from 10 to 16

it doesnt run at all. for the previous run of the model, I have this summary

rigid zodiac Dec 21, 2021, 3:07 PM

#

delicate sphinx Model summary? Might need to change a dense layer from 10 to 16

I see, to be able to run it I will need to change the n_classes = 16 and then change the dense layer to 16 also

delicate sphinx Dec 21, 2021, 3:07 PM

#

I thought your dense(16) worked it was just the output size issue

delicate sphinx Dec 21, 2021, 3:08 PM

#

rigid zodiac it doesnt run at all. for the previous run of the model, I have this summary

Dense_7 should probably be 16 and then add a dense_8 thats size 2 (your dense layers will be more like dense_10 because this model.summary() is outdated)

#

But in that model.summary() id recommend size of 16 into a new layer of 2

rigid zodiac Dec 21, 2021, 3:09 PM

#

dense_8 thats size 2 is not working. I put it in as you say before

rigid zodiac Dec 21, 2021, 3:11 PM

#

delicate sphinx Dense_7 should probably be 16 and then add a dense_8 thats size 2 (your dense la...

but if I change Dense_6 to 32 and Dense_7 to 16 and it work lol

delicate sphinx Dec 21, 2021, 3:28 PM

#

Sorry had to give a lift to someone

#

Yeah tensorflow squishes them by halving and multiplying

#

I.e. a layer of 255 would rarely work because it wouldn't be compatible with other layers fully, I.e. 255 would halve to 127.5 and either go to 127 or 128

#

But if you want another layer of 255 the best that layer could give you is 254 or 256

#

Which is why I try to keep all my outputs to a power of 2

rigid zodiac Dec 21, 2021, 3:32 PM

#

delicate sphinx Which is why I try to keep all my outputs to a power of 2

I see but somehow it just wont work with mine 😦

delicate sphinx Dec 21, 2021, 3:33 PM

#

whats the model.summary() now

rigid zodiac Dec 21, 2021, 3:34 PM

#

hold on

rigid zodiac Dec 21, 2021, 3:36 PM

#

delicate sphinx whats the model.summary() now

but when I run model.compile() it just have the error code of Shapes(None,16) and (Non,2) are incompatible

delicate sphinx Dec 21, 2021, 3:36 PM

#

hmm weird

#

a very bad solve, but maybe just try another dense after 16 of 8 then 4 then 2

#

but idk

rigid zodiac Dec 21, 2021, 3:37 PM

#

let me tried with 8

rigid zodiac Dec 21, 2021, 3:39 PM

#

delicate sphinx but idk

I think the reason why it wont run because of the number of class. I need to change it to 2 since I only have 2 classes

delicate sphinx Dec 21, 2021, 3:40 PM

#

yeah I don't know all that n_classes stuff you mentioned earlier

#

but that's a shout, I just assumed you hadn't made a chnage

#

change

small mist Dec 21, 2021, 3:41 PM

#

print (hello world )

rigid zodiac Dec 21, 2021, 3:41 PM

#

it work now, but the accuracy is shit so far. I will need to wait for it for 3 hours to know

delicate sphinx Dec 21, 2021, 3:42 PM

#

you can always split up the dataset, your batch size was 16,000 or so a minute ago?

delicate sphinx Dec 21, 2021, 3:42 PM

#

small mist print (hello world )

you're missing quotation marks print("Hello world")

delicate sphinx Dec 21, 2021, 4:13 PM

#

I've only learnt about it in uni and not done it myself as it sounds really complex, but I think eigenvalues are used for general face recognition

#

so it would make sense that you can expand on that for emotions (i.e. smiling)

#

I've no clue about the actual method to solve it all, I can't even seem to figure out how to one-hot encode my own work but yeah that sort of stuff is all I know

#

Personal + uneducated opinion, but lower face would be my go to

warm verge Dec 21, 2021, 4:20 PM

#

delicate sphinx Personal + uneducated opinion, but lower face would be my go to

Like the mouth?

#

I wanna see if I can detect that yeah

delicate sphinx Dec 21, 2021, 4:21 PM

#

grayscale may be a shout, just because colour would require more data (i.e. 10,000 with and 10,000 without lipstick etc)

#

yeah for sure

warm verge Dec 21, 2021, 4:22 PM

#

Then I can detect only the face using Haar cascade

delicate sphinx Dec 21, 2021, 4:23 PM

#

and it can give you some images to send to friends and give them nightmares

warm verge Dec 21, 2021, 4:23 PM

#

So once I have a grayscale face, it should be the best thing to apply PCA to

delicate sphinx Dec 21, 2021, 4:24 PM

#

forehead being scrunched can be surprised

#

though the issue with that could bias older people as surprised (due to wrinkles)

warm verge Dec 21, 2021, 4:24 PM

#

Forehead would be good yes, I haven't thought about that

delicate sphinx Dec 21, 2021, 4:25 PM

#

Yeah you'd have to account for elderly people though

warm verge Dec 21, 2021, 4:25 PM

#

The proposed system performs better than existing technique for facial emotion recognition when Gradient Filter, PCA and PSO has been considered for feature extraction with random forest classification technique.

delicate sphinx Dec 21, 2021, 4:25 PM

#

though I'm not 100% on this perhaps if their eyes are not wide open but their forehead is scrunched then they're old

warm verge Dec 21, 2021, 4:25 PM

#

delicate sphinx Yeah you'd have to account for elderly people though

I think my data sets are all 'younger' people so should be fine in that sense

delicate sphinx Dec 21, 2021, 4:25 PM

#

but if eyes are wide open and forehead scrunched then they're surprised

warm verge Dec 21, 2021, 4:25 PM

#

Although I would talk about its potential inaccuracy for wrinkles

#

Also I keep seeing 'random forest classifier' everywhere but I have no idea what it is

#

Google/Wikipedia/ELI5 not very helpful

delicate sphinx Dec 21, 2021, 4:29 PM

#

Outside of my already limited knowledge I'm afraid

warm verge Dec 21, 2021, 4:32 PM

#

Nw, thank you very much for the input! Got some new ideas now 😄

delicate sphinx Dec 21, 2021, 4:34 PM

#

No worries, if you want to really expand it you can use RGB image to detect for wrinkles so you can better predict if their emotion is biased or not

#

but that's a self-thought idea so the practicality of it god knows

warm verge Dec 21, 2021, 4:35 PM

#

RGB is definitely something that has big trade-offs so I'll try both ways

delicate sphinx Dec 21, 2021, 4:35 PM

#

yeah you'd only want to preprocess bias weights with it

#

though you don't need to preprocess it and could just measure the amount of wrinkles and use that as a secondary accuracy score

#

i.e. "due to the wrinkles detected in the image this emotion guess may only be 30% accurate"

#

though in a less-harsh way so people in their 20-40s don't feel like you're calling them old haha

warm verge Dec 21, 2021, 4:46 PM

#

That's a good point

#

I already have so many ethical considerations, ageism would be a good one to talk about

#

Good as in there's a lot to say, not that ageism is good

delicate sphinx Dec 21, 2021, 4:48 PM

#

haha im glad

mortal silo Dec 21, 2021, 5:11 PM

#

for theano to work with g++, do I have to have all 500mb of mingw-w64 files from conda? Aren't there any lightweight options or maybe I can precompile it somehow? I'm asking because I need to be able to run the program in any environment.

warped rapids Dec 21, 2021, 6:14 PM

#

Does anyone have experience with matplotlib?

#

There's one thing I have to fix

#

And I have been stuck forever

hoary flame Dec 21, 2021, 6:21 PM

#

Anyone experienced with Mask R CNN? I am trying to figure out the input format for the system. My dataset does not includes json file but each instance pixel's is labelled with the grayscale value.

warm verge Dec 21, 2021, 6:23 PM

#

warped rapids Does anyone have experience with matplotlib?

Post question

warped rapids Dec 21, 2021, 6:23 PM

#

I already have

#

https://discordapp.com/channels/267624335836053506/366673702533988363/922906945667596288

mortal silo Dec 21, 2021, 6:40 PM

#

mortal silo for theano to work with g++, do I have to have all 500mb of mingw-w64 files from...

pls help .-.

delicate sphinx Dec 21, 2021, 6:54 PM

#

mortal silo pls help .-.

Might be worth opening a help channel and asking there as more people might see it

#

while also keeping it in here

#

personally I have no knowledge of it 😦

mortal silo Dec 21, 2021, 6:55 PM

#

I tried. no one seems to know this 😦
Well time to try again

warped rapids Dec 21, 2021, 7:44 PM

#

@warm verge

warped rapids Dec 21, 2021, 7:44 PM

#

warped rapids https://discordapp.com/channels/267624335836053506/366673702533988363/9229069456...

Here

warm verge Dec 21, 2021, 7:44 PM

#

warped rapids <@!592045119117459480>

Why's it in web dev

warped rapids Dec 21, 2021, 7:45 PM

#

Its flask

warm verge Dec 21, 2021, 7:45 PM

#

You asked about matplotlib?

warped rapids Dec 21, 2021, 7:45 PM

#

Yes

#

But its integrated in flask

#

In my example

#

So it all correlates

#

Plots data to webbrowser

#

But do you have any idea?

warm verge Dec 21, 2021, 7:48 PM

#

https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.plot.html

#

It's one of those methods but I can't remember which

#

Lets you plot above

warped rapids Dec 21, 2021, 7:52 PM

#

No, it's text method

#

I just need texts

warm verge Dec 21, 2021, 7:56 PM

#

On the axes

warped rapids Dec 21, 2021, 7:57 PM

#

No

#

I think what you mean are labels

#

I just want text

#

https://matplotlib.org/stable/tutorials/text/text_intro.html

#

Like that

#

Or like this @warm verge

#

The percentages

warm verge Dec 21, 2021, 8:00 PM

#

So you can either plot them in line with a datum point, such as 0.05 on that example graph, or use the axes and plot above

warped rapids Dec 21, 2021, 8:00 PM

#

Yeah but inputs differ

#

plt.text(mostLeftSection, stats.norm.pdf(mostLeftSection, u, o), "test")

#

This is what I have

#

To try and have text in the mostleftsection

#

It works for color

#

mostLeftSection = np.linspace(u - 3 * o, u - 1 * o, 100)

#

plt.fill_between(mostLeftSection, 0, stats.norm.pdf(mostLeftSection, u, o), facecolor='#49393d')

novel acorn Dec 21, 2021, 8:19 PM

#

Hello everyone, hope you're doing great!! 😄

Is there a way to get the function behind a neural network? In this image below, I want to get b

#

This is the network

warped rapids Dec 21, 2021, 8:39 PM

#

Does anyone have experience with matplotlib and would like to help? :)

summer anchor Dec 21, 2021, 9:02 PM

#

Greetings, I'd like to have some advice regarding our application.
We serve an API (Flask) that analyses images after testing them through about 10 models.
Currently the code base is a bit messy, I am trying to use a pattern that would allow us to plug-in and out different models, also increase reusability of the source code by implementing a class based code where we separate concerns for preprocessing, configuration and finally running the models against input image. Anyways, I am newer on the ML/DL domain although I have fundamental understanding of OOP and design patterns. Any recommendation is appreciated regarding "organization" of the project, thanks in advance!

arctic wedgeBOT Dec 21, 2021, 9:11 PM

#

:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until <t:1640121661:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

odd meteor Dec 21, 2021, 9:22 PM

#

novel acorn Hello everyone, hope you're doing great!! 😄 Is there a way to get the function...

from this image, b is simply your output (the result predicted by your network).

Where:

X1, X2,..., Xn = n nodes/neurons in your input layer
W1, W2, ,..., Wn = The weight of the respective neurons in the input layer

Sigma = Your hidden layer with one neuron (remember each neuron in a hidden layer has its own bias weight, so because we only have 1 neuron in our hidden layer, that's why we have (W0) there)

a = sum of inputs x weights + bias

g(x) = it should be g(a) though. So here, a is then passed into the activation function to give us our yield

b = yield a.k.a output

odd meteor Dec 21, 2021, 9:30 PM

#

novel acorn This is the network

You've built your neural network architecture here, and have even made your neural nets fertile and ready for training by setting the loss function your optimization function should get to its global minimum)

So the next stage is to train your model (neural nets) by calling the fit method on it.

Then to get 'b', call your predict method on the model.

That's how to get the yield. ✌️

novel acorn Dec 21, 2021, 10:00 PM

#

thank you, is much clearer now!

mortal silo Dec 21, 2021, 10:19 PM

#

for theano to work with g++, do I have to have all 500mb of mingw-w64 files from conda? Aren't there any lightweight options or maybe I can precompile it somehow? I'm asking because I need to be able to run the program in any environment.

warped rapids Dec 21, 2021, 10:21 PM

#

Does anyone have experience with matplotlib and would like to help? :)

desert bear Dec 21, 2021, 10:31 PM

#

Hey, I have developed my own neural network based on gradient descent for back propagation. I'm now trying to test it, and to my surprise I always get similar results when it comes to worsening of the model.
No matter the data shuffling, performance of my NN drops always after third epoch:

#

Is this concerning to you?

gentle swift Dec 21, 2021, 10:43 PM

#

Saw this on Product Hunt today: https://www.producthunt.com/posts/modern-data-stack

All things related to modern data stack for engineering in a single place. Nice resource!

Product Hunt

Modern Data Stack - One platform for all things about the modern d...

A platform for everything you need to know about the Modern Data Stack ⭐️ Companies & Categories shaping the Modern Data Stack 📚 Data stacks of the world's top companies 📖 Resources to get updates on the latest in this space 🛠 Jobs in data engineering

loud cave Dec 21, 2021, 11:43 PM

#

summer anchor Greetings, I'd like to have some advice regarding our application. We serve an A...

Try BentoML

#

It is a framework for deploying ML models

rose pasture Dec 21, 2021, 11:51 PM

#

Hey guys how do you judge if whether a data science bootcamp is good or not?

odd meteor Dec 22, 2021, 12:28 AM

#

rose pasture Hey guys how do you judge if whether a data science bootcamp is good or not?

I think this question is subjective so here's how I'll guage the program:

Richness of the curriculum
Duration of the bootcamp
Do they assume everyone is a novice and willing to start from the basis or do they assume we all know what we're doing 😂.
Pay attention the stated prerequisite(s) if there's any.
Where exactly are the people who were in their previously cohort now? Are majority of them now employed as Data Scientist or still willing to enroll for another data science bootcamp (This is the point where you really need to put on your Investigative Journalist + FBI cap) 😂
The experience of your instructors. Are they Data Scientist, ML Engineer at well-to-do companies?
Last but not the least... The teaching style. Do they assign mentors to students, what kinda projects are you gon be working on as your Capstone project... etc.

==================
I'm not affiliated to Fourth Brian but I'll always recommend their Bootcamp if you can finesse the payment.

https://FourthBrain.ai

FourthBrain

Become a Machine Learning Engineer with our live online program.

delicate sphinx Dec 22, 2021, 12:50 AM

#

(TENSORFLOW) does anyone know if my combination of TextVectorization + Embedding layer mean I can remove a mask_zero = True parameter

#

as mask_zero = True is stopping me from correctly loading my model from a json

#

but I want to be sure I don't need it before I remove it

rose pasture Dec 22, 2021, 2:14 AM

#

odd meteor I think this question is subjective so here's how I'll guage the program: 1. Ri...

What do you think of this one?

https://concordiabootcamps.ca/courses/data-science-full-time/

Looks like ill have to put my fbi cap on and find out what their projects are ahah

Concordia Bootcamps

Data Science Bootcamp - Concordia Bootcamps

Learn Python, SQL, Foundations of Data Modeling, & Machine Learning. Join Our Online Data Science Bootcamp and Be Job-Ready in Just 12-weeks.

wicked grove Dec 22, 2021, 2:17 AM

#

def flip(image):
    image=cv2.flip(image,1)
    return image
#train_datagen = ImageDataGenerator(preprocessing_function=orth_rot,horizontal_flip=False)


j=0
my_img=os.listdir(train_path)
for i,image_name in enumerate(my_img):
    if(image_name.split('.')[1]=='jpg'):
        print(train_path+image_name)
        x=cv2.imread(os.path.join(train_path,image_name))
        x=cv2.cvtColor(x,cv2.COLOR_BGR2RGB)
        
        y=crop_square(x,512) 
        im_flip=flip(y)
        plt.imshow(im_flip)
        plt.show()
        
        cv2.imwrite(os.path.join(save_path,str(j),'_',image_name),im_flip)
        break```

#

can someone please tell me why cv2.imwrite is not working

delicate sphinx Dec 22, 2021, 2:23 AM

#

can we get an output and the error output

#

Idk about cv2 but if it's something trivial I might be able to help

wicked grove Dec 22, 2021, 2:24 AM

#

there is no error

#

the image just doesnt get saved to the location

delicate sphinx Dec 22, 2021, 2:24 AM

#

does cv2 require opening a file to write to?

#

and it definitely prints just after the if? (if not then the if statement is likely wrong but as you're asking about cv2.imwrite I'll assume it definitely reaches that point)

stone marlin Dec 22, 2021, 2:27 AM

#

How's it goin' tonight, y'all? I've got a few days to kill so I wanted to get grounded with some of those NN things y'all have been chattin' about. :'] Haha.

[Note: I've been workin' in ML/DS for a while, so I've got a pret good background in Python and general ML/DS architecture stuff, DDB nonsense, etc. Just don't know much about that sweet, sweet NN stuff!]

Anyone got a tutorial series they dig?
Anyone got a preferred framework? Is TF still the standard?

I was told "deeplearnin.ai" is a good place to start, but, you know, wanted to survey a bit.

wicked grove Dec 22, 2021, 2:28 AM

#

delicate sphinx and it definitely prints just after the if? (if not then the if statement is lik...

yes it reached that point cause i have plotted the image and that gets displayed

delicate sphinx Dec 22, 2021, 2:32 AM

#

stone marlin How's it goin' tonight, y'all? I've got a few days to kill so I wanted to get g...

TF has lots of guides that are nice

odd meteor Dec 22, 2021, 2:51 AM

#

stone marlin How's it goin' tonight, y'all? I've got a few days to kill so I wanted to get g...

Free Resources

Deeplearning.ai = PyTorch
Neuromatch.io = PyTorch

(ooh, I almost forgot to mention, GANs was covered here too 😀)
https://deeplearning.neuromatch.io/tutorials/intro.html

Andrew Ng's Deep Learning course on Coursera.
University of Youtube

stone marlin Dec 22, 2021, 2:54 AM

#

I think DeepLearning.ai links to Ng's DL course on coursera. Or at least that's what it linked to me. PyTorch is also fine for me, I don't mind either way with TF vs. PyTorch. Seems like DL.ai is a good place to start then!

odd meteor Dec 22, 2021, 2:56 AM

#

stone marlin How's it goin' tonight, y'all? I've got a few days to kill so I wanted to get g...

Researchers are leaning towards JAX lately. Personally, I find TensorFlow & Keras easier.

PyTorch is still probably the most popular Framework that's widely used.

stone marlin Dec 22, 2021, 2:58 AM

#

Yeah, I'm gonna see what it's all about, I'm not too worried about what I start with. Cool, thanks! I'm gonna try that then, and see how it goes.

odd meteor Dec 22, 2021, 3:03 AM

#

rose pasture What do you think of this one? https://concordiabootcamps.ca/courses/data-scien...

Bro I don't have much time to look into this at the moment. I also don't wanna influence or by any chance polarise your decision.

Do your due diligence. ✌️

odd meteor Dec 22, 2021, 3:04 AM

#

stone marlin Yeah, I'm gonna see what it's all about, I'm not too worried about what I start ...

That's cool

delicate sphinx Dec 22, 2021, 3:28 AM

#

Well, atleast my model doesn't overfit to just "yes" now ,-,

celest geyser Dec 22, 2021, 4:12 AM

#

Not sure if this is the right channel but does anyone know of good ressources to start learning tacotron 2 (or other) for voice synthesis??

arctic crown Dec 22, 2021, 6:41 AM

#

how can i make my ai assistant copy my habits?
suppose i set alarm straight for 5 days to ring at 7am, i want the program to set an alarm at that same time on the 6th day, in case if i forget to set it yourself?

woeful falcon Dec 22, 2021, 7:15 AM

#

what would be the best way for a beginner in ai and neural stuff to learn the basics, like a small goal to work towards (eg- make a tictactoe playing neural net) etc ?

olive patio Dec 22, 2021, 8:14 AM

#

https://colab.research.google.com/drive/1gQO_RddY0aBYtTQ2HTDcP6PXVEsJYuJL?usp=sharing

this is a colab project i've been working on. i'm using the dataset - https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

I was able to train the model and get accuracy and loss for validation sets. However, when I try to predict test set images the accuracy remains at 0.625. I ran it on 20 epochs and 4 epochs and 2 epochs and the value did not change. I know for a fact that the accuracy should be around 92% (I'm using a project for reference with the same model but some different code). Could someone please go through the notebook and see where I'm going wrong? I just cant figure it out. Thanks very much!

Google Colaboratory

Chest X-Ray Images (Pneumonia)

5,863 images, 2 categories

lapis sequoia Dec 22, 2021, 8:41 AM

#

Hi anyone familiar with LSTM models? I am using LSTM to predicting stock price movement, however, the prediction result is very similar to original test set, why?

#

features=['Open','High','Low','Volume']
scaler=MinMaxScaler()
feature_transform=scaler.fit_transform(df[features])
feature_transform= pd.DataFrame(columns=features, data=feature_transform, index=df.index)
feature_transform.head()
timesplit= TimeSeriesSplit(n_splits=10,test_size=6126)
for train_index, test_index in timesplit.split(feature_transform):
        X_train, X_test = feature_transform[:len(train_index)], feature_transform[len(train_index): (len(train_index)+len(test_index))]
        y_train, y_test = output_var[:len(train_index)].values.ravel(), output_var[len(train_index): (len(train_index)+len(test_index))].values.ravel()
trainX =np.array(X_train)
testX =np.array(X_test)
X_train = trainX.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = testX.reshape(X_test.shape[0], 1, X_test.shape[1])
lstm = Sequential()
lstm.add(LSTM(32, input_shape=(1, trainX.shape[1]), activation='relu', return_sequences=False))
lstm.add(Dense(1))
lstm.compile(loss='mean_squared_error', optimizer='adam')

history=lstm.fit(X_train, y_train, epochs=5, batch_size=8, verbose=1, shuffle=False)
y_pred= lstm.predict(X_test)
predictions=list(chain.from_iterable(y_pred.tolist()))

#

Here is the code I have used

#

Thanks. So my dataset is sorted by time, so there shouldn't be problem in input?

#

and I take a look at test_set, they took latest 6126 data which is fine, I just need it to predict next 6126 data

#

uh, I have 290K on train side, only 6126 on test side

#

well it because I can only get 6126 data for now to verify the prediction result, maybe more for future

#

but the thing I don't quite understand is still why the prediction looks basically same as test_set

#

well, maybe I will try another model and to see how it becomes, I use type() to check and didn't find out any problems in type

#

thanks

odd meteor Dec 22, 2021, 9:37 AM

#

olive patio https://colab.research.google.com/drive/1gQO_RddY0aBYtTQ2HTDcP6PXVEsJYuJL?usp=sh...

I haven't checked the code on Colab but try this

Plot the learning curves to aid your understanding of what's really going on.
Could be a case of Overfitting. If you didn't add Batch Normalization, or Dropout layer or Early Stopping callback, you might wanna do that and see if there's an improvement.
Might be the problem of selecting the right number of epoch vs. Batch size.

A quick way to find out if this has a hand in what's happening is to train your neural net with 2 different batch_sizes and a constant epochs.

For example use:

epochs = 5, batch_size = 1
epochs = 5, batch_size = X_train.shape[0]

lapis sequoia Dec 22, 2021, 9:50 AM

#

hi can anyone recommend me some resources to learn reinforcement learning

teal mortar Dec 22, 2021, 9:54 AM

#

lapis sequoia hi can anyone recommend me some resources to learn reinforcement learning

https://www.youtube.com/watch?v=TCCjZe0y4Qc&list=PLqYmG7hTraZDVH599EItlEWsUOsJbAodm&ab_channel=DeepMind

YouTube

DeepMind

2021 DeepMind x UCL RL Lecture Series - Introduction to Reinforceme...

Research Scientist Hado van Hasselt introduces the reinforcement learning course and explains how reinforcement learning relates to AI.

Slides: https://dpmd.ai/introslides
Full video lecture series: https://dpmd.ai/DeepMindxUCL21

▶ Play video

amber lark Dec 22, 2021, 1:11 PM

#

How does data science connected to neural network?
Can I learn NN without the knowledge of data science?

delicate sphinx Dec 22, 2021, 1:23 PM

#

NN is a part of data science I believe

#

I never explicitly learned about data science but I had learnt about what a NN is and how they work

#

so I think it's sort of hand-in-hand (though Data Science likely contains more than just NN obviously)

wary bison Dec 22, 2021, 1:28 PM

#

Hi guys need some advice on a thing. I have a univariate time series dataset on which I need to apply anomaly detection. Everything needs to be unsupervised - the thresholds for the anomalies also need to come from the model/algorithm. Any suggestions/recommendations?

lapis sequoia Dec 22, 2021, 1:29 PM

#

Hi could anyone please tell me about the math in machine learning?

delicate sphinx Dec 22, 2021, 1:31 PM

#

lapis sequoia Hi could anyone please tell me about the math in machine learning?

That's quite a difficult question to answer as in my opinion atleast it depends on what you intend to do with it all, if you use something like Tensorflow then you can avoid having to write any actual functions and the most math you get really is in figuring what numbers to put for which layer

#

if you plan to write from scratch or to apply your own functions to it, then you can expect lots of maths

lapis sequoia Dec 22, 2021, 1:32 PM

#

delicate sphinx That's quite a difficult question to answer as in my opinion atleast it depends ...

Ok thank you very much

#

I'm actually a beginner

delicate sphinx Dec 22, 2021, 1:32 PM

#

It can get really mathematical but there are many libraries/packages out there that you can use that will do it largely for you

#

i.e. tensorflow has libraries that can apply loss functions, optimization functions, etc. all without me doing anything more than importing a loss and optimizer module

#

but tensorflow also allows you to customise your input/output functions and more, in which from the look of some things you'd need a very mathematical brain to interpret well enough to make yourself

lapis sequoia Dec 22, 2021, 1:34 PM

#

Oh ok thank you very much 👍

delicate sphinx Dec 22, 2021, 1:35 PM

#

if you want to do stuff with data you can probably expect lots of disgusting variables to do things quick and efficiently

#

but it's just like any project, to you it makes sense and to others you've just written a piece of space language

chilly torrent Dec 22, 2021, 1:54 PM

#

I am struggling so hard with numpy.random.lognormal. For some reason, I am getting really high numbers when I am using a small mean. Does anyone have any idea what's going on?

# output 3.4214334251929405e+30

The output is 30 times bigger than the input!

ancient sorrel Dec 22, 2021, 2:21 PM

#

I need some help idk if it s related to ai or data science but ti seems the closest domain. I have a 3d obj with a mtl texture and i have a script which takes photos of it from different angles. i m using pyrender. for some reason the object i get in the photo is untextured. can anyone give me a hand here?

delicate sphinx Dec 22, 2021, 2:42 PM

#

chilly torrent I am struggling so hard with numpy.random.lognormal. For some reason, I am getti...

https://stackoverflow.com/questions/51609299/python-np-lognormal-gives-infinite-results-for-big-average-and-st-dev

Stack Overflow

Python np.lognormal gives infinite results for big average and St Dev

I am trying to draw the lognormal distribution for my data. using the following code:

mu, sigma = 136519., 50405. # mean and standard deviation
hs = np.random.lognormal(mu, sigma, 1000) #mean, s d...

#

does that help?

tidal bough Dec 22, 2021, 3:23 PM

#

chilly torrent I am struggling so hard with numpy.random.lognormal. For some reason, I am getti...

meanfloat or array_like of floats, optional
Mean value of the underlying normal distribution. Default is 0.
sigmafloat or array_like of floats, optional
Standard deviation of the underlying normal distribution. Must be non-negative. Default is 1.

These are parameters of the underlying normal distibution. So for a mean of 70, the mean of the result will be around exp(70) ~= 10^30

low spear Dec 22, 2021, 3:26 PM

#

anyone can help me on this? this is code i obtained from github and i tried to learn it by using my own input but i bumped into the error. this code is about vehicles detection using faster rcnn

delicate sphinx Dec 22, 2021, 3:28 PM

#

low spear anyone can help me on this? this is code i obtained from github and i tried to ...

if you give some input examples and their shapes then maybe, but from a quick glance, although I've not normalized batches, I'd assume it to be with input shape being incorrect

#

and if you want to help further a model.summary() can show the output shapes expected

#

making sure they all match is really good

normal violet Dec 22, 2021, 4:04 PM

#

hey if anyone here uses flask could they please look at this?

#

https://stackoverflow.com/questions/70451567/no-module-named-flask-in-vscode-terminal/70451670#70451670

Stack Overflow

'No module named flask' in VScode terminal

While attempting to create my first flask application, I created a virtual environment and used pip install Flask to attempt to install the flask module in the VScode terminal. This appears to have

#

please upvote if you can

chilly torrent Dec 22, 2021, 5:26 PM

#

delicate sphinx https://stackoverflow.com/questions/51609299/python-np-lognormal-gives-infinite-...

This was super helpful, thank you. I guess I will need to take the log10 of my mean to adjust my current approach

chilly torrent Dec 22, 2021, 5:27 PM

#

tidal bough > meanfloat or array_like of floats, optional > > Mean value of the underly...

Also very helpful, thank you

wooden night Dec 22, 2021, 5:54 PM

#

Any tips on implementing novel improvements from papers? (I have RL algorithms in mind specifically but I suppose it can apply for any paper-about-a-concept)
I'd consider myself a fairly competent python dev and I know the basics of RL and deep learning - enough to play around with stable_baselines3 and to get the gist of the algorithms - but when I read a paper my eyes glazeth over and I'm just unable to start writing an implementation

#

I've implemented tabular RL from scratch (trivial) and A2C from scratch (was very difficult). Is it just an experience thing, and I should be working my way up by implementing the actor-critic/Q-learning ladder myself from DQN/A2C up to TD3 and SAC?

desert oar Dec 22, 2021, 6:41 PM

#

@wooden night unglaze your eyes and start writing 🙂 i don't have experience specifically with reinforcement learning, but sometimes there's nothing you can do but sit down and write an implementation

#

trying to figure out test cases is probably the hardest part (you do want to test your code, right?)

#

these algorithms can be very very difficult to verify and audit

#

writing tests sometimes is a matter of verifying that certain mathematical properties hold (modulo floating point error)

bronze skiff Dec 22, 2021, 6:43 PM

#

as the physicists say, "shut up and calculate!"

wooden night Dec 22, 2021, 6:43 PM

#

desert oar these algorithms can be very very difficult to verify and audit

yeah that's what I struggled with the most, to my eyes my implementation ~= the paper/reference implementation but the reference implementation worked and mine didn't 😆

desert oar Dec 22, 2021, 6:47 PM

#

wooden night yeah that's what I struggled with the most, to my eyes my implementation ~= the ...

if this is "for work" and you aren't trying to replicate the paper for any kind of academic purposes, can you just use the reference implementation?

#

as another option, use the reference implementation to generate test data (known-correct input-output pairs)

#

plus maybe there are bugs in the reference implementation

wooden night Dec 22, 2021, 6:49 PM

#

desert oar if this is "for work" and you aren't trying to replicate the paper for any kind ...

Partially for work and partially for my own edification I suppose - I'm using a baseline implementation of algorithm A, but there's cool improvement B that I read about in a paper that I'd like to use

#

I don't have to use cool improvement B per se but I think it would really help, but sadly it hasn't been added to the package I'm using

desert oar Dec 22, 2021, 6:51 PM

#

maybe your employer will let you dedicate time to contributing the new version upstream even

arctic crown Dec 22, 2021, 6:55 PM

#

in ml what is support vector machines?
and how does it work?

hybrid mica Dec 22, 2021, 6:59 PM

#

in artificial neural networks, how does one know how many hidden layers are needed, and how many neurons?

desert oar Dec 22, 2021, 7:04 PM

#

hybrid mica in artificial neural networks, how does one know how many hidden layers are need...

you don't. as far as i know, there is still no generally useful theoretical approach to network architecture

boreal summit Dec 22, 2021, 7:07 PM

#

Mehn, it's so frustrating when much of the code in the book you're studying ain't running.

#

And this book was released this year. ☹️

bronze skiff Dec 22, 2021, 7:07 PM

#

code changes fast

#

that's why math == best lang

boreal summit Dec 22, 2021, 7:08 PM

#

I type in the exact code and think I'm wrong, then I go to the GitHub to copy and paste yet still doesn't run.

boreal summit Dec 22, 2021, 7:08 PM

#

bronze skiff that's why math == best lang

True though.

bronze skiff Dec 22, 2021, 7:08 PM

#

rustlang? no. mathlang

boreal summit Dec 22, 2021, 7:08 PM

#

bronze skiff rustlang? no. mathlang

I'm also learning Rust prog Lang by the side.

#

I used to learn c# before university, so I wanted to get back that feeling.

#

Python is plain, and doesn't really feel like programming IMO.

desert oar Dec 22, 2021, 7:11 PM

#

arctic crown in ml what is support vector machines? and how does it work?

there are 2 primary ways to interpret an SVM (assuming classification):

the modern way: a linear model with a specific loss function called "hinge loss"
the traditional way: an algorithm that finds a "separating hyperplane" that divides two classes, such that the hyperplane has the greatest possible "margin" between it and the data

it also turns out that the SVM model is amenable to something called the "kernel trick", which lets you embed your data in an arbitrarily complicated space without having to actually transform the data, as long as you can compute inner products in that space in terms of the original data values.

this "kernel SVM" technique is part of why SVM was so popular before gradient boosting and neural networks rolled around . it allowed you to develop a fairly complicated "feature space" that possibly encoded high-order relationships between data points, in which the classes were much easier to separate. this is not entirely unlike what gradient boosting and neural networks do.

#

but i highly recommend reading a book instead of asking people online 🙂

desert oar Dec 22, 2021, 7:12 PM

#

boreal summit Mehn, it's so frustrating when much of the code in the book you're studying ain'...

what book? not all books are good

boreal summit Dec 22, 2021, 7:13 PM

#

Applied Data science using Pyspark by Ramcharan Karla, Sundar Krishnan

#

One of the reasons I try to read recent books is so I'm sure I'm not reading old stuff. This book is copyrighted 2021, so I feel everything should work fine. But there are some hiccups.

bronze skiff Dec 22, 2021, 7:15 PM

#

do your pyspark versions and jvm match with the book's?

#

they are using spark 3, which is quite stable

boreal summit Dec 22, 2021, 7:16 PM

#

I can't confirm that ATM, but the latest Pyspark version is 3.0. if the book was released this year, then I think it should be compatible with the latest release.

bronze skiff Dec 22, 2021, 7:16 PM

#

i find it hard to believe you're experiencing hiccups unless you are using a completely off version

#

spark 2 to spark 3 has quite a lot of changes

#

especially with pyspark

desert oar Dec 22, 2021, 7:17 PM

#

that, and it might be useful to post the actual errors you are getting

bronze skiff Dec 22, 2021, 7:17 PM

#

which basically sucked until this year, haha

boreal summit Dec 22, 2021, 7:18 PM

#

I already sent the author a LinkedIn connect request. I'll Inbox him when he accepts, but I'll make sure I confirm the Pyspark version used in the book tomorrow so I can be certain that's not what's causing the issue.

bronze skiff Dec 22, 2021, 7:20 PM

#

also, what errors are you getting?

arctic crown Dec 22, 2021, 7:34 PM

#

desert oar there are 2 primary ways to interpret an SVM (assuming classification): 1) the ...

whats a hyperplane

bronze skiff Dec 22, 2021, 7:37 PM

#

arctic crown whats a hyperplane

a codimension 1 subspace of an affine space defined by a linear equation

desert oar Dec 22, 2021, 7:39 PM

#

arctic crown whats a hyperplane

the generalization of a plane to more than 2 dimensions

#

like how a plane is the generalization of a line

#

a hyperplane is the generalization of a plane, beyond what we as humans can visualize. but a lot of the properties are the same, and yes it helps if you know the linear algebra to avoid being stuck too much on low-dimensional intuition

stone marlin Dec 22, 2021, 7:40 PM

#

boreal summit I already sent the author a LinkedIn connect request. I'll Inbox him when he acc...

I've been through the first part of this book and had no issue. Check to make sure the PySpark + etc. versions you're using match the ones he mentions at the beginning of the book, and when copy-pasting make sure to copy the entire thing and not only the snippets. We have no idea what errors you're running into, so beyond this advice I can't say much.

arctic crown Dec 22, 2021, 7:40 PM

#

desert oar the generalization of a plane to more than 2 dimensions

super sorry but what is generalization

desert oar Dec 22, 2021, 7:40 PM

#

arctic crown super sorry but what is generalization

a "more general" version of something

#

a plane is specifically a 2-dimensional object in a 3-dimensional space. a generalization would be an n-dimensional object in an (n+1)-dimensional space

#

it's not longer specifically "2" and "3", ergo it is "more general" and therefore a "generalization"

#

this is a common practice in math: finding generalizations of things

boreal summit Dec 22, 2021, 7:51 PM

#

bronze skiff also, what errors are you getting?

I'll cross check the versions tomorrow and report back here. Thanks everyone. @stone marlin

hybrid mica Dec 22, 2021, 8:07 PM

#

I built my first artificial neural network in python today, and I got an accuracy rate of 0.86. However, when I used the same dataset and calculated the accuracy rate with kernel svm and random forest (without any deep learning), they were both higher than 0.86. Is this normal?

bronze skiff Dec 22, 2021, 8:10 PM

#

yeah, why not?

#

neural nets aren't the best thing for every single dataset you see, especially tabular datasets

serene scaffold Dec 22, 2021, 9:17 PM

#

hybrid mica I built my first artificial neural network in python today, and I got an accurac...

what do you mean by "accuracy"? (tp + tn) / (tp + tn + fn + fp)?

it's possible that a different NN architecture would have outperformed the one that you created, but functor (named "guido van pasta" currently) is right NNs aren't the be-all-end-all of AI.

delicate sphinx Dec 22, 2021, 9:35 PM

#

my model in tensorflow keeps outputting "yes" which is the most common single-word answer of my dataset

#

any ideas on how to fix?

#

I have image features and LSTMs combined into one model that then creates a network of dense layers, has a dropout of 0.5, activations tanh, and denses of 16,128 for each of those dropout,activation pairs.

The layer then has a dense layer of 256 and outputs via a dense layer of shape 16,23000. Finally, it is activated by a softmax

#

the dense layer of 256 uses a kernel_regularizer l1_l2 (elastic net)

serene scaffold Dec 22, 2021, 9:45 PM

#

delicate sphinx my model in tensorflow keeps outputting "yes" which is the most common single-wo...

what is this model intended to do?

delicate sphinx Dec 22, 2021, 9:46 PM

#

Image + Question --> Answer

#

the answer can be up to 16 separate words

serene scaffold Dec 22, 2021, 9:46 PM

#

and for what percentage of the training instances is the answer "yes"?

delicate sphinx Dec 22, 2021, 9:46 PM

#

question is up to 32 separate words but a TextVec and an Embedding layer before the LSTM layers so it's represented as a dense vec

#

A large amount, it would make the single most common answer, I don't have exact counts of how many answers are yes but I could probably find it fairly quick

#

though I'm trying to run another test on my model so have about 10 minutes before that will finish the first epoch

serene scaffold Dec 22, 2021, 9:47 PM

#

so the possible answers are a mix of yes/no and qualitative questions?

delicate sphinx Dec 22, 2021, 9:47 PM

#

yes I have 23,000 potential answers

serene scaffold Dec 22, 2021, 9:47 PM

#

I feel like those two classes of questions should be handled separately.

delicate sphinx Dec 22, 2021, 9:48 PM

#

(the vocab size will include words like "what" which for this example we can say isn't a possible answer, but I've included that in my output size regardless)

#

yeah it would be good if I could but idk how to :/

#

I was hoping it would work as just any old classifier

serene scaffold Dec 22, 2021, 9:49 PM

#

where did you get the idea to do this?

delicate sphinx Dec 22, 2021, 9:49 PM

#

the first 10 of 248,000 questions

#

Visual Question Answering

serene scaffold Dec 22, 2021, 9:49 PM

#

delicate sphinx the first 10 of 248,000 questions

sorry, I don't read screenshots of text.

delicate sphinx Dec 22, 2021, 9:49 PM

#

No worries

serene scaffold Dec 22, 2021, 9:49 PM

#

if you provide it as text, we can continue.

delicate sphinx Dec 22, 2021, 9:50 PM

#

it's nothing of actual importance, just an example of the proportion of yes/no questions

#

Is the food napping on the table?
True answer:  no
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 1 , Answer:  yes                

What has been upcycled to make lights?
True answer:  kettles
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 2 , Answer:  yes                

Is this an Spanish town?
True answer:  no
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 3 , Answer:  yes                

Are there shadows on the sidewalk?
True answer:  yes
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 4 , Answer:  yes                

What is in the top right corner?
True answer:  tree
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 5 , Answer:  yes                

Is it cold outside?
True answer:  yes
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 6 , Answer:  yes                

What is leaning against the house?
True answer:  ladder
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 7 , Answer:  yes                

How many windows can you see?
True answer:  1
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 8 , Answer:  yes                

Is this in a park?
True answer:  yes
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 9 , Answer:  yes

serene scaffold Dec 22, 2021, 9:50 PM

#

interesting; what do the lists of ints mean?

delicate sphinx Dec 22, 2021, 9:50 PM

#

those are the very first 10 so obviously the density of yes/no questions won't be represented by this

#

the list of ints are the int versions of the output (softmax output, then np.argmax(output))

#

so [9, 0,0,0,...] means that the answer is "whatever word is represented by 9, and the rest is unknown" as 0 is an unknown token

serene scaffold Dec 22, 2021, 9:52 PM

#

Is the food napping on the table?
True answer:  no

this doesn't even make sense?

#

Are colorless green ideas sleeping furiously?
True answer: sometimes

????

delicate sphinx Dec 22, 2021, 9:52 PM

#

The questions are created to give the model "commonsense knowledge"

serene scaffold Dec 22, 2021, 9:53 PM

#

I see

#

and you said there were images. how are those images being represented?

delicate sphinx Dec 22, 2021, 9:53 PM

#

i.e. a more trained and adapted model would look at object pairings such as "food napping" and decide "what the heck, food can't sleep???"

#

the images are preprocessed via the InceptionV3 model

#

they are then flattened in my model and that is about all I do with them before passing them to the merged model (that is constructed of Dense layers, Activations and Dropouts)

#

Someone recommended regularization but that hasn't really changed the output from "yes"

#

(The images are loaded with np.load() to load them as numpy arrays which are then converted to tensors)

modest mulch Dec 22, 2021, 10:06 PM

#

Anyone knows how I might be able to make a mask out of shapes/pictures in an image which contains text? so like it's an image from a pdf, so it could contain texts or pictures or shapes. Am I able to make a mask out of everything that is NOT a text?

sleek tapir Dec 22, 2021, 10:53 PM

#

iim learning ml rn

#

is KNN related to topology in any way

#

like distances and metric spaces

bronze skiff Dec 22, 2021, 10:54 PM

#

if your definition of topology is distances and metric spaces, sure

#

knn... uses distances

#

and distances make up metric spaces

sleek tapir Dec 22, 2021, 10:54 PM

#

well i learn topolgoy at uni

#

how bout manifolds and differential geometry

#

Differential geometry finds applications throughout mathematics and the natural sciences. Most prominently the language of differential geometry was used by Albert Einstein in his theory of general relativity, and subsequently by physicists in the development of quantum field theory and the standard model of particle physics. Outside of physics, differential geometry finds applications in chemistry, economics, engineering, control theory, computer graphics and computer vision, and recently in machine learning.

#

from wiki

bronze skiff Dec 22, 2021, 10:56 PM

#

i mean, i'm confused by your question

#

are you asking if KNN is related to parts of mathematics?

#

it's just a supervised learning algorithm-- it's hard to say if it relates to very large swathes of a very large field

sleek tapir Dec 22, 2021, 10:58 PM

#

im deciding which subects

#

to take

#

yea knn is related to other parts of mathematics

#

for next year uni

merry ridge Dec 22, 2021, 11:02 PM

#

It doesn't really use topology no. You could argue it uses point set topology, but by that same argument calculus is topology

sleek tapir Dec 22, 2021, 11:03 PM

#

well i dont classify calculus a part of topology

#

i classify topology and calc as part of real analysis

#

then there is complex analysis

#

im not even a pure major

#

im a stats majro

#

together with cs

#

but ik a bit of toplogy and e.t.c.

merry ridge Dec 22, 2021, 11:04 PM

#

Saying topology is part of real analysis is one way to offend every topologist.

sleek tapir Dec 22, 2021, 11:05 PM

#

well my uni

#

does it part of real analyiss

#

my uni does a bit of topology in multivariable clac

#

(calc 2 or 3 in the us i think)

#

we learn calc 1 and probably a bit of calc 2 in High school

merry ridge Dec 22, 2021, 11:07 PM

#

That is probably point set topology, I don't see any reason to introduce a student to anything from topology in multivariable calculus.

sleek tapir Dec 22, 2021, 11:08 PM

#

yea its point set topology i think

#

it was hard when i did it

#

its oinly very basic obviously

#

open closed sets intersection and e.t.c.

short wren Dec 22, 2021, 11:36 PM

#

Hey! I'm making a neural network to identify the presence of pneumonia in lung x-rays (1 or 0), so what activation functions and loss functions do you recommend?
I have relu for everything except output, which is softmax, and binary_crossentropy for loss
(tensorflow keras)

#

No matter what I try, my accuracy is 75% or below, but training almost always ends at 95% accuracy

#

and my model has a dropout of 50%

alpine rain Dec 22, 2021, 11:40 PM

#

I have no idea which would be better, but I would suggest that you test multiple activation functions to see which provides the best result on one set of x-rays, and then test them on another set of x-rays to make sure the one that seems to be the best is really the best and it wasn't just overfitted

#

the difference in % between training and test data might be smaller if you have a larger training set

short wren Dec 22, 2021, 11:43 PM

#

thanks!

short wren Dec 22, 2021, 11:44 PM

#

alpine rain the difference in % between training and test data might be smaller if you have ...

5219 images for training and 619 for testing

alpine rain Dec 22, 2021, 11:44 PM

#

that sounds good

#

not sure what kind of difference you're looking for in the images though, maybe the difference between a pneumonia patient's image and a regular person's image is small enough to need more training data... but the ratio between the training and testing images is good

willow linden Dec 23, 2021, 12:03 AM

#

short wren and my model has a dropout of 50%

maybe try different values of dropout or other kinds of normalization

short wren Dec 23, 2021, 12:04 AM

#

willow linden maybe try different values of dropout or other kinds of normalization

i tried 35, 30, 40, 75, 85, 55, etc

willow linden Dec 23, 2021, 12:04 AM

#

data augmentation?

short wren Dec 23, 2021, 12:11 AM

#

willow linden data augmentation?

haven't tried it

#

i looked into it

#

is layers.RandomFlip("horizontal_and_vertical"), sufficient

willow linden Dec 23, 2021, 12:12 AM

#

I don't know, that's the magic of AI spending thousands of hours on hyperparameter tuning

#

haha

#

but maybe you should do a bit more of augmentation

alpine rain Dec 23, 2021, 12:13 AM

#

RandomFlip sounds like something that will randomly flip your image either horizontally or vertically, judging by the parameters you added

#

I'm not sure if that's in any way beneficial to you

#

why would you want to train your AI to recognize pneumonia in a picture of lungs upside-down? it has no added value

willow linden Dec 23, 2021, 12:15 AM

#

true, maybe play more with contrast

#

and that kind of stuff

alpine rain Dec 23, 2021, 12:15 AM

#

willow linden true, maybe play more with contrast

yes, select the values that have the potential to enhance the picture, and tune those, forget the rest

#

I think creating a software to automatically detect an illness from an X-ray is a really cool idea. I honestly wish I could help you more, but I can only give you pointers

short wren Dec 23, 2021, 12:26 AM

#

k ty

#

is there anything that is obviously wrong in this?

#

model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(
            45, (5, 5), activation="relu", input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)
        ),
        tf.keras.layers.MaxPooling2D(pool_size=(3, 3)),
        tf.keras.layers.Conv2D(
            45, (2, 2), activation="relu"
        ),
        tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(100, activation="relu"),
        tf.keras.layers.Dense(60, activation="relu"),
        tf.keras.layers.Dense(60, activation="relu"),
        tf.keras.layers.Dropout(0.35),
        tf.keras.layers.Dense(30, activation="relu"),
        tf.keras.layers.Dense(NUM_CATEGORIES, activation="softmax")
    ])
    model.compile(
        optimizer="adam",
        loss="binary_crossentropy",
        metrics=["accuracy"]
    )
    model.summary()
    return model```

alpine rain Dec 23, 2021, 12:31 AM

#

do you get an obvious error message from it?

willow linden Dec 23, 2021, 12:34 AM

#

short wren ```py model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D( ...

this probably has not much to do with your accuracy, but I believe x-rays have no color so, couldn't you convert your images to grayscale with only one channel?

charred umbra Dec 23, 2021, 1:19 AM

#

short wren Hey! I'm making a neural network to identify the presence of pneumonia in lung x...

Yeah there are many activation functions that were recently proposed such as Swish, Mish, and Phish that you could try as an alternative to ReLU

charred umbra Dec 23, 2021, 1:22 AM

#

alpine rain I have no idea which would be better, but I would suggest that you test multiple...

Yeah, Swish, Mish, and Phish have shown to be better than ReLU in certain situations. Im not entirely sure how good each of them are in relation to one another, but trying them out couldnt hurt either

alpine rain Dec 23, 2021, 1:23 AM

#

Swish, Mish, and Phish? I thought this was the beginning of a joke and you're telling me these are actually names of features in tensorflow? 😄

charred umbra Dec 23, 2021, 1:24 AM

#

alpine rain Swish, Mish, and Phish? I thought this was the beginning of a joke and you're te...

Swish is in tensorflow, but Mish and Phish were proposed recently. Mish came out last year I think, and Phish was proposed only a couple days ago!

#

I dont think Mish and Phish are in there, but you could probably use them still

#

Mish is defined as f(x) = xTanH(Softplus(x))

#

and Phish is f(x) = xTanH(GELU(x))

#

Mish and Phish could probably be added if enough people start using them like Swish though

charred umbra Dec 23, 2021, 1:26 AM

#

alpine rain Swish, Mish, and Phish? I thought this was the beginning of a joke and you're te...

Lol yeah the names are funny, but they seem to be pretty good activations actually

alpine rain Dec 23, 2021, 1:28 AM

#

"avengers, activate Phish"... doesn't quite have the same ring to it

charred umbra Dec 23, 2021, 1:31 AM

#

alpine rain "avengers, activate Phish"... doesn't quite have the same ring to it

I think Mish is named after it's creator. Not sure what Phish is named for though.

alpine rain Dec 23, 2021, 1:31 AM

#

after fish, probably 🙂

#

it can be really funny though, how some programming things get their names... like you'd think phish comes from fish, but maybe it's some very elaborate thing involving 13 people over the course of 4 and a half years plus a dog and two chocolate cakes that ended up being this name...

charred umbra Dec 23, 2021, 1:33 AM

#

alpine rain after fish, probably 🙂

I looked it up and its named after the guy who proposed it

alpine rain Dec 23, 2021, 1:33 AM

#

the guy's name was Phish?

charred umbra Dec 23, 2021, 1:34 AM

#

alpine rain the guy's name was Phish?

His name is Philip, and the guy who named mish is Misra

#

I dont think so. It would be hilarious if it was though

alpine rain Dec 23, 2021, 1:34 AM

#

so it wasn't quite "named after" those people so much as those people chose the name for it based on their initials...

charred umbra Dec 23, 2021, 1:35 AM

#

alpine rain so it wasn't quite "named after" those people so much as those people chose the ...

yeah, that makes more sense

alpine rain Dec 23, 2021, 1:35 AM

#

also I imagine Philip was simping on Misra a little bit so he wanted to follow in that pattern Misra set with her name 🙂

charred umbra Dec 23, 2021, 1:35 AM

#

I kinda wanna test out Mish and Swish in actual neural nets and see how they do

charred umbra Dec 23, 2021, 1:36 AM

#

alpine rain also I imagine Philip was simping on Misra a little bit so he wanted to follow i...

he probably did it because of that or for pure comedic effect

alpine rain Dec 23, 2021, 1:36 AM

#

both are good

charred umbra Dec 23, 2021, 1:36 AM

#

alpine rain also I imagine Philip was simping on Misra a little bit so he wanted to follow i...

I like the name because my name is also Philip lol

alpine rain Dec 23, 2021, 1:37 AM

#

I like the whole Swish, Mish, Phish thing too, I think it's funny

#

also not very serious

#

so I don't take it seriously

#

but I probably should

#

but I'm using a programming language named after 5 british idiots so whatevs 🙂

charred umbra Dec 23, 2021, 1:38 AM

#

https://www.semanticscholar.org/paper/Swish%3A-a-Self-Gated-Activation-Function-Ramachandran-Zoph/4f57f486adea0bf95c252620a4e8af39232ef8bc

[PDF] Swish: a Self-Gated Activation Function | Semantic Scholar

The experiments show that Swish tends to work better than ReLU on deeper models across a number of challenging datasets, and its simplicity and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network. The choice of activation functions in deep networks has a significant effect on the training...

#

https://arxiv.org/abs/1908.08681

arXiv.org

Mish: A Self Regularized Non-Monotonic Activation Function

We propose $\textit{Mish}$, a novel self-regularized non-monotonic activation
function which can be mathematically defined as: $f(x)=x\tanh(softplus(x))$. As
activation functions play a crucial...

#

https://www.techrxiv.org/articles/preprint/Phish_A_Novel_Hyper-Optimizable_Activation_Function/17283824

figshare

Phish: A Novel Hyper-Optimizable Activation Function

Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectified Linear (ReLU) has been the dominant activation function for the past decade. Swish and Mish are newer activation functions that have shown to yield better results t...

#

Swish, Mish, and Phish respectively

alpine rain Dec 23, 2021, 1:39 AM

#

if I'd make an activation function, I'd just call it good and then people would say good AF 😄

charred umbra Dec 23, 2021, 1:39 AM

#

alpine rain if I'd make an activation function, I'd just call it `good` and then people woul...

Either that or "Hish"

alpine rain Dec 23, 2021, 1:39 AM

#

Heruk is not my real name

charred umbra Dec 23, 2021, 1:39 AM

#

ah ok

alpine rain Dec 23, 2021, 1:40 AM

#

maybe I'd call it Wish 😄

#

you WISH this was a good activation function but it isn't 😄

charred umbra Dec 23, 2021, 1:41 AM

#

Swish and Phish seem to be on par, but Mish is seemingly better than both of those

alpine rain Dec 23, 2021, 1:42 AM

#

I have no idea

stone marlin Dec 23, 2021, 1:49 AM

#

I missed this, but: topology? Part of real analysis? Color me offended. 🦂

alpine rain Dec 23, 2021, 1:54 AM

#

offended by what?

willow linden Dec 23, 2021, 1:59 AM

#

Hey guys, anyone knows a good source to learn about "probability density function"? or maybe just probability in general

stone marlin Dec 23, 2021, 2:04 AM

#

I should'a referred back to this message, but I'm just kiddin'. https://discordapp.com/channels/267624335836053506/366673247892275221/923350429004357744

raven herald Dec 23, 2021, 2:05 AM

#

willow linden Hey guys, anyone knows a good source to learn about "probability density functio...

https://www.probabilitycourse.com/chapter4/4_1_1_pdf.php

Probability Density Function | PDF | Distributions

Definitions and examples of the Probability Density Function

analog hull Dec 23, 2021, 2:06 AM

#

charred umbra Swish and Phish seem to be on par, but Mish is seemingly better than both of tho...

Could be open to testing in more applications outside of classification maybe

charred umbra Dec 23, 2021, 2:06 AM

#

analog hull Could be open to testing in more applications outside of classification maybe

I think it would be interesting to see how it affects generators in GANS

#

or maybe in autoencoders

analog hull Dec 23, 2021, 2:07 AM

#

Do you think it could be implemented into the discriminator as well

charred umbra Dec 23, 2021, 2:07 AM

#

I would think that having it in the descriminator would make the generator more accurate

#

since the idea of a GAN is for the discriminator to have high loss and an accuracy of 1/2 everywhere

#

so a function that minimizes loss would make the generator more accurate in making the fake images

#

but im not sure

analog hull Dec 23, 2021, 2:08 AM

#

Anyways it's really cool we have a new activation function

willow linden Dec 23, 2021, 2:22 AM

#

raven herald https://www.probabilitycourse.com/chapter4/4_1_1_pdf.php

ty I'll look into that

daring tiger Dec 23, 2021, 2:28 AM

#

charred umbra Swish and Phish seem to be on par, but Mish is seemingly better than both of tho...

dude phish is so good

daring tiger Dec 23, 2021, 2:31 AM

#

willow linden ty I'll look into that

Vouch its really good

#

Phish is quite revolutionary

#

Surely the mods here should make an announcement about it, its very useful and applicable to the real world

#

At the very least it should be added to tensorflow

daring tiger Dec 23, 2021, 2:39 AM

#

alpine rain but I'm using a programming language named after 5 british idiots so whatevs 🙂

Lmao 😂

bronze skiff Dec 23, 2021, 3:08 AM

#

willow linden Hey guys, anyone knows a good source to learn about "probability density functio...

get a copy of ross's text "introduction to probability models" and read diligently

#

any lesser source is just copium

stone marlin Dec 23, 2021, 4:06 AM

#

I've been through Billingsley Prob + Measure like, ten years ago, so it might be nice to refresh --- is Ross' mostly theoretical, or is it more of an applied text?

desert oar Dec 23, 2021, 4:23 AM

#

bronze skiff get a copy of ross's text "introduction to probability models" and read diligent...

i found a copy of this online, it seems kind of like a "handbook" of various basic probability models, rather than a useful resource for learning probability

#

i guess the first chapter is a good rundown of probability theory, but probably not something you can effectively self-study from if you've never seen the material before

bronze skiff Dec 23, 2021, 4:29 AM

#

its a canonical text for undergraduate probability, so its definitely something you can self study from

desert oar Dec 23, 2021, 4:30 AM

#

i missed the exercises at the end of the chapter

#

yeah these are pretty good

#

i stand corrected

tight dove Dec 23, 2021, 4:35 AM

#

I have a dataframe in pandas

#

#

I'm trying to "collapse" multiple rows to one since they pretty much the same values

#

How do I go about this?

#

PS : I had used groupby initially to transform the data

arctic crown Dec 23, 2021, 4:37 AM

#

anyone have a dataset for a chatbot?
like the intents.json

#

odd meteor Dec 23, 2021, 6:38 AM

#

charred umbra Swish is in tensorflow, but Mish and Phish were proposed recently. Mish came out...

I just checked. It seems Mish has been added to TensorFlow but not Phish

austere swift Dec 23, 2021, 6:46 AM

#

well if it was only proposed a couple days ago i wouldn't expect it to be

lapis sequoia Dec 23, 2021, 7:43 AM

#

can anybody suggest me a hands-on course on reinforcement learning

lapis sequoia Dec 23, 2021, 8:00 AM

#

Yo Guys

#

Where can i Learn AI

#

using python as a programming language specifically

#

i'm willing to enroll for a paid course just need good recommendation

minor mica Dec 23, 2021, 8:09 AM

#

👋 Hi, I'm a full-stack web developer with a little bit of experience in Python, but most of my programming experience is in Javascript. I've always had an interest in machine learning and NLP, so I decided I want to make a chat bot with Python, create an API for it, and turn it into a full-stack project for my portfolio. To be more specific, I wanna make something that is like CleverBot in the sense that its goal is just to have conversations with humans that are as natural as possible. However, rather than taking the pure ML-based model that CleverBot uses, I wanted to try something like a rule-based approach that uses AI to augment the quality of its responses over time. So basically, I guess I'm just looking for any tips/advice? Tools that might help me? Dunno, just kinda playing with the idea at this point haha. 😅

bold timber Dec 23, 2021, 8:31 AM

#

whether k-prototype need to feature scaling?

green zinc Dec 23, 2021, 9:00 AM

#

lapis sequoia Where can i Learn AI

there is this free course which is highly recommended https://es.coursera.org/learn/machine-learning, but does not use python ( you can ignore the programming parts) and it will help you learn the base of ml / ai. then you can look for specific tutorials on how to develop a ml/ai solution in python

Coursera

Coursera | Online Courses From Top Universities. Join for Free

7,000+ courses from schools like Stanford and Yale - no application required. Build career skills in data science, computer science, business, and more.

iron basalt Dec 23, 2021, 9:08 AM

#

lapis sequoia i'm willing to enroll for a paid course just need good recommendation

https://www.youtube.com/watch?v=TjZBTDzGeGg&list=PLUl4u3cNGP63gFHB6xb-kVBiQHYe_4hSi

YouTube

MIT OpenCourseWare

1. Introduction and Scope

MIT 6.034 Artificial Intelligence, Fall 2010
View the complete course: http://ocw.mit.edu/6-034F10
Instructor: Patrick Winston

In this lecture, Prof. Winston introduces artificial intelligence and provides a brief history of the field. The last ten minutes are devoted to information about the course at MIT.

License: Creative Commons BY-NC-SA
...

▶ Play video

odd meteor Dec 23, 2021, 9:08 AM

#

lapis sequoia Where can i Learn AI

Paid

Graduate Studies in University
Udacity.com
Coursera.com
Udemy.com
DataQuest.io
DataCamp.com

BootCamp

FourthBrain.ai

Free Resources

University of YouTube
FreeCodeCamp.org
Neuromatch.io
🏓 Check Pinned Post For More

iron basalt Dec 23, 2021, 9:08 AM

#

There are several open courses now.

boreal summit Dec 23, 2021, 9:19 AM

#

Hello everyone, the Pyspark version used in the book is 3.0.1, while the version on my laptop is 3.1.2

boreal summit Dec 23, 2021, 10:02 AM

#

@bronze skiff @stone marlin

arctic wedgeBOT Dec 23, 2021, 10:50 AM

#

:incoming_envelope: :ok_hand: applied mute to @wintry quarry until <t:1640257217:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

serene scaffold Dec 23, 2021, 1:33 PM

#

boreal summit <@536320349516857366> <@199950202252165120>

Do you have standing permission to ping those people?

serene scaffold Dec 23, 2021, 1:34 PM

#

tight dove I'm trying to "collapse" multiple rows to one since they pretty much the same va...

Sounds like you're looking for drop_duplicates?

tight dove Dec 23, 2021, 2:07 PM

#

serene scaffold Sounds like you're looking for drop_duplicates?

Yes

#

What I did was to do a groupby, then agg with the column names

serene scaffold Dec 23, 2021, 2:10 PM

#

tight dove What I did was to do a groupby, then agg with the column names

I don't know enough about what you're trying to do to understand this context.

#

If you still have a question, I can try to help, though keep in mind that I do not look at screenshots of text.

tight dove Dec 23, 2021, 2:13 PM

#

i already dropped duplicates -

tight dove Dec 23, 2021, 2:13 PM

#

serene scaffold I don't know enough about what you're trying to do to understand this context.

processed_customers = processed_customer \
      .groupby('customer_id',as_index=False) \
      .agg({
          'customer_name':'first',
          'total_invoice_count': 'first',
          'total_invoiced_amount': 'first',
          'unpaid_amount': 'first',
          'unpaid_count': 'first',
          'first_invoice_date': 'first',
          'first_invoice_amount': 'first',
          'last_payment_date': 'first',
          'last_payment_amount': 'first',
          'customer_segment': 'first'
      })

#

That's what I did

serene scaffold Dec 23, 2021, 2:14 PM

#

tight dove ```py processed_customers = processed_customer \ .groupby('customer_id',as...

When you're working with tabular data, showing what you did with the data isn't enough; you have to show what the data itself is.

Try print(processed_customer.head(30).to_dict('list'))

tight dove Dec 23, 2021, 2:15 PM

#

Okay. Usually i use display() but I will give this a try as well

serene scaffold Dec 23, 2021, 2:15 PM

#

tight dove Okay. Usually i use display() but I will give this a try as well

That displays it in a format that can't be copied and pasted effectively.

tight dove Dec 23, 2021, 2:19 PM

#

serene scaffold When you're working with tabular data, showing what you did with the data isn't ...

I just did. Let me paste what I got on pastebin

#

https://pastebin.com/KKAiPhxi

Pastebin

{'customer_id': [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4,...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

serene scaffold Dec 23, 2021, 2:24 PM

#

tight dove I just did. Let me paste what I got on pastebin

Thanks! So what is your question?

tight dove Dec 23, 2021, 2:24 PM

#

serene scaffold Thanks! So what is your question?

I wanted to drop duplicates as at the time I asked

serene scaffold Dec 23, 2021, 2:25 PM

#

tight dove I wanted to drop duplicates as at the time I asked

you just have to do .drop_duplicates() on the DataFrame that has duplicates.

tight dove Dec 23, 2021, 2:25 PM

#

Okay. Thanks!

serene scaffold Dec 23, 2021, 2:26 PM

#

In [4]: df.groupby('customer_id',as_index=False) \
   ...:       .agg({
   ...:           'customer_name':'first',
   ...:           'total_invoice_count': 'first',
   ...:           'total_invoiced_amount': 'first',
   ...:           'unpaid_amount': 'first',
   ...:           'unpaid_count': 'first',
   ...:           'first_invoice_date': 'first',
   ...:           'first_invoice_amount': 'first',
   ...:           'last_payment_date': 'first',
   ...:           'last_payment_amount': 'first',
   ...:           'customer_segment': 'first'
   ...:       }).drop_duplicates()
Out[4]:
   customer_id customer_name  total_invoice_count  ...  last_payment_date  last_payment_amount  customer_segment
0            1     Microsoft                    6  ...         2021-06-01                 1000               Low
1            2         Apple                    4  ...         2021-08-15                 3000              High
2            3        Google                    4  ...         2021-04-01                 1000               Low
3            4       Netflix                    4  ...         2021-07-31                 2500              High
4            5          Meta                    2  ...         2021-07-15                  500               Low

[5 rows x 11 columns]

charred umbra Dec 23, 2021, 2:48 PM

#

odd meteor I just checked. It seems Mish has been added to TensorFlow but not Phish

It could probably be added if people started using it a lot

#

Actually, it seems that you can just run Phish, by creating it like this:

#

"""Tensorflow-Keras Implementation of phish"""

## Import Necessary Modules
import tensorflow as tf
from tensorflow.keras.layers import Activation
from tensorflow.keras.utils import get_custom_objects


class Phish(Activation):

    def __init__(self, activation, **kwargs):
        super(Phish, self).__init__(activation, **kwargs)
        self.__name__ = "phish"


def phish(x):
    return x*tf.math.tanh(tf.nn.gelu(x))


get_custom_objects().update({"phish": Phish(phish)}) ```

#

and calling "phish" as a string literal in the dense layers

charred umbra Dec 23, 2021, 2:55 PM

#

austere swift well if it was only proposed a couple days ago i wouldn't expect it to be

I think this is how people would use it until its added officially

polar narwhal Dec 23, 2021, 3:54 PM

#

Is there anyone out there who knows opencv and can help?

bronze skiff Dec 23, 2021, 4:04 PM

#

boreal summit <@536320349516857366> <@199950202252165120>

please don't ever ping me at 5 in the morning

#

and yes, so what is your error with that spark version

analog pike Dec 23, 2021, 4:20 PM

#

I don't know if i'd ask this here or no, i'm trying to work through the RL guides on tensorflow's website, but every time I try and import everything related to tf-agents I keep getting "AttributeError: module 'tf_agents.trajectories.trajectory' has no attribute 'Transition'"

#

I made sure my tf agents and tensorflow versions were correct

bronze skiff Dec 23, 2021, 4:23 PM

#

are you using an ide and forgot to link your venv interpreter to it?

analog pike Dec 23, 2021, 4:23 PM

#

no i don't believe so, I'm using the anaconda prompt to install the packages for jupyter notebook

#

everything else I installed works just fine

lilac crest Dec 23, 2021, 4:25 PM

#

i am not sure if i should ask here but could someone help me with matplotlib stuff?

analog pike Dec 23, 2021, 4:25 PM

#

the weird thing is if I just do import tf_agents it works fine, but when I get to trying to import specific things from it is when it decides to give an error

boreal summit Dec 23, 2021, 4:28 PM

#

serene scaffold Do you have standing permission to ping those people?

Wasn't aware that I needed permission to ping people.

stone marlin Dec 23, 2021, 4:28 PM

#

I turn my pings off for exactly this reason.

boreal summit Dec 23, 2021, 4:28 PM

#

bronze skiff please don't ever ping me at 5 in the morning

I'm sorry, I'm on GMT+1. It's currently 17:28 in my country.

hazy escarp Dec 23, 2021, 5:59 PM

#

Hey, I need some help with NEAT, anybody who knows it? Maybe DMs or smth idk

serene scaffold Dec 23, 2021, 6:36 PM

#

boreal summit Wasn't aware that I needed permission to ping people.

it's bad etiquette to ping people you aren't talking to. If you have a question, it should be posed generally, not to specific people who haven't volunteered to answer it.

boreal summit Dec 23, 2021, 6:42 PM

#

serene scaffold it's bad etiquette to ping people you aren't talking to. If you have a question,...

Noted.

barren seal Dec 23, 2021, 7:11 PM

#

hello

hazy escarp Dec 23, 2021, 7:11 PM

#

hi

barren seal Dec 23, 2021, 7:12 PM

#

I want to implement Fast Fourier Transformation.

#

import scipy as sp
import matplotlib.pyplot as plt
listA = sp.ones(500)
listA[100:300] = -1
f = sp.fft(listA)
plt.plot(f)

#

but it's asking me -"AttributeError: module 'scipy' has no attribute 'fft'

#

anyone can help me with it?

#

if I use -import scipy.fft as fft

#

it says--TypeError: 'module' object is not callable

charred umbra Dec 23, 2021, 7:19 PM

#

barren seal import scipy as sp import matplotlib.pyplot as plt listA = sp.ones(500) listA[10...

You could try the one in numpy as an alternative

#

if you cant get the scipy one to wokr

barren seal Dec 23, 2021, 7:20 PM

#

does numpy have fft ?

desert bear Dec 23, 2021, 7:21 PM

#

Hey, does anyone know any github repo or module that can generate a 2d map like one below? I'm looking for a playground on which I can test some reinforcement learning algorithms

charred umbra Dec 23, 2021, 7:31 PM

#

barren seal does numpy have fft ?

yeah it does

#

I used it once

barren seal Dec 23, 2021, 7:49 PM

#

charred umbra I used it once

Thanks man!

round field Dec 23, 2021, 8:13 PM

#

how do i start learning the python coding?

hazy escarp Dec 23, 2021, 8:18 PM

#

just find some tutorials on ytb ig

delicate sphinx Dec 23, 2021, 10:17 PM

#

round field how do i start learning the python coding?

As this is your first message in this entire discord, I assume you mean in general? If so, maybe check out something like HackerRank (If they're still around), or CodeAcademy etc. Or give yourself little projects to do, you can work up to things like PyTorch or Tensorflow for the Data Science

delicate sphinx Dec 23, 2021, 10:50 PM

#

how can i shuffle different datasets by x amount? i.e. I need order to be maintained, so my (input1, input2, output) need to be such that when I fetch them after being shuffled, I get (input1_list[x], input2_list[x], output_list[x])

#

tensorflow

vivid plank Dec 23, 2021, 11:13 PM

#

desert bear Hey, does anyone know any github repo or module that can generate a 2d map like ...

Not exactly similar but here are some maze environments:
https://github.com/MattChanTK/gym-maze
https://github.com/maximecb/gym-minigrid

limber hemlock Dec 23, 2021, 11:16 PM

#

iron basalt https://www.youtube.com/watch?v=TjZBTDzGeGg&list=PLUl4u3cNGP63gFHB6xb-kVBiQHYe_4...

Thks man is this the famous MIT course everybody did ?

arctic wedgeBOT Dec 23, 2021, 11:21 PM

#

:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until <t:1640302301:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).

sweet owl Dec 23, 2021, 11:25 PM

#

so basically Whoever wants to help Im trying to create a "touchscreen" by using a side facing camera for now I want to see if the Hand is touching the display and Idk where to start since I know cv2 but im extremely weak with ai stuff

#

This is quite similar to virtual mice however I would use a side facing camera to factor if the fingers are touching the screen or not

vivid plank Dec 23, 2021, 11:26 PM

#

https://facebookresearch.github.io/PyTouch/ ? @sweet owl

PyTouch | PyTouch

A Machine Learning Library for Touch Processing

sweet owl Dec 23, 2021, 11:26 PM

#

Ur a god

#

Thanks

vivid plank Dec 23, 2021, 11:28 PM

#

Paper might be a interesting read though https://arxiv.org/abs/2105.12791

arXiv.org

PyTouch: A Machine Learning Library for Touch Processing

With the increased availability of rich tactile sensors, there is an equally
proportional need for open-source and integrated software capable of
efficiently and effectively processing raw touch...

delicate sphinx Dec 23, 2021, 11:29 PM

#

delicate sphinx how can i shuffle different datasets by x amount? i.e. I need order to be mainta...

Using Dataset.zip(x,y,z).batch(i).shuffle(i) but it's taking like 3.5 minutes to shuffle 255 data points, any ideas on what can be quicker? I have 248,000 data points I need shuffled overall

sweet owl Dec 23, 2021, 11:33 PM

#

@vivid plank its kinda weird tho and a process like why do I need to make a PR

grave frost Dec 24, 2021, 12:33 AM

#

delicate sphinx how can i shuffle different datasets by x amount? i.e. I need order to be mainta...

shuffle should maintain order

#

what's i put the whole snippet

delicate sphinx Dec 24, 2021, 12:34 AM

#

grave frost `shuffle` should maintain order

The issue is that to shuffle my zip of 3 different datasets gives me a long waiting time before any shuffling has happened

#

i would be the batch sizing

grave frost Dec 24, 2021, 12:35 AM

#

delicate sphinx The issue is that to shuffle my zip of 3 different datasets gives me a long wait...

doesn't matter - its a onetime cost only

delicate sphinx Dec 24, 2021, 12:35 AM

#

grave frost doesn't matter - its a onetime cost only

unfortunately not

#

to cache all of the images, questions and answers I'd need about 64GB of RAM

grave frost Dec 24, 2021, 12:35 AM

#

reduce the buffer size then

delicate sphinx Dec 24, 2021, 12:36 AM

#

Would that not drastically lower the point of the shuffle if I could only shuffle it < 255

#

as 1) that's a relatively small value (as a perfect shuffle according to tensorflow would be 248,001) and 2) less than batch size from what I understand means some values will be kept where they are

grave frost Dec 24, 2021, 12:37 AM

#

if your batch size is 16, a buffer size of 32 would do?

delicate sphinx Dec 24, 2021, 12:37 AM

#

It likely would be, I was thinking bigger = better in terms of batches though

#

I'm not sure if it actually would be

#

so was trying to keep it fairly high

grave frost Dec 24, 2021, 12:38 AM

#

it probably won't be

delicate sphinx Dec 24, 2021, 12:38 AM

#

My model is a mess lol

grave frost Dec 24, 2021, 12:38 AM

#

try using prefetch too

delicate sphinx Dec 24, 2021, 12:38 AM

#

Even using precision and recall metrics for my model all it outputs is "yes"

#

yeah I prefetch my images, should prefetch my one hot answers too

grave frost Dec 24, 2021, 12:39 AM

#

all models are like that in the start 🙂 DL is not that easy

delicate sphinx Dec 24, 2021, 12:39 AM

#

the actual model processing takes about 0.8 seconds and loading all of the images (when I use the full dataset) takes about 4 seconds

#

so I always end up waiting for the I/O bottleneck anyway

grave frost Dec 24, 2021, 12:39 AM

#

but overtime, you'd immediately recognize what the problem is

delicate sphinx Dec 24, 2021, 12:40 AM

#

Yeah, I'm just getting annoyed haha, no matter what I do it seems to only output "yes" so I'm doing all I can to change that

grave frost Dec 24, 2021, 12:40 AM

#

yea, my model is bottlenecking too 😜 but I am a bit lazy and can wait more

delicate sphinx Dec 24, 2021, 12:40 AM

#

I try kernel_regularizer, changing metrics to Precision and Recall (which I think is best), changing loss/optimizer etc. etc.

#

one change after the other and something I'm doing always ends up favouring the most common answer lol

grave frost Dec 24, 2021, 12:41 AM

#

well, then speed shouldn't be a priority for you at this stage

delicate sphinx Dec 24, 2021, 12:41 AM

#

yeah I'm using 10% of the full dataset

#

so it can do 1 epoch in about a minute as opposed to 45minutes

grave frost Dec 24, 2021, 12:42 AM

#

You can't expect to save off much from 45 mins tbh

#

prolly 30 mins max

delicate sphinx Dec 24, 2021, 12:45 AM

#

I can reduce that time but I'm trying to do extra processing to see if I can get a few varying results

#

but I still just get [9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] (once decoded that's "yes" with 15 spaces)

#

because like a big brain idiot I'm trying to also output up to 16 words ,-,

#

anyways, I'ma sleep cuz it's 1am rn, ty for the advice man

grave frost Dec 24, 2021, 12:55 AM

#

hm..what's the task?

delicate sphinx Dec 24, 2021, 1:05 AM

#

Visual Question Answering

#

Image + Question --> Answer

#

I've just thrown in like 6 metrics and giving it a quick run through now

grave frost Dec 24, 2021, 1:07 AM

#

you're backpropping through 6 metrics?

delicate sphinx Dec 24, 2021, 1:07 AM

#

https://www.tensorflow.org/tutorials/structured_data/imbalanced_data

TensorFlow

Classification on imbalanced data | TensorFlow Core

grave frost Dec 24, 2021, 1:08 AM

#

I doubt you'd get good accuracy without carefully adjusting your architecture

delicate sphinx Dec 24, 2021, 1:08 AM

#

I'm using the ones given from this

#

ah :/

#

from that site I'm using:

METRICS = [
      keras.metrics.TruePositives(name='tp'),
      keras.metrics.FalsePositives(name='fp'),
      keras.metrics.TrueNegatives(name='tn'),
      keras.metrics.FalseNegatives(name='fn'), 
      keras.metrics.Precision(name='precision'),
      keras.metrics.Recall(name='recall'),
]

grave frost Dec 24, 2021, 1:08 AM

#

wuz the loss

delicate sphinx Dec 24, 2021, 1:08 AM

#

but from a quick test it still outputs "yes"

#

tbh I seem to have deleted the code that gets loss lmao, how do I get loss from train_on_batch again

grave frost Dec 24, 2021, 1:09 AM

#

you can paste the code here, someone might look over it and help you out

arctic wedgeBOT Dec 24, 2021, 1:10 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

delicate sphinx Dec 24, 2021, 1:11 AM

#

delicate sphinx from that site I'm using: ``` METRICS = [ keras.metrics.TruePositives(nam...

so I'm assuming the way this is formatted is the list holds the returns from each metric used

#

[0.6561643481254578, 3789.0, 86.0, 100498472.0, 291.0, 0.9778064489364624, 0.9286764860153198]

#

The issue is much like the credit card fraud imbalanced data set I have a HUGE amount of "yes" answers

#

except it's not a binary classification task and has 23,000 possibilities

#

Loss from last batch trained:  0.5620800852775574
TruePos from last batch trained:  3801.0
FalsePos from last batch trained:  24.0
TrueNeg from last batch trained:  100498536.0
FalseNeg from last batch trained: 279.0
Precision from last batch trained:  0.9937254786491394
Recall from last batch trained:  0.9316176176071167

#

Yeah it's scoring quite high just by guessing "yes"

desert oar Dec 24, 2021, 3:58 AM

#

imbalanced classification is hard in general

#

especially with a huge number of classes

odd coral Dec 24, 2021, 4:07 AM

#

Anyone here code counterfactual regret minimization

coral sage Dec 24, 2021, 5:26 AM

#

I'd like to have custom images instead of labels on my squarify tree map, How can I do that, I couldn't find anything on it when I checked.

#

This is the test data I'm working with for now

#

I plan on downloading the custom emoji images using requests and storing them in a folder with their emoji_id as the filename

#

This is what the treemap looks like, but instead of text labels I'd like the images with the respective filename

drifting mason Dec 24, 2021, 6:53 AM

#

Hello, I wanted a small help

#

I am searching proteins in a disease let us say x

#

in google

#

there are about thousand

#

how do I automate the search and fetch only the proteins in the disease as per google

#

any idea?

#

pls help

bold timber Dec 24, 2021, 8:36 AM

#

anyone can explain to me about silhouette score?

grave frost Dec 24, 2021, 12:28 PM

#

delicate sphinx ``` [0.6561643481254578, 3789.0, 86.0, 100498472.0, 291.0, 0.9778064489364624, 0...

what's the loss you are using?

delicate sphinx Dec 24, 2021, 1:09 PM

#

grave frost what's the loss you are using?

I use Adam and CategoricalCrossEntropy

#

For my loss and optimizer

#

I think one issue i have is I try to output up to 16 words in 16 different lists because I have questions between 1 word and 12 words in my dataset

#

So if I run it too long with a kernel_regularizer, even with a tiny value it ends up forgetting even the most common answer and just outputs nothing

#

I'm not sure if its natively masked but without the regulariser it seems to still remember to output yes

lapis sequoia Dec 24, 2021, 1:13 PM

#

I am wondering is there any machine learning models use info in the past to predict future outcome with time as independent variable?

#

like doing time series forecast on stock price, is there any model based on time on machine learning?

delicate sphinx Dec 24, 2021, 1:16 PM

#

I'm not sure about the time side or int of it but in text there are models called RNNs that maintain cell state to adjust future biases

#

And I think you may be able to create dense networks with a bias_initializer so you might be able to assign higher bias to ones that have historically scored better

#

Bias weights I think you can also get from previous iterations of the network though as far as I'm aware you'd have to build the model and compile it everytime

lapis sequoia Dec 24, 2021, 1:20 PM

#

thanks

desert oar Dec 24, 2021, 2:51 PM

#

lapis sequoia like doing time series forecast on stock price, is there any model based on time...

https://otexts.com/fpp3/

Forecasting: Principles and Practice (3rd ed)

3rd edition

mighty spoke Dec 24, 2021, 3:06 PM

#

Hi does anyone know about how to use power spectral density on the price of the particular stock vs the time series ?

lapis sequoia Dec 24, 2021, 3:23 PM

#

desert oar https://otexts.com/fpp3/

thanks

lapis sequoia Dec 24, 2021, 6:19 PM

#

It is not possible to transfer your consciousness to the virtual world. It is impossible. Maybe it will be possible to create a virtual, limited double of a person, their limited imitation of a character, etc., but it will be a lie, it will be a computer processed set of 0 and 1, not a real person or his actual consciousness. Man will never reach the intellectual level to create an artificial intelligence equal to his own. It is logically impossible. AI can do things faster and more accurately, but it won't be true intelligence, just a limited imitation of it. AI is not really intelligence, but an algorithm that mimics the characteristics of intelligence in a limited way. And that artificial intelligence does certain things faster and more accurately is due to the speed of computers and these algorithms, and the numerical nature of computers that compute everything much faster than humans with satisfactory accuracy, hence the efficiency and accuracy of these algorithms, but it is not intelligence.
Can someone tell me if im wrong or right?

cedar brook Dec 24, 2021, 6:57 PM

#

Does anyone have any book recommendations for something between ISLR and the elements of statistical learning wrt. difficulty/complexity?

burnt knot Dec 24, 2021, 7:12 PM

#

So for the past few months I've been trying to build a bilingual voice cloning machine
The other day I ran into an issue that I can't figure out if it's either an obstacle or a permanent shutdown

#

I need to synthesise my audio, which requires trading my metadata files

#

But the process isn't streamlined in the slightest, meaning I have to take all my hundreds of datasets and manually write them in to be synthesised

#

Which I think might not be what I'm meant to do it this is supposed to be dealing with one metadata file

#

I don't know, I feel as of there's still a chance for me to make this work but I feel as if everything I've been doing since August has been a complete mess

narrow wren Dec 24, 2021, 11:37 PM

#

Code design for a ML project

Hi everyone - I'd love your thoughts on something I'm working on:

I'm writing some code for a ML project that I'm working on. The rough pipeline is something like this:

Pull raw training data from a SQL database
Perform feature engineering
Train the model on the data (hyperparameters have already been defined and stored in a config file)
Save and upload trained model onto a server.
Use trained model to score on new observations

All of this code will reside in a GitHub repo, and the same repo will house code for other models as well (in different folders within the repo) that are similar but have different features for the raw data for those models. My question is for point number 2 above:

How do I write modularized code for feature engineering? Should I define a specific class for the raw data that I pull for this model, and then define specific methods for each feature that I add to this dataframe? Or am I thinking about this wrongly? I apologise if my question doesn't make a lot of sense, but I'd appreciate any thoughts on this. Also, if you have links to a good public repo that I can look at to get some inspiration, that would be awesome too.

TIA

serene scaffold Dec 25, 2021, 12:32 AM

#

narrow wren Code design for a ML project Hi everyone - I'd love your thoughts on something ...

What do you mean by feature engineering? Representing the data from the SQL table in a way that can be passed to the model?

narrow wren Dec 25, 2021, 12:34 AM

#

serene scaffold What do you mean by feature engineering? Representing the data from the SQL tabl...

I mean, adding additional derived columns/features from the existing set of columns, which i intend to do using Python. SQL (step 1) will only be used to pull in the raw data from a database, and nothing else.

serene scaffold Dec 25, 2021, 12:37 AM

#

narrow wren I mean, adding additional derived columns/features from the existing set of colu...

If the derived features are derived only in terms of what is already in the DataFrame, I don't think you need to design anything extra.

narrow wren Dec 25, 2021, 12:42 AM

#

serene scaffold If the derived features are derived only in terms of what is already in the Data...

Yes, but where will the code for this step ideally reside? In a separate file like feature_engineering.py? And if this file also contains code to perform feature engineering for other dataframes to be used for other models, what's the best way to separate out the code for different models?

serene scaffold Dec 25, 2021, 12:44 AM

#

narrow wren Yes, but where will the code for this step ideally reside? In a separate file li...

I guess you can make a function that takes a dataframe and returns a series, and make one function like that for each derived feature, and then call all of them in a call to pd.concat

narrow wren Dec 25, 2021, 12:55 AM

#

serene scaffold I guess you can make a function that takes a dataframe and returns a series, and...

Yes, that was my initial though. But this function will only be applicable to a specific dataframe. For example let's say the repo in this case hosts code for 5 different models. And let's say I have a file named get_raw_data.py which has different functions like get_raw_data_for_model1(), get_raw_data_for_model2(), ..., get_raw_data_for_model5(), each of which runs a SQL query to pull in raw data and returns a pandas dataframe required for each of the models. Each of these dataframe have different columns and are quiet different from each other. So the functions I create in feature_engineering.py are not applicable to all 5 dataframes. In this case, how do I separate out these functions? More concretely, if I have a function creare_feature_foo() in feature_engineering.py that is applicable only to dataframe 5, what's the most pythonic way to do it?

iron basalt Dec 25, 2021, 1:36 AM

#

narrow wren Yes, that was my initial though. But this function will only be applicable to a ...

Make 5 different functions if they clearly have no overlap. There is no way to make that less work or smaller in terms of amount of code.

narrow wren Dec 25, 2021, 1:40 AM

#

iron basalt Make 5 different functions if they clearly have no overlap. There is no way to m...

Yes, we will have 5 different functions. But what's the best way to indicate that a function is applicable to only a certain dataframe (and not the others?) - is it okay to define these dataframes as separate classes (that inherit from pd.DataFrame) and define these functions as methods for these classes?

iron basalt Dec 25, 2021, 1:42 AM

#

narrow wren Yes, we will have 5 different functions. But what's the best way to indicate tha...

You can either have documented pre-conditions (recommended because it makes the code shorter), or you will have to have the functions check the conditions in code (the dataframe format / columns, etc).

#

The second method that takes more code makes sense if you want to make this a bit more robust against people new to the project that don't really yet know what they are doing or don't really care.

#

To that end you can have a dataframe format checking tool (in code) that takes in a given expected format.

narrow wren Dec 25, 2021, 1:48 AM

#

iron basalt The second method that takes more code makes sense if you want to make this a bi...

Got it - that makes sense! 🙂 Can you tell me what's wrong/undesirable with the classes/method option? Just trying to understand the pros and cons of everything.

iron basalt Dec 25, 2021, 1:48 AM

#

narrow wren Got it - that makes sense! 🙂 Can you tell me what's wrong/undesirable with the ...

Don't make a object unless it needs to be one.

#

That's how you get those massive OOP projects with crazy class names that make no sense (cough Java libraries cough).

#

And they don't do much (pretty empty, just some getters and setters).

#

The time to make an object (often with no methods actually (data object)), is when you find yourself passing around the same arguments to different functions together all the time.

#

So you can think of an object / struct as being a shared stack frame for the variables.

#

Actually the way to get the optimal structure for a project is to first write out the entire thing flat (step 1), that is, no functions, no classes, just all inline. Then you look at which parts repeat or are very similar other than some differences in parameters. Take those repeat parts and make a function out of them (step 2). Note that this effectively compresses the code. Then look at your functions and see if they tend to have a bunch of arguments that can be grouped together / are passed around together. If they are, make an object out of them (with those variables as the members) (step 3). Again this compresses the code. Now go back to step 1. Extra step: sometimes it's worth splitting up the flat code even if it does not actually compress it further because you want to be able to read what is happening as high-level steps (like a story), this is often done in the main function / file.

#

This method will give you the optimal code in terms of size while still being readable. No function nor object is unnecessary.

#

All programming paradigms teach this method in an indirect way.

#

Note that this method uses hindsight. It does not try to predict which functions or classes need to be made and then make them (no upfront diagram (UML), only maybe after making it). It let's the code itself decide how it wants to look.

#

Prediction of classes and functions can be incorrect and lead to bad design.

safe elk Dec 25, 2021, 3:22 AM

#

Done UML not too great yep

#

Evolve your code as needed

#

If you have UML you have to update that too more work

#

Document with UML when things are stable

narrow wren Dec 25, 2021, 3:28 AM

#

iron basalt Prediction of classes and functions can be incorrect and lead to bad design.

Thank you so much for the detailed explanation! I really appreciate it! 🙂

stone marlin Dec 25, 2021, 4:14 AM

#

Hm, I've done a significant amount of EDA with OOP and I thought it was fine, I don't think it's generally undesirable. But if someone is starting out, doing an easier or smaller project, or even just working alone, etc., I agree that it's sometimes sort'a overkill. Having said that, before anyone yells at me for liking OOP or whatever, I'll note my experience here.

My experience with Objects in EDA: For much of the data I worked with, there were components, subcomponents, etc., and these were typically all part of the same "thing" --- though I've done this even with travel data and other types of data, not just physically-modeled data. I can use classmethods to parse the appropriate parts of the data into objects which each have their own methods --- mainly cleaning and descriptor methods, as well as plotting methods. I'm also able to isolate different parts for inter-component feature engineering, and for component-to-component feature engineering it is enough to check for the class type to see if the components "go together". If one also forces an additive-only structure with feature engineering (at least until returning the df) then it's also trivial to add new features in the class.

I also feel that readability is significantly easier than the normal "imperative" EDA where it's just a bunch of imperative code with comments above it saying what something does --- especially if you want to modify one single part of the imperative code and you didn't realize something lower required some strict size or something. Though, to be fair, I'm a huge stickler for Python's type-hinting + documentation --- usually my commit stuff runs mypy + sphinx and throws an error if something isn't documented --- though I'm on the extreme side of this, I know.

#

My bias, though, is to err on the side of verbosity and telling people exactly what needs to go in and come out. I know not everyone likes this, but it's been okay for me so far.

#

(This also works well with larger data, when you need to format / feature engineer around chunks and you need to make a DAG structure to do all that nonsense. But, again, that's kind of the "shared code" that Squiggle talks about above, just a specific point I wanted to note.)

iron basalt Dec 25, 2021, 5:57 AM

#

stone marlin Hm, I've done a significant amount of EDA with OOP and I thought it was fine, I ...

In Python one does need to fight the language a bit here by not being lazy on the type hinting. I like to also distinguish between API specification (what are the functions and classes, pre-conditions, post-conditions, side-effects, return values, errors, is the operation atomic?, security concerns, TODO, etc), and documentation (an extensive set of documents which can link to or contain the API specification as well). API specification can be auto generated by "documentation" tools, but documentation is often like writing a book and takes a lot of time. Documentation is often best done after when one actually knows what it looks like (rather than just prediction | giving it time to settle), because having to change it is a costly (in time) thing to do.

#

*The entire program does not need to be complete to write documentation, parts can be documented. I just look at the rate at which they were changed over time (commits). If the part has no been touched in a while (solidified), then I document.

stone marlin Dec 25, 2021, 6:02 AM

#

I don't disagree --- Python is great for prototyping, so to do anything for "production" (type-hinting, etc.) it does take a bit of work and boilerplate to make sure everyone's on the right page. Moreover, there's a lot of DS people that don't even know type-hinting is a thing, or that they can lint/format.

I'm not sure I understand the distinction you're making exactly between API docs and standard documentation, but it might be different here because we build our APIs separate from our EDA tooling and both take a different method of documenting (APIs use swagger, autogen usually; EDA uses self-written numpy docstyle). Either way, it's good to separate those otherwise it gets confusing, I agree.

The only thing I disagree with here (and mildly so) is that I make my peeps document when they're submitting a PR, even if things change soon, as well as write unit tests --- but, having said that, it's usually at a point then where things have "settled". Also, these are people who are doing this kind of thing a lot, so we already know the gist of what functions we should have. I also found that if I don't force them to document right away, it will literally never get documented, but maybe that's just me not badgering people enough later.

#

That's an interesting strategy --- I do not know if it would work for me in a team setting, but I can imagine if it's a solo project that isn't quite clear yet (initial stages of EDA, etc.) then, yeah, totally, I can see that being a reasonable way to go. Especially for early EDA when you're just scratchin' around at stuff.

#

One additional downside (at first) of my methodology is: it takes a LONG time to get people used to documenting, linting, type-hinting, etc. Eventually, it's second nature, but there's a pretty significant ramp-up time.

iron basalt Dec 25, 2021, 6:05 AM

#

stone marlin I don't disagree --- Python is great for prototyping, so to do anything for "pro...

Yeah, I I agree about writing a bit explaining what is going on and writing some tests, I just don't consider it documentation under my own terminology. Documentation is more "serious", one needs to sit down and spend a lot of time on just it. Open up the Latex maybe, word, make some diagrams (maybe even animations), etc.

#

(It can take weeks)

stone marlin Dec 25, 2021, 6:06 AM

#

Ahh, I understand. Yes, I agree --- sort of like, reporting or "long-term" documentation of tools for wide-spread use or something.

iron basalt Dec 25, 2021, 6:06 AM

#

(Most projects can't afford this unless it's legacy / will stick around for a long time and not change too much / or they just have a lot of employees like the big popular game engines for example)

stone marlin Dec 25, 2021, 6:07 AM

#

Yes, that documentation, I agree, should not be done until the tool is in a usable state and is not being used only by the team that made it. I misunderstood --- in the above, when I say "documentation" I mean doing google/numpy/whatever style docstrings and other minor things like that in a Python file, as well as Swaggering the API (or whatever doc system is used).

iron basalt Dec 25, 2021, 6:07 AM

#

It's the sort of thing one creates to make sure that future employees can understand it all in total long after you are gone.

#

(An example would be like hardware documentation like Intel's official docs for their chips)

stone marlin Dec 25, 2021, 6:09 AM

#

Yeah, that is a much more serious endeavor, one that I've luckily not had to do much. I don't think I'd require my team to do documentation like that on most of, if not all of, our EDA code. For a full ML project, probably only a small bit on how to run the pipeline (since it's generally standard pipelining).

iron basalt Dec 25, 2021, 6:17 AM

#

stone marlin Yeah, that is a much more serious endeavor, one that I've luckily not had to do ...

I think the writing of some comment at least per file (at the top (I like to include example code)) is very nice. For the functions themselves, I like the various assumptions being made (pre-conditions, post, etc (every function has more of these than one might think)), and in some cases to cover up bad language design (like how C does not have multiple return values so I need to say which argument is actually an output). I prefer very long function names (sentences sometimes), and variable names so if those do not already let you know what it does then maybe a comment still, but often they are (and are suppose to be) sufficient.

stone marlin Dec 25, 2021, 6:20 AM

#

Agree with everything here. Especially in the context of Python.

#

Seems legit. I wanted above to note that these things are options, but for someone starting out in their journey / doing a smaller project, yeah, probably focusing on other things is more reasonable, as y'all noted above.

lapis sequoia Dec 25, 2021, 6:28 AM

#

Hi All anyone has used Shap for fb's prophet?

safe elk Dec 25, 2021, 6:41 AM

#

Lol we all agree

hoary wigeon Dec 25, 2021, 8:40 AM

#

Hey I need some good project topic for developing Deep Learning Model.

Can anyone suggest me some good topics ?

trim cedar Dec 25, 2021, 8:48 AM

#

lapis sequoia ``` It is not possible to transfer your consciousness to the virtual world. It i...

The Google Alpha zero is said to have learned chess on its own by playing against itself and then it also defeated the then best chess engine, stockfish! In that case, it has also incorporated self-learning and utilization of its previous learning in subsequent attempts, to play even better etc.
Other than this, human intelligence is capable of creative content creation like poetry or art, which code like GPT3 is capable already to some basic extent right?
Would this suggest the contrary to you?

austere swift Dec 25, 2021, 8:48 AM

#

hoary wigeon Hey I need some good project topic for developing Deep Learning Model. Can anyo...

what i usually do is check out kaggle for datasets/notebooks and just try to train a model with those

bold timber Dec 25, 2021, 9:02 AM

#

Hi, I am so wondering about this plot. Why can't plotting an 'other' value in the 'Terminal' feature?

trim cedar Dec 25, 2021, 9:18 AM

#

bold timber Hi, I am so wondering about this plot. Why can't plotting an 'other' value in th...

Hi, I do not know seaborn. But could this error be because 'Others' does not have enough passengers to show?

bold timber Dec 25, 2021, 9:32 AM

#

trim cedar Hi, I do not know seaborn. But could this error be because 'Others' does not ha...

it means the 'Other' value is very small?

trim cedar Dec 25, 2021, 9:39 AM

#

bold timber it means the 'Other' value is very small?

Could be. You could take its ~~sum~~ value_count or something and check.

rose loom Dec 25, 2021, 9:57 AM

#

hi guys, i have homework for artificial intelligence class. Its topic is "artificial intelligence in law". They asked us to design an artificial intelligence program to help lawyers. But everything I can think of has been done. I want to get your opinion too. Can you share your ideas with me?

bold timber Dec 25, 2021, 10:20 AM

#

trim cedar Could be. You could take its ~~sum~~ value_count or something and check.

ok thank you for the answer

trim cedar Dec 25, 2021, 10:39 AM

#

rose loom hi guys, i have homework for artificial intelligence class. Its topic is "artifi...

Crime Prediction? 🤔
Oops! Our clients are not law enforcement! But Lawyers! Sorry!
Hmm.. How about App that predicts probability of bail? etc?
It could provide the history of such successful cases and the applied sections and guidelines from verdict from data?

rose loom Dec 25, 2021, 10:45 AM

#

I think, good idea. Thanks😊

rose loom Dec 25, 2021, 10:57 AM

#

rose loom hi guys, i have homework for artificial intelligence class. Its topic is "artifi...

Anyone else have any ideas to share?🤗

odd meteor Dec 25, 2021, 11:18 AM

#

bold timber Hi, I am so wondering about this plot. Why can't plotting an 'other' value in th...

The 'other' category count is relatively small when compared to other categories therein.

Use value_counts to view the proportion of each category in the 'Terminal' column.

odd meteor Dec 25, 2021, 11:27 AM

#

rose loom hi guys, i have homework for artificial intelligence class. Its topic is "artifi...

Hmm Intresting... 😀

You could explore Topic Modelling for fraud detection using LDA (not to be mistaken for Linear Discriminant Analysis) I mean using the other LDA model used for topic modelling.

If you then want to make your work more alluring (or maybe, sophisticated) then delve into Self-Supervised vs Semi-Supervised Learning to compare result gotten from your LDA topic model.

bold timber Dec 25, 2021, 11:52 AM

#

odd meteor The 'other' category count is relatively small when compared to other categories...

Oke thank you

#

Do we should scaling to the data before determine a cluster?

#

I want to use DBSCAN, but I am confused about when to scale the data?

odd meteor Dec 25, 2021, 12:32 PM

#

bold timber I want to use DBSCAN, but I am confused about when to scale the data?

Yes it's advisable to scale your data first before applying any clustering algorithm on it.

bold timber Dec 25, 2021, 12:32 PM

#

odd meteor Yes it's advisable to scale your data first before applying any clustering algor...

Ok thank you for the answer

#

but di you know about silhouette score?

#

does it true silhouette score is the method to determine a cluster if it has a label?

odd meteor Dec 25, 2021, 12:44 PM

#

bold timber but di you know about silhouette score?

I only know some people use silhouette metric in gauging performance of their clustering model but I haven't used it before so I don't know for sure.

Have you sift through Google yet? I'm sure it'll know the right answer.

bold timber Dec 25, 2021, 12:56 PM

#

odd meteor I only know some people use silhouette metric in gauging performance of their cl...

I've been seen in google but I still confused because it given example only use 2 columns (feature and label) but I have a 15 feature and 1 label

#

but I wondering about this. When the format datetime like this, it's advisable to drop or stay included to used to scaling?

safe elk Dec 25, 2021, 1:09 PM

#

https://www.prisonpolicy.org/blog/2017/04/18/automated-justice/

Prison Policy Initiative Blog

Stephen Raher

Automated Justice: A Review of Weapons of Math Destruction

Stephen Raher reviews Cathy O’Neil's book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.

safe elk Dec 25, 2021, 1:11 PM

#

rose loom Anyone else have any ideas to share?🤗

See abovr great book

#

weapons of math destruction” (WMDs). She defines WMDs as opaque mathematical models that embed human prejudice, misunderstanding, and bias into the software systems that automate numerous aspects of our lives. Her book covers several types of these models and the frustrating injustices they can perpetrate. In addition to case studies about credit scoring, online advertising, employment, and insurance, O’Neil discusses the use of WMDs in the criminal justice system

#

Explore possible bias in models thst cause inequality and suggest ways to fix the models

#

The title funny too

#

Oh yeah i resd the book it is informative and entertaining

spark apex Dec 25, 2021, 1:23 PM

#

hey,
I am want pose detection on web
So i tried move net with ONNX and TFjs.
For PC:
ONNX: ~20 FPS
TFjs: ~ 40 FPS

For mobile:
ONNX: 2 FPS
TFjs: 6 FPS

What can i do to improve speed ? I am doing all inference on Client side in web

Any other models i can try ?
Or something completely else i can try ?

Thank you for help in Advance

grave frost Dec 25, 2021, 2:33 PM

#

spark apex hey, I am want pose detection on web So i tried move net with ONNX and TFjs. ...

make them sparse - check out neuralmagic

austere swift Dec 25, 2021, 2:38 PM

#

spark apex hey, I am want pose detection on web So i tried move net with ONNX and TFjs. ...

if you're doing inference only then try lowering precision

#

theres also things like tensorrt which do lower-level optimizations as well

spark apex Dec 25, 2021, 2:40 PM

#

austere swift if you're doing inference only then try lowering precision

i was thinking to use Int8 model currently i am using float32
But i don't know how to use TFlite model with TFjs