#data-science-and-ml
1 messages · Page 362 of 1
Wait, okay, so you're just putting in the tokenized words (as int indexes) into a NN?
yeah through an embedding layer
Yeah the tokenizer has a vocabulary I can index to get words with a given int
i.e. tokenizer.sequences_to_text([581, 20, 14, 3414]) = what are you doing
Right, okay, so far so good. IIRC, Keras requires integer stuff for this kind of thing. I think this is what Word2Vec does?
I only use every word in my project which is around 28,000 unique ones
it keeps count of the unique words
though there is another part of it that counts occurrences of the words I think
Yeah I think the tokenizer was like a word2vec idk
Okay, so, you've got the embedding layer. What are you hoping to get out of your NN at the end? You end up with a 16-element thing, what is that supposed to be?
an answer
i.e.
"what are you doing" --> model --> "nothing"
hopefully between lengths of 1 and 16 as my longest answer in my dataset is 12 (and I want to keep it in powers of 2)
Okay, so you're using like SGD to train the NN? This is where my knowledge of NLP / NNs gets a bit fuzzy, apologies.
Okay, same dealio. Okay.
Okay, so now I'm gonna be totally in the dark. Does your output, right now (on a test set) correspond to up to 16 words (index of words) which you've tokenized?
If you get back like, [1, 4, 6, 7, ...] is that going to be those words?
Yeah I just realised the way I'm doing it is definitely wrong
as my answer is tokenized too
which means as long as it outputs float I'll never get good enough answers
Right, the softmax is gonna be wacky.
because the output expects float and I have int
so I need to either:
-
convert the stuff to int just before it's output or
-
find a way to convert from float to int and do the inverse of that to my answers so the model can actually learn it
either way my answer format is wrong :/
Yeah, it feels like it's giving you a "fuzzy" answer, but that makes no sense with an integer corpus. Hm.
yeah it's my processing that converts it to float
I'm not knowledgeable enough in this field to be able to help right now, but I'll take a look at it a bit later and see if I can mess around with a toy model.
tbh with you I combine an image and a question model into one
so it's fairly complex so dw too much
but if you can make a sort of model that converts integers to floats then that's fair
Haha, I'm just going to do the question part, mostly so that I know it better. :']
but because of my image input I think I have to avoid just processing on ints
I haven't done much with NNs since I've never needed them for work, but they seem pret cool.
Yeah Tensorflow has lots of documentation and all as well
Can you process your image in another NN and pick out relevant features, then pass those in?
sadly the only project that appealed to me like it could've helped me used custom encoder/decoders which I don't want to steal
Object detection or whatnot? Or, possibly, figure out the "topic" of the question from your NN and feed that into an Object Detection NN?
my image is preprocessed in InceptionV3 to give me features from the image
and that's passed into an image model and my image model and question models combine into my merged model
Ahh, okay. Wild.
yeah, I can find help for either 1 thing but the only help I can find online for my project are quite different and don't just use tensorflow which I'd like to do
the way I've managed my model so far, this is the output
I think softmax is only realy useful for one-hot-encoding because it represents a probability on each word being correct
@stone marlin which algo can i use for this?
There are a number of ways to architecture this, but if you're specifically talking about one thing, you can record the times you've done this in a db or something and use linear regression.
At least for a proof of concept.
yea thats what i was thinking store the data in some sort of db and then use linear regression
but
You could, for example, store "time lights went on in the last 30 days" or something, and regress on those.
but in linear regression lets say one axis is the time what would be the other axis?
x-axis is the day, y-axis is the time you put the light on or whatever.
Honestly, for your thing, you could literally just take the mean of the last N days.
im sorry?
Your y-axis would be the time. Your x-axis would be the day number.
hmm
If you want a 1-dimensional linear regression, that's pret much just the mean. So, you could do the last 7 days and be like, [7, 7, 7, 8, 7, 7, 6] for time to wake up, and it would take the mean of those and turn the light on then.
what do you mean by "mean"?
Average.
there's nothing wrong imo if the index isn't "1 per row". if anything, it's good practice to try to use meaningful indices whenever you can, instead of just integer row numbers
mean as in: mean of [5,5,6,9,4] = 5+5+6+9+4 / 5 = 29/5 = 5.8 (mean = sum_of_list / len_of_list)
tbh I would've said just do the mean of it too, but I'm really no genius haha, and obviously you can use AI to prevent outliers contaminating it
but for the most part mean should work
ah
Yeah, mean is fine, median is prob better tbh.
but then i dont even need to use ml
Median's robust to outliers, so that'll be better.
Yes. You don't.
Haha, not every problem needs ML to solve!
yea lol
If it's some sort of "ML is required" project you've been given, Linear regression is probably best bet
but in my case it can also be: [5:30,5:40,6,6:30]
But you might be trying to use a workshop of tools to hammer a nail in place here haha
Yeah, you can translate that to military time or whatever.
yeah 5:30 = 0530 in military time
much easier to manage
(or just do 5 * 60 + 30)
as long as you're consistent
If you want to bring in machine learning that will probably have to be a more complex project, i.e. learning whether or not a value is an outlier
i.e. if on the weekends they turn on the lights at 11:00 because they slept in longer than usual
then that shouldn't change the times for the rest of the week
you know when some people say their ai can improve over time
what do they exactly mean?
like it improves in what?
@delicate sphinx
@stone marlin
sorry for the pings
It depends on what they look to improve
And it depends on if they let it continue to learn
what does it learn tho?
It learns based on the information you give it
Usually they are referring to 'online' learning, which is a method that can continually update itself
I've not heard of online learning for AI but I'm familiar with the idea behind it
im still a bit confused on what it learns
You might already be familiar with methods that partition a large data set into train, test, etc. So that type of model learns once on the training data and that's it. If you have newer data that you want it to learn from, you have to make an entirely new model
I guess I"m jumping into this conversation without context. I assumed we're using 'AI' and 'ML' interchangably but from Tentenmen's answer is sounds like we're distinguishing them
I'm not as well versed in you with all the lingo and Jargon so you're probably better
at guessing the topic with your assumption
I have no idea 😛
I have made some models but probably still a beginner in the big picture of things
same but im learning sklearn
have you ever used things like Embedding layer, Tokenizer or TextVectorization?
I've been stuck for like 2 weeks on how to actually get valid answers from a model
at first I just had <unk> <unk> <unk>, ....
then I had the same array over and over
also my ai is a personal assistant and if i want it to improve what can i make it improve in?
and now I'm just getting floats but have no idea how to translate that into words
@arctic crown An example of online vs offline learning would be the linear regression model someone mentioned to you above. If you trained AKA fit that type of model, it would be 'learning' values for a coefficient that minimizes the squared error beween the line and all of the ground truth values AKA labels AKA 'y'. If you had an algorithm that fit the line given a dataset and couldn't update the coefficient after, that would be 'offline' learning. if you had an algorithm that could update the coefficient after each new example, that would be 'online' learning, so it could improve over time. I think SKLearn has some online learning algorithms in it
@delicate sphinx I actually have used those at a previous job. Or at least worked on a project where someone else used them and I inherited the model they made. When you say you don't know how to translate that into words, what do you mean?
So I can get float outputs, but I've no idea how to get that back into integer or string form
My current output is basically a softmax probability which should probably be used more for one-hot encoding
but apparently softmax could also work with other methods
if I one-hot encode I'm gonna destroy my processing time
(each answer / prediction would take 479,999 0's and one "1" value if I use one-hot encoding for most answers)
What are you training the model to do? I guess the input is some words/sentences, but what is the ground truth/label that you are training it against?
Preprocessed image features + Question (LSTM) model --> a merged model that should output an answer
I also made a help in #help-cookie but I'm not sure if it's any use really
I've looked at TextVectorization, Embedding, Tokenizer but I can't understand any of them :/
Is the answer supposed to be a single word?
everytime I've used them I've ended with float values I don't know how to translate back
the longest answer my dataset uses is 12 words long
and to keep my model easy to use, I try to make everything a power of 2
so the output length is 16 (16 words maximum)
my inputs:
my output
so input1 is the RGB/greyscale values of the image, input2 is words of the question and output is words of the anser?
input1 is: (36,2048,3)
where 3 is the channels (RGB)
so yeah
question input2 is (32)
longest question is 24 so I pad it to 32 (subnet masking in place)
That's incremental learning. Online learning requires the learning to happen in order that the data arrives. For example, waiting for 10 seconds to receive a bunch of data and then randomizing it and learning it in a batch is not online learning, which most DL requires to work well (the i.i.d. assumption).
Real life data that a robot receives in real time for example, is almost always not i.i.d.
so there is some vocabulary of answer words defined?
(Though it's not binary, it can be more or less)
I have tokenized all ~28,000 words
Which includes all questions and answers in my training, validation and test datasets
(28,000 unique words)
Peace is it cool if I DM? It's 00:19am here and I need to walk my dog before I code until 4am and walk him x-x
if not that's perfectly fine and don't worry
So is the output supposed to be a vector of something like 16 X 28001? A probability for each word in each index of the answer, plus one more word for 'none'?
well, I was thinking of one hot encoding it
but that requires 16 * 28,000 values in lists
which is for most cases 477,999 0's
with one "1" value
and that is just such a waste imo, so I was hoping to use some sort of TextVectorization as apparently that's more of a dynamic approach that only uses as much as is needed
that's why I took the Tokenizer approach to begin with
It's almost bedtime for me, you can send a message if you like but I may be asleep by the time you return. But I assume this isn't necessarily something you must finish in the next couple hours so if you haven't resolved it by the time I see your message I can still try to help
Yeah I mean I've been up till 4am every day this week trying to code this fix
I need to finish my model soon. If not by the end of this week I probably will have to leave it where its at
I've asked for help on this every day for the past week but understandably I'm not catching attention of many who understand TF and ofcourse TF is hard in itself so understandably I haven't been able to fix it
I guess the only suggestion I can give is to try to make a simple dummy model using the embedding/vectorizer with a very limited vocabulary/answer size for the sake of easy debugging/understanding, and once you have that expand it into the real size
Yeah I guess I could :/ i sort of got tunnel vision with it all
I guess 28,000 isn't that big a vocabulary really. But you could restrict it to like 5 words
It's not open source is it? If you put it on github or something I would try running it myself
I can give u a drive link to it but its not public yet
Does the output answer have to be grammatical? or is like just a list of tags?
When I finish it I plan to put it on my website and make like 20 questions on stack overflow for every issue I had to ask for help for
So that I can answer them myself
The output can be grammatical but for the most part is one worded answers
Sorry but why do you mean by “coefficient”
Like can it learn from my habits or something?
coefficient as far as I'm aware, is a mathematical term
so y = mx + c
m would be a coefficient
or: 10 = 5y + 2
5 would be a coefficient
older stats and science jargon for the weights in a regression model
:incoming_envelope: :ok_hand: applied mute to @balmy bolt until <t:1640049534:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
if i want my personal assistant to improve automatically what can i make it improve in?
coefficient - "that which unites in action with something else to produce a given effect"
ML people call them weights because they see it as a weighted sum.
In decision theory, the weighted sum model (WSM), also called weighted linear combination (WLC) or simple additive weighting (SAW), is the best known and simplest multi-criteria decision analysis (MCDA) / multi-criteria decision making method for evaluating a number of alternatives in terms of a number of decision criteria.
i never really thought about what the word "coefficient" meant. in math i always learned that a "coefficient" was "a constant factor in a term" and that was that
The definition given is the original math definition (from the 1600s).
yeah i had no idea it was such a general term
interesting
makes sense given the latin etymology
There is not really any good word to come up with anyhow for something abstract like that, but better than no word.
Weight also does not really make sense.
"Misinformation cannot be corrected" said by @elder thunder
dont trust people online
What?
No I didn't
I told you to not DM people because misinformation cannot be corrected
Also check out #discord-bots ^
He was dealt with
Have any of you had any experience with Google colab or similar tools? I'm trying to train some models but I'm a beginnier. I'm looking for something that's relatively easy to use but will work well enough for what I'm trying to do. Do any of you have any suggestions? Thank you. (:
Hello
I am trying to select rows for specific date using df.loc method
My date column is object type
I am not able to get specific rows data from dataframe
Ping me when replying
Hello, I was assigned a task "Route Planning and ETA estimation in Urban Traffic Network Using Artificial Intelligence".
No dataset was given. I was planning to use data from OpenStreetMap.org. My question is How do you approach these types of problem?
you can use google colab or kaggle notebooks, would recommend kaggle if dataset you use is on kaggle, colab when you get your dataset from github or google drive
Hey guys I'm trying to do CNN model and I receive this error message.
The response suppose to have 2 classes, either fall or non Fall.
https://paste.pythondiscord.com/keminotixa.sql
!paste
does anyone know of a dataset for a positive/negative sentiment analysis?
I didn't check to see if the reviews are classified strictly as positive or negative, but if it's on a scale of some kind, you can descretize it.
(for example, you could say data['is_positive'] = data['sentiment'] >= 2.5)
I dunno, try it
(rather, I don't remember because I'm on my work computer. if it requires an account then I must have one on my home computer)
I have a Pandas Series with each value as a Series of authors and the number of messages they sent in a given day, how do I convert it into a DataFrame?
Excuse the terrible drawing but that's basically what I wanna do on a bigger scale
so your series has multiple levels of indexing?
can you do print(series.to_dict()) and paste the text into the chat?
It's a bunch of series within a series
The index is a bunch of dates
that's fine; please do the statement I showed you and show the text.
Yeah sure give me a minute
Please ping me if you come back; otherwise I'm going to do something else.
!docs pandas.DataFrame.unstack
DataFrame.unstack(level=- 1, fill_value=None)```
Pivot a level of the (necessarily hierarchical) index labels.
Returns a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels.
If the index is not a MultiIndex, the output will be a Series (the analogue of stack when the columns are not a MultiIndex).
The solution will probably involve that. Good luck!
there's tslearn for time series stuff
Yes but I need built an exe
why does it need to be an exe?
Requirment
well, once you have something working, you can ask how to make it an exe in #tools-and-devops
Does tslearn give me some dashboard sort of things
if you want to make something with a UI, you can ask about that in #user-interfaces. AI libraries are about the actual AI component, and then there are other libraries for making interfaces.
{(datetime.date(2021, 6, 20), 'Name1'): 398,
(datetime.date(2021, 6, 20), 'Name2'): 3,
(datetime.date(2021, 6, 20), 'Name3'): 180,
(datetime.date(2021, 6, 20), 'Name4'): 99,
(datetime.date(2021, 6, 20), 'Name5'): 120,
(datetime.date(2021, 6, 20), 'Name6'): 1,
(datetime.date(2021, 6, 20), 'Name7'): 1347,
(datetime.date(2021, 6, 20), 'Name8'): 893,
(datetime.date(2021, 6, 20), 'Name9'): 207,
...
Sorry, I got preoccupied by something IRL. This is what I get when I use the .to_dict() method on the series.
In [12]: series.to_frame()
Out[12]:
0
2021-06-20 Name1 398
Name2 3
Name3 180
Name4 99
Name5 120
Name6 1
Name7 1347
Name8 893
Name9 207
In [13]: series.to_frame().unstack(level=1)
Out[13]:
0
Name1 Name2 Name3 Name4 Name5 Name6 Name7 Name8 Name9
2021-06-20 398 3 180 99 120 1 1347 893 207
There's only one row because the sample data only has one unique date.
to_frame turns the Series into a DataFrame with two levels of indexing. The first level is the date and the second level is the name
Unstacking the second level (which is 1, because the numbering starts at 0) achieves the desired result.
actually it looks like you can just do series.unstack(level=1) and it's converted to a DataFrame as part of that
Yay!

also remember to have a copy/pastable example ready whenever yo have a pandas question
🐼 
ah thanks for the tip, I'll remember that for next time :D


Hey guys I'm trying to do CNN model and I receive this error message.
The response suppose to have 2 classes, either fall or non Fall.
https://paste.pythondiscord.com/keminotixa.sql
Your x train is probably the wrong size for the model input
this is my x_train and y_train shape
So your model is probably outputting shape 10 my bad
I'm not sure 10 and 2 are good values because if it was a power of 2 (i.e. 16) its much easier to compress to your desired shape
(I.e. 16 > 8 > 4 > 2)
Instead of 10 > 5
(The > is an arrow im just on phone)
i'm still a bit confusing about this. so what is the outputting shape mean? Is it for the number of classes for my outcome?
So maybe add another layer thats size 2
Can you do model.summary()
And show the final layer(s)
The output shape section will likely say (None, 10)
If you add another layer that outputs shape (None, 2) it should work, otherwise maybe try and make a layer that goes into 16 then go to 2 for the output
how can I do that? it's my first time try with CNN model
Can you send pictures of your model code or your model.summary()
Just so I know if what I'm saying might fix it or not
Ok so can you add a Dense(2)
As its sequential you should be able to just do model.add(tf.keras.layers.Dense(2))
Just put that right at the bottom
before the model.compile() and model.fit () right
Also just to give you a bit of extra info that may help you understand it
The (None, x) has None because thats your batch size
So that's a variable output size
If you're new to tf I just figured it worth saying that :)
Yes
thank you so much, so I will adjust my outputting as 16
I see you are a tad confused.
Doing batch size 50 gives (50,x),
batch size 100 gives (100, x) etc
Yeah maybe 16 was wrong of me to say. But you can replace the 10 with 16 and add a layer dense 2
Sorry I woke up a few minutes ago haha
lol well i'm glad that you wake up on time. I havent learn or do this model before.. What book / site should I read to know more about this
may be 16 is not ok
Tensorflow has lots of documented guides that really give you a boost in your learning
Yeah sorry I meant 16 and 2
You could keep the 10 if you really want but personally I love using powers of 2 for all my layers due to the way tensorflow can squash stuff
But do another layer dense 2 just after the 1y
16
When I add model.add(Dense(2)) it gives me this error
Model summary?
Might need to change a dense layer from 10 to 16
it doesnt run at all. for the previous run of the model, I have this summary
I see, to be able to run it I will need to change the n_classes = 16 and then change the dense layer to 16 also
I thought your dense(16) worked it was just the output size issue
Dense_7 should probably be 16 and then add a dense_8 thats size 2 (your dense layers will be more like dense_10 because this model.summary() is outdated)
But in that model.summary() id recommend size of 16 into a new layer of 2
dense_8 thats size 2 is not working. I put it in as you say before
but if I change Dense_6 to 32 and Dense_7 to 16 and it work lol
Sorry had to give a lift to someone
Yeah tensorflow squishes them by halving and multiplying
I.e. a layer of 255 would rarely work because it wouldn't be compatible with other layers fully, I.e. 255 would halve to 127.5 and either go to 127 or 128
But if you want another layer of 255 the best that layer could give you is 254 or 256
Which is why I try to keep all my outputs to a power of 2
I see but somehow it just wont work with mine 😦
whats the model.summary() now
hold on
but when I run model.compile() it just have the error code of Shapes(None,16) and (Non,2) are incompatible
hmm weird
a very bad solve, but maybe just try another dense after 16 of 8 then 4 then 2
but idk
let me tried with 8
I think the reason why it wont run because of the number of class. I need to change it to 2 since I only have 2 classes
yeah I don't know all that n_classes stuff you mentioned earlier
but that's a shout, I just assumed you hadn't made a chnage
change
print (hello world )
it work now, but the accuracy is shit so far. I will need to wait for it for 3 hours to know
you can always split up the dataset, your batch size was 16,000 or so a minute ago?
you're missing quotation marks print("Hello world")
I've only learnt about it in uni and not done it myself as it sounds really complex, but I think eigenvalues are used for general face recognition
so it would make sense that you can expand on that for emotions (i.e. smiling)
I've no clue about the actual method to solve it all, I can't even seem to figure out how to one-hot encode my own work but yeah that sort of stuff is all I know
Personal + uneducated opinion, but lower face would be my go to
Like the mouth?
I wanna see if I can detect that yeah
grayscale may be a shout, just because colour would require more data (i.e. 10,000 with and 10,000 without lipstick etc)
yeah for sure
Then I can detect only the face using Haar cascade
and it can give you some images to send to friends and give them nightmares
So once I have a grayscale face, it should be the best thing to apply PCA to
forehead being scrunched can be surprised
though the issue with that could bias older people as surprised (due to wrinkles)
Forehead would be good yes, I haven't thought about that
Yeah you'd have to account for elderly people though
The proposed system performs better than existing technique for facial emotion recognition when Gradient Filter, PCA and PSO has been considered for feature extraction with random forest classification technique.
though I'm not 100% on this perhaps if their eyes are not wide open but their forehead is scrunched then they're old
I think my data sets are all 'younger' people so should be fine in that sense
but if eyes are wide open and forehead scrunched then they're surprised
Although I would talk about its potential inaccuracy for wrinkles
Also I keep seeing 'random forest classifier' everywhere but I have no idea what it is
Google/Wikipedia/ELI5 not very helpful
Outside of my already limited knowledge I'm afraid
Nw, thank you very much for the input! Got some new ideas now 😄
No worries, if you want to really expand it you can use RGB image to detect for wrinkles so you can better predict if their emotion is biased or not
but that's a self-thought idea so the practicality of it god knows
RGB is definitely something that has big trade-offs so I'll try both ways
yeah you'd only want to preprocess bias weights with it
though you don't need to preprocess it and could just measure the amount of wrinkles and use that as a secondary accuracy score
i.e. "due to the wrinkles detected in the image this emotion guess may only be 30% accurate"
though in a less-harsh way so people in their 20-40s don't feel like you're calling them old haha
That's a good point
I already have so many ethical considerations, ageism would be a good one to talk about
Good as in there's a lot to say, not that ageism is good
haha im glad
for theano to work with g++, do I have to have all 500mb of mingw-w64 files from conda? Aren't there any lightweight options or maybe I can precompile it somehow? I'm asking because I need to be able to run the program in any environment.
Does anyone have experience with matplotlib?
There's one thing I have to fix
And I have been stuck forever
Anyone experienced with Mask R CNN? I am trying to figure out the input format for the system. My dataset does not includes json file but each instance pixel's is labelled with the grayscale value.
Post question
pls help .-.
Might be worth opening a help channel and asking there as more people might see it
while also keeping it in here
personally I have no knowledge of it 😦
I tried. no one seems to know this 😦
Well time to try again
@warm verge
Here
Why's it in web dev
Its flask
You asked about matplotlib?
Yes
But its integrated in flask
In my example
So it all correlates
Plots data to webbrowser
But do you have any idea?
On the axes
No
I think what you mean are labels
I just want text
Like that
Or like this @warm verge
The percentages
So you can either plot them in line with a datum point, such as 0.05 on that example graph, or use the axes and plot above
Yeah but inputs differ
plt.text(mostLeftSection, stats.norm.pdf(mostLeftSection, u, o), "test")
This is what I have
To try and have text in the mostleftsection
It works for color
mostLeftSection = np.linspace(u - 3 * o, u - 1 * o, 100)
plt.fill_between(mostLeftSection, 0, stats.norm.pdf(mostLeftSection, u, o), facecolor='#49393d')
Hello everyone, hope you're doing great!! 😄
Is there a way to get the function behind a neural network? In this image below, I want to get b
This is the network
Does anyone have experience with matplotlib and would like to help? :)
Greetings, I'd like to have some advice regarding our application.
We serve an API (Flask) that analyses images after testing them through about 10 models.
Currently the code base is a bit messy, I am trying to use a pattern that would allow us to plug-in and out different models, also increase reusability of the source code by implementing a class based code where we separate concerns for preprocessing, configuration and finally running the models against input image. Anyways, I am newer on the ML/DL domain although I have fundamental understanding of OOP and design patterns. Any recommendation is appreciated regarding "organization" of the project, thanks in advance!
:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until <t:1640121661:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
from this image, b is simply your output (the result predicted by your network).
Where:
X1, X2,..., Xn = n nodes/neurons in your input layer
W1, W2, ,..., Wn = The weight of the respective neurons in the input layer
Sigma = Your hidden layer with one neuron (remember each neuron in a hidden layer has its own bias weight, so because we only have 1 neuron in our hidden layer, that's why we have (W0) there)
a = sum of inputs x weights + bias
g(x) = it should be g(a) though. So here, a is then passed into the activation function to give us our yield
b = yield a.k.a output
You've built your neural network architecture here, and have even made your neural nets fertile and ready for training by setting the loss function your optimization function should get to its global minimum)
So the next stage is to train your model (neural nets) by calling the fit method on it.
Then to get 'b', call your predict method on the model.
That's how to get the yield. ✌️
thank you, is much clearer now!
for theano to work with g++, do I have to have all 500mb of mingw-w64 files from conda? Aren't there any lightweight options or maybe I can precompile it somehow? I'm asking because I need to be able to run the program in any environment.
Does anyone have experience with matplotlib and would like to help? :)
Hey, I have developed my own neural network based on gradient descent for back propagation. I'm now trying to test it, and to my surprise I always get similar results when it comes to worsening of the model.
No matter the data shuffling, performance of my NN drops always after third epoch:
Is this concerning to you?
Saw this on Product Hunt today: https://www.producthunt.com/posts/modern-data-stack
All things related to modern data stack for engineering in a single place. Nice resource!
Try BentoML
It is a framework for deploying ML models
Hey guys how do you judge if whether a data science bootcamp is good or not?
I think this question is subjective so here's how I'll guage the program:
-
Richness of the curriculum
-
Duration of the bootcamp
-
Do they assume everyone is a novice and willing to start from the basis or do they assume we all know what we're doing 😂.
Pay attention the stated prerequisite(s) if there's any. -
Where exactly are the people who were in their previously cohort now? Are majority of them now employed as Data Scientist or still willing to enroll for another data science bootcamp (This is the point where you really need to put on your Investigative Journalist + FBI cap) 😂
-
The experience of your instructors. Are they Data Scientist, ML Engineer at well-to-do companies?
-
Last but not the least... The teaching style. Do they assign mentors to students, what kinda projects are you gon be working on as your Capstone project... etc.
==================
I'm not affiliated to Fourth Brian but I'll always recommend their Bootcamp if you can finesse the payment.
(TENSORFLOW) does anyone know if my combination of TextVectorization + Embedding layer mean I can remove a mask_zero = True parameter
as mask_zero = True is stopping me from correctly loading my model from a json
but I want to be sure I don't need it before I remove it
What do you think of this one?
https://concordiabootcamps.ca/courses/data-science-full-time/
Looks like ill have to put my fbi cap on and find out what their projects are ahah
def flip(image):
image=cv2.flip(image,1)
return image
#train_datagen = ImageDataGenerator(preprocessing_function=orth_rot,horizontal_flip=False)
j=0
my_img=os.listdir(train_path)
for i,image_name in enumerate(my_img):
if(image_name.split('.')[1]=='jpg'):
print(train_path+image_name)
x=cv2.imread(os.path.join(train_path,image_name))
x=cv2.cvtColor(x,cv2.COLOR_BGR2RGB)
y=crop_square(x,512)
im_flip=flip(y)
plt.imshow(im_flip)
plt.show()
cv2.imwrite(os.path.join(save_path,str(j),'_',image_name),im_flip)
break```
can someone please tell me why cv2.imwrite is not working
can we get an output and the error output
Idk about cv2 but if it's something trivial I might be able to help
does cv2 require opening a file to write to?
and it definitely prints just after the if? (if not then the if statement is likely wrong but as you're asking about cv2.imwrite I'll assume it definitely reaches that point)
How's it goin' tonight, y'all? I've got a few days to kill so I wanted to get grounded with some of those NN things y'all have been chattin' about. :'] Haha.
[Note: I've been workin' in ML/DS for a while, so I've got a pret good background in Python and general ML/DS architecture stuff, DDB nonsense, etc. Just don't know much about that sweet, sweet NN stuff!]
- Anyone got a tutorial series they dig?
- Anyone got a preferred framework? Is TF still the standard?
I was told "deeplearnin.ai" is a good place to start, but, you know, wanted to survey a bit.
yes it reached that point cause i have plotted the image and that gets displayed
TF has lots of guides that are nice
Free Resources
-
Deeplearning.ai = PyTorch
-
Neuromatch.io = PyTorch
(ooh, I almost forgot to mention, GANs was covered here too 😀)
https://deeplearning.neuromatch.io/tutorials/intro.html
-
Andrew Ng's Deep Learning course on Coursera.
-
University of Youtube
I think DeepLearning.ai links to Ng's DL course on coursera. Or at least that's what it linked to me. PyTorch is also fine for me, I don't mind either way with TF vs. PyTorch. Seems like DL.ai is a good place to start then!
Researchers are leaning towards JAX lately. Personally, I find TensorFlow & Keras easier.
PyTorch is still probably the most popular Framework that's widely used.
Yeah, I'm gonna see what it's all about, I'm not too worried about what I start with. Cool, thanks! I'm gonna try that then, and see how it goes.
Bro I don't have much time to look into this at the moment. I also don't wanna influence or by any chance polarise your decision.
Do your due diligence. ✌️
That's cool
Well, atleast my model doesn't overfit to just "yes" now ,-,
Not sure if this is the right channel but does anyone know of good ressources to start learning tacotron 2 (or other) for voice synthesis??
how can i make my ai assistant copy my habits?
suppose i set alarm straight for 5 days to ring at 7am, i want the program to set an alarm at that same time on the 6th day, in case if i forget to set it yourself?
what would be the best way for a beginner in ai and neural stuff to learn the basics, like a small goal to work towards (eg- make a tictactoe playing neural net) etc ?
https://colab.research.google.com/drive/1gQO_RddY0aBYtTQ2HTDcP6PXVEsJYuJL?usp=sharing
this is a colab project i've been working on. i'm using the dataset - https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
I was able to train the model and get accuracy and loss for validation sets. However, when I try to predict test set images the accuracy remains at 0.625. I ran it on 20 epochs and 4 epochs and 2 epochs and the value did not change. I know for a fact that the accuracy should be around 92% (I'm using a project for reference with the same model but some different code). Could someone please go through the notebook and see where I'm going wrong? I just cant figure it out. Thanks very much!
Hi anyone familiar with LSTM models? I am using LSTM to predicting stock price movement, however, the prediction result is very similar to original test set, why?
features=['Open','High','Low','Volume']
scaler=MinMaxScaler()
feature_transform=scaler.fit_transform(df[features])
feature_transform= pd.DataFrame(columns=features, data=feature_transform, index=df.index)
feature_transform.head()
timesplit= TimeSeriesSplit(n_splits=10,test_size=6126)
for train_index, test_index in timesplit.split(feature_transform):
X_train, X_test = feature_transform[:len(train_index)], feature_transform[len(train_index): (len(train_index)+len(test_index))]
y_train, y_test = output_var[:len(train_index)].values.ravel(), output_var[len(train_index): (len(train_index)+len(test_index))].values.ravel()
trainX =np.array(X_train)
testX =np.array(X_test)
X_train = trainX.reshape(X_train.shape[0], 1, X_train.shape[1])
X_test = testX.reshape(X_test.shape[0], 1, X_test.shape[1])
lstm = Sequential()
lstm.add(LSTM(32, input_shape=(1, trainX.shape[1]), activation='relu', return_sequences=False))
lstm.add(Dense(1))
lstm.compile(loss='mean_squared_error', optimizer='adam')
history=lstm.fit(X_train, y_train, epochs=5, batch_size=8, verbose=1, shuffle=False)
y_pred= lstm.predict(X_test)
predictions=list(chain.from_iterable(y_pred.tolist()))
Here is the code I have used
Thanks. So my dataset is sorted by time, so there shouldn't be problem in input?
and I take a look at test_set, they took latest 6126 data which is fine, I just need it to predict next 6126 data
uh, I have 290K on train side, only 6126 on test side
well it because I can only get 6126 data for now to verify the prediction result, maybe more for future
but the thing I don't quite understand is still why the prediction looks basically same as test_set
well, maybe I will try another model and to see how it becomes, I use type() to check and didn't find out any problems in type
thanks
I haven't checked the code on Colab but try this
-
Plot the learning curves to aid your understanding of what's really going on.
-
Could be a case of Overfitting. If you didn't add Batch Normalization, or Dropout layer or Early Stopping callback, you might wanna do that and see if there's an improvement.
-
Might be the problem of selecting the right number of epoch vs. Batch size.
A quick way to find out if this has a hand in what's happening is to train your neural net with 2 different batch_sizes and a constant epochs.
For example use:
- epochs = 5, batch_size = 1
- epochs = 5, batch_size = X_train.shape[0]
hi can anyone recommend me some resources to learn reinforcement learning
Research Scientist Hado van Hasselt introduces the reinforcement learning course and explains how reinforcement learning relates to AI.
Slides: https://dpmd.ai/introslides
Full video lecture series: https://dpmd.ai/DeepMindxUCL21
How does data science connected to neural network?
Can I learn NN without the knowledge of data science?
NN is a part of data science I believe
I never explicitly learned about data science but I had learnt about what a NN is and how they work
so I think it's sort of hand-in-hand (though Data Science likely contains more than just NN obviously)
Hi guys need some advice on a thing. I have a univariate time series dataset on which I need to apply anomaly detection. Everything needs to be unsupervised - the thresholds for the anomalies also need to come from the model/algorithm. Any suggestions/recommendations?
Hi could anyone please tell me about the math in machine learning?
That's quite a difficult question to answer as in my opinion atleast it depends on what you intend to do with it all, if you use something like Tensorflow then you can avoid having to write any actual functions and the most math you get really is in figuring what numbers to put for which layer
if you plan to write from scratch or to apply your own functions to it, then you can expect lots of maths
Ok thank you very much
I'm actually a beginner
It can get really mathematical but there are many libraries/packages out there that you can use that will do it largely for you
i.e. tensorflow has libraries that can apply loss functions, optimization functions, etc. all without me doing anything more than importing a loss and optimizer module
but tensorflow also allows you to customise your input/output functions and more, in which from the look of some things you'd need a very mathematical brain to interpret well enough to make yourself
Oh ok thank you very much 👍
if you want to do stuff with data you can probably expect lots of disgusting variables to do things quick and efficiently
but it's just like any project, to you it makes sense and to others you've just written a piece of space language
I am struggling so hard with numpy.random.lognormal. For some reason, I am getting really high numbers when I am using a small mean. Does anyone have any idea what's going on?
# output 3.4214334251929405e+30
The output is 30 times bigger than the input!
I need some help idk if it s related to ai or data science but ti seems the closest domain. I have a 3d obj with a mtl texture and i have a script which takes photos of it from different angles. i m using pyrender. for some reason the object i get in the photo is untextured. can anyone give me a hand here?
does that help?
meanfloat or array_like of floats, optional
Mean value of the underlying normal distribution. Default is 0.sigmafloat or array_like of floats, optional
Standard deviation of the underlying normal distribution. Must be non-negative. Default is 1.
These are parameters of the underlying normal distibution. So for a mean of 70, the mean of the result will be around exp(70) ~= 10^30
anyone can help me on this? this is code i obtained from github and i tried to learn it by using my own input but i bumped into the error. this code is about vehicles detection using faster rcnn
if you give some input examples and their shapes then maybe, but from a quick glance, although I've not normalized batches, I'd assume it to be with input shape being incorrect
and if you want to help further a model.summary() can show the output shapes expected
making sure they all match is really good
hey if anyone here uses flask could they please look at this?
please upvote if you can
This was super helpful, thank you. I guess I will need to take the log10 of my mean to adjust my current approach
Also very helpful, thank you
Any tips on implementing novel improvements from papers? (I have RL algorithms in mind specifically but I suppose it can apply for any paper-about-a-concept)
I'd consider myself a fairly competent python dev and I know the basics of RL and deep learning - enough to play around with stable_baselines3 and to get the gist of the algorithms - but when I read a paper my eyes glazeth over and I'm just unable to start writing an implementation
I've implemented tabular RL from scratch (trivial) and A2C from scratch (was very difficult). Is it just an experience thing, and I should be working my way up by implementing the actor-critic/Q-learning ladder myself from DQN/A2C up to TD3 and SAC?
@wooden night unglaze your eyes and start writing 🙂 i don't have experience specifically with reinforcement learning, but sometimes there's nothing you can do but sit down and write an implementation
trying to figure out test cases is probably the hardest part (you do want to test your code, right?)
these algorithms can be very very difficult to verify and audit
writing tests sometimes is a matter of verifying that certain mathematical properties hold (modulo floating point error)
as the physicists say, "shut up and calculate!"
yeah that's what I struggled with the most, to my eyes my implementation ~= the paper/reference implementation but the reference implementation worked and mine didn't 😆
if this is "for work" and you aren't trying to replicate the paper for any kind of academic purposes, can you just use the reference implementation?
as another option, use the reference implementation to generate test data (known-correct input-output pairs)
plus maybe there are bugs in the reference implementation
Partially for work and partially for my own edification I suppose - I'm using a baseline implementation of algorithm A, but there's cool improvement B that I read about in a paper that I'd like to use
I don't have to use cool improvement B per se but I think it would really help, but sadly it hasn't been added to the package I'm using
maybe your employer will let you dedicate time to contributing the new version upstream even
in ml what is support vector machines?
and how does it work?
in artificial neural networks, how does one know how many hidden layers are needed, and how many neurons?
you don't. as far as i know, there is still no generally useful theoretical approach to network architecture
Mehn, it's so frustrating when much of the code in the book you're studying ain't running.
And this book was released this year. ☹️
I type in the exact code and think I'm wrong, then I go to the GitHub to copy and paste yet still doesn't run.
True though.
rustlang? no. mathlang
I'm also learning Rust prog Lang by the side.
I used to learn c# before university, so I wanted to get back that feeling.
Python is plain, and doesn't really feel like programming IMO.
there are 2 primary ways to interpret an SVM (assuming classification):
-
the modern way: a linear model with a specific loss function called "hinge loss"
-
the traditional way: an algorithm that finds a "separating hyperplane" that divides two classes, such that the hyperplane has the greatest possible "margin" between it and the data
it also turns out that the SVM model is amenable to something called the "kernel trick", which lets you embed your data in an arbitrarily complicated space without having to actually transform the data, as long as you can compute inner products in that space in terms of the original data values.
this "kernel SVM" technique is part of why SVM was so popular before gradient boosting and neural networks rolled around . it allowed you to develop a fairly complicated "feature space" that possibly encoded high-order relationships between data points, in which the classes were much easier to separate. this is not entirely unlike what gradient boosting and neural networks do.
but i highly recommend reading a book instead of asking people online 🙂
what book? not all books are good
Applied Data science using Pyspark by Ramcharan Karla, Sundar Krishnan
One of the reasons I try to read recent books is so I'm sure I'm not reading old stuff. This book is copyrighted 2021, so I feel everything should work fine. But there are some hiccups.
do your pyspark versions and jvm match with the book's?
they are using spark 3, which is quite stable
I can't confirm that ATM, but the latest Pyspark version is 3.0. if the book was released this year, then I think it should be compatible with the latest release.
i find it hard to believe you're experiencing hiccups unless you are using a completely off version
spark 2 to spark 3 has quite a lot of changes
especially with pyspark
that, and it might be useful to post the actual errors you are getting
which basically sucked until this year, haha
I already sent the author a LinkedIn connect request. I'll Inbox him when he accepts, but I'll make sure I confirm the Pyspark version used in the book tomorrow so I can be certain that's not what's causing the issue.
also, what errors are you getting?
whats a hyperplane
a codimension 1 subspace of an affine space defined by a linear equation
the generalization of a plane to more than 2 dimensions
like how a plane is the generalization of a line
a hyperplane is the generalization of a plane, beyond what we as humans can visualize. but a lot of the properties are the same, and yes it helps if you know the linear algebra to avoid being stuck too much on low-dimensional intuition
I've been through the first part of this book and had no issue. Check to make sure the PySpark + etc. versions you're using match the ones he mentions at the beginning of the book, and when copy-pasting make sure to copy the entire thing and not only the snippets. We have no idea what errors you're running into, so beyond this advice I can't say much.
super sorry but what is generalization
a "more general" version of something
a plane is specifically a 2-dimensional object in a 3-dimensional space. a generalization would be an n-dimensional object in an (n+1)-dimensional space
it's not longer specifically "2" and "3", ergo it is "more general" and therefore a "generalization"
this is a common practice in math: finding generalizations of things
I'll cross check the versions tomorrow and report back here. Thanks everyone. @stone marlin
I built my first artificial neural network in python today, and I got an accuracy rate of 0.86. However, when I used the same dataset and calculated the accuracy rate with kernel svm and random forest (without any deep learning), they were both higher than 0.86. Is this normal?
yeah, why not?
neural nets aren't the best thing for every single dataset you see, especially tabular datasets
what do you mean by "accuracy"? (tp + tn) / (tp + tn + fn + fp)?
it's possible that a different NN architecture would have outperformed the one that you created, but functor (named "guido van pasta" currently) is right NNs aren't the be-all-end-all of AI.
my model in tensorflow keeps outputting "yes" which is the most common single-word answer of my dataset
any ideas on how to fix?
I have image features and LSTMs combined into one model that then creates a network of dense layers, has a dropout of 0.5, activations tanh, and denses of 16,128 for each of those dropout,activation pairs.
The layer then has a dense layer of 256 and outputs via a dense layer of shape 16,23000. Finally, it is activated by a softmax
the dense layer of 256 uses a kernel_regularizer l1_l2 (elastic net)
what is this model intended to do?
and for what percentage of the training instances is the answer "yes"?
question is up to 32 separate words but a TextVec and an Embedding layer before the LSTM layers so it's represented as a dense vec
A large amount, it would make the single most common answer, I don't have exact counts of how many answers are yes but I could probably find it fairly quick
though I'm trying to run another test on my model so have about 10 minutes before that will finish the first epoch
so the possible answers are a mix of yes/no and qualitative questions?
yes I have 23,000 potential answers
I feel like those two classes of questions should be handled separately.
(the vocab size will include words like "what" which for this example we can say isn't a possible answer, but I've included that in my output size regardless)
yeah it would be good if I could but idk how to :/
I was hoping it would work as just any old classifier
where did you get the idea to do this?
sorry, I don't read screenshots of text.
No worries
if you provide it as text, we can continue.
it's nothing of actual importance, just an example of the proportion of yes/no questions
Is the food napping on the table?
True answer: no
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 1 , Answer: yes
What has been upcycled to make lights?
True answer: kettles
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 2 , Answer: yes
Is this an Spanish town?
True answer: no
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 3 , Answer: yes
Are there shadows on the sidewalk?
True answer: yes
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 4 , Answer: yes
What is in the top right corner?
True answer: tree
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 5 , Answer: yes
Is it cold outside?
True answer: yes
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 6 , Answer: yes
What is leaning against the house?
True answer: ladder
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 7 , Answer: yes
How many windows can you see?
True answer: 1
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 8 , Answer: yes
Is this in a park?
True answer: yes
[9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Index: 9 , Answer: yes
interesting; what do the lists of ints mean?
those are the very first 10 so obviously the density of yes/no questions won't be represented by this
the list of ints are the int versions of the output (softmax output, then np.argmax(output))
so [9, 0,0,0,...] means that the answer is "whatever word is represented by 9, and the rest is unknown" as 0 is an unknown token
Is the food napping on the table?
True answer: no
this doesn't even make sense?
Are colorless green ideas sleeping furiously?
True answer: sometimes
????
The questions are created to give the model "commonsense knowledge"
i.e. a more trained and adapted model would look at object pairings such as "food napping" and decide "what the heck, food can't sleep???"
the images are preprocessed via the InceptionV3 model
they are then flattened in my model and that is about all I do with them before passing them to the merged model (that is constructed of Dense layers, Activations and Dropouts)
Someone recommended regularization but that hasn't really changed the output from "yes"
(The images are loaded with np.load() to load them as numpy arrays which are then converted to tensors)
Anyone knows how I might be able to make a mask out of shapes/pictures in an image which contains text? so like it's an image from a pdf, so it could contain texts or pictures or shapes. Am I able to make a mask out of everything that is NOT a text?
iim learning ml rn
is KNN related to topology in any way
like distances and metric spaces
if your definition of topology is distances and metric spaces, sure
knn... uses distances
and distances make up metric spaces
well i learn topolgoy at uni
how bout manifolds and differential geometry
Differential geometry finds applications throughout mathematics and the natural sciences. Most prominently the language of differential geometry was used by Albert Einstein in his theory of general relativity, and subsequently by physicists in the development of quantum field theory and the standard model of particle physics. Outside of physics, differential geometry finds applications in chemistry, economics, engineering, control theory, computer graphics and computer vision, and recently in machine learning.
from wiki
i mean, i'm confused by your question
are you asking if KNN is related to parts of mathematics?
it's just a supervised learning algorithm-- it's hard to say if it relates to very large swathes of a very large field
im deciding which subects
to take
yea knn is related to other parts of mathematics
for next year uni
It doesn't really use topology no. You could argue it uses point set topology, but by that same argument calculus is topology
well i dont classify calculus a part of topology
i classify topology and calc as part of real analysis
then there is complex analysis
im not even a pure major
im a stats majro
together with cs
but ik a bit of toplogy and e.t.c.
Saying topology is part of real analysis is one way to offend every topologist.
well my uni
does it part of real analyiss
my uni does a bit of topology in multivariable clac
(calc 2 or 3 in the us i think)
we learn calc 1 and probably a bit of calc 2 in High school
That is probably point set topology, I don't see any reason to introduce a student to anything from topology in multivariable calculus.
yea its point set topology i think
it was hard when i did it
its oinly very basic obviously
open closed sets intersection and e.t.c.
Hey! I'm making a neural network to identify the presence of pneumonia in lung x-rays (1 or 0), so what activation functions and loss functions do you recommend?
I have relu for everything except output, which is softmax, and binary_crossentropy for loss
(tensorflow keras)
No matter what I try, my accuracy is 75% or below, but training almost always ends at 95% accuracy
and my model has a dropout of 50%
I have no idea which would be better, but I would suggest that you test multiple activation functions to see which provides the best result on one set of x-rays, and then test them on another set of x-rays to make sure the one that seems to be the best is really the best and it wasn't just overfitted
the difference in % between training and test data might be smaller if you have a larger training set
thanks!
5219 images for training and 619 for testing
that sounds good
not sure what kind of difference you're looking for in the images though, maybe the difference between a pneumonia patient's image and a regular person's image is small enough to need more training data... but the ratio between the training and testing images is good
maybe try different values of dropout or other kinds of normalization
i tried 35, 30, 40, 75, 85, 55, etc
data augmentation?
haven't tried it
i looked into it
is layers.RandomFlip("horizontal_and_vertical"), sufficient
I don't know, that's the magic of AI spending thousands of hours on hyperparameter tuning
haha
but maybe you should do a bit more of augmentation
RandomFlip sounds like something that will randomly flip your image either horizontally or vertically, judging by the parameters you added
I'm not sure if that's in any way beneficial to you
why would you want to train your AI to recognize pneumonia in a picture of lungs upside-down? it has no added value
yes, select the values that have the potential to enhance the picture, and tune those, forget the rest
I think creating a software to automatically detect an illness from an X-ray is a really cool idea. I honestly wish I could help you more, but I can only give you pointers
k ty
is there anything that is obviously wrong in this?
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(
45, (5, 5), activation="relu", input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)
),
tf.keras.layers.MaxPooling2D(pool_size=(3, 3)),
tf.keras.layers.Conv2D(
45, (2, 2), activation="relu"
),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(100, activation="relu"),
tf.keras.layers.Dense(60, activation="relu"),
tf.keras.layers.Dense(60, activation="relu"),
tf.keras.layers.Dropout(0.35),
tf.keras.layers.Dense(30, activation="relu"),
tf.keras.layers.Dense(NUM_CATEGORIES, activation="softmax")
])
model.compile(
optimizer="adam",
loss="binary_crossentropy",
metrics=["accuracy"]
)
model.summary()
return model```
do you get an obvious error message from it?
this probably has not much to do with your accuracy, but I believe x-rays have no color so, couldn't you convert your images to grayscale with only one channel?
Yeah there are many activation functions that were recently proposed such as Swish, Mish, and Phish that you could try as an alternative to ReLU
Yeah, Swish, Mish, and Phish have shown to be better than ReLU in certain situations. Im not entirely sure how good each of them are in relation to one another, but trying them out couldnt hurt either
Swish, Mish, and Phish? I thought this was the beginning of a joke and you're telling me these are actually names of features in tensorflow? 😄
Swish is in tensorflow, but Mish and Phish were proposed recently. Mish came out last year I think, and Phish was proposed only a couple days ago!
I dont think Mish and Phish are in there, but you could probably use them still
Mish is defined as f(x) = xTanH(Softplus(x))
and Phish is f(x) = xTanH(GELU(x))
Mish and Phish could probably be added if enough people start using them like Swish though
Lol yeah the names are funny, but they seem to be pretty good activations actually
"avengers, activate Phish"... doesn't quite have the same ring to it
I think Mish is named after it's creator. Not sure what Phish is named for though.
after fish, probably 🙂
it can be really funny though, how some programming things get their names... like you'd think phish comes from fish, but maybe it's some very elaborate thing involving 13 people over the course of 4 and a half years plus a dog and two chocolate cakes that ended up being this name...
I looked it up and its named after the guy who proposed it
the guy's name was Phish?
His name is Philip, and the guy who named mish is Misra
I dont think so. It would be hilarious if it was though
so it wasn't quite "named after" those people so much as those people chose the name for it based on their initials...
yeah, that makes more sense
also I imagine Philip was simping on Misra a little bit so he wanted to follow in that pattern Misra set with her name 🙂
I kinda wanna test out Mish and Swish in actual neural nets and see how they do
he probably did it because of that or for pure comedic effect
both are good
I like the name because my name is also Philip lol
I like the whole Swish, Mish, Phish thing too, I think it's funny
also not very serious
so I don't take it seriously
but I probably should
but I'm using a programming language named after 5 british idiots so whatevs 🙂
The experiments show that Swish tends to work better than ReLU on deeper models across a number of challenging datasets, and its simplicity and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network. The choice of activation functions in deep networks has a significant effect on the training...
Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectified Linear (ReLU) has been the dominant activation function for the past decade. Swish and Mish are newer activation functions that have shown to yield better results t...
Swish, Mish, and Phish respectively
if I'd make an activation function, I'd just call it good and then people would say good AF 😄
Either that or "Hish"
Heruk is not my real name
ah ok
maybe I'd call it Wish 😄
you WISH this was a good activation function but it isn't 😄
Swish and Phish seem to be on par, but Mish is seemingly better than both of those
I have no idea
I missed this, but: topology? Part of real analysis? Color me offended. 🦂
offended by what?
Hey guys, anyone knows a good source to learn about "probability density function"? or maybe just probability in general
I should'a referred back to this message, but I'm just kiddin'. https://discordapp.com/channels/267624335836053506/366673247892275221/923350429004357744
Definitions and examples of the Probability Density Function
Could be open to testing in more applications outside of classification maybe
I think it would be interesting to see how it affects generators in GANS
or maybe in autoencoders
Do you think it could be implemented into the discriminator as well
I would think that having it in the descriminator would make the generator more accurate
since the idea of a GAN is for the discriminator to have high loss and an accuracy of 1/2 everywhere
so a function that minimizes loss would make the generator more accurate in making the fake images
but im not sure
Anyways it's really cool we have a new activation function
ty I'll look into that
dude phish is so good
Vouch its really good
Phish is quite revolutionary
Surely the mods here should make an announcement about it, its very useful and applicable to the real world
At the very least it should be added to tensorflow
Lmao 😂
get a copy of ross's text "introduction to probability models" and read diligently
any lesser source is just copium
I've been through Billingsley Prob + Measure like, ten years ago, so it might be nice to refresh --- is Ross' mostly theoretical, or is it more of an applied text?
i found a copy of this online, it seems kind of like a "handbook" of various basic probability models, rather than a useful resource for learning probability
i guess the first chapter is a good rundown of probability theory, but probably not something you can effectively self-study from if you've never seen the material before
its a canonical text for undergraduate probability, so its definitely something you can self study from
i missed the exercises at the end of the chapter
yeah these are pretty good
i stand corrected
I have a dataframe in pandas
I'm trying to "collapse" multiple rows to one since they pretty much the same values
How do I go about this?
PS : I had used groupby initially to transform the data
I just checked. It seems Mish has been added to TensorFlow but not Phish
well if it was only proposed a couple days ago i wouldn't expect it to be
can anybody suggest me a hands-on course on reinforcement learning
Yo Guys
Where can i Learn AI
using python as a programming language specifically
i'm willing to enroll for a paid course just need good recommendation
👋 Hi, I'm a full-stack web developer with a little bit of experience in Python, but most of my programming experience is in Javascript. I've always had an interest in machine learning and NLP, so I decided I want to make a chat bot with Python, create an API for it, and turn it into a full-stack project for my portfolio. To be more specific, I wanna make something that is like CleverBot in the sense that its goal is just to have conversations with humans that are as natural as possible. However, rather than taking the pure ML-based model that CleverBot uses, I wanted to try something like a rule-based approach that uses AI to augment the quality of its responses over time. So basically, I guess I'm just looking for any tips/advice? Tools that might help me? Dunno, just kinda playing with the idea at this point haha. 😅
whether k-prototype need to feature scaling?
there is this free course which is highly recommended https://es.coursera.org/learn/machine-learning, but does not use python ( you can ignore the programming parts) and it will help you learn the base of ml / ai. then you can look for specific tutorials on how to develop a ml/ai solution in python
MIT 6.034 Artificial Intelligence, Fall 2010
View the complete course: http://ocw.mit.edu/6-034F10
Instructor: Patrick Winston
In this lecture, Prof. Winston introduces artificial intelligence and provides a brief history of the field. The last ten minutes are devoted to information about the course at MIT.
License: Creative Commons BY-NC-SA
...
Paid
Graduate Studies in University
Udacity.com
Coursera.com
Udemy.com
DataQuest.io
DataCamp.com
BootCamp
Free Resources
University of YouTube
FreeCodeCamp.org
Neuromatch.io
🏓 Check Pinned Post For More
There are several open courses now.
Hello everyone, the Pyspark version used in the book is 3.0.1, while the version on my laptop is 3.1.2
@bronze skiff @stone marlin
:incoming_envelope: :ok_hand: applied mute to @wintry quarry until <t:1640257217:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
Do you have standing permission to ping those people?
Sounds like you're looking for drop_duplicates?
Yes
What I did was to do a groupby, then agg with the column names
I don't know enough about what you're trying to do to understand this context.
If you still have a question, I can try to help, though keep in mind that I do not look at screenshots of text.
i already dropped duplicates -
processed_customers = processed_customer \
.groupby('customer_id',as_index=False) \
.agg({
'customer_name':'first',
'total_invoice_count': 'first',
'total_invoiced_amount': 'first',
'unpaid_amount': 'first',
'unpaid_count': 'first',
'first_invoice_date': 'first',
'first_invoice_amount': 'first',
'last_payment_date': 'first',
'last_payment_amount': 'first',
'customer_segment': 'first'
})
That's what I did
When you're working with tabular data, showing what you did with the data isn't enough; you have to show what the data itself is.
Try print(processed_customer.head(30).to_dict('list'))
Okay. Usually i use display() but I will give this a try as well
That displays it in a format that can't be copied and pasted effectively.
I just did. Let me paste what I got on pastebin
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
Thanks! So what is your question?
I wanted to drop duplicates as at the time I asked
you just have to do .drop_duplicates() on the DataFrame that has duplicates.
Okay. Thanks!
In [4]: df.groupby('customer_id',as_index=False) \
...: .agg({
...: 'customer_name':'first',
...: 'total_invoice_count': 'first',
...: 'total_invoiced_amount': 'first',
...: 'unpaid_amount': 'first',
...: 'unpaid_count': 'first',
...: 'first_invoice_date': 'first',
...: 'first_invoice_amount': 'first',
...: 'last_payment_date': 'first',
...: 'last_payment_amount': 'first',
...: 'customer_segment': 'first'
...: }).drop_duplicates()
Out[4]:
customer_id customer_name total_invoice_count ... last_payment_date last_payment_amount customer_segment
0 1 Microsoft 6 ... 2021-06-01 1000 Low
1 2 Apple 4 ... 2021-08-15 3000 High
2 3 Google 4 ... 2021-04-01 1000 Low
3 4 Netflix 4 ... 2021-07-31 2500 High
4 5 Meta 2 ... 2021-07-15 500 Low
[5 rows x 11 columns]
It could probably be added if people started using it a lot
Actually, it seems that you can just run Phish, by creating it like this:
"""Tensorflow-Keras Implementation of phish"""
## Import Necessary Modules
import tensorflow as tf
from tensorflow.keras.layers import Activation
from tensorflow.keras.utils import get_custom_objects
class Phish(Activation):
def __init__(self, activation, **kwargs):
super(Phish, self).__init__(activation, **kwargs)
self.__name__ = "phish"
def phish(x):
return x*tf.math.tanh(tf.nn.gelu(x))
get_custom_objects().update({"phish": Phish(phish)}) ```
and calling "phish" as a string literal in the dense layers
I think this is how people would use it until its added officially
Is there anyone out there who knows opencv and can help?
please don't ever ping me at 5 in the morning
and yes, so what is your error with that spark version
I don't know if i'd ask this here or no, i'm trying to work through the RL guides on tensorflow's website, but every time I try and import everything related to tf-agents I keep getting "AttributeError: module 'tf_agents.trajectories.trajectory' has no attribute 'Transition'"
I made sure my tf agents and tensorflow versions were correct
are you using an ide and forgot to link your venv interpreter to it?
no i don't believe so, I'm using the anaconda prompt to install the packages for jupyter notebook
everything else I installed works just fine
i am not sure if i should ask here but could someone help me with matplotlib stuff?
the weird thing is if I just do import tf_agents it works fine, but when I get to trying to import specific things from it is when it decides to give an error
Wasn't aware that I needed permission to ping people.
I turn my pings off for exactly this reason.
I'm sorry, I'm on GMT+1. It's currently 17:28 in my country.
Hey, I need some help with NEAT, anybody who knows it? Maybe DMs or smth idk
it's bad etiquette to ping people you aren't talking to. If you have a question, it should be posed generally, not to specific people who haven't volunteered to answer it.
Noted.
hello
hi
I want to implement Fast Fourier Transformation.
import scipy as sp
import matplotlib.pyplot as plt
listA = sp.ones(500)
listA[100:300] = -1
f = sp.fft(listA)
plt.plot(f)
but it's asking me -"AttributeError: module 'scipy' has no attribute 'fft'
anyone can help me with it?
if I use -import scipy.fft as fft
it says--TypeError: 'module' object is not callable
You could try the one in numpy as an alternative
if you cant get the scipy one to wokr
does numpy have fft ?
Hey, does anyone know any github repo or module that can generate a 2d map like one below? I'm looking for a playground on which I can test some reinforcement learning algorithms
Thanks man!
how do i start learning the python coding?
just find some tutorials on ytb ig
As this is your first message in this entire discord, I assume you mean in general? If so, maybe check out something like HackerRank (If they're still around), or CodeAcademy etc. Or give yourself little projects to do, you can work up to things like PyTorch or Tensorflow for the Data Science
how can i shuffle different datasets by x amount? i.e. I need order to be maintained, so my (input1, input2, output) need to be such that when I fetch them after being shuffled, I get (input1_list[x], input2_list[x], output_list[x])
tensorflow
Not exactly similar but here are some maze environments:
https://github.com/MattChanTK/gym-maze
https://github.com/maximecb/gym-minigrid
Thks man is this the famous MIT course everybody did ?
:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until <t:1640302301:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
so basically Whoever wants to help Im trying to create a "touchscreen" by using a side facing camera for now I want to see if the Hand is touching the display and Idk where to start since I know cv2 but im extremely weak with ai stuff
This is quite similar to virtual mice however I would use a side facing camera to factor if the fingers are touching the screen or not
https://facebookresearch.github.io/PyTouch/ ? @sweet owl
A Machine Learning Library for Touch Processing
Paper might be a interesting read though https://arxiv.org/abs/2105.12791
Using Dataset.zip(x,y,z).batch(i).shuffle(i) but it's taking like 3.5 minutes to shuffle 255 data points, any ideas on what can be quicker? I have 248,000 data points I need shuffled overall
@vivid plank its kinda weird tho and a process like why do I need to make a PR
shuffle should maintain order
what's i put the whole snippet
The issue is that to shuffle my zip of 3 different datasets gives me a long waiting time before any shuffling has happened
i would be the batch sizing
doesn't matter - its a onetime cost only
unfortunately not
to cache all of the images, questions and answers I'd need about 64GB of RAM
reduce the buffer size then
Would that not drastically lower the point of the shuffle if I could only shuffle it < 255
as 1) that's a relatively small value (as a perfect shuffle according to tensorflow would be 248,001) and 2) less than batch size from what I understand means some values will be kept where they are
if your batch size is 16, a buffer size of 32 would do?
It likely would be, I was thinking bigger = better in terms of batches though
I'm not sure if it actually would be
so was trying to keep it fairly high
it probably won't be
My model is a mess lol
try using prefetch too
Even using precision and recall metrics for my model all it outputs is "yes"
yeah I prefetch my images, should prefetch my one hot answers too
all models are like that in the start 🙂 DL is not that easy
the actual model processing takes about 0.8 seconds and loading all of the images (when I use the full dataset) takes about 4 seconds
so I always end up waiting for the I/O bottleneck anyway
but overtime, you'd immediately recognize what the problem is
Yeah, I'm just getting annoyed haha, no matter what I do it seems to only output "yes" so I'm doing all I can to change that
yea, my model is bottlenecking too 😜 but I am a bit lazy and can wait more
I try kernel_regularizer, changing metrics to Precision and Recall (which I think is best), changing loss/optimizer etc. etc.
one change after the other and something I'm doing always ends up favouring the most common answer lol
well, then speed shouldn't be a priority for you at this stage
yeah I'm using 10% of the full dataset
so it can do 1 epoch in about a minute as opposed to 45minutes
I can reduce that time but I'm trying to do extra processing to see if I can get a few varying results
but I still just get [9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] (once decoded that's "yes" with 15 spaces)
because like a big brain idiot I'm trying to also output up to 16 words ,-,
anyways, I'ma sleep cuz it's 1am rn, ty for the advice man
hm..what's the task?
Visual Question Answering
Image + Question --> Answer
I've just thrown in like 6 metrics and giving it a quick run through now
you're backpropping through 6 metrics?
I doubt you'd get good accuracy without carefully adjusting your architecture
I'm using the ones given from this
ah :/
from that site I'm using:
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
]
wuz the loss
but from a quick test it still outputs "yes"
tbh I seem to have deleted the code that gets loss lmao, how do I get loss from train_on_batch again
you can paste the code here, someone might look over it and help you out
Pasting large amounts of code
If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
so I'm assuming the way this is formatted is the list holds the returns from each metric used
[0.6561643481254578, 3789.0, 86.0, 100498472.0, 291.0, 0.9778064489364624, 0.9286764860153198]
The issue is much like the credit card fraud imbalanced data set I have a HUGE amount of "yes" answers
except it's not a binary classification task and has 23,000 possibilities
Loss from last batch trained: 0.5620800852775574
TruePos from last batch trained: 3801.0
FalsePos from last batch trained: 24.0
TrueNeg from last batch trained: 100498536.0
FalseNeg from last batch trained: 279.0
Precision from last batch trained: 0.9937254786491394
Recall from last batch trained: 0.9316176176071167
Yeah it's scoring quite high just by guessing "yes"
imbalanced classification is hard in general
especially with a huge number of classes
Anyone here code counterfactual regret minimization
I'd like to have custom images instead of labels on my squarify tree map, How can I do that, I couldn't find anything on it when I checked.
This is the test data I'm working with for now
I plan on downloading the custom emoji images using requests and storing them in a folder with their emoji_id as the filename
This is what the treemap looks like, but instead of text labels I'd like the images with the respective filename
Hello, I wanted a small help
I am searching proteins in a disease let us say x
in google
there are about thousand
how do I automate the search and fetch only the proteins in the disease as per google
any idea?
pls help
anyone can explain to me about silhouette score?
what's the loss you are using?
I use Adam and CategoricalCrossEntropy
For my loss and optimizer
I think one issue i have is I try to output up to 16 words in 16 different lists because I have questions between 1 word and 12 words in my dataset
So if I run it too long with a kernel_regularizer, even with a tiny value it ends up forgetting even the most common answer and just outputs nothing
I'm not sure if its natively masked but without the regulariser it seems to still remember to output yes
I am wondering is there any machine learning models use info in the past to predict future outcome with time as independent variable?
like doing time series forecast on stock price, is there any model based on time on machine learning?
I'm not sure about the time side or int of it but in text there are models called RNNs that maintain cell state to adjust future biases
And I think you may be able to create dense networks with a bias_initializer so you might be able to assign higher bias to ones that have historically scored better
Bias weights I think you can also get from previous iterations of the network though as far as I'm aware you'd have to build the model and compile it everytime
thanks
Hi does anyone know about how to use power spectral density on the price of the particular stock vs the time series ?
thanks
It is not possible to transfer your consciousness to the virtual world. It is impossible. Maybe it will be possible to create a virtual, limited double of a person, their limited imitation of a character, etc., but it will be a lie, it will be a computer processed set of 0 and 1, not a real person or his actual consciousness. Man will never reach the intellectual level to create an artificial intelligence equal to his own. It is logically impossible. AI can do things faster and more accurately, but it won't be true intelligence, just a limited imitation of it. AI is not really intelligence, but an algorithm that mimics the characteristics of intelligence in a limited way. And that artificial intelligence does certain things faster and more accurately is due to the speed of computers and these algorithms, and the numerical nature of computers that compute everything much faster than humans with satisfactory accuracy, hence the efficiency and accuracy of these algorithms, but it is not intelligence.
Can someone tell me if im wrong or right?
Does anyone have any book recommendations for something between ISLR and the elements of statistical learning wrt. difficulty/complexity?
So for the past few months I've been trying to build a bilingual voice cloning machine
The other day I ran into an issue that I can't figure out if it's either an obstacle or a permanent shutdown
I need to synthesise my audio, which requires trading my metadata files
But the process isn't streamlined in the slightest, meaning I have to take all my hundreds of datasets and manually write them in to be synthesised
Which I think might not be what I'm meant to do it this is supposed to be dealing with one metadata file
I don't know, I feel as of there's still a chance for me to make this work but I feel as if everything I've been doing since August has been a complete mess
Code design for a ML project
Hi everyone - I'd love your thoughts on something I'm working on:
I'm writing some code for a ML project that I'm working on. The rough pipeline is something like this:
- Pull raw training data from a SQL database
- Perform feature engineering
- Train the model on the data (hyperparameters have already been defined and stored in a config file)
- Save and upload trained model onto a server.
- Use trained model to score on new observations
All of this code will reside in a GitHub repo, and the same repo will house code for other models as well (in different folders within the repo) that are similar but have different features for the raw data for those models. My question is for point number 2 above:
How do I write modularized code for feature engineering? Should I define a specific class for the raw data that I pull for this model, and then define specific methods for each feature that I add to this dataframe? Or am I thinking about this wrongly? I apologise if my question doesn't make a lot of sense, but I'd appreciate any thoughts on this. Also, if you have links to a good public repo that I can look at to get some inspiration, that would be awesome too.
TIA
What do you mean by feature engineering? Representing the data from the SQL table in a way that can be passed to the model?
I mean, adding additional derived columns/features from the existing set of columns, which i intend to do using Python. SQL (step 1) will only be used to pull in the raw data from a database, and nothing else.
If the derived features are derived only in terms of what is already in the DataFrame, I don't think you need to design anything extra.
Yes, but where will the code for this step ideally reside? In a separate file like feature_engineering.py? And if this file also contains code to perform feature engineering for other dataframes to be used for other models, what's the best way to separate out the code for different models?
I guess you can make a function that takes a dataframe and returns a series, and make one function like that for each derived feature, and then call all of them in a call to pd.concat
Yes, that was my initial though. But this function will only be applicable to a specific dataframe. For example let's say the repo in this case hosts code for 5 different models. And let's say I have a file named get_raw_data.py which has different functions like get_raw_data_for_model1(), get_raw_data_for_model2(), ..., get_raw_data_for_model5(), each of which runs a SQL query to pull in raw data and returns a pandas dataframe required for each of the models. Each of these dataframe have different columns and are quiet different from each other. So the functions I create in feature_engineering.py are not applicable to all 5 dataframes. In this case, how do I separate out these functions? More concretely, if I have a function creare_feature_foo() in feature_engineering.py that is applicable only to dataframe 5, what's the most pythonic way to do it?
Make 5 different functions if they clearly have no overlap. There is no way to make that less work or smaller in terms of amount of code.
Yes, we will have 5 different functions. But what's the best way to indicate that a function is applicable to only a certain dataframe (and not the others?) - is it okay to define these dataframes as separate classes (that inherit from pd.DataFrame) and define these functions as methods for these classes?
You can either have documented pre-conditions (recommended because it makes the code shorter), or you will have to have the functions check the conditions in code (the dataframe format / columns, etc).
The second method that takes more code makes sense if you want to make this a bit more robust against people new to the project that don't really yet know what they are doing or don't really care.
To that end you can have a dataframe format checking tool (in code) that takes in a given expected format.
Got it - that makes sense! 🙂 Can you tell me what's wrong/undesirable with the classes/method option? Just trying to understand the pros and cons of everything.
Don't make a object unless it needs to be one.
That's how you get those massive OOP projects with crazy class names that make no sense (cough Java libraries cough).
And they don't do much (pretty empty, just some getters and setters).
The time to make an object (often with no methods actually (data object)), is when you find yourself passing around the same arguments to different functions together all the time.
So you can think of an object / struct as being a shared stack frame for the variables.
Actually the way to get the optimal structure for a project is to first write out the entire thing flat (step 1), that is, no functions, no classes, just all inline. Then you look at which parts repeat or are very similar other than some differences in parameters. Take those repeat parts and make a function out of them (step 2). Note that this effectively compresses the code. Then look at your functions and see if they tend to have a bunch of arguments that can be grouped together / are passed around together. If they are, make an object out of them (with those variables as the members) (step 3). Again this compresses the code. Now go back to step 1. Extra step: sometimes it's worth splitting up the flat code even if it does not actually compress it further because you want to be able to read what is happening as high-level steps (like a story), this is often done in the main function / file.
This method will give you the optimal code in terms of size while still being readable. No function nor object is unnecessary.
All programming paradigms teach this method in an indirect way.
Note that this method uses hindsight. It does not try to predict which functions or classes need to be made and then make them (no upfront diagram (UML), only maybe after making it). It let's the code itself decide how it wants to look.
Prediction of classes and functions can be incorrect and lead to bad design.
Done UML not too great yep
Evolve your code as needed
If you have UML you have to update that too more work
Document with UML when things are stable
Thank you so much for the detailed explanation! I really appreciate it! 🙂
Hm, I've done a significant amount of EDA with OOP and I thought it was fine, I don't think it's generally undesirable. But if someone is starting out, doing an easier or smaller project, or even just working alone, etc., I agree that it's sometimes sort'a overkill. Having said that, before anyone yells at me for liking OOP or whatever, I'll note my experience here.
My experience with Objects in EDA: For much of the data I worked with, there were components, subcomponents, etc., and these were typically all part of the same "thing" --- though I've done this even with travel data and other types of data, not just physically-modeled data. I can use classmethods to parse the appropriate parts of the data into objects which each have their own methods --- mainly cleaning and descriptor methods, as well as plotting methods. I'm also able to isolate different parts for inter-component feature engineering, and for component-to-component feature engineering it is enough to check for the class type to see if the components "go together". If one also forces an additive-only structure with feature engineering (at least until returning the df) then it's also trivial to add new features in the class.
I also feel that readability is significantly easier than the normal "imperative" EDA where it's just a bunch of imperative code with comments above it saying what something does --- especially if you want to modify one single part of the imperative code and you didn't realize something lower required some strict size or something. Though, to be fair, I'm a huge stickler for Python's type-hinting + documentation --- usually my commit stuff runs mypy + sphinx and throws an error if something isn't documented --- though I'm on the extreme side of this, I know.
My bias, though, is to err on the side of verbosity and telling people exactly what needs to go in and come out. I know not everyone likes this, but it's been okay for me so far.
(This also works well with larger data, when you need to format / feature engineer around chunks and you need to make a DAG structure to do all that nonsense. But, again, that's kind of the "shared code" that Squiggle talks about above, just a specific point I wanted to note.)
In Python one does need to fight the language a bit here by not being lazy on the type hinting. I like to also distinguish between API specification (what are the functions and classes, pre-conditions, post-conditions, side-effects, return values, errors, is the operation atomic?, security concerns, TODO, etc), and documentation (an extensive set of documents which can link to or contain the API specification as well). API specification can be auto generated by "documentation" tools, but documentation is often like writing a book and takes a lot of time. Documentation is often best done after when one actually knows what it looks like (rather than just prediction | giving it time to settle), because having to change it is a costly (in time) thing to do.
*The entire program does not need to be complete to write documentation, parts can be documented. I just look at the rate at which they were changed over time (commits). If the part has no been touched in a while (solidified), then I document.
I don't disagree --- Python is great for prototyping, so to do anything for "production" (type-hinting, etc.) it does take a bit of work and boilerplate to make sure everyone's on the right page. Moreover, there's a lot of DS people that don't even know type-hinting is a thing, or that they can lint/format.
I'm not sure I understand the distinction you're making exactly between API docs and standard documentation, but it might be different here because we build our APIs separate from our EDA tooling and both take a different method of documenting (APIs use swagger, autogen usually; EDA uses self-written numpy docstyle). Either way, it's good to separate those otherwise it gets confusing, I agree.
The only thing I disagree with here (and mildly so) is that I make my peeps document when they're submitting a PR, even if things change soon, as well as write unit tests --- but, having said that, it's usually at a point then where things have "settled". Also, these are people who are doing this kind of thing a lot, so we already know the gist of what functions we should have. I also found that if I don't force them to document right away, it will literally never get documented, but maybe that's just me not badgering people enough later.
That's an interesting strategy --- I do not know if it would work for me in a team setting, but I can imagine if it's a solo project that isn't quite clear yet (initial stages of EDA, etc.) then, yeah, totally, I can see that being a reasonable way to go. Especially for early EDA when you're just scratchin' around at stuff.
One additional downside (at first) of my methodology is: it takes a LONG time to get people used to documenting, linting, type-hinting, etc. Eventually, it's second nature, but there's a pretty significant ramp-up time.
Yeah, I I agree about writing a bit explaining what is going on and writing some tests, I just don't consider it documentation under my own terminology. Documentation is more "serious", one needs to sit down and spend a lot of time on just it. Open up the Latex maybe, word, make some diagrams (maybe even animations), etc.
(It can take weeks)
Ahh, I understand. Yes, I agree --- sort of like, reporting or "long-term" documentation of tools for wide-spread use or something.
(Most projects can't afford this unless it's legacy / will stick around for a long time and not change too much / or they just have a lot of employees like the big popular game engines for example)
Yes, that documentation, I agree, should not be done until the tool is in a usable state and is not being used only by the team that made it. I misunderstood --- in the above, when I say "documentation" I mean doing google/numpy/whatever style docstrings and other minor things like that in a Python file, as well as Swaggering the API (or whatever doc system is used).
It's the sort of thing one creates to make sure that future employees can understand it all in total long after you are gone.
(An example would be like hardware documentation like Intel's official docs for their chips)
Yeah, that is a much more serious endeavor, one that I've luckily not had to do much. I don't think I'd require my team to do documentation like that on most of, if not all of, our EDA code. For a full ML project, probably only a small bit on how to run the pipeline (since it's generally standard pipelining).
I think the writing of some comment at least per file (at the top (I like to include example code)) is very nice. For the functions themselves, I like the various assumptions being made (pre-conditions, post, etc (every function has more of these than one might think)), and in some cases to cover up bad language design (like how C does not have multiple return values so I need to say which argument is actually an output). I prefer very long function names (sentences sometimes), and variable names so if those do not already let you know what it does then maybe a comment still, but often they are (and are suppose to be) sufficient.
Agree with everything here. Especially in the context of Python.
Seems legit. I wanted above to note that these things are options, but for someone starting out in their journey / doing a smaller project, yeah, probably focusing on other things is more reasonable, as y'all noted above.
Hi All anyone has used Shap for fb's prophet?
Lol we all agree
Hey I need some good project topic for developing Deep Learning Model.
Can anyone suggest me some good topics ?
The Google Alpha zero is said to have learned chess on its own by playing against itself and then it also defeated the then best chess engine, stockfish! In that case, it has also incorporated self-learning and utilization of its previous learning in subsequent attempts, to play even better etc.
Other than this, human intelligence is capable of creative content creation like poetry or art, which code like GPT3 is capable already to some basic extent right?
Would this suggest the contrary to you?
what i usually do is check out kaggle for datasets/notebooks and just try to train a model with those
Hi, I am so wondering about this plot. Why can't plotting an 'other' value in the 'Terminal' feature?
Hi, I do not know seaborn. But could this error be because 'Others' does not have enough passengers to show?
it means the 'Other' value is very small?
Could be. You could take its sum value_count or something and check.
hi guys, i have homework for artificial intelligence class. Its topic is "artificial intelligence in law". They asked us to design an artificial intelligence program to help lawyers. But everything I can think of has been done. I want to get your opinion too. Can you share your ideas with me?
ok thank you for the answer
Crime Prediction? 🤔
Oops! Our clients are not law enforcement! But Lawyers! Sorry!
Hmm.. How about App that predicts probability of bail? etc?
It could provide the history of such successful cases and the applied sections and guidelines from verdict from data?
I think, good idea. Thanks😊
Anyone else have any ideas to share?🤗
The 'other' category count is relatively small when compared to other categories therein.
Use value_counts to view the proportion of each category in the 'Terminal' column.
Hmm Intresting... 😀
You could explore Topic Modelling for fraud detection using LDA (not to be mistaken for Linear Discriminant Analysis) I mean using the other LDA model used for topic modelling.
If you then want to make your work more alluring (or maybe, sophisticated) then delve into Self-Supervised vs Semi-Supervised Learning to compare result gotten from your LDA topic model.
Oke thank you
Do we should scaling to the data before determine a cluster?
I want to use DBSCAN, but I am confused about when to scale the data?
Yes it's advisable to scale your data first before applying any clustering algorithm on it.
Ok thank you for the answer
but di you know about silhouette score?
does it true silhouette score is the method to determine a cluster if it has a label?
I only know some people use silhouette metric in gauging performance of their clustering model but I haven't used it before so I don't know for sure.
Have you sift through Google yet? I'm sure it'll know the right answer.
I've been seen in google but I still confused because it given example only use 2 columns (feature and label) but I have a 15 feature and 1 label
but I wondering about this. When the format datetime like this, it's advisable to drop or stay included to used to scaling?
See abovr great book
weapons of math destruction” (WMDs). She defines WMDs as opaque mathematical models that embed human prejudice, misunderstanding, and bias into the software systems that automate numerous aspects of our lives. Her book covers several types of these models and the frustrating injustices they can perpetrate. In addition to case studies about credit scoring, online advertising, employment, and insurance, O’Neil discusses the use of WMDs in the criminal justice system
Explore possible bias in models thst cause inequality and suggest ways to fix the models
The title funny too
Oh yeah i resd the book it is informative and entertaining
hey,
I am want pose detection on web
So i tried move net with ONNX and TFjs.
For PC:
ONNX: ~20 FPS
TFjs: ~ 40 FPS
For mobile:
ONNX: 2 FPS
TFjs: 6 FPS
What can i do to improve speed ? I am doing all inference on Client side in web
Any other models i can try ?
Or something completely else i can try ?
Thank you for help in Advance
make them sparse - check out neuralmagic
if you're doing inference only then try lowering precision
theres also things like tensorrt which do lower-level optimizations as well
i was thinking to use Int8 model currently i am using float32
But i don't know how to use TFlite model with TFjs