#data-science-and-ml | Python | Page 186

feral lodge May 29, 2018, 9:22 PM

#

Lemme have a quick look

#

Soon sleep time for me though, heads up

placid snow May 29, 2018, 9:23 PM

#

Returns the coefficient of determination R^2 of the prediction

feral lodge May 29, 2018, 9:28 PM

#

Ah I see! Looks like you can choose the error measure yourself

#

I guess you have this one http://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html#sklearn.metrics.r2_score

#

Then hold on, let me reread your older messages

placid snow May 29, 2018, 9:28 PM

#

I use the built in .score methods of the model classes

#

model.fit(X_train, y_train)
score = model.score(X_test, y_test)```

feral lodge May 29, 2018, 9:30 PM

#

Aight, so you can ignore my error measure up there then

#

I've pretty much never used R² though

#

But anyway, so you just average your score over your 100 experiments then? Sounds fine

placid snow May 29, 2018, 9:31 PM

#

Yeah, avg, min and max

feral lodge May 29, 2018, 9:32 PM

#

👌

#

Then just have a go at changing the data to a set of integers 0-17 and try fitting the model again, evaluating the score just like before

placid snow May 29, 2018, 9:33 PM

#

Should i stick to manual data splitting?

feral lodge May 29, 2018, 9:34 PM

#

Trust the machine 🤖👌

#

Just do random draws or something to split your data into training/testing sets

#

Sometimes they'll be bad, sometimes good, c'est la vie

#

If the model is good then the average score will reflect that

placid snow May 29, 2018, 9:36 PM

#

Hm

#

I guess

#

Thank you! Imma go back to tinkering the knobs then

#

Before I go GWcmeisterPeepoE

feral lodge May 29, 2018, 9:38 PM

#

The "legit" way to split your data over a series of experiments using a limited data set is cross validation however: https://en.wikipedia.org/wiki/Cross-validation_(statistics)

Cross-validation (statistics)

Cross-validation, sometimes called rotation estimation, or out-of-sample testing is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the g...

#

Definitely have a look at that, I think i saw something in the docs about it

placid snow May 29, 2018, 9:39 PM

#

Since the dataset has 6 * 3 features I can see the getting 18 combos

#

But what do I do with data that share features ?

feral lodge May 29, 2018, 9:39 PM

#

🤔 I don't understand

placid snow May 29, 2018, 9:39 PM

#

M,trinervia,0.4627
M,trinervia,0.4135```

#

Etc

hollow lantern May 29, 2018, 9:39 PM

#

avg?

#

😃

placid snow May 29, 2018, 9:40 PM

#

just give both the same category value and keep it as is?

feral lodge May 29, 2018, 9:40 PM

#

Oh

#

Yes, that will be "translated" to

15, 0.4627
15, 0.4135

Or whatever integer you assign that combo

placid snow May 29, 2018, 9:41 PM

#

alright

#

Just wanted to make sure peepoSmile

feral lodge May 29, 2018, 9:41 PM

#

There's nothing weird about that just so you know. A data set will always be a flawed representation of reality

#

Say we wanted to predict the weight of a human being, using the features "nationality" and "gender" for instance

#

Certainly we can have two swedish men with different weights

placid snow May 29, 2018, 9:43 PM

#

Yeah, you're right

#

Just had one of those, but what if bad happen moments

feral lodge May 29, 2018, 9:44 PM

#

The issue is that we lack features, such as "profession", "salary", etc. Those would help us, and they exist IRL -- but we don't have access to them

#

They are so-called "latent features" https://en.wikipedia.org/wiki/Latent_variable

Latent variable

In statistics, latent variables (from Latin: present participle of lateo (“lie hidden”), as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (direct...

placid snow May 29, 2018, 9:44 PM

#

I see

feral lodge May 29, 2018, 9:45 PM

#

As you do more ML you'll encounter this more often. If you plot the weight of all swedish men, for instance, you would quickly see groups, or clusters, forming

#

Why these clusters arise is because of latent features. Maybe most of the men in one cluster were recently divorced, or something

#

No way to know, but we can at least find the clusters and make educated guesses

hollow lantern May 29, 2018, 9:46 PM

#

that probably would make you eat more

feral lodge May 29, 2018, 9:46 PM

#

exactly

#

Finding clusters like this is the main problem in unsupervised learning -- finding patterns that arise due to latent features https://en.wikipedia.org/wiki/Unsupervised_learning

Unsupervised learning

Unsupervised machine learning is the machine learning task of inferring a function to describe hidden structure from "unlabeled" data (a classification or categorization is not included in the observations). Since the examples given to the learner are unlabeled, there is no e...

#

That's a tangent though, not directly related to your problem 😃

#

The point is that without knowing these latent features, we will always see variation in the data

#

Of course, if we had ALL conceivable data, we would have 0 variation

hollow lantern May 29, 2018, 9:49 PM

#

good explanations 👍🏽

feral lodge May 29, 2018, 9:49 PM

#

Cheers 😄

#

Good night friendo

hollow lantern May 29, 2018, 9:49 PM

#

good night

placid snow May 29, 2018, 9:52 PM

#

👍🏾

small ore May 29, 2018, 9:53 PM

#

Wow. I learnt a lot today. Thanks to you guys. Esp Slandon

fathom current May 30, 2018, 3:15 AM

#

I want to be able to train a program to identify a object in a picture

#

By feeding it pictures of what I want it to identify

#

Is keras and tensor flow the tools to do that ?

feral lodge May 30, 2018, 6:46 AM

#

Yes those will work! Pytorch is another library that I find easier. You'll want to have a look at convolutional neural networks

#

Hope you have a lot of training data :)

feral lodge May 30, 2018, 7:10 AM

#

@small ore By the way, after waking up i suddenly realized what you meant by "Is arbitrary numbering a good input training set?" and the answer is no. If we number the combinations 0-17 and regress upon those we should first have sorted the combinations after, say, average weight or something, so that combination 5 will correspond to a higher weight than combination 4 , etc. Otherwise there will be no correlation between the input and the weight, and our linear estimation 'y = kx + m' will fail miserably which I suspect it did. Completely slipped my mind yesterday 😖

placid snow May 30, 2018, 7:18 AM

#

I tried categorizing the combos of level and species btw

#

Didn't really see any improvements FeelsSadMan

feral lodge May 30, 2018, 7:32 AM

#

Did you try sorting the list of combinations by average weight before assigning the new label?

#

If we don't, there won't be any correlation between the label and the output

#

But even then, in the best case, your regression will just be another way of immediately mapping a combination to its average weight :/

#

It's such a weird assignment tbh :/

placid snow May 30, 2018, 11:22 AM

#

Yup

#

Was even stated that the highest possible score from all hand ins would be the top grade for the assignment, so if nobody got higher than -0.5.. that would be an A

feral lodge May 30, 2018, 12:49 PM

#

They pit you against each other like fighting roosters 🤔 Is the data available online?

placid snow May 30, 2018, 12:56 PM

#

Not that im aware of

#

We just got a csv with it

#

But i think I've found a solution imma stick to

fathom current May 30, 2018, 1:09 PM

#

So to make a model that can distinguish between let's say different fish. I would need thousands of pictures of each fish I want it to recognise?

feral lodge May 30, 2018, 1:43 PM

#

Quite possible!

placid snow May 30, 2018, 1:46 PM

#

The more the merrier

feral lodge May 30, 2018, 1:51 PM

#

https://arxiv.org/pdf/1805.10106.pdf

In this paper they do some work on recognizing fish. In total they use 27000 images

fathom current May 30, 2018, 2:07 PM

#

Thnx

feral lodge May 30, 2018, 2:20 PM

#

no problemo 🐠

velvet anchor May 30, 2018, 11:55 PM

#

Any ideas on how to go about effectively knowing what to fiddle with to help a network train? Right now I have a network with about 400,000 images in the training set across two classes that are very similar (photoshopped vs not photoshopped) of the same category essentially and its having a hard time not over fitting into one section or the other. Do I reduce data? change activation functions? scale the images larger? all of the above?

small ore May 31, 2018, 3:34 AM

#

@feral lodge The sorting would indeed meake it better but still arbitrary numbers does not make much sense to me. It is like a step of 1 to distinguish between species while the actual meaningful things may not even have any reasonable corelation to that function. Hot encoding seems better esp since data is not much.

#

Number of dimensions are also reasonable

#

@fathom current From what little I have read, do not apply your recognition codes directly to the original image. Dumb it down ( Grey scale, etc and perhaps other dumbing masks) to make the problem solve in reasonable time.

fathom current May 31, 2018, 3:37 AM

#

sadly there's an app already that does everything i was considering doing

small ore May 31, 2018, 3:37 AM

#

Some common sense even says things like outlines may be sufficient to determine what these are. ( Not always)

fathom current May 31, 2018, 3:38 AM

#

so i may not be doing any machine learning stuff any time soon 😦

#

back to the drawing board

#

but thank you for the info

small ore May 31, 2018, 3:43 AM

#

Google photos is one app I can think of which does not need a great many photos to recognize faces and search image by faces. Maybe it incrementally betters the prediction model for each face when you add more photos to its database. And going by its silent speed it perhaps dumbs down images a lot and uses only a few bits of information for making indexes. Recently someone known to me was impressed when it could detect a childhood photo of theirs based on a few 10s of their adulthood photos

feral lodge May 31, 2018, 8:58 AM

#

@small ore I won't defend it since i definitely agree with you that it's a very bad representation of the data. 😄 It enforces some strange stuff like a uniform distance 1 between the data points, which is almost certainly false. But I also think we should be very skeptical of a representation that requires fitting in nine dimensions despite originally having only two features 😕 In the end I think neither will work well -- the artitrary-number representation because it simplifies and assumes too much, and the one-hot representation because it's too high dimensional -- because linear regression simply is a poor model for the data

feral lodge May 31, 2018, 12:09 PM

#

@velvet anchor How different are the photoshopped and real versions of the images?

#

If the data is like, half children's sketches of animals and half photographs of animals it might be better to first detect whether or not it's a sketch and then use one of two convolutional networks to classify which animal it is

#

However if they're fairly similar it's probably better to preprocess the images, keeping only black-and-white outlines, and train on those

#

https://jgeekstudies.org/2017/03/12/who-is-that-neural-network/ Here's a fun blog entry showcasing some image preprocessing techniques

Journal of Geek Studies

JGS editor

Who is that Neural Network?

Henrique M. Soares Independent researcher. São Paulo, SP, Brazil. Email: hemagso (at) gmail (dot) com Pokémon has been an enormous success around the globe for more than 20 years. In this paper, I …

#

Also, in the arxiv paper i linked above they do some preprocessing on real life fish images (page 3) If your photoshop images are realistic-looking I imagine this preprocessing may produce similar results for them as for real-life images

📎 unknown.png

#

Those guys also saw big performance jumps when switching between activation functions, since you mentioned those

📎 unknown.png

polar acorn May 31, 2018, 1:19 PM

#

So I'm trying out some algorithms on this classification problem. But the data cleaning is a hassle. I usually just structure everything in a long script 😔. I have several data sources that needs tying together, and each source has several separate datasets for certain time periods. In addition each dataset for each source has meta info like time offset etc. Would it be wise to create a class for handling all the data sources for each time period? With some function that returns a workable dataset?

#

And in the case that I want to add several of these time period datasets together to a larger dataset. Should this also be in a class, or should I just have a function that iterates over the some id's create instances of the class, calls the wished for output and merges this output?

astral harbor May 31, 2018, 1:43 PM

#

could this motherboard be used to build a gpu-based machine learning supercomputer? or is pci-e x1 over usb too limiting? https://edgeup.asus.com/2018/asus-h370-mining-master-20-gpus-one-motherboard-pcie-over-usb/

Edge Up

Geoff Gasior

The ASUS H370 Mining Master connects 20 GPUs to one motherboard wi...

We ported PCIe over USB to help miners build more efficient farms that connect up to 20 GPUs to a single socket. Find out how the H370 Mining Master uses innovative engineering and refined diagnostics to crunch more cryptocurrency with less downtime.

velvet anchor May 31, 2018, 3:32 PM

#

@feral lodge not very

#

Well kinda. They’re deepfake images

#

So only a small portion has been touched

feral lodge May 31, 2018, 3:54 PM

#

Ah

#

Is it humans?

velvet anchor May 31, 2018, 4:12 PM

#

Yes

feral lodge May 31, 2018, 4:12 PM

#

Tricky 🤔

velvet anchor May 31, 2018, 4:12 PM

#

Right now I'm setup as 7 convolutional layers in a binary classification problem

feral lodge May 31, 2018, 4:12 PM

#

You just feed the images without preprocessing?

velvet anchor May 31, 2018, 4:12 PM

#

I've done a tone of preprocessing

#

Been working at this problem for like 4 months 😛

feral lodge May 31, 2018, 4:13 PM

#

Aw shucks 🤔

velvet anchor May 31, 2018, 4:14 PM

#

I've tried RGB, Greyscale, a Gradient (which is a self created technique of drawing vectors of brightness change within the image), etc etc

#

all the different activation functions, more layers, less layers, etc etc

feral lodge May 31, 2018, 4:14 PM

#

I know there are entropy-based algorithms for detecting tampering with images

#

But that's like for detecting photoshop editing

#

Interesting but not really my area. I'll see if I can find some papers

#

Good project though, lord knows we'll need to be able to detect deepfake media in the near future

velvet anchor May 31, 2018, 4:17 PM

#

Yeah it's a research project with one of my professors and it's just a 2 man team with a limited GPU

#

I'm having to run a batch size of 2 @ 250x250 to even train networks lol

#

It's a super interesting problem because

feral lodge May 31, 2018, 4:18 PM

#

I've read about cloud-based deep learning using tensorflow on google cloud, is that an option? https://cloud.google.com/ml-engine/docs/pricing To increase the computational power

Google Cloud

Pricing | Cloud Machine Learning Engine (Cloud ML Engine) ...

velvet anchor May 31, 2018, 4:18 PM

#

you're essentially trying to detect the noise within an image

#

but that gets lost in training a lot of time

#

cloud based isn't really an option because of pricing but a new GPU may be in the departments future soonish

feral lodge May 31, 2018, 4:20 PM

#

Aight 🤔

#

https://arxiv.org/pdf/1805.04096.pdf I guess this is too simple for you?

#

Looks like it detects copy-and-paste homemade fake images 😄

velvet anchor May 31, 2018, 4:21 PM

#

Yeah

feral lodge May 31, 2018, 4:22 PM

#

https://arxiv.org/pdf/1711.10394.pdf This looks it handles a bit more challenging stuff. No humans though 😕

velvet anchor May 31, 2018, 4:23 PM

#

And it may turn out that what we’re trying to is impossible given team size (basically only me) and the hardware. That’s okay too. I just don’t want it to be the case

#

I’ve tried every combination of settings though. I have a master python script that generates models and tests accuracy and it’s either always 1 or always 0 over the test set.

hasty maple May 31, 2018, 4:24 PM

#

Have you tried selu activation and alpha dropout?

velvet anchor May 31, 2018, 4:25 PM

#

Yes on selu no on alpha dropout

hasty maple May 31, 2018, 4:25 PM

#

try alpha dropout after selu, also have you tried Bayesian search for your parameters?

velvet anchor May 31, 2018, 4:26 PM

#

Yes on Bayesian

hasty maple May 31, 2018, 4:26 PM

#

what about other dropouts, batch normalization?

velvet anchor May 31, 2018, 4:27 PM

#

like keras.layers.BatchNormalization?

#

and for drop outs right now im dropping out 0.25 after 3rd layer then 0.1 after the 5th

#

Id post my code but i'm not at the workstation right now to have access to it

hasty maple May 31, 2018, 4:28 PM

#

no github?

velvet anchor May 31, 2018, 4:29 PM

#

Nah I only work on it when I'm getting paid so i didnt have it uploaded remotely to remove temptation 😛

hasty maple May 31, 2018, 4:30 PM

#

ayy

velvet anchor May 31, 2018, 4:35 PM

#

Is there a better ML framework than Keras for images? Like one thats easier to use?

#

I like keras because you can just feed it np.arrays but if theres another im open to switchin

hasty maple May 31, 2018, 4:36 PM

#

pytorch,tensorflow

velvet anchor May 31, 2018, 4:37 PM

#

might look at pytorch but im also more well versed in keras. probably not worth trying to replace 4months of keras knowledge

feral lodge May 31, 2018, 4:48 PM

#

When you say you did Bayesian search, does that mean variational inference?

velvet anchor May 31, 2018, 4:52 PM

#

Maybe, any links for exactly what you're referring to to be sure?

feral lodge May 31, 2018, 4:55 PM

#

https://arxiv.org/pdf/1506.02158.pdf Like this one for instance. Instead of producing point estimates of the weights, place a prior distribution over them, and train to compute the posterior

#

Sort of in its infancy and limited by computation power and bias due to choice of prior, but in theory good against overfitting

velvet anchor May 31, 2018, 4:57 PM

#

Might work if I had more hardware

#

but our machine has a 970 I think

#

velvet anchor May 31, 2018, 6:01 PM

#

https://hastebin.com/osuduyowin.py

#

here's what I'm working with in my current iteration, just trying distinct activation functions to see what changes

#

The epochs and stuff are low for rapid testing just to see results on a model, i up them to something reasonable if I get promising results

hasty maple May 31, 2018, 6:18 PM

#

have bigger kernels at the start and reduce them as you go deeper in the conv2d layers

velvet anchor May 31, 2018, 6:18 PM

#

Tried that combination as well, tried at one point going from like 64 to 2

hasty maple May 31, 2018, 6:19 PM

#

they should gradually decrease, not decrease at once

velvet anchor May 31, 2018, 6:23 PM

#

Yeah I did

#

I stepped like 64 -> 58 -> ... -> 2 at one point

#

I've just been messing around with this variation for a bit

#

Examples from training images are like

hasty maple May 31, 2018, 6:25 PM

#

All I every did with images was on mnist, my experiments showed, have kernals decrease slowly, window size gradually decrease and that helped. sandwich dropouts/batch norms as much as possible and they'd do well

velvet anchor May 31, 2018, 6:33 PM

#

One of the big problems I found with using large kernels and stuff was overfitting

#

Because the images are so close to a real image

#

📎 FS-Bean-1.jpg

#

Like that

hasty maple May 31, 2018, 6:49 PM

#

Damn that looks difficult to say it's a fake.

velvet anchor May 31, 2018, 6:49 PM

#

Exactly

#

Sometimes theyre more obvious like this

#

📎 FS-Obama-3.jpg

#

📎 frame218.jpg

#

And then theres these which are basically impossible

hasty maple May 31, 2018, 6:53 PM

#

It's pretty much impossible to solve lol. I think natural images have some sort of static noise in them, maybe artificially created ones don't, you could try to extract that as a feature and feed it to the network.

velvet anchor May 31, 2018, 6:54 PM

#

That's what I tried to do by creating a gradient to measure light

grave axle May 31, 2018, 7:29 PM

#

Hey guys! Just wanted to drop in and ask if there's a relevant channel for python for finance purposes?

#

I have an ongoing project that involves some quant with py, so if there's someone who's good with that please tag them here or tag a relevant channel as I couldnt find one

spark summit May 31, 2018, 8:06 PM

#

@velvet anchor and @hasty maple you might find some interest in research around characatures

#

the human mind abstracts things in ways that we have very difficult time reproducing in AI

small ore Jun 1, 2018, 12:41 AM

#

@grave axle I would me interested to learn about the same topic too. If you learn about a server/channelor come across/know of any material, please let me know

fresh otter Jun 1, 2018, 7:08 PM

#

Anyone here knowledgeable in Keras? Specifically multilabel class prediction

#

Getting some strange results from predict_generator

velvet anchor Jun 1, 2018, 7:23 PM

#

Im a bit of a keras noob but it's what i've been using for my project

fresh otter Jun 1, 2018, 7:28 PM

#

predict_generator is returning incorrect values i believe

#

I have a model trained on ~1 million images

#

number of classes in 228

#

BATCH = 64
STEPS = len(test_paths) // BATCH

test_seq = TestBatchSequence(test_paths, BATCH)

probs = model.predict_generator(
    test_seq,
    steps = STEPS + 1,
    workers = 5,
    verbose = 1
)

#

Example of the resultant probability array (index 0)

#

Prob: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

#

Shouldn't that be a probability for the classes, not just 1?

#

Also, each subsequent image has the exact same result array

velvet anchor Jun 1, 2018, 7:31 PM

#

If the results are the same for every image I think it's an issue of over/under fitting as for 0 or 1 i think some activation functions return only 0 or 1 as a result. @hasty maple might be able to answer definitively if he's around

fresh otter Jun 1, 2018, 7:32 PM

#

indeed

velvet anchor Jun 1, 2018, 7:33 PM

#

are you using softmax / sigmoid in your output layer?

fresh otter Jun 1, 2018, 7:33 PM

#

I'm using sigmoid activation on my last layer since regarding: https://stackoverflow.com/questions/44164749/how-does-keras-handle-multilabel-classification

Stack Overflow

How does Keras handle multilabel classification?

I am unsure how to interpret the default behavior of Keras in the following situation:

My Y (ground truth) was set up using scikit-learn's MultilabelBinarizer().

Therefore, to give a random examp...

hasty maple Jun 1, 2018, 7:38 PM

#

I'm not that used to keras either lol. All I did was mnist data set using it.

velvet anchor Jun 1, 2018, 7:40 PM

#

I know for sure the reason it's 1 is because of how softmax handles probability

#

📎 05euYS7InCmDP08ir..png

#

it's output looks like this

#

so it will drag output up to 1 or 0 outside a very small subset of values

#

and the reason it's the same is because of presumably some set of images resulting in an overfit, have you tried changing kernel / window size?

fresh otter Jun 1, 2018, 7:44 PM

#

yeah it might be overfitting, you're right

#

I oughta compare test results with validation to determine this

prime thistle Jun 1, 2018, 7:49 PM

#

kfold that B

lapis sequoia Jun 2, 2018, 12:00 PM

#

what kind of companies do you data scientists guys work for?

#

dont the big companies all use java?

earnest prawn Jun 2, 2018, 12:08 PM

#

not really

lapis sequoia Jun 2, 2018, 12:24 PM

#

guys

#

how do you guys kaggle

naive hornet Jun 2, 2018, 4:44 PM

#

@lapis sequoia I'm no data scientist, but my understanding was that Python is a very strong incumbent in the data science industry because of the many specialized tools written for it and its ease of use. bar R, it might even be the most popular

#

("data science" being a massive and sweeping field that it's probably not fair to lump every portion of under the same umbrella)

placid snow Jun 2, 2018, 5:15 PM

#

I know statoil, the big norwegian oil company uses machine learning to determine if theres oil based on soil samples or something like that. Was a guest lecture about it in my python ML course. Doesnt quite imply that they use python.. but GWcmeisterPeepoShrug

velvet anchor Jun 2, 2018, 5:19 PM

#

Is there a TF equivalent to keras' .flow_from_directory()? in essence, I want to try a tf model with some images, but the few examples I could find either use the build in .dataset() module or a pickle'd file and neither are very helpful in that regard

lapis sequoia Jun 3, 2018, 5:26 PM

#

I think I gonna do a data science bootcamp, there are lot of companies offering that now, costs about 10000$ for 3-5 months, only basic coding background required, almost 100% of students find a job as data scientist afterwards

lapis sequoia Jun 3, 2018, 7:27 PM

#

hello guys

#

i'm having some trouble with training this data set using stacked algorithms

lapis sequoia Jun 3, 2018, 7:59 PM

#

wait nvm

#

i think i know how to fix it

polar acorn Jun 3, 2018, 8:47 PM

#

What is currently the most widely used/best python library for HMM?

lapis sequoia Jun 3, 2018, 9:14 PM

#

hmm?

polar acorn Jun 3, 2018, 9:18 PM

#

Hidden Markov Models

polar acorn Jun 3, 2018, 9:46 PM

#

Seems the answer is pomegranate anyhow..

lapis sequoia Jun 3, 2018, 9:51 PM

#

Data science is simply a cool name for parts of statistics

lapis sequoia Jun 4, 2018, 1:33 AM

#

is data science artificial intelligence?

velvet anchor Jun 4, 2018, 1:48 AM

#

Can be

#

Data science is extremely broad

lapis sequoia Jun 4, 2018, 1:56 AM

#

true

#

by the way

lean ledge Jun 4, 2018, 1:56 AM

#

A part of AI is data science, a part is not

lapis sequoia Jun 4, 2018, 1:56 AM

#

can you guys help me with something

lean ledge Jun 4, 2018, 1:56 AM

#

AI just means something that can make decisions. Doesn't have to be very statistical but it tends to be

lapis sequoia Jun 4, 2018, 1:57 AM

#

oh okay

#

by the way

lapis sequoia Jun 4, 2018, 1:22 PM

#

is data science artificial intelligence?
no

elder basalt Jun 4, 2018, 4:58 PM

#

in linear regression if i z score my data and calculate the constants for my function, how do i invert the z-score constants that is true for my z scored data to original data?

silk schooner Jun 4, 2018, 5:52 PM

#

add mean and multiply by standard dev.

velvet anchor Jun 4, 2018, 6:07 PM

#

Is there a dataset of normal photos containing people. Not just cropped faces like the LFW or Essex set?

sacred summit Jun 4, 2018, 11:45 PM

#

@velvet anchor This has a wide list of photos containing people: http://image-net.org/synset?wnid=n07942152

velvet anchor Jun 4, 2018, 11:46 PM

#

that's P E R F E C T

#

ty

sacred summit Jun 4, 2018, 11:46 PM

#

No problem

lapis sequoia Jun 4, 2018, 11:48 PM

#

what are some features that you guys commonly use when interpreting data

velvet anchor Jun 4, 2018, 11:49 PM

#

Depends on the data set

small pumice Jun 4, 2018, 11:58 PM

#

Would this be a good channel to ask for help on a neural network program?

sacred summit Jun 5, 2018, 12:00 AM

#

@small pumice Yes

small pumice Jun 5, 2018, 12:00 AM

#

Ok

#

I'm working on a project in which a character in a text-based rogulike-styled world has to find its way around

#

To control it, I made a neural network from scratch

#

no libraries

#

But it isn't working

#

Wait-nevermind

#

I think I figured it out

sacred summit Jun 5, 2018, 12:10 AM

#

Alrighty

lapis sequoia Jun 5, 2018, 12:29 AM

#

@small pumice how did you do it?

#

in one of my future projects i plan on doing that as well

#

well except apply the neural network to something else

small pumice Jun 5, 2018, 12:29 AM

#

Don’t ask me

#

It doesn’t really wprk

#

Work

lapis sequoia Jun 5, 2018, 12:30 AM

#

oh

small pumice Jun 5, 2018, 12:30 AM

#

I’ll post the code later though

#

Maybe someone will know how to fix it

velvet anchor Jun 5, 2018, 12:45 AM

#

Maybe. Coding a NN from scratch though is a very big task. I think you’ll likely save a bunch of time getting tensorflow / Keras / pytorch to work for you than you will coding one from the ground up

sacred summit Jun 5, 2018, 12:47 AM

#

Yeah, probably start from the basics, NN are actually pretty hard to make

velvet anchor Jun 5, 2018, 12:49 AM

#

I’m not saying it’s impossible but it’s definitely a several month task for a team of researchers to get a working version

small pumice Jun 5, 2018, 12:56 AM

#

Yeah

#

It’s just so hard to find TensorFlow tutorials that explain the concepts that are being coded in

velvet anchor Jun 5, 2018, 1:06 AM

#

Try looking for Keras info. I had the same problem

#

It runs as basically a tensorflow wrapper. Almost. Either tensor or thano

sacred summit Jun 5, 2018, 1:35 AM

#

You mean thanos :p

velvet anchor Jun 5, 2018, 1:35 AM

#

maybe. im not familiar with it just seen it before

#

I think its actually theano but im not 100%

small pumice Jun 5, 2018, 1:42 AM

#

snaps fingers

#

Also the recent TensorFlow update makes most tutorials outdated

silk schooner Jun 5, 2018, 1:47 AM

#

a simple neural network only one or two layers is not hard to code up by yourself if you wanted to... you should use numpy for all the matrix operations

small pumice Jun 5, 2018, 1:49 AM

#

Mine has 2 hidden layers

#

Input layer has 190 neurons, second layer has 16, third has 16, and output layer has 4.

silk schooner Jun 5, 2018, 1:52 AM

#

here is a multi-layer neural net i coded with just numpy, if your interested

https://github.com/jaym910/deep_neuralnet

GitHub

jaym910/deep_neuralnet

deep_neuralnet - Deep neural net implementation in python/numpy

small pumice Jun 5, 2018, 1:52 AM

#

Cool

silk schooner Jun 5, 2018, 1:52 AM

#

its not documented well or anything and its a few years old but maybe you'll find it useful

#

lemme know if u have any specific questions

#

i'd really only recommend coding one up yourself though if you are interested in the challenge or learning more about them or something.... if you just want one to use, id use libraries like other people recommended

small pumice Jun 5, 2018, 1:57 AM

#

Yeah

#

Thanks

velvet anchor Jun 5, 2018, 3:22 PM

#

If I have a NN detecting photoshop should the training set classes be like normal humans & a seperate class of photoshopped humans or would it make more sense to do like unphotoshopped images of all possible objects then a class of photoshopped humans

velvet anchor Jun 5, 2018, 7:26 PM

#

that new dataset improved things dramatically @hasty maple BTW. with some preprocessing and stuff I've gotten 80% accuracy 😮

hasty maple Jun 5, 2018, 7:27 PM

#

Are you sure it's not overfit on the train/test set :P

velvet anchor Jun 5, 2018, 7:29 PM

#

I mean It could be but I just ran it over a test set of ~12,000 images not included in the training set

#

need more testing to confirm but its a start

nova viper Jun 6, 2018, 2:22 PM

#

Anyone know a server dedicated to Machine Learning ?

austere quartz Jun 6, 2018, 2:55 PM

#

@feral lodge

feral lodge Jun 6, 2018, 3:00 PM

#

I can't link invites, but search for artifical intelligence here: https://discordlist.me/

#

It's got chats for several branches of AI, including machine learning @nova viper

nova viper Jun 6, 2018, 3:02 PM

#

@feral lodge
Thanks for your help

feral lodge Jun 6, 2018, 3:05 PM

#

No problemo friendo

#

Ty for the mention @austere quartz 😏

austere quartz Jun 6, 2018, 3:10 PM

#

😏

hasty maple Jun 6, 2018, 3:14 PM

#

/r/learningmachinelearning has a discord server as well. You could look into that if you're interested @nova viper

feral lodge Jun 6, 2018, 3:35 PM

#

I didn't know about that one, cheers! Here's the reddit post: https://www.old.reddit.com/r/learnmachinelearning/comments/6mfwmf/introducing_rlearnmachinelearning_discord_chatroom/

worn cosmos Jun 6, 2018, 3:35 PM

#

Anyone know of a good way to get certified/prove skills in Python to a potential employer? My MS is in biostatistics but I'd like to get into data science.

velvet anchor Jun 6, 2018, 3:40 PM

#

One of the best ways to prove skills is with a github to show case but I'm not entirely sure on certification

feral lodge Jun 6, 2018, 3:40 PM

#

Looks like there are organizations that issue python certificates, but looks like a hassle and sometimes expensive 🤔 I'd say it's probably better to have a personal project or two which you can show the employer

#

Agreed on github

velvet anchor Jun 6, 2018, 3:40 PM

#

I know some companies like microsoft are now additionally offering like a certification in their machine learning programs but i don't know how much value they hold to employers

#

https://www.microsoft.com/en-us/learning/mcsa-machine-learning.aspx like this

Microsoft Learning

worn cosmos Jun 6, 2018, 3:42 PM

#

Alright, thanks for the advice guys. I've got some basic stuff up on GitHub but it's not really data-science related. Do you have any advice on something I might want to look into project-wise? Or would this be a better question for a career counselor or someone in the industry?

feral lodge Jun 6, 2018, 3:43 PM

#

Biostatistics sounds like a perfect application area for machine learning if that's what you're interested in

worn cosmos Jun 6, 2018, 3:44 PM

#

Definitely. We do pretty much everything in SAS though, and I'm already looking into getting certified for that soon. (I'm currently in college) I'd like to expand my skillset though.

feral lodge Jun 6, 2018, 3:45 PM

#

Python has several excellent libraries for statistical analysis

#

Pandas, SciPy, Numpy among others

velvet anchor Jun 6, 2018, 3:47 PM

#

scikit-learn

#

I listen to a python podcast thats been covering a lot of data science applications lately and I hear scikit come up ALL the time

feral lodge Jun 6, 2018, 3:48 PM

#

What's the pod?

velvet anchor Jun 6, 2018, 3:48 PM

#

I think its talk python to me

feral lodge Jun 6, 2018, 3:48 PM

#

Thanks 👌

velvet anchor Jun 6, 2018, 3:48 PM

#

He's been interviewing people who are using it in geoscience or at the allen institute, etc etc and they talk about their software stack and stuff

#

its pretty high level so not like super detailed but 3 you mentioned and scikit are what I hear every episode

lapis sequoia Jun 6, 2018, 3:49 PM

#

I consider going to a data scientist boot camp, but i saw on the curriculum they dont teach python or javan, they only use R, is that any good for a job? ;/

velvet anchor Jun 6, 2018, 3:49 PM

#

R and Python are pretty neck and neck for data science

#

I dont have much personal experience with it but it was the original laguage and pythons currently trying to overtake its spot

feral lodge Jun 6, 2018, 3:50 PM

#

R is a bit like MATLAB if you've ever used it

#

As a programmer I like to think of it as a very advanced, programmable, calculator

#

Great for stats, but you can't really integrate it with a bigger program

#

Which you of course can with python

lapis sequoia Jun 6, 2018, 3:51 PM

#

yeah i dont think its a very versatile programming language, more something for academia and mathematicians

feral lodge Jun 6, 2018, 3:51 PM

#

That's my opinion yeah

#

I can be used in the field though, one of my professors has done a lot of work for the central bank of Sweden, and he works almost exclusively in R afaik

velvet anchor Jun 6, 2018, 3:59 PM

#

seems like both have a pretty even split though

#

I have an interview @ twitch soon as a data analyst and they use Python

hasty maple Jun 6, 2018, 4:05 PM

#

Good luck! Wish I had an interview too >.<

velvet anchor Jun 6, 2018, 4:07 PM

#

Ive prolly sent out 50+ applications haha

#

just looking for something entry level for my last semester

#

dont super care what it is

hasty maple Jun 6, 2018, 4:10 PM

#

:o nice

#

I sent out like 5 a couple weeks ago. I guess I should send more

worn cosmos Jun 6, 2018, 4:27 PM

#

Okay awesome, thanks for the help guys

velvet anchor Jun 6, 2018, 4:54 PM

#

shotgun technique @hasty maple

hasty maple Jun 6, 2018, 5:35 PM

#

@velvet anchor have you done any Data Science projects before applying for all these jobs? I just did one, studied for like 4 months, did one kaggle comp, got good results and that's about it 😂

velvet anchor Jun 6, 2018, 5:36 PM

#

Just this research fellowship over deepfakes

#

But I also almost have a BS in Math and CS

hasty maple Jun 6, 2018, 6:08 PM

#

ah you have a data analytics preferred major

velvet anchor Jun 6, 2018, 6:30 PM

#

Yeah something like that:P

past gazelle Jun 6, 2018, 8:28 PM

#

Are there any data-science centric Discords beyond the Python realm?

velvet anchor Jun 6, 2018, 8:28 PM

#

like for R?

past gazelle Jun 6, 2018, 8:29 PM

#

Just in general, like broader topics than just language-specific stuff

#

I don't have any particular questions I'm more just curious

velvet anchor Jun 6, 2018, 8:29 PM

#

i think /r/learnmachinelearning has a discord

past gazelle Jun 6, 2018, 8:30 PM

#

Ah, cool, thanks!

stoic gyro Jun 6, 2018, 9:15 PM

#

📎 unknown.png

#

i have this problem

#

how can i solve it?

velvet anchor Jun 6, 2018, 9:16 PM

#

youre either not training well enough or displaying too low of confidence

stoic gyro Jun 6, 2018, 9:17 PM

#

ok ok

#

i will figured this tomorrow

velvet anchor Jun 6, 2018, 9:37 PM

#

Is there a book for tensorflow or pytorch that’s most recommended?

sacred summit Jun 6, 2018, 11:21 PM

#

@velvet anchor I have a pdf on Machine Learning with Tensorflow

#

in python btw

velvet anchor Jun 6, 2018, 11:22 PM

#

That’ll work. I’ve been using Keras mainly just wanna expand my horizons a bit

sacred summit Jun 6, 2018, 11:23 PM

#

I'll dm you

velvet anchor Jun 6, 2018, 11:23 PM

#

K

small ore Jun 7, 2018, 5:23 AM

#

I read some of the posts above. I am unable to see a search option on https://discordlist.me/ . Can someone help or just pass me the invites for the relevant servers please?

DiscordList.me - Advertise your Discord server or find one to join!

Discord Servers - The best list of communities! Advertise your Discord servers or find one you can join! Gaming, anime, programming and many other categories!

#

Also how do I add /r/learnmachinelearning?

#

@hasty maple @velvet anchor

feral lodge Jun 7, 2018, 9:41 AM

#

Remove the [[[....]]] stuff if you didn't notice it 👌

small ore Jun 7, 2018, 9:46 AM

#

Thanks a lot

feral lodge Jun 7, 2018, 9:46 AM

#

No problemo 👌

#

@velvet anchor I've been recommended this book before: https://www.amazon.com/gp/product/1491962291/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&tag=jackchanamazo-20&camp=1789&creative=9325&linkCode=as2&creativeASIN=1491962291&linkId=dba20e99d63c85b4bc0c89940c05cff0 Haven't read it though

small ore Jun 7, 2018, 9:47 AM

#

Is that the Tensor flow pdf?

feral lodge Jun 7, 2018, 9:48 AM

#

Possibly, but I have no idea! That pdf could be anything really

hearty hazel Jun 7, 2018, 9:48 AM

#

!mute @feral lodge 3d Bypassing the spam filter

crimson lightBOT Jun 7, 2018, 9:48 AM

#

:ok_hand: Slandön#5361 is now muted for 3 days (Bypassing the spam filter)

small ore Jun 7, 2018, 9:49 AM

#

😦

#

Too harsh a punishment

hearty hazel Jun 7, 2018, 9:49 AM

#

He knew what he was doing

#

He could have just DM'd you

#

These things are in place for a reason

small ore Jun 7, 2018, 9:53 AM

#

He says he agrees with you but I still think 3 days of muting Slandon is a loss to us more than him

hearty hazel Jun 7, 2018, 9:55 AM

#

!unmute @feral lodge

crimson lightBOT Jun 7, 2018, 9:55 AM

#

:ok_hand: Slandön#5361 is now unmuted

hearty hazel Jun 7, 2018, 9:55 AM

#

Let's make sure that doesn't happen again

feral lodge Jun 7, 2018, 9:56 AM

#

Lesson learned! Thanks boys

hot karma Jun 7, 2018, 2:21 PM

#

plt.scatter(x_factor[:,0],x_factor[:,0])

#

What is [:,0] ?

placid snow Jun 7, 2018, 2:23 PM

#

a slice

#

Is x_factor a numpy or pandas array?

hot karma Jun 7, 2018, 2:23 PM

#

Matplotlib

#

Y there is 0 in [:,0]?

proven crater Jun 7, 2018, 2:24 PM

#

how does that slice work with the comma though. it looks weird

placid snow Jun 7, 2018, 2:25 PM

#

A slice is an object with start, end and step.

#

Wait, with a comma

hot karma Jun 7, 2018, 2:25 PM

#

Oh 0 is step

placid snow Jun 7, 2018, 2:25 PM

#

I totally missed that comma

proven crater Jun 7, 2018, 2:26 PM

#

if it would be the step it would be [::0].
Also a step of 0 would be an infinite loop I guess,

placid snow Jun 7, 2018, 2:26 PM

#

I think it's referencing in a 2d manner with , just ignoring x and only giving y?

#

something like [x,y:x,0] ?

proven crater Jun 7, 2018, 2:27 PM

#

yea it must be somehting like that

hearty hazel Jun 7, 2018, 2:27 PM

#

It's a tuple

#

one second

proven crater Jun 7, 2018, 2:28 PM

#

and specifically implemented by the class of whatever x_factor is

placid snow Jun 7, 2018, 2:28 PM

#

^

#

Hence why i asked if it's numpy or pandas, I believe they syntax like that?

proven crater Jun 7, 2018, 2:29 PM

#

They also (ab)use the __getitem__ to make the user write nicer syntax yes. Doesn't often look like this though.

placid snow Jun 7, 2018, 2:29 PM

#

data.iloc[:, 0] pandas dataframe

hearty hazel Jun 7, 2018, 2:29 PM

#

OK

#

It's an empty slice

#

followed by a 0

#

in a tuple

#

(slice(None, None, None), 0)

placid snow Jun 7, 2018, 2:31 PM

#

Right, but dependent on what lib it is, it could use 2 slices

hot karma Jun 7, 2018, 2:31 PM

#

Plt.scatter(x_factor[:,0],x_factor[:,1]

hearty hazel Jun 7, 2018, 2:31 PM

#

Yeah we really need to know what x_factor is

hot karma Jun 7, 2018, 2:31 PM

#

At second x_factor there is 1 [:,1]

placid snow Jun 7, 2018, 2:32 PM

#

They are used to request most likely different indexed columns in a table

#

So 0 would get first columns data, 1 would get 2nd column

hearty hazel Jun 7, 2018, 2:33 PM

#

that syntax is kinda nasty

hot karma Jun 7, 2018, 2:33 PM

#

X_factor is a variable carying data

proven crater Jun 7, 2018, 2:33 PM

#

It is :P

placid snow Jun 7, 2018, 2:33 PM

#

Would still need to know what x_factor is to give a proper answer

proven crater Jun 7, 2018, 2:33 PM

#

it's like 2d_data[row_selecting, column_selecting]

placid snow Jun 7, 2018, 2:33 PM

#

You can always print(type(x_factor)) if you have no idea GWcmeisterPeepoShrug

hot karma Jun 7, 2018, 2:35 PM

#

F = factorAnalysis(n_components=2)

#

X_factor=f.fit_transform(iris.data)

placid snow Jun 7, 2018, 2:36 PM

#

What lib is factorAnalysis from

#

sklearn?

hot karma Jun 7, 2018, 2:37 PM

#

Sklearn.decomposition

placid snow Jun 7, 2018, 2:38 PM

#

Also did you mean x_factor = ?

#

Else it's a different variable

hot karma Jun 7, 2018, 2:39 PM

#

X_factor=F.fit_transform(iris.data)

placid snow Jun 7, 2018, 2:39 PM

#

with a capital X?

#

Anyways

#

Returns: X_new : numpy array of shape [n_samples, n_features_new]

hot karma Jun 7, 2018, 2:41 PM

#

Small

placid snow Jun 7, 2018, 2:41 PM

#

So yes it's a numpy array

#

Therefore its array[row slicing, col slicing] so py x_factor[1:3, 0]for instance would be row 1 and 2 with only data from column 0

#

[: ,0] says give me all the rows, with only data in the first column

hot karma Jun 7, 2018, 2:44 PM

#

Thanks

#

From sklearn import datasets

#

Import numpy as np

#

Iris= datasets.load_iris()

#

C= np.corrcoef(iris.data.T)

#

What is the capital T?

placid snow Jun 7, 2018, 2:50 PM

#

~~to me it seems like an alias for target~~

#

I misread something, lemme try again

#

Yeah, it seems to be the ~~target~~

#

It's an array of all data split into multiple lists

#

first list is target

#

I actually don't know what I'm doing. But it's atleast every column of the dataset split into lists

feral lodge Jun 7, 2018, 2:57 PM

#

📎 unknown.png

#

It's the transpose I think

placid snow Jun 7, 2018, 2:58 PM

#

Something like that yeah

#

It's data, but transposed

velvet anchor Jun 7, 2018, 3:01 PM

#

#slandonisfree

#

Also ty for that link @feral lodge gonna pick it up after work

hot karma Jun 7, 2018, 3:02 PM

#

Can anyone give me a cheat sheet for sklearn?

velvet anchor Jun 7, 2018, 3:04 PM

#

What kind of cheat sheet

hot karma Jun 7, 2018, 3:04 PM

#

Sklearn library

placid snow Jun 7, 2018, 3:05 PM

#

Their docs are fairly good imo

#

http://scikit-learn.org/stable/documentation.html

velvet anchor Jun 7, 2018, 3:09 PM

#

Yeah. Their docs are one of the best

#

I’m not sure if any like quick reference pages though the lists like functions often used or whatever

hot karma Jun 7, 2018, 3:13 PM

#

Ow ,That's a good site.

velvet anchor Jun 7, 2018, 6:04 PM

#

Is there a reference for all the output values of different keras activation functions?

#

there's https://keras.io/activations/ but I wanted like a set of possible values

feral lodge Jun 7, 2018, 6:11 PM

#

https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions

#

That OK? Shows the range and a bunch of other properties

velvet anchor Jun 7, 2018, 6:12 PM

#

Yeah that's actually perfect

feral lodge Jun 7, 2018, 6:14 PM

#

Didn't know there was such variety actually 🤔

velvet anchor Jun 7, 2018, 8:32 PM

#

Yeah picking the right activation functions is certainly difficult

#

because theyre all so different

feral lodge Jun 7, 2018, 8:37 PM

#

https://arxiv.org/pdf/1804.02763.pdf

Looks like there's been some work on comparing some of them

velvet anchor Jun 7, 2018, 8:50 PM

#

Noice

#

Ill get paid to read this tomorrow

worn cosmos Jun 7, 2018, 10:37 PM

#

hey, I'm trying to do some time series analysis.

#

year
1998-01-01    71
1998-01-01    60
1998-01-01    65
1998-01-01    83
1998-01-01    72
Name: yieldpercol, dtype: int64

#

this is ts.head()

#

My issue is that I need to combine all the data from each individual year together

#

that is, add up all the entries for 1998, then all the ones for 1999, etc

#

And I'm having trouble figuring out how to do that with pandas

small pumice Jun 7, 2018, 11:07 PM

#

There are so many TensorFlow and Keras tutorials that jump straight to things like MNIST recognition. Does anyone know some good tutorials on neural networks using simple data that you make?
For example, a tutorial that shows how to make a neural network that can add two numbers together? I know it’s simple, but it would be a good way to get the concept down.

placid snow Jun 7, 2018, 11:15 PM

#

The Js tutorial on NN from coding train and 3blue1brown are pretty good

#

https://youtu.be/aircAruvnKk and https://youtu.be/XJ7HLz9VYz0

YouTube

3Blue1Brown

But what *is* a Neural Network? | Chapter 1, deep learning

Subscribe to stay notified about new videos: http://3b1b.co/subscribe Support more videos like this on Patreon: https://www.patreon.com/3blue1brown Special t...

▶ Play video

YouTube

The Coding Train

10.1: Introduction to Neural Networks - The Nature of Code

Welcome to Chapter 10 of The Nature of Code: Neural Networks. (http://natureofcode.com/book/chapter-10-neural-networks/) In this video, I provide a brief int...

▶ Play video

#

Both not python related, but explain the concepts fairly well

feral lodge Jun 8, 2018, 10:36 AM

#

@worn cosmos By "add upp", do you mean sum? In that case you can probably do something with pandas cumulative sum https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.cumsum.html. If you mean "for each year Y, create a vector/dataframe of all data from the first year to year Y", then you should be able to loop through range(0, length_of_data) and for each index create a slice from 0 to index with data.iloc([ : index]) or something like that:

https://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges
https://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-integer

trail current Jun 8, 2018, 6:14 PM

#

hey, i'm doing some pretty basic data acquisition and i can't get my live plot to work properly, would anyone mind helping out?

naive swallow Jun 9, 2018, 12:44 AM

#

bot.tags['ask']

arctic wedgeBOT Jun 9, 2018, 12:44 AM

#

ask

Asking good questions will yield a much higher chance of a quick response:

• Don't ask to ask your question, just go ahead and tell us your problem.
• Try to solve the problem on your own first, we're not going to write code for you.
• Show us the code you've tried and any errors or unexpected results it's giving
• Keep your patience while we're helping you.

You can find a much more detailed explanation on our website.

junior hemlock Jun 9, 2018, 11:11 AM

#

Guys I need a little help

#

Can aiohttp and urllib read from websites like YT ?

earnest prawn Jun 9, 2018, 11:11 AM

#

aiohtpp and urllib can read from anything which is reachable with http or https

junior hemlock Jun 9, 2018, 11:11 AM

#

Oh thanks

honest relic Jun 9, 2018, 1:35 PM

#

http://jakevdp.github.io/blog/2014/10/16/how-bad-is-your-colormap/

#

https://i.redd.it/r8t2tazc1z211.jpg

#

That's actually hilarious.

hasty maple Jun 9, 2018, 2:23 PM

#

😂

lapis sequoia Jun 9, 2018, 3:33 PM

#

hello

#

how do you guys apply models to the training.csv file

#

and what is a model made of/

#

?

feral lodge Jun 10, 2018, 1:01 PM

#

Hey! It sounds to me like you're unsure what a model is, so this answer is pretty basic. Sorry if I misunderstood 😄 When we observe and take measurements of stuff in the real world there's often good reason to assume that those observations follow a predictable pattern, even if they are seemingly random and independent of each other. A statistical model is a way of mathematically concretizing those patterns, so we can better understand and work more effectively with our data. We will never have enough data to completely accurately model the complexities of real-world relationships, but a simple mathematical model often captures the essential underlying patterns of the observations. What model is suitable depends on the nature of the data:

#

Sometimes our observations are only positive integers, like if we were counting the number of spam emails a person recieves every day. This kind of data should probably be modelled with a Poisson or binomial distribution.
https://en.wikipedia.org/wiki/Poisson_distribution
https://en.wikipedia.org/wiki/Binomial_distribution

Sometimes they're real numbers without limits, like when measuring the temperature at 21:00 each day in January. That kind of data probably follows a normal distribution or a Cauchy distribution.
https://en.wikipedia.org/wiki/Normal_distribution
https://en.wikipedia.org/wiki/Cauchy_distribution

Sometimes the data consists of real numbers with explicit limits, like if we have a list of estimations of the probability of a turtle egg of a certain species containing a female turtle. Since a probability can only be between 0 and 1, that kind of data can likely be nicely modeled with a beta distribution.
https://en.wikipedia.org/wiki/Beta_distribution

#

Instead of only trying to model the distribution of possible values in your data, you're often also interested in modelling the relationship between the different features in your data. In what way does the chance of getting lung cancer increase with each year of smoking? How does it correlate with age? Income? Weight? To model these kinds of relationships we often use a technique called regression. With linear regression we assume the relationship is linear, with polynomial regression we assume the relationship is some polynomial.
https://en.wikipedia.org/wiki/Linear_regression
https://en.wikipedia.org/wiki/Polynomial_regression

If the relationship between features is very complex and high-dimensional, we often need to use a more complex model, like a support vector machine or a neural network.
https://en.wikipedia.org/wiki/Support_vector_machine
https://en.wikipedia.org/wiki/Artificial_neural_network

#

Since we haven't seen your training.csv file and don't know what kind of analysis you want to do on this data, we can't really say what kind of model or approach is suitable for you. There're plenty of tutorials online for machine learning in python, maybe check this one out?

https://towardsdatascience.com/simple-and-multiple-linear-regression-in-python-c928425168f9

#

Sorry if this was too verbose or below your level friendo, let us know if you wanted some other kind of direction 😄

#

@lapis sequoia

lapis sequoia Jun 10, 2018, 1:09 PM

#

oh its fine!

#

im kind of new to the whole scene

#

plus im just a high school student lol

feral lodge Jun 10, 2018, 1:10 PM

#

Never too early to start 😄

#

https://www.coursera.org/learn/machine-learning I haven't looked through this course myself, but my friends who have liked it a lot! Might be a good intro to machine learning, and it's free I believe? You enroll and then get access to the video lectures

placid snow Jun 10, 2018, 3:14 PM

#

Just as a headsup Slandon, you can cut down the length of messages by removing the embed sent by each link by just wrapping them in <> <www.google.com> wont send the embed for www.google.com

#

Unless you wanted them ofc 😅

feral lodge Jun 10, 2018, 3:31 PM

#

Oh I had no idea, thanks! I've got link previews turned off, so I always forget they exist

placid snow Jun 10, 2018, 3:31 PM

#

Oh, makes sense. You had quite the wall there ':P

#

Every wiki link had a pretty big image with them

feral lodge Jun 10, 2018, 3:33 PM

#

Live and learn 🤦

hasty maple Jun 10, 2018, 3:34 PM

#

What's a good book for statistics, I'm primarily looking for something small and conscience just to go over the concepts, not anything with a ton of derivations and stuff.

young aurora Jun 10, 2018, 8:32 PM

#

Hey all. I'm using Python to do some basic data visualization for a pet project I'm working on. I'm wondering what they best way to plot a timeline (historical, as in multi-year) for the reign of multiple emperors would be. The columns I've got that I think are actually of interest are as follows: Start (in years), End (in years), and ruler name. Basically I'd like to get a chart with all the rulers mapped onto it (just straight lines) but separated so you can see them individually, as they're sometimes overlapping.

#

I'm happy to share the dataset. It's a collection of all Chinese emperors. I'm gonna be releasing it onto my blog once I've finished this last part, but I'm having trouble finding the proper package to use. I found a suggestion that I could use a Gantt chart for this, but can't figure out how to actually work that with my data. I've got it all in a pandas dataframe.

#

Each row is an emperor, with each emperor having a start date, end date, and length of reign. Not sure if that will help with answering the question.

feral lodge Jun 10, 2018, 8:44 PM

#

This link https://plot.ly/python/gantt/#use-a-pandas-dataframe seems to cover plotting pandas data in Gantt charts using Plotly @young aurora

young aurora Jun 10, 2018, 8:45 PM

#

I'll try this! Thank you. I'll report whether it worked or not afterwards.

feral lodge Jun 10, 2018, 8:46 PM

#

Hope it helps!

young aurora Jun 10, 2018, 9:02 PM

#

So I feel like an idiot - but my plot is totally blank.

#

This is the code I used to create it - don't know if this is what you'd need to help.

#

fig = ff.create_gantt(ThreeSovereigns, colors=['#333F44', '#93e4c1', '#93e4c1'], show_colorbar=True, group_tasks=True)
py.iplot(fig, filename='gantt-group-tasks-together', world_readable=True)

young aurora Jun 10, 2018, 9:23 PM

#

Yeah, that one definitely wont work. It just isn't doing what it needs to do. Any other ideas?

feral lodge Jun 10, 2018, 10:40 PM

#

How is ThreeSovereigns defined?

#

@young aurora

young aurora Jun 10, 2018, 10:41 PM

#

ThreeSovereigns = ChinaEmpire[ChinaEmpire.DynastyCode == '00a']

feral lodge Jun 10, 2018, 10:41 PM

#

In one of their earlier snippets they define their data like this

📎 unknown.png

young aurora Jun 10, 2018, 10:41 PM

#

It works fine for creating matplotlib charts etc.

feral lodge Jun 10, 2018, 10:42 PM

#

If you just copy-paste their code, using their example data, does it plot correctly?

young aurora Jun 10, 2018, 10:43 PM

#

Yes. I was hoping there was a more elegant solution than hardcoding in the start and end date - I think the issue may be that my start and end dates aren't in datetime

#

That being said, they're only years (e.g. -2023 Start, -1500 End) so I'm not sure how to convert them into datetime if that's what this requires

#

If it isn't and hardcoding is what it wants, I can do that too. It just seems extremely... bad

feral lodge Jun 10, 2018, 10:45 PM

#

As long as it fits the pattern if should be fine! That is, the data should be of the form [ {"Task" : <Name>, "Start" : <Start time>, "Finish" : <End time>} ]

#

So if you can process your data file and parse it to such a list i imagine it'll work. So then the issue is the dates... If you only have the years you can probably just set the month and day to be the first of January or something

young aurora Jun 10, 2018, 10:47 PM

#

Yeah, that's totally fine - it's thousands of years and also mythology.

feral lodge Jun 10, 2018, 10:48 PM

#

No idea if it'll handle BC dates nicely though 🤔 I have an early meeting so I'll have to leave, but if you ask in the help channels someone should be able to help

young aurora Jun 10, 2018, 10:48 PM

#

Okay, thank you!

feral lodge Jun 10, 2018, 10:48 PM

#

This has become a sub-problem of the first, so if you show a snip of your data and explain about the negative dates that's probably enough to go on

young aurora Jun 10, 2018, 10:49 PM

#

Okay, cool. Pyplot just isn't playing nice with the numbers, either.

feral lodge Jun 10, 2018, 10:49 PM

#

No problem, hope you can solve it! Feel free to DM me a link to the blog when you're done 😄

young aurora Jun 10, 2018, 10:49 PM

#

-9000 maps, -10000 does not.

#

Will do!

lapis sequoia Jun 10, 2018, 11:06 PM

#

Guys

#

Is the Yahoo scraper of pandas actually broken

#

The Yahoo data reader for stock prices

small ore Jun 11, 2018, 9:41 AM

#

<@&267628507062992896> I suggest pinning Slandon's message above. The one with a lot of links and explaining basic 'model'

hearty hazel Jun 11, 2018, 9:46 AM

#

I agree

small ore Jun 11, 2018, 9:54 AM

#

I think the entire message isn't pinned. It prolly is internally split into two messages

hearty hazel Jun 11, 2018, 9:54 AM

#

Not gonna pin the whole thing

worn cosmos Jun 11, 2018, 5:14 PM

#

Hey, anyone here know much about logistic regression? I'm getting a huge upper limit for a Wald confidence interval and I don't know if it's reasonable or not.

#

fractional_shortening | 75.604| 0.252| >999.999 |

#

the values being the point estimate, and upper/lower wald CI's respectively

feral lodge Jun 11, 2018, 11:12 PM

#

Looks like it works, but implies a high standard error SE(β-hat) = SD(β-hat)/sqrt(n) for the maximum likelihood estimation β-hat of your coefficient β. n is the sample size.

The Wald 95% confidence interval is usually a Gaussian centered on β-hat whose standard deviation is the standard error of the mean, SE(β-hat). That means that 95% of its density is contained within the two points β-hat ± 1.96 * SE(β-hat), so those two points are where the upper and lower CI limits usually lie.

Since this is logistic regression though, we're working in transformed space. That means the confidence interval is transformed as well, so the interval is rather exp(β-hat ± 1.96 * SE(β-hat)). So for you we have this:

> betaHat <- 75.604
> lower <- 0.252
> 
> # Now, because lower <- exp(betaHat - 1.96*stdErr)
> 
> stdErr <- (betaHat - log(lower))/1.96
> 
> stdErr
[1] 39.2767
> # Pretty high!
> 
> 
> # Checking lower and upper CI limits:
> 
> exp(betaHat - 1.96*stdErr)  # Lower limit
[1] 0.252
> 
> exp(betaHat + 1.96*stdErr)  # Upper limit
[1] 1.85097e+66
> # Very big!
>

velvet anchor Jun 12, 2018, 3:45 AM

#

@hasty maple I can find the book we use for my Statistics class at Uni if you want

#

Found it. this is what we used, it was pretty nice, I'm sure you can find old versions quite easily

#

https://www.amazon.com/Probability-Statistical-Inference-Robert-Hogg/dp/0321923278

flat umbra Jun 12, 2018, 4:02 AM

#

Hi, this looks like the scientific and numerical python channel on the python discord server. Is that correct?

velvet anchor Jun 12, 2018, 4:03 AM

#

Yeah

hasty maple Jun 12, 2018, 4:00 PM

#

Wew that's an expensive book

velvet anchor Jun 12, 2018, 4:14 PM

#

Honestly its not so bad. its pricy new but used is reasonable

#

Is there a way to ensure a keras model is free from the GPU to train again?

#

Such as like

While(this):
    train a model
    test accuracy
    free resources, to tweak settings
    lets do the timewarp again

hasty maple Jun 12, 2018, 6:02 PM

#

I've had to manually restart my notebook session to get the GPU freed from the clutches of Keras, let me know if you find a better way 😂

velvet anchor Jun 12, 2018, 6:06 PM

#

Right now I’m running multiple scripts that call a second one to free it

#

So like master.py passes arguments to model.py but I want to simplify it

#

So it frees with model.py exits

hasty maple Jun 12, 2018, 6:17 PM

#

:o script calling a script, is this something different from an import? iirc import does just that

velvet anchor Jun 12, 2018, 6:19 PM

#

I wasn't importing it but it would probably also work

#

Just looking to rewrite my toolkit's as we move forward towards an adversarial network with 6 months of python knowledge

#

instead of the trash thats like taped together

hasty maple Jun 12, 2018, 6:21 PM

#

lol

velvet anchor Jun 12, 2018, 6:22 PM

#

I took this research position with actually 0 knowledge of python really

#

So it was like, quick sketch of what I wanted to do in C#, Port to python for Keras & TF options, hold the codebase together with prayer

hasty maple Jun 12, 2018, 6:25 PM

#

You can learn python in a week tbh, it's not that difficult

velvet anchor Jun 12, 2018, 6:26 PM

#

Yeah for basic stuff

#

but decorators, generators, etc etc are stuff that are also easy but not immediately apparent when you need them plus all the other libraries that are core parts of writing correct python definitely takes more than a week

hasty maple Jun 12, 2018, 6:28 PM

#

ah true, I never learnt classes, decorators, generators and the like as I haven't found any use for them yet

velvet anchor Jun 12, 2018, 6:34 PM

#

They're nice

#

Generators are nice for datasets where you don't have a standard way of iterating but dont need (or can't fit) the whole set in memory

#

https://paste.pydis.com/vaqofeyoxi.coffeescript monkeys with typewriter method of CNN training

#

@hasty maple

lilac shadow Jun 12, 2018, 7:38 PM

#

i have an interest in learning about neural networks, but i don't really know enough maths to do much with it (though i seem to pick stuff up quite quickly). i don't really have the willpower to learn a whole bunch of mathsy stuff straight up because i'd prefer to apply it and see what it does, rather than simply knowing what it does. essentially, is it possible to work with some of the more simpler aspects of neural nets without having a lot of mathematical knowledge at the beginning?

velvet anchor Jun 12, 2018, 7:39 PM

#

Yeah neural networks don't require a ton of math knowledge for categorization that's more of a data analytics type of problem

#

but not having a strong math background won't hurt too bad with NNs

lilac shadow Jun 12, 2018, 7:39 PM

#

i see

#

because i had a look at a sort of "hello world" example a while back and, even though i didn't know exactly how everything worked, i believe i got the general idea at least. i guess if i fiddled with stuff more to see what values affect certain things, i would be able to develop a better understanding of how stuff works together like that

velvet anchor Jun 12, 2018, 7:48 PM

#

https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py is a simple example btw @lilac shadow

GitHub

keras-team/keras

keras - Deep Learning for humans

lilac shadow Jun 12, 2018, 7:48 PM

#

ooh okay i'll take a look ^^ thanks!

#

i'll have to look up what a lot of this stuff does haha

velvet anchor Jun 12, 2018, 7:49 PM

#

And then mine as well with the pydis link above is an example of a network

#

but it's not standard because im kind of just room full of monkeys with typewritering my parameters

lilac shadow Jun 12, 2018, 7:50 PM

#

lol

velvet anchor Jun 12, 2018, 7:51 PM

#

@feral lodge gonna come in and lay down the science on why what I said has been wrong 😛

lilac shadow Jun 12, 2018, 7:51 PM

#

haha slandon is amazing at this stuff from what i've seen

#

👍

feral lodge Jun 12, 2018, 7:51 PM

#

If you don't mind a little bit of reading i would recommend checking out at least the first few chapters of this book: http://neuralnetworksanddeeplearning.com/chap1.html I think it's a wonderful introduction to NNs! The math is light also

lilac shadow Jun 12, 2018, 7:52 PM

#

ooh, now that is interesting

#

i've always been fascinated at NN's recognising stuff in images

#

and things like that

velvet anchor Jun 12, 2018, 7:53 PM

#

The math behind categorization is largely abstracted from the user

#

for Keras at least you just kind of throw your image as np.array() to the model and it tells you if it fits in one of your categories

lilac shadow Jun 12, 2018, 7:55 PM

#

i like to know how stuff works behind the scenes too, yeah?

feral lodge Jun 12, 2018, 7:55 PM

#

Image stuff is usually handled using what's called a convolutional neural network -- an example of which you'll find in Clay's github link up there. One thing I like very much about the learning example in the book there is that it shows that even plain old feed-forward neural networks can be used for images (though in a limited capacity)

lilac shadow Jun 12, 2018, 7:55 PM

#

i see

#

all this terminology is going straight over my head so i'm going to be doing a lot of googles when i get round to looking at this in detail :D

velvet anchor Jun 12, 2018, 7:56 PM

#

It's not super difficult to pick up

#

there's just a LOT of techniques for it

lilac shadow Jun 12, 2018, 7:57 PM

#

oh, that makes sense

#

so you just need to find the best algorithms to do what you need to do, basically?

#

obviously easier said than done

velvet anchor Jun 12, 2018, 7:58 PM

#

Best parameters, activation functions, etc

lilac shadow Jun 12, 2018, 7:58 PM

#

yeah

velvet anchor Jun 12, 2018, 7:58 PM

#

some is pretty easy but when dealing with images that are really close together it gets difficult to distinguish

#

I've been working on this image classification problem for like 3-4 months now for example

lilac shadow Jun 12, 2018, 7:59 PM

#

i imagine it's a fuckton of optimisations to do as well

velvet anchor Jun 12, 2018, 8:01 PM

#

@feral lodge what do you know about genetic algorithms?

feral lodge Jun 12, 2018, 8:03 PM

#

A big part of it is also figuring out clever ways to preprocess your data. Clay's working with images, for example, but he can't just toss his images through the net -- has also has to preprocess his images with stuff like this,to remove unnecessary noise and bring out important features

📎 unknown.png

#

Not much at all actually, just what I remember from the AI intro course 😄

#

I'm sure I can google a bit to look smart though 😏 👌

lilac shadow Jun 12, 2018, 8:05 PM

#

oh yeah, i know about pre-processing ^^ i always imagined it to be a way of making the data more "standard" i guess you could say

velvet anchor Jun 12, 2018, 8:05 PM

#

My boss was just explaining them to me and mentioned how they might be helpful to randomly tweak parameters

#

It can be. It can also be used to help guide the important parts too

lilac shadow Jun 12, 2018, 8:06 PM

#

yeah, that makes sense

velvet anchor Jun 12, 2018, 8:07 PM

#

I didn’t really have a question about them though Slandon. Just didn’t know if you’d heard of them

feral lodge Jun 12, 2018, 8:07 PM

#

That's super interesting, haven't seen it before. Here's a paper on it you can get paid for reading https://arxiv.org/pdf/1712.06567.pdf

velvet anchor Jun 12, 2018, 8:08 PM

#

😂. I’ve got it almost implemented actually alreadh

feral lodge Jun 12, 2018, 8:08 PM

#

Nice!

velvet anchor Jun 12, 2018, 8:08 PM

#

Just gotta make it object oriented and pretty

feral lodge Jun 12, 2018, 8:09 PM

#

Got this stuff on an open github?

velvet anchor Jun 12, 2018, 8:10 PM

#

Nah

#

Not until we publish

#

I keep the network stuff kinda open but the pre processing is hidden 😛

feral lodge Jun 12, 2018, 8:11 PM

#

That's cool to be part of a publication before your graduation 😮

velvet anchor Jun 12, 2018, 8:12 PM

#

Yeah

#

and lead author 🤑

#

Pre processing is kept secret because it's the magic I guess (and it's ugly AF code wise because I wrote it Day 1 of learning Python so it takes around 72hours to do 12,000 images LOL)

feral lodge Jun 12, 2018, 8:14 PM

#

😄

#

That's real nice though man, hope it turns out well

velvet anchor Jun 12, 2018, 8:15 PM

#

But if you want to read into it, it's based on light measurement and ELA

feral lodge Jun 12, 2018, 8:15 PM

#

I can read the paper laterz 😏

velvet anchor Jun 12, 2018, 8:15 PM

#

ELA is such a genius method of image forensics honestly

#

but it only works on jpgs 😦

#

https://en.wikipedia.org/wiki/Error_level_analysis

Error level analysis

Error level analysis is the analysis of compression artifacts in digital data with lossy compression such as JPEG.

#

but I keep all the stuff off public repos too @feral lodge so I don't work on them without getting paid hahaha

feral lodge Jun 12, 2018, 8:18 PM

#

Don't you have the thirst for knowledge? 🤓

#

ELA looks cool, never heard of it. I've hardly worked on images at all

velvet anchor Jun 12, 2018, 8:19 PM

#

I do but I also have a thirst for not starving

#

Although with that being said, I'm currently working on my classes while off the clock xd

feral lodge Jun 12, 2018, 8:24 PM

#

hey me too

#

But i have some exams in August

#

What're your courses?

velvet anchor Jun 12, 2018, 8:27 PM

#

I meant classes like OOP classes my mistake haha but

#

I'm taking C++, Algorithms, Philosophy and Tech writing this summer

#

then I finish in the fall with Operating Systems, Programming Languages, Senior Design, and Assembly 4

feral lodge Jun 12, 2018, 8:35 PM

#

I have ascended 👼

#

That's some good stuff though! How far into your education are you?

velvet anchor Jun 12, 2018, 8:36 PM

#

I graduate in dec

feral lodge Jun 12, 2018, 8:37 PM

#

Master's?

velvet anchor Jun 12, 2018, 8:37 PM

#

Bachelors

#

But a degree in math and CS

feral lodge Jun 12, 2018, 8:44 PM

#

That's great! I didn't touch ML until after my bachelor's were completed

velvet anchor Jun 12, 2018, 8:44 PM

#

Here's my genetic algorithm btw @feral lodge but I haven't been able to test it yet 😛

#

import netparams
import random


class Genetic:
    
    def __init__(self):
        self._population = []
        self.createpops()
        self.actfunc = ['relu', 'selu', 'linear', 'tanh', 'softmax', 'elu', 'softplus', 'softsign', 'sigmoid']
        self.paramlist = ['window1', 'window2', 'window3', 'window4', 'window5', 'window6', 'conv_depth_1', 'conv_depth_2',
                     'conv_depth_3', 'conv_depth_4', 'conv_depth_5', 'conv_depth_6']
        self.actlist = ['activation1', 'activation2', 'activation3', 'activation4', 'activation5', 'activation6',
                   'activation7', 'activation8']
    
    def createpops(self):
        for x in range(0, 9):
            child = netparams.NetworkParams()
            for attrib in self.paramlist:
                child.setval(attrib, random.randint(1, 36))
            
            for slot in self.actlist:
                child.setval(slot, random.choice(actfunc))
            
            child.setval('hidden', random.randint(400, 1600))
            self._population.append(child)
            
    def evolve(self):
        Parent1 = random.choice(self._population)
        Parent2 = random.choice(self._population)
        
        while Parent1 is Parent2:
            Parent2 = random.choice(self._population)
        
        child = netparams.NetworkParams()
        for attrib in self.paramlist:
            child.setval(attrib, random.choice( Parent1.getval(attrib) , Parent2.getval(attrib) ))
        for slot in self.actlist:
            child.setval(slot, random.choice( Parent1.getval(slot), Parent2.getval(slot) ))
            
        return child
    
    def compare(self, childlist):
        for x in self._population:
            if x.getval('fit') < childlist.getval('fit'):
                self._population.remove(x)
                self._population.append(childlist)

#

class NetworkParams:

    def __init__(self, **kwargs):
        for key,value in kwargs.items():
            setattr(self,key,value)

    def getval(self, networkparam):
        return getattr(self, networkparam)

    def setval(self, networkparam, value):
        setattr(self,networkparam, value)

round current Jun 12, 2018, 8:45 PM

#

Would this be an appropriate place to ask for help concerning Matplotlib and Python?

velvet anchor Jun 12, 2018, 8:45 PM

#

Sure

#

actually, maybe.

#

this is more for analytics / ML so it depends on what you're asking about within it 😛

round current Jun 12, 2018, 8:46 PM

#

Aperture redirected my help request to this channel

velvet anchor Jun 12, 2018, 8:46 PM

#

Ask away 😃

round current Jun 12, 2018, 8:46 PM

#

It is about plotting a polar plot essentially.

#

I am working on generating a Radar PPI Scope using matplotlib. I need fine control of how major and minor ticks are handled along with tick labeling. Since plt.polar does not offer sufficient control over these parameters (to my knowledge), I have opted to use a Polarxes transformation and AxisArtist functions to get the control I need. However, I have run into difficulties with how tick label printing and minor tick marks are handled. The picture below is an example PPI template that I seek to recreate.

#

📎 Radar_PPI_Outline.jpg

naive swallow Jun 12, 2018, 8:47 PM

#

~~the channel description was always like this~~

round current Jun 12, 2018, 8:48 PM

#

And this is what I have recreated thus far. North bearing corresponding to 0°.

#

📎 Radar_PPI.png

#

I can't figure out how to get minor ticks to print every 1°. Additionally, I cannot get the major axis tick labels to print every 15° starting at 0.
My current Code: https://paste.pythondiscord.com/urozeduzov.py
These issues have been stumping me for the last couple of days, so I figured it was time to ask for some advice. 😛

velvet anchor Jun 12, 2018, 8:50 PM

#

Lemme get matplot installed and such and i'll take a loot

#

and by that I mean we're taking our break in class now so BRB 10

#

Damn can't get matplot installed

round current Jun 12, 2018, 9:05 PM

#

Which dependency manager are you using?

velvet anchor Jun 12, 2018, 9:05 PM

#

pip

#

Keeps failing for no reason

#

(in a virtual environment)

#

not in a virtual environment ti says operation isn't permitted

round current Jun 12, 2018, 9:08 PM

#

Interesting. I am running it in a virtualized environment right now. I use MiniConda though as my dependency manager.

young aurora Jun 12, 2018, 10:03 PM

#

Here's a cool thing I made:

#

📎 LengthofReignSeparatedbyDynasty.png

naive swallow Jun 12, 2018, 10:08 PM

#

ooh that's some pretty pretty data

young aurora Jun 12, 2018, 10:09 PM

#

You bet it is. It's also problematic because it includes mythical beings, but hey, what can you do

dreamy tartan Jun 13, 2018, 11:51 AM

#

Hi everyone,

Have anyone tried to predict words from letters? Or give word suggestions.

I want to train a model for my language with my own data and i want to predict words from letters or give word suggestions.

I'm open to all suggestions

small ore Jun 13, 2018, 12:44 PM

#

@lilac shadow Andrew Ngs course is good for someone who fears math. He even teaches basic matrix multiplication and skips over derivations which require the simplest of PDs and straight away goes to the final result and concentrates more on discussing it

feral lodge Jun 13, 2018, 1:28 PM

#

I've heard Ng's course is fantastic 👌

#

@dreamy tartan Do you mean like autocorrect, or, easier, autocompletion? Or does your language use other symbols like фкушщк αιερξςονςγ বুগবডু্সুকতু and you want to predict those kinds of words using the abc alphabet? 😄

#

Can you show us a small example of what the program should be able to do?

#

@round current I've never used matplotlib, but this guy https://stackoverflow.com/a/44657941 seems to have created major and minor ticks using some other approach than transformation

#

I'm not sure how to interpret the graph @young aurora, could you explain? For Yuan for instance, the bar goes between 3ish and 12ish years, but wikipedia says the dynasty lasted from 1260 to 1368 🤔 Those dates seem to be pretty exact, so why do you have errors bars?

#

📎 unknown.png

young aurora Jun 13, 2018, 1:46 PM

#

Oh, so this isn’t the length of the dynasty - this is the length of time for individual rulers!

#

Also, it’s all based on the traditional dating used by Chinese historians (AKA the old one) rather than newer dating methods.

#

I should be more clear with the title/X label.

feral lodge Jun 13, 2018, 1:49 PM

#

Hmm, but these were the Yuan emperors -- only Kublai lasted for 34 years, but I'm interpreting the graph to say he ruled for maybe 2 years 😄

📎 unknown.png

#

And Temür lasted 35 years, but the tick diving the Yuan bar in two is far from the middle 😄

young aurora Jun 13, 2018, 1:51 PM

#

So this is important to understand for chinese dating

feral lodge Jun 13, 2018, 1:51 PM

#

Lay it on me

young aurora Jun 13, 2018, 1:51 PM

#

The emperors aren’t necessarily what you see on Wikipedia

#

These are taken not from historians in the modern, technical sense, but rather from court records made and changed much later

#

Think of it as “edited history”

#

I’ll go check the data for them, though, and give you a more complete answer

#

It’s 398 total emperors, haha

feral lodge Jun 13, 2018, 1:54 PM

#

Very interesting! Send it as a DM so we don't scare away new questions here

dreamy tartan Jun 13, 2018, 2:01 PM

#

@feral lodge my language is using latin alphabet 😃 Peter Norvig approach helped me a lot and i think it solved my problem. With it im doing spell checker now. For autocorrect and autocompletion do i needed something like this im guessing. Am i correct?

round current Jun 13, 2018, 2:06 PM

#

@feral lodge That is what I ended up doing. I generated small line segments prior to the transformation to create the minor and major ticks. Far from an elegant solution, but workable. I am satisfied with the end result.

📎 Radar_PPI.png

young aurora Jun 13, 2018, 2:12 PM

#

That looks cool!

feral lodge Jun 13, 2018, 2:16 PM

#

This is the Peter Norvig approach, right? https://norvig.com/spell-correct.html Looks like it already functions as an autocorrector! For autocompletion a good starting approach is to just keep track of the letters the user has written, and keep a list of all words in the dictionary that begin with that sequence, sorted by how common the word is (if you have that info).

#

Yeah, that looks awesome, great job 😮

velvet anchor Jun 13, 2018, 2:22 PM

#

Nice job @round current glad you got it working

small ore Jun 13, 2018, 3:16 PM

#

Slandon! You know the universe

hasty maple Jun 13, 2018, 3:33 PM

#

I don't know why I was tagged to that haste bin Clay 😕

velvet anchor Jun 13, 2018, 3:33 PM

#

Oh just because it was a dumb solution to just like random parameters and rerun it

#

😛

#

what makes more sense from an OOP perspective for an evolutionary algorithm? wrapping the model + parameters inside of an overarching simulation class or just letting the model be a procedural setup that calls parental gene manipulation as needed?

#

also because doing it that way does free up the model memory Ichi, it seems just running del model and gc.collect() will clear the GPU allocation allowing for another model to run agin

#

and, I thought, you mentioned being interested in a solution from within the same script

hasty maple Jun 13, 2018, 3:42 PM

#

ah yeah I was, but it was hard to follow the code as I checked back a day later and wasn't really sure why I was tagged.

I'll keep del model and gc.collect() in mind. Do I need to import anything to run gc.collect()?

velvet anchor Jun 13, 2018, 3:47 PM

#

I don't believe so

#

Once I get back to the office i'm gonna be rewriting it to add in the genetic / evolutionary algorithm instead of just rand()ing parameters

hasty maple Jun 13, 2018, 3:50 PM

#

hard in the sense I didn't know why I was given the code, the code was easy to understand

velvet anchor Jun 13, 2018, 3:55 PM

#

Yeah just because you'd been the person i'd been primarily talking to about it 😛

hasty maple Jun 13, 2018, 4:19 PM

#

Good luck with the Genetic Algorithms

velvet anchor Jun 13, 2018, 4:30 PM

#

Is there a quick way to calculate how much space a network will take up in memory?

#

Is it just input size^2 * layers?

small ore Jun 13, 2018, 4:46 PM

#

Can someone tell me as to what genetic algorithm is and how and where they are useful?

velvet anchor Jun 13, 2018, 4:47 PM

#

Ok so a genetic algorithm (also called evolutionary algorithm, or a few other terms) is a way used to create a set of parameters that gets better over time

#

It works, at a really high level, like this. Create a population of a number of parameters. Let’s use 4 in this example

#

These parameters may be a list like:
1 layer, 3x3 window, 200x200 input size, sigmoid output function.
3 layers, 10x10 window, 400x400 input size, soft max output function
And 2 more with a different set of parameters you want to adjust

#

Now you take these sets of parameters and compute a score for them for how well they match an optimal output. So for my case, as an example, I’m scoring based on how accurate they are at identifying images.

Now that I have 4 sets of parameters and a score. I can create children. So I’ll take 2 random sets from my population and just randomly pick parameters. So I may take dads input size, layers and moms output function and window

You’ll take this new child set of parameters and score it. If it’s better than one of the other 4, you replace the lowest score and run it again

#

You can also implement “mutations” to your population. So you could take the parameters and add 1 to them or whatever

#

Does that make sense @small ore

small ore Jun 13, 2018, 4:55 PM

#

Still reading and trying to make sense 😁

#

So, if I understood it right, you set the parameters for each layer(including activation function, window, etc) in the beginning quite randomly and then use the "genetic algorithm" to change parameters ( is that what you call mutation) and see if it scores better?

velvet anchor Jun 13, 2018, 5:01 PM

#

Yeah exactly. You just randomly set parameters. Make children from them and see if it’s better. Then with survival of the fittest you replace the lowest score with the new one if it’s better

#

And repeat until you’re satisfied. And that whole process is called a generic / evolutionary algorithm

#

It’s useful for optimization. There’s a few techniques like the one above, simulated annealing, swarming, etc

small ore Jun 13, 2018, 5:03 PM

#

Wow. And here I am finding it difficult to understand even the basic NN well

#

There is loads to learn flop

velvet anchor Jun 13, 2018, 5:04 PM

#

There’s a lot to learn but it’s not too hard once you get it 😃

#

I’ve been working with Keras for a few months now and I’m like just barely scratching the surface kinda.

small ore Jun 13, 2018, 5:06 PM

#

I have read forward and backward propogation twice now and while I understand everything that is said, I am yet to figure what are the knowns and what are unknowns in each step

velvet anchor Jun 13, 2018, 5:53 PM

#

So if I'm understanding this correctly. a network with input shape 100x100 in RGB and 3 convolutional layers with 3 filters, with a dense output will take up

100x100x3 = 3000 +
100x100x3 = 3000 +
100x100x3 = 3000 +
100x100x3 = 3000 +
1x100x100 = 1000

Then multiply the batch size by the total amount?
did i do that right? does window size matter at all?

velvet anchor Jun 13, 2018, 6:33 PM

#

I know there’s model.timeline() or model.summary() but I believe that the model gets loaded into memory first fully before computing that information. So I’m trying to avoid OOM errors instead of wrapping everything in a try except

velvet anchor Jun 13, 2018, 8:08 PM

#

https://paste.pythondiscord.com/ufigumaron.py
https://paste.pythondiscord.com/hekadomeca.rb
https://paste.pythondiscord.com/ezalocedoj.rb

it lives!

#

@small ore those hastebin links above are an example of an evolutionary algorithm (version 1.0) if you wanted to see one fully written

small ore Jun 13, 2018, 8:23 PM

#

Oh wow. Not sure if I will understand it. I will take a look at it. Thank you

#

Are those ruby files? 😮

velvet anchor Jun 13, 2018, 8:30 PM

#

No it’s python

small ore Jun 13, 2018, 8:33 PM

#

Okay. It does look like python classes and methods but the extension in those hastebins made me think it could be ruby

#

Thanks for getting me interested in it

velvet anchor Jun 13, 2018, 8:39 PM

#

No problem. Haha. Gave me a reason to come into work today and finish jt

velvet anchor Jun 13, 2018, 9:12 PM

#

Yeah IDK why they got the ruby extension that's weird.

lilac shadow Jun 13, 2018, 10:23 PM

#

if you edit them to have a .py URL, it'll have python syntax highlighting

velvet anchor Jun 13, 2018, 10:28 PM

#

ALso dont hate on my awful use of kwargs. it's got a reason as this build continues fleshing out

quiet gyro Jun 14, 2018, 2:46 AM

#

@small ore To answer the question you asked this morning at a very high level...

Genetic Algorithms are useful for optimizing extremely large data sets. They don't necessarily give you the absolute best possible value. However, they get very close, with significantly less computational cost (less computer power).

Think of it as finding a solution that's 90% as good as the best, in 2 days on your laptop, instead of the absolute best in a year on a supercomputer.

velvet anchor Jun 14, 2018, 2:52 AM

#

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.330.1662&rep=rep1&type=pdf paper about if you want even more high level

small ore Jun 14, 2018, 7:52 AM

#

@quiet gyro Thank you

lapis sequoia Jun 14, 2018, 11:36 AM

#

hey data scientists, I wonder ... I have some time series data that I use as a regressor in a GLM for some other measurement. the problem is taht there are some high intensity peaks in this time series that kinda mess up the regression

#

i know that music processing people usually apply some "dynamic range compression" to make the whole recording have a similar amplitude. is that used for other data as well or would it change too much?

small ore Jun 14, 2018, 12:36 PM

#

My two cents: Does the data representing those peak really matter to the representation of the data and prediction if they are removed?

#

@lapis sequoia

velvet anchor Jun 14, 2018, 1:40 PM

#

Yeah that type of processing is used in other places Hypo

velvet anchor Jun 14, 2018, 2:26 PM

#

It might change too much but smoothing out data from massive peaks is always part of the challenge but do keep in mind it might alter the rest of your results too so you might have to redo your formulas with the compressed data

bleak geode Jun 14, 2018, 11:05 PM

#

Any pandas people here? I've got a litle thing I'm wondering if I can pick your minds.
I've got two tables, one for hospitals and one for patients.

#

Each patient record has a foreign key back to the hospital table.

#

each patient record also has a mortality result (ALIVE/EXPIRED)

#

I'm trying to add a column to the hospital table of the mortality ratio, that is n_expired / n_total

#

Currently I was thinking about going down the hospital table with iterrows() and grabbing all the patient records for the corresponding hospital and calculate the ratio one by one. I was wondering if there was a more pandas-y/pythonic way of doing this..

velvet anchor Jun 14, 2018, 11:37 PM

#

Not that I know of. #databases may have a clever way to do it that I’m unaware of, but your method is how I’d approach it.

feral lodge Jun 15, 2018, 4:31 AM

#

This would be the R-y way of doing it, and pandas is meant to resemble R, afaik:

import pandas as pd

apples = pd.DataFrame({'color': ["red", "green", "red", "yellow", "red", "green", "yellow"], 'taste': ["nasty", "tasty", "nasty", "tasty", "tasty", "tasty", "nasty"]})

color_stats = pd.DataFrame({'color': ["red", "green", "yellow"]})

def compute_tasty_ratio(c):
    c_apples = apples[apples['color'] == c]
    n_tasty = (c_apples['taste']=="tasty").sum()
    n_tot = c_apples.shape[0]
    return n_tasty/n_tot

color_stats['ratio'] = color_stats.color.apply(compute_tasty_ratio)

print(apples)
print(color_stats)

In this example we add a column to show the ratio of tastiness of apples of different colors

#

@bleak geode

#

Prints

    color  taste
0     red  nasty
1   green  tasty
2     red  nasty
3  yellow  tasty
4     red  tasty
5   green  tasty
6  yellow  nasty

    color     ratio
0     red  0.333333
1   green  1.000000
2  yellow  0.500000

velvet anchor Jun 15, 2018, 2:27 PM

#

😮 @feral lodge I thought you died

feral lodge Jun 15, 2018, 2:47 PM

#

I live!! Got a 70 hour work week during the summer though, so I'll probably not be as active :O

#

Besides, you keep answering all questions for me 👌

velvet anchor Jun 15, 2018, 2:51 PM

#

Ok so @feral lodge general question here. If I wanted to release the evolutionary algorithm onto pip as a framework what makes the most sense. Allowing users to submit their own class and a list of keys to create children of? Or another way?

feral lodge Jun 15, 2018, 2:56 PM

#

Let me check my textbook and refamiliarize myself with GAs 🤔

#

Is it a general GA, or is it specifically for training NN weights?

velvet anchor Jun 15, 2018, 2:58 PM

#

I want to at some point release another version specifically for NN weights but there’s a lack of GA frameworks for Python in general so having a general version too isn’t a bad idea

feral lodge Jun 15, 2018, 2:59 PM

#

Good initiative!

velvet anchor Jun 15, 2018, 3:00 PM

#

At some point for NN weights I incision a constructor where you say like ‘Genetic(pop=X,conv_layers=y,...)’ and have it generate full models

#

But for a regular GA it’s much simpler

feral lodge Jun 15, 2018, 3:04 PM

#

The way I see it the algorithm needs two things: how each individual member of the population is formatted (like, is it a binary vector? A vector of floats? A mix? How long is the vector?) and a fitness function

#

What did you mean by their own class?

#

The inputs to their pseudocode is an initial population and a fitness function

📎 unknown.png

#

Would your framework be fancier than this?

lilac shadow Jun 15, 2018, 3:12 PM

#

genetic algorithms amaze me ^^

feral lodge Jun 15, 2018, 3:13 PM

#

@lilac shadow http://boxcar2d.com/ Check this out if you haven't 😄

velvet anchor Jun 15, 2018, 3:13 PM

#

Not really. But I wasn’t going to implement the fitness function from within (you can see my framework as it is up a few lines). Essentially I envisioned a way so that the user has s simulation class that scores fitness.

From within simulation they instantiate my genetic class providing a list of attributes to be randomized (the things we care about) and a class to contain them. My GA framework will handle the evolutions / population control / etc so that

#

From the users perspective all they have to do would be
GeneticF = Genetic(...)
For X in GeneticGenerator
X.fit = function result.
X.compare()

#

And then after a predetermined number of iterations it would spit out the population of most fit results

#

Where the generator would handle evolving from the pop / mutations / etc

#

The reason you’d want a class to contain values vs a dict doesn’t super matter but it gives you freedom to apply @properties so you can apply processing to specific results later

feral lodge Jun 15, 2018, 3:25 PM

#

Oh, I see 🤔 So they test fitness themselves, and your class functions as a way to produce new individuals which to test?

velvet anchor Jun 15, 2018, 3:25 PM

#

Yeah

#

Unless there’s a better way

feral lodge Jun 15, 2018, 3:26 PM

#

Seems fine to me!

#

I can't think of any direct improvement of what you have right now, except generalizing the code in Genetic

#

I think 🤔

velvet anchor Jun 15, 2018, 3:27 PM

#

Yeah that comes too

#

I just needed something workable and I had 2 hours to write and test before class 😂

feral lodge Jun 15, 2018, 3:29 PM

#

Or maybe I still don't get it completely, the reproduction rates are based on the fitness evaluation right?

#

📎 unknown.png

#

The percentages in that screenie

#

Does the user have to input the fitness evaluation for each individual?

velvet anchor Jun 15, 2018, 3:29 PM

#

Not exactly. It’s just kind of random sampling

feral lodge Jun 15, 2018, 3:29 PM

#

But the point is to choose/evolve fit individuals 😮

velvet anchor Jun 15, 2018, 3:29 PM

#

The user would supply fit = function result inside a loop or whatever

#

And then the framework would handle making sure it fit within the population

#

It is the goal to evolve fit individuals. Yes.

#

However more fit parents doesn’t always equal a more fit offspring

#

So just evolving from the two most fit every time doesn’t guarantee the best result

#

So you randomly sample parents from across your population that are within the ”fittest”

#

So like in your picture above, unless I misunderstood what you asked, you don’t mate just the best two and the worst two. You just mate all of them kinda and see what the best results are. Make a new population of best performers that’s the same size. Repeat

feral lodge Jun 15, 2018, 3:35 PM

#

Oh no, definitely not just pair the best two

#

Bu I think our two algorithms are slightly different

#

"In this particular variant of the genetic algorithm, the probability of being chosen for reproducing is directly proportional to the fitness score, and the percentages are shown next to the raw scores."

I was thinking this

velvet anchor Jun 15, 2018, 3:36 PM

#

Maybe. I’m on mobile too so I could just be representing my ideas incompletely

#

Yeah nah. I’m not choosing based on fitness scores in any way. Though i suppose it wouldn’t be hard to implement. I was thinking of just mating all of them together so 1&2, 1&3, 1&4. 2&3, 2&4, 3&4. Make new population of best 4. Repeat

feral lodge Jun 15, 2018, 3:38 PM

#

I see 🤔 Is this approach based on a paper or something?

#

It seems time consuming

velvet anchor Jun 15, 2018, 3:39 PM

#

Nah it’s not. Just how my professor explained how it worked. 😂

#

But I guess at the same time GAs aren’t not time consuming either.

feral lodge Jun 15, 2018, 3:42 PM

#

Oh sure, they're basically a random search! But I was thinking your approach adds a lot of extra randomness and time/space requirements, while disregarding a big part of the "genetic" aspect 🤔

velvet anchor Jun 15, 2018, 3:43 PM

#

Yeah could be for sure. Definitely wouldn’t hurt to add in percentage to be chosen

feral lodge Jun 15, 2018, 3:44 PM

#

But, the Genetic class has no knowledge of the fitness function then?

#

That's all handled by the user?

velvet anchor Jun 15, 2018, 3:45 PM

#

Yeah

feral lodge Jun 15, 2018, 3:46 PM

#

So how do you choose the best children after pairing all individuals?

velvet anchor Jun 15, 2018, 3:47 PM

#

For child in babies:
If child.fit > worst population
Replace worst with baby

feral lodge Jun 15, 2018, 3:48 PM

#

And child.fit is computed how? o:

velvet anchor Jun 15, 2018, 3:48 PM

#

The generator function would return a child object to the user

#

The user would score the file off their fitness function

#

Set child.fit equal to its result

feral lodge Jun 15, 2018, 3:49 PM

#

gotcha 👌

velvet anchor Jun 15, 2018, 3:50 PM

#

Then the generator function could, using the child’s newly given fitness score compute a new child object to supply

feral lodge Jun 15, 2018, 3:56 PM

#

You said you pair the individuals [1,2,3,4] like this: [1,2], [1,3], [1,4], [2,3], [2,4], [3,4] right?

velvet anchor Jun 15, 2018, 3:56 PM

#

Ye

feral lodge Jun 15, 2018, 3:56 PM

#

Does each pairing only generate one offspring?

velvet anchor Jun 15, 2018, 3:56 PM

#

Right now

#

But there’s nothing making that not be the case later

#

But it seems that most implementations only generate one off spring

feral lodge Jun 15, 2018, 4:01 PM

#

Indeed! But I think most implementations don't choose parent like this 😄 In your case, if 1 is the individual [1111 1111], we can never concieve a child with 1111 in the second half

#

But one bigger thing I was thinking regarding that, is that you generate (N choose 2) children each generation leap, which the user has to test before settling on the N best ones, which become the next generation

#

Whereas they in the figure up there, generate N children each generation leap

velvet anchor Jun 15, 2018, 4:04 PM

#

Right

#

It’s kinda hard to say which is correct I think. The % gets you less but possibly better guesses but this way is a more complete sampling. Hard to say both have their pros and cons

feral lodge Jun 15, 2018, 4:07 PM

#

Sure! And I'm definitely no expert

#

But if we compare the complexities of f(x) = x choose 2 and g(x) = x we get this

#

📎 unknown.png

#

So for x choose 2 to be a reasonable choice, then the best child must be very similar a member of the original population

#

Whereas the blue line will quickly move through populations, finding descendants very different from the initial pop

velvet anchor Jun 15, 2018, 4:09 PM

#

Right

lilac shadow Jun 15, 2018, 4:10 PM

#

nerds! :D

feral lodge Jun 15, 2018, 4:11 PM

#

math

📎 unknown.png

velvet anchor Jun 15, 2018, 4:11 PM

#

GWseremePeepoGGERS

#

I wonder if there’s any justification for just random choice. Where each individual has 1/n choice of being a parent

feral lodge Jun 15, 2018, 4:13 PM

#

That starts to approach beam search a bit imo 🤔

velvet anchor Jun 15, 2018, 4:14 PM

#

I was just thinking in terms of NNs on that where

#

Given certain problems you have to beam because small changes can give drastic results

#

Especially with breeding and activation functions

feral lodge Jun 15, 2018, 4:19 PM

#

Sure, that's a pickle

lilac shadow Jun 15, 2018, 4:20 PM

#

ooh i like pickles :^)

feral lodge Jun 15, 2018, 4:20 PM

#

📎 unknown.png

#

Are you sure you want to have the activation function as part of the GA though? When I first heard you explaining the application of GAs in NNs i figured you were just going to evolve the weights

velvet anchor Jun 15, 2018, 4:21 PM

#

It can go both ways

#

Not gonna evolve the output function to keep the uh

#

Answer range the same

#

But everything else is fair game I think

feral lodge Jun 15, 2018, 4:22 PM

#

You da boss 👌

velvet anchor Jun 15, 2018, 4:22 PM

#

Right like obviously it makes no sense to score fitness on networks with sigmoid

#

And then suddenly breed a tanh answer

#

So your last dense layer would stay the same

#

But I think activation functions on the convolution layers can be helpful

feral lodge Jun 15, 2018, 4:34 PM

#

Worth a shot! Have you tried training with it yet?

velvet anchor Jun 15, 2018, 4:35 PM

#

It’s running over the weekend

#

I had some bugs to quash which I think have all been taken care of

#

Was gonna go into work today in a couple hours and see if it’s still running over night

velvet anchor Jun 16, 2018, 5:54 PM

#

I feel like I’m missing something scikit learn in not knowing what it’s capable of. Is there a resource for showing all the advanced stuff it can do?

spring radish Jun 16, 2018, 6:55 PM

#

see: http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

#

's got a cool flowchart

small pumice Jun 16, 2018, 8:05 PM

#

There are quite a few tutorials on stock price prediction with machine learning, but many of them are outdated or use the Quandl library, which only has stock data to March of this year. Does anyone know of any tutorials that don’t have these problems?

spring radish Jun 16, 2018, 8:13 PM

#

note: if there was a tutorial that worked for actually getting a profitable model, everyone would be doing it

#

i had a group of friends work on a machine learning model for stock prediction for half a year, and they only got vaguely positive predictions that hypothetically made them money but were never tested live...

#

they couldn't find data without paying big bucks, either.

quiet gyro Jun 16, 2018, 9:01 PM

#

Yeah, anything that's real-time or any useful aggregate analytics is often behind a paywall

young aurora Jun 16, 2018, 10:39 PM

#

@feral lodge, check this out:

📎 xiarulers.png

small ore Jun 16, 2018, 11:42 PM

#

@small pumice DM me for data if it is only for testing your code and learning. I do not know how credible the data is though

velvet anchor Jun 17, 2018, 3:32 AM

#

I like that Joseph

#

I think I liked the vertical timeline more but the round data has a cool feel to it

hasty maple Jun 17, 2018, 6:34 AM

#

Donut chart 😄

lapis sequoia Jun 17, 2018, 11:45 AM

#

https://twitter.com/StuartJRitchie/status/1007699738307723269

Stuart Ritchie (@StuartJRitchie)

Machine learning (and possibly human civilisation) has peaked: a paper describing an algorithm that can automatically generate internet memes. Oh, and the paper is called "Dank Learning". via @samfreis https://t.co/btyqNumBfi

Retweets

264

Likes

540

young aurora Jun 17, 2018, 7:03 PM

#

So the round data and the is actually representing the data differently

quiet gyro Jun 18, 2018, 2:33 AM

#

http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

velvet anchor Jun 18, 2018, 4:54 AM

#

My project this semester is to write some type of technical manuscript. Going to wind up doing a tutorial / user's manual for Keras. Is there any interest in that being posted here? Ideally it'll cover types of networks / optimizations / when to use techniques such as forwards / backwards propagation, etc.

quiet gyro Jun 18, 2018, 5:12 AM

#

Maybe send it to the Keras developers?

#

I'm sure they'd love to add it to their documentation

velvet anchor Jun 18, 2018, 5:14 AM

#

Maybe. I just HATED having to have 5-10 tabs open while researching at the start

#

and found no concrete starting point

#

every tutorial is the same copy pasted MNIST flower petal model with a sentence variation

quiet gyro Jun 18, 2018, 5:23 AM

#

Definitely send it to them then

#

Depending on how their docs are done,you could just take pieces of content from whatever you make and add it, then submit a pull request or something

#

The bane of all software projects is lack of good documentation

velvet anchor Jun 18, 2018, 5:24 AM

#

and no users

quiet gyro Jun 18, 2018, 5:25 AM

#

Often because they have no idea how to use the awesome thing you built and don't have the time or desire to retrace your steps to figure out how it works 😉

velvet anchor Jun 18, 2018, 5:26 AM

#

True

small ore Jun 18, 2018, 11:58 AM

#

People will also like a quick reference guide for switching from one ML moduel to the other

velvet anchor Jun 18, 2018, 5:50 PM

#

Yeah. It’ll all be in Keras scope those because unless you’re doing natural language modeling, which the Azure platform is set to excel at, there’s not a huge reason to use like pytorch here, Keras there, etc to my understanding of it

hasty maple Jun 18, 2018, 5:59 PM

#

Isn't Keras docs easy enough to use? I didn't have any trouble for the most part

velvet anchor Jun 18, 2018, 6:18 PM

#

Yeah the docs are nice but piecing together the docs into a coherent structure was :puke: cuz there wasnt like any nice tutorials

#

except the datasets ones thats used everywhere

#

theyre more geared towards people with some type of knwoledge about ML in general not for newbies

hasty maple Jun 18, 2018, 6:44 PM

#

IMO ML shouldn't be like you can pick a library,learn it and use it. You should understand some of the principles,ideas before being able to use them, so in a way the current state of resources is good to filter out overenthusiastic hardly working entrants to this field

velvet anchor Jun 18, 2018, 6:46 PM

#

I dunno. I guess in theory thats not wrong, but theres value in being an entrant building something and understanding it

#

even if you dont know the innerworkings behind it

hasty maple Jun 18, 2018, 6:54 PM

#

the cycle should be learn-->understand-->build, you don't need innerworking level understanding but atleast the surface level, so that you would know where else a certain ML concept might be applied to

velvet anchor Jun 18, 2018, 7:19 PM

#

I need a data set of headshots that are larger than 64x64. Anyone have any ideas?

#

Ideally I’d like 400x400 or larger

#

I might scrape insta for them

feral lodge Jun 18, 2018, 7:22 PM

#

No idea, maybe there's one here: https://www.kairos.com/blog/60-facial-recognition-databases

velvet anchor Jun 18, 2018, 7:22 PM

#

Checked all those. Using bits and pieces from like 4 different ones there

#

Thinking scraping insta or Facebook might be best

#

But idk if that’s okay to use in research 😂

feral lodge Jun 18, 2018, 7:25 PM

#

http://discovery.cs.wayne.edu/lab_website/index.php/lsdl/
Probably have to cite this paper if you use: https://arxiv.org/pdf/1706.08690.pdf

#

Reminds me of an industrious little company called cambridge analytica 😏

velvet anchor Jun 18, 2018, 7:27 PM

#

5million faces. Machine learning intensifies.

#

Computing gradients of all of those will take a month though. Lol.

#

Probably worth

feral lodge Jun 18, 2018, 7:28 PM

#

No problem, looks like half of them are obama's face 😛

velvet anchor Jun 18, 2018, 7:29 PM

#

Worth. That works since half my deepfake is trump 😂

feral lodge Jun 18, 2018, 7:29 PM

#

😄

velvet anchor Jun 18, 2018, 7:30 PM

#

Dlib is such a great library

feral lodge Jun 18, 2018, 7:30 PM

#

Never used it

#

I've only ever used opencv

#

Oh, it's not only image processing

velvet anchor Jun 18, 2018, 7:39 PM

#

Nah. It’s just quick to double check a dataset and make sure the images have faces

feral lodge Jun 18, 2018, 7:41 PM

#

Cool! Wonder if it's good enough to detect on images like these

📎 unknown.png

velvet anchor Jun 18, 2018, 7:42 PM

#

I can test

#

Some of those obscured look like the subject just ripped a fat vape

feral lodge Jun 18, 2018, 7:43 PM

#

aw ye

velvet anchor Jun 18, 2018, 7:45 PM

#

Go green

lilac shadow Jun 18, 2018, 8:14 PM

#

lmao

velvet anchor Jun 18, 2018, 8:47 PM

#

@feral lodge Found 2595 correct faces out of 10049 total images

#

with dlib

feral lodge Jun 18, 2018, 8:48 PM

#

Oh snap

small ore Jun 18, 2018, 8:48 PM

#

Was that on those obscured faces?

velvet anchor Jun 18, 2018, 8:48 PM

#

import dlib
import glob
from skimage import io

dir = "/faces/dir"

detector = dlib.get_frontal_face_detector()
correct = 0
total = 0
for x in glob.glob(dir + "*.jpg"):
    img = io.imread(x)
    total += 1
    faces = detector(img, 1)
    if len(faces) > 0:
        correct += 1
print "Found {} correct faces out of {} total images".format(correct, total)

#

yeah

small ore Jun 18, 2018, 8:49 PM

#

No wonder

velvet anchor Jun 18, 2018, 8:49 PM

#

25% isnt terrible when you can literally only see half the face

small ore Jun 18, 2018, 8:51 PM

#

May be you should give parts of the face as training sets

velvet anchor Jun 18, 2018, 8:52 PM

#

im just using dlibs default model

#

because slandon was curious if it works

#

enlarging the images gave about 10% more correct

#

Found 3636 correct faces out of 10049 total images

small ore Jun 18, 2018, 8:52 PM

#

I am just throwing random thoughts 😃

velvet anchor Jun 18, 2018, 8:53 PM

#

Yeah im just messing around with things while I keep tweaking my research project

small ore Jun 18, 2018, 8:58 PM

#

<@&267628507062992896> Worth pinning the code-block above

south quest Jun 18, 2018, 8:59 PM

#

uhhhhhh

#

That's probably not the kind of thing we'd pin

velvet anchor Jun 18, 2018, 8:59 PM

#

xd

#

I upped the resampling to 10 to see if it changed anything and its taken like 20minutes to run

velvet anchor Jun 18, 2018, 9:31 PM

#

40 minutes still going strong

weak kiln Jun 18, 2018, 9:56 PM

#

no idea why we would pin that.

velvet anchor Jun 18, 2018, 9:56 PM

#

same

naive swallow Jun 18, 2018, 10:05 PM

#

same

small ore Jun 18, 2018, 10:29 PM

#

Well, if someone is trying to write their own image recognition, then that code above will serve as an avaluation standard to measure your own code against

velvet anchor Jun 18, 2018, 10:33 PM

#

@feral lodge set it up to check every resampling rate dlib offers, seems to get about 10% better each sampling. didn't have time to do 10 fully, but 0 was 1100, 1 was 2200, etc will report tomorrow with exact results. it looks promising though

#

not that it matters for anything but it's cool none the less

#

Is also works on the CPU so it wont impact my keras training xd

young aurora Jun 18, 2018, 11:52 PM

#

Hey all - I'm trying to do two subplots - each being a LineCollection - in matplotlib. Just to make it easy since I am not providing the underlying data, here's the function I created to make the first line collection - I only need to duplicate this so that there are two of them side by side.

                             'x2': Xia['Finish']})

segs = np.zeros((len(df_lines), 2,2))
segs[:,:,1] = df_lines[["x1","x2"]].values


fig, ax = plt.subplots(figsize=(3,20))

colors = [mcolors.to_rgba(c)
          for c in plt.rcParams['axes.prop_cycle'].by_key()['color']]

line_segments = LineCollection(segs, colors=colors, linewidths=7)
ax.add_collection(line_segments)

ax.set_ylim(-1,1)
plt.title('Xia Dynasty', fontsize = '25')
plt.ylabel('Year', fontsize = '20')
plt.yticks(fontsize = '15')
plt.xticks(range(len(begin)), "")

plt.ylim(-2230, -1750)
plt.xlim(-.3,1)

for i in range(18):
    plt.text(.1, begin.iloc[i] + length.iloc[i]/2, event.iloc[i], ha='left', fontsize = '14', rotation=0)

plt.gca().invert_yaxis()
plt.show()
fig.savefig('xiadynasty.png', dpi=100)```

velvet anchor Jun 18, 2018, 11:52 PM

#

code blocks 😦

young aurora Jun 18, 2018, 11:54 PM

#

I've tried this solution, and I get an error about the image being too big to create.

begin = Xia['Start']
end = Xia['Finish']
length = Xia['Length']

event2 = XiaXSZCP['Dynasty']
begin2 = XiaXSZCP['Start']
end2 = XiaXSZCP['Finish']
length2 = XiaXSZCP['Length']

df_lines = pd.DataFrame({'y1': Xia['Start'], 
                             'y2': Xia['Finish']})

df_lines2 = pd.DataFrame({'y1': XiaXSZCP['Start'], 
                             'y2': XiaXSZCP['Finish']})

segs = np.zeros((len(df_lines), 2,2))
segs[:,:,1] = df_lines[["y1","y2"]].values

segs2 = np.zeros((len(df_lines2), 2,2))
segs2[:,:,1] = df_lines2[["y1","y2"]].values

colors = [mcolors.to_rgba(c)
          for c in plt.rcParams['axes.prop_cycle'].by_key()['color']]

plt.subplots(figsize=(3,6))
ax1 = plt.subplot(1,2,1)
line_segments = LineCollection(segs, colors=colors, linewidths=7)
ax1.add_collection(line_segments)
for i in range(18):
    plt.text(.1, begin.iloc[i] + length.iloc[i]/2, event.iloc[i], ha='left', fontsize = '12', rotation=0)
plt.title('Xia Dynasty', fontsize = '25')


ax2 = plt.subplot(1,2,2)
line_segments2 = LineCollection(segs2, colors=colors, linewidths=7)
ax2.add_collection(line_segments2)
for i in range(1):
    plt.text(.1, begin2.iloc[i] + length2.iloc[i]/2, event2.iloc[i], ha='left', fontsize = '12', rotation=0)


plt.gca().invert_yaxis()
plt.show()
fig.savefig('xiadynasty.png', dpi=100)```

velvet anchor Jun 18, 2018, 11:56 PM

#

I should really gt around to learning matplotlib so I can help with these

#

have you tried lowering your dpi maybe? or does the error occur sooner than that

young aurora Jun 18, 2018, 11:58 PM

#

It's sooner - removing it or limiting it to a tiny amount still spits out a "this is way too big" error. Here's the error message when dpi = 100

#


<Figure size 216x432 with 2 Axes>```

manic mason Jun 18, 2018, 11:59 PM

#

Thats not a dpi problem

velvet anchor Jun 19, 2018, 12:00 AM

#

Yeah was just a quick troubleshooting thing to be sure :p

manic mason Jun 19, 2018, 12:01 AM

#

Sorry, I don't know matplotlib, wish I could help more

#

Maybe a list is too big, or a loop doesnt have an end condition

velvet anchor Jun 19, 2018, 12:03 AM

#

Stack seems to think that it could be a stray text coordinate. Make sure they're all being given within the bounds of the image. but I'm not sure about MPL either, @feral lodge normally handles these questions hahaha