#data-science-and-ml | Python | Page 267

velvet thorn Nov 10, 2020, 12:20 AM

#

but I would do everything differently

smoky fractal Nov 10, 2020, 12:20 AM

#

It's really just a proof of concept program at this point

#

what would you do differently?

velvet thorn Nov 10, 2020, 12:21 AM

#

btw, not directly functionality related, but

#

snake case is preferred for Python

#

@smoky fractal maybe something like this:

diff = symbol_data['close'].tail(bars)

down_mean, _, up_mean = diff.groupby(np.sign(diff)).mean().sort_index()
rsi = 100 - 100 / (1 + up_mean / down_mean)

drifting hemlock Nov 10, 2020, 12:29 AM

#

Has anyone used a Data Discovery/Catalog service like Amundsen?

smoky fractal Nov 10, 2020, 12:29 AM

#

@velvet thorn I'm still a bit confused about the ups/downs, is that a continuous value at each point or is it just one point that gets returned?

#

because initially I had them as variables but now they are columns in the dataframe

velvet thorn Nov 10, 2020, 12:30 AM

#

I think what you mean to ask is - is that a Series or a single value?

#

and the answer is the former

#

wait, go back

#

you mean in yours?

smoky fractal Nov 10, 2020, 12:30 AM

#

Yes that's exactly what I was asking, thanks. And those three lines you posted do the same thing as my code? Just trying to

velvet thorn Nov 10, 2020, 12:30 AM

#

or mine

smoky fractal Nov 10, 2020, 12:31 AM

#

in my original code, I had them as single values. Now I have them as series.

velvet thorn Nov 10, 2020, 12:32 AM

#

oh wait actually I need to clarify your algorithm

#

RS is supposed to have the value of the mean increase divided by the value of the mean decrease, right

smoky fractal Nov 10, 2020, 12:33 AM

#

yeah

velvet thorn Nov 10, 2020, 12:34 AM

#

okay, then I need to change my code

#

sec

smoky fractal Nov 10, 2020, 12:34 AM

#

I've pretty much just been trying to implement this in python https://www.macroption.com/rsi-calculation/

velvet thorn Nov 10, 2020, 12:36 AM

#

okay, I edited the code

#

in my original code, I had them as single values. Now I have them as series.
@smoky fractal at that point

#

they should be single values

#

because they are means

smoky fractal Nov 10, 2020, 12:41 AM

#

so what I think I am doing now is taking the RSI at evey point and putting it in that column. At the end of my code now I am returning symbol_data['RSI'][-1] to get the most recent value

#

Because while yes it is a mean it is still helpful to see it plotted over time as a series

velvet thorn Nov 10, 2020, 12:45 AM

#

so what I think I am doing now is taking the RSI at evey point and putting it in that column. At the end of my code now I am returning symbol_data['RSI'][-1] to get the most recent value
@smoky fractal RSI given the last X points?

smoky fractal Nov 10, 2020, 12:46 AM

#

yes the function takes in the bars variable as the amount of points to consider

velvet thorn Nov 10, 2020, 12:46 AM

#

yeah, but that returns a single value, right

#

the function

smoky fractal Nov 10, 2020, 12:47 AM

#

Yes the function returns a single value. To give some context, this utility will be a filter to decide whether to buy a stock or not. If RSI >=X, do/don't buy. So I always want the most recent value

#

But in other contexts I might want to plot the RSI over time to spot trends

velvet thorn Nov 10, 2020, 12:48 AM

#

@smoky fractal then look into window functions

candid merlin Nov 10, 2020, 3:41 AM

#

I was brain stroming for ideas to write a python module, But I was struck.

#

what kind of python module do you guys think should have been already available?

hasty grail Nov 10, 2020, 4:54 AM

#

Is this related to data science?

serene scaffold Nov 10, 2020, 6:06 AM

#

My advisor asked me to help one of her students install tensorflow on windows 10; he was getting errors related to Windows not being able to find C++ files

#

For some reason I'm able to install tensorflow but idk what I installed that enables me to do that.

bitter harbor Nov 10, 2020, 6:07 AM

#

is it the cpp build tools?

serene scaffold Nov 10, 2020, 6:08 AM

#

could be. I asked him to install visual studio and that didn't work.

bitter harbor Nov 10, 2020, 6:08 AM

#

it's separate from vs
you still have to download the build tools

serene scaffold Nov 10, 2020, 6:08 AM

#

I see

bitter harbor Nov 10, 2020, 6:09 AM

#

https://visualstudio.microsoft.com/visual-cpp-build-tools/
pretty sure that's the one

not being able to find C++ files
idk what else that'd be

serene scaffold Nov 10, 2020, 6:09 AM

#

I'm not referring to the IDE so I may be using the wrong term

#

I'm not sure why they throw around the word "visual" so much

bitter harbor Nov 10, 2020, 6:10 AM

#

wasn't visual cpp ms's version of the language or smthing

serene scaffold Nov 10, 2020, 6:11 AM

#

C# wasn't enough for them?

#

wow

bitter harbor Nov 10, 2020, 6:11 AM

#

or maybe it was c# im not sure

#

it was one of the c's

undone flare Nov 10, 2020, 6:19 AM

#

So I want to use data from MySQL what library is good for that?

#

from sqlalchemy import create_engine
engine = create_engine("mysql:///:memory:")
``` Would something like this work?

bitter harbor Nov 10, 2020, 6:24 AM

#

might be a good #databases question

undone flare Nov 10, 2020, 6:25 AM

#

related to pandas

#

I want to read that using pandas

bitter harbor Nov 10, 2020, 6:30 AM

#

sqlEngine       = create_engine('mysql+pymysql://*', pool_recycle=3600)
dbConnection    = sqlEngine.connect()
frame           = pd.read_sql("select * from whatever", dbConnection);

pd.set_option('display.expand_frame_repr', False)

dbConnection.close()```
± whatever options you need

winged stratus Nov 10, 2020, 6:30 AM

#

Hey guys, does anyone have a small classification dataset? I want to build a neural network just using numpy and the MNIST one seems a bit much for me

bitter harbor Nov 10, 2020, 6:32 AM

#

https://www.kaggle.com could probably find one here

Kaggle: Your Machine Learning and Data Science Community

Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals.

winged stratus Nov 10, 2020, 6:32 AM

#

yeah, i searched on kaggle but not really sure which one i should practice on

bitter harbor Nov 10, 2020, 6:33 AM

#

well what are you using it for

winged stratus Nov 10, 2020, 6:33 AM

#

some keywords to search for will be helpful

#

well what are you using it for
@bitter harbor i want to build a small neural network

bitter harbor Nov 10, 2020, 6:33 AM

#

for..?

winged stratus Nov 10, 2020, 6:33 AM

#

classification, i just learned them so i want to practice

bitter harbor Nov 10, 2020, 6:34 AM

#

classification of what tho?

winged stratus Nov 10, 2020, 6:35 AM

#

any classification dataset

undone flare Nov 10, 2020, 6:35 AM

#

sqlEngine       = create_engine('mysql+pymysql://*', pool_recycle=3600)
dbConnection    = sqlEngine.connect()
frame           = pd.read_sql("select * from whatever", dbConnection);

pd.set_option('display.expand_frame_repr', False)

dbConnection.close()```
± whatever options you need

@bitter harbor umm what is pymysql?

#

and pool_recycle?

bitter harbor Nov 10, 2020, 6:36 AM

#

idk that was the first thing that came up with the google search
I haven't worked with sql like at all

undone flare Nov 10, 2020, 6:36 AM

#

ok

winged stratus Nov 10, 2020, 6:36 AM

#

should i just use a normal breast cancer dataset?

undone flare Nov 10, 2020, 6:36 AM

#

@winged stratus you want a data set?

winged stratus Nov 10, 2020, 6:36 AM

#

i practiced logistic regression on that

#

@winged stratus you want a data set?
@undone flare yeah

bitter harbor Nov 10, 2020, 6:36 AM

#

https://www.kaggle.com/olgabelitskaya/classification-of-handwritten-letters?select=letters.csv

#

why not just use the mnist set?

undone flare Nov 10, 2020, 6:37 AM

#

for what type of operations you wanna practice on that

winged stratus Nov 10, 2020, 6:37 AM

#

@undone flare a neural netowrk

#

i really dont know how to build them, so im prcticng

bitter harbor Nov 10, 2020, 6:38 AM

#

we understand that, it's just that 'classification' is a pretty broad term

#

why not just use the mnist set?

winged stratus Nov 10, 2020, 6:39 AM

#

mnist has something like 784 inputs per training example right?

#

it may take quite a while to train

#

im just looking for something small and simple

#

it's ok ill find one

#

thanks for your time guys

undone flare Nov 10, 2020, 6:41 AM

#

@winged stratus https://www.kaggle.com/kaggle/sf-salaries how about this?

SF Salaries

Explore San Francisco city employee salary data

#

I haven't worked with neural networks so idk what type of dataset is good for that

bitter harbor Nov 10, 2020, 6:42 AM

#

that's not a lot of inputs tbh
the one you sent arnav has 110811

quiet pine Nov 10, 2020, 7:10 AM

#

hi im new to python/numpy and i was trying to understand how to represent different probability functions via numpy

#

rn im confused as to how i can alter the probability of np random (if its possible)

velvet thorn Nov 10, 2020, 7:12 AM

#

@quiet pine what exactly do you want to do?

quiet pine Nov 10, 2020, 7:12 AM

#

i want to return 1 or 0 given a probability ratio

#

basically implement a bernoulli rv

#

ik random returns [0, 1)? i believe

velvet thorn Nov 10, 2020, 7:13 AM

#

np.random.binomial

#

alternatively, np.random.choice

#

(but the former would be more appropriate)

quiet pine Nov 10, 2020, 7:13 AM

#

ah yeah im trying to do it w rand specifically because i want to learn how to implement these probabilities

velvet thorn Nov 10, 2020, 7:13 AM

#

ah, okay

#

so in that case

#

the output of np.random.rand is uniformly distributed, right

#

in the range [0, 1)

quiet pine Nov 10, 2020, 7:14 AM

#

ye

velvet thorn Nov 10, 2020, 7:14 AM

#

so think about this.

#

what's the probability

#

that the output will be >= 0.7?

quiet pine Nov 10, 2020, 7:14 AM

#

.3? no

velvet thorn Nov 10, 2020, 7:14 AM

#

yup

#

so now

#

let's say your probability of success is 0.3

#

wouldn't you say that the above calculation

#

could represent a Bernoulli RV?

quiet pine Nov 10, 2020, 7:16 AM

#

yes, altho i mean i want to influence the probability

#

by p

velvet thorn Nov 10, 2020, 7:16 AM

#

yup

#

so in that case

#

we set 0.7

#

as an arbitrary bound

#

but it doesn't have to be 0.7, right?

#

or rather, 0.7 and 0.3 are arbitrarily chosen

#

to put it another way...given a uniform distribution in the range [0, 1), and a number p also in that range, what is the probability that a randomly drawn value will be >= p?

#

think about that and relate it to the nature of a Bernoulli RV

quiet pine Nov 10, 2020, 7:18 AM

#

if the value > p then it has probability 1-p and if its less than p it has probability p?

#

or hm wait let me write this out before i come to a conclusion 1sec

#

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
^
P(X <= 0.6) = 0.6, P(X >= 0.4) = 0.4 right okay.

#

ok so if its less than or equal to the value then it represents the probability

#

ohh okay i got it now thanks 👍

#

lol that was confusing for how simple it is

velvet thorn Nov 10, 2020, 7:31 AM

#

ye yw 👋

grave frost Nov 10, 2020, 9:20 AM

#

Anyone know why pytorch checkpoint is recognized as a folder in my Ubuntu and a file in my VM? And if checkpoint is in the form of a folder, how do I load it?

#

The ckeckpoint is supposed to be 20 Gigs, but it was in the download phase where I beleive it lost the correct format

#

Im gonna try to compress it and see if that works

boreal summit Nov 10, 2020, 10:21 AM

#

For those who use VS code to write Python, I just discovered my intellisense is case sensitive and won't come up except you use the exact case for the word you need. Any walk around this?

undone flare Nov 10, 2020, 11:37 AM

#

I have a dataset and it has a column called 'CC Exp Date'
and it has dates like 01/20, 04/22, 23/25.... in different rows
How many people have a credit card that expires in 2025?
so
I tried using regex
but I failed miserably lol

#

nvm got it 👍

grave frost Nov 10, 2020, 12:06 PM

#

For those who use VS code to write Python, I just discovered my intellisense is case sensitive and won't come up except you use the exact case for the word you need. Any walk around this?
@boreal summit Use Kite

grave frost Nov 10, 2020, 12:39 PM

#

Can anyone confirm whether a 20Gb checkpoint would necessarily use 20Gb RAM or is there a way to reduce the memory taken by the loading checkpoint? I have 8GB mem on my system but can use Kaggle/Colab for 16Gb too.

boreal summit Nov 10, 2020, 1:10 PM

#

@grave frost thanks, I'll give it a try.

lapis sequoia Nov 10, 2020, 2:01 PM

#

can anyone link me to some good ai tutorials using python

prisma isle Nov 10, 2020, 2:32 PM

#

I need to use scipy's optimise functions to perform gradient descent.
Only issue is, my function is calculated by finding a linear combination of several large matrices

#

Large enough that I can't store them all in memory, so they're numpy memmapped

#

However, there's still a significant memory usage spike while calculating each function value

#

Is there a way to make sure it won't overcap my memory while running? Also, can I keep storing the iteratively computed values to disk, so that in the case it does fail, I can atleast start closer to the minima

undone flare Nov 10, 2020, 3:00 PM

#

does legend() has default loc set to 0?

autumn imp Nov 10, 2020, 3:52 PM

#

anyone here know Pandas? Can you take a look https://stackoverflow.com/questions/64762942/how-do-i-compare-2-separate-csv-files-using-pandas-and-store-it-into-a-dictionar

Stack Overflow

How do I compare 2 separate csv files using Pandas and store it int...

So I have 2 csv files. The first one is SortedPower.csv which has
ID, Stat, Level, Power, Level_Area
12, A, Silver, 546.0, 3
11, A, Silver, 546.0, 3
13, A, Silver, 561.0, 3
14, A, Sil...

open stratus Nov 10, 2020, 4:00 PM

#

hey... i'm just getting into machine learning (TensorFlow for now) do i need to get anaconda for that or can i just use my usual python 3.8 with tensorflow?

#

i already have multiple versions of python i'd rather not install more... is anaconda a requirement for machine learning or is it optional??

hollow sentinel Nov 10, 2020, 4:36 PM

#

optional but recommended bc jupyter notebook is great @open stratus

glacial rune Nov 10, 2020, 4:54 PM

#

I have a list of dictionaries:

[{'store': 'a', 'buy': '1.1312', 'sell': '1.1518'}, 
{'store': 'b', 'buy': '1.1315', 'sell': '1.1517'}, 
{'store': 'c', 'buy': '1.1316', 'sell': '1.1518'},
etc.]

all of the buys and sells are strings. What is the most performant way to convert them all to floats? I made a try_float method but iteratively that takes quite a while

#

if anyone has any performant data structures they'd recommend for processing a list of dictionaries that would be great - I'm trying to see if I can use a numpy array

pearl vine Nov 10, 2020, 5:02 PM

#

One possibility is to write a dict wrapper that applies a float conversion as the 'buy' and 'sell' values are accessed.

lapis sequoia Nov 10, 2020, 5:29 PM

#

how can i scrap and download mp3 here https://www.ldoceonline.com/dictionary/absent

absent | meaning of absent in Longman Dictionary of Contemporary En...

absent meaning, definition, what is absent: not at work, school, a meeting etc, beca...: Learn more.

#

i can scrap text no problem but i couldnt scrap and download mp3 file

#

i just wanna download first class="speaker exafile fas fa-volume-up hideOnAmp"

prisma isle Nov 10, 2020, 5:30 PM

#

optional but recommended bc jupyter notebook is great @open stratus
@hollow sentinel you can get jupyter without anaconda too

hallow orbit Nov 10, 2020, 5:33 PM

#

so I'm trying to make a financial option tool, and I need to decide between using any of:

Hidden Markov Models
Naive Bayes
Bayesian Networks
Markov Networks

If I'm inputting past sets of observations, such as price points, bid/ask prices, strike prices, etc, and I input the next day's return on investment for different combinations of those observations, and I want to make the tool predict a return on investment given a unique set of observations, what model should I use? I'm currently going with Bayesian Networks because they're good at inference, but I'm not totally confident.

hollow sentinel Nov 10, 2020, 5:42 PM

#

@prisma isle my bad haha

fallow prism Nov 10, 2020, 6:10 PM

#

boys i need help, i want to fix spell errors in spanish text and i don't know haw to start, somebody get an idea?

grave frost Nov 10, 2020, 6:22 PM

#

@fallow prism Do you want to use ML for that or is any other tool good enough?

#

Does anyone know how to load a big Pytorch checkpoint (20Gigs) without taking 20~ish Gb RAM? I only have 16G

fallow prism Nov 10, 2020, 6:35 PM

#

@fallow prism Do you want to use ML for that or is any other tool good enough?
@grave frost yes, i want to use it for ML, specifcally NLP

molten hamlet Nov 10, 2020, 6:37 PM

#

Hi guys,
any idea if I could do this in one line?

hue = np.mean(hsv[:, :, 0])
saturation = np.mean(hsv[:, :, 1])
value = np.mean(hsv[:, :, 2])

hue, sat, val = np.mean(hsv, ....)

pale thunder Nov 10, 2020, 6:39 PM

#

hue, sat, val = (np.mean(hsv[:, :, n]) for n in range(3)), with this few elements it does not matter that you are using a python comp

#

other than that, maybe np.moveaxis with some clever axis arg for the np.mean

molten hamlet Nov 10, 2020, 6:41 PM

#

mean returns matrixes if feeded axis, or scalar if not
😐 I though I will find some numpy solution

#

it does not matter that you are using a python comp
@pale thunder
hey, can you elaborate on that compiler? what u had on mind?

pale thunder Nov 10, 2020, 6:44 PM

#

that was short for comprehension

molten hamlet Nov 10, 2020, 6:45 PM

#

ah right

#

~~I jsut checked, and you can iterate natively on last axis, so for matrix in hsv 🙂~~

#

i was wrong

hallow orbit Nov 10, 2020, 7:46 PM

#

so I'm trying to make a financial option tool, and I need to decide between using any of:

Hidden Markov Models
Naive Bayes
Bayesian Networks
Markov Networks

If I'm inputting past sets of observations, such as price points, bid/ask prices, strike prices, etc, and I input the next day's return on investment for different combinations of those observations, and I want to make the tool predict a return on investment given a unique set of observations, what model should I use? I'm currently going with Bayesian Networks because they're good at inference, but I'm not totally confident.

heady hatch Nov 10, 2020, 7:49 PM

#

Have you considered time series model?

hallow orbit Nov 10, 2020, 7:50 PM

#

no, what's that?

heady hatch Nov 10, 2020, 7:50 PM

#

But also I think a clarification on your problem will be helpful too.

From my understanding, you want to predict a return on investment based on some past data.

hallow orbit Nov 10, 2020, 7:51 PM

#

yes

heady hatch Nov 10, 2020, 7:51 PM

#

So time series models learn from past data to predict future patterns.

hallow orbit Nov 10, 2020, 7:51 PM

#

ah

#

are there any other names for time series models?

#

i'm using pomegranate, and it doesn't list time series models as an option

#

it's possible that pomegranate doesnt support it though

heady hatch Nov 10, 2020, 7:52 PM

#

I’m not too sure what pomegranate is, but you can look up ARIMA models.

hallow orbit Nov 10, 2020, 7:52 PM

#

ok

#

do you know any ML libraries that implement time series?

#

o nvm, found one that looks good

#

https://pypi.org/project/pmdarima/ if anyone comes upon this later

PyPI

pmdarima

Python's forecast::auto.arima equivalent

#

oh wait I did a dumb, when I agreed with "you want to predict a return on investment based on some past data", I interpreted past data to mean a set of observations that has just been collected, not data from a significant time ago. the thing being predicted is totally independent of past data @heady hatch

heady hatch Nov 10, 2020, 8:17 PM

#

Oh I see.

#

If that's the case you can try much more algorithms. If you think the relationship is linear or can be transformed into linear, you can try linear models. If not then try some tree based or ensemble models.

#

I think I should clarify what you mean by totally independent of the past data.

#

As in there's no relationship or no time relationship?

#

Because it would be hard to do a prediction on features that doesn't have any relationships at all.

hallow orbit Nov 10, 2020, 8:25 PM

#

oh

#

there's no time relationship

#

which of these would (most likely) be the best though:

Hidden Markov Models
Naive Bayes
Bayesian Networks
Markov Networks

(the person I'm doing this for wants to stick to these models)

wintry atlas Nov 10, 2020, 8:31 PM

#

Hi all,

I am running the following code:

import math
from scipy import stats

o=float(input("Enter Odds(O):"))
r=float(input("Provide ROI(R):"))

s=abs(math.sqrt(abs(r*(o-r))))
print("\nStandard Deviation(S.D.)="+str(s)+"")

n=float(input("Enter n:"))

t=(math.sqrt(n)*(r-1))/s

print("\nT-score="+str(t)+"\n")

p=round((stats.t.sf(t,n))*100,3)

print("\nP-value="+str(p))

#

for which I'm entering:
Enter Odds(O):4.76

Provide ROI(R):0.1163

Standard Deviation(S.D.)=0.734889318196965

Enter n:8854

T-score=-113.14951036753682

P-value=100.0

#

I just can't quite understand the p-value here

spice cedar Nov 10, 2020, 8:33 PM

#

Hello, a quick question.
I have a DataFrame with a Time vector, where the time has been given as
00:04
00:08
00:11
and so on, which is an object datatype.
How do i change this to a normal time vector, like 4,8,11, etc.

heady hatch Nov 10, 2020, 8:44 PM

#

@hallow orbit hard to say without able to know the relationship of your features.

But I would start with naive bayes since that's relatively simple in capturing relationship in a probabilistic way.

mental timber Nov 10, 2020, 8:44 PM

#

Can someone help me to understand Random Forests Classifiers?

heady hatch Nov 10, 2020, 8:45 PM

#

Do you have any specific questions?

mental timber Nov 10, 2020, 8:45 PM

#

ah yes. I want to use random forest to predict results from a dataset

midnight rain Nov 10, 2020, 8:47 PM

#

https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

#

has a nice example

mental timber Nov 10, 2020, 8:47 PM

#

oh ty 😄

midnight rain Nov 10, 2020, 8:48 PM

#

and here is a good write up on different ensemble models: https://scikit-learn.org/stable/modules/ensemble.html

heady hatch Nov 10, 2020, 8:48 PM

#

Hey @midnight rain , have you worked much with rf?

I would love to get your opinion on outliers and imbalanced data with rf.

midnight rain Nov 10, 2020, 8:49 PM

#

ive done a bit of work with isolation forests

mental timber Nov 10, 2020, 8:49 PM

#

Ty for the info. I'll look into it

midnight rain Nov 10, 2020, 8:49 PM

#

but im not a datascientist im a machine learning engineer

heady hatch Nov 10, 2020, 8:49 PM

#

Ahh.

midnight rain Nov 10, 2020, 8:49 PM

#

so i do more support then primary modeling

#

if you want to use a RF i really recommend trying an isolation forest

heady hatch Nov 10, 2020, 8:49 PM

#

Do you mind if I ask you regarding your responsibility as a mle?

midnight rain Nov 10, 2020, 8:50 PM

#

they work very well and i think they tend to work much better than SVMs in production environments

heady hatch Nov 10, 2020, 8:50 PM

#

Ahh.

midnight rain Nov 10, 2020, 8:50 PM

#

sure whats up

heady hatch Nov 10, 2020, 8:50 PM

#

What's your responsibility like as an mle?

#

Because I've come across a wide definition and would love to add yours to my knowledge base.

midnight rain Nov 10, 2020, 8:50 PM

#

mmm right now im integrating data science models into a large project we are working on

#

i take the jupyter notebooks from the data scientists and then i turn them into a production ready model by optimizing the code as much as possible and adding production ready error handling etc.

#

im also managing the data pipelines for productionizing the models.

#

and im working on a large project in Neo4J to create an interface for us to query and get insights out of the data produced from the models and our other data scraping

heady hatch Nov 10, 2020, 8:52 PM

#

Oh that's pretty cool.

Do you add monitoring and testing/debugging for the models?

midnight rain Nov 10, 2020, 8:53 PM

#

i dont do any monitoring at my current job yet

#

eventually i'll add a feedback loop from our BI work, but thats further down the product pipeline

hallow orbit Nov 10, 2020, 8:54 PM

#

@heady hatch thanks!

heady hatch Nov 10, 2020, 8:54 PM

#

Are you the sole/few engineers on your team?

hallow orbit Nov 10, 2020, 8:55 PM

#

oh wait you're still in the convo, sorry for ping

heady hatch Nov 10, 2020, 8:55 PM

#

Yea no problem Theelx.

#

Btw pantsforbirds, thank you so much for the information.

It's really nice to hear what other people are doing and working with so I can evaluate myself and see where I fit in.

midnight rain Nov 10, 2020, 8:55 PM

#

yeah im the only MLE on the team right now

heady hatch Nov 10, 2020, 8:55 PM

#

Ahh.

midnight rain Nov 10, 2020, 8:56 PM

#

its a smaller VC firm that im working at now

heady hatch Nov 10, 2020, 8:56 PM

#

Oh how are deadlines like?

midnight rain Nov 10, 2020, 8:56 PM

#

we are semi researching right now so not terrible

#

also have you guys seen https://www.gdeltproject.org/ ?

The GDELT Project

hallow orbit Nov 10, 2020, 8:58 PM

#

oh that looks cool

heady hatch Nov 10, 2020, 8:58 PM

#

That's pretty insane.

#

I wonder how did they get the data realtime.

midnight rain Nov 10, 2020, 9:00 PM

#

i have no idea the latency on it

hallow orbit Nov 10, 2020, 9:02 PM

#

they might have a bunch of different scrapers set up for each news sites

midnight rain Nov 10, 2020, 9:02 PM

#

the data scale is so insane

hollow sentinel Nov 10, 2020, 10:19 PM

#

is there like a machine learning project idea generator somewhere

#

I'm getting bored

molten hamlet Nov 10, 2020, 10:19 PM

#

generator?

hollow sentinel Nov 10, 2020, 10:20 PM

#

yeah that spits out ideas to do

#

there's one like that online with videogames

molten hamlet Nov 10, 2020, 10:21 PM

#

make mozaic generator 😄

📎 chef_small_lin.png

hollow sentinel Nov 10, 2020, 10:23 PM

#

lol don't think i'm at that level yet

#

I'm having a hard time staying motivated to do the Ng course

molten hamlet Nov 10, 2020, 10:23 PM

#

its not ML actually ;D

hollow sentinel Nov 10, 2020, 10:23 PM

#

oh my bad then

molten hamlet Nov 10, 2020, 10:24 PM

#

have you been on open ai gym ?

hollow sentinel Nov 10, 2020, 10:24 PM

#

no what's that

molten hamlet Nov 10, 2020, 10:25 PM

#

https://tenor.com/view/reinforcement-learning-cartpole-v0-tensorflow-open-ai-gif-18474251

Tenor

#

place with many environments

graceful glacier Nov 10, 2020, 10:25 PM

#

any recommendations for a SQL IDE?

molten hamlet Nov 10, 2020, 10:25 PM

#

#databases maybe

#

😕

hollow sentinel Nov 10, 2020, 10:53 PM

#

i need to practice cleaning data

molten hamlet Nov 10, 2020, 10:58 PM

#

I think regularization is that term

#

but I do not know it

#

😦

hollow sentinel Nov 10, 2020, 11:04 PM

#

lol well i suck at it

#

My best idea is to find kaggle datasets and clean them

molten hamlet Nov 10, 2020, 11:09 PM

#

I downloaded some fruits from kaggle and jsut classified it 😐

hollow sentinel Nov 10, 2020, 11:11 PM

#

oh that's cool

#

lol idk how to do thatyet

candid lodge Nov 11, 2020, 12:50 AM

#

hi @velvet thorn

velvet thorn Nov 11, 2020, 12:50 AM

#

okay so about resizability of numpy arrays

#

in the general case, you cannot resize them, because the memory of a numpy array must be contiguous

#

so it's best (IMO) to treat them as static.

#

however, @serene scaffold is actually right in that you can technically resize them with the .resize method.

olive dove Nov 11, 2020, 12:51 AM

#

I agree

candid lodge Nov 11, 2020, 12:51 AM

#

so what is the best alternative of vector<T> in C++ for python?

velvet thorn Nov 11, 2020, 12:51 AM

#

but you reaaaaaaally shouldn't do that because of views and stuff

candid lodge Nov 11, 2020, 12:51 AM

#

I am trying to make a tile map

#

which requires a 2D layout

#

and store values

olive dove Nov 11, 2020, 12:52 AM

#

They can start with a list, append to that, then switch to np array right

candid lodge Nov 11, 2020, 12:52 AM

#

but list is slow right

velvet thorn Nov 11, 2020, 12:52 AM

#

but you reaaaaaaally shouldn't do that because of views and stuff
@velvet thorn because say you have an array b that is a view into an array a; if you resize a, the behaviour of b is undefined.

olive dove Nov 11, 2020, 12:52 AM

#

Or you could np.full numbers

velvet thorn Nov 11, 2020, 12:52 AM

#

so what is the best alternative of vector<T> in C++ for python?
@candid lodge why do you need resizability?

#

if it's a tile map

candid lodge Nov 11, 2020, 12:53 AM

#

i need a map that is resizable?

velvet thorn Nov 11, 2020, 12:53 AM

#

what kind of calculations

#

are you doing

#

as in

#

why not just create a new array

#

copying the values

candid lodge Nov 11, 2020, 12:53 AM

#

oh

velvet thorn Nov 11, 2020, 12:53 AM

#

so what np.append does is create a new array

candid lodge Nov 11, 2020, 12:53 AM

#

so np.append(array, values)?

velvet thorn Nov 11, 2020, 12:53 AM

#

with the passed values added to the end

#

!e

import numpy as np

a = np.array([1, 2, 3])
print(a)
print(np.append(a, [4, 5]))
print(a)

arctic wedgeBOT Nov 11, 2020, 12:54 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 | [1 2 3]
002 | [1 2 3 4 5]
003 | [1 2 3]

velvet thorn Nov 11, 2020, 12:54 AM

#

you can see that a does not change, because append returns a new array

#

this is unlike the behaviour of native Python list.append

candid lodge Nov 11, 2020, 12:54 AM

#

ohh

#

that's so alike of vector<T> in C++

#

it creates a new object

velvet thorn Nov 11, 2020, 12:55 AM

#

you can append inplace to vectors in C++, right?

candid lodge Nov 11, 2020, 12:55 AM

#

yes

velvet thorn Nov 11, 2020, 12:55 AM

#

yeah, you can't for numpy arrays (in the general case)

#

so they're really more like Python lists, except faster

#

and statically typed

candid lodge Nov 11, 2020, 12:55 AM

#

ohh okay thank you

#

do you know how to make a numpy array the initialise the size

#

when created

velvet thorn Nov 11, 2020, 12:57 AM

#

uh

#

you want an array of zeroes?

candid lodge Nov 11, 2020, 12:57 AM

#

a = np.array([0, 0, 0, 0, 0, 0, 0, 0])

#

yes

velvet thorn Nov 11, 2020, 12:57 AM

#

!e

import numpy as np

a = np.zeros((3, 5))
print(a)

arctic wedgeBOT Nov 11, 2020, 12:58 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 | [[0. 0. 0. 0. 0.]
002 |  [0. 0. 0. 0. 0.]
003 |  [0. 0. 0. 0. 0.]]

velvet thorn Nov 11, 2020, 12:58 AM

#

there you go

candid lodge Nov 11, 2020, 12:58 AM

#

not 0 but what if different?

velvet thorn Nov 11, 2020, 12:58 AM

#

what do you mean

#

np.fill

candid lodge Nov 11, 2020, 12:58 AM

#

what are the parametersw?

velvet thorn Nov 11, 2020, 12:58 AM

#

you can check the docs

candid lodge Nov 11, 2020, 12:58 AM

#

alright

velvet thorn Nov 11, 2020, 12:58 AM

#

or np.full, if you want a new array

#

np.fill is inplace

candid lodge Nov 11, 2020, 12:59 AM

#

ohh

velvet thorn Nov 11, 2020, 12:59 AM

#

@serene scaffold btw sorry to ping you but was just wondering - why did you say arrays could be resized?

#

like were you thinking of the same thing I was or was there something else

prime girder Nov 11, 2020, 1:00 AM

#

Hey I'm new here👋
What happens here?

velvet thorn Nov 11, 2020, 1:00 AM

#

Hey I'm new here👋
What happens here?
@prime girder we talk about data science

serene scaffold Nov 11, 2020, 1:01 AM

#

I figured that if you can change the data in an array without creating a new object, then you can also change the size.

velvet thorn Nov 11, 2020, 1:01 AM

#

ah, okay

serene scaffold Nov 11, 2020, 1:01 AM

#

@prime girder we talk about data science
@velvet thorn and we don't talk about fight club

velvet thorn Nov 11, 2020, 1:01 AM

#

thanks for explaining

#

@velvet thorn and we don't talk about fight club
@serene scaffold yes but we ALSO don't talk about fight club

prime girder Nov 11, 2020, 1:02 AM

#

I feel like there is a lot of context I am missing

velvet thorn Nov 11, 2020, 1:03 AM

#

but yeah, we discuss data science/machine learning/statistics/etc. and the Python libraries incidental thereto here

prime girder Nov 11, 2020, 1:04 AM

#

Well I dabble in those

#

Most goes way over my head

serene scaffold Nov 11, 2020, 1:08 AM

#

@prime girder let me pull up the rules for you

#

&rules

arctic wedgeBOT Nov 11, 2020, 1:09 AM

#

Python Discord Rules
We have a small but strict set of rules on our server. Please read over them and take them on board. If you don't understand a rule or need to report an incident, please send a direct message to @sonic vapor!
Rule 1
Do not talk about fight club.
Rule 2
DO NOT TALK ABOUT FIGHT CLUB.
Rule 3
Listen to and respect staff members and their instructions.
Rule 4
This is an English-speaking server, so please speak English to the best of your ability.
Rule 5
Do not provide or request help on projects that may break laws, breach terms of services, be considered malicious or inappropriate. Do not help with ongoing exams. Do not provide or request solutions for graded assignments, although general guidance is okay.
Rule 6
No spamming or unapproved advertising, including requests for paid work. Open-source projects can be shared with others in #python-discussion and code reviews can be asked for in a help channel.

prime girder Nov 11, 2020, 1:10 AM

#

Well there goes my gameplan to get files to hack people with

candid lodge Nov 11, 2020, 1:32 AM

#

what is the better code of this

#

test = [5, 2, 6, 3]

counter = 0
for e in test:
    print(e)
    print(counter)```

#

i want to keep track of the index

autumn locust Nov 11, 2020, 1:38 AM

#

test = [5, 2, 6, 3]

for i, value in enumerate(test):
  print(value)
  print(i)

#

@candid lodge

hollow sentinel Nov 11, 2020, 1:55 AM

#

WHY ARE WE SCREAMING

serene scaffold Nov 11, 2020, 1:58 AM

#

@lapis sequoia please don't disturb the developers.

boreal summit Nov 11, 2020, 2:07 AM

#

Guys, each time I try to run GridSearchCv, I always get this invalid parameters error. Even though I'm certain all the hyper parameters are spelled and labelled correctly. I use VS code.

#

Never mind about this again, just found out I spelt neighbours wrongly. I used neighbours instead of neighbors

prime girder Nov 11, 2020, 2:41 AM

#

Never mind about this again, just found out I spelt neighbours wrongly. I used neighbours instead of neighbors
@boreal summit
Amen to that I hate American spelling

#

Cuz a "u" is soo hard to type

boreal summit Nov 11, 2020, 3:01 AM

#

@prime girder I've been wondering why the code couldn't run even after doing everything right for the past 2 days. I'm a bit relieved now.

#

Thanks.

heady hatch Nov 11, 2020, 3:10 AM

#

@velvet thorn I think I'm beginning to understand why database prefer atomic values.

Working with data structures inside columns are a pain.

potent fern Nov 11, 2020, 7:28 AM

#

Hi

#

Any one know to focus on a specific object out of multiple objects?

velvet thorn Nov 11, 2020, 7:36 AM

#

@velvet thorn I think I'm beginning to understand why database prefer atomic values.

Working with data structures inside columns are a pain.
@heady hatch by a lot

#

I mean there are databases that do well with those

#

just not SQL

potent fern Nov 11, 2020, 11:28 AM

#

Hii.. Any one familiar with pyzbar? 😞

#

I am getting error...

While i trying to implement this code :

#

import cv2
import numpy as py
import pyzbar

cap = cv2.VideoCapture(0)
cap.set(3,640)
cap.set(4,480)

while True:

success,img = cap.read()
for barcode in decode(img):
    print(barcode.data)
    mydata = barcode.data.decode('utf-8')
    print(mydata)
    
cv2.imshow('Result',img)
cv2.waitKey(1

undone flare Nov 11, 2020, 12:33 PM

#

Hey where can I see all the datasets available in seaborn?

smoky bobcat Nov 11, 2020, 12:56 PM

#

someone can help me with standardisation process? explain how it's done?

mild topaz Nov 11, 2020, 1:31 PM

#

i have a code which saves a image and do further execution of code
now i do not want to save this image , i want to directly do further execution of code
how i can do this?
my code herepython with open("imagetosave2.png", "wb") as test_img: test_img.write(image_data) test_img = image.load_img("imagetosave2.png", target_size = (64, 64))
here i do not want to save img2.png this here

#

ping me when u have ans

fierce shadow Nov 11, 2020, 2:07 PM

#

@mild topaz whats image data consisting of?

#

numpy arrays?

#

or what?

#

btw is this channel about data science or for machine learning aswell?

mild topaz Nov 11, 2020, 2:11 PM

#

see my code herepython image_data = base64.b64decode(data["image"]) print(type(image_data)) data = io.BytesIO(image_data) try: test_img = Image(io.BytesIO(image_data)) except Exception as e : logger.debug ({ "status": "invalid", "message" : "Provide valid base64 string"}) return { "status": "invalid", "message" : "Provide valid base64 string"} test_img = open("img2.png", "rb") image_data = test_img.read() test_img.close() test_img = image.load_img("img2.png", target_size = (64, 64))

#

@fierce shadow

#

btw is this channel about data science or for machine learning aswell?
both

fierce shadow Nov 11, 2020, 2:13 PM

#

never worked with those base64 stuff... but I am pretty sure you might have to use PIL.Image

#

it has many functions to convert images

mild topaz Nov 11, 2020, 2:26 PM

#

see here i am not converting any image @fierce shadow

marsh chasm Nov 11, 2020, 3:03 PM

#

hi! I'm learning some supervised learning stuff and i have a project to find the best classifier for some data; i'm trying svm's rn and I find that the poly kernel takes a long time; I can't seem to find why some kernels take longer than others; it seems to be data dependent too since my friend with a different data set had the same problem but for her the linear kernel was the one that took a long time to run; is there a reason why?

lapis sequoia Nov 11, 2020, 3:40 PM

#

I'm trying to work with some json data but can't figure out how to gather all information and then use it for an example a function that just get's all the json sections without me specifying the real name like this: 2020-11-11

{
"status": 200,
"type": "stack",
"data": {
"2020-11-11": {
"total_cases": 166707,
"deaths": 6082,
"recovered": 0,
"critical": 129,
"tested": 2431770,
"death_ratio": 0.03648317107260043,
"recovery_ratio": 0
},
"2020-11-10": {
"total_cases": 162240,
"deaths": 6057,
"recovered": 0,
"critical": 92,
"tested": 2431770,
"death_ratio": 0.0373335798816568,
"recovery_ratio": 0
},
"2020-11-09": {
"total_cases": 146461,
"deaths": 6022,
"recovered": 0,
"critical": 92,
"tested": 2431770,
"death_ratio": 0.04111674780316944,
"recovery_ratio": 0
},
"2020-11-08": {
"total_cases": 146461,
"deaths": 6022,
"recovered": 0,
"critical": 92,
"tested": 2431770,
"death_ratio": 0.04111674780316944,
"recovery_ratio": 0
},
"2020-11-07": {
"total_cases": 146461,
"deaths": 6022,
"recovered": 0,
"critical": 92,
"tested": 2431770,
"death_ratio": 0.04111674780316944,
"recovery_ratio": 0
},
"2020-11-06": {
"total_cases": 146461,
"deaths": 6022,
"recovered": 0,
"critical": 92,
"tested": 2431770,
"death_ratio": 0.04111674780316944,
"recovery_ratio": 0
},
"2020-11-05": {
"total_cases": 141764,
"deaths": 6002,
"recovered": 0,
"critical": 90,
"tested": 2242469,
"death_ratio": 0.0423379701475692,
"recovery_ratio": 0
},
"2020-11-04": {
"total_cases": 137730,
"deaths": 5997,
"recovered": 0,
"critical": 73,
"tested": 2242469,
"death_ratio": 0.04354171204530603,
"recovery_ratio": 0
}
}
}

quick helm Nov 11, 2020, 3:45 PM

#

is there anyone know something about huggingface and text classification with electra?

smoky bobcat Nov 11, 2020, 4:29 PM

#

how much test size is suggested? 0.3?

#

while doing train test split

boreal summit Nov 11, 2020, 5:04 PM

#

@marsh chasm could be that your data is high dimensional. You could reduce the dimensionality using PCA or some other dimensionality reduction technique.

#

Also, Your data might be too complex for the model you're using to train it.

cerulean spindle Nov 11, 2020, 6:21 PM

#

@marsh chasm you could definitely try PCA, but you should also check to see if there are a lot of zeros in your dataset. Sometimes these zeros are treated as a placeholder or null value. You could use the following code to check:

print(np.sum(data == 0)/(data.size))

If this results in a large %, you should consider using the TruncatedSVD dimensionality reduction technique.

mortal pendant Nov 11, 2020, 6:22 PM

#

With textgenrnn, is it possible to continue training a pre-trained dataset (possibly with more data)? So, like, I'll train with a datest with 10000 datapoints for 5 epochs one day, and then the next day I can continue to train with that same data (possible with now 10200 datapoints) from the hdf5 file for another 5 epochs to get even better results?

marsh chasm Nov 11, 2020, 6:23 PM

#

@marsh chasm could be that your data is high dimensional. You could reduce the dimensionality using PCA or some other dimensionality reduction technique.
@boreal summit yeah I’ll try PCA thanks

#

@marsh chasm you could definitely try PCA, but you should also check to see if there are a lot of zeros in your dataset. Sometimes these zeros are treated as a placeholder or null value. You could use the following code to check:
print(np.sum(data == 0)/(data.size))
If this results in a large %, you should consider using the TruncatedSVD dimensionality reduction technique.
@cerulean spindle ok cool! Thanks so much

cerulean spindle Nov 11, 2020, 6:24 PM

#

@marsh chasm Are you using MNIST dataset?

marsh chasm Nov 11, 2020, 6:24 PM

#

No I’m using the Wisconsin breast cancer data set

#

On kaggle

cerulean spindle Nov 11, 2020, 6:24 PM

#

oh ok

azure holly Nov 11, 2020, 7:27 PM

#

Does anyone here mess with Tensorflow? Pretty much learned what I can from the entire Python Crash Course book and was wanting to move into ML. Only been doing Python for like 8 months. Should I learn about something else before Tensorflow and ML or just go straight into it?

heady hatch Nov 11, 2020, 7:41 PM

#

Are you familiar with ML foundation? Such as train, validation, test split, overfitting, underfitting, imbalanced datasets, model evaluation, optimization, different kinds of ml problems, data cleaning, transformation, selection, etc etc?

#

Along with mathematical foundation such as linear algebra, probabilities, statistics, and calculus?

#

Or you can also dive straight in and go with a top down approach instead of a bottom up.

#

Ultimately depends on your learning style and how much you are willing to adapt.

#

There's fastai which teaches it in top down perspective and you learn the models as tools and then learning how to take it apart.

raw vigil Nov 11, 2020, 7:52 PM

#

Does anyone have any good datasets for chatbot?

#

Please ping me

#

thanks

livid temple Nov 11, 2020, 7:55 PM

#

I've been using pandas/python/jupyter for many years now, but i recently saw an R notebook and it looked really really clean/easy. Can someone explain to me what benefit R might have to someone who already knows python/pandas/jupyter well?

cerulean spindle Nov 11, 2020, 7:56 PM

#

I believe R is a more statistically minded approach, but I'm not sure.

livid temple Nov 11, 2020, 8:02 PM

#

my initial thoughts were that it looked cleaner, but python is maybe more granular?

torpid cave Nov 11, 2020, 8:51 PM

#

Anyone here who knows both R and Python?

#

I have something I have been doing in R for quite a while but implementing it in Python is a hassle

hearty jewel Nov 11, 2020, 9:03 PM

#

def children(data):
if data=0:
return 'childless'
if 1 <= data =<3:
return '1-3 children'
if data > 3:
return '4+ children'

#

im getting a syntax error with this

#

can anyone help lol

#

says data=0 is syntax error

torpid cave Nov 11, 2020, 9:04 PM

#

data == 0

hearty jewel Nov 11, 2020, 9:04 PM

#

ty

#

yes

torpid cave Nov 11, 2020, 9:04 PM

#

= is for assignmet, == is for comparison

#

nww

hearty jewel Nov 11, 2020, 9:04 PM

#

nowim getting new syntax error

#

for the =<3

torpid cave Nov 11, 2020, 9:05 PM

#

welp

#

it is badly written

#

Haha

hearty jewel Nov 11, 2020, 9:05 PM

#

im noob

#

lol

torpid cave Nov 11, 2020, 9:05 PM

#

no worries

hearty jewel Nov 11, 2020, 9:06 PM

#

whats wrong with the <= 3

torpid cave Nov 11, 2020, 9:06 PM

#

You should do

#

if data >= 1 and data <= 3

hearty jewel Nov 11, 2020, 9:06 PM

#

it wouldnt be and right

#

would be &>?

torpid cave Nov 11, 2020, 9:07 PM

#

and

#

actually

hearty jewel Nov 11, 2020, 9:07 PM

#

if data>=1 and data<=3:

torpid cave Nov 11, 2020, 9:07 PM

#

Let me check, I have been coding in R

#

and got the syntax confused for both

hearty jewel Nov 11, 2020, 9:08 PM

#

i got it

#

it worked

#

thanks bro

#

❤️

torpid cave Nov 11, 2020, 9:08 PM

#

def children(data):
if data=0:
return 'childless'
if data >= 1 and data <= 3:
return '1-3 children'
if data > 3:
return '4+ children'

hearty jewel Nov 11, 2020, 9:08 PM

#

u a god

torpid cave Nov 11, 2020, 9:08 PM

#

Keep on working on Python

hearty jewel Nov 11, 2020, 9:08 PM

#

i will one day become a god like you

torpid cave Nov 11, 2020, 9:08 PM

#

I am not a god haha

#

Just learn that comparison syntax and you should be fine

#

I just wished someone helped me with my issue, I am overcomplicating my code

heady hatch Nov 11, 2020, 9:14 PM

#

@torpid cave I'm not familiar with R, but might be able to help you translate.

What are you trying to do in terms of code?

torpid cave Nov 11, 2020, 9:14 PM

#

I have one dataframe with responses, and another dataframe with keys

#

I just need to translate responses to keys

#

My initial approach (works in R) was:
df.apply(lambda x: df2[df2['key'] == x]['code'].item())

heady hatch Nov 11, 2020, 9:16 PM

#

Hm could you give me an example of the dataframes?

torpid cave Nov 11, 2020, 9:16 PM

#

let me get a repex

#

one sec

hearty jewel Nov 11, 2020, 9:17 PM

#

for column in insurance.columns:
pivot=insurance.pivot_table('charges', index=column)
display(pivot)
pivot.plot.bar(stacked=False)

#

oscar im getting an error

#

with the new columns we just made

torpid cave Nov 11, 2020, 9:18 PM

#

one sec @hearty jewel

hearty jewel Nov 11, 2020, 9:18 PM

#

that code worked with all columns except the new columns

#

ValueError: Grouper for 'charges' not 1-dimensional

heady hatch Nov 11, 2020, 9:19 PM

#

Do you have two columns named "charges"?

hearty jewel Nov 11, 2020, 9:20 PM

#

No

#

📎 unknown.png

torpid cave Nov 11, 2020, 9:21 PM

#

df = pd.DataFrame(dict(
    Sample1 = [5,2,10,2,2],
    Sample2 = [5,5,5,10,10]))

df2 = pd.DataFrame(dict(
    Keys = ['A','B','C'],
    Values = ['5', '2', '10'] ))

#

@heady hatch

#

What I am trying to do is just convert df into df2 letters

heady hatch Nov 11, 2020, 9:22 PM

#

so change all the 5 to 'A'?

torpid cave Nov 11, 2020, 9:22 PM

#

yep

heady hatch Nov 11, 2020, 9:22 PM

#

Your apply makes sense.

torpid cave Nov 11, 2020, 9:23 PM

#

I am doing this frankestein

def TranslateList(list1):
    def LookValue(value):
        value = str(value)
        value = info[1][info[1]['IDNumber'] == value]['Sample'].item()
        return value
    
    translation = []
    for item in list1:
        translation.append(LookValue(item))
    return(translation)

defCreateTrasnlatedTable():
#I am writting this now

#

But a one-liner should do it

#

Not sure why I can't get it right

#

@hearty jewel what are you trying to do? I have some more time before I start work

heady hatch Nov 11, 2020, 9:24 PM

#

So another way of doing is to creating a dictionary to index into.

df2_dict = df2.set_index('Values')['Keys'].to_dict()

df1.apply(lambda x: df2_dict[x])

torpid cave Nov 11, 2020, 9:25 PM

#

Let me try

#

I never think about dictionaries

heady hatch Nov 11, 2020, 9:25 PM

#

This is assuming that the values to keys mapping is unique.

torpid cave Nov 11, 2020, 9:25 PM

#

AH yes

#

I control the data input

heady hatch Nov 11, 2020, 9:27 PM

#

Yea let me know how that goes.

torpid cave Nov 11, 2020, 9:28 PM

#

TypeError: 'Series' objects are mutable, thus they cannot be hashed

#

damn it

#

haha

#

My df is quite complex, let me check the code

#

But I think a dict should be the way

heady hatch Nov 11, 2020, 9:31 PM

#

Hmm does your values or keys have Series object?

torpid cave Nov 11, 2020, 9:32 PM

#

not sure

#

I think I got it

heady hatch Nov 11, 2020, 9:33 PM

#

Nice nice nice.

torpid cave Nov 11, 2020, 9:33 PM

#

Nevermind

heady hatch Nov 11, 2020, 9:33 PM

#

Oh hahaha

#

Do you mind sharing the structure of your dataframe?

torpid cave Nov 11, 2020, 9:34 PM

#

info[0][['Sample1','Sample2','Sample3']].apply(lambda x: info_dict.get(x))

#

Tried that

heady hatch Nov 11, 2020, 9:34 PM

#

Like an actual structure.

#

Oh hm.

torpid cave Nov 11, 2020, 9:34 PM

#

so info is a list of dataframes

#

df 0 is the on I am using

#

df[0]

#

and the columns with the keys are the ones I am interested

heady hatch Nov 11, 2020, 9:35 PM

#

To double check, info_dict is okay? Like you can create it and index into the hashes.

torpid cave Nov 11, 2020, 9:35 PM

#

so info_dict is

hearty jewel Nov 11, 2020, 9:35 PM

#

i want to pivot between each column and charges

torpid cave Nov 11, 2020, 9:35 PM

#

info_dict = info[1].drop(['Title'],axis=1).set_index('IDNumber')['Sample'].to_dict()

hearty jewel Nov 11, 2020, 9:35 PM

#

and show a graph

torpid cave Nov 11, 2020, 9:35 PM

#

info_dict = info[1].drop(['Title'],axis=1).set_index('IDNumber')['Sample'].to_dict()

hearty jewel Nov 11, 2020, 9:35 PM

#

in a for loop

heady hatch Nov 11, 2020, 9:36 PM

#

And print info_dict for yourself, is that what you're expecting?

torpid cave Nov 11, 2020, 9:36 PM

#

{'329': 'A', '587': 'A', '433': 'B', '274': 'B'}

#

print(info_dict)
{'329': 'A', '587': 'A', '433': 'B', '274': 'B'}

heady hatch Nov 11, 2020, 9:36 PM

#

And that's what you want?

torpid cave Nov 11, 2020, 9:36 PM

#

Yep

#

key to value

#

I might just solve it and post it to SO

heady hatch Nov 11, 2020, 9:37 PM

#

Cool. Now hmm for info.

You said info is a list of dataframes?

torpid cave Nov 11, 2020, 9:37 PM

#

yes

#

info is a list that contains 2 dataframes

#

I just did it like that to have some order, I dont like having so many variables in the code

#

Should not affect anything

heady hatch Nov 11, 2020, 9:39 PM

#

OH

#

You probably need applymap.

torpid cave Nov 11, 2020, 9:39 PM

#

@hearty jewel maybe try showing me what output you are looking for. I could help with data manipulation.
For graphs I use ggplot in R and the plots I did in Python are just copy/paste so I won't be able to help there

#

@heady hatch how is that?

#

nvm I will just read the docs

heady hatch Nov 11, 2020, 9:39 PM

#

applymap is elementwise, apply is series.

torpid cave Nov 11, 2020, 9:39 PM

#

oh

#

damn

heady hatch Nov 11, 2020, 9:40 PM

#

And I think the error is tripping up when you do dict.get(SeriesObject).

torpid cave Nov 11, 2020, 9:40 PM

#

fck

#

That was it

#

works

#

hahahaha

heady hatch Nov 11, 2020, 9:40 PM

#

🙂

#

Yea sorry in my head I was thinking you're working with one column.

torpid cave Nov 11, 2020, 9:40 PM

#

I thought, I should not be coding something this simple that hard

#

Why is Python not as simple as R

#

many thanks, you just won lots of internet points

heady hatch Nov 11, 2020, 9:41 PM

#

Hope your journey is swell from here on.

torpid cave Nov 11, 2020, 9:42 PM

#

info[0][['Sample1','Sample2','Sample3','Diff']].applymap(LookValue)

Works like a charm

lapis sequoia Nov 12, 2020, 1:05 AM

#

Hey guys trying to train my gan getting a weird traceback

#

https://pastebin.com/jdd9hKt4

Pastebin

GAN model final (hopefully) - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

https://pastebin.com/dTCzrF20

Pastebin

gan error traceback full - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

Can someone help me figure out what's going in

#

*on

#

The traceback would infer it's a type error but I've just loaded up type8 numpy arrays into tf.dataset.from tensor slices or whatever

#

Should I upload my h5py files and convert the numpy array to type 64 or whatever it's saying is appropriate

hasty grail Nov 12, 2020, 1:42 AM

#

You should train your models with float32 inputs in general. If you're using images, it's recommended that you rescale them to the [0, 1] range.

lapis sequoia Nov 12, 2020, 1:48 AM

#

For gans they recommend you normalise between -1 and 1 tho

#

Most documentation I've seen does that

#

Which is what I've done

#

All my data is between those numbers

#

Anyway I think I'm gonna try reshape with numpy to float32 cos they're float64

hasty grail Nov 12, 2020, 1:50 AM

#

That would be called type casting, not reshaping. [-1, 1] would also work, I personally don't have much experience with GANs

lapis sequoia Nov 12, 2020, 1:56 AM

#

My bad

#

It is asking me to cast it to a supported type

#

TypeError: Failed to convert object of type <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'> to Tensor. Contents: <BatchDataset shapes: ((1, 256, 256, 1), (1,)), types: (tf.float64, tf.int32)>. Consider casting elements to a supported type.

hasty grail Nov 12, 2020, 2:04 AM

#

you're dealing with a dataset, which doesn't have a dtype of its own

#

you should map the dataset with a function that casts its elements to the correct dtype

#

# Suppose your dataset is the variable `ds`
cast_ds = ds.map(lambda x, y: (tf.cast(x, tf.float32), y))

lapis sequoia Nov 12, 2020, 2:07 AM

#

I mean I cast the thing as float32 before storing it in the tf.dataset file

#

Didn't work

#

But idk

#

How do I implement that into the code

hasty grail Nov 12, 2020, 2:14 AM

#

Implement what?

lapis sequoia Nov 12, 2020, 2:14 AM

#

The cast ds

hasty grail Nov 12, 2020, 2:14 AM

#

I just did it above

lapis sequoia Nov 12, 2020, 2:15 AM

#

Like would copying what you did work

#

Ok I'll try ut

hasty grail Nov 12, 2020, 2:16 AM

#

That was an example

#

You will have to use your own variable names and stuff

summer island Nov 12, 2020, 2:22 AM

#

Hii, Can anyone help me as I would like to get excel sheet from complex nested JSON file?

hollow gull Nov 12, 2020, 2:28 AM

#

python has a json library that might be useful. That can help you turn it into a python dict. Then pandas can build a dataframe from a dict, and a pandas dataframe can be saved to a csv or workbook.

marsh chasm Nov 12, 2020, 2:29 AM

#

Hi! I was wondering if people here are familiar with the validation_curve functionality of sklearn. Basically I was wondering if for the x axis on a validation curve I can plot instead of a hyperparameter a combination of hyperparameters like so:

#

📎 Screen_Shot_2020-11-11_at_9.29.08_PM.png

#

my teacher somehow managed to produce that plot, unfortunately i can't see how given the limitation of the validation_curve function with param_name and param_range

hollow gull Nov 12, 2020, 2:31 AM

#

The x-axis format reminds me of what matplotlib.pyplot does by default if you have a multi-index dataframe, but I am not sure on that. Even if what I said is correct I am not sure if it will help you. The short answer is I don't have any experience with the validation_curve function.

marsh chasm Nov 12, 2020, 2:32 AM

#

yeah the thing is i feel like i need the validation_curve in order to plot both the training and validation score (every time i try to look up how to plot them without validation_curve google just tries to show me validation_curve xD)

#

pls ping me if you could help! i'd greatly appreciate it. i asked in a help channel before but my helper and i got stuck xD

#

I can just use the gridsearch features

#

ugh i totally forgot about that

#

oh wait but still that doesnt show me the training vs validation score

#

hm

mortal pendant Nov 12, 2020, 2:54 AM

#

https://colab.research.google.com/drive/1tn4l65t47I6G1MV5fcZ0Q-W91Rtwns9e?usp=sharing https://hastebin.com/eyopocaxir.csharp Any ideas what's going wrong?
I'm following this tutorial https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb but changing as much as possible to try and show to myself I actually understand what is going on

graceful glacier Nov 12, 2020, 3:37 AM

#

any resources for sql projects besides kaggle?

torpid cave Nov 12, 2020, 3:57 AM

#

Hi guys, anyone here knows how to do math in Python?

#

I mean solving equations by calling out other variables

zealous holly Nov 12, 2020, 4:08 AM

#

is this a good channel for web scraping

velvet thorn Nov 12, 2020, 4:13 AM

#

I mean solving equations by calling out other variables
@torpid cave are you talking about symbolic math?

undone flare Nov 12, 2020, 6:56 AM

#

How can I move the legend?

📎 unknown.png

lapis sequoia Nov 12, 2020, 7:10 AM

#

Guess you used matplotlib

#

https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.pyplot.legend.html

undone flare Nov 12, 2020, 7:35 AM

#

no seaborn

mental timber Nov 12, 2020, 7:48 AM

#

getting this error not sure how to fix. Note: im still learning python so sorry for my stupidity xd

📎 unknown.png

hasty grail Nov 12, 2020, 7:49 AM

#

that only works if you're in a Jupyter notebook

#

inside a regular script, it's nonsense

mental timber Nov 12, 2020, 7:49 AM

#

so it doesnt work in spyder... damn...

undone flare Nov 12, 2020, 7:53 AM

#

What is better : sns.factorplot(kind='bar') or the sns.barplot()

mental timber Nov 12, 2020, 7:53 AM

#

whats the difference between them if you dont mind me asking

undone flare Nov 12, 2020, 7:55 AM

#

There is actually no difference

#

it's just that factorplot has a kind attribute

#

factorplot/catplot

#

so like if you set kind to violin it will act as sns.violinplot()

mental timber Nov 12, 2020, 7:56 AM

#

I see. Ty

undone flare Nov 12, 2020, 7:57 AM

#

so what would you prefer?

mental timber Nov 12, 2020, 7:57 AM

#

I'm making a random forest code to predict something using a dataset i found. So just going around and researching different codes and reading which one would be best and easiest

undone flare Nov 12, 2020, 7:59 AM

#

I mean would you use factorplot() or the specific kind plots

mental timber Nov 12, 2020, 7:59 AM

#

hmm, I prolly would since it'll make is easier to understand for me

undone flare Nov 12, 2020, 8:00 AM

#

so you would use specific kind plots?

mental timber Nov 12, 2020, 8:00 AM

#

i guess ye

undone flare Nov 12, 2020, 8:00 AM

#

okay

#

I would switch between those and see what suits me xD

mental timber Nov 12, 2020, 8:01 AM

#

ah ok xD

lapis sequoia Nov 12, 2020, 8:10 AM

#

Hey guys I'm still having trouble with my model

#

I'm confused because shouldn't a numpy array that is stored in a tf.dataset be a tensor

hasty grail Nov 12, 2020, 8:11 AM

#

It should

lapis sequoia Nov 12, 2020, 8:11 AM

#

It doesn't make sense that the error I'm getting is saying BatchDataset is cannot be converted to a tensor

hasty grail Nov 12, 2020, 8:12 AM

#

you need to distinguish between a Dataset and an element of a Dataset

#

a Dataset is a Dataset

#

an element of a Dataset is a Tensor

#

Datasets are basically a better version of a vanilla Python generator when it comes to iterating over data

#

they are not convertible to ndarrays directly

lapis sequoia Nov 12, 2020, 8:13 AM

#

But even if I cast it as a float 32 before loading it with from_tensor _slices I still get the same thing

hasty grail Nov 12, 2020, 8:13 AM

#

Dataset is not a tensor

#

as such, it doesn't have a dtype

#

so you can't cast it

#

you can only cast the elements of the dataset

lapis sequoia Nov 12, 2020, 8:14 AM

#

Yeah I meant cast the image array

hasty grail Nov 12, 2020, 8:14 AM

#

in that case you need to map the dataset

#

as I have shown previously

#

the mapping function is applied to each element of the dataset

#

either that, or you cast the array before converting it into a dataset

lapis sequoia Nov 12, 2020, 8:16 AM

#

Well I tried the latter and it.didnt work I got the same thing

hasty grail Nov 12, 2020, 8:16 AM

#

how did you do it

lapis sequoia Nov 12, 2020, 8:16 AM

#

Just saying float32 instead of float64

#

g_x_in = g_x_in.astype('float32')

hasty grail Nov 12, 2020, 8:17 AM

#

and then?

lapis sequoia Nov 12, 2020, 8:18 AM

#

Same error traceback

hasty grail Nov 12, 2020, 8:18 AM

#

what does the error say again?

lapis sequoia Nov 12, 2020, 8:18 AM

#

Just float32 instead of 64

#

One sec

#

TypeError: Failed to convert object of type <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'> to Tensor. Contents: <BatchDataset shapes: ((1, 256, 256, 1), types: (tf.float32)>. Consider casting elements to a supported type.

#

That's the error I get when I cast it as float32 before converting

hasty grail Nov 12, 2020, 8:21 AM

#

can you provide your code again?

#

Seems that you're passing a dataset to tf.cast which doesn't work because, as mentioned above, datasets don't have a dtype

lapis sequoia Nov 12, 2020, 8:22 AM

#

https://pastebin.com/3uMk9Zry

Pastebin

gan final (hopefully) - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

hasty grail Nov 12, 2020, 8:24 AM

#

gen_output = generator(g_dataset, training=True)

#

this doesn't work

#

you need to pass in a tensor

#

not a dataset

lapis sequoia Nov 12, 2020, 8:25 AM

#

Aight

hasty grail Nov 12, 2020, 8:25 AM

#

such as the input argument of the function this code lies within

lapis sequoia Nov 12, 2020, 8:28 AM

#

Yeah so now I'm getting a thing saying content is larger than 2gb but my training data (g_x_in) ,is 300mb

#

Cannot create a tensor proto whose content is larger than 2gb

weary heart Nov 12, 2020, 8:29 AM

#

Hi, i'm new to data science and i want to create some ML project but i need some datasets, is there any recommendation site for good datasets that have more than 5k data? other than kaggle and UCI

lapis sequoia Nov 12, 2020, 8:32 AM

#

My batch size is 1 bro lmao

#

I got no idea wtf to dp

hasty grail Nov 12, 2020, 8:35 AM

#

can you print input?

#

your fit function also has a loop that doesn't make sense

#

    # Train
    for n, (g_dataset) in train_ds.enumerate():
      print('.', end='')
      if (n+1) % 100 == 0:
        print()
      train_step(g_dataset, target_dataset, epoch)
    print()

#

train_ds.enumerate() yields the train step, and an element of train_ds

#

so using g_dataset is misleading

#

datasets don't yield datasets

heady hatch Nov 12, 2020, 8:38 AM

#

@weary heart what kind of data are you looking for? Have you considered scraping?

lapis sequoia Nov 12, 2020, 8:39 AM

#

Ok changing that thanksd

#

https://pastebin.com/a5HwxmQm

Pastebin

print g_x_in - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

That's what I get when I print the generator input

weary heart Nov 12, 2020, 8:41 AM

#

i'm looking for some kind of e-commerce datasets or something like that. i haven't tried scraping atm

hasty grail Nov 12, 2020, 8:44 AM

#

looks correct

lapis sequoia Nov 12, 2020, 8:45 AM

#

Idk man I'm so confused as to why it doesn't work

hasty grail Nov 12, 2020, 8:50 AM

#

look at the call stack and determine the type of each variable that is relevant

#

check that they have the correct type (don't confuse Dataset with Tensor!)

lapis sequoia Nov 12, 2020, 8:53 AM

#

Well yeah ive changed all calls to tensors

#

Where appropriate at least

#

The problem is the size?

#

But it's totally below 2gb

#

I've looked up the error on Google and I can't find anything thats relevant

#

This stinks

hasty grail Nov 12, 2020, 8:59 AM

#

can you show the error log?

heady hatch Nov 12, 2020, 9:00 AM

#

Sorry to nitpick but is this correct?
g_x_in = np.array(g_x_in) - 127.5/127.5

#

It looks like array - 1

hasty grail Nov 12, 2020, 9:00 AM

#

does that make sense to you?

heady hatch Nov 12, 2020, 9:01 AM

#

Oh lmao I didn't mean that as in I wrote it. I picked it from @lapis sequoia 's code.

hasty grail Nov 12, 2020, 9:01 AM

#

nvm thought kash said that

#

lol

lapis sequoia Nov 12, 2020, 9:02 AM

#

Let me put brackets around that

#

I'll grab the error logs one sec dude

#

https://pastebin.com/3kdSLQ6r

Pastebin

2gb!!!! - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

I'll be right back homies

hasty grail Nov 12, 2020, 9:09 AM

#

oh

#

you need to zip the two datasets

#

if you're passing in a dataset, the y parameter in tf.keras.Model.fit will be ignored

#

according to the docs:

#

A tf.data dataset. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).

undone flare Nov 12, 2020, 9:13 AM

#

g.map(sns.displot,'total_bill')```
What is wrong with this? It gives displot separated than the grid

#

but when I do

g.map(sns.distplot,'total_bill')``` It works fine but a warning comes distplot will be deprecated in future release

#

g.map(sns.histplot,'total_bill')``` Works but I just want to know why displot won't work?

lapis sequoia Nov 12, 2020, 9:20 AM

#

@hasty grail you know how to zip it?

hasty grail Nov 12, 2020, 9:21 AM

#

tf.data.Dataset.zip

lapis sequoia Nov 12, 2020, 9:22 AM

#

So would I do smth like g_dataset = tf.data.dataset.zip(g_dataset)

unique flicker Nov 12, 2020, 9:24 AM

#

async def masscloneemoji(self, ctx, emoji: discord.PartialEmoji, name=None):

What should I change here if I want to be able to add multiple emojis at once?

hasty grail Nov 12, 2020, 9:25 AM

#

look at the example in the docs @lapis sequoia

#

zip is in the sense of the vanilla Python zip

lapis sequoia Nov 12, 2020, 9:26 AM

#

Yeah ok will do

undone flare Nov 12, 2020, 9:26 AM

#

anyone know why displot doesn't work with grid?

#

g.map(sns.displot,'total_bill')```
What is wrong with this? It gives displot separated than the grid

This is what I am talking about

lapis sequoia Nov 12, 2020, 9:38 AM

#

@hasty grail same error

hasty grail Nov 12, 2020, 9:39 AM

#

code?

lapis sequoia Nov 12, 2020, 9:39 AM

#

Sec

#

https://pastebin.com/veRPANyZ

Pastebin

gan gang - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

hasty grail Nov 12, 2020, 9:46 AM

#

g_dataset = tf.data.Dataset.zip(g_dataset) that's not how you zip datasets

#

did you read the docs?

#

you need to zip the sample and label datasets together into a single dataset

#

and pass only that dataset to fit

lapis sequoia Nov 12, 2020, 9:48 AM

#

I think it wont work because g_x_in is an ndarray

#

I wasn't sure what to do because it said you need to put dataset objects in it

#

G_y_in is a dataset

#

I can comment out the line where it's g_x_in/127.5 - 1

#

That'll change g_x_in to a dataset and the zip will work but then the issue is how do I normalise all the data

hasty grail Nov 12, 2020, 10:05 AM

#

You can normalize the data, then convert it into a dataset

#

Or use .map on the dataset to apply a mapping function to each element

lapis sequoia Nov 12, 2020, 10:20 AM

#

BUFFER_SIZE = 5000

gen_input = h5py.File('/content/gdrive/My Drive/Colab Notebooks/files/training_mnist_raw.h5','r')
g_x_in = gen_input.get('images')
g_x_in = np.array(g_x_in)/127.5 - 1 
g_y_in = gen_input.get('labels')
g_dataset = tf.data.Dataset.from_tensor_slices(g_x_in)
g_dataset = g_dataset.shuffle(BUFFER_SIZE)
g_dataset = g_dataset.batch(BATCH_SIZE,drop_remainder=True)
g_dataset = tf.data.Dataset.zip((g_dataset,g_y_in))```

 gives me

```TypeError                                 Traceback (most recent call last)
<ipython-input-19-262507c436a9> in <module>()
      8 g_dataset = tf.data.Dataset.from_tensor_slices(g_x_in)
      9 
---> 10 g_dataset = tf.data.Dataset.zip((g_dataset,g_y_in))
     11 
     12 

1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py in zip(datasets)
    998       Dataset: A `Dataset`.
    999     """
-> 1000     return ZipDataset(datasets)
   1001 
   1002   def concatenate(self, dataset):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py in __init__(self, datasets)
   3481           message = ("The argument to `Dataset.zip()` must be a nested "
   3482                      "structure of `Dataset` objects.")
-> 3483         raise TypeError(message)
   3484     self._datasets = datasets
   3485     self._structure = nest.pack_sequence_as(

TypeError: The argument to `Dataset.zip()` must be a nested structure of `Dataset` objects.```

#

I don't know if it's different because of the fact that its a batch dataset

#

Because g_dataset before being zipped is a batch dataset object

cobalt jetty Nov 12, 2020, 10:24 AM

#

what is inside your zip?

#

structure wise

lapis sequoia Nov 12, 2020, 10:26 AM

#

Images and labels

cobalt jetty Nov 12, 2020, 10:29 AM

#

The point seems to be that the content of your zip isn't properly structured to be accepted by tensorflow.
Since you want to use MNIST, it seems, you should find it easier to do this:

import tensorflow_datasets as tfds
(ds_train, ds_test), ds_info = tfds.load(
    'mnist',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)
def normalize_img(image, label):
  """
  Normalizes images: `uint8` -> `float32`.
  """
  return tf.cast(image, tf.float32) / 255., label

ds_train = ds_train.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_train = ds_train.cache()
ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)
ds_train = ds_train.batch(128)
ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)

ds_test = ds_test.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_test = ds_test.batch(128)
ds_test = ds_test.cache()
ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)

lapis sequoia Nov 12, 2020, 10:59 AM

#

@cobalt jetty thanks man, my data is actually like a smaller version of mnist and the images are distorted

#

It's like 5000 images

#

Do you think the same code would work with just my unloaded image and label dataset from my h5py files? The images are 256*256

cobalt jetty Nov 12, 2020, 11:12 AM

#

The mnist images are formatted in 28x28 natively, where do you get a 256x256 mnist dataset?

lapis sequoia Nov 12, 2020, 11:14 AM

#

So these are actually photos I took with a camera that captured the mnist images that were projected with a laser using a DMD

#

And then I distorted those images using scattering media

#

Lol

#

Those images were 256

#

This project is kinda like a superresolution project

#

I'm essentially correcting the distortion with a gan

bold olive Nov 12, 2020, 11:27 AM

#

Do the sk-learn classifiers not work with just a single feature?

cobalt jetty Nov 12, 2020, 11:58 AM

#

since you want to work on a dataset that is comparable to the MNIST, I would resize your secondary dataset to 28x28.

#

You might also want to look at the following: https://keras.io/examples/vision/image_classification_from_scratch/

Keras documentation: Image classification from scratch

#

there are some pretty nice pre-processing functions from tf/keras shown there

#

especially tf.keras.preprocessing.image_dataset_from_directory, which might ease your workflow.

lapis sequoia Nov 12, 2020, 12:17 PM

#

I don't think I would need to do that though, my generator downsamples the image and outputs a 28*28 image

#

I've seen precedent of this working in a paper which was the inspiration for this project

cobalt jetty Nov 12, 2020, 12:18 PM

#

mhm

#

I've used that page to help preprocess a relatively chonky dataset to train a NSFW detector.

lapis sequoia Nov 12, 2020, 12:19 PM

#

📎 received_1689185424578742.webp

cobalt jetty Nov 12, 2020, 12:19 PM

#

You're trying to create a LeNet-like network?

lapis sequoia Nov 12, 2020, 12:20 PM

#

I've never heard of that I thought this was based off the original SRGAN

#

Just appropriated to a physics context

#

Basically the generated output is 28*28 anyway

cobalt jetty Nov 12, 2020, 12:25 PM

#

So what are you trying to achieve? At first glance it seems like you're trying to compress a 128² picture into a 28² one.

lapis sequoia Nov 12, 2020, 12:28 PM

#

My images (which are 256*256) go through the generator and are processed to eventually look like the target images (mnist) and then go through the discriminator which decides if the image is fake or real

#

It's essentially a network that is making predictions of what the image looked like prior to distortion

cobalt jetty Nov 12, 2020, 12:29 PM

#

that's actually neat.

lapis sequoia Nov 12, 2020, 12:30 PM

#

Would be if I could get it to work lol

#

I just can't seem to figure out how to load in data, I've been trying to zip data but I feel like I shouldn't even need to my data isnt over 2gb

#

My training dataset is 313 mb

cobalt jetty Nov 12, 2020, 12:34 PM

#

what's the issue? You can't load your data in memory?

#

I don't understand why you'd need to zip the data to work with it

lapis sequoia Nov 12, 2020, 12:35 PM

#

Yeah exactly lol

#

But no the issue is when I run fit

#

I'll paste the traceback

#

https://pastebin.com/3kdSLQ6r

Pastebin

2gb!!!! - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

This is an error I get when I try to run my code

cobalt jetty Nov 12, 2020, 12:36 PM

#

can I get the code of the cell where it happens?

lapis sequoia Nov 12, 2020, 12:36 PM

#

Sure one sec

cobalt jetty Nov 12, 2020, 12:37 PM

#

also I'm in class right now, I might not answer quickly.

grave frost Nov 12, 2020, 12:37 PM

#

Anyone know how to load a large checkpoint without eating up all the RAM? (I have about a 20G Pytorch checkpoint) Would prefer a solution that can make it work in about 16G of memory....

lapis sequoia Nov 12, 2020, 12:37 PM

#

No problem I really appreciate your help all the same

#

https://pastebin.com/DB1Xbp7a

Pastebin

gantrainstep - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

Also it's 11:40 pm for me so I might knock soon, if you want you can just DM me

#

Really appreciate everything tho guys thank you all

cobalt jetty Nov 12, 2020, 12:41 PM

#

you have an issue in how your g_dataset or dataset are constructed.

#

you're trying to pass a file which tensorflow cannot parse as a tensor because it's too big.

#

However if your dataset is only 353mb, you might have a preprocessing issue

#

you have to look in your previous cells

#

how you process and get your two variables.

lapis sequoia Nov 12, 2020, 12:44 PM

#

I don't know I just used sentdex's method for storing data in h5py

#

I mean he stored his in pickle files but I just used hdf5

#

Cos pickle sux

#

Lol

cobalt jetty Nov 12, 2020, 12:45 PM

#

tbh, your dataset is small enough you can just use keras to load pictures directly in batches of like 16, 34, 64

#

no need to perform some preprocessing like that.

smoky bobcat Nov 12, 2020, 12:46 PM

#

is logistic regression and naive bayes good for numerical classification dataset?

#

supervised learning classification

cobalt jetty Nov 12, 2020, 12:47 PM

#

depends on the complexity of your dataset, but I'd say that logistic regression is a good start for binary classification iirc.

lapis sequoia Nov 12, 2020, 12:47 PM

#

Yeàh I was thinking about that roms

smoky bobcat Nov 12, 2020, 12:47 PM

#

depends on the complexity of your dataset, but I'd say that logistic regression is a good start for binary classification iirc.
@cobalt jetty the target is binary yeah

cobalt jetty Nov 12, 2020, 12:48 PM

#

then I'd go back to the Keras page I sent you and use the functions there.

lapis sequoia Nov 12, 2020, 12:48 PM

#

I would use keras preprocessing right

#

Yeah

smoky bobcat Nov 12, 2020, 12:48 PM

#

@cobalt jetty what about naive bayes? is that good for binary target as well?

cobalt jetty Nov 12, 2020, 12:49 PM

#

I've never used naive bayes, so I can't tell.

smoky bobcat Nov 12, 2020, 12:49 PM

#

i got decision tree, logistic regression and naive bayes on my list

#

ook

cobalt jetty Nov 12, 2020, 12:49 PM

#

but like logistic regression is super easy to implement

smoky bobcat Nov 12, 2020, 12:49 PM

#

all of them are easy, same code just changes the model function

#

make the model, fit the model and test predictions

#

tell me if i'm skipping something

cobalt jetty Nov 12, 2020, 12:51 PM

#

like with sci-kit learn

dataset = pd.read_csv("AAAAAAAAAAAAH.csv",sep=";")
train_set, test_set = train_test_split(dataset, test_size=0.2)
model = LogisticRegression()
model.fit(train_set.attribute, train_set.label)
pred = model.predict(test_set)```

smoky bobcat Nov 12, 2020, 12:52 PM

#

yeah that's what i was saying, are there harder ways to implement the learning algorithms?

cobalt jetty Nov 12, 2020, 12:54 PM

#

You could implement the logreg from scratch

#

doing something from scratch without using libraries is always the hardest.

grave frost Nov 12, 2020, 1:26 PM

#

Anyone know how to load a large checkpoint without eating up all the RAM? (I have about a 20G Pytorch checkpoint) Would prefer a solution that can make it work in about 16G of memory....

cobalt jetty Nov 12, 2020, 1:32 PM

#

if you're talking about all the checkpoints in Pytorch, isn't there a load function where you can specify which checkpoint you want to load.

#

I'd be surprised a model would weight 20gb

#

I've not used Pytorch much so I can't really help

fallow prism Nov 12, 2020, 1:39 PM

#

hello everybody, anyone knows something about pycharm respect to the others python editors like vscode or sublime? what is the difference beetwen scientific python development and pure python development? give me your opinions 🤓 🤓 🤓

austere swift Nov 12, 2020, 1:40 PM

#

yeah I wouldn't think a checkpoint would be 20gb, usually mine only ever go up to like 200mb

#

but I don't think theres any way to do that anyways, you really do have to load everything into memory anyways since the model itself has to be stored in memory

#

so even if you could somehow load the checkpoint without an oom error, you'd still have to have the whole model in your memory anyways

grave frost Nov 12, 2020, 2:03 PM

#

@austere swift ik, but model can also be read through SSD which may impact performance but wouldn't matter much since I am not training, only inferencing. So, that wouldn't really impact time taken that much

austere swift Nov 12, 2020, 2:04 PM

#

I’ve never read the model off disk so I wouldn’t know how to do that lol

spark nimbus Nov 12, 2020, 2:07 PM

#

Making progress on research for some jupyter notebooks I'm working on 👌

📎 Screenshot_20201112_150703.png

grave frost Nov 12, 2020, 2:08 PM

#

With an online calc, puts it to 6s to load the model which doesn't sound that bad, and I can have it done in a day or so

spark nimbus Nov 12, 2020, 2:08 PM

#

Does anyone know of things relating to audio you'd like to be explained in a simplified way?

grave frost Nov 12, 2020, 2:09 PM

#

@austere swift Are you sure that a 200Mb model would take exactly 200 RAM, perhaps there is some clever memory tricks done on the way to save memory..?

cobalt jetty Nov 12, 2020, 2:09 PM

#

tbh, I'm always intrigued at how people can splice voice out of a clip (with music for instance) or vice versa.

#

but not enough to read up on that.

grave frost Nov 12, 2020, 2:09 PM

#

@cobalt jetty Using Machine Learning

spark nimbus Nov 12, 2020, 2:10 PM

#

@cobalt jetty either by using a bandpass, machine learning or trying to recreate the music part and subtracting it

cobalt jetty Nov 12, 2020, 2:10 PM

#

I'm answering Mart, but not just ML

#

splicing voice out is older than ML

spark nimbus Nov 12, 2020, 2:10 PM

#

Bandpass and a bit of manual sample editing is probably what they did back in the day

cobalt jetty Nov 12, 2020, 2:10 PM

#

mhm

#

an uneducated guess of mine was that voice and instruments are usually not recorded with the same mics and so voice and instruments would be recorded on different subparts of let's say a magnetic band.

#

so one could only read those parts.

#

but I was wrong, I see.

crude marsh Nov 12, 2020, 2:13 PM

#

Guys. Anyone up? I need some help

spark nimbus Nov 12, 2020, 2:13 PM

#

That's usually the case for source files, but due to size constraints on tapes/cds it all had to be put on one channel (two for stereo), and that was usually stored as interleaved data at best

crude marsh Nov 12, 2020, 2:14 PM

#

Can someone help me? I just need to know how to go about an Idea I have, I will code it myself

#

I just need the framework

spark nimbus Nov 12, 2020, 2:15 PM

#

@crude marsh what's the issue?

crude marsh Nov 12, 2020, 2:15 PM

#

I have this code that basically scrapes a share price from the web and then prints out the price

#

I need it to record the prices in an excel file

spark nimbus Nov 12, 2020, 2:16 PM

#

but yeah, the interactive jupyter notebooks I'm working on should hopefully help with understanding some more abstract concepts

📎 Screenshot_20201112_151443.png

#

@crude marsh try openpyxl, or if even something simple works, you could export it as CSV using the built-in csv library

crude marsh Nov 12, 2020, 2:17 PM

#

So, in open csv, Does it have to be in a table form?

spark nimbus Nov 12, 2020, 2:17 PM

#

!docs csv.writer

arctic wedgeBOT Nov 12, 2020, 2:17 PM

#

`csv.writer`

csv.writer(csvfile, dialect='excel', **fmtparams)```
Return a writer object responsible for converting the user’s data into delimited strings on the given file-like object. *csvfile* can be any object with a `write()` method. If *csvfile* is a file object, it should be opened with `newline=''` [1](#id3). An optional *dialect* parameter can be given which is used to define a set of parameters specific to a particular CSV dialect. It may be an instance of a subclass of the [`Dialect`](#csv.Dialect "csv.Dialect") class or one of the strings returned by the [`list_dialects()`](#csv.list_dialects "csv.list_dialects") function. The other optional *fmtparams* keyword arguments can be given to override individual formatting parameters in the current dialect. For full details about the dialect and formatting parameters, see section [Dialects and Formatting Parameters](#csv-fmt-params)... [read more](https://docs.python.org/3/library/csv.html#csv.writer)

crude marsh Nov 12, 2020, 2:18 PM

#

Whats this?

spark nimbus Nov 12, 2020, 2:18 PM

#

The documentation for something that writes to a CSV file, there's an example if you click read more

crude marsh Nov 12, 2020, 2:18 PM

#

I am not able to access the file

spark nimbus Nov 12, 2020, 2:19 PM

#

Uhh...

crude marsh Nov 12, 2020, 2:19 PM

#

I can only read !docs csv.writer

spark nimbus Nov 12, 2020, 2:19 PM

#

🤔

#

do you have embeds disabled somehow?

crude marsh Nov 12, 2020, 2:19 PM

#

I dont know. How to enable them?

spark nimbus Nov 12, 2020, 2:19 PM

#

they should be enabled by default

#

can you screenshot what you see?

crude marsh Nov 12, 2020, 2:20 PM

#

Hmm. Yeah sure.

#

📎 Screenshot_24.png

#

This is what I see.

#

AAh, just a min. found out whats wrong

#

now I can see

#

@spark nimbus U still there?

spark nimbus Nov 12, 2020, 2:23 PM

#

Yeah

crude marsh Nov 12, 2020, 2:23 PM

#

Yeah, I can read it now. I will post the code here for your reference

quiet breach Nov 12, 2020, 2:24 PM

#

does anyone know whether there's a better way to select rows in a dataframe based on a column value than what I'm currently using?

part_df = df[df['path'].str.startswith(directory_path, na=False)]

I'm dealing with 2+ million rows and the command above takes about 11 seconds to complete

crude marsh Nov 12, 2020, 2:24 PM

#

# Imports

import bs4
import requests

#Custom Function
def get_share_price(share_url):
    res = requests.get(share_url)
    res.raise_for_status()

#Element finder
    soup = bs4.BeautifulSoup(res.text, features="html.parser")
    elems = soup.select('#quote-header-info > div.My\(6px\).Pos\(r\).smartphone_Mt\(6px\) > div.D\(ib\).Va\(m\).Maw\(65\%\).Ov\(h\) > div > span.Trsdu\(0\.3s\).Fw\(b\).Fz\(36px\).Mb\(-4px\).D\(ib\)')
    return elems[0].text.strip()

#Get price

price = get_share_price('https://in.finance.yahoo.com/quote/HDFCBANK.NS/history/')

#Call

print('The price of HDFC bank share is ' + price)

#

@quiet breach Sorry mate, I am intermediate

#

Why don't u use copy, paste and ctrlF to do it quickly?

#

@quiet breach

quiet breach Nov 12, 2020, 2:26 PM

#

how do you mean?

crude marsh Nov 12, 2020, 2:26 PM

#

Like, you need to type this line several times, right?

#

is that your qn?

quiet breach Nov 12, 2020, 2:26 PM

#

huh? no

#

I have a dataframe of 2 million rows

#

where I must select a subset containing only the rows where the value in the 'path' column matches a string

#

the initial scope of what I'm working on required this to happen 50 times

#

so I was fine with it taking 10 seconds per run

#

now it needs to run 120k times :)

crude marsh Nov 12, 2020, 2:28 PM

#

Ahh, I see, since I am an intermediate, I cant really help a lot, but from what I know, you can scrape the code using python to return only the values that meet a specific criteria

spark nimbus Nov 12, 2020, 2:29 PM

#

@crude marsh doesn't yahoo finance have an API so you don';t need to scrape?

crude marsh Nov 12, 2020, 2:29 PM

#

Yeah, but I am doing it to have some basic experience with web scraping

spark nimbus Nov 12, 2020, 2:29 PM

#

ah

crude marsh Nov 12, 2020, 2:29 PM

#

then I can move on to some complex projects with confidence

#

Like scraping wikipedia

#

Aight, Imma go see if CSV works

spark nimbus Nov 12, 2020, 2:30 PM

#

concern

#

wikimedia api exists

cobalt jetty Nov 12, 2020, 2:30 PM

#

Yahoo doesn't support an API anymore IIRC. Last year their API was removed -- maybe it's changed since.

#

That caused me issues when I tried to recreate my own VIX index calculator.

crude marsh Nov 12, 2020, 2:31 PM

#

Yeah, but I want it to create a chart connecting different articles with each other

#

You know what I mean

#

?

cobalt jetty Nov 12, 2020, 2:31 PM

#

Not really. What do you mean by articles?

crude marsh Nov 12, 2020, 2:32 PM

#

Wait a min. I will show you

#

I can t find the image

#

It basically shows how one article leads to another

#

like there are links to other articles right?

#

those

cobalt jetty Nov 12, 2020, 2:34 PM

#

what do you mean by article here?

crude marsh Nov 12, 2020, 2:34 PM

#

any wikipedia page

#

a mindmap/ flowchart

cobalt jetty Nov 12, 2020, 2:34 PM

#

aren't you working with Yahoo Finance, tho?

crude marsh Nov 12, 2020, 2:34 PM

#

Yeah, this is my future project

cobalt jetty Nov 12, 2020, 2:34 PM

#

based on your snippet above.

crude marsh Nov 12, 2020, 2:34 PM

#

I plan on building it

#

Not yet though

#

How can I make my code transfer all the data to csv file?

#

Any idea. You see my code above

#

What should I edit so that It transferrs it to a CSV file?

cobalt jetty Nov 12, 2020, 2:37 PM

#

transfer the stock data your scraped into a panda dataframe then just use the method .to_csv('file.csv')

crude marsh Nov 12, 2020, 2:38 PM

#

Ahh. I see

#

I just need to convert it to a table and then use .to_csv('file.csv')

#

Right?

#

I just need to convert it to a table and then use .to_csv('file.csv')
Using pandas

quiet breach Nov 12, 2020, 2:39 PM

#

table?

#

dataframe

#

then indeed, df.to_csv(path, options)

crude marsh Nov 12, 2020, 2:40 PM

#

Okay

#

Arigato(Thanks)

marsh chasm Nov 12, 2020, 4:07 PM

#

@remote valley i ended up figuring it out; i didn't realize gridsearch had the ability to return test scores and training scores; this way i don't have to use the validation_curve function and can just directly plot it using matplotlib

remote valley Nov 12, 2020, 4:30 PM

#

@marsh chasm nice. thanks for telling me. gridsearch does the validation curve stuff for the whole set of parameters and plots with correct axis labels for the parameter set? sounds way easier.

ivory panther Nov 12, 2020, 4:34 PM

#

Anybody who have experience using multiindex on Pandas?

mortal pendant Nov 12, 2020, 4:57 PM

#

https://colab.research.google.com/drive/1tn4l65t47I6G1MV5fcZ0Q-W91Rtwns9e?usp=sharing https://hastebin.com/eyopocaxir.csharp Any ideas what's going wrong?
I'm following this tutorial https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb but changing as much as possible to try and show to myself I actually understand what is going on

grave path Nov 12, 2020, 5:24 PM

#

Hello guys what does it mean when my cross validation score is less than my model accruacy score?

heady hatch Nov 12, 2020, 5:34 PM

#

@grave path could you give more information on what you mean by cv score vs accuracy score? Is your cv using a different metric?

grave path Nov 12, 2020, 5:35 PM

#

hello nine so I did split my data first and do the scaling for them my accuracy was 84% then I tried to apply cross validation on the scaled model and the accuracy was 79%

#

print("{:.3f}".format(scores.mean())) ```

#

@heady hatch

heady hatch Nov 12, 2020, 5:40 PM

#

@grave path So you're saying the scores for the scaled model was lower than the score of the model unscaled?

#

Could you clarified on what you mean by your accuracy was 84%?

grave path Nov 12, 2020, 5:40 PM

#

Model score = 74%
Model Scaled = 84%
Cross Validation for model scaled = 79%

#

because I have training and testing data

heady hatch Nov 12, 2020, 5:41 PM

#

Right so what's model score and what's model scaled?

#

Model themselves don't have score unless you're talking about oob.

grave path Nov 12, 2020, 5:41 PM

#

Model = LogisiticRegression()

heady hatch Nov 12, 2020, 5:41 PM

#

Okay.

#

Is this score on the training data?

grave path Nov 12, 2020, 5:41 PM

#

So my question is that how come my cross validation accuracy was lower

#

Yes Nine

heady hatch Nov 12, 2020, 5:41 PM

#

So

#

Just to clarify.

#

You're asking why your training score was higher than your cv score?

grave path Nov 12, 2020, 5:42 PM

#

isn't cross validation supposed to split the dataset and try to fnd the best accuracy

#

Yes

heady hatch Nov 12, 2020, 5:42 PM

#

Okay I guess here, give me these information.

#

cv score of model not scaled
cv score of the model scaled.

#

cv isn't trying to find the best accuracy.

#

cv is trying to see how your model will generalize on a validation set.

grave path Nov 12, 2020, 5:43 PM

#

cv score of model not scaled:71%
cv score of the model scaled:79%

heady hatch Nov 12, 2020, 5:44 PM

#

Okay so what I'm seeing here is your model is generalizing better on the validation set.

#

Meaning that it's capturing more signal when the data is scaled.

#

It doesn't really make sense to compare one model's score on training data to another model's cv score.

#

Often times in classical ml, if training score is higher than validation score, that could mean your model is overfitting.

grave path Nov 12, 2020, 5:46 PM

#

what do you mean when you say cv is trying to see how your model will generalize on a validation set

heady hatch Nov 12, 2020, 5:46 PM

#

So do you know how cross validation works, especially in default where it's kfold?

grave path Nov 12, 2020, 5:47 PM

#

kfold is the number of splits right?

heady hatch Nov 12, 2020, 5:47 PM

#

Mhm!

grave path Nov 12, 2020, 5:47 PM

#

might have misunderstood what it does then

heady hatch Nov 12, 2020, 5:48 PM

#

By default, I think cross validation in sklearn uses 5 fold.

#

https://www.researchgate.net/profile/Fabian_Pedregosa/publication/278826818/figure/fig10/AS:614336141750297@1523480558954/The-technique-of-KFold-cross-validation-illustrated-here-for-the-case-K-4-involves.png

grave path Nov 12, 2020, 5:48 PM

#

but is 79% considered good when cross validating ?

heady hatch Nov 12, 2020, 5:48 PM

#

Metric evaluation is another topic. hahaha

#

It depends on the problem.

grave path Nov 12, 2020, 5:49 PM

#

yes so we split the dataset and each time we change the training and testing split and then the cv will be the mean of these right?

heady hatch Nov 12, 2020, 5:49 PM

#

Right.

grave path Nov 12, 2020, 5:50 PM

#

oh I think you cleared something for me then

heady hatch Nov 12, 2020, 5:50 PM

#

It splits it into x section, in 5 folds, it's 5 sections.

#

It treats one of them as the validation set. And your model trains on the rest, and test it on the validation set.

grave path Nov 12, 2020, 5:50 PM

#

So this is an overall accuracy since it tests more possible outcomes and the scaled has nothing to do with it

#

I should compare it on the split I did manually right

heady hatch Nov 12, 2020, 5:51 PM

#

Hm could you clarify on what you meant by scaled has nothing to do with it?

#

It seems like scaling does better, since your data respond well to scaling.

#

Isn't that's what your cv showed? 71 vs 79.

grave path Nov 12, 2020, 5:51 PM

#

Yes but that doesn't have to be interpreted in a bad way

#

or does it?

heady hatch Nov 12, 2020, 5:52 PM

#

I'm not sure. What do you mean by interpreted in a bad way?

grave path Nov 12, 2020, 5:52 PM

#

I was thinking that cv must be higher than Scaled data or something is wrong

heady hatch Nov 12, 2020, 5:53 PM

#

You can think of scaled vs not scaled model as two different models.

#

Because cv is higher in model that scales the data, that's a sign that your data provides more signal when it's scaled.

#

Or that your model isn't able to capture the signal properly when it's not scaled.

#

cv is just a scoring method to tell you how your model generalizes on data it wasn't trained on.

#

Because you don't want to just train it on the training set and test it on the training set.

grave path Nov 12, 2020, 5:56 PM

#

Yeah I see your point since cv trains and tests on everything eventually then it will give you the gerelized score

#

since testing it wouldn't make sense since its not new data

slender eagle Nov 12, 2020, 5:56 PM

#

Does anyone here also know Abstract graph transformation, and deriving binary logic from it?

grave path Nov 12, 2020, 5:56 PM

#

Thanks a lot for the help Nine

heady hatch Nov 12, 2020, 5:56 PM

#

Yea no problem, glad to be of help.

molten hamlet Nov 12, 2020, 6:48 PM

#

how do I do this in jupyter? turn text black? I do not see myself coping and pasting cells to make them raw

📎 Screenshot_from_2020-11-12_19-47-50.png

#

uh

#

I have to run cell, nvm

#

thanks guys, you are awesome 😄

plush zenith Nov 12, 2020, 8:24 PM

#

Hi can i ask here something about Matrix operations in an iteration?

heady hatch Nov 12, 2020, 8:33 PM

#

Try it, and if people don't answer, try somewhere else.

plush zenith Nov 12, 2020, 8:34 PM

#

okey

#

thanks

#

im having this error

#

No loop matching the specified signature and casting was found for ufunc inv

#

    J=sp.lambdify([x, y],[dp1,dp2], "numpy")
    f=sp.lambdify([x, y],[dp1,dp2], "numpy")
    v = v0
    print(v)
    for i in range(20):
        Jr=np.array(J(v[0], v[1]))
        fi=np.array(f(v[0],v[1]))
        J_inv=np.linalg.inv(Jr)
        #print(J_inv)
        print("")
        v = v - J_inv @ fi
        print("v")
        print(v)
        print("")
    return

#

basically this is what i wanth to tierate

#

J and f are two matrix (J is from derivatives) but the thing is i dont know how to make the iteration works without errors

heady hatch Nov 12, 2020, 8:36 PM

#

What kind of errors are you getting?

plush zenith Nov 12, 2020, 8:37 PM

#

TypeError: No loop matching the specified signature and casting was found for ufunc inv