#data-science-and-ml

1 messages ยท Page 246 of 1

solemn hull
#
player_df.loc[player_df.SEASON_ID == '2019-20',:]```
#

@dire pollen l

muted oyster
#

yes this should work

dire pollen
#

`from nba_api.stats.endpoints import playercareerstats

Anthony Davis

career = playercareerstats.PlayerCareerStats(player_id='203076')
player_df = career.get_data_fames()[0]
player_df.loc[player_df.SEASON_ID == '2019-20',:]`

#

like this?

solemn hull
#

looks right, oh wait i typoed

#

get_data_fames, should be get_data_frames

dire pollen
#

wow brilliant, it work!!

muted oyster
#

nice

dire pollen
#

yeah i saw the error haha

solemn hull
#

hurray PartyGlasses

muted oyster
#

i was working on similar thing when i saw your question

dire pollen
#

I think this was the easy part how can I get this row and other rows and join them?

solemn hull
#

so try to take that last line

this_year = player_df.loc[player_df.SEASON_ID == '2019-20',:]
type(this_year)
#

im guessing its also dataframe but what does that say

tidal bough
#

ML questions get posted in this channel once in a while, so here's mine:
When creating AIs for playing board games, it's common to take advantage of symmetry to reduce the number of possible states. How is that actually done in practice? I had to take advantage of symmetry in one case before(not ML, just a metaheuristic optimization task), but I achieved rather meager results. Is there some sort of hashing algorithm for a 2d array of values that is invariant under rotations/reflections?

muted oyster
#

@solemn hull what if i want to get 3 months of 3rd quarter using similar code ?

dire pollen
#

so try to take that last line

this_year = player_df.loc[player_df.SEASON_ID == '2019-20',:]
type(this_year)

@solemn hull Im not sure I quite get it what you are trying to say

solemn hull
#

so you can build a function to parse each specific player/dataframe, then iterate through the players or days etc

dire pollen
#

๐Ÿค” I think I need to learn more python, I try to understand

muted oyster
#
DF1 = DF.loc[DF['Month'] == '07',:]
DF1

i also want 08 and 09

solemn hull
#
from nba_api.stats.endpoints import playercareerstats
# Anthony Davis
def get_player_current_year(player_id):
  career = playercareerstats.PlayerCareerStats(player_id=player_id)
  player_df = career.get_data_fames()[0]
  return player_df.loc[player_df.SEASON_ID == '2019-20',:]

player_results = []
for player_id in ['203076', ...]:
  player_results.append(get_player_current_year(player_id)]
print(player_results )```
dire pollen
#

So I can put different IDs at the same time and I would get the row I want?

solemn hull
#

so i think pandas has specific syntax for multiple conditionals.. no idea if this will work but

DF1 = DF.loc[DF['Month'] in ['07', '08', '09'],:]```
#

it will call the api for each player, get the year 2019-20 then build up a list

#

and at the end print the list.. there is probably a better way to do it though, im a pandas newb

#

and yeah carly, it will get only that row for each player

tidal bough
#

Hmm. Maybe DF.loc["07"<=DF['Month']<="09",:]? Not quite the same, mind.

muted oyster
#

@solemn hull oh sorry, my doubt was a separate thing

solemn hull
#

i think they are strings so comparison wont compare the digits

muted oyster
#

not related to Carly's

solemn hull
#

no worries

muted oyster
#

im asking in general.. if i want to get 3 values out of rows 07, 08, 09 are for 3 months of 3rd quarter

#
DF1 = DF.loc[DF['Month'] == '07',:]
DF1

if i give this it will onl return for month of july

solemn hull
#

did you try the above DF['Month'] in ['07', '08', '09']

muted oyster
#

yes is an error

solemn hull
#

ah xD

tidal bough
#

what you definitely can do is

DF1 = DF.loc[(DF['Month'] == '07') | (DF['Month'] == '08') | (DF['Month'] == '09'),:]
#

shame the other way doesn't work, though

#

there's probably a way to make it work.

solemn hull
#

freakin pandas, eating up all the bamboo and making strange syntaxes ๐Ÿผ

muted oyster
#

oh yes it worked. what does | did here ? @tidal bough

solemn hull
#

| is or

muted oyster
#

so we cant simply pass or ?

solemn hull
#

try and see

muted oyster
#

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

solemn hull
#

its some pandas specific syntax

#

i guess

#

different rules than normal python

muted oyster
#

I see, thanks ๐Ÿ™‚ I saw that guy's question and was similar to what I was doing otherwise im a noob myself lol

solemn hull
#

heh, i think im gonna go learn the basics

dire pollen
#
from nba_api.stats.endpoints import playercareerstats
# Anthony Davis
def get_player_current_year(player_id):
  career = playercareerstats.PlayerCareerStats(player_id=player_id)
  player_df = career.get_data_fames()[0]
  return player_df.loc[player_df.SEASON_ID == '2019-20',:]

player_results = []
for player_id in ['203076', ...]:
  player_results.append(get_player_current_year(player_id)]
print(player_results )```

@solemn hull It worked but the data got a weird look, is there a way to get the data and make it look 'pretty'?

solemn hull
#

for that, i think you need to use the pandas method of joining data rows.. someone said merge before.

dire pollen
#

Yeah, I will take a look but you definitely helped a lot!

solemn hull
#

or instead of print, you could do

for result in player_results:
    print(result)
#

that way its not printing a list but each specific item individually...

#

awsum glad u got it working

muted oyster
#

u can convert it to dataframe if u want tabular form

#

pd.DataFrame

#

Another question, can we plot trend graph for 2 separate values in same graph ?

dire pollen
#

I want to give format to the data to display it nicer ultimately

tidal bough
#

@muted oyster | is bitwise OR, which for Series is overloaded to act elementwise.

#

or isn't really the same thing

dire pollen
muted oyster
#

@tidal bough is it only for pandas or used in other libraries too ?

solemn hull
#

try

import pandas
from nba_api.stats.endpoints import playercareerstats
# Anthony Davis
def get_player_current_year(player_id):
  career = playercareerstats.PlayerCareerStats(player_id=player_id)
  player_df = career.get_data_fames()[0]
  return player_df.loc[player_df.SEASON_ID == '2019-20',:]

player_results = pandas.DataFrame()
for player_id in ['203076', ...]:
  player_results.append(get_player_current_year(player_id))
#if youre using jupyter you can call display()
display(player_results)```
tidal bough
#

@muted oyster Well, having | work elementwise on Series is just a Pandas thing.

#

the operator itself is of course used often when working with bits.

muted oyster
#

ok I understood. thx : -)

dire pollen
#

@solemn hull I got no result from that

solemn hull
#

๐Ÿ˜ฑ

dire pollen
#

Im not sure which kind of result I would get

#

Not even an error

solemn hull
#

lol, dang.. ok, i guess go back to list []

import pandas
pandas.set_option('display.max_rows', None)
pandas.set_option('display.max_columns', None)
pandas.set_option('display.width', None)
pandas.set_option('display.max_colwidth', -1)

from nba_api.stats.endpoints import playercareerstats
# Anthony Davis
def get_player_current_year(player_id):
  career = playercareerstats.PlayerCareerStats(player_id=player_id)
  player_df = career.get_data_fames()[0]
  return player_df.loc[player_df.SEASON_ID == '2019-20',:]

player_results = []
for player_id in ['203076', ...]:
  player_results.append(get_player_current_year(player_id))
for result in player_results:
    print(result)```
@dire pollen
#

thats supposed to remove the abbreviating '...' stuff

dire pollen
#

Oh I see, well anyways thank you for your help I will try to take a look about the other stuff!

solemn hull
#

np dogeblanky2

ripe forge
#

Terminology question : I came across the term "interval" for a column data type. (for context, this terminology is used in sas documentation). Does interval data refer to continuous data?

lapis sequoia
#

Could be referring to timestamp data

#

usually interval refers to the interval between two given dates

#

or whatever time periods are required

desert oar
#

pandas has a "time period" data type

muted oyster
#

like a state wise counts of closed and open and its total at the end

tidal bough
#

ML questions get posted in this channel once in a while, so here's mine:
When creating AIs for playing board games, it's common to take advantage of symmetry to reduce the number of possible states. How is that actually done in practice? I had to take advantage of symmetry in one case before(not ML, just a metaheuristic optimization task), but I achieved rather meager results. Is there some sort of hashing algorithm for a 2d array of values that is invariant under rotations/reflections?

velvet thorn
#

I have a dataframe like this which i want to convert into:
@muted oyster groupby count unstack

muted oyster
#
DF2.groupby(['State', 'Final_Status' == 'Open' | 'Final_Status' == 'Closed']).size().unstack(fill_value=0)
#

do u mean like this ? but its giving error

#

but is giving total closed and open values for all states and not individually

#

ok I figured out to get closed and open in rows and sort of this code worked:

DF3 = DF2.groupby('State')['Final_Status'].value_counts()
DF3 = pd.DataFrame(DF3)
DF3
#

like closed and open in columns instead of rows

#

ok figured it out lol

#

thanks buddy @velvet thorn

velvet thorn
#

groupby just 'State' actually, but yeah

muted oyster
#

ok I figured out to get closed and open in rows and sort of this code worked:

DF3 = DF2.groupby('State')['Final_Status'].value_counts()
DF3 = pd.DataFrame(DF3)
DF3

I added .unstack().fillna(0) so it worked

plucky cairn
#

i'm pretty inexperienced in ds/ml coming from an econ background. i want to fit a supervised learning model to associate bodies of text with items from a list of shorter texts. in the training set i know which large texts should be associated with the short texts and in the out-of-sample dataset i have groupings in each list

#

does that make sense

#

where the matched pairs are 'name' and 'strategy'

#

then in the unmatched set i want to associate strategies with the 'name' fields

calm wagon
#

is it?

tidal bough
#

it's from 2019, so hardly ๐Ÿค”

bold olive
#

df.sparse.to_dense() is returning sparse not found? Am I missing something?

tidal bough
#

hmm

#

Maybe you have an old version?

#

check pandas.__version__

bold olive
#

1.1.0

#

Shouldn't be a problem I guess

#

Really strange

drowsy kite
#

thats happened to me but it was because i renamed my dataframe

#

@bold olive

bold olive
#

Nope, not renaming my dataset anywhere

#

It is actually the output value of a multilabel classification which is in sparse format and I need to convert it into a dense matrix for the metrics

#

Weird thing is that it worked before but when I got back to it and tried running it again, it's returning this error

drowsy kite
#

story of my life

#

you could try restart the runtime and clear any outputs

bold olive
#

Tried it, no luck! The function works alone in a separate instance though

#

This is so weird

keen root
#

Hi, this is probably an annoying question, but I have to ask: Does anyone recommend any book to learn Machine Learning? I've looked online but there's just so much stuff!! It's hard to distinguish from hyped stuff, oversimplified things and the actual useful things. So I was looking for something to hook into that would get me through things. I come from a physics background and I'm confortable with python (if it helps in some way)

tidal bough
#

@keen root I personally:

  1. Did this amazing coursera course:https://www.coursera.org/learn/machine-learning as an overview of the field.
  2. Am now doing the Practical Reinforcement Learning course from this specialization (just because that's what I'm interested in): https://www.coursera.org/specializations/aml
    For reading material, I found useful the materials the AI discord suggests:

MACHINE LEARNING
Before you start specialising in any particular field, it's important to learn the core theory of Machine Learning for a broad exposure to ideas and techniques that you can likely apply to any field.

Core
โ€ข Bishop - Pattern Recognition and Machine Learning

  • Also check out Model-Based Machine Learning by the same author
    โ€ข Tibshirani, Friedman, Hastie - The Elements of Statistical Learning
    โ€ข ColumbiaX on edX - Machine Learning
#

The first course is free. The ones from the Advanced specialization aren't, but coursera's audit mode allows free access to basically everything from the course except quizzes for some reason (programming assignments are available).

keen root
#

@tidal bough Thank you, that's amazing, I'll follow the first course, seems to be quite complete, however it is based on matlab/octava, will it be crucial to understand the contents if I've never worked with them?

trail walrus
#

ooh, I finished that specialization on coursera, not all courses are equally good, but overall I learned a lot

keen root
#

Also, did you find it important to follow some book at the same time?

trail walrus
#

nah, if you want to know something google is your friend.

#

but I do recommend to supplement the material in courses by looking things up whenever you're curious or confused about something

tidal bough
#

@keen root I've never worked with Octave before that course. I didn't find it hard to learn - it's very nice in its native support of matrix and vector calculations.

Also, did you find it important to follow some book at the same time?
I didn't read any ML books until my Practical RL course. The first course provides its own materials, which are quite enough.

keen root
#

Got it, thank you

odd yoke
#

The Elements of Statistical Learning is absolutely fantastic

muted oyster
#

@keen root there are lot of books from OReilly

lapis sequoia
#

@muted oyster O'Reilly is pretty good

#

I learnt the basics from those books

muted oyster
#

yes i started with Head First Python brain friendly

lapis sequoia
#

Also, I recommend that you guys check out StatQuest with Josh Starmer on youtube

#

He covers basic statistics and ML. The channel is amazing for beginners and experts alike.

muted oyster
#

sure, anytime! I'm new to everything and evrything helps : )

#

and also most of the O'Reilly books are available in pdfs just a google search would do

#

@keen root

keen root
#

That's great, thank you :)

pearl crystal
#

Can I say join distribution instead of mutivariate disctribution or it is better to say multivariate distribution for multi dimensional distributions and joint distribution for jointly distribution between different random variables?

muted oyster
#

if i pass this

DF3.set_index('Date').plot();

I get a plot of very small size, how can i enlarge it ?

#

i guess i should ask in help section ๐Ÿ˜…

muted oyster
#

got it, but had to change it to something much messy

safe sparrow
#

Anyone here with experience with LSTM layers in keras?

#

Im not sure how to interpret the shapes, input and output

plucky cairn
#

@pearl crystal either is fine, multivariate distribution implies that the random variables covary - so it's really the same thing as explicitly saying that the distributions are joint. a multivariate distribution with zero covariance wouldn't really be multivariate, it would just be a collection of univariate distributions

#

you can use .plot(figsize(width,height)) where width and height are in inches

#

or you can just use matplot lib and build the graph yourself

#

which will probably end up looking better

ancient lichen
#

hey I'm trying to make a classifier to identify if a burger is burger king or mcdonalds. Can some people help me build a dataset? 1 is the worst, and 100 is the best

  1. Can you give me a rating on a scale of 1-100 on how good a burger king burger tastes?
  2. Can you give me a rating on a scale of 1-100 on how healthy a burger king burger is?
  3. Can you give me a rating on a scale of 1-100 on how good a mcdonalds burger tastes?
  4. Can you give me a rating on a scale of 1-100 on how healthy a mcdonalds burger is?
    Anyone willing to take a few seconds and think back to when they've had a burger would be great!
muted oyster
#

which will probably end up looking better
@plucky cairn yes I plot it using matplotlib. It's messy bcoz i wanted 12 lines on graph and for every line i had to copy that code 12 times and passing every column in it.

tidal bough
#

Interesting. Of all the scipy solvers, only DOP583, supposedly a very precise RK solver, has any problems with this equation.

#

this is dy/dt = 1/y - 1/(1-y) + 10*abs(y-0.5) + np.cos(t/10), from y(0)=0.6

ancient lichen
#

anyone want to rate burger king and mcdonalds burgers?

muted oyster
#

Interesting. Of all the scipy solvers, only DOP583, supposedly a very precise RK solver, has any problems with this equation.
@tidal bough this looks interesting. What are the legends about ? As u mentioned dop853 is the only one ? Or alsoLSODA

#

And what are these things actually ?

tidal bough
#

I only really know how the RK ones work

#

RK23 is actually oscillating a bit too

#

RK45 oscillates less

#

the rest are nigh-perfect

polar berry
#

hey what is a good way to learn python for machine learning if I have absolutely no experience with coding at all
pls ping

rapid ridge
#

someone uses nginx here?

tidal bough
ancient lichen
#

anyone know any good datasets for training very basic classifiers?

tidal bough
#

What kind of classifiers?

#

Like, any ones? Check out the Titanic dataset, it's a classic.

#

oh god, I found a really unforgiving equation

#

dy/dt= y**2 - 50/(1-y)**2 + 50*np.cos(t/5) - 10*np.sin(t/10)

polar berry
#

@tidal bough where is resources?

tidal bough
#

!resources

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

polar berry
#

@tidal bough which one is the best?

velvet thorn
#

@tidal bough which one is the best?
@polar berry do you really expect them to be able to tell

polar berry
#

@velvet thorn idk bro

#

gonna use codecademy

bitter harbor
#

IBMโ€™s always a good choice

thin solstice
#

Thought this is kinda relevant, since it's using numpy with large amounts of data;

Just wondering, I've got two arrays, both of the same shape. They look like this;

a = [[1,2],[6,4,2]]
b = [[3,4],[5,3,4]]```

Both a and b are numpy arrays, and I was wondering how I'd go about adding them together, to get a result like so:
```python
[[4,6],[11,7,6]]```

Would this be possible?
fervent bridge
#
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(58, 78, 3)))
    model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))
    model.add(tf.keras.layers.Conv2D(128, (3, 3), activation='relu'))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Dense(1024, activation='relu'))
    model.add(tf.keras.layers.Dropout(0.2))
    model.add(tf.keras.layers.Dense(196, activation='softmax'))
    model.compile(optimizer=tf.keras.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])```Isn't my shape supposed to get smaller per layer in a CNN? if so then why do I get this error ```python
 OOM when allocating tensor with shape[479232,1024] and type float on ``` whats with the input shape of `479232`
desert parcel
#

@desert parcel did you read what I said above
@velvet thorn I just did

#

It does return those

#

when you said you should return loss.item() and the other stuff

polar berry
#

@bitter harbor they're both IBM?

bitter harbor
#

Youโ€™re asking about courses that I bet not many people here have looked at, look at the reviews as itโ€™ll probably mostly be up to your/the general opinion

velvet thorn
#

when you said you should return loss.item() and the other stuff
@desert parcel huh

#

no, it's literally returning a string

#

don't you see the .format call?

#
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(58, 78, 3)))
    model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))
    model.add(tf.keras.layers.Conv2D(128, (3, 3), activation='relu'))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Dense(1024, activation='relu'))
    model.add(tf.keras.layers.Dropout(0.2))
    model.add(tf.keras.layers.Dense(196, activation='softmax'))
    model.compile(optimizer=tf.keras.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])```Isn't my shape supposed to get smaller per layer in a CNN? if so then why do I get this error ```python
 OOM when allocating tensor with shape[479232,1024] and type float on ``` whats with the input shape of `479232`

@fervent bridge you cut out half the error...

#

anyway, a CNN does decrease the size of the input (channels last) along the 2nd and 3rd dimensions (assuming you don't have padding), while increasing the size of the 4th dimension (assuming the number of filters increases)

#

but then you flatten.

#

(which is a prerequisite for passing to a dense layer if you want to work with all the dimensions, but)

#

say you have an input image of size 640x480x3

#

after going through the first three convolutional layers it would end up being 636x476x128 = 38750208.

#

in a vanilla CNN the operation that decreases the size of the image is not really convolution, but pooling.

#

look up MaxPooling2D

#

Thought this is kinda relevant, since it's using numpy with large amounts of data;

Just wondering, I've got two arrays, both of the same shape. They look like this;

a = [[1,2],[6,4,2]]
b = [[3,4],[5,3,4]]```

Both a and b are numpy arrays, and I was wondering how I'd go about adding them together, to get a result like so:
```python
[[4,6],[11,7,6]]```

Would this be possible?

@thin solstice how can those be arrays?

#

they're not the right shape

#

unless you're saying they're object arrays (print(a.dtype)), which is not what numpy should be used for, in general

desert parcel
#

don't you see the .format call?
@velvet thorn oh.. Now I See

velvet thorn
#

yes.

#

which is why I said you really would benefit from some more work on your fundamentals

#

don't try to dive into DS (and DL) so quickly...

desert parcel
#

my fundamentals are good

#

... despite

#

everything...

#

lol

velvet thorn
#

incidentally, if you had had type hints there

#

your IDE would have made that clear

desert parcel
#

I'm using google colab

#

and I use a text editor

velvet thorn
#

not really sure if it has support for type hints

desert parcel
#

I don't have an IDE installed on my machine

velvet thorn
#

but you can try mypy

desert parcel
#

what's mypy

#

lemme search it up

velvet thorn
#

but anyway, I'm not going to argue with you about your learning path...?

desert parcel
#

yeah sounds fair lol

velvet thorn
#

the last thing I'll say is that that would have been a trivial error to debug

desert parcel
#

I'm good in a sense that

velvet thorn
#

IMO

#

but well, to each their own

desert parcel
#

I agree with that

#

like... um

velvet thorn
#

not that I mind solving this kind of problem since I love procrastinating ๐Ÿ™‚

desert parcel
#

I know what to do ut sometimes

#

but*

#

but sometimes I just don't look too into detail like return and print

#

I just assumed they were interchangeable well... until now of course

velvet thorn
#

that...is a very scary statement

desert parcel
#

well I mean

#

Don't let it be..?

#

xd

velvet thorn
#

how long have you spent using Python

desert parcel
#

about 4 months

velvet thorn
#

hm.

desert parcel
#

but I don't really know anymore lol

velvet thorn
#

well, I hope it works out for you

desert parcel
#

haven't been keeping track

#

It worked

#

so all my issues with my code

#

was because I was returning a string

#

ohhhh no wonder I couldn't get any where

velvet thorn
#

yes

#

I would suggest you look up type hinting

desert parcel
#

it's like

velvet thorn
#

trivial way to prevent such errors

desert parcel
#

def (x: str, y: int)

#

something like that right?

velvet thorn
#

you're missing a function name

desert parcel
#

yeah I know lol

velvet thorn
#

and you can annotate the return type too

#

but yes

#

that's the basic idea

desert parcel
#

I use this in my functions

#

or whatever they are used in

#

I don't need to keep track of it too much

graceful ice
#

How to join 2 df's using pandas on the basis of a column but the column data matches partially is that possible

#

hello work (df a) ------- hello world data.

velvet thorn
#

not really.

#

not without a fair bit of processing

#

that's quite a high-level problem

thin solstice
#

unless you're saying they're object arrays (print(a.dtype)), which is not what numpy should be used for, in general
@velvet thorn in that case, oops

graceful ice
#

can anybody help

soft dock
#

You may be able to incorporate something like Levenshtein distance as a conditional check whether to join a specific column, but I think it would be kinda awkward depending on the structure of the two dataframes.

graceful ice
#

let me give you an example to be more specific

bitter harbor
#

about 4 months
@desert parcel I'm honestly impressed that you got away with that for so long

velvet thorn
#

you cannot just use Levenshtein distance or some other difference metric

#

or rather, not alone

#

I would suggest some form of clustering

#

then join on the cluster IDs

#

which is why I said "not without a fair bit of processing"

graceful ice
#

In one df the there is a column named as model i,.e equal to Galaxy s2
another df there is a colum named model i.e equal to Galaxy s2 a

#

I want to match these 2

velvet thorn
#

yes, we understand the problem.

#

it's not a simple problem

#

it is not difficult to find the distance between two rows given a specific column.

graceful ice
#

@velvet thorn you are taking it ina complex manner

#

wait let me thik a bit out of it

velvet thorn
#

just use some form of string metric

#

@velvet thorn you are taking it ina complex manner
@graceful ice do you understand why I say it is not simple...?

desert parcel
#

@desert parcel I'm honestly impressed that you got away with that for so long
@bitter harbor lol so am I

#

I learnt some selenium and other stuff

bitter harbor
#

hello work (df a)
would these be three different columns

velvet thorn
#

no, "hello work" is the value in a column in one dataframe, and "hello world" is the value in an identically named column in the other dataframe

#

it was a pretty poorly formatted example TBH

bitter harbor
#

yikes ya I thought it was comparing 'words' not the phrase

#

would it work if you checked if the indices of the characters in the column were equal on both objects+/had the same spacing (ei (data) > data - different indices but the letters are the same spacing)

velvet thorn
#

and the thing is

#

(I'm not so sure about this)

#

because their use of terminology isn't very clear

#

but they might want to join the rows

#

as opposed to match them

graceful ice
#

I did it

velvet thorn
#

and the thing is...where do you stop?

#

because with a high enough threshold any number of strings can be matched

bitter harbor
#

ah ya I just read it

#

I'm confused now

graceful ice
#
df['PartialModel'] =df['Model'].apply(lambda x: difflib.get_close_matches(x, invoiceDf['Model'])[0] if len(difflib.get_close_matches(x, invoiceDf['Model']))>0 else "Unknown")
#

read this

velvet thorn
#

that's literally not what you originally said though

#

you said "join"

graceful ice
#

yes

#

I took this apprach

velvet thorn
#

good for you then

graceful ice
#

@velvet thorn I will join with this

#

@velvet thorn why are you getting agry

velvet thorn
#

huh?

#

what do you mean

graceful ice
#

angry

#

never mind

#

@velvet thorn thanks for your help

#

and time

velvet thorn
#

@velvet thorn thanks for your help
@graceful ice yw

lofty scarab
#

Hello all! I had a question involving route optimization and distance matrices.

#

It's similar to the traveling salesman problem but if you had multiple salesmen

bitter harbor
#

@lofty scarab So no overlapping of the salesmen?

thin solstice
pale thunder
#

you can use the @ operator

thin solstice
#

thanks! :)

pale thunder
#

or np.matmul

thin solstice
#

wait... it seems to only return an array with one value?..

#

lemme show you what I've got...

#
>>> a = np.array([0.0019, -0.01])
>>> ht = np.array([[-0.09],[0.04]])
>>> a@ht
array([-0.000571])
# shouldn't this array be shaped as (2,2)?
#

@pale thunder ^

#

since here on this website, matrix multiplication of two arrays returns an array shaped as 2,2

#

& this is what I get when I try that same thing in python:

>>> A = np.array([1,2])
>>> B = np.array([3,4])
>>> A@B
11
>>> 
red pike
#

try np.matmul

bitter harbor
#

*!!!

thin solstice
#

same thing

>>> np.matmul(A,B)
11
#
>>> A*B
array([3, 8]) 
#

still not a 2,2 array

pale thunder
#
In [17]: A = np.array([[1,2]])

In [18]: B = np.array([[3],[4]])

In [19]: A @ B
Out[19]: array([[11]])

In [20]: B @ A
Out[20]:
array([[3, 6],
       [4, 8]])
thin solstice
#

ohh

#
>>> B @ A
11 ```
@pale thunder same issue
#

wait no

#

I see

#

thank you

bitter harbor
#

this is why linear algebra is hard ๐Ÿ˜›

thin solstice
#

yeah haha

pale thunder
#

you need to have have a row and a column, so they need to be 2D.

thin solstice
#

I'm only in year 10 and I'm struggling to wrap my head around matrix multiplication :P

pale thunder
#

good luck!

thin solstice
#

haven't done anything like this in school lol, thanks! :)

bitter harbor
#

id suggest watching 3b1b's series on it

pale thunder
#

another useful thing is transpose, which you do as A.T

lost yoke
#

& this is what I get when I try that same thing in python:

>>> A = np.array([1,2])
>>> B = np.array([3,4])
>>> A@B
11
>>> 

@thin solstice try with B = np.array([[3],[4]])

#

oh already said. sorry.

#

also, np.array([3,4]).T

#

that transposes ("turn the other way") the vector

thin solstice
#

yup, I used that :)

#

thanks

molten hamlet
#

Someone is using plotly? I wonder if there is a way to not use browser to plot

lapis sequoia
#

Matplotlib has always been the go-to plotting library for all my use cases.

covert rover
#

Hey ppl can anyone tell me why everyone uses Jupyter Notebook for data science? I use Spyder but if that many people uses Jupyter it has to be a reason right?

tidal bough
#

@covert rover Main advantage for me is the cell structure.

#

It really is convenient. It's a nice balance between running entire programs, and running single lines of code (REPL).

#

So you can have a cell with imports, one that calculates stuff, one that plots stuff...

#

And if you want to plot with different settings, you just change the plotting cell and rerun that cell without recalculating the data.

covert rover
#

@tidal bough that's convincing
thanks!

solemn topaz
#

How would I go about testing image recognition code that I wrote using OpenCV?

#

Are there any libraries/tools that can help with this?

#

I haven't found any good resources online

#

The only thing I can think of is just to have a folder with a bunch of test images and some json file with text or numbers or whatever that I expect to find in each of them. Is there a better way?

deft harbor
#

Data pipeline

dire pollen
#

anyone knows how to export a pandas dataframe to csv? Im reading the doc but I got this 'list' object has no attribute 'to_csv'

solemn topaz
#

@deft harbor could you elaborate?

desert oar
#

@dire pollen then it's not a dataframe, it's a list

#

read the error message

molten hamlet
#

Decision tree prototype, I mean, its functioning and predicting, the plotting is prototype ๐Ÿ˜„

tidal bough
#

that's not a tree, it has cycles ๐Ÿ˜›

molten hamlet
#

nooo

#

I cant draw arows , im learning plotly :d

#

its all one direction starting from want

raven mulch
#

In this video we continue on the topic of Lipschitz continuity by presenting a paper which proposes a projection method to enforce it! If you enjoy this video consider watching others which I have on the topic! ๐Ÿ™‚ I would love to have discussion here or on the comment section, the goal of this youtube channel is to create knowledge and interesting discussions in this area of ML.

Video: https://www.youtube.com/watch?v=9kxhEdiTwek

Paper: https://arxiv.org/abs/1804.04368

Abstract: We investigate the effect of explicitly enforcing the Lipschitz continuity of neural networks with respect to their inputs. To this end, we provide a simple technique for computing an upper bound to the Lipschitz constant---for multiple p-norms---of a feed forward neural network composed of commonly used layer types. Our technique is then used to formulate training a neural network with a bounded Lipschitz constant as a constrained optimisation problem that can be solved using projected stochastic gradient methods. Our evaluation study shows that the performance of the resulting models exceeds that of models trained with other common regularisers. We also provide evidence that the hyperparameters are intuitive to tune, demonstrate how the choice of norm for computing the Lipschitz constant impacts the resulting model, and show that the performance gains provided by our method are particularly noticeable when only a small amount of training data is available.

In this video we continue on the topic of Lipschitz continuity by presenting a paper which proposes a projection method to enforce it!

Paper: https://arxiv.org/abs/1804.04368

Abstract: We investigate the effect of explicitly enforcing the Lipschitz continuity of neural net...

โ–ถ Play video
desert oar
#

@raven mulch has this technique been adpoted at all? its interesting but i havent heard of it before

lapis sequoia
#

if i have a dataset with a lot of missing values and i want to calculate (cramers) correlation, is it important to impute the missing values first?

desert oar
#

you need to do something with them. either drop them or impute them @lapis sequoia

raven mulch
#

Similar techniques have had great success with GANs

desert oar
#

imputation is kind of a can of worms but maybe you can get away with mean/median/mode imputation

lapis sequoia
#

ty

raven mulch
#

And experimental section shows very promising results with feed forward nets and conv nets

desert oar
#

looks like you're interested in regularization, i see you have a video on another "obscure" technique

raven mulch
#

My main area of interest is ML security

#

Thatโ€™s what I do research in at my uni

#

But Iโ€™m interested in this stuff too yeah

#

Which is quite related

desert oar
#

i suspect regularization would be an important topic in that area

raven mulch
#

Yep

desert oar
#

very interesting

lapis sequoia
#

Hi.
Recently I've started exploring graph-like data (complete beginner). Does anyone have a resource recommendation for modelling 'labeled property graph' data? I want to learn how to properly represent such data in python.

pearl crystal
#

Nowadays, should we use criterion like AIC to compare models?
AIC= 2k-2ln(L)
We can compare models based on the accuracy in test data and utilize cross validation techniques. So, why do we need these absurd criterion?

desert oar
#

@pearl crystal the criteria aren't absurd. they are meant for cases where you don't necessarily have enough data, or good enough data, to use cross validation or a train/test split

#

also they use different goodness of fit criteria, in this case the likelihood of the model

#

that said, there are some nice asymptotic results relating model fit criteria like AIC DIC and WAIC with LOOCV

#

in a lot of todays' machine learning problems, you dont usually need these criteria. and not all of them are actually good criteria. but to call them "absurd" is imo ignorant of their intended purpose

pearl crystal
#

@desert oar
Ben Lambert is a great and expert data scientist. I have seen some of his videos. They were perfect. thanks

desert oar
#

๐Ÿ‘ indeed

#

one place you still see AIC used is in time series modeling

#

although its not necessarily ideal there either

#

but in time series work it's often much harder to cross-validate or otherwise hold out test data

pearl crystal
#

I do not know why his videos in youtube do not have enough views

desert oar
#

he's a pretty well respected researcher, so he probably just doesn't spend effort promoting his work

#

i agree i really like his content

lapis sequoia
#

anyone here use eta-squared before?

solid aurora
#

matplotlib's imshow() on a 3-d array treats the third axis as RGB, right?

tidal bough
#

As 10 seconds of googling show, yes:
https://matplotlib.org/api/_as_gen/matplotlib.pyplot.imshow.html

The image data. Supported array shapes are:

(M, N): an image with scalar data. The values are mapped to colors using normalization and a colormap. See parameters norm, cmap, vmin, vmax.
(M, N, 3): an image with RGB values (0-1 float or 0-255 int).
(M, N, 4): an image with RGBA values (0-1 float or 0-255 int), i.e. including transparency.
solid aurora
#

right, oops

#

sorry

tidal bough
#

if shape[2] is more than 4, higher indexes by that dim are ignored.

molten hamlet
modest rune
#
exog : array_like
A nobs x k array where nobs is the number of observations and k is the number of regressors. An intercept is not included by default and should be added by the user. See statsmodels.tools.add_constant.
#

I googled nobs array and got nothing.

#

Here is an example of this array being constructed...

nsample = 50
sig = 0.5
x = np.linspace(0, 20, nsample)
X = np.column_stack((x, np.sin(x), (x-5)**2, np.ones(nsample)))
beta = [0.5, 0.5, -0.02, 5.]

y_true = np.dot(X, beta)
y = y_true + sig * np.random.normal(size=nsample)

X is exog the nobs array

#

To restate my request for help: What is nobs? how are the 4 elements in each array element for nobs used? Any suggestions on something I could read to inform myself?

odd yoke
#

nobs is the number of observations as per the text you posted
and I'm unsure what beta is in the snippet as it is unused

rapid ridge
odd yoke
#

in the code you posted, nobs = 50, k = 4

#

the k comes from the 4 elements in the argument to np.column_stack

modest rune
#

oh, ignore beta. it is used to calculate y_true and I accidently left out the line of code that explains how it was used

#

added that line back

#

what is k?

#

number of regressors? What do I need to read to better understand that?

odd yoke
modest rune
#

yes, I sorted that much out

#

which was an initial source of confusion

odd yoke
#

regressors are basically how you would call features

modest rune
#

my remaining confusion is about the four k elements... 1: x, 2: sin(x), 3: (x-5)^2, 4: 1

#

hey hey hey... I wouldn't call them anything ๐Ÿ˜‰ I don't even understand what they are

#

what do you mean by "you would call features"

odd yoke
#

the independent variables

#

the "things" that make up the observations

#

like you had a pandas dataset, you'd have some column "output", and 4 other columns that would correspond to these

#

these would be used to predict the output

#

i'm not exactly a stats major so pardon my lack of proper terminology

modest rune
#

x is raw data. that part is easy.

odd yoke
#

i'm guessing the OLS curve is the one generated from the model ?

modest rune
#

yes

odd yoke
#

if so, then i'm guessing it'd be something like OLS would try to fit y = ax + b sin(x) + c (x-5)^2 + d

modest rune
#

That is what I was worried about, because that means that OLS was given a ton of data about what the plot should look like. So, what work is OLS actually doing if most of the curve is already defined?

odd yoke
#

the model predicted the a, b, c, d

modest rune
#

Interesting. OK! That helps a ton!

odd yoke
#

do you have the code that gets the result of the model, and plot it ?

modest rune
#

Often times, a person's confusion is more about their lack of ability to view the problem from the right perspective.

#

It is completely copy and paste from the example I linked above.

odd yoke
#

ah yeah i see it now

modest rune
#

Except I naively swapped the data out with a Google option chain volatility smile expecting to get a nice curve fit without changing much in the code. Finally figured out that it wasn't working because I wasn't defining the k parameters.

#

But, I think you gave me enough of a hint... I have an idea of what I need to do now.

odd yoke
#

and if we do graph it, we can see it matches the graph from above

modest rune
#

does the number of k parameters define the number of orders of a polynomial equation that is used?

odd yoke
#

yes

modest rune
#

cool

odd yoke
#

if you wanted to use a polynomial that is

modest rune
#

so, it wouldn't be y = ax + b sin(x) + c (x-5)^2 + d it would be y = ax^3 + b sin(x)^2 + c (x-5)^2 + d?

#

or not.

#

it is not how many orders then, it just defines an equation.

#

you could make it a polynomial or not, your choice

odd yoke
#

no this model accepts anything apparently, you'd have to replace sin(x) , (x - 5)**2 etc with actual x ** 2, x ** 3 etc

#

you could make it a polynomial or not, your choice
@modest rune exactly

modest rune
#

cool. I think I get it. Thanks!

#

@odd yoke Thanks! I was able to make progress! Have more to learn now, but I was able to get a decent curve fit.

odd yoke
#

nice

thin solstice
#

okey, I've got a question about something...
I'm trying to write a neural network library, and I've got something so far. it works at predicting, it's got weights & biases, an mutate functions, etc. it's fully functional if you use a genetic algorithm to train it, but personally I'd like to incorporate backpropagation, however I'm having some trouble

arctic wedgeBOT
thin solstice
#

there's my code, and it seems to be very strange during the training process;

#

if __name__ == '__main__':

    n = Network( [2,3,1] )

    tests = [
        [[1,0],[1]],
        [[0,1],[1]],
        [[1,1],[0]],
        [[0,0],[0]]
    ]

    for i in range(2500):
        test = random.choice(tests)
        print('\n')
        print(test)
        print(n.feedforward(test[0]))
        n.train( test[0], test[1] )
        print(n.feedforward(test[0]))
    
    for test in tests:
        print(test, n.feedforward(test[0]))```
I'm attempting to teach it the XOR problem, but I'll send a sample of what happens when it is run..
#

this is during training;

[[1, 0], [1]]
[0.43385151]
[0.44138086]


[[1, 1], [0]]
[0.44142151]
[0.43554815]


[[0, 0], [0]]
[0.43553338]
[0.42969438]```
#

the first line is the test, second line is the network's guess, and the third line is (hopefully) the improved network's guess

#

and as you can tell, it is improving, but only per question

#

by the end of the 2,500 training examples, it seems like it hasn't learnt a thing, apart from making all the answers equal for some odd reason

#
[[1, 0], [1]] [0.45336777]
[[0, 1], [1]] [0.45343231]
[[1, 1], [0]] [0.45340774]
[[0, 0], [0]] [0.45339234]```
#

the first list is the test, and the second list is the network's guesses

#

as you can see, they seem to converge to 0.45

#

any help is appreciated greatly, and please @ me in replies, thanks :)

muted oyster
#

a quick question, how do i get max value of a column along with corresponding row element?

#

max value I know, DF['column'].max()

#

DF['column'].idxmax()
this gives index of the max value but i want value from another column which falls in same row as max value

#

๐Ÿฅด idk if someone will understand what im trying to say

#

Ok got it, sometimes just need to revisit the basics:
DF[DF.Column == DF.Column.max()]

lapis sequoia
#

Hello, how are you guys? I want to learn data science and artificial intelligence, and I know that I have to start learning linear algebra, differentiation and integration, statistics, probabilities, and data analysis. Is there anything more I should learn?

static aurora
#

I want to make every cell that's >0.4 yellow and I'm trying to do it like this --> ```python
df.style.applymap( lambda x: 'background-color : yellow' if x > 0.4 else '')

#

@lapis sequoia no, that's everything

bitter harbor
#

Hello, how are you guys? I want to learn data science and artificial intelligence, and I know that I have to start learning linear algebra, differentiation and integration, statistics, probabilities, and data analysis. Is there anything more I should learn?
@lapis sequoia I'd suggest basic neural network architecture

viral scroll
#

Hi Guys,

I have a pandas dataset with a datetime field and a value field.

I would like to get the sum of the records sorted week wise in such a way so that the all the records before that week should be included in the sum.

Week 1 should have sum of values for week 1 dates
Week 2 should have sum of values for week 1 dates+week 2 dates
Week 3 should have sum of values for week 1 + week 2 + week 3 dates
and so on

#

Your help will be very much appreciated

Thanks in advance ๐Ÿ™‚

velvet thorn
#

Your help will be very much appreciated

Thanks in advance ๐Ÿ™‚
@viral scroll sort, groupby, sum, cumsum

viral scroll
#

Ohh...that was fast...Thanks a lot ๐Ÿ™‚

#

let me give it a try

solar jungle
#

Hello, so I had this question about neural networks,
when we merge outputs from 2 different layers, we usually use 'add' layer

#

In keras there are many such layers like 'add', 'multiply', 'average', etc.

#

does anybody have a practical explanation of which one to use to merge when ?

jovial oriole
#

Im working with a dataframe in pandas,
I dont know how to search by a specific year

Basically my question is In 2016, which person sold the most in each category?

#group = df.groupby(["Category", "person"]).sum()
#group."Ship Date"].to_datetime()
#total_sales = group["Units Sold"].groupby(level=0, group_keys=False)
#total_sales.nlargest(1)

But how do I group by the specific year aswell

#

the data type of the date column is datetime64[ns]

velvet thorn
#

df.groupby(['Ship Date'].dt.year)

jovial oriole
#

not working

desert oar
#

@velvet thorn use resample instead

#

df.resample('1Y', on='date')

#

I think

#

Off the top of my head

lapis sequoia
#

hey there

#

I'm having a hard time understanding sync and async..

#

I looked up simple explanations, it says : sync is when request 1 -> response, before you run request2..

#

async is request1 and request2 get executed at the same time without waiting for either to complete

#

I don't have anything I personally do to correlate this with, so this explanation isn't useful..

#

anyway, I'm ultimately trying to understand this in relation to model training:

#

Synchronous training has all worker training on different subsets of input data and incrementally combines results. In asynchronous training, workers operate independently and update variables asynchronously

#

@desert oar

desert oar
jolly hinge
#

Hello fam,

sacred sierra
#

Hey guys, if anyone is good with pandas, I am having some issues with duplicate values that I've tried to describe out in #help-popcorn channel, not sure if there is a more appropriate channel to post this too so apologies if this isn't the place

modest rune
#

I am trying to wrap my head around something with regards to surface fitting. Libraries like scikit-learn and statsmodels provide the tools to fit a curve, but not the tools to fit a surface (3D surface). I get the feeling, that given 3 axis, X, Y, and Z, there is a way to curve fit Z with respect to X but do it for every value of Y, then seperately curve fit Z with respect to Y and do it for every value of X, and then somehow combine those curve fits to form a surface fit.

Like i mentioned above, scikit-learn and statsmodels libraries have lots of curve fitting algoritms but no surface fitting algorithms.

#

I think scikit has a few surface fit funtions, but not for the vast majority of their curve fit algoritms (ex. OLS, RLM, LOESS, LOWESS, etc.)

tidal bough
#

Many curve fitting methods work on any number of dimensions. It's weird if scikit can't do it, lemme check...

modest rune
#

They might and maybe they simply lack examples showing how to do it with an extra dimension.

#

Being a noob in this area, I am certain I don't understand much of the documentation.

tidal bough
#

In fact, it looks like the first example is that case:

>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> rng = np.random.RandomState(0)
>>> y = rng.randn(n_samples)
>>> X = rng.randn(n_samples, n_features)
>>> clf = Ridge(alpha=1.0)
>>> clf.fit(X, y)
Ridge()
modest rune
#

I looked, didnt see an example of them using that regression for surface fitting. That doesn't mean it can't though. I think there is a high probability what I want to do is supported and easy, I just am missong a piece of the mental puzzle

#

I think you are right. I bet I can pass a properly dimensioned array to pull this off. But, I guess I don't quite get how to do that... An example would be sweet. I am a bit suprised I can' find one if it is supposidly easy.

bitter fiber
#

What is special about the ridge model though?

velvet thorn
#

@velvet thorn use resample instead
@desert oar depends on what you wanna do I guess?

#

like if you wanted to transform instead of aggregate you couldnโ€™t resample

bitter fiber
#

ridge sounds like an esoteric statistical model.

odd yoke
#

I see it very often at work, alongside lasso and elasticnet

tidal bough
#

lemme try LinearRegression

odd yoke
#

it's a scarily simple way to regularize linear models, and it generally doesn't cost anything other than adding a few characters to your code to specify you want to use ridge

velvet thorn
#

ridge sounds like an esoteric statistical model.
@bitter fiber it is mega common

#

at least IME

odd yoke
#

i have the same experience

modest rune
#

My data has lots of outliers, so i was hoping to use something stable like lowess or robust linear models

velvet thorn
#

not working
@jovial oriole what do you mean not working

#

actually .groupby(pd.Grouper(โ€˜Ship Dateโ€™, axis=โ€˜yearโ€™)) would have been more appropriate

bitter fiber
#

I've used the facebook ai model prophet since 2016 > for time series specifically

jovial oriole
#

@velvet thorn I got it in the end , I did
dfbyyear2014= df[df['Order Date'].dt.strftime('%Y') == '2014']
So I reworked the dataframe, basicly a pre applied filter

bitter fiber
#

way better than any other model i've tried

tidal bough
#

As you can see, it correctly finds out the coefficients:

[ 0.00000000e+00  1.00000000e+00  1.11022302e-15  3.33066907e-16
  1.00000000e+00 -1.00000000e+00]
#

wait, or does it

#

that's not correct at all, lol

velvet thorn
#

@velvet thorn I got it in the end , I did
dfbyyear2014= df[df['Order Date'].dt.strftime('%Y') == '2014']
So I reworked the dataframe, basicly a pre applied filter
@jovial oriole that's not grouping by though

tidal bough
#

well, actually it's not completely wrong

#

it estimated 0 + X + XY-Y^2

#

real answer is 2 + X + XY - Y^2

#

and I'm not sure how it missed the bias

#

...or maybe it's the model as a whole that does it?

jovial oriole
#

group = dfbyyear2014.groupby(["Category", "person"]).sum()

odd yoke
#

the intercept is in lin.intercept_ @tidal bough

tidal bough
#

ah, nice

#

still kinda misleading, since the coef_ also has it, but it's 0 always ๐Ÿ˜…

odd yoke
#

uh, does it ?

velvet thorn
#

group = dfbyyear2014.groupby(["Category", "person"]).sum()
@jovial oriole okay, so you actually wanted to filter and then groupby I guess

odd yoke
#

coef_ has 6 elems, it fits a x2y2 + b x2y + c xy2 + d xy + e x + f y

#

looks fine to me

#

then we have the + g with the intercept

tidal bough
#

so yeah, successful fitting.

#

max absolute error of 3.552713678800501e-15

velvet thorn
#

still kinda misleading, since the coef_ also has it, but it's 0 always ๐Ÿ˜…
@tidal bough no, the bias is not in the coefficient vector

#

I would guess they just both happen to be 0

tidal bough
#

@odd yoke

coef_ has 6 elems
that's the problem, they correspond to 1, x, y, x^2, xy and y^2

#

and yet the first of these is actually always 0, which the actual bias is in intercept_

odd yoke
#

multi variate polynomials are 1, x, y, x^2y, xy^2, x^2y^2

#

and xy

tidal bough
#

Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].
That's the output of polynomialFeatures

#

(that's exactly my case)

velvet thorn
#

ah, okay, because you have an explicit constant feature

odd yoke
#

ah I see, PolynomialFeatures has a include_bias parameter, and so does LinearRegression (fit_intercept)

#

lin = LinearRegression(fit_intercept=False) fixes it

tidal bough
#

ah, that makes sense

#

the slighly more efficient way is probably include_bias=False in the features.

#

(I'm assuming that to add the constant to the result array is cheaper than having the input have one more column).

modest rune
#

Thanks yall. I'm afraid I need some time to process what you all are saying. It seems you have confirmed it is easy and possible, I just need to figure out the cobfusion in my head. If I can't make sense of things, I might come back with sample inout data and sone code tomorrow or Monday.

tidal bough
#

@modest rune Basically, the general idea is that you can do polynomial curve fitting by only linear regression by generating tons of new features (like, if you have arrays of X and Y coordinates, you also generate arrays of multiples X*Y, X**2, Y**2 (for order=2)) and then fitting a line to this 5-dimensional data. PolynomialFeatures for the former, LinearRegression for the latter (or something with normalization like Ridge)

So in general, you just do

lin = LinearRegression()
model = make_pipeline(PolynomialFeatures(degree), lin)

And then pass it the input and output: the input an array of shape (m,n), the output of shape (m,k), where:
n is the number of points - in my case, a total of 10000 points.
m is the number of dimensions of each input point - in my case, 2.
k is the number of dimensions of each output point. In my case, it's 1. Having it >1 is the same as having several models with the same inputs, but predicting different parameters of the output.

modest rune
#

@tidal bough thankyou so much!

lapis sequoia
#

@desert oar I dont think it particularly belongs in async, because it's about model training strategies.. and ok

winter citrus
#

i'm trying to translate natural language text into text for a program..anyone know where I can get started?

soft dock
prime elm
#

is it possible to make a dictionary where the keys increase by an increment

say

a = 5

...
Results in dict = {1: None, 2: None, 3: None. 4: None, 5: None}```
lapis sequoia
#

@winter citrus you can use google translate api

desert oar
#

@bitter fiber ridge regression is L2 regularization, if that's something you are familiar with

velvet thorn
#

is it possible to make a dictionary where the keys increase by an increment

say

a = 5

...
Results in dict = {1: None, 2: None, 3: None. 4: None, 5: None}```

@prime elm ...you want all the keys to be None?

#
>>> dict.fromkeys(range(5))
{0: None, 1: None, 2: None, 3: None, 4: None}

adapt as necessary

desert parcel
#
inputs = np.array([
    [1, 2, 3, 4, 5, 6], 
    [7, 8, 9, 10, 11, 12],
    [11, 22, 33, 44, 55, 66],
    [100, 200, 330, 400, 500, 123],
    [99, 123, 33, 32, 12, 44],
    [9999, 123123, 123123, 444343, 5555, 66699]
    ], dtype='float32')

targets = np.array([
    [4], [6], [8], [10], [12], [14]
    ], dtype='float32')

inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)

print(inputs.shape)
print(targets.shape)

train_ds = TensorDataset(inputs, targets)
train_dl = DataLoader(train_ds, shuffle=True)
#

There is a problem with converting the inputs into tensors from numpy arrays

#

Because it says it expected a numpy array but a tensor was given instead

#

But having only one row works just fine

desert parcel
#
tensor([[  -2.3799],
        [  -6.7324],
        [ -23.5662],
        [-145.1938],
        [ -52.2061],
        [1322.5454]]

Some of my predictions still have negative values even though I used mse_loss

desert parcel
#

never mind I figured it out

bitter harbor
#

when computing limits, is it possible to do it all with factoring/one function, or do other methods have to be implemented?

desert parcel
#

I have an error with the final line in this file

#

Error output:```
untimeError Traceback (most recent call last)

<ipython-input-55-90c5585d3b40> in <module>()
1 opt = torch.optim.Adam(model.parameters(), lr=7)
----> 2 fit(5, model, loss_fn, opt, train_dl, eval_dl, accuracy)

3 frames

<ipython-input-49-afd130f584e4> in forward(self, xb)
18
19 def forward(self, xb):
---> 20 xb = xb.reshape(-1, 784)
21 outputs = self.linear(xb)
22 return outputs

RuntimeError: shape '[-1, 784]' is invalid for input of size 200

#

I have tried stack overflow but the solutions that are covered are part of a more advanced model and I am unable to follow along with it.

#

And there may be a few lines of code that are not needed so don't mind those too much

#

ping me btw

velvet thorn
#

ping me btw
@desert parcel uh

#

do you understand

#

what reshaping does?

desert parcel
#

yeah

#

at least I think so

#

doesn't it just well

#

reshape a tensor

#

into a difference shape

pearl crystal
#

Hi. I have already watched "Udemy_The_Data_Science_Course_2020_Complete_Data_Science_Bootcamp_2020". It was simple and I think it was for beginners and at a shallow level. Could you suggest me better online course (deep knowledge) to become a data scientist? I have M.S. degree in artificial intelligence, thanks.
I do not know where I can ask similar questions about it, here or another channel

desert parcel
#

do you understand
@velvet thorn yeah I think so

velvet thorn
#

yeah so

#

how can you reshape a tensor of shape (200) into one of shape (1, 784)?

#

it doesn't make sense

#

they have different numbers of elements

desert parcel
#

I was following the tutorial

#

And the tutorial didn't have a problem

velvet thorn
#

presumably it has different data

#

that's the only explanation

desert parcel
#

so then

#

it's the MNIST dataset though

#

So I don't think that's the case

#

unless there are multiple instances

#

or I made an error

#

so what can I do then

#

instead of making it into 784

#

I just change 784 to 200?

#

But the MNIST dataset is a 1x28x28

#

changing it from (-1, 784) to (-1, 200) just gave a matrix multiplication error

#

@velvet thorn

velvet thorn
#

it's 784

#

because

#

784 is 1 * 28 * 28

desert parcel
#

I understand that

#

But I'm not sure what to do

gentle tide
#

I have this endpoint code to get the average stock closing price given a stock name, month, and year

@app.route('/stock=<stock>/date=<date>/average', methods = ['GET'])
def average(stock, date):
    if request.method == 'GET':
        dict = {'FB': 0, 'AAPL': 0, 'NFLX': 0, 'GOOG': 0}
        if stock not in dict:
            return "This stock does not exist. List of stocks (case sensitive): \nFB \nAAPL \nNFLX \nGOOG \n"
        try:
            dt = datetime.datetime.strptime(date, "%Y-%m")
        except:
            return "Please enter a valid month and year \nExample: 12-2020 \n"
        df['date'] = pd.to_datetime(df['date'])
        by_stock_month_year = df[(df["company_ticker"] == stock) & (df['date'].dt.month == dt.month) & (df['date'].dt.year == dt.year)]
        if by_stock_month_year.empty:
            return "There is no available price for that date \n"
        prices = by_stock_month_year["closing_price"]
        data = {}
        data['price'] = round(prices.mean(), 2)
        return json.dumps(data, indent = 2)
    else:
        return "Only GET methods are supported \n"

For this csv file

company_ticker,date,closing_price
AAPL,1989-09-19,1.54
AAPL,1989-09-20,1.59
AAPL,1994-12-08,1.28
AAPL,2019-11-15,265.76
GOOG,2004-08-19,49.98
GOOG,2004-08-20,53.95
GOOG,2019-11-15,1334.87

Is there a way to make this cleaner

desert parcel
#

784 is 1 * 28 * 28
@velvet thorn I understand why it's it's 784 but I have no idea what to do next

#

Someone in SO helped me out

#

the solution worked

#

the problem was

#

I left out an argument in a function loss_batch

sweet ember
#

Hi, I am trying to scrape emails from yelp by crawling into individual listing. Using bs4 and selenium for it but not able to scrape them. Where do I ask this?

desert oar
#

that might be against yelp terms of service, in which case we can't help with that on this server @sweet ember

#

!rules 5

arctic wedgeBOT
#

5. Do not provide or request help on projects that may break laws, breach terms of services, be considered malicious/inappropriate or be for graded coursework/exams.

calm wagon
#

how much should i have this?

net = tflearn.fully_connected(net, 12)

and of what size?

tidal bough
#

depends on the task

#

hmm, I wonder what are the advantages of using tflearn, really

calm wagon
#

;)

#

im making a chat bot in python

tidal bough
#

it claims to be higher-level than TF itself, but doesn't TF has its own Sequential class that allows building models the same way?

calm wagon
#

so how many should i have?

#

@tidal bough

bitter harbor
#

is tflearn separate from tf?

calm wagon
#

wdym

tidal bough
#

it's a wrapper over TF, basically

#

TFlearn is a modular and transparent deep learning library built on top of Tensorflow. It was designed to provide a higher-level API to TensorFlow in order to facilitate and speed-up experimentations, while remaining fully transparent and compatible with it.

calm wagon
#

how much should i have this?

net = tflearn.fully_connected(net, 12)

and of what size?
@calm wagon

tidal bough
#

@calm wagon you should probably find some existing simple implemetation/guide and see how they do it

bitter harbor
#

speed-up experimentations ok

lapis sequoia
#

is it possible to build a face-recognition using Tensorflow

tidal bough
#

that, or just guess. 3 layers of 100 neurons or something.

#

is it possible to build a face-recognition using Tensorflow
well, yes, this is one of the things ML tends to be used for ๐Ÿ˜›

bitter harbor
#

that does seem redundant tho

lyric canopy
#

!tempmute 739406136981192784 1d Be silent.

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied mute to @lapis sequoia until 2020-08-24 16:54 (23 hours and 59 minutes).

lapis sequoia
#

Dose Tensorflow specifically made for face-recognition stuff like that?

bitter harbor
#

tf is specifically made for machine learning yes

tidal bough
#

Tensorflow is a pretty low-level framework for machine learning and neural networks.

#

It's not, like

from tensorflow import FaceRecognition
FaceRecognition().recognize(faces)

๐Ÿ˜›

lapis sequoia
#

TensorFlow or Opencv which one is great for face-recognition?

tidal bough
bitter harbor
#

opencv can't do recognition on it's own

lapis sequoia
#

ik

bitter harbor
#

whereas tf can train on images

lapis sequoia
#

numpy support

#

opencv and numpy

bitter harbor
#

not necessarily live ones

#

idk what you're on about

sweet ember
#

Thanks @desert oar I was just trying projects on webscraping/crawling for my github. I ll try with someother site

molten hamlet
#

cvlib can detect faces

#

and then max pooling extracts numbers from each frame separately ๐Ÿ˜›

#

for example, image (32x32x1)
image -> convolution of 10 filters -> result is 10 x (30, 30, 1)

desert oar
#

@sweet ember wikipedia.org is a good place to start. you also dont need selenium for that which makes it a lot simpler

modest rune
#

Recommendations for best way to write dataframes and numpy arrays to a file? I assume numpy and pandas has builtin functionality to do this. Should I use those or is there something better?

#

Whatever direction I go, I'd like something that can handle large datasets and I can be reasonably confident won't break as I upgrade my library versions over the years.

#

Human readable files would be a plus, but not at the cost of huge file sizes.

#

I assume pickling is a bad idea because pickle files are likely to fail to load if you try to load a pickle file saved in a previous library version?

tidal bough
#

Recommendations for best way to write dataframes and numpy arrays to a file?
numpy has save, which saves to a numpy's own npz binary format

modest rune
#

I saw that. I am leaning that direction. But, I think I would prefer something that works with numpy AND pandas.

#

specifically, I think I would have to use a different function to save a dataframe to file. Which isn't the end of the world, but not ideal.

bitter harbor
#

you can specify the file type with save

#

I usually send it to a .dat

modest rune
#

The idea that json is human readable is very attractive... maybe the files wouldn't be too big in json. If I had to guess, the largest files I would save would be maybe 1 billion doubles.

tidal bough
#

honestly, a dataframe/array in json will probably not be very readable ๐Ÿ˜›

bitter harbor
#

^^

tidal bough
#

by the way, if they are 2d, you can use just csv

bitter harbor
#

or excel's version

modest rune
#

honestly, a dataframe/array in json will probably not be very readable ๐Ÿ˜›
@tidal bough

Good point, I hadn't even contemplated what it would look like.

#

Interest point from this stack exchange discussion: "Another useful point is that although ASCII CSV encoding isn't very efficient, using a file compression utility (like zip, gzip, etc.) on your ascii file will typically bring the file size down to something similar to the size of a binary file."
https://scicomp.stackexchange.com/questions/8404/binary-vs-ascii-file-size

bitter harbor
#

.npy is binary, .npz is compressed @tidal bough btw

#

so german if you care about efficiency/compression i'd suggest looking at numpy's file types

desert oar
#

csv is fine depending on what you have

#

npy is good if you have only numerical data

molten hamlet
#

cvlib is awesome

#

lemon_thinking ๐Ÿ˜†

#

I saw scissors

#

toothbrush pretty close

#

;d

modest rune
#
while(True)
  if (cvlib.object == 'person') AND (cvlib.wields == 'baseball bat'):
    police_state.send_swat_team()
molten hamlet
#

omg dont do this @modest rune

#

ah I see

#

xD

#

I always miss jokes

modest rune
#

hahaha! I just received an email from the Central Intelligence Agency asking me to code their new Auto-Policing robots

#

hahaha

#

what did it label the books?

molten hamlet
#

yes

#

nice ai

#

I got two bottles

#

๐Ÿ˜„

modest rune
#
while(True)
  if (cvlib.object == 'person') AND (cvlib.bottles >= 2) AND (cvlib.ethnicity != 'russian'):
    bar.refuse_service()
tidal bough
modest rune
#

And these examples are why I tell my brother that AI is going to cause a disaster at some point.

molten hamlet
#

xD

#

you can use slavic instead, im not russian ๐Ÿ˜„

#

russian can understand polish but we can't understand theirs :d

modest rune
#

Petty Officer Dirk: "General Dukes, the AI has detected the launch of 56 thermonuclear warheads. But, I am pretty sure it is just a bunch of pencils that fell out of a teacher's satchel. The AI is recommending a counter-offensive."

General Dukes: "Son, trust the AI, launch the counter-offensive."

#

@molten hamlet you could be American for all I know ๐Ÿ™‚ I didn't mean to insinuate you were Russian. Was only making a joke that Russian's can hold their liquor.

molten hamlet
#

nah chill ๐Ÿ˜„

#

im fine

#

I love that korean soju

#

its cheap in korea

#

but not cheap here due import ๐Ÿ˜„

#

should I know something specific in detecing road signs or keeping car on road between line and edge ๐Ÿ˜„

#

got interview tommorow

modest rune
#

Can't help you with that. Maybe someone else can chime in.

molten hamlet
#

you know any models?

#

I just read that yolo is fast, but is it popular? ๐Ÿ˜„

modest rune
#

Nope, zero experience with machine learning. Other than I have starting trying to get better at curve fitting.

gentle tide
#
"dates": "[[\"2004-08-20\", 53.95], [\"2019-11-15\", 1600.63]]"

Does anyone knowhow to get rid of that weird formatting on the dates

untold hare
#

@modest rune you have no idea how close that was to actually happen. Soviet early warning syatem got confused by sun reflecring off clouds and assumed nato had launched a first strike.

modest rune
#

@untold hare wow.

#

I think a similar thing happened with America's early warning system.

untold hare
#

Lots of incidents yeah. There is a good book about this lemme see if I can find it. Basically a must read if you do data science and ML for defense companiea

modest rune
#

Cool! Thanks, I might kindle taht.

untold hare
#

Do it

languid warren
#

Hey someone can help me to fix:

#
print("Train data:")
for i in tqdm(range(0, X_train_windowed.shape[0] - seq_len+1)):
        X_train_Conv_LSTM[i] = current_seq_X
        y_train_Conv_LSTM[i] = y_train[i + seq_len - 1]

(262, 3, 50, 50, 3) X_train_Conv_LSTM.shape = (1, 3, 50, 50, 3) current_seq_X.shape
(262, 1) y_train_Conv_LSTM.shape            = (264,) y_train.shape

cupy\core\core.pyx in cupy.core.core.ndarray.__setitem__()

cupy\core\_routines_indexing.pyx in cupy.core._routines_indexing._ndarray_setitem()

cupy\core\_routines_indexing.pyx in cupy.core._routines_indexing._scatter_op()

cupy\core\_kernel.pyx in cupy.core._kernel.ufunc.__call__()

cupy\core\_kernel.pyx in cupy.core._kernel._get_out_args()

ValueError: Out shape is mismatched```
desert oar
#

is there some library that lets you create an index on a column in a pandas dataframe that isnt the index of the dataframe?

#

e.g. some data structure that keeps a sorted collection of rows, or a hash table, and does binary search or a hash lookup to find the dataframe rows that you want (or whatever other index implementation is out there, trees etc)

#
df = pd.DataFrame(...)
product_category_index = ColumnIndex(df['product_category'], algorithm='b-tree')
df_pants = df.iloc[product_category_index('pants')]

something like that

#

would be a fun project if nobody has done this already

peak bolt
#

Could someone help me in the help voice channel?

velvet thorn
#

is there some library that lets you create an index on a column in a pandas dataframe that isnt the index of the dataframe?
@desert oar hm.

#

not possible in general, because the index would need to update with the DataFrame

desert oar
#

good point

#

or you could just, not bother

velvet thorn
#

like you could hack it, but it'd be prone to breaking with pandas updates

desert oar
#

and the caller would re-index as desired

velvet thorn
#

and the caller would re-index as desired
@desert oar then what would the benefit over normal pandas indexing be

#

since filtering is at worst linear, and index-building is at best linear

desert oar
#

for big datasets where you already have an index but need to do repeated lookups on non-index fields, or a variety of fields

#

not that uncommon in my work

velvet thorn
#

I see

#

fair enough

#

okay I'm going to need you to stop talking about use cases

#

because I don't think I need another side project

desert oar
#

lol

velvet thorn
#

this is a pretty cool idea

#

I'll see what I can do in an hour

desert oar
#

class BaseIndex(metaclass=ABCMeta):
    def __init__(self, data: Sequence[_T]):
        self.data = data

    @abstractmethod
    def lookup(self, val: _T) -> Optional[int]:
        pass


class BinsearchIndex:
    data: Sequence[_T]
    data_sorted: Sequence[_T]
    sort_key: Optional[Callable[[_T], Any]]

    def __init__(self, data: Sequence[_T], sort_key: Optional[Callable[[_T], Any]] = None):
        super().__init__(data)
        self.sort_key = sort_key
        self.data_sorted = sorted(data, key=sort_key)

    def lookup(self, val: _T) -> Optional[int]:
        # https://docs.python.org/3/library/bisect.html#searching-sorted-lists
        i = bisect_left(self.data_sorted, val)
        if i >= len(self.data) or self.data_sorted[i] != val:
            return None
        return i

i slapped this together, not sure if it actually works

velvet thorn
#

what do you see this line doing though

#

df_pants = df.iloc[product_category_index('pants')]

desert oar
#

yeah

#

looking it up in the index in < O(n) time

#

then looking it up in the dataframe in O(1) time

velvet thorn
#

I mean

desert oar
#

idk if it actually works that way

velvet thorn
#

what's the expected output

#

the column pants sorted by the value of product_category?

#

i.e. df.sort_values(by='product_category')['pants']?

desert oar
#

it would be equivalent to df.loc[df['product_category'] == 'pants']

velvet thorn
#

wouldn't that just be df[df['product_category'] == 'pants']

#

but okay I get it

#

if you say that the index doesn't need to change with the DataFrame

#

then it seems to me that you could just use a dict

#

where the keys are unique values of the given category and the values are row numbers

#

which would reduce lookups to constant time

desert oar
#

thats what i was thinking too

#

that was on my TODO list

#

you could use a B-tree or whatever

#

but yeah a dict is easy

#

also this doesnt support range index lookups (yet)

#

eg if there is more than 1 row with that value

velvet thorn
#

time to poke around pandas source code

#

and see what they do with __setattribute__

desert oar
#

and obviously something like this is kinda useless except on pretty large dataframes

#

heh have fun ๐Ÿ˜„

velvet thorn
#

indeed

#

how big are your dataframes?

#

honestly I don't think I've ever been at the point that this would be a necessary optimisation

desert oar
#

not that big anymore

#

but ive worked on problems with > 1bn rows in memory

#

or where the lookups just needed to be faster than they were

velvet thorn
#

and you needed to index on arbitrary columns

#

such that a multi-level index wouldn't have worked?

desert oar
#

@velvet thorn thats an interesting option, i still like this separate index idea though ๐Ÿ˜›

#

im curious if it can actually produce any speed improvements on bigger datasets

flat quest
#

and a new project is born

soft dock
#

I'm working on a project generating guitar hero charts based on tablature but honestly I don't think I'll ever finish

flat quest
#

using ML?

Does it even need ML for that?

desert parcel
#

does anyone know what nan means when you're calculating your loss?

molten hamlet
#

not a number

#

probably too small

#

or some other numerical error

desert parcel
#

Hmm alright

velvet thorn
#

also possibly too big

#

or divide by 0

molten hamlet
#

ah right

#

divided by zero most possible

velvet thorn
#

yeah

molten hamlet
#

due to numerical error, some number just get smaller than epsilon

velvet thorn
#

too big distinct from that usually comes when your learning rate is too high

#

so gradient descent becomes gradient ascent ๐ŸŽข

desert parcel
#

hmm well I have that right now

#

I messed around with different lrs

#

but it didn't work after changing it differently

velvet thorn
#

do you get nan loss immediately?

#

or after a while

desert parcel
#

Immediately

velvet thorn
#

then it's not that

desert parcel
#

Then what could it be

velvet thorn
#

too big distinct from that usually comes when your learning rate is too high
@velvet thorn not this

#

the other stuff we said

desert parcel
#

huh

velvet thorn
#

okay

#

if the loss starts out finite

desert parcel
#

so my loss could be too large?

velvet thorn
#

but becomes nan after a while

#

(and you see it going up real quick)

#

that suggests that your learning rate is too high

#

because your model's parameters bounce out of the valley of low loss into the skies of float overflow

#

but if your loss starts out as nan

#

that implies that the problem is something else

#

e.g. division by 0 somewhere

desert parcel
#

because your model's parameters bounce out of the valley of low loss into the skies of float overflow
@velvet thorn what does that mean

velvet thorn
#

do you know how gradient descent works?

desert parcel
#

yeah

velvet thorn
#

then you should understand that...?

desert parcel
#

My english isn't the best lol

velvet thorn
#

if your learning rate is too high

#

okay never mind

#

let me draw this

desert parcel
#

An increasing gradient requires a low learning rate right?

#

and a decreasing gradient is the opposite of that

velvet thorn
#

basically

#

hm. I should take drawing classes.

desert parcel
#

Naw it's alright lol

#

It's good enough

velvet thorn
#

basically if you adjust your weights by too much each iteration it is possible that you will "bounce" to the other side of the loss landscape

desert parcel
#

So you wanna check my code? Could it be because of the way I added in my input data?

velvet thorn
#

increasing loss in the process

#

So you wanna check my code? Could it be because of the way I added in my input data?
@desert parcel no thank you

desert parcel
#

Because I've never done it this way before

velvet thorn
#

hard to say, could be a few things

desert parcel
#

lol

#

is my code that bad

velvet thorn
#

I don't like debugging DL code

#

it's very time-consuming

desert parcel
velvet thorn
#

because of the level of abstraction

#

nothing about you personally

desert parcel
#

well I did that there are no issues but I'm just wondering

velvet thorn
#

why is your target 2D

#

any reason?

desert parcel
#

Because I'm only trying to predict one thing

velvet thorn
#

yes, so it should be 1D ,right

desert parcel
velvet thorn
#

yes, do you not see it is 2D

#

9 is the first dimension

#

1 is the second dimension

desert parcel
#

Oh yeah

velvet thorn
#

(9, 1) is different from (9,)

#

TBH I don't have much experience with Torch so I don't know how it would handle such things

#

but it's at least a little strange IMO

desert parcel
#

Well but I have 2, 2D tensors should be fine right?

velvet thorn
#

what?

#

didn't get that, sorry

#

are you Malaysian btw

desert parcel
#

Oh nice

#

yeah you're right

random perch
#

I'm trying to set up an upstream for tensorflow by doing
git remote add upstream git@github.com:tensorflow/tensorflow.git
however when i try to run
git pull upstream master
I get the error seen in the screen shot. If anyone knows what im doing wrong please lmk. Sorry if im intruding in a conversation

desert parcel
#

well I meant that I have two 2D tensors multiplying them together should be fine

velvet thorn
#

well I meant that I have two 2D tensors multiplying them together should be fine
@desert parcel okay, you have kind of lost me

#

which two tensors are you multiplying together

desert parcel
#

preds and targets

velvet thorn
#

I'm trying to set up an upstream for tensorflow by doing
git remote add upstream git@github.com:tensorflow/tensorflow.git
however when i try to run
git pull upstream master
I get the error seen in the screen shot. If anyone knows what im doing wrong please lmk. Sorry if im intruding in a conversation
@random perch you can't pull directly from the TF repo

desert parcel
#

preds = model(inputs)

#

model = (11, 1)

velvet thorn
#

preds = model(inputs)
@desert parcel ah, okay

#

that seems reasonable

#

could be something else in the data

#

hard to say from here

#

just experiment a little

random perch
#

@random perch you can't pull directly from the TF repo
@velvet thorn How do I update my forked repo to match the TF repo

velvet thorn
#

okay it's been a while since I actually forked a repo

#

so I don't wanna tell you the wrong thing that I'm not sure about

desert parcel
#

Lol i'm not even familiar with git

velvet thorn
#

think that's more appropriate

random perch
#

ite bet

velvet thorn
#

actually

#

I feel like what I said might be wrong

#

about not being able to pull directly

#

hm

#

let me try

desert parcel
#

so if it's not the tensor issue

#

then it's the data?

velvet thorn
#

could be your model too...?

#

@random perchnever mind

#

I'm p sure I'm wrong

desert parcel
#

Hmm

velvet thorn
#

it's a different issue

desert parcel
#

well then I'm not sure how to proceed then

velvet thorn
#

Git can't access your credentials

#

are you using Windows or *nix

#

oh okay I think I get it

desert parcel
#

do you mean unix?

velvet thorn
#

*nix = Unix, Linux, etc.

#

it's because

random perch
#

Im using mac

velvet thorn
#

you're trying to connect using SSL

random perch
#

so unix

velvet thorn
#

the SIMPLEST way to fix this

#

is

#

do this instead

#

git remote add upstream https://github.com/tensorflow/tensorflow.git

random perch
#

oh mm

velvet thorn
#

although I would suggest you look into setting up SSH keys

random perch
#

yeah that actually might work lol

velvet thorn
#

like you notice

#

the URL is different

random perch
#

i did set up my SSH key

#

but idk why its being wack

velvet thorn
#

no Mac experience, sorry

random perch
#

what do u use

velvet thorn
#

Ubuntu

desert parcel
velvet thorn
#

why is there

#

a nan

#

in the input?

#

if you have nan inputs of course the output will be nan too

desert parcel
#

oh yeah

velvet thorn
#

well

desert parcel
#

I just saw that

#

huh

#

ohh

#

I didn't see that

velvet thorn
#

I mean

random perch
#

git remote add upstream https://github.com/tensorflow/tensorflow.git
@velvet thorn 10/10 it worked ty!

velvet thorn
#

yw

#

I'm not sure how to put this in a non-condescending/offensive way

#

but this is really basic debugging

#

so...yeah...

desert parcel
#

Lol I don't get offended easily

#

so no worries

velvet thorn
#

like to reiterate

#

you really should take a step back on work on more basic things (like coding and mathematics)

#

this isn't even an architecture problem

desert parcel
#

My maths is alright

#

Well the tutorial didn't really cover too much of the math side other than when it is talking about calculating the loss