velvet thorn Feb 19, 2021, 12:21 AM

#

linear regression assumes HOMOSCEDASTICITY of RESIDUALS

young dock Feb 19, 2021, 12:21 AM

#

oh, so lin reg probably isnt the best for a heteroscedastic dataset

velvet thorn Feb 19, 2021, 12:21 AM

#

young dock oh, so lin reg probably isnt the best for a heteroscedastic dataset

of residuals

young dock Feb 19, 2021, 12:21 AM

#

uh

velvet thorn Feb 19, 2021, 12:21 AM

#

errors

young dock Feb 19, 2021, 12:22 AM

#

sorry im new to all this

velvet thorn Feb 19, 2021, 12:22 AM

#

y_hat - y

#

y_hat is predicted y

#

anyway

#

just try it out and see how it gors

#

goes

#

linear regression is p robust

#

to assumption violation

young dock Feb 19, 2021, 12:22 AM

#

I see

#

i got the score of the model, which was 0.36

#

i think thats the r^2

#

is that good?

velvet thorn Feb 19, 2021, 12:24 AM

#

uh

#

depends on your use case, but generally no

#

not horrible

#

but not great

young dock Feb 19, 2021, 12:24 AM

#

ok

#

basically it means that the variance in the x variable explains 36% of variance in the y variable

#

is that correct?

feral shard Feb 19, 2021, 12:30 AM

#

This is off topic, but I can't help but think of the meme from chernobyl

#

3.6 roentgen, not great, not terrible

#

lol

young dock Feb 19, 2021, 12:30 AM

#

lol

feral shard Feb 19, 2021, 12:30 AM

#

did you watch it?

young dock Feb 19, 2021, 12:31 AM

#

i've only seen the meme around

feral shard Feb 19, 2021, 12:31 AM

#

hilarious that your R^2 was .36

young dock Feb 19, 2021, 12:32 AM

#

yeah same digits lol

feral shard Feb 19, 2021, 12:32 AM

#

anyway, i would actually say that .36 is more on the terrible side

#

you want like 0.7 or higher

young dock Feb 19, 2021, 12:33 AM

#

fair

#

i guess i should try other types of reg in that case?

feral shard Feb 19, 2021, 12:36 AM

#

yeah you could try that

#

there sure is a lot of variance though

young dock Feb 19, 2021, 12:37 AM

#

yeah

velvet thorn Feb 19, 2021, 12:55 AM

#

feral shard anyway, i would actually say that .36 is more on the terrible side

I would say it depends on the problem

#

because some are just harder

iron basalt Feb 19, 2021, 1:33 AM

#

Hello, is anyone here very familiar with numpy?

velvet thorn Feb 19, 2021, 1:34 AM

#

iron basalt Hello, is anyone here very familiar with numpy?

just ask.

iron basalt Feb 19, 2021, 1:34 AM

#

I am currently wondering about this numpy code and i'm pretty confused about the resulting shape.

#

>> a = np.ones((784,))
>> b = np.ones((784,1))
>> a.shape
(784,)
>> b.shape
(784, 1)
>> a = a - b
>> a.shape
(784, 784)

velvet thorn Feb 19, 2021, 1:36 AM

#

iron basalt I am currently wondering about this numpy code and i'm pretty confused about the...

this is an example of broadcasting.

#

simplest example

#

!e

import numpy as np

a = np.array([1, 2, 3, 4, 5])
print(a - 1)

arctic wedgeBOT Feb 19, 2021, 1:36 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

[0 1 2 3 4]

velvet thorn Feb 19, 2021, 1:37 AM

#

so, a is an array of shape (5,), but 1 is a scalar

#

how can you subtract a scalar from an array? you broadcast it - duplicating it across axes

#

now, scale that concept up.

#

!e

import numpy as np

a = np.array([[1, 2, 3],
              [4, 5, 6]])

b = np.array([5, 10, 15])

print(a - b)

arctic wedgeBOT Feb 19, 2021, 1:37 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 | [[ -4  -8 -12]
002 |  [ -1  -5  -9]]

iron basalt Feb 19, 2021, 1:37 AM

#

yes, but with the code above I expecting element-wise subtraction, with (784,) being treated the same as (784,1).

velvet thorn Feb 19, 2021, 1:38 AM

#

iron basalt yes, but with the code above I expecting element-wise subtraction, with (784,) b...

nope

#

a singleton dimension isn't the same as no dimension.

#

although I must say that is a bit of an edge case

iron basalt Feb 19, 2021, 1:40 AM

#

so it does element-wise subtraction but per axis? broadcasted up?

#

I think I get it, just not sure how to describe it in text.

velvet thorn Feb 19, 2021, 1:40 AM

#

iron basalt so it does element-wise subtraction but per axis? broadcasted up?

yes

#

I think what you're imagining is for

#

!e

import numpy as np

a = np.array([1, 2, 3])
b = np.array([[4, 5, 6]])

print(a.shape)
print(b.shape)

print((a - b).shape)
print(a - b)

arctic wedgeBOT Feb 19, 2021, 1:41 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 | (3,)
002 | (1, 3)
003 | (1, 3)
004 | [[-3 -3 -3]]

velvet thorn Feb 19, 2021, 1:41 AM

#

is this what you would expect? @iron basalt

iron basalt Feb 19, 2021, 1:42 AM

#

yes

velvet thorn Feb 19, 2021, 1:42 AM

#

note the shapes

iron basalt Feb 19, 2021, 1:42 AM

#

leading dimension is 1 this time

velvet thorn Feb 19, 2021, 1:42 AM

#

yes

#

in your original case

#

the 784 and 1 axes are matched

#

leading to a 2nd axis of size 784

iron basalt Feb 19, 2021, 1:47 AM

#

so in my case it took the last axis from a and matched with the last axis of b, because a only has one axis?

#

It matches from "right" to "left"?

#

In your last example 3 matches with 3?

serene scaffold Feb 19, 2021, 1:55 AM

#

These days if I hear that a product uses "deep learning and AI" I assume that they either used some off-the-shelf AI solution for something that didn't need it, or the AI that they're using isn't very effective. But maybe that's because I see how much AI doesn't work before it does.

#

Is this something a lot of people start to feel after they've been working with AI for a while?

velvet thorn Feb 19, 2021, 1:58 AM

#

iron basalt It matches from "right" to "left"?

yes

iron basalt Feb 19, 2021, 2:00 AM

#

@velvet thorn thank you, numpy's broadcasting was something I never really fully learned.

misty flint Feb 19, 2021, 2:00 AM

#

serene scaffold These days if I hear that a product uses "deep learning and AI" I assume that th...

i feel the same tbh DoggoKek

iron basalt Feb 19, 2021, 2:00 AM

#

@serene scaffold Yes

magic summit Feb 19, 2021, 2:03 AM

#

sorry for the crappy paint drawing

#

but

#

how would i graph something like this with matplotlib

storm lintel Feb 19, 2021, 2:04 AM

#

anyone good with webscraping here?

misty flint Feb 19, 2021, 2:05 AM

#

uhh youre probably looking for the subplot function

magic summit Feb 19, 2021, 2:08 AM

#

misty flint uhh youre probably looking for the subplot function

I guess i can just wing it with just swapping the axis for the left graph when plotting

lapis sequoia Feb 19, 2021, 2:08 AM

#

hello

iron basalt Feb 19, 2021, 2:11 AM

#

magic summit I guess i can just wing it with just swapping the axis for the left graph when p...

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> x = np.linspace(1, 10, num=10)
>>> x
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
>>> y_1 = x
>>> y_2 = 2*x
>>> y_3 = x**2
>>> plt.subplot(1, 2, 1)
<AxesSubplot:>
>>> plt.plot(x, y_1)
[<matplotlib.lines.Line2D object at 0x7fd3513eb1f0>]
>>> plt.title("Left")
Text(0.5, 1.0, 'Left')
>>> plt.subplot(1, 2, 2)
<AxesSubplot:>
>>> plt.plot(x, y_2)
[<matplotlib.lines.Line2D object at 0x7fd3512939a0>]
>>> plt.plot(x, y_3)
[<matplotlib.lines.Line2D object at 0x7fd351293d00>]
>>> plt.title("Right")
Text(0.5, 1.0, 'Right')
>>> plt.tight_layout(4)
<stdin>:1: MatplotlibDeprecationWarning: Passing the pad parameter of tight_layout() positionally is deprecated since Matplotlib 3.3; the parameter will become keyword-only two minor releases later.
>>> plt.show()

#

Expect swap the axes on the left one.

velvet thorn Feb 19, 2021, 2:12 AM

#

iron basalt <@!171929073063297024> thank you, numpy's broadcasting was something I never rea...

yw! there’s a p comprehensive tutorial/explanation in the docs

#

it might help to read through it

lapis sequoia Feb 19, 2021, 2:12 AM

#

i m new to this

#

and i have a question

iron basalt Feb 19, 2021, 2:12 AM

#

lapis sequoia Feb 19, 2021, 2:12 AM

#

if you re willing to help me

#

can Python help me analyze soccer matches and predict the outcome?

magic summit Feb 19, 2021, 2:14 AM

#

iron basalt

ahh i see. what do you mean by swapping axes

iron basalt Feb 19, 2021, 2:23 AM

#

import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(1, 10, num=10)
y_1 = x
y_2 = 2*x
y_3 = x**2

ax = plt.subplot(1, 2, 1)
plt.plot(y_3, x)
ax.invert_xaxis()
plt.title("Left")

plt.subplot(1, 2, 2)
plt.plot(x, y_2)
plt.plot(x, y_3)
plt.title("Right")

plt.tight_layout(4)
plt.show()

#

#

@magic summit

magic summit Feb 19, 2021, 2:32 AM

#

thank you very much

next tree Feb 19, 2021, 2:46 AM

#

could i get some help with mongolite

#

Screen_Shot_2021-02-18_at_6.46.53_PM.png

#

my classmate on piazza said im suppose to sum over purchaseMethod rather than items

#

and that the $sum:1 sums over the rows

#

but my total items in the df is all 0

#

so im doing something wrong with the total_item sum part of the function

misty flint Feb 19, 2021, 3:19 AM

#

..excel..?

#

what is this?

#

confuseddog

#

oh mongolite

#

nice edit kannaSus

next tree Feb 19, 2021, 3:30 AM

#

lolll mango lite

lapis sequoia Feb 19, 2021, 3:59 AM

#

can someone help me get started with data science?

#

i have an idea for a project

storm lintel Feb 19, 2021, 4:01 AM

#

so my docker code it in the wrong time zone

#

how do i change time zone

#

its in utc rn

cerulean spindle Feb 19, 2021, 4:22 AM

#

does anyone know how to lower loss on a tensorflow model? My loss is really high and then goes to nan.

hasty grail Feb 19, 2021, 4:58 AM

#

cerulean spindle does anyone know how to lower loss on a tensorflow model? My loss is really high...

It depends. You'll have to provide more information on what you're doing exactly

storm lintel Feb 19, 2021, 4:58 AM

#

i cant figure out how to change this darn time zone on docker

hasty grail Feb 19, 2021, 4:58 AM

#

Especially the details of your model, your learning rate, and which loss function you're using

cerulean spindle Feb 19, 2021, 4:59 AM

#

I figured it out nvm

hasty grail Feb 19, 2021, 4:59 AM

#

Ok cool

austere swift Feb 19, 2021, 5:47 AM

#

anybody know some good pip packages for gradcam in pytorch?

#

https://github.com/vickyliin/gradcam_plus_plus-pytorch this is the only one i could find

GitHub

vickyliin/gradcam_plus_plus-pytorch

A Simple pytorch implementation of GradCAM and GradCAM++ - vickyliin/gradcam_plus_plus-pytorch

#

but im trying to see if there are other better ones

misty flint Feb 19, 2021, 6:01 AM

#

too much scattering

#

DoggoKek

#

if i expand figsize, i wonder if this will be fixed pithink

#

oh it helped

#

#

too many columns for a scatter matrix; better to do it individually pithink

#

actually im going to see what tableau can do with this

meager shoal Feb 19, 2021, 6:10 AM

#

Trying to load yolov5 weights into pytorch, and gives this error:

#

The code for loading this in is almost exactly copied from the yolov5 detect.py script https://github.com/ultralytics/yolov5/blob/master/detect.py

GitHub

ultralytics/yolov5

YOLOv5 in PyTorch > ONNX > CoreML > TFLite. Contribute to ultralytics/yolov5 development by creating an account on GitHub.

#

!code

arctic wedgeBOT Feb 19, 2021, 6:12 AM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

meager shoal Feb 19, 2021, 6:12 AM

#

!git clone https://github.com/ultralytics/yolov5
%cd yolov5
%pip install -qr requirements.txt
import torch
import cv2
from models.experimental import attempt_load
model = attempt_load('/content/last.pt')
img = cv2.imread('/content/test.jpg')
img = torch.from_numpy(img).to('cuda')
img = img.float()
img /= 255.0
if img.ndimension() == 3:
  img = img.unsqueeze(0)
model(img)

topaz oracle Feb 19, 2021, 6:33 AM

#

i am looking to learn more about data science but I don't know where to begin

#

I remember watching those sentdex videos but otherwise I don't know much

spring mortar Feb 19, 2021, 6:51 AM

#

I’ll give it a shot in Linux in a second where I can easily check for folder permissions. I still don’t get permissions in windows after using that OS for all my life. Thanks for the heads up!

misty flint Feb 19, 2021, 7:00 AM

#

@topaz oracle https://www.youtube.com/watch?v=3Mm1U1CbzNw&list=PL2zq7klxX5ATMsmyRazei7ZXkP1GHt-vs&ab_channel=KenJee

YouTube

Ken Jee

Is Data Science Right For You?

In this video I help you to answer if data science is a good fit for you. I provide 5 questions that you should ask yourself that will assess your fit for the field.

#DataScience #DataScienceJobs #DataScienceCareers

Questions to Ask Yourself:

Am I prepared to seriously commit to learning? Data Science has a steep learning curve. You also ha...

▶ Play video

topaz oracle Feb 19, 2021, 7:00 AM

#

thnaks

misty flint Feb 19, 2021, 7:00 AM

#

my fave DS YTber so far

#

for sklearn's OrdinalEncoder function, is there a way to reverse the 1's and 0's? to make abnormal coded as 1 and normal coded as 0?

astral path Feb 19, 2021, 7:05 AM

#

is there a way to do a multiple regression of every column in a subset of a dataframe as a function of all the other columns? e.g. if I have a dataframe with columns "age", "pclass", "sex", "embarked", "fare", "sibsp", "parch", I would want to perform multiple regression of age as a function of "pclass", "sex", "embarked", "fare", "sibsp", "parch", pclass as a function of "age", "sex", "embarked", "fare", "sibsp", "parch", and so on...

twin moth Feb 19, 2021, 7:49 AM

#

Could anyone here help us choose a good ML algorithm for our scenario?

hasty grail Feb 19, 2021, 7:53 AM

#

About your scenario...

#

@twin moth

twin moth Feb 19, 2021, 8:13 AM

#

hasty grail About your scenario...

Heya!
We're trying to predict the weather/specific statistics using heat maps we took from NASAs' website

#

An example of our dataframe

Y,X,Year,Month,Land_Surface_Temperature Color Index,Land_Surface_Temperature Is Valuable,Vegetation Color Index,Vegetation Is Valuable
28,160,2000,3,-1,False,15.0,True
28,161,2000,3,-1,False,10.0,True
28,162,2000,3,-1,False,10.0,True
28,176,2000,3,-1,False,19.0,True
28,177,2000,3,-1,False,11.0,True
28,178,2000,3,-1,False,15.0,True
28,179,2000,3,-1,False,14.0,True
28,180,2000,3,5,True,16.0,True
28,181,2000,3,-1,False,14.0,True
28,182,2000,3,0,True,19.0,True
28,183,2000,3,0,True,12.0,True
28,184,2000,3,2,True,14.0,True
28,185,2000,3,0,True,11.0,True
28,186,2000,3,-1,False,18.0,True
28,187,2000,3,0,True,15.0,True
28,188,2000,3,-1,False,17.0,True
28,189,2000,3,-1,False,17.0,True
28,190,2000,3,-1,False,15.0,True
28,191,2000,3,-1,False,18.0,True
28,192,2000,3,-1,False,17.0,True
28,193,2000,3,-1,False,21.0,True
28,194,2000,3,-1,False,25.0,True
28,195,2000,3,-1,False,29.0,True
28,196,2000,3,-1,False,35.0,True
28,197,2000,3,-1,False,29.0,True

#

Basically it contains a row for each entry * month * year (~Feb 2000-Dec2020)

hasty grail Feb 19, 2021, 8:15 AM

#

Hmm

twin moth Feb 19, 2021, 8:15 AM

#

We tried running a couple of ML algorithms on the data, mostly linear models and we got really bad accuracy

#

0.35 was the max

hasty grail Feb 19, 2021, 8:16 AM

#

I'm thinking of graph-based methods

twin moth Feb 19, 2021, 8:16 AM

#

We even got a negative value once

hasty grail Feb 19, 2021, 8:16 AM

#

I don't think negative accuracy is possible xD

twin moth Feb 19, 2021, 8:16 AM

#

We didn't either

#

But here it is

#

And yes, the print statement is okay, we printed other algorithms as well and they returned normal values

hasty grail Feb 19, 2021, 8:19 AM

#

Weird lol

#

Maybe LassoLars is not implemented correctly

iron basalt Feb 19, 2021, 8:23 AM

#

So is each pixel you referred to here one of those entries?

twin moth Feb 19, 2021, 8:28 AM

#

hasty grail Maybe LassoLars is not implemented correctly

Might be, I could try to send you the code if you'd like

hasty grail Feb 19, 2021, 8:29 AM

#

Is it a custom implementation?

twin moth Feb 19, 2021, 8:29 AM

#

iron basalt So is each pixel you referred to here one of those entries?

Each of those rows is a single pixel in a single month, we also cleaned values which were colorless throughout all of the maps used

twin moth Feb 19, 2021, 8:30 AM

#

hasty grail Is it a custom implementation?

Nope, but maybe we didn't call it correctly

hasty grail Feb 19, 2021, 8:30 AM

#

Hmm I don't use sklearn that much so idk whether I can help

twin moth Feb 19, 2021, 8:32 AM

#

😦

#

So how would you approach it?

hasty grail Feb 19, 2021, 8:32 AM

#

Regardless of whether your loss function is correct

#

https://discordapp.com/channels/267624335836053506/366673247892275221/812236042265427978

twin moth Feb 19, 2021, 8:33 AM

#

We'll try to research it, I personally never heard of it

#

Would we need to change the data structure?

hasty grail Feb 19, 2021, 8:34 AM

#

Yeah you probably need an adjacency matrix

#

to represent your stations as nodes in the graph

twin moth Feb 19, 2021, 8:34 AM

#

We have about 36 hours to turn it in

hasty grail Feb 19, 2021, 8:34 AM

#

wait a second

twin moth Feb 19, 2021, 8:34 AM

#

😅

hasty grail Feb 19, 2021, 8:35 AM

#

nvm I looked at your data again and it seems that your heat map has a value for each point

#

(i.e. it is a dense 2d map)

#

maybe you want to use Conv-LSTM then

twin moth Feb 19, 2021, 8:36 AM

#

BTW, we took each and every pixel from a map like that
https://earthobservatory.nasa.gov/global-maps/MOD_NDVI_M

We calculate the scale index for each of the pixels using the given scale

Vegetation

climate change, global climate change, global warming, natural hazards, Earth, environment, remote sensing, atmosphere, land processes, oceans, volcanoes, land cover, Earth science data, NASA, environmental processes, Blue Marble, global maps

hasty grail Feb 19, 2021, 8:37 AM

#

turn your data into an "image" (according to the x/y values) and store the metadata as channels

twin moth Feb 19, 2021, 8:37 AM

#

And we only leave the colored pixels, so we have about ~12MM lines

hasty grail Feb 19, 2021, 8:38 AM

#

only 36 hours though...

#

Idk whether you can train your model in time

#

oof

twin moth Feb 19, 2021, 8:38 AM

#

hasty grail only 36 hours though...

Yup, we're short, real short

hasty grail Feb 19, 2021, 8:39 AM

#

This sort of problem pretty much requires deep learning

twin moth Feb 19, 2021, 8:39 AM

#

hasty grail turn your data into an "image" (according to the x/y values) and store the metad...

But each of those pixels contain data for multiple map types

hasty grail Feb 19, 2021, 8:39 AM

#

It's harder than image classification already xD

twin moth Feb 19, 2021, 8:39 AM

#

hasty grail This sort of problem pretty much requires deep learning

We didn't learn it so we can't really use it

hasty grail Feb 19, 2021, 8:40 AM

#

Since you've also got the time dimension to worry about

twin moth Feb 19, 2021, 8:40 AM

#

twin moth We didn't learn it so we can't really use it

I mean I guess we can but we never tried it so it'll take more time and it might be an issue

nova widget Feb 19, 2021, 8:40 AM

#

Is it prediction per coordinate or per time?

hasty grail Feb 19, 2021, 8:40 AM

#

I think they need both

twin moth Feb 19, 2021, 8:41 AM

#

hasty grail I think they need both

Yup

hasty grail Feb 19, 2021, 8:41 AM

#

Idk how you're even supposed to do this using traditional ML methods

twin moth Feb 19, 2021, 8:41 AM

#

And we have multiple maps, so either take a single one at a time or take them all in favor of a more successful prediction

twin moth Feb 19, 2021, 8:41 AM

#

hasty grail Idk how you're even supposed to do this using traditional ML methods

We weren't given that project, we came up with it 😅

hasty grail Feb 19, 2021, 8:42 AM

#

twin moth But each of those pixels contain data for multiple map types

That's what the channel dimension of the "image" is for

#

e.g. in RGB images you have R, G and B maps

#

just extend this to your scenario

twin moth Feb 19, 2021, 8:42 AM

#

But we have about 6 maps

#

How would you use it?

nova widget Feb 19, 2021, 8:43 AM

#

Is there a time series?

twin moth Feb 19, 2021, 8:44 AM

#

BTW, we couldn't use MLP since it took way too much RAM, got any idea if DL algorithms will be more lax on it?

hasty grail Feb 19, 2021, 8:44 AM

#

As I mentioned above, I would go for training a Conv-LSTM model on your data

twin moth Feb 19, 2021, 8:44 AM

#

nova widget Is there a time series?

Kinda

hasty grail Feb 19, 2021, 8:44 AM

#

Convolution is more memory efficient than MLP (Dense)

twin moth Feb 19, 2021, 8:44 AM

#

hasty grail Convolution is more memory efficient than MLP (Dense)

Amazing

hasty grail Feb 19, 2021, 8:44 AM

#

It should be ok as long as you don't use too many timesteps / maps that are too large

iron basalt Feb 19, 2021, 8:45 AM

#

My solution would be to use a generative predictive model.

hasty grail Feb 19, 2021, 8:45 AM

#

I am not confident that you can get a decent model from all this in 36 hours though

nova widget Feb 19, 2021, 8:45 AM

#

Just make it micro first

twin moth Feb 19, 2021, 8:45 AM

#

hasty grail It should be ok as long as you don't use too many timesteps / maps that are too ...

Our data is about 1GB, as a CSV

iron basalt Feb 19, 2021, 8:45 AM

#

It works for both the time aspect and learns all the correlations.

hasty grail Feb 19, 2021, 8:46 AM

#

twin moth Our data is about 1GB, as a CSV

I am referring to the size of your batch

twin moth Feb 19, 2021, 8:46 AM

#

hasty grail I am not confident that you can get a decent model from all this in 36 hours tho...

All good, we'd trying our hardest, worst case we won't get it high enough, otherwise we might be able to get a time extension or maybe even just a lucky streak

hasty grail Feb 19, 2021, 8:46 AM

#

Should have researched the problem beforehand xD

#

but anyways

twin moth Feb 19, 2021, 8:47 AM

#

hasty grail I am referring to the size of your batch

Isn't the size of the CSV relevant though?

hasty grail Feb 19, 2021, 8:47 AM

#

Not really - if you use lazy loading to feed your data into the model, you don't need to fit the entire thing in memory

misty flint Feb 19, 2021, 8:48 AM

#

buy some cloud GPUs for the model

#

Sip

hasty grail Feb 19, 2021, 8:48 AM

#

that's kinda cheating

misty flint Feb 19, 2021, 8:48 AM

#

ID_BoomKek

hasty grail Feb 19, 2021, 8:48 AM

#

I assume this is for a course project

twin moth Feb 19, 2021, 8:48 AM

#

hasty grail I assume this is for a course project

True

misty flint Feb 19, 2021, 8:49 AM

#

is it cheating? what if you tell your prof pithink

iron basalt Feb 19, 2021, 8:49 AM

#

If you need speed, then your best bet is dimensioality reduction via sparse methods.

hasty grail Feb 19, 2021, 8:49 AM

#

However if you want to keep within the bounds of your course, I think it is worthwhile to show that traditional ML methods are unable to solve such a difficult problem 😛

misty flint Feb 19, 2021, 8:51 AM

#

gl tho. 36 hours amegablobsweats

hasty grail Feb 19, 2021, 8:51 AM

#

(Well not really, since the images are so small that you could still fit an MLP in memory to basically brute force it)

misty flint Feb 19, 2021, 8:51 AM

#

assuming you have to turn in a report/presentation too? NervousSip

iron basalt Feb 19, 2021, 8:52 AM

#

MLPs are fine for mnist, get like 97-98% which is pretty much the max since MNIST has miss-labelled data.

twin moth Feb 19, 2021, 8:52 AM

#

misty flint is it cheating? what if you tell your prof <:pithink:652247559909277706>

What is? Using a cloud server for fast computing? Why would it be?

twin moth Feb 19, 2021, 8:52 AM

#

misty flint assuming you have to turn in a report/presentation too? <a:NervousSip:6605153783...

True

#

And we did some complex calculations for the data so that might get us some points haha

misty flint Feb 19, 2021, 8:53 AM

#

im already stressed on your behalf amegablobsweats

hasty grail Feb 19, 2021, 8:53 AM

#

twin moth What is? Using a cloud server for fast computing? Why would it be?

Because you get way more resources than the other groups

#

to solve your problem

iron basalt Feb 19, 2021, 8:53 AM

#

I would just fail with grace and say why it's not really do-able, so you gain something out of it and they do too.

twin moth Feb 19, 2021, 8:54 AM

#

hasty grail to solve your problem

Yeah, but the issue is the code writing and the presentation

#

No one cares how you run it

iron basalt Feb 19, 2021, 8:54 AM

#

ML is not this all mighty can do everything thing, no matter how much people may hype it up to be.

#

Very much WIP.

hasty grail Feb 19, 2021, 8:55 AM

#

twin moth No one cares how you run it

Well, if you're only using traditional ML, sure

misty flint Feb 19, 2021, 8:55 AM

#

if this was for my AI class, my prof would be okay with it but thats bc she gave everyone cloud credits to use

#

Sip

hasty grail Feb 19, 2021, 8:55 AM

#

But in DL, models can take hours or even days to train

#

And then there's hyperparameter tuning, so you have to repeat the training process many times

#

so differences in resources could matter a lot

iron basalt Feb 19, 2021, 8:57 AM

#

(Unless you use very modern ML which can run on the CPU due to exploitation of sparse operations, but that is some bleeding edge / very not common place, and needs much more research)

#

(non-differentiable models are very hard to grasp since all commonly used techniques are out the window)

#

(no backprop)

hasty grail Feb 19, 2021, 8:59 AM

#

But yeah, better to focus on the process than the results @twin moth

misty flint Feb 19, 2021, 9:00 AM

#

gl dude

#

cattohug

twin moth Feb 19, 2021, 9:07 AM

#

hasty grail But yeah, better to focus on the process than the results <@!191683640118214656>

So convolution it is?

twin moth Feb 19, 2021, 9:07 AM

#

misty flint gl dude

Thanks mate 🙂

hasty grail Feb 19, 2021, 9:10 AM

#

Stick to what was taught in the course

#

You don't have the time for DL methods

#

Especially since you have not dealt with DL before

twin moth Feb 19, 2021, 9:12 AM

#

We weren't taught conv

#

But I don't think that anyone would care if we used it

hasty grail Feb 19, 2021, 9:14 AM

#

You don't have the time for DL methods

twin moth Feb 19, 2021, 9:14 AM

#

Oh, Conv is a DL method?

#

So just try to do ML, stick with the highest percentage and show that ML is not an option for such a scenario?

hasty grail Feb 19, 2021, 9:15 AM

#

in its formulation, not necessarily

#

but working models tend to be deep

hasty grail Feb 19, 2021, 9:15 AM

#

twin moth So just try to do ML, stick with the highest percentage and show that ML is not ...

Yeah

#

Also DL is a form of ML, to be precise, so you should refer to them as "traditional ML methods" 😛

twin moth Feb 19, 2021, 9:16 AM

#

lol, true

#

Got any traditional ML methods you'd recommend? 😛

iron basalt Feb 19, 2021, 9:19 AM

#

IMO it's more like "common traditional ML methods"

#

And not the improved versions, some of them still have new variants popping up each year.

#

A "traditional" ML method (just based on time period it was invented) that could actually handle something like this problem would be Adaptive Resonance Theory methods. But very few people know of it.

#

(And it has many modern variants that drastically improve on the original models)

twin moth Feb 19, 2021, 9:22 AM

#

iron basalt A "traditional" ML method (just based on time period it was invented) that could...

How difficult would it be to implement it?

iron basalt Feb 19, 2021, 9:23 AM

#

The implementation is actually trivial which makes it very elegant, but it would take some reading.

#

(There are some python implementations on github I think)

#

(with numpy)

#

You don't have time for that either though, just stick to the course knowledge.

twin moth Feb 19, 2021, 9:26 AM

#

I'd love some names if you get know them from the top of your head

twin moth Feb 19, 2021, 9:27 AM

#

iron basalt You don't have time for that either though, just stick to the course knowledge.

How come?

iron basalt Feb 19, 2021, 9:30 AM

#

ART could have it's entire own course, and many more for its variants. It builds on a lot of ideas that much more neuro-science-ish (biologically plausible), which would take you down the rabbit hole of spiking neural models, and much more.

hasty grail Feb 19, 2021, 9:30 AM

#

If you're going to do a presentation using out-of-class materials, you're probably going to be asked on them in Q&A

#

Better to stick to what you actually understand

iron basalt Feb 19, 2021, 9:31 AM

#

There is an entire other community within ML that does biologically plausible models that are very much like real neural networks.

hasty grail Feb 19, 2021, 9:32 AM

#

(Personally I jumped right into DL so won't be of much assistance in this situation)

twin moth Feb 19, 2021, 9:33 AM

#

😩

iron basalt Feb 19, 2021, 9:34 AM

#

You would need to understand DL too though, since DL is based on an idealization of old neuroscience from which you then can learn why the new neuroscience makes more sense and what you can do with it (how to improve upon DL).

twin moth Feb 19, 2021, 9:34 AM

#

If I send you the list of all of the methods we were taught, would you be able to tell me what should be most fitting for our situation?

iron basalt Feb 19, 2021, 9:34 AM

#

sure

twin moth Feb 19, 2021, 9:35 AM

#

iron basalt You would need to understand DL too though, since DL is based on an idealization...

We'd be taking an ML course next year

hasty grail Feb 19, 2021, 9:35 AM

#

you mean DL?

twin moth Feb 19, 2021, 9:36 AM

#

Nope, the current course is an introduction to DS

#

The next will be ML

hasty grail Feb 19, 2021, 9:37 AM

#

huh

twin moth Feb 19, 2021, 9:37 AM

#

So I guess that we'll learn ML in depth and maybe even DL

twin moth Feb 19, 2021, 9:37 AM

#

hasty grail huh

Yeah, we took it kinda far lol

iron basalt Feb 19, 2021, 9:40 AM

#

TBH this task seems way outside the scope of your course. I have been told by others similar stories in which they get an ML task that is outside the scope of the course.

#

(unless the entire point is to show that the methods are insufficient)

hasty grail Feb 19, 2021, 9:41 AM

#

Yeah, I mentioned that earlier

twin moth Feb 19, 2021, 9:41 AM

#

Again, we came up with this task

iron basalt Feb 19, 2021, 9:42 AM

#

Showing that the methods you learned do not work and why should be fine then. If grading were up to me I would give you full credit if you can give me all the reasons why and also show the best results you got.

twin moth Feb 19, 2021, 9:43 AM

#

We were only told to think of a research question and try to answer it using DS

torpid pilot Feb 19, 2021, 11:53 AM

#

anyone?

#

#data-science-and-ml

hasty grail Feb 19, 2021, 12:04 PM

#

Don't ask to ask

#

Just ask your question, if anyone can help you they will answer

rotund dock Feb 19, 2021, 12:34 PM

#

Hi guys! I have this data frame, Im trying to group by season and year how can I do it?

#

df.groupby('[Season','Date])['25900MS'].mean()

#

thats not working

#

got it.... P25900MS.groupby([P25900MS.iloc[:,0].dt.year, P25900MS.iloc[:,2]]).mean().reset_index()

keen root Feb 19, 2021, 1:00 PM

#

Hi, I want to perform a multiclass classification. I have a very large dataset, and the number of inputs on the machine can easily extend beyond the 1000 inputs. So far I've used scikit's learn API, with the RidgeClassifier, but if I'm not mistaken this method relies on doing a lot of linear algebra, and if the number of inputs can get quite large I presume that the training time will scale up quite a bit. So I was thinking of maybe implementing a NN, maybe just a simple Perceptron, would that be better? Are there any advantages?

delicate yarrow Feb 19, 2021, 1:13 PM

#

#

help

#

GOT IT! has to be .csv (I'm a newb sorry)

bold olive Feb 19, 2021, 1:32 PM

#

How can I run out of memory with 3D CNN (TensorFlow Keras), even with a batch size of 1, when each of my images is only ~14mb in size?

#

This is both for CPU and GPU.

#

Model too big a possibility or something else?

thin kindle Feb 19, 2021, 1:33 PM

#

Hello guys, I have a code written with tensorflow 1.14, and I need to migrate to 2.0, but I don't the equivalent of the function tf.contrib in 2.0. Does someone can help me ?

hasty grail Feb 19, 2021, 1:41 PM

#

bold olive How can I run out of memory with 3D CNN (TensorFlow Keras), even with a batch si...

What does your model look like?

hasty grail Feb 19, 2021, 1:41 PM

#

thin kindle Hello guys, I have a code written with tensorflow 1.14, and I need to migrate to...

Could you elaborate?

thin kindle Feb 19, 2021, 1:44 PM

#

hasty grail Could you elaborate?

in my code I used tf.contrib but when i upgrade the version of tensorflow 2. 0 when i run my code it says no module name tf.contrib

#

@hasty grail do you know the equivalent of tb.contrib into tensorflow 2.0 ?

hasty grail Feb 19, 2021, 1:45 PM

#

thin kindle <@!253467095185096705> do you know the equivalent of tb.contrib into tensorflow ...

https://www.tensorflow.org/guide/upgrade#compatibility_modules

#

If you can't find the function in tfa (TensorFlow Addons) then I'm afraid you're out of luck

#

Take a look at the source code of the original function and see if you can implement it yourself

keen kestrel Feb 19, 2021, 1:50 PM

#

Could you share your experience in writing new custom layer in Pytorch? I usually create a jupyter notebook and code the layer with dummy input so that I can get instant feedback if I mess up with the dimension. May be there are better way?

thin kindle Feb 19, 2021, 2:02 PM

#

hasty grail If you can't find the function in `tfa` (TensorFlow Addons) then I'm afraid you'...

thx

hasty grail Feb 19, 2021, 2:03 PM

#

np

shadow ridge Feb 19, 2021, 2:10 PM

#

Who, where should I report this?
URL: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

Screen_Shot_2021-02-19_at_7.08.58_AM.png

hasty grail Feb 19, 2021, 2:12 PM

#

no idea

shadow ridge Feb 19, 2021, 2:13 PM

#

The site ahead may contain harmful programs

hasty grail Feb 19, 2021, 2:14 PM

#

replace pandas-docs with just docs

austere swift Feb 19, 2021, 2:16 PM

#

!d pandas.concat

arctic wedgeBOT Feb 19, 2021, 2:16 PM

#

`pandas.concat`

pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)```
Concatenate pandas objects along a particular axis with optional set logic along the other axes.

Can also add a layer of hierarchical indexing on the concatenation axis, which may be useful if the labels are the same (or overlapping) on the passed axis number.

Parameters  **objs**a sequence or mapping of Series or DataFrame objectsIf a mapping is passed, the sorted keys will be used as the keys argument, unless it is passed, in which case the values will be selected (see below). Any None objects will be dropped silently unless they are all None in which case a ValueError will be raised.

**axis**{0/’index’, 1/’columns’}, default 0The axis to concatenate along.

**join**{‘inner’, ‘outer’}, default ‘outer’How to handle indexes on other axis (or axes).... [read more](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html#pandas.concat)

austere swift Feb 19, 2021, 2:16 PM

#

hmm

#

i think pandas site got hijacked actually

#

cus even old links that ive visited have that message

#

shadow ridge Feb 19, 2021, 2:18 PM

#

I just submitted an issue on github

#

@austere swift thanks for the concat doc

austere swift Feb 19, 2021, 2:18 PM

#

np

shadow ridge Feb 19, 2021, 2:22 PM

#

@austere swift @hasty grail
https://github.com/pandas-dev/pandas/issues/39862

GitHub

BUG: Docs website is blocked by google safe browsing · Issue #39862...

I hope it's ok to skip the template as this is not a direct code bug but a website bug. Sorry for the german locales, but I guess the message is pretty clear anyways. The pandas docs are bl...

hasty grail Feb 19, 2021, 2:32 PM

#

Nice

lapis sequoia Feb 19, 2021, 3:08 PM

#

No issue at all with safari

#

same on chrome :/

lapis sequoia Feb 19, 2021, 3:09 PM

#

shadow ridge Who, where should I report this? URL: ```https://pandas.pydata.org/pandas-docs/s...

are you sure you don't have some weird plugin or something?

viscid dagger Feb 19, 2021, 3:43 PM

#

does anyone know how to change the .jupyter directory location to somewhere else in linux

lapis sequoia Feb 19, 2021, 3:50 PM

#

you mean change the software's directory or change the open folder location? :/

#

@viscid dagger

viscid dagger Feb 19, 2021, 3:50 PM

#

no the config folder

#

if that makes sense

#

@lapis sequoia

lapis sequoia Feb 19, 2021, 3:51 PM

#

ow yeah, you could make a shortcut to it

viscid dagger Feb 19, 2021, 3:51 PM

#

the .jupyter directory in the home folder

lapis sequoia Feb 19, 2021, 3:51 PM

#

yeah i figured

viscid dagger Feb 19, 2021, 3:51 PM

#

no i actually wanna i move it to another directory

#

cause my home directory is so cluttered

lapis sequoia Feb 19, 2021, 3:52 PM

#

no i can't really help you with that, maybe someone else could, never had that desire so never faced that issue

devout scroll Feb 19, 2021, 3:55 PM

#

Hey does someone know if it's possible to specify dtypes when writing a pandas dataframe to feather file? I try this because feather infers the wrong dtpye for one of my columns, later resulting in an error. I only found this on stackoverflow which does not answer the question: https://stackoverflow.com/questions/41439564/is-it-possible-to-specify-column-types-when-saving-a-pandas-dataframe-to-feather

Stack Overflow

is it possible to specify column types when saving a pandas DataFra...

Currently, if a column happens to have only nulls, an exception is thrown with the error:
Invalid: Unable to infer type of object array, were all null
It is possible to specify the type of the ...

viscid dagger Feb 19, 2021, 4:00 PM

#

lapis sequoia no i can't really help you with that, maybe someone else could, never had that d...

lol okay chill

lapis sequoia Feb 19, 2021, 4:02 PM

#

ow didn't mean to sound rude :/

viscid dagger Feb 19, 2021, 4:23 PM

#

lapis sequoia ow didn't mean to sound rude :/

lol no it's chill

#

actually i found it

#

export JUPYTER_CONFIG_DIR="${XDG_CONFIG_HOME:-$HOME/.config}/jupyter"

#

i have to set this env variable that jupyter uses

#

buts thanks anyway for trying to help me out

#

@lapis sequoia

lapis sequoia Feb 19, 2021, 4:25 PM

#

yeah sorry, but never faced this issue hence

misty flint Feb 19, 2021, 4:48 PM

#

lapis sequoia are you sure you don't have some weird plugin or something?

i also have this when i tried to look up something on pandas

#

☹️

lapis sequoia Feb 19, 2021, 4:51 PM

#

do you guys have vpns or something? or wtf is wrong with my computer?

shadow ridge Feb 19, 2021, 4:51 PM

#

misty flint i also have this when i tried to look up something on pandas

See link to issue above

lapis sequoia Feb 19, 2021, 4:51 PM

#

i tried on safari, chrome

shadow ridge Feb 19, 2021, 4:52 PM

#

lapis sequoia i tried on safari, chrome

If you go to pandas.updatable.org and make you way to the docs there is not warning.

misty flint Feb 19, 2021, 4:53 PM

#

shadow ridge See link to issue above

oh missed that. thanks

astral path Feb 19, 2021, 4:53 PM

#

is there a way to do a multiple regression of every column in a subset of a dataframe as a function of all the other columns? e.g. if I have a dataframe with columns "age", "pclass", "sex", "embarked", "fare", "sibsp", "parch", I would want to perform multiple regression of age as a function of "pclass", "sex", "embarked", "fare", "sibsp", "parch". pclass as a function of "age", "sex", "embarked", "fare", "sibsp", "parch", and so on...

lapis sequoia Feb 19, 2021, 4:54 PM

#

are you having issue with https://pandas.pydata.org?

shadow ridge Feb 19, 2021, 4:55 PM

#

lapis sequoia are you having issue with https://pandas.pydata.org?

I don’t

#

https://github.com/pandas-dev/pandas/issues/39862#issuecomment-782114976

GitHub

BUG: Docs website is blocked by google safe browsing · Issue #39862...

I hope it's ok to skip the template as this is not a direct code bug but a website bug. Sorry for the german locales, but I guess the message is pretty clear anyways. The pandas docs are bl...

grave frost Feb 19, 2021, 5:21 PM

#

astral path is there a way to do a multiple regression of every column in a subset of a data...

Could you explain your issues more simply?

astral path Feb 19, 2021, 5:23 PM

#

I have several variables (some that take on values from 1-10000, some that only take on 1 and 0), and want to run multiple regression on each of these variables with values in the columns of a dataframe to find correlations between each variable

#

@grave frost

misty flint Feb 19, 2021, 5:29 PM

#

pretty sure pandas.plot has parameters you can insert

#

X and Y

#

might be what youre looking for

astral path Feb 19, 2021, 5:30 PM

#

woah

#

im getting that error too now

misty flint Feb 19, 2021, 5:31 PM

#

oh yeah you have to navigate from home page instead https://pandas.pydata.org/

astral path Feb 19, 2021, 5:31 PM

#

ok 👍

misty flint Feb 19, 2021, 5:31 PM

#

~~so many tabs~~ ID_BoomKek

astral path Feb 19, 2021, 5:31 PM

#

what would plot do in this scenario? itsnt it just for plotting?

misty flint Feb 19, 2021, 5:32 PM

#

~~why are half of those sound cloud~~

#

DoggoKek

astral path Feb 19, 2021, 5:32 PM

#

misty flint ~~so many tabs~~ <a:ID_BoomKek:663248902346113035>

oh no... this is my window with the least amount of tabs

misty flint Feb 19, 2021, 5:32 PM

#

Pika

#

second-hand stressed amegablobsweats

astral path Feb 19, 2021, 5:33 PM

#

lol

young dock Feb 19, 2021, 6:19 PM

#

so I did quantile regression two different ways, I'm confused why they are different

#

#

#

one is a straight line and the other is bumpy (to state the obvious lol)

#

former is gradient boosting regressor from sklearn with loss=quantile, and the latter is quantreg from statsmodels

#

Idk why they are different

charred umbra Feb 19, 2021, 6:36 PM

#

Maybe its because one of them considers the data as a time series interpretation? That would explain why it's wavy opposed to just a regular linear regression

#

I honestly don't know

misty flint Feb 19, 2021, 6:40 PM

#

pithink

outer fulcrum Feb 19, 2021, 6:52 PM

#

Hey guys, does anyone here know Grafana well?

graceful scaffold Feb 19, 2021, 7:17 PM

#

#Use the kfold cross validation to create two lists: train and holdhout which have the indices of those elements of the X matrix that will be #used for the training and holdout (validation) at each iteration (fold of the cross validator)

Cvals = [1e-6, 1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6]
k_fold = KFold(n_splits=5)

results_l2=[]
for C in Cvals:
    # instantiate a logistic regression with L2 penalty and the proper C value for this iteration of the loop
    model = LogisticRegression(penalty='l2',C=C)
    
    # collect the predicted y values and true y values of each hold out set
    predicteds=[]
    trueys=[]
    train=[]       
    holdout=[]    #WTF ARE THESE TWO LISTS FOR?
    for train, holdout in k_fold.split(X):  ##I ONLY HAD TO ADD THIS LINE
        model.fit(X[train],y[train])
        predicteds.append( model.predict(X[holdout]) )
        trueys.append( y[holdout] )

#

Can someone help me with this please?

#

idk if the for loop is OK

lapis sequoia Feb 19, 2021, 7:19 PM

#

Hey anyone has a clue how i can select string index in pivot table?

#

^ pandas

#

i wanna do a project

iron basalt Feb 19, 2021, 7:32 PM

#

keen root Hi, I want to perform a multiclass classification. I have a very large dataset, ...

When you say number of inputs, do you mean the dimensionality of the input, the number of samples, or are you working with a time series?

keen root Feb 19, 2021, 7:40 PM

#

I mean the number of neurons, or the number of parameters of the perceptron, which I believe to be the dime sionality of the input as you put it

iron basalt Feb 19, 2021, 7:44 PM

#

To confirm, the dimensionality of the input in case of image input would be the dimensions of the image multiplied together (e.g. a grayscale image that is 32x32 pixels would be 32 rows * 32 columns * 1 channels = dimensionality of 1024).

#

@keen root

#

Is that what you mean?

#

(Not that you have an image as input, but something like that)

stray roost Feb 19, 2021, 7:46 PM

#

Hi yall. I recently got into machine learning and AI. Can yall give me some interesting projects to try to finish without watching any tutorials?

#

I wanna test my skills and see how much can I do alone

iron basalt Feb 19, 2021, 7:47 PM

#

stray roost Hi yall. I recently got into machine learning and AI. Can yall give me some inte...

Classify MNIST. Note it has miss-labelled data so don't pull your hair out over not getting 100% accuracy.

stray roost Feb 19, 2021, 7:48 PM

#

iron basalt Classify MNIST. Note it has miss-labelled data so don't pull your hair out over ...

is that hand written digits?

iron basalt Feb 19, 2021, 7:48 PM

#

yes

stray roost Feb 19, 2021, 7:48 PM

#

I already watched a tutorial on that one hahah

#

I might try to redo it by myself tho

#

See how good is my memory

iron basalt Feb 19, 2021, 7:49 PM

#

Then try something a bit harder, fashion-MNIST.

stray roost Feb 19, 2021, 7:50 PM

#

I might check that one out.

#

Thank you man

stray roost Feb 19, 2021, 8:08 PM

#

So basically I tried it and it overfits

#

while training, the acc was 90 and after in testing it was 30

#

how do I know what to change in my code

keen root Feb 19, 2021, 8:09 PM

#

iron basalt To confirm, the dimensionality of the input in case of image input would be the ...

Yes, that's it

#

(Sorry for the delay)

iron basalt Feb 19, 2021, 8:10 PM

#

stray roost So basically I tried it and it overfits

Research / learn about regularization.

stray roost Feb 19, 2021, 8:10 PM

#

will do

iron basalt Feb 19, 2021, 8:12 PM

#

@keen root Yeah so 1000 inputs is actually not that bad, try just throwing an MLP at the problem, scikit has one: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

#

If they does not work, try using a dimensionality reduction algorithm and then feeding that into an MLP.

keen root Feb 19, 2021, 8:14 PM

#

That's pretty awesome, I didn't know that existed in scikit

iron basalt Feb 19, 2021, 8:14 PM

#

The key thing it does for you is called softmax in case you want to learn more about it just search for softmax.

keen root Feb 19, 2021, 8:15 PM

#

That's the generalization of the logistic curve, right?

iron basalt Feb 19, 2021, 8:15 PM

#

generalization of logistic regression yes

keen root Feb 19, 2021, 8:15 PM

#

Ok ok, awesome. Thank you. I'll give it a go then

iron basalt Feb 19, 2021, 8:16 PM

#

generally if you have a multi-class prediction problem it's the go to

stray roost Feb 19, 2021, 8:17 PM

#

One quick question

#

So basically I decided to see how did other people make their fashion_mnist code, changed mine to be like theirs and it still overfits

iron basalt Feb 19, 2021, 8:18 PM

#

Yup, at that point you gotta try harder. No nice out the box solution.

#

ML is ~~kind of~~ open-ended, lots of room for improvement.

#

*very

#

Just see what seems to work and what does not, and after that you have to get creative.

#

(Try to come up with reasons why it works or does not and then test those ideas / science)

storm lintel Feb 19, 2021, 9:03 PM

#

anyone know how to find a hidden option on target like iuts oos rn and i cant find the websites html for the add to cart button

charred umbra Feb 19, 2021, 9:40 PM

#

The numbers MNIST can technically get 100% accuracy on a 20:80 test-train split

#

There is already a configuration of convolution and pooling that had achieved a 100% CCR

charred umbra Feb 19, 2021, 9:43 PM

#

stray roost Hi yall. I recently got into machine learning and AI. Can yall give me some inte...

Make support vector machine [SVM] to classify between dogs and cats (aim for a CCR above 80%)

astral path Feb 19, 2021, 9:55 PM

#

#

just a nice looking heatmap i made

stray roost Feb 19, 2021, 10:05 PM

#

charred umbra Make support vector machine [SVM] to classify between dogs and cats (aim for a C...

Oh that's interesting. I did that but with CNN using a tutorial

#

I might try it

misty flint Feb 19, 2021, 10:08 PM

#

astral path just a nice looking heatmap i made

pandas/matplotlib?

#

Sip

astral path Feb 19, 2021, 10:10 PM

#

seaborn!

#

and pandas

#

but matplotlib is under the hood of seaborn

charred umbra Feb 19, 2021, 10:11 PM

#

stray roost Oh that's interesting. I did that but with CNN using a tutorial

Yeah the idea is to use the CNN to feature extract, but instead of using an MLP to classify use an SVM

opal ferry Feb 19, 2021, 10:26 PM

#

Not sure if this is widespread or has been asked a lot lately, but is chrome saying the pandas documentation site is“dangerous” for you guys?

#

Pandas.pydata.org

tidal bough Feb 19, 2021, 10:34 PM

#

yeah, people have noticed

#

it's pretty weird

opal ferry Feb 19, 2021, 10:34 PM

#

...am I safe to still view the pandas docs?

tidal bough Feb 19, 2021, 10:35 PM

#

There's an issue for it:
https://github.com/pandas-dev/pandas/issues/39862

GitHub

BUG: Docs website is blocked by google safe browsing · Issue #39862...

I hope it's ok to skip the template as this is not a direct code bug but a website bug. Sorry for the german locales, but I guess the message is pretty clear anyways. The pandas docs are bl...

opal ferry Feb 19, 2021, 10:35 PM

#

I was just reading that, no real insight in the thread tho

tidal bough Feb 19, 2021, 10:36 PM

#

lol, yeah, the pandas devs are really confused

velvet thorn Feb 19, 2021, 11:15 PM

#

good thing I neither use Chrome nor read documentation 🥴

tidal bough Feb 19, 2021, 11:17 PM

#

it shows for me in Firefox too

#

it's just pretty weird, only some paths are affected

lapis sequoia Feb 19, 2021, 11:45 PM

#

@shadow ridge just got it too

#

date_range

prisma willow Feb 19, 2021, 11:47 PM

#

New question

Regarding machine learning and AI i was wondering where is whats being learnt is saved/stored? Apparently MachineL can do a linear distribution but i don't get how the machine is learning anything, or with Test/train because nothing is being remembered, the program is just running an approximation on some data... Or with a Chess AI how does the program remember and train against itself? where would each trial be stored and in what format?

plain jungle Feb 19, 2021, 11:47 PM

#

I made a post a few weeks ago of an AI playing the chrome Dino, now this is the frogger

magic panther Feb 19, 2021, 11:57 PM

#

guys and girls, if I have a set of input parameters and I want to minimize one of themm, how do i go about making an objective function? what do people do to find a relationship between my variables?

digital crescent Feb 20, 2021, 12:20 AM

#

I would like to do a lot of realtime high-speed data analysis. One of my analysis techniques will probably use either fixed-length time series where old data is dropped off and replaced with new data or time series that grow as new data arrives. Will I be hindered by using Pandas? Should I try to focus exclusively on Numpy? Maybe focus on something else entirely? The new data will (at the start of the analysis) be entirely on a SQL server. New data will also arrive to the SQL server periodically.

#

I'm pretty new to Python. Just want to make sure that I'm practicing with the approaches, techniques, and packages that will serve me best over the long-run

iron basalt Feb 20, 2021, 12:23 AM

#

When you say speed do you mean throughput or latency? @digital crescent

digital crescent Feb 20, 2021, 12:24 AM

#

iron basalt When you say speed do you mean throughput or latency? <@!266774717803921410>

I don't think I know the difference well enough. I would like to be able to pull the first set of data + any smaller "new" data quickly from SQL, update my dataset in Python quickly, and run a new updated analysis as fast as possible. So every step in the process is important to me in terms of speed

#

I've heard a bit in terms of certain packages used to interact with SQL servers being faster/slower, but I guess I was mostly concerned with me inefficiently manipulating the data during the analysis part

velvet thorn Feb 20, 2021, 12:26 AM

#

digital crescent I don't think I know the difference well enough. I would like to be able to pull...

throughput = how much gets done in a period of time
latency = how long it takes for one unit of work to get done

iron basalt Feb 20, 2021, 12:26 AM

#

Latency would we the time from new data entering the system (before it even gets to the database) to the output of the analysis being updated. Throughput would be how many points in the time series you can process per unit time.

velvet thorn Feb 20, 2021, 12:27 AM

#

digital crescent I don't think I know the difference well enough. I would like to be able to pull...

so from this it sounds like you want low latency at least

iron basalt Feb 20, 2021, 12:27 AM

#

Optimizing for one is very different than the other.

digital crescent Feb 20, 2021, 12:27 AM

#

If we go by Squiggle's definition, I'd say I'm mostly concerned with the speed of analysis

velvet thorn Feb 20, 2021, 12:27 AM

#

whether you also need high throughput will depend on how much input is coming in at any one time

digital crescent Feb 20, 2021, 12:27 AM

#

I think in general it will take much more time for me to analyze the data then move it from SQL to Python

#

Thousands of new data points per second potentially but the analysis could involve as many computations as I wanted (so thousands, tens of thousands, hundreds of thousands, millions)

velvet thorn Feb 20, 2021, 12:29 AM

#

what kind of analysis?

#

you might want to look into Spark

iron basalt Feb 20, 2021, 12:29 AM

#

So you want low latency? You do not have it backing up in terms of how many points it can process? By that I mean does the database get more points than it gives to the analysis system (think like how much water is flowing into a container versus flowing out).

velvet thorn Feb 20, 2021, 12:29 AM

#

in particular, Spark streaming

digital crescent Feb 20, 2021, 12:29 AM

#

A mixture of things. Regressions, pattern recognition, random stuff with probability distributions

velvet thorn Feb 20, 2021, 12:29 AM

#

do you want a managed solution?

#

(you probably do, right...)

digital crescent Feb 20, 2021, 12:30 AM

#

iron basalt So you want low latency? You do not have it backing up in terms of how many poin...

The database will have more data points that I will be using, yes

velvet thorn Feb 20, 2021, 12:30 AM

#

are you willing to spend $$?

digital crescent Feb 20, 2021, 12:30 AM

#

But I would like the flexibility to basically decide to stop looking at one portion of the dataset and look at a completely different portion

digital crescent Feb 20, 2021, 12:30 AM

#

velvet thorn are you willing to spend $$?

Probably not, no

velvet thorn Feb 20, 2021, 12:30 AM

#

digital crescent Probably not, no

huh.

#

well

#

cheap, good, fast, choose 2

digital crescent Feb 20, 2021, 12:31 AM

#

The SQL server will be on my computer, and I will be doing the analysis mostly in Python on my computer as well

#

(Just to clarify that the "server" isn't like a separate set of hardware)

velvet thorn Feb 20, 2021, 12:31 AM

#

of course it depends on your exact requirements and it's defo possible to run analyses locally without incurring additional costs

#

but @ some point you might need to scale up

#

hard to say without knowing numbers

iron basalt Feb 20, 2021, 12:32 AM

#

Then without spending anything I think a solution may be to have the data points go directly to the analysis system to reduce the latency of having to go through a database system. However, at the same time the data points are also sent to the database to be stored for later.

velvet thorn Feb 20, 2021, 12:32 AM

#

digital crescent Thousands of new data points per second potentially but the analysis could invol...

thousands per second is not impossible

#

but it would require some configuration and data engineering, at least

iron basalt Feb 20, 2021, 12:32 AM

#

oh on the same system

digital crescent Feb 20, 2021, 12:33 AM

#

I guess my main concern is just the Python tools I should use for analysis provided that the data I'm analyzing is constantly changing (i.e. changing in size, looking at a completely different set of data points, stuff like that)

#

And I see stuff like with Pandas that say that adding extra rows is super slow

#

And it makes me wonder about other concerns I should have

velvet thorn Feb 20, 2021, 12:33 AM

#

digital crescent And I see stuff like with Pandas that say that adding extra rows is super slow

adding rows is slow (relatively) because you need to reallocate the backing array

#

pandas isn't really meant for data that changes

digital crescent Feb 20, 2021, 12:33 AM

#

And which preferred approaches I should be considering for realtime analysis

velvet thorn Feb 20, 2021, 12:33 AM

#

which is why I said

#

look into Spark streaming

#

which adds a lot of overhead

#

but shrugs

#

is a bit heavyweight for local usage, too

#

I mean

#

you could defo build abstractions around numpy that would allow you to do this kind of thing but

iron basalt Feb 20, 2021, 12:34 AM

#

Yeah they is starting to sound like a database question, which could be asked in the databases section, there are database systems that are designed for quickly adding new types of data and such, but I am not exactly an expert on them.

velvet thorn Feb 20, 2021, 12:35 AM

#

so

#

this goes back to the kind of analyses you need to perform

#

but yeah, probably Spark.

digital crescent Feb 20, 2021, 12:35 AM

#

Huh. I didn't think this was really a databases question primarily, but then again I don't know much about these kinds of things

velvet thorn Feb 20, 2021, 12:35 AM

#

digital crescent Huh. I didn't think this was really a databases question primarily, but then aga...

it's a data engineering problem, specifically.

#

you're basically asking "how do I construct an ETL pipeline -> data warehouse that will meet my needs?"

digital crescent Feb 20, 2021, 12:36 AM

#

Does ETL include an analysis step?

velvet thorn Feb 20, 2021, 12:36 AM

#

possibly, as part of the T step, but

#

depends on the complexity of the analysis

#

ideally that would come after

digital crescent Feb 20, 2021, 12:37 AM

#

I'm mostly concerned with the speed of the analysis part than the speed of the "ETL" part

velvet thorn Feb 20, 2021, 12:37 AM

#

digital crescent Thousands of new data points per second potentially but the analysis could invol...

when you say "computations" you mean "data points"?

#

the reason pandas is fast (relatively speaking) is that it holds the entire dataset in memory.

digital crescent Feb 20, 2021, 12:37 AM

#

velvet thorn when you say "computations" you mean "data points"?

No, I meant the number of computations required to perform the analysis. Basically you can take 10000 points and look at them a million different ways

velvet thorn Feb 20, 2021, 12:37 AM

#

digital crescent No, I meant the number of computations required to perform the analysis. Basical...

so how many points @ any one time?

digital crescent Feb 20, 2021, 12:38 AM

#

But if I want to add 100 or replace 100 and look at it differently, I feel like I could run into problems with Pandas

velvet thorn Feb 20, 2021, 12:38 AM

#

and how often will the subset of data to be looked at change?

#

relative to the number of analyses being run

iron basalt Feb 20, 2021, 12:38 AM

#

The first step is to get upper bounds on things. You can't do as many computations as you want, computation is finite resource.

digital crescent Feb 20, 2021, 12:38 AM

#

velvet thorn so how many points @ any one time?

Could be 500 - 10000, I suspect

iron basalt Feb 20, 2021, 12:38 AM

#

Even if those upper bounds are massive

velvet thorn Feb 20, 2021, 12:38 AM

#

digital crescent Could be 500 - 10000, I suspect

this is very manageable

digital crescent Feb 20, 2021, 12:40 AM

#

velvet thorn and how often will the subset of data to be looked at change?

It will be a balancing act. I would like to produce updated analyses as fast as possible (ideally within a few seconds or a fraction of a second), so I'm aware that I won't be able to do the best analysis nonstop. But say I've got 5000 data points and add/replace 300. I would like to run some regressions or do some pattern recognition or generate some new probability distributions as fast as possible

#

But it will be constantly running. And the faster I can analyze, the better analysis I can do if the goal is to produce an updated analysis on a rolling, realtime, and almost infinite basis

velvet thorn Feb 20, 2021, 12:41 AM

#

digital crescent It will be a balancing act. I would like to produce updated analyses as fast as ...

a million computations is a lot

#

if it's anything complex like regression analysis, you don't have enough compute for that

#

nowhere near

digital crescent Feb 20, 2021, 12:42 AM

#

I'm trying to think in my head how many computations would be required to a do a simple linear regression with 10,000 coordinate pairs

#

Probably a lot

velvet thorn Feb 20, 2021, 12:42 AM

#

digital crescent I'm trying to think in my head how many computations would be required to a do a...

do you mean like primitive computations...?

iron basalt Feb 20, 2021, 12:42 AM

#

digital crescent I'm trying to think in my head how many computations would be required to a do a...

Don't imagine, just try it out, test it.

digital crescent Feb 20, 2021, 12:42 AM

#

Might not take a lot of memory though

velvet thorn Feb 20, 2021, 12:42 AM

#

no, 10,000 is very small

#

but yeah, just try it.

digital crescent Feb 20, 2021, 12:42 AM

#

I mean, I can almost count them in my head

iron basalt Feb 20, 2021, 12:42 AM

#

Just make sure you are measuring it correctly.

digital crescent Feb 20, 2021, 12:43 AM

#

Yeah. I was abstracting it into the stuff I would do on paper which obviously isn't the same as what goes on in a computer

#

But again, I feel like this is somewhat beside the point, right?

iron basalt Feb 20, 2021, 12:43 AM

#

A simple timing can tell a lot

digital crescent Feb 20, 2021, 12:45 AM

#

Regardless of the analysis, if the analysis is taking up the bulk of the time (and not the data-fetching part), is there a generally preferred way to handle the data and intermediate calculations in Python? Or is it really not that simple? Like if I give you 10,000 data points and tell you that every so often the analysis will randomly be performed on a somewhat differently sized database and sometimes the analysis itself will be slightly different, what tools do you use to run the analysis? Not Pandas? Yes to Pandas? Only Numpy? Python Lists?

velvet thorn Feb 20, 2021, 12:45 AM

#

digital crescent Regardless of the analysis, if the analysis is taking up the bulk of the time (a...

for numeric data, never lists.

#

it depends on the analysis.

#

but numpy is generally faster

digital crescent Feb 20, 2021, 12:46 AM

#

I guess I'm just worried that Pandas seems almost useless if speed is at all an issue if you decide to add some data to your existing dataset

velvet thorn Feb 20, 2021, 12:46 AM

#

pandas can provide better abstractions though

velvet thorn Feb 20, 2021, 12:46 AM

#

digital crescent I guess I'm just worried that Pandas seems almost useless if speed is at all an ...

pandas is backed by numpy arrays.

iron basalt Feb 20, 2021, 12:47 AM

#

It's not that simple, but like gm wrote, there are definitely some things NOT to do. Python itself is pretty slow so all speed must from the c-libraries.

velvet thorn Feb 20, 2021, 12:47 AM

#

they have the exact same problem.

digital crescent Feb 20, 2021, 12:47 AM

#

Is there even an efficient way to handle data of a changing size or is that kind of a problem that can't be solved?

velvet thorn Feb 20, 2021, 12:48 AM

#

digital crescent Is there even an efficient way to handle data of a changing size or is that kind...

there is

#

but

#

that is not necessarily the correct question

digital crescent Feb 20, 2021, 12:48 AM

#

velvet thorn `pandas` is backed by `numpy` arrays.

I've just read that Pandas is like Numpy + overhead and is slightly slower. I have no idea if that's relevant to me though

velvet thorn Feb 20, 2021, 12:48 AM

#

"how often will the dataset size change?"

#

like let's say

#

running analyses takes 15s

#

then you change once

#

and that takes 0.5s

digital crescent Feb 20, 2021, 12:49 AM

#

Gotcha. Thanks. You guys have given me some stuff to consider

#

It is almost like I should just spend more time thinking about ways to efficiently structure stuff with the tools I have rather than look for a tool that magically solves these problems

velvet thorn Feb 20, 2021, 12:51 AM

#

data engineering is an art

#

not one I particularly like, but it is important

iron basalt Feb 20, 2021, 12:53 AM

#

No library can magically overcome the limitations set by the hardware itself. In general, the less you know up front (which types of data you will have, how much, etc), the slower the solution will be, but with the trade-off of hack-ability / extensibility.

digital crescent Feb 20, 2021, 12:53 AM

#

Here is an example of what I mean (not necessarily one that applies to my project but I think it is in the same line of thought): Imagine your dataset for analysis will be anywhere from 5000 - 6000 rows. Maybe you could just make a 6000-row Pandas table and fill in the ones you don't need with zero or something like that. Or track the rows that aren't being used. And then have some kind of flag to ignore the portion of the vector calculations done on the unneeded rows

#

Something like that

velvet thorn Feb 20, 2021, 12:54 AM

#

digital crescent Here is an example of what I mean (not necessarily one that applies to my projec...

you could

#

or you could also use a numpy masked array

#

which I think might be a more appropriate abstraction, BUT

digital crescent Feb 20, 2021, 12:54 AM

#

Not saying this is the best way to do things. But it is what I mean in terms of just coming up with better solutions rather than looking for better tools

velvet thorn Feb 20, 2021, 12:54 AM

#

shrugs

velvet thorn Feb 20, 2021, 12:54 AM

#

digital crescent Not saying this is the best way to do things. But it is what I mean in terms of ...

okay so you see

#

this will boil down to performing a filtering operation at the start of every set of analyses, probably

#

because you want to retrieve the subset that you'll use

iron basalt Feb 20, 2021, 12:55 AM

#

digital crescent Here is an example of what I mean (not necessarily one that applies to my projec...

That is a technique known as pre-allocation, it works (faster and makes the code more simple), but you will be using up more memory.

digital crescent Feb 20, 2021, 12:55 AM

#

Just it doesn't seem like this stuff is written out anywhere. Or at least I don't see a good guide saying "this is how you should do x/y/z analysis in Python if you want to do it quickly"

#

Which is fine

#

I just wanted to make sure that I wasn't missing anything obvious

#

I.e. If it was as simple as like "oh, realtime data analysis with changing amounts of data? Do/don't use pandas"

#

Or do/don't use numpy

#

But I think you guys have gotten me somewhere 🙂

#

So thanks

velvet thorn Feb 20, 2021, 12:56 AM

#

yw

atomic obsidian Feb 20, 2021, 12:57 AM

#

Is sql ever going to become obsolete due to libraries in languages or is learning it valuable?

prisma willow Feb 20, 2021, 12:58 AM

#

python implements an sql package

velvet thorn Feb 20, 2021, 12:58 AM

#

atomic obsidian Is sql ever going to become obsolete due to libraries in languages or is learnin...

no idea whether it'll ever become obsolete

#

probably, in a hundred years+?

iron basalt Feb 20, 2021, 12:58 AM

#

SQL might, but relational databases probably not (speculation). It will probably stick around for a long time in case anyone needs to manually query things.

velvet thorn Feb 20, 2021, 12:58 AM

#

but the fact that ORMs can abstract away the need to know raw SQL is not in itself a reason not to learn SQL

velvet thorn Feb 20, 2021, 12:59 AM

#

atomic obsidian Is sql ever going to become obsolete due to libraries in languages or is learnin...

...why do you ask

prisma willow Feb 20, 2021, 1:01 AM

#

question
what does it mean when people say sql sever and sql developer have their own database? Aren't we the user creating the database? what does it mean to come with one?

#

im using database synonmous with tabular data

storm lintel Feb 20, 2021, 1:27 AM

#

tag = self.soup.body.find('div', class_='fulfillment-add-to-cart-button')
        if tag and 'add to cart' in tag.text.lower():
            self.alert_subject = alert_subject
            self.alert_content = f'{alert_content.strip()}\n{self.url}```

#

anyone see anything wrong with this?

#

its saying its in stock but its not idk if the code is messed up or sum

#

im not making a auto buy bot btw

misty flint Feb 20, 2021, 1:45 AM

#

prisma willow python implements an sql package

ID_blurryeyes

plain jungle Feb 20, 2021, 2:03 AM

#

SQL is going to stick around for a long time because of the same reasons that Java is sticking around. Theres better languages for the job, but so many businesses already use it that they'd never think of not using it #Mongo

#

That being said, SQL definitely does have times where it shines and if you are looking to get into it with python, try SQLite3

prisma willow Feb 20, 2021, 2:05 AM

#

@misty flint spill the beans

#

@misty flint ppl cant better themselves with inside jokes

lapis sequoia Feb 20, 2021, 2:11 AM

#

Hello evveryone,

I am having an issue with RollingOLS from statsmodels .
'''
mod = RollingOLS(Y, X, window=75, min_nobs=None,expanding=True)
fit=mod.fit()
'''

#

When i want to get the AIC

#

i get a list of multiple values

#

and i think my X and Y are in the wrong format

#

since i have X and Y two lists

#

of numbers

#

How should i proceed ?

misty flint Feb 20, 2021, 2:28 AM

#

prisma willow <@446424248479645706> ppl cant better themselves with inside jokes

oh that was just me being hopeful

#

DoggoKek

misty flint Feb 20, 2021, 2:28 AM

#

plain jungle SQL is going to stick around for a long time because of the same reasons that Ja...

so why people used Azure at first and now that theyve upped their services, people are using it for reals?

#

Sip

odd aspen Feb 20, 2021, 2:35 AM

#

lapis sequoia Hello evveryone, I am having an issue with RollingOLS from statsmodels . ''' m...

Can I see how you created the X and Y arrays?

plain jungle Feb 20, 2021, 2:36 AM

#

lol, I mean just as Fortan and Cobolt is about for Banks, SQL is here to stay for a while @misty flint

lapis sequoia Feb 20, 2021, 2:39 AM

#

odd aspen Can I see how you created the X and Y arrays?

Both are numpy array

#

1D

#

containing numbers

#

np.size(X) give 529

#

and np.size(Y) give 529 too

#

@odd aspen Do i have to use a panda dataframe ?

#

shape gives (529,)

austere swift Feb 20, 2021, 3:16 AM

#

How would i put the labels for the bars in a matplotlib bar graph above the bars themselves

#

something like this with the "text field" being the label

flat-chart-graph-simply-color-editable-infographics-elements-vector-id538088027.png

misty flint Feb 20, 2021, 4:01 AM

#

the more i use matplotlib, the more i realize hate it

#

DoggoKek

keen kestrel Feb 20, 2021, 4:09 AM

#

I use altair for statistical plot, the API is easier to remember

tight jewel Feb 20, 2021, 4:15 AM

#

austere swift something like this with the "text field" being the label

Woah, dont tell me this is made in python ....

still salmon Feb 20, 2021, 4:21 AM

#

I want to specify in pandas dataframe timezone as CDT IST EDT etc , instead of Region/Country, is there a way to do that?? All examples I came across specify timezone as Region/Country Ex - "Africa/Douala"

austere swift Feb 20, 2021, 4:21 AM

#

tight jewel Woah, dont tell me this is made in python ....

I don’t know I just found that on the internet

keen kestrel Feb 20, 2021, 4:22 AM

#

It looks like it is made in excel lol

#

The label on top, this example will do? https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html#sphx-glr-gallery-lines-bars-and-markers-barchart-py

bold olive Feb 20, 2021, 4:36 AM

#

hasty grail What does your model look like?

A standard 3 layer CNN.

honest adder Feb 20, 2021, 5:15 AM

#

not quite sure where to put this

#

from MTM import matchTemplates
import cv2
r10 = 'out2.png'
lt = [('small', r10)]
image = 'rank101.png'
Hits = matchTemplates(lt, image, score_threshold=0.5, method=cv2.TM_CCOEFF_NORMED, maxOverlap=0)

#

MTM is a library for template detection, so i don't have to fuss about with more code than i have to

#

but, it keeps out putting AttributeError: 'str' object has no attribute 'dtype at line 7

#

For the example, they use coin from from skimage.data import coins... how do i make my image have dtype

iron basalt Feb 20, 2021, 5:31 AM

#

honest adder For the example, they use coin from from skimage.data import coins... how do i m...

You misunderstand the API. The matchTemplates function wants numpy arrays not filepaths to image files. image should be a numpy array, as should r10. The reason the error says that it can't find dtype is because you gave it a string instead of a numpy array, strings don't have dtypes.

#

Basically it wants the actual image data, not a string telling it where to find the image data.

hasty grail Feb 20, 2021, 5:59 AM

#

bold olive A standard 3 layer CNN.

Could you print the summary?

analog pike Feb 20, 2021, 6:33 AM

#

Don't know if this is the right place but can someone tell me why this is throwing up an error/what I can do to fix it

#

https://gyazo.com/a172528d60920a0f041cbee44ba6c252

Gyazo

fierce shadow Feb 20, 2021, 6:34 AM

#

bold olive How can I run out of memory with 3D CNN (TensorFlow Keras), even with a batch si...

btw why are you using 3d cnn when you want to process images?

analog pike Feb 20, 2021, 6:34 AM

#

x = 0
index = ufos[ufos['country'] == 'us']
for value in index['datetime']:
    if ("24:" in value):
        value.replace('24:00', '00:00')
        index.iloc[x,0] = value
    x+=1
``` there we go

#

pd.to_datetime doesn't like when values are as 24:00 but im having trouble reassigning the values back into the dataframe once i change 24:00 to 00:00

bold olive Feb 20, 2021, 6:35 AM

#

fierce shadow btw why are you using 3d cnn when you want to process images?

I'm processing 3D volumetric images.

fierce shadow Feb 20, 2021, 6:35 AM

#

oh okay

bold olive Feb 20, 2021, 6:36 AM

#

Hang on, I'll get the model summary!

bold olive Feb 20, 2021, 6:42 AM

#

hasty grail Could you print the summary?

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv3d (Conv3D)              (None, 222, 126, 222, 32) 896       
_________________________________________________________________
activation (Activation)      (None, 222, 126, 222, 32) 0         
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 111, 63, 111, 32)  0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 109, 61, 109, 32)  27680     
_________________________________________________________________
activation_1 (Activation)    (None, 109, 61, 109, 32)  0         
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 54, 30, 54, 32)    0         
_________________________________________________________________
flatten (Flatten)            (None, 2799360)           0         
_________________________________________________________________
dense (Dense)                (None, 32)                89579552  
_________________________________________________________________
activation_2 (Activation)    (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 33        
_________________________________________________________________
activation_3 (Activation)    (None, 1)                 0         
=================================================================
Total params: 89,608,161
Trainable params: 89,608,161
Non-trainable params: 0
_________________________________________________________________```

#

And even with a batch size of 1, I get this OOM error during the first epoch:

hasty grail Feb 20, 2021, 6:43 AM

#

yeah that definitely looks too big

bold olive Feb 20, 2021, 6:43 AM

#

hasty grail Feb 20, 2021, 6:43 AM

#

200 cubed is huge

bold olive Feb 20, 2021, 6:43 AM

#

What can I do to avoid this? I mean I have to process the images somehow.

#

Does it have to do with how I am shaping my data?

hasty grail Feb 20, 2021, 6:44 AM

#

it basically means that your data is too large

#

what is the kernel size of each CNN layer?

bold olive Feb 20, 2021, 6:44 AM

#

The tensor (or X) you mean?

#

Because each image file is only around 14mb

hasty grail Feb 20, 2021, 6:44 AM

#

yes

#

because when you input it into a CNN layer

#

inside the convolution operation, your 3d image is multipled by each value in the kernel, resulting in a total size = (size of image) x (size of kernel)

#

kernel size, not number of channels

bold olive Feb 20, 2021, 6:47 AM

#

Sorry - (3,3,3)

hasty grail Feb 20, 2021, 6:47 AM

#

ok that's the smallest as it's going to get

bold olive Feb 20, 2021, 6:47 AM

#

Yeah

hasty grail Feb 20, 2021, 6:47 AM

#

so yeah, probably your input is too large

bold olive Feb 20, 2021, 6:48 AM

#

Okay so how do I make this work now basically?

hasty grail Feb 20, 2021, 6:48 AM

#

you'll need to downsample it before the CNN layers

bold olive Feb 20, 2021, 6:49 AM

#

I have 38 images, each with a dimension of (766, 200, 760)

#

After creating X and resizing, the shape of X is (39, 224, 128, 224)

#

Then I add one more channel to make it a 5D tensor fit for Conv3D so it becomes (39, 224, 128, 224, 1)

hasty grail Feb 20, 2021, 6:51 AM

#

you'll have to resize your images to be even smaller

bold olive Feb 20, 2021, 6:53 AM

#

Hmm, perhaps (128, 64, 128) will do the trick?

hasty grail Feb 20, 2021, 6:53 AM

#

try it

bold olive Feb 20, 2021, 6:53 AM

#

There is an example of volumetric MRI image classification on Keras

#

https://keras.io/examples/vision/3D_image_classification/

Keras documentation: 3D Image Classification from CT Scans

#

They are using the same size without any problems

hasty grail Feb 20, 2021, 6:54 AM

#

Yeah

bold olive Feb 20, 2021, 7:00 AM

#

Yes, seems to work now!

#

Need to increase the accuracy but that's another issue

#

Great, so the cubed size was far too large

hasty grail Feb 20, 2021, 7:04 AM

#

yeah, by halving each dimension you're now using 1/8th of the original memory

short heart Feb 20, 2021, 7:38 AM

#

can somebody help wiht sklearn

floral flare Feb 20, 2021, 8:20 AM

#

Any Idea why this may be happening (The red letters is the heuristic algorithm used to expand Manhattan distance as cost, Misplaced Tiles as cost, and BFS as cost)

#

U can see that 3rd last for manhattan and the 4th last for tiles has a drop in nodes expanded

#

whereas it keeps going up for BFS

velvet thorn Feb 20, 2021, 8:39 AM

#

@bold olive how thick is your FC layer?

#

...how many neurons does your Dense layer have?

#

man that was incoherent

floral flare Feb 20, 2021, 8:44 AM

#

how thick xd lol i like that

velvet thorn Feb 20, 2021, 8:45 AM

#

floral flare how thick xd lol i like that

it's not even correct

#

it should be "how wide"

#

🥴

floral flare Feb 20, 2021, 8:56 AM

#

Ah

bold olive Feb 20, 2021, 9:57 AM

#

"thick" is alright with me joe_maverick

bold olive Feb 20, 2021, 9:57 AM

#

velvet thorn man that was incoherent

64

thin remnant Feb 20, 2021, 10:33 AM

#

im having a dataset that contains names of natural reservoires. I've also got a few cols about Area but they don't seem usefull for my situation since i need longitude and latitude... I found this website and filled in the name of a few reservoires and it seems to return the right longitude and latitutde.. Now since the dataset is quiet big, I obsiously don't want to do this mannually. Could someone give me some help in how I can make a python script that just requests this inside the script for each name in the dataset ?

#

website: https://www.latlong.net/convert-address-to-lat-long.html

Get Lat Long from Address Convert Address to Coordinates

A handy tool to get lat long from address, helps you to convert address to coordinates (latitude longitude) on map, also calculates the gps coordinates.

hexed parrot Feb 20, 2021, 11:40 AM

#

Can i get a seed from a picture pixel colors? i mean if you generate 256x256 random color values and you can doit with a seed can you reverse it? you input a 256x266 image and get the seed?

grave frost Feb 20, 2021, 11:56 AM

#

@hexed parrot numpy has a seed https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.seed.html

#

why don't you just generate an array that way?

hexed parrot Feb 20, 2021, 12:12 PM

#

but can you recover the seed with the list of generated numbers?

#

or there is no way?

grave frost Feb 20, 2021, 1:33 PM

#

Off the top of my head, you could maybe Bruteforce if a dedicated function to do that is not available.

honest adder Feb 20, 2021, 1:46 PM

#

iron basalt You misunderstand the API. The `matchTemplates` function wants numpy arrays not ...

ahhh ok so i have to open the image file and then give it the string

rotund rampart Feb 20, 2021, 2:07 PM

#

hello guys, i want to create a model using Keras or Tensorflow that synthesizes two images, joining a body to a head. I don't know much about deep learning and honestly I'm a little lost. Any tips on how to get started?

grave frost Feb 20, 2021, 2:09 PM

#

Can you elucidate about what you exactly want to do?>

rotund rampart Feb 20, 2021, 2:14 PM

#

input 1

#

input 2

#

output

#

i dont know exactly how to describe this

#

Ignore the colors. The first two I took a picture of the computer screen

rotund rampart Feb 20, 2021, 2:20 PM

#

rotund rampart output

in this case that was made with photoshop. but it takes a looong time to edit and thats why im searching a way to do this with deep learning

lapis sequoia Feb 20, 2021, 2:57 PM

#

would anyone review my recently worked on neural network?

#

Also how to backpropogate

woeful estuary Feb 20, 2021, 3:04 PM

#

lapis sequoia Also how to backpropogate

What ai framework are you using?

lapis sequoia Feb 20, 2021, 3:05 PM

#

python

#

As in pure python with a few simple modules like math and random

#

So not really any framework and from scratch

woeful estuary Feb 20, 2021, 3:06 PM

#

Oh, not even numpy?

#

I think i can't help with this one

#

Try asking someone else

lapis sequoia Feb 20, 2021, 3:07 PM

#

Nope not even numpy. Thats alright thankyou for taking an interest 🙂

misty flint Feb 20, 2021, 3:25 PM

#

from scratch... NervousSip

analog pike Feb 20, 2021, 3:27 PM

#

what's the easiest way to remove the time from a column in pandas. The csv i downloaded goes from 0:24 inclusive and its kinda messing with to_datimetime

#

And I only need the year anyways

acoustic forge Feb 20, 2021, 4:53 PM

#

So I'm working on a project with a couple of friends. We want to be able remove the background of a picture (usually, but not necessarily a portrait), somewhat like remove.bg. Would PyTorch be a good fit for this project?

nova widget Feb 20, 2021, 4:53 PM

#

@analog pike you slice with loc

#

@acoustic forge CV2 https://stackoverflow.com/questions/63001988/how-to-remove-background-of-images-in-python

Stack Overflow

how to remove background of images in python

I have a dataset that contains full width human images I want to remove all the backgrounds in those Images and just leave the full width person,
my questions:
is there any python code that does th...

acoustic forge Feb 20, 2021, 5:07 PM

#

@nova widget Yeah, that suggestion that he made doesn't work

misty flint Feb 20, 2021, 5:50 PM

#

that link might not work but i would go down the route of opencv @acoustic forge

#

they have some useful modules in their documentation

acoustic forge Feb 20, 2021, 5:51 PM

#

Okay, I'm gonna check it out @misty flint. Thanks 🙂

iron basalt Feb 20, 2021, 7:02 PM

#

thin remnant im having a dataset that contains names of natural reservoires. I've also got a ...

Consider using https://pypi.org/project/geopy/ instead.

PyPI

geopy

Python Geocoding Toolbox

iron basalt Feb 20, 2021, 7:03 PM

#

lapis sequoia would anyone review my recently worked on neural network?

Show code. What type of neural network?

lapis sequoia Feb 20, 2021, 7:09 PM

#

iron basalt Consider using https://pypi.org/project/geopy/ instead.

thanks

#

will do!

#

thanks

#

https://paste.pythondiscord.com/elihutusix.lua

#

Its rather long at might be difficult to understand. I only know a limited amount about the math behind neural networks, backpropogation and so fourth. This is my attempt so far.

#

I have been working on another that uses a genetic/fitness approach.

iron basalt Feb 20, 2021, 7:17 PM

#

How familiar are you with linear algebra?

lapis sequoia Feb 20, 2021, 7:18 PM

#

somewhat familiar, only what I know from studying it in maths education

#

but I think most of the math behind machine learning is beyond me.

#

I can understand what sigmoid and other activation functions do

iron basalt Feb 20, 2021, 7:19 PM

#

Ok, my first note is that this code is much smaller and simple if you make use of matrices.

#

(Which is their entire purpose)

lapis sequoia Feb 20, 2021, 7:20 PM

#

I just thought I'd give it a try! Still not sure if it works as intended but we can hope.

#

Right that makes sense.

#

I used a one dimensional arrary for most of the weights etc

#

thankyou

iron basalt Feb 20, 2021, 7:20 PM

#

You can simply implement matrix multiplication and transpose yourself, it does not need to be fast.

#

As long as you get the idea

lapis sequoia Feb 20, 2021, 7:21 PM

#

that sounds like a good idea. I do see what you mean. Then I wouldn't have to loop through each neuron individually?

iron basalt Feb 20, 2021, 7:22 PM

#

yeah, that's why matrices are cool, they make everything easier to think about and code, since you are thinking at a higher level.

#

By that I mean like as in programming higher level.

#

Like assembly vs python

lapis sequoia Feb 20, 2021, 7:23 PM

#

ohh right I see what you mean now. They sound super cool actually! I will try it out thanks

#

that will be useful

iron basalt Feb 20, 2021, 7:23 PM

#

It's also why they were invented in math, nobody wants to manually juggle all those numbers.

lapis sequoia Feb 20, 2021, 7:24 PM

#

It is quite annoying and one of the problems that took me the longest. I have reworked it a few times!

#

That may be a much better approach

iron basalt Feb 20, 2021, 7:27 PM

#

A sign that you may not be doing things the best way is when your objects are too small. For example, neuron does not really need to be it's own object unless you intend to create a neuron by itself outside of a neural network. Or another sign is when something exists not by itself ever, but in a group / cluster. Rather than making it its own object, just have the data held by the object that manages the group / cluster.

#

Generally you will always be working with groups of neurons.

lapis sequoia Feb 20, 2021, 7:29 PM

#

I see what you mean. It would be much simpler to store the all the weights inside a matrix in a single layer object than a neuron. I will try playing around with different lists to see what I can do. You are right, I do not plan to use a single neuron on its own. Thankyou for that explanation!

iron basalt Feb 20, 2021, 7:30 PM

#

Btw that group vs single thing idea applies to pretty much all programming.

#

(Computers like groups of things)

#

(Contiguous)

lapis sequoia Feb 20, 2021, 7:31 PM

#

thankyou for that advice! That is the sort of thing that will really help me improve.

#

Groups do seem to be used a lot in programming, list logic is essential to a lot of software it seems. Or at least, it is used often for challenges etc

#

Thankyou so much for all your help it has been really helpful 🙂

#

I might redo the neural network using a different method with matrices now, thanks 🙂

iron basalt Feb 20, 2021, 7:40 PM

#

On line 101, you use this count = 0. To keep track of the current layer index correct?

thin remnant Feb 20, 2021, 7:43 PM

#

im looping over a geoapicall and want to append some of the json results to my dataframe. Since i select only latitude and longitude out of the json results i use the selecting technique response['latitude']. The things is. For some responses there is no 'latitude' value.. How can i ommit my code from crashing and just continueing to the next record instead of crashing

iron basalt Feb 20, 2021, 7:43 PM

#

@lapis sequoia

lapis sequoia Feb 20, 2021, 7:44 PM

#

Oh yes pretty sure I do

#

let me check

#

That is correct

#

It doesn't need to be 1 I don't think as the number of the weights for one neuron in one layer of the neural network should always equal the number of neurons in the next

iron basalt Feb 20, 2021, 7:46 PM

#

@thin remnant Use python dict's get function, you can set an optional return value for when there is no entry .get(key, ret_val_when_not_there), e.g. lat = response.get('latitude', None) ... if lat is None: ...

#

@lapis sequoia Use enumerate instead, also the if statement if(count != len(self.layers)): will always be True.

#

https://www.tutorialspoint.com/enumerate-in-python

Enumerate() in Python

When using the iterators, we need to keep track of the number of items in the iterator. This is achieved by an in-built method called enumerate(). The enumerate ...

lapis sequoia Feb 20, 2021, 7:48 PM

#

ohh right that is a good idea thankyou I will try that!

#

Oh I see what you mean about the if statement...

#

thankyou 🙂

#

As I am using len() rather than it counting from 0

iron basalt Feb 20, 2021, 7:49 PM

#

on 117 if(i + 1 != len(self.layers)): you are using this to make it only loop up until the layer before the last right?

lapis sequoia Feb 20, 2021, 7:49 PM

#

Yes that is also correct

#

as the neurons in the last layer do not needs weights

iron basalt Feb 20, 2021, 7:50 PM

#

just change the range then

lapis sequoia Feb 20, 2021, 7:50 PM

#

they are not connected to neurons in the next layer

iron basalt Feb 20, 2021, 7:50 PM

#

for i in range(0, len(self.layers)): to for i in range(0, len(self.layers) - 1):

lapis sequoia Feb 20, 2021, 7:50 PM

#

rightttt I see

#

that would also work very well

#

I do not think about these things sometimes that is a nice and simple solution!

thin remnant Feb 20, 2021, 7:50 PM

#

#

squigle

#

still the same

iron basalt Feb 20, 2021, 7:51 PM

#

You got a list index out of range, so data is an empty list, check to make sure len of data is greater than zero.

#

so data['data'] is the actual data

#

which is a list

#

and it was empty, but you tried to access the element at index 0.

static grail Feb 20, 2021, 7:53 PM

#

thin remnant

wow what IDE is that

thin remnant Feb 20, 2021, 7:54 PM

#

jupyter notebook xd

iron basalt Feb 20, 2021, 7:54 PM

#

@lapis sequoia Python has a bunch of loop control that allows you avoid having to put if statements inside loops to control where they loop.

thin remnant Feb 20, 2021, 7:54 PM

#

#

@iron basalt rip

#

this gives more error

lapis sequoia Feb 20, 2021, 7:55 PM

#

@thin remnant you are trying to access the index using a string datatype

iron basalt Feb 20, 2021, 7:55 PM

#

data['data'] is a list, and data['data'][0] seems to also be a list (i'm guessing length 2 for lat and long).

#

I recommend trying to print out the types of things

#

e.g. print(type(data['data'][0]))

lapis sequoia Feb 20, 2021, 7:56 PM

#

These features seem really useful, I will have a think next time about how I can use loop control instead!

#

Also I found printing literally every variable helps

#

I have a "debug" mode boolean that I can enable and disable to print everything

#

Sometimes it can be helpful

thin remnant Feb 20, 2021, 7:57 PM

#

ive tried some things but couldnt figure it out

iron basalt Feb 20, 2021, 7:57 PM

#

You should also be able to print the json itself probably or save it to a file. Then analyze it.

thin remnant Feb 20, 2021, 7:57 PM

#

this is what i did to check stuff

#

#

you have any idea how to check if lat en lon are there and otherwise just not care instead of crash xd

iron basalt Feb 20, 2021, 7:58 PM

#

can you display just data for me?

#

the whole thing

#

or is it really long?

thin remnant Feb 20, 2021, 7:59 PM

#

gimme a sec, ill make a picture

lapis sequoia Feb 20, 2021, 8:00 PM

#

You can check whether a given key exists in a dictionary using:

#

    print("will execute if this key is present")```

thin remnant Feb 20, 2021, 8:01 PM

#

iron basalt Feb 20, 2021, 8:02 PM

#

My hunch is that since you are in juypter notebook it may be an out of order cell execution thing (or other), create a new file on your pc and run the script in there to make sure it's nothing strange going on with the notebook.

#

It could also be that not all responses are the same. Some could be giving different structures.

#

I would wrap the loop in a try catch and on error print the current and previous data to compare a valid and invalid data.

thin remnant Feb 20, 2021, 8:07 PM

#

My linux is rebooting, the window froze

iron basalt Feb 20, 2021, 8:09 PM

#

You are using a vm?

thin remnant Feb 20, 2021, 8:10 PM

#

no

#

i run linux as main

#

anyway, this is what it looks like when i run the script

iron basalt Feb 20, 2021, 8:10 PM

#

ok can you print data?

#

or display it somehow

thin remnant Feb 20, 2021, 8:11 PM

#

thats gonna give me a huge output since its a loop

#

but i got an idea

#

i can just print the index number

#

and then next run just print the data of that index where it stopped

iron basalt Feb 20, 2021, 8:12 PM

#

yeah

thin remnant Feb 20, 2021, 8:17 PM

#

#

im jsoning it reall quick

#

sec

#

it's weird

iron basalt Feb 20, 2021, 8:20 PM

#

So, yeah, there is no guarantee for anything, just gotta do a ton of checks on everything. A bunch of if key in x, if len(y) > 0, and maybe even if isinstance(z, (typeA, typeB, ...)).

thin remnant Feb 20, 2021, 8:20 PM

#

sometimes it stops faster

iron basalt Feb 20, 2021, 8:20 PM

#

Yeah it's random it seems.

thin remnant Feb 20, 2021, 8:20 PM

#

sometimes it stops at index 5 and sometimes index 9

#

it shouldnt be random haha

iron basalt Feb 20, 2021, 8:20 PM

#

Server is not always giving the same thing.

#

Why not just use geopy though?

thin remnant Feb 20, 2021, 8:21 PM

#

i dont know how xd

#

is it easy ?

iron basalt Feb 20, 2021, 8:22 PM

#

yes

#

Much easier than doing all this.

thin remnant Feb 20, 2021, 8:22 PM

#

i think i have the right checks to make it work now

iron basalt Feb 20, 2021, 8:22 PM

#

>>> from geopy.geocoders import Nominatim
>>> geolocator = Nominatim(user_agent="specify_your_app_name_here")
>>> location = geolocator.geocode("175 5th Avenue NYC")
>>> print(location.address)
Flatiron Building, 175, 5th Avenue, Flatiron, New York, NYC, New York, ...
>>> print((location.latitude, location.longitude))
(40.7410861, -73.9896297241625)
>>> print(location.raw)
{'place_id': '9167009604', 'type': 'attraction', ...}

thin remnant Feb 20, 2021, 8:22 PM

#

ill take a look at geopy later this evening

#

mmm

#

wow

#

that looks pretty easy yea haha

#

ill give it a shot i guess

iron basalt Feb 20, 2021, 8:23 PM

#

They did what you are doing right now, but wrapped it up for you with a bow tie.

thin remnant Feb 20, 2021, 8:23 PM

#

haha you sample code does look easy yea

#

but what is the max amount of calls ?

#

cause im doing it for each record in a dataset

iron basalt Feb 20, 2021, 8:24 PM

#

Depends on which site you choose

#

They chose Nominatim in this example.

#

https://nominatim.org/

Nominatim

Open source geocoding with OpenStreetMap data

thin remnant Feb 20, 2021, 8:24 PM

#

my dataset has 5000 +- records

#

so i used positionstack

#

but ill take a look into those things aswell

#

thanks a lot for the help though!

iron basalt Feb 20, 2021, 8:25 PM

#

It can use google's geocoding

#

I assume there is a paid tier for that or something

thin remnant Feb 20, 2021, 8:26 PM

#

yea its with a lot of these geocoding sites

#

almost all of them

iron basalt Feb 20, 2021, 8:26 PM

#

free tier probably too and probably a lot of requests, because google is big

thin remnant Feb 20, 2021, 8:26 PM

#

positionstack didn't have very good docs imo but their calls are at least free

#

if you do some filtering yourself..

iron basalt Feb 20, 2021, 8:27 PM

#

here is the list of all the ones it has

#

https://github.com/geopy/geopy/tree/master/geopy/geocoders

GitHub

geopy/geopy

Geocoding library for Python. Contribute to geopy/geopy development by creating an account on GitHub.

thin remnant Feb 20, 2021, 8:27 PM

#

Thanks

#

you need to receive a reward haah

#

Can I send you a trophy or sth ? 😄

iron basalt Feb 20, 2021, 8:31 PM

#

np, I gain more practice from this stuff.

thin remnant Feb 20, 2021, 8:32 PM

#

me to 😄

#

i wouldnt have ever touched this stuff if it werent for my gf

#

she doesn't know anything about datascience and has to do data analysis/linear regression in SPSS

#

and so she doesnt know data cleaning or python at all

#

and the school didnt give her the data

#

so that sucked haha

#

But I knew it was possible in python so i wanted to try it

arctic wedgeBOT Feb 20, 2021, 8:35 PM

#

Hey @cerulean spindle!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

cerulean spindle Feb 20, 2021, 8:36 PM

#

I'm new to TensorFlow and I am having trouble lowering the loss of my model. Please ping me if you know how to improve the model. https://paste.pythondiscord.com/gosebesocu.py

iron basalt Feb 20, 2021, 8:37 PM

#

@thin remnant I recommend just trying to look on https://pypi.org/ to see if there is already a package for what you are trying to do, then go to their github page and see if the README has a simple example. If it seems overly complex for what you are trying to do then try doing it yourself.

PyPI

The Python Package Index (PyPI) is a repository of software for the Python programming language.

analog pike Feb 20, 2021, 9:16 PM

#

so im still having troubles with my csv, now its throwing a SettingWithCopyWarning

#

x = 0
for item in index['datetime']:
    index.iloc[x,0] = item[:-6]
    x+=1

#

im just trying to shave the end of a string in each row of a column in pandas

#

since datetime isnt cooperating

astral path Feb 20, 2021, 9:28 PM

#

If I have a column in my dataset which contains short string descriptions using keywords, how could I include that in a heatmap/correlogram to show relationships between the keyword and other variables? e.g. I could use this to find that, for example, descriptions that contain the word "red" and "dress" have a smaller value in a column called stock than a description that includes "green" and "bag"

#

example of data

iron basalt Feb 20, 2021, 9:30 PM

#

@analog pike Try modifying your column like so:

tawny geode Feb 20, 2021, 9:30 PM

#

I just started data science in uni so I'm willing to get help

iron basalt Feb 20, 2021, 9:30 PM

#

import pandas as pd

df = pd.DataFrame(
    [[1, 2], [4, 5], [7, 8]],
    index=['cobra', 'viper', 'sidewinder'],
    columns=['max_speed', 'shield']
)

print(df)

column = df.iloc[:, 0]

print("-------------------")
print(column)

for i, val in enumerate(column):
    column[i] = val + 1

print("-------------------")
print(column)

analog pike Feb 20, 2021, 9:33 PM

#

ah elite dangerous

#

@iron basalt the problem is that these are the values im trying to modify:https://gyazo.com/931714db8974dfa5c1524a31f69d0255

Gyazo

#

im trying to just strip off the time portion and whatever im trying just doesnt seem to want to work

#

since pd.to_datetime only uses values 0-23 for time and for some reason the csv goes 0-24

#

and I don't need the times anyways just the year

iron basalt Feb 20, 2021, 9:38 PM

#

just the middle part? the year?

analog pike Feb 20, 2021, 9:38 PM

#

ye

#

since im just trying to get frequency per year

iron basalt Feb 20, 2021, 9:38 PM

#

Are they always formatted like this all entries?

analog pike Feb 20, 2021, 9:38 PM

#

yeah I downloaded the cleaned one

grave frost Feb 20, 2021, 9:39 PM

#

cerulean spindle I'm new to TensorFlow and I am having trouble lowering the loss of my model. Ple...

if you google it, there are about a million fixes for such a problem

analog pike Feb 20, 2021, 9:39 PM

#

so I wouldn't have to deal with all the data cleaning

iron basalt Feb 20, 2021, 9:39 PM

#

val.split()[1] is the year then

#

(Assuming each entry is a string)

analog pike Feb 20, 2021, 9:41 PM

#

they are

#

yet settingwithcopywarning is messing with me again

#

A value is trying to be set on a copy of a slice from a DataFrame

iron basalt Feb 20, 2021, 9:41 PM

#

are you modifying the column like I did above?

analog pike Feb 20, 2021, 9:41 PM

#

this is my whole code atm ```py

mport matplotlib.pyplot as plt
import pandas as pd
import DateTime as dt
ufos = pd.read_csv("scrubbed.csv",low_memory=False)
countries = ufos['country'].unique()
print(countries)
fig,ax = plt.subplots()

index = ufos[ufos['country'] == 'us']
print(index['datetime'])

column = index.iloc[:, 0]

print("-------------------")
print(column)

for i, val in enumerate(column):
column[i] = val.split()[1]

index['datetime'] = pd.to_datetime(index['datetime'])

index['year'] = index['datetime'].dt.year```

#

damn the highlighting didnt work

iron basalt Feb 20, 2021, 9:42 PM

#

just edit your message

analog pike Feb 20, 2021, 9:43 PM

#

there we go

iron basalt Feb 20, 2021, 9:43 PM

#

what does print(index['datetime']) look like?

analog pike Feb 20, 2021, 9:43 PM

#

thats the image i posted before

#

it gives just the list of dates and times

#

for each entry

iron basalt Feb 20, 2021, 9:44 PM

#

How many columns are there? just one?

analog pike Feb 20, 2021, 9:44 PM

#

no, though now that I think about it i really only need the one column

#

since im not doing by state or anything and this is just the US

iron basalt Feb 20, 2021, 9:45 PM

#

so when you print column what do you get?

analog pike Feb 20, 2021, 9:46 PM

#

same thing as printing index['datetime']

iron basalt Feb 20, 2021, 9:46 PM

#

so there is only 1 column in index['datetime']?

analog pike Feb 20, 2021, 9:46 PM

#

oh shoot wait a minute i think I know why

#

https://gyazo.com/ffb5b4a1e1c8215dcf11c2d9621a0eed

Gyazo

#

after I made the copy with only US pandas doesn't fix the rows

#

:/

#

it leaves the gaps where the other countries were

#

Damn I forgot how i fix that

misty flint Feb 20, 2021, 9:48 PM

#

sounds like that would be a kata if katas did arrays

#

DoggoKek

#

i did a similar thing but it was just elements and a list

analog pike Feb 20, 2021, 9:49 PM

#

pain

misty flint Feb 20, 2021, 9:49 PM

#

theres probably a function

analog pike Feb 20, 2021, 9:49 PM

#

Probably

#

just have to find it

misty flint Feb 20, 2021, 9:49 PM

#

ye

analog pike Feb 20, 2021, 9:49 PM

#

I want to say sort would do it

#

but i don't think so

#

I just want to visualize the number of ufo sightings in the us per year bro