hearty tusk Jul 31, 2021, 3:16 PM

#

are your scores maybe below 0.7, since you are setting the y limits to 0.7-1.01?

icy pine Jul 31, 2021, 4:22 PM

#

Hello, fellow coders.

I'm putting together a team of python users to make a downloadable AI assistant (kind of like Siri, Cortana or Alexa) that you can download on your computer. All in python.

I think this isn't a one-man project so I need some team members. Please contact me if you have experience regarding this area (I'm new to this but I'm a fast learner) or if you have any questions. I'm very new to this but It's a project I definitely want to undertake because it seems overall like a fun project, especially since I'm only a teen.

What I'm expecting or hoping for the final result to be (I will update it, fix it, and add more features as we go too) I'm trying to make it able to tell weather, time, math calculations, mini-games, looking on the web, youtube music, and recent news, all using voice commands and speaking in voice that should sound somewhat natural. I'm also trying to make some sort of machine learning so the AI can learn more about you and slightly change its questions and statements to fit your personality.

If you think this is impossible or I'm having high hopes and I am a complete idiot, please feel free to tell me, since I'm open to judgement and improvement.

You can DM me at DarkMist#0074.

Note: I'm not offering payment of any kind or anything. I am just hoping that this will be a fun experience to everyone and a wonderful project. I will make like a poster of everyone in the team with their names and contribution and everything to kind of honor them and thank them for their help. This is a TEAM, by the way, not a company or a giant corporation, so I will probably accept a max of 15 members or so.

Thank you for reading. It should have taken a ton of time unless you are Mr Howard Berg. Let me know if you have questions!

DarkMist

arctic wedgeBOT Jul 31, 2021, 4:23 PM

#

Rules

6. Do not post unapproved advertising.

grave frost Jul 31, 2021, 5:00 PM

#

icy pine Hello, fellow coders. I'm putting together a team of python users to make a do...

you cant get it to learn though - anything else is doable

#

and there are multiple projects that have made similar things, check them out too

quasi sparrow Jul 31, 2021, 5:20 PM

#

Does anybody know of a workaround to train a model for XGBoost regression on multi-output?

#

The library XGBoost currently does not support multi-output regression.

grave frost Jul 31, 2021, 5:23 PM

#

quasi sparrow Does anybody know of a workaround to train a model for XGBoost regression on mul...

that's a problem - tried other things like LightGBM?

#

nvm they dont have it either

#

is it for a kaggle comp?

grizzled barn Jul 31, 2021, 6:47 PM

#

Anyone here involved with AI projects and know a good place to start? Not necessarily learning about what it is, but how to actually make projects involving it.

serene scaffold Jul 31, 2021, 6:50 PM

#

grizzled barn Anyone here involved with AI projects and know a good place to start? Not necess...

you could build a classifier for a dataset on Kaggle.

grizzled barn Jul 31, 2021, 6:51 PM

#

serene scaffold you could build a classifier for a dataset on Kaggle.

Yea I think Ill do that, it looks neat. Thank you.

lapis sequoia Jul 31, 2021, 8:45 PM

#

I have a function like the following

def myfunc(c, h, alpha, beta, delta):
    # perform some calculations
    return s, t, x, y, z

where input parameters are

c = 0.53
h = 0.07
alpha = 0.6
beta = 1
delta = 0.8

The alpha, beta, delta inputs are initial values in the range from 0 to 1. I would like to adjust these input values such that the outputs s, t, and the sum of x, y, z are close to some values such as

s = 0.34
t = 0.20
sum(x, y, z) = 0.45

Is there an optimization function in SciPy or other Python package that would do something like this?

chilly geyser Jul 31, 2021, 8:47 PM

#

If you can shift everything to a single objective then yes, I think you can use one of the ready-made ones

#

If you are doing multi-objective optimization, I'm not too sure if any are ready-made

tidal bough Jul 31, 2021, 9:05 PM

#

try just minimizing something like:

c = 0.53
h = 0.07
def cost(alpha, beta, delta):
    s, t, x, y, z = myfunc(c, h, alpha, beta, delta)
    return (s-0.34)**2 + (t-0.20)**2 + (0.45 - (x+y+z))**2

with scipy.optimize for a starter

#

I think it even has multiobjective ones

lapis sequoia Jul 31, 2021, 9:31 PM

#

@tidal bough So something like this:

from scipy.optimize import minimize
c = 0.53
h = 0.07
def cost(alpha, beta, delta):
    s, t, x, y, z = myfunc(c, h, alpha, beta, delta)
    return (s-0.34)**2 + (t-0.20)**2 + (0.45 - (x+y+z))**2
x0 = [0.6, 1, 0.8]
res = minimize(cost, x0, method='Nelder-Mead', tol=1e-6)

tidal bough Jul 31, 2021, 10:10 PM

#

Yeah, basically

radiant kayak Aug 1, 2021, 1:34 AM

#

Hi

serene scaffold Aug 1, 2021, 2:00 AM

#

radiant kayak Hi

Hi

rancid widget Aug 1, 2021, 3:18 AM

#

hearty tusk are your scores maybe below 0.7, since you are setting the y limits to 0.7-1.01?

Oh yes , you are right . Thank you so much ! @hearty tusk

real torrent Aug 1, 2021, 3:38 AM

#

Is it possible to change the location of the origin in a 2D Matplotlib plot?

velvet thorn Aug 1, 2021, 3:53 AM

#

real torrent Is it possible to change the location of the origin in a 2D Matplotlib plot?

what do you mean by "origin"

#

if you mean in the mathematical sense

#

look into ax.axis/plt.axis

stoic hill Aug 1, 2021, 5:50 AM

#

hello guys can anyone help me, i am trying to run a code in google colab with a database containing 29lkh records and trying to fit that data to random forest classifier and when i try to run that code, my session crashes because of memory error as i am running it on 16GB ram and GPU tho it crashes any way to run it?

ornate jasper Aug 1, 2021, 5:58 AM

#

Hi

austere swift Aug 1, 2021, 6:10 AM

#

stoic hill hello guys can anyone help me, i am trying to run a code in google colab with a...

can you show the code?

arctic wedgeBOT Aug 1, 2021, 6:13 AM

#

Hey @stoic hill!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

austere swift Aug 1, 2021, 6:13 AM

#

send the colab link

stoic hill Aug 1, 2021, 6:13 AM

#

yeh ok

#

https://colab.research.google.com/drive/1ViOFt-3DZm6toxl0j0RyRPsrrLw3K2sy?usp=sharing

Google Colaboratory

#

when i try to fit it crashes

austere swift Aug 1, 2021, 6:15 AM

#

random forest doesnt use gpu btw

#

so try using a non gpu session with maybe more ram

stoic hill Aug 1, 2021, 6:15 AM

#

ik even on cpu with 16gb of ram it crashes

#

i think the only problem is the ram

#

and i dont have that;(

austere swift Aug 1, 2021, 6:16 AM

#

i have a server with more than enough ram to run that, and it's not currently loaded with much, so if you send me the data i can run it

#

if you're ok with that

stoic hill Aug 1, 2021, 6:16 AM

#

sure i just want the model

#

file

late shell Aug 1, 2021, 7:54 AM

#

Hello, I'm just getting started with CNN, and I was wondering what would happen if I throw in a (greyscale) image (flattened into an array) to normal feed forward NN. Wouldn't the network learn the weights to classify the image or what would happen?

cinder barn Aug 1, 2021, 8:51 AM

#

what IDE do y'all use for tensorflow?

#

anyone use vscode here?

chilly geyser Aug 1, 2021, 8:58 AM

#

A lot of people use vsc

serene scaffold Aug 1, 2021, 9:22 AM

#

cinder barn what IDE do y'all use for tensorflow?

it's ultimately up to personal choice, but I like PyCharm.

cinder barn Aug 1, 2021, 9:25 AM

#

Anybody know why I get this error when I run the code

serene scaffold Aug 1, 2021, 9:29 AM

#

cinder barn Anybody know why I get this error when I run the code

Try Googling the "Could not load dynamic library ..." part. That appears to be the salient point.

#

One of the skills you'll develop is identifying the salient part of error messages. Often you can find an exact solution by googling the salient part.

dire echo Aug 1, 2021, 10:23 AM

#

cinder barn Anybody know why I get this error when I run the code

Mine is way frustrating

#

pip install tensorflow

Required satisfaction

Import tensorflow

ERROR: 
Module "Tensorflow" not found

#

And thats why i use mobile ide ;D

cinder barn Aug 1, 2021, 10:26 AM

#

Yea I wish installing tensorflow was miles easier

#

It's a roadblock that prevents many beginners from starting to learn

dire echo Aug 1, 2021, 10:27 AM

#

My favorite AI module is

#

Random

#

;|

#

Lol

dire echo Aug 1, 2021, 10:27 AM

#

cinder barn It's a roadblock that prevents many beginners from starting to learn

Yea

#

I mean is google sooo

hoary wigeon Aug 1, 2021, 10:57 AM

#

i need help

#

This is first time I'm training LogisticRegression Model over 0.95*1.6 Million rows and 0.5 Million columns of data with penalty='elasticnet', l1_ratio=0.5, solver=saga

#

How much time it can take ?

#

it already took 4 hr and still in progress, i want to track progress if its is really working or just stuck...

lament halo Aug 1, 2021, 11:16 AM

#

Whats up! Who anyone knows quant trading?

hoary wigeon Aug 1, 2021, 11:24 AM

#

hoary wigeon it already took 4 hr and still in progress, i want to track progress if its is r...

System usage stats

long shard Aug 1, 2021, 12:53 PM

#

Correlation matrix captures linear relation between 2 features in a dataset. how to capture non linear relations between features? And how to address/eliminate them?

dull turtle Aug 1, 2021, 1:13 PM

#

hello

#

how i can seprate pandas dataframe

#

                  4
0  02-03-2020 09:19
1  02-03-2020 09:20
2  02-03-2020 09:20
3  02-03-2020 09:21
4  02-03-2020 09:21
5  02-03-2020 09:22
6  02-03-2020 09:22
7  02-03-2020 09:23
8  02-03-2020 09:23```

#

how i can seprate date and time in above data?

grave breach Aug 1, 2021, 1:32 PM

#

cinder barn Yea I wish installing tensorflow was miles easier

tensorflow works correctly

#

you just have to get cuda up and running

granite star Aug 1, 2021, 2:20 PM

#

hello

#

I got a dataset from Kaggle about water potability and the data set gives potability as 0 or 1 (like True or False)

#

it got: ph Hardness Solids Chloramines Sulfate Conductivity Organic_carbon Trihalomethanes Turbidity attribiutes and I wonder how can i apply multiple linear regression to it

#

actually it doesnt have to be linear regression

#

but i think i have trouble with 0 or 1 value of potability. When i try to apply multiple linear regression to it, it gives me absurd results

granite star Aug 1, 2021, 2:26 PM

#

granite star it got: ph Hardness Solids Chloramines Sulfate Conductivity Or...

i want to find coefficients of these attributes

#

https://www.kaggle.com/adityakadiwal/water-potability here is the dataset

Water Quality

Drinking water potability

#

I am very new to data science and thank you in advance for your help

#

oh I found the solution. I think I must use binary classification not regression 😅

ocean swallow Aug 1, 2021, 2:57 PM

#

is there any leaks in numpy? My pipeline data is made up of in total 40 images and about 1000 objects that has array views of those 40 images about 100 mb in total as jpeg. My memory consumption increases dramatically compared to how much it should actually be. (10-14 gigabytes of pipeline data, with only libraries used it is about 3gb.)

#

aren't numpy images memory efficient since it uses view

young valve Aug 1, 2021, 3:59 PM

#

Hi, is it okay to conduct an elbow test using a data frame with no normalized variables (i.e all dummies), or is it better to include the data frame with normalized variables?

#

Would there be any significant differences? Thank you

lapis sequoia Aug 1, 2021, 5:39 PM

#

Hi

#

I have made a Jarvis like
To make my things easier like opening an app or listening to songs

#

Voice recognition

#

Can someone help me to put apps

#

def JARVIS(self): wish() while True: self.query = self.STT() if 'good bye' in self.query: sys.exit() elif 'open google' in self.query: webbrowser.open('www.google.co.in') speak("opening google") elif 'open youtube' in self.query: webbrowser.open("www.youtube.com") elif 'play music' in self.query: speak("playing music from pc") self.music_dir ="./music" self.musics = os.listdir(self.music_dir) os.startfile(os.path.join(self.music_dir,self.musics[0]))

#

How do I put spotify or any other app

quasi sparrow Aug 1, 2021, 5:45 PM

#

Does anybody know of a good example on sklearn.model_selection.TimeSeriesSplit?

#

I'm trying to use this method along with a modekl

grave breach Aug 1, 2021, 5:55 PM

#

lapis sequoia How do I put spotify or any other app

Check #❓｜how-to-get-help, this channel is about data science and ai

remote fossil Aug 1, 2021, 6:01 PM

#

when comparing models should you keep hyperparameters the same or optimise for each

ancient fog Aug 1, 2021, 7:04 PM

#

herlp

#

i need help

#

HOW DO I DO THIS 8PUZZLE HTING

quasi sparrow Aug 1, 2021, 7:09 PM

#

Sorry I was not clear. I'm working on a boosted trees model and using XGBoost implementation for Python.
XGBoost can only predict one target, so I'm using scikit_learn multioutput regression as a wrapper to train a model with 3 target outputs.

multioutputregressor = MultiOutputRegressor(xgb.XGBRegressor(max_depth=3, n_estimators=100, n_jobs=2,
                           objectvie='reg:squarederror', booster='gbtree',
                           random_state=42, learning_rate=0.05)).fit(x_train, y_train)

Time series must be validated using walk forward validation . I want to use the scikit implementation on my problem but I can't find a good example online on how to implement this validation on my model.

cinder barn Aug 1, 2021, 7:35 PM

#

grave breach tensorflow works correctly

Ohh, thanks for letting me know!

#

I stg I’ve been trying to setup tensorflow for over a year

#

Tomorrow is the day I will finally finish

grave breach Aug 1, 2021, 7:37 PM

#

the problem is with cuda

#

tensorflow is setupped

slate hollow Aug 1, 2021, 11:02 PM

#

is there any difference

#

between https://keras.io/api/layers/pooling_layers/global_max_pooling2d/

Keras documentation: GlobalMaxPooling2D layer

#

and https://keras.io/api/layers/pooling_layers/max_pooling2d/

Keras documentation: MaxPooling2D layer

grave frost Aug 1, 2021, 11:19 PM

#

slate hollow between https://keras.io/api/layers/pooling_layers/global_max_pooling2d/

global just does it over the whole input

#

look it up for a better explanation

slate hollow Aug 1, 2021, 11:46 PM

#

doesn't the normal maxpool also do it over the whole input

blissful nymph Aug 2, 2021, 12:16 AM

#

can someone do me a favor and translate this keras nn to pytorch?

model = Sequential()
model.add(Dense(128, input_shape=(len(train_x[0]),), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(train_y[0]), activation='softmax'))

dire echo Aug 2, 2021, 5:13 AM

#

!e

import random as r
print(r.randint(1, 100))
#what come out will be decision

arctic wedgeBOT Aug 2, 2021, 5:13 AM

#

@dire echo :white_check_mark: Your eval job has completed with return code 0.

dire echo Aug 2, 2021, 5:13 AM

#

there, the most simple AI i can think of

austere swift Aug 2, 2021, 5:52 AM

#

ai involves intelligence

#

that's random

#

lol

#

technically the simplest ai you can do is using one parameter, so linear regression

austere swift Aug 2, 2021, 6:05 AM

#

blissful nymph can someone do me a favor and translate this keras nn to pytorch? ``` model = Se...

why do you need to?

#

why not keep it in keras?

arctic wedgeBOT Aug 2, 2021, 6:42 AM

#

Hey @young valve!

It looks like you tried to attach file type(s) that we do not allow (.html). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a.

Feel free to ask in #community-meta if you think this is a mistake.

proven sigil Aug 2, 2021, 7:44 AM

#

Pyspark:

def remove_null_columns(df, label_col, null_threshold=0.8):
    total_rows = df.count()
    cols_to_drop = []
    for c in df.columns:
        if c == label_col:
            continue
            
        null_values = df.select(F.count(F.when(F.isnull(c), c)).alias(c)).collect()[0][0]
        if null_values / total_rows > null_threshold:
            cols_to_drop.append(c)
    
    df = df.drop(*cols_to_drop)
    return df

df = remove_null_columns(df, label_col)
print(len(df.columns))

I'm removing columns which have more than 80% null values. How to optimise this code?

velvet thorn Aug 2, 2021, 8:33 AM

#

proven sigil Pyspark: ``` def remove_null_columns(df, label_col, null_threshold=0.8): tot...

null_values = df.select(F.count(F.when(F.isnull(c), c)).alias(c))

#

this part

#

is the problem

#

you're calling collect once per column

#

you should write a query that selects the null percentage for all columns, filters out the ones that fall above the threshold, and then collect that

#

then drop

#

alternatively you can write it as a select but that's a bit more complex

#

I wouldn't recommend that

proven sigil Aug 2, 2021, 8:52 AM

#

Thank you so much!

velvet thorn Aug 2, 2021, 8:53 AM

#

yw 👋

cinder barn Aug 2, 2021, 10:04 AM

#

Why doesn't it print the chart

#

https://www.kaggle.com/alexisbcook/hello-seaborn This is the tutorial I am following

Hello, Seaborn

Explore and run machine learning code with Kaggle Notebooks | Using data from Interesting Data to Visualize

brazen jackal Aug 2, 2021, 10:16 AM

#

cinder barn Why doesn't it print the chart

add plt.show()

cinder barn Aug 2, 2021, 10:30 AM

#

omg

cinder barn Aug 2, 2021, 10:30 AM

#

brazen jackal add `plt.show()`

thankyou sm

uncut barn Aug 2, 2021, 10:39 AM

#

does anyone know how to open NDPI files in python?

indigo skiff Aug 2, 2021, 1:38 PM

#

hey guys anyone familiar with text generation pipelines? Need quick help to understand it more

desert oar Aug 2, 2021, 1:54 PM

#

can you be more specific @indigo skiff ?

desert oar Aug 2, 2021, 1:55 PM

#

cinder barn Why doesn't it print the chart

!code in the future, can you please share your code as text with code formatting, instead of a screenshot? see below 👇

arctic wedgeBOT Aug 2, 2021, 1:55 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

weak sentinel Aug 2, 2021, 2:23 PM

#

Does anyone know how efficiency of pd.cumsum() scales as compared to np.cumsum()

#

I found someone on StackOverflow saying pandas was faster, but my understanding was that pandas is just a layer built on top of numpy

chilly geyser Aug 2, 2021, 2:29 PM

#

I think Numpy is likely to be faster, but actually, just test it out

weak sentinel Aug 2, 2021, 2:31 PM

#

My dataset I’m currently developing with is really small, but when this goes into prod and uses live data it’s going to be thousands of rows

#

I’ll try a test though

#

Do you suggest %timeit?

chilly geyser Aug 2, 2021, 2:34 PM

#

Thousands of rows is generally not performance critical to me, unless you are doing >quadratic stuff

#

But well, you can make fake data with 50000 rows and see

grand lion Aug 2, 2021, 2:40 PM

#

How would I plot data on a United States Map by coordinate?

chilly geyser Aug 2, 2021, 2:47 PM

#

!e

from timeit import repeat
setup=(
"""
from numpy.random import default_rng
from pandas import DataFrame
x = default_rng().standard_normal(size=(30000, 20))
df = DataFrame(x)
"""
)
print(repeat("x.cumsum(axis=0)", setup, number=10, repeat=5))
print(repeat("df.cumsum()", setup, number=10, repeat=5))

arctic wedgeBOT Aug 2, 2021, 2:48 PM

#

@chilly geyser :white_check_mark: Your eval job has completed with return code 0.

001 | [0.19380801357328892, 0.1688365377485752, 0.17779459105804563, 0.20028469525277615, 0.2744292030110955]
002 | [0.2766013741493225, 0.28942475002259016, 0.25081070279702544, 0.2581129721365869, 0.2731569781899452]

chilly geyser Aug 2, 2021, 2:48 PM

#

@weak sentinel Minibot 'benchmark' seems to say np is slightly faster.
I also tested with colab, with numpy being also slightly faster

chilly geyser Aug 2, 2021, 3:07 PM

#

I further tested with C++ with compiler optimizations - it's a lot faster if you go that route, so there's that if what you're doing is somehow performance critical

indigo skiff Aug 2, 2021, 3:28 PM

#

desert oar can you be more specific <@!755745170464440422> ?

My goal is to generate product description using language models however im not sure about the pipe line and it would be really helpful if i can know more details or someone who has gone through it and have a quick chat with him.

cinder barn Aug 2, 2021, 3:42 PM

#

desert oar !code in the future, can you please share your code as _text_ with code formatti...

Oh yea, sure. I just sent it as a ss to show the output

visual heart Aug 2, 2021, 4:17 PM

#

Hello

grave frost Aug 2, 2021, 4:35 PM

#

indigo skiff My goal is to generate product description using language models however im not ...

take in the product image from a CNN, identify what it is and generate a desc. I guess?

timber skiff Aug 2, 2021, 4:43 PM

#

#

Hey, anyone know why my OLS trend is so whacky on my plotly scatter plots?

#

Like, in mid-April it jumps up when all of the observations were actually below average

blissful nymph Aug 2, 2021, 4:55 PM

#

austere swift why do you need to?

tensorflow has problems on my computer pytorch works perfectly, i actually managed to translate t this thing yesterday so its fine

quasi sparrow Aug 2, 2021, 4:56 PM

#

What's the incentive of publishing articles on medium?

#

"towards data science" blog, to be more specific

blissful nymph Aug 2, 2021, 4:58 PM

#

@quasi sparrow money and probably reputation

desert oar Aug 2, 2021, 4:58 PM

#

TDS has a lot of clout nowadays

#

lots of people subscribe to it

#

otherwise, medium is just a blogging platform with some social media elements

quasi sparrow Aug 2, 2021, 5:02 PM

#

Yeah, it's hard to navigate TDS. Most of the examples are toy programs.

candid wraith Aug 2, 2021, 5:19 PM

#

hey how do i get time.sleep(60) to stop all functions and read the script logically so i can make my script stop where i want it to ?

austere swift Aug 2, 2021, 5:20 PM

#

blissful nymph tensorflow has problems on my computer pytorch works perfectly, i actually mana...

why not fix tensorflow instead?

blissful nymph Aug 2, 2021, 5:20 PM

#

Dunno i use pytorch quite a bit

#

not tensorflow as much

austere swift Aug 2, 2021, 5:20 PM

#

that's like asking someone to help move your stuff into a new house because your old house has a leak

#

lol

blissful nymph Aug 2, 2021, 5:20 PM

#

true

austere swift Aug 2, 2021, 5:21 PM

#

what's the issue with tensorflow?

#

like what error do you get when you run it

blissful nymph Aug 2, 2021, 5:21 PM

#

2021-08-02 10:20:52.830264: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-08-02 10:20:52.830756: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

austere swift Aug 2, 2021, 5:21 PM

#

did you install cuda

wooden rapids Aug 2, 2021, 5:22 PM

#

hey guys any references to intuitively understand json/html/css parsing

#

so far its a lot of brute force try and try again

austere swift Aug 2, 2021, 5:23 PM

#

there's already prebuilt modules in python for html and json parsing

#

also, that doesn't really sound like it would belong in this channel

wooden rapids Aug 2, 2021, 5:32 PM

#

oh sorry - which channel is more appropriate? for context im in a data science course so i thought that this is something a lot of native users could speak about

cinder barn Aug 2, 2021, 5:33 PM

#

blissful nymph 2021-08-02 10:20:52.830264: W tensorflow/stream_executor/platform/default/dso_lo...

I have the same issue lol

#

I gave up

#

and just used google collab

austere swift Aug 2, 2021, 5:33 PM

#

cinder barn I have the same issue lol

did you install cuda lol

cinder barn Aug 2, 2021, 5:33 PM

#

so much easier

cinder barn Aug 2, 2021, 5:33 PM

#

austere swift did you install cuda lol

cuda toolkit?

austere swift Aug 2, 2021, 5:34 PM

#

yeah cuda 11.2

#

that's the latest version tensorflow accepts

cinder barn Aug 2, 2021, 5:34 PM

#

austere swift yeah cuda 11.2

ohhh

#

I downloaded 11.4

austere swift Aug 2, 2021, 5:34 PM

#

there's your issue :)

cinder barn Aug 2, 2021, 5:34 PM

#

thanks bro

#

Could you send me the link to download 11.2

#

I could only find 11.4

austere swift Aug 2, 2021, 5:34 PM

#

https://developer.nvidia.com/cuda-11.2.1-download-archive

NVIDIA Developer

CUDA Toolkit 11.2 Update 1 Downloads

Select Target Platform Click on the green buttons that describe your target platform. Only supported platforms will be shown. By downloading and using the software, you agree to fully comply with the terms and conditions of the CUDA EULA. Operating System Architecture Compilation Distribution Version Installer Type Do you want to cross-compile? ...

#

you have to go to the archives to see it

cinder barn Aug 2, 2021, 5:35 PM

#

austere swift https://developer.nvidia.com/cuda-11.2.1-download-archive

king

austere swift Aug 2, 2021, 5:35 PM

#

the update 1 part doesn't matter

#

then you'll also need to install cudnn from here https://developer.nvidia.com/cudnn-download-survey (you have to make a developer account to get it, its free though)

sterile prawn Aug 2, 2021, 5:53 PM

#

Some lstm generated tweets:

#

Heya really tweeting are understanding upon with are you and of it you here and.
I am murder of rogers time as shorter with burgers and with me and here once blahhh.
Stop crazy twitter very hit turkey little homemade turkey upset his food haircut.
Mite goodmorning because roof with development yay amp twittering a his paparazzi.
Willieday tweet guys cousin are ian getting hopes and i im gonna for with so im here yet xxx.
Even sooo the shame create home food visit hit your massive myself starbucks.
Dats sounded yay hence hopes proud brit of ease you movies pain like you they are on at tomorrow.
Lauren hate with bugs wouldnt yet doing word and there do to about that and cya.
Iranelection alot the rumor recorded torture approach printed are of it even love and he are around yay.
Httptwitpic doing jenna and does perfect news line political newcastle while going on your song and in proud dani

#

made with an LSTM autoencoder and dcgan

serene scaffold Aug 2, 2021, 6:08 PM

#

sterile prawn Heya really tweeting are understanding upon with are you and of it you here and....

I feel like markov chains with ngrams are better than this, but it's still interesting.

#

do you have the source code? I'd love to see it.

sterile prawn Aug 2, 2021, 6:08 PM

#

totes this is more for learning

#

about gans and seq2seq autoencoders

#

i'd fine tune gpt-2 if i was like oging for quality tweets

sterile prawn Aug 2, 2021, 6:09 PM

#

serene scaffold I feel like markov chains with ngrams are better than this, but it's still inter...

https://colab.research.google.com/drive/1lPGdE3lVCpkVnJUIjB7OA2-A_sgzzi6E?usp=sharing

Google Colaboratory

#

sure here's the lstm autoencoder

serene scaffold Aug 2, 2021, 6:09 PM

#

sterile prawn i'd fine tune gpt-2 if i was like oging for quality tweets

I have a bad taste in my mouth for anything GPT because I had been researching NLP for two years before I even heard of it, yet people talked about GPT as if it permanently solved all of NLP.

sterile prawn Aug 2, 2021, 6:10 PM

#

and teh gan is literally the keras dcgan

#

just plugedd into the latent space

sterile prawn Aug 2, 2021, 6:10 PM

#

serene scaffold I have a bad taste in my mouth for anything GPT because I had been researching N...

yeah ik but gpt-2 is still probs the best tool i would have for generating tweets coherently

#

but gpt is abit overbearing rn

serene scaffold Aug 2, 2021, 6:10 PM

#

the problem is that there's more to NLP than generating text.

sterile prawn Aug 2, 2021, 6:10 PM

#

ofc

#

this is legit just text gen which is what gpt-2 is FOR

#

so ofc it would work

serene scaffold Aug 2, 2021, 6:11 PM

#

right. and it's still very interesting lemon_hyperpleased

sterile prawn Aug 2, 2021, 6:11 PM

#

serene scaffold the problem is that there's more to NLP than generating text.

yeah classifcation, recognition, text-to-speech

sterile prawn Aug 2, 2021, 6:11 PM

#

serene scaffold right. and it's still very interesting <:lemon_hyperpleased:754441879822663811>

exacrtly - great for anything text-generationy, but doesn't do everything

#

it isn't AGI for NLP

serene scaffold Aug 2, 2021, 6:11 PM

#

what is AGI?

sterile prawn Aug 2, 2021, 6:11 PM

#

aritifical general intelligence

chilly geyser Aug 2, 2021, 6:48 PM

#

serene scaffold I have a bad taste in my mouth for anything GPT because I had been researching N...

Replace GPT with BERT ezgame

chilly geyser Aug 2, 2021, 6:49 PM

#

quasi sparrow Yeah, it's hard to navigate TDS. Most of the examples are toy programs.

TDS is very annoying with "subscribe or clap or whatever to read this 20%-of-the-time-useful bit"

#

I don't even know how it's so high up when stackoverflow is even better, especially when the result is quite pertinent

cinder barn Aug 2, 2021, 6:58 PM

#

# Imports
import cv2
print("imported cv2")

# Loading pre-trained data
trainedFaceData = cv2.CascadeClassifier('FaceDetection/haarcascade_frontalface_default.xml')
print("loaded pre-trained data")

# launch webcam
webcam = cv2.VideoCapture(1)
print("Webcam launched")

# loop all frames
while True:
    sucessFrameRead, frame = webcam.read()
    # Converting to grayscale
    grayscaleImg = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    print("Converted to grayscale")
    # Detecting faces
    faceCoordinates= trainedFaceData.detectMultiScale(grayscaleImg)

    # Print location of face
    print(faceCoordinates)

    # Draw rectangle around face
    for (x, y, w, h) in faceCoordinates:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 1)


    # show image (window name, and what you want to show)
    cv2.imshow('Face Detector app', frame)
    print("Displaying image")
    
    # wait and close be pressing any key
    cv2.waitKey(1)
    print("Press any key to exit")

cinder barn Aug 2, 2021, 6:58 PM

#

cinder barn ```python # Imports import cv2 print("imported cv2") # Loading pre-trained data...

How do I print the certainty?

grave frost Aug 2, 2021, 7:52 PM

#

serene scaffold I have a bad taste in my mouth for anything GPT because I had been researching N...

well, transformers help 😦

#

plus they are SOTA, so its hard to argue with that ¯_(ツ)_/¯

sterile prawn Aug 2, 2021, 8:23 PM

#

yea i only use lstms cuz they are easier 😐

pearl violet Aug 2, 2021, 9:24 PM

#

how about music and AI, there is someone very good at it here?

serene scaffold Aug 2, 2021, 10:00 PM

#

pearl violet how about music and AI, there is someone very good at it here?

Perhaps. You're more likely to get answers when you just put the question out there.

sterile prawn Aug 3, 2021, 12:32 AM

#

pearl violet how about music and AI, there is someone very good at it here?

i've DONE music and ai

#

very badly

#

but basically

#

encode music as text -> generatoe text w/ lstm -> encode bacc to musak

#

carykh has great vid on it

proven sigil Aug 3, 2021, 12:57 AM

#

velvet thorn you should write a query that selects the null percentage for all columns, filte...

def remove_null_columns(df, label_col, null_threshold=config['null_threshold']):
    all_features = df.columns
    if label_col in all_features:
        all_features.remove(label_col)
    df.createOrReplaceTempView('remove_null_columns')
    query = 'select ' + ', '.join(['count(`%s`) * 1.0 / count(*) as `%s`'%(i, i) \
                                   for i in all_features]) + ' from remove_null_columns'
    non_null_count = spark.sql(query).collect()[0].asDict()
    columns_to_drop = [k for k, v in non_null_count.items() if (1 - v) > null_threshold]
    
    df = df.drop(*columns_to_drop)
    return df

new_df = remove_null_columns(df, label_col, 0.1)
print(len(new_df.columns))

Does this look good?

#

It runs much faster compared to earlier. Should I change anything further?

velvet thorn Aug 3, 2021, 1:32 AM

#

proven sigil ```py def remove_null_columns(df, label_col, null_threshold=config['null_thresho...

you forgot your ```py

proven sigil Aug 3, 2021, 1:36 AM

#

I don't understand

#

Got it

#

Updated

serene scaffold Aug 3, 2021, 1:49 AM

#

proven sigil ```py def remove_null_columns(df, label_col, null_threshold=config['null_thresho...

I feel like... there must be an easier way. Are you just trying to drop columns that contain a NaN or what?

velvet thorn Aug 3, 2021, 1:51 AM

#

proven sigil I don't understand

well

#

I would suggest

#

you refrain from writing your own SQL

#

you can do that with the Spark DSL

#

but that's not a huge problem

velvet thorn Aug 3, 2021, 1:52 AM

#

serene scaffold I feel like... there must be an easier way. Are you just trying to drop columns ...

columns that have more nulls than a certain percentage

serene scaffold Aug 3, 2021, 1:54 AM

#

df[df.isna().sum() / len(df) < TRESHOLD]

#

yes?

proven sigil Aug 3, 2021, 2:21 AM

#

serene scaffold ```py df[df.isna().sum() / len(df) < TRESHOLD] ```

That doesn't work on pyspark

#

haha

desert oar Aug 3, 2021, 2:35 AM

#

@velvet thorn i imagine the sql version would be a lot faster, no?

#

otherwise you end up doing a for loop over columns with a collect-ing operation (count) in each iteration of the loop

#

or is there some pyspark magic i don't know about

velvet thorn Aug 3, 2021, 2:37 AM

#

desert oar <@!171929073063297024> i imagine the sql version would be a lot faster, no?

no

velvet thorn Aug 3, 2021, 2:37 AM

#

desert oar otherwise you end up doing a for loop over columns with a `collect`-ing operatio...

this is actually

#

what they did originally

#

and I said

#

do it once, with one collect

#

it's been like a 2 years since I worked with PySpark

#

but

#

you can defo express it with the Spark DSL

desert oar Aug 3, 2021, 2:37 AM

#

how would you do that with the DSL? you can't count without "collecting"

#

maybe the scala version lets you do some map/filter stuff over columns

velvet thorn Aug 3, 2021, 2:38 AM

#

like you write a query that counts the nulls in each column, collect that, then drop

#

not sure if I'm expressing myself properly

desert oar Aug 3, 2021, 2:38 AM

#

oh, i see

#

yeah easy enough

velvet thorn Aug 3, 2021, 2:39 AM

#

df.select((F.count(F.isnull(F.col(col)) / len(df) < 0.8).alias(col) for col in df.columns)?

#

something like that?

#

I don't really remember but that should work

#

wait as is reserved in Python right

desert oar Aug 3, 2021, 2:40 AM

#

yeah it's .alias in pyspark

velvet thorn Aug 3, 2021, 2:42 AM

#

desert oar maybe the scala version lets you do some map/filter stuff over columns

I remember there was some cool stuff in Scala not available in Python

desert oar Aug 3, 2021, 2:42 AM

#

F.count

velvet thorn Aug 3, 2021, 2:42 AM

#

but it's been a LONG time since I did any sort of Spark

desert oar Aug 3, 2021, 2:42 AM

#

that was it

velvet thorn Aug 3, 2021, 2:42 AM

#

bet it's like version 2.8 already

#

3.2?

desert oar Aug 3, 2021, 2:43 AM

#

yeah 3.something

velvet thorn Aug 3, 2021, 2:43 AM

#

3.1.2

#

wow

desert oar Aug 3, 2021, 2:47 AM

#

def null_frac(df, colname):
    return df[colname].isNull() / F.count(df)

col_null_fracs = df.select(
    *(null_frac(df, c).alias(c) for c in df.columns)
).first()

bad_columns = [c for c, f in col_null_fracs.asDict().items() if f > 0.8]
df = df.drop(bad_columns)

#

@proven sigil ☝️

#

it's probably good to stay fresh on pyspark

#

i haven't used it in over a year

dire echo Aug 3, 2021, 2:52 AM

#

Natural language...

#

can we use unused human brain as cpu, LOL

#

It also come with free big hardrive and built in learning module

proven sigil Aug 3, 2021, 3:17 AM

#

desert oar ```python def null_frac(df, colname): return df[colname].isNull() / F.count(...

That's amazing. Thanks!

halcyon vale Aug 3, 2021, 6:16 AM

#

Tokenization: Subword Tokenization splits words into smaller parts based on the most commonly occurring sub strings. Word Tokenization splits a sentence on spaces as well as applying language specific rules to try to separate parts of meaning even when there are no spaces. Subword Tokenization provides a way to easily scale between character tokenization i.e. using a small subword vocab and word tokenization i.e using a large subword vocab and handles every human language without needing language specific algorithms to be developed. On my Journey of Machine Learning and Deep Learning, I have read and implemented from the book Deep Learning for Coders with Fastai and PyTorch. Here, I have read about Word Tokenization, Subword Tokenization, Setup Method, Vocabulary, Numericalization with Fastai, Embedding Matrices and few more topics related to the same from here. I have presented the implementation of Subword Tokenization and Numericalization using Fastai and PyTorch here in the snapshot. I hope you will gain some insights and work on the same. I hope you will also spend some time learning the topics from the Book mentioned below. Excited about the days ahead !!
https://www.linkedin.com/posts/thinam-tamang-3b12831a2_300daysofdata-66daysofdata-machinelearning-activity-6828204194089041920-gyVw

Thinam Tamang on LinkedIn: #300DaysOfData #66DaysOfData #machinelea...

🏆 Day 233 of #300DaysOfData!

📑 Tokenization :
Subword Tokenization splits words into smaller parts based on the most commonly occurring sub strings...

arctic wedgeBOT Aug 3, 2021, 9:15 AM

#

@raven steeple Please don't try to ping @everyone or @here. Your message has been removed. If you believe this was a mistake, please let staff know!

ebon rock Aug 3, 2021, 10:46 AM

#

Hey folks! I am a data science aspirant and I have been learning SQL from the past month for a data analyst role.

#

I know JOINS,CASE, CTEs, Agrregrate Functions as I have used all these to solve problems in Hackerrank.

#

What is the next step?

#

What more do I need to know?

uncut barn Aug 3, 2021, 11:31 AM

#

does anyone know how to check if the pixels have more than 8 bits per channel for a given image?

sterile prawn Aug 3, 2021, 11:40 AM

#

uncut barn does anyone know how to check if the pixels have more than 8 bits per channel fo...

maybe check the szie of the image

#

if its above what and 8 bit channel img should be

#

it has 10 bits?

#

like the size in mem of the img idk

uncut barn Aug 3, 2021, 11:43 AM

#

sterile prawn maybe check the szie of the image

so im using openslide

sterile prawn Aug 3, 2021, 11:43 AM

#

oh

uncut barn Aug 3, 2021, 11:43 AM

#

and the only thing I can get out of it are the dimensions

sterile prawn Aug 3, 2021, 11:43 AM

#

isn't that a C library?

uncut barn Aug 3, 2021, 11:43 AM

#

no python has it too

sterile prawn Aug 3, 2021, 11:43 AM

#

uncut barn and the only thing I can get out of it are the dimensions

you cant like... read the size in memory

uncut barn Aug 3, 2021, 11:43 AM

#

which are only (width, height)

sterile prawn Aug 3, 2021, 11:43 AM

#

that's it

#

so ONLY with the width and height

#

can you tell if it has more than 8 bits

uncut barn Aug 3, 2021, 11:44 AM

#

but they're color imgs

#

that i dont know

sterile prawn Aug 3, 2021, 11:44 AM

#

well is there any difference in the width and height between 8 bit and 10 bits

#

?

uncut barn Aug 3, 2021, 11:44 AM

#

I converted it to a thumbnail and that turned this to an array and gave 3 dimensions, last being 3 which is colour

sterile prawn Aug 3, 2021, 11:45 AM

#

so

#

(width, height, channels)?

uncut barn Aug 3, 2021, 11:45 AM

#

sterile prawn (width, height, channels)?

yh for thumbnail

uncut barn Aug 3, 2021, 11:46 AM

#

sterile prawn well is there any difference in the width and height between 8 bit and 10 bits

not sure what you mean as there's different levels in this library but generally width and height are different for each level

sterile prawn Aug 3, 2021, 11:46 AM

#

ok

#

so i dont think your problem can be resolved

#

with just width, height and cannels

#

lookf or someone who knows openslide

sterile prawn Aug 3, 2021, 11:48 AM

#

uncut barn not sure what you mean as there's different levels in this library but generally...

https://openslide.org/api/python/

#

maybe there's something here

abstract falcon Aug 3, 2021, 11:51 AM

#

Can anybody share a good resource on Entity Extraction Model..??

desert oar Aug 3, 2021, 12:10 PM

#

ebon rock What more do I need to know?

that sounds like a pretty good basis for data analysis. maybe also take a look at window functions. otherwise you know more than enough to get started, and you should start focusing on other things like excel skills, light-duty data processing with python, basic command line stuff, and probability/statistics

random elk Aug 3, 2021, 1:48 PM

#

Hello, everyone! I'm a beginner python programmer who is coming from a project management background and looking to transition into a data analyst career. Could you help me build a roadmap of which skills to develop? So far my python projects are very basic and I wanted to build projects that get slowly more complex.
So far I know the basics of the language and I've played a little bit with some analytics concepts in my last project. I don't want to write a giant text wall here with all my questions and curiosities, but here is my github and I would really appreciate some recommendations on what to work on: https://github.com/renatolew

GitHub

renatolew - Overview

Data Analyst and Python Developer looking to contribute to meaningful projects! - renatolew

#

I am sorry if this is not the right way to ask this. This is my first time participating in a programming community and I don't know how things work yet

sterile prawn Aug 3, 2021, 1:49 PM

#

how about if you want to get into machine larning

#

i see you already built a recipe analyzer

#

how about an RNN/LSTM to write original recipes?

#

but for a roadmpa

#

i'd say learn SQL, numpy, pandas, then get into scikit, machine learning, then finally learn tensorflow, keras, deep learning

#

allw hile making proejcts

random elk Aug 3, 2021, 1:50 PM

#

Thank you. Do you recommend any specific projects for me?

sterile prawn Aug 3, 2021, 1:50 PM

#

to start - idk its really up to you?

#

but your recipe thing is a great example

#

for my getting started with nerual nets

random elk Aug 3, 2021, 1:51 PM

#

One of my issues has been finding good projects, because the last two ones I tried were way more complex than I antecipated and I got stuck pretty quickly

sterile prawn Aug 3, 2021, 1:51 PM

#

i experimented with the iris dataset

#

or i built a simple webscraper

#

an LSTM to generate text

#

all pretty easy with plenty of examples

random elk Aug 3, 2021, 1:51 PM

#

Thank you. I'll look into it!

modern pine Aug 3, 2021, 1:51 PM

#

How about a stock prediction ai bot?

#

Any examples for the code?

sterile prawn Aug 3, 2021, 1:52 PM

#

modern pine How about a stock prediction ai bot?

those dont work too well and are kinda difficult

#

with simple LSTMs

#

though an idea i had

#

was to run an LSTM on wall street journal article headlines

#

and use that to predict how the DOW/ S & P 500 would perform in the next week

#

if anyone wants to try that

modern pine Aug 3, 2021, 1:57 PM

#

Oic...with LSTMs ....how about DNN?

#

Fully connected deep neural network

sterile prawn Aug 3, 2021, 2:01 PM

#

modern pine Oic...with LSTMs ....how about DNN?

i mean.... you could... but text is timerseries data

#

so lstms yeet DNNs

inland zephyr Aug 3, 2021, 2:07 PM

#

Hello I want to ask about making custom matrice with pandas.
I want to make similiar confusion plot but with average of distance between actual and predicted class which the data available in here https://paste.pythondiscord.com/ehogipafip.css but cannot find a good advice to do this.

#

i want to make it like this so i can analyze which entity has almost have relation between each other

modern pine Aug 3, 2021, 2:10 PM

#

sterile prawn i mean.... you could... but text is timerseries data

Ok thanks

sterile prawn Aug 3, 2021, 2:14 PM

#

gl

quiet vault Aug 3, 2021, 3:43 PM

#

I am doing a walk forward validation to evaluate a model. To get the best accuracy of how the model really performs to new data, I retrain the model every timestep. I am testing for 7 timesteps and repeating the walk forward validation 3 times. The model seems to be getting better every time it foes through the 7 days. Could it be possible that the model has past knowledge and is basically "cheating"?

#

I am working with keras.

acoustic halo Aug 3, 2021, 3:51 PM

#

quiet vault I am doing a walk forward validation to evaluate a model. To get the best accura...

Do you have a separate validation set to test against?

quiet vault Aug 3, 2021, 3:58 PM

#

no

#

Its time series data

#

Thats why I have walk forward validation

#

So just to be safe, is there a way to restart models completely?

acoustic halo Aug 3, 2021, 4:00 PM

#

You can still (and definitely should) have a separate validation set

quiet vault Aug 3, 2021, 4:01 PM

#

how

acoustic halo Aug 3, 2021, 4:01 PM

#

Have you got a sample of your dataset?

quiet vault Aug 3, 2021, 4:01 PM

#

yes

#

I have 131 datapoints

#

how many do u need?

acoustic halo Aug 3, 2021, 4:03 PM

#

like 5

quiet vault Aug 3, 2021, 4:03 PM

#

-0.10000228881836648
0.6800003051757955
0.5600013732910085
-0.8400001525878906
-0.7400016784667969
0.9099998474121094

acoustic halo Aug 3, 2021, 4:03 PM

#

Okay gimme a minute, I need to check some code where I have done something similar before

quiet vault Aug 3, 2021, 4:04 PM

#

alright

acoustic halo Aug 3, 2021, 4:08 PM

#

So if you have 131 datapoints, you could only use the first 100 for training and testing

#

Then at the end of each epoch, validate on the remaining 31

quiet vault Aug 3, 2021, 4:10 PM

#

That is not a very accurate way to test though

#

Walk forward validation is a great way for testing models with time series data because it represents how I would make predictions with the model

#

So the problem here is not really the way to testing.

#

I'm looking for a way to completely delete a model

acoustic halo Aug 3, 2021, 4:33 PM

#

Either way, the same still applies, you train the model on the first 100 or so datapoints, then once you have finished, you can use those 100 to try and predict the final 30 or however many you want to use as a final benchmark, otherwise there is no way to know the actual performance of the model, this would apply to both sliding windows and expanding windows

#

Not sure what you mean by deleting the model, since if you are saving them, they are usually just saved in a .h5 file

lapis sequoia Aug 3, 2021, 4:58 PM

#

Statement: We cannot determine overfitting based on one hypothesis only

Why is that? Isn't it the case that if we have a hypothesis with very low E_in and have a high E_out that its a sign of overfitting? Why can't we in virtue of that conclude that H1 is overfitting the data?

wicked basin Aug 3, 2021, 5:00 PM

#

how do i start the basics of machine learning

quiet vault Aug 3, 2021, 5:08 PM

#

I don't think you are understanding what walk forward validation is

#

The point is not to get the best model

#

it is to test to see how well the model performs

#

To get the best understanding of the variance of predictions, I am running the walk forward validation 3 times. After every time step, I want to make a new model and train it with the data it has. Then make a prediction and compare it with the answer

quiet vault Aug 3, 2021, 5:12 PM

#

wicked basin how do i start the basics of machine learning

I recommend the SoloLearn Course which is free

wicked basin Aug 3, 2021, 5:14 PM

#

quiet vault I recommend the SoloLearn Course which is free

Thanks

quiet vault Aug 3, 2021, 5:16 PM

#

np

#

it does not go into deep learning right away which is good

#

at the end it does introduce u to neural networks

flint musk Aug 3, 2021, 5:57 PM

#

If logistic regression is classified between two features, and KNN is classified between more than two features, between how many features a decision tree classified?

sterile prawn Aug 3, 2021, 6:00 PM

#

this is where the tweet lstm is now:

#

So greece calories shooting tunnel tragedy and lovin crazy touching.
Just dreading full marathon is as or pool rock move spain beers.
Taking up drunken other tragedy high projects is alot less that no if i feel maybe even worry.
Was into a decisions chilled apples lol but take dream that sore workout i.
Just dreading full thunderstorm is fun or pool rock move spain rehearsal.
Here at my appropriate soft more animation today i was a drunken and.
Just off in class is id miss central places meet downtown weeks.
Was up the disc murder a misery comcast blogging pirates.
Chilling reality tonight october possibly on your annoying theme music for flowers.
Iranelection greece murder ideal loud and is i lovin with touching.

quiet vault Aug 3, 2021, 6:01 PM

#

beter than me can right

inland zephyr Aug 3, 2021, 6:02 PM

#

i want to ask about arcface embedding algorithm, about the output vector. Is the vector value are is normalized between 0 and 1 or not? since i need to decide which similarity method to do the inference?

flint musk Aug 3, 2021, 6:03 PM

#

sterile prawn So greece calories shooting tunnel tragedy and lovin crazy touching. Just dreadi...

is it a yes or a no?😂

sterile prawn Aug 3, 2021, 6:05 PM

#

idk

#

i really dk

acoustic halo Aug 3, 2021, 6:05 PM

#

quiet vault I don't think you are understanding what walk forward validation is

You are probably right, I thought it was similar to k-fold but with time series data, but it sounds like something else, you got a resource I can look it up on?

quiet vault Aug 3, 2021, 6:05 PM

#

Sure. Can you wait like half an hour?

#

I'm in a ranked game lol

sterile prawn Aug 3, 2021, 6:06 PM

#

the lstm is a autoencoder lstm with dcgan if anyone wants to give it a shot themelves

acoustic halo Aug 3, 2021, 6:06 PM

#

Yeah no worries, ranked siege is way more important

quiet vault Aug 3, 2021, 6:08 PM

#

yes, sorry

quiet vault Aug 3, 2021, 6:36 PM

#

@acoustic halo https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/

Machine Learning Mastery

How To Backtest Machine Learning Models for Time Series Forecasting

k-fold Cross Validation Does Not Work For Time Series Data and Techniques That You Can Use Instead. The goal of […]

acoustic halo Aug 3, 2021, 7:06 PM

#

Okay it was what I thought it was, nevermind

umbral ferry Aug 3, 2021, 8:46 PM

#

is there any benefit to over fitting? maybe it can give you some insight into your data like in unsupervised learning

unborn glacier Aug 3, 2021, 9:23 PM

#

umbral ferry is there any benefit to over fitting? maybe it can give you some insight into yo...

Yes, the ability to over-fit suggests that your model is sufficiently complex to handle the data. If you just do a single variable linear regression on complicated data, you'll never be able to get a proper fit to, lets say, sinusoidal data. So if you are concerned that you don't have enough layers or you didn't choose a complex enough ML model, the ability to over-fit suggests that it is complex enough but you need more data, or data that better predicts the test samples

#

It can also give you an idea of when to stop training, because if you are over-fitting, you've gone too far

umbral ferry Aug 3, 2021, 9:44 PM

#

you can tell when you've overfit?

acoustic halo Aug 3, 2021, 9:50 PM

#

Yes, in super simple terms if your model seems to be doing really well on the training data, but then does worse on new data it has never seen before, it's likely because it has overfit to the training data

umbral ferry Aug 3, 2021, 9:52 PM

#

what do you mean by doing well on training data? I thought you just kind of feed the training data in and out pops a model

#

I am sort of seeing on my test data, that most predictions are pretty close to the actual, but a few are far off

#

I think that points to over fitting possibly

acoustic halo Aug 3, 2021, 10:00 PM

#

Okay so let's say you train a model on your training data

#

Then after training you test your model on the training data again and get 99% correct predictions

#

Then you test your test data in the model, which only predicts correctly 30% of the time

#

That's suggests the model has learnt the training data too well that it doesn't generalize well to new data, aka overfitted

umbral ferry Aug 3, 2021, 10:22 PM

#

oooh ok

#

ty

dusty cloud Aug 4, 2021, 12:03 AM

#

Hi guys, when it comes to faster python code, whats the difference between numba and transonic?

stark bough Aug 4, 2021, 12:20 AM

#

hello Everyone!

#

🙂

#

i am new in this group glad to be here

desert oar Aug 4, 2021, 12:23 AM

#

dusty cloud Hi guys, when it comes to faster python code, whats the difference between numba...

transonic appears to be a "wrapper" around several packages, of which numba is one

cyan sun Aug 4, 2021, 12:44 AM

#

If any of you are feeling extra generous tonight, mind joining me a #☕help-coffee ??need a pandas Q answered thx

lapis sequoia Aug 4, 2021, 2:02 AM

#

Anyone have some good tips for how I could speed up my yolov5 model? Using 720p images for my input data and only two classes, I have about 3000 training images and 10% of those are for validation

What can I do to speed up inference no matter how small?

tidal sonnet Aug 4, 2021, 4:01 AM

#

So like, I've been trying to figure out how many heights are within one standard deviation for this given set... but I seem to be doing something wrong, in a process I thought was fairly straightforward

#

code:

from math import sqrt
players = [180, 172, 178, 185, 190, 195, 192, 200, 210, 190]

mean = sum(players) / len(players)

pre_variant = [(number - mean) * (number - mean) for number in players]

variance = sum(pre_variant) / len(pre_variant)

std = sqrt(variance)

valid = [player for player in players if player in range(int(mean-std), int(mean+std))]

print(len(valid))

#

Find the mean, then the variance which is the average of the squares of the difference of each value and the mean, find the standard_deviation which is the square root of the Variance, and then find all numbers in that range...
Where did I go wrong?

#

I tried doing this same method with different data, another question that I knew the correct answer for, and it worked fine

#

Nvm... I got it to work, was a problem with my last list comp

#

from math import sqrt
data = [180, 172, 178, 185, 190, 195, 192, 200, 210, 190]

mean = sum(data) / len(data)

pre_variance = list(map(lambda number: (number - mean) * (number - mean), data))

variance = sum(pre_variance) / len(pre_variance)

std = sqrt(variance)

result = list(filter(lambda number: number > (mean - std) and number < (mean + std), data))

print(len(result))

The result

white venture Aug 4, 2021, 5:20 AM

#

What exactly is TensorFlow? And why do many say that it makes it easy to do ML when looking at it makes my head hurt?

acoustic halo Aug 4, 2021, 5:24 AM

#

Many use something like keras which provides a simpler interface for tensorflow, or pytorch instead

#

Tensorflow can be difficult to understand if you are not already familiar with the concept of tensors

white venture Aug 4, 2021, 5:26 AM

#

acoustic halo Many use something like keras which provides a simpler interface for tensorflow,...

Where should I start learning the fundamentals of ML so I can build some cool applications with it?

acoustic halo Aug 4, 2021, 5:29 AM

#

https://www.reddit.com/r/learnmachinelearning/wiki/index

index - learnmachinelearning

r/learnmachinelearning: A subreddit dedicated to learning machine learning

#

Theres also a bunch of other resources in the pinned messages

#

Codecademy has a decent course as well if you have access to that

limpid osprey Aug 4, 2021, 8:11 AM

#

import pandas as pd
from sklearn.tree import DecisionTreeClassifier

data = pd.read_csv('/content/student-por.csv')
print(data)
X =  data.drop(columns=['grade'])
print(X)
y = data['grade']
print(y)
model = DecisionTreeClassifier()
model.fit(X , y)
model.predict([[18, 2, 2, 0, 1, 0, 0, 0, 1, 1, 0, 0, 4, 3, 4, 1, 1, 3, 4]])

What am i doing wrong here

ValueError                                Traceback (most recent call last)
<ipython-input-33-4fc4f82f40e4> in <module>()
      9 print(y)
     10 model = DecisionTreeClassifier()
---> 11 model.fit(X , y)
     12 model.predict([[18, 2, 2, 0, 1, 0, 0, 0, 1, 1, 0, 0, 4, 3, 4, 1, 1, 3, 4]])

2 frames
/usr/local/lib/python3.7/dist-packages/sklearn/tree/_classes.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
    875             sample_weight=sample_weight,
    876             check_input=check_input,
--> 877             X_idx_sorted=X_idx_sorted)
    878         return self
    879 

/usr/local/lib/python3.7/dist-packages/sklearn/tree/_classes.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
    171 
    172         if is_classification:
--> 173             check_classification_targets(y)
    174             y = np.copy(y)
    175 

/usr/local/lib/python3.7/dist-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
    167     if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
    168                       'multilabel-indicator', 'multilabel-sequences']:
--> 169         raise ValueError("Unknown label type: %r" % y_type)
    170 
    171 

ValueError: Unknown label type: 'continuous'

This works with other data i have but not with this one for some reason
current data : https://paste.pythondiscord.com/pogoqinuko.apache
old data : https://paste.pythondiscord.com/ayazepezoz.apache

Is it that it can only train with 2 parameters

late shell Aug 4, 2021, 8:47 AM

#

Hello, I'm just getting started with keras and was trying out this code:

model = keras.Sequential(name='shit', layers = [                                               keras.Input(shape(2,)),
                                                    keras.layers.Dense(3, activation='relu'),
                                                    keras.layers.Dense(1, activation='sigmoid')
                                                        ])

But I get this error :
WARNING:tensorflow:Please add keras.layers.InputLayerinstead ofkeras.Inputto Sequential model.keras.Input is intended to be used by Functional model.

What does it mean by "Functional model" and why am I getting this error?

acoustic halo Aug 4, 2021, 8:52 AM

#

Because as it says, you need to use InputLayer, not Input

#

you could also just do keras.layers.Dense(3, activation='relu', input_shape=(2,))

#

And skip defining the input layer

#

Functional models in keras are just models more complex than sequential models, eg:

#

austere swift Aug 4, 2021, 9:32 AM

#

late shell Hello, I'm just getting started with keras and was trying out this code: ```py m...

well the error pretty much explains itself, keras.Input is meant for a different kind of model, and so for sequential models you should replace it with keras.InputLayer

#

a functional model is a model that is made kind of like this:

inputs = keras.Input()
x = keras.layers.Dense()(inputs)
x = keras.layers.Dense()(x)
# ...
outputs = keras.layers.Dense()(x)
model = keras.Model(inputs=inputs, outputs=outputs)

#

if you'd like to learn more about those here's the documentation link for the functional api https://keras.io/guides/functional_api/

Keras documentation: The Functional API

late shell Aug 4, 2021, 10:41 AM

#

Cool 👍 , thanks a lot @acoustic halo & @austere swift

acoustic halo Aug 4, 2021, 10:46 AM

#

Actually got a functional API question myself, could you implement NEAT with it by creating layers with single inputs and outputs to effectively act as single nodes?

dusk spear Aug 4, 2021, 11:12 AM

#

Hello, I've got an error while running a NN. Can you please help me?

#

It's giving me this error

#

WARNING:tensorflow:Model was constructed with shape (None, 300, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, 300, 1), dtype=tf.float32, name='gru_9_input'), name='gru_9_input', description="created by layer 'gru_9_input'"), but it was called on an input with incompatible shape (None, 14, 1).

#

I think it's something to do with the way I've imputed the data, but I cannot figure it out

grave frost Aug 4, 2021, 11:13 AM

#

acoustic halo Actually got a functional API question myself, could you implement NEAT with it ...

with the functional API - you can modify your NN any way you want as long as autograd does it job ¯_(ツ)_/¯

acoustic halo Aug 4, 2021, 11:14 AM

#

That's what I was thinking, not sure what the performance would be of having potentially hundreds of layers, even if they are small

grave frost Aug 4, 2021, 11:14 AM

#

well, you can alawys use their profiler to optimize them further if they take too much time

#

though use pytorch or jax then cuz it would be more controllable

acoustic halo Aug 4, 2021, 11:15 AM

#

Not that I plan on doing this, it's mostly just a thought experiment

#

But it also makes me wonder if there are any papers that apply a method like NEAT to layers as opposed to individual nodes

glacial sparrow Aug 4, 2021, 12:30 PM

#

I have a dataset (100k,52) with labelled anomalies and I'm trying iforest on various variations of the dataset. So far none returns anything sensible. When I plot the feature space in 2D with PCA or TSNE with different colours for normal and anomaly points, must I be able to visually confirm anomaly regions? In my case normal and anomaly points are mixed e.g. with PCA normal points form a circle and anomaly points are scattered towards the middle or with TSNE everything seems to be mixed altogether

carmine tide Aug 4, 2021, 12:35 PM

#

Hello! So I am not sure if this is the right room to ask my question but I didn't find a more appropriate one. I want help with fitting a curve on data with errors on both the x and y axes. From what I've read scipy's curve_fit cannot deal with x errors (correct me if I'm wrong). I tried using odr but I think that it didn't give me the correct curve. I could be wrong and it could actually be the best fit curve but I would appreciate a second opinion. Thanks!

wheat sun Aug 4, 2021, 1:11 PM

#

I was trying to make an ML algorithm that predicts the value of for example f(x) = 2x for a specific input to x (overkill but I'm trying out the power of ML) so that when I fit this array into the model (DecisionTreeRegressor),

it should hopefully predict a y value of 40 for x = 20, 60 for 30, 90 for 45, so on and so forth.

However, when I try to predict 6 for example with the model trained on the array above, it returns 10 and not 12. It goes for any number higher than the numbers in the array.

Can another model solve this issue?

length = 50
slope = 2
b = 0
step = 1

# Initialize array
data = {'x': [(i * step) for i in range(length)], 'y': [(i * step) * slope + b for i in range(length)]}
df = pd.DataFrame(data)

# Define
from sklearn.tree import DecisionTreeRegressor

model = DecisionTreeRegressor(random_state=1)

y = df.y
X = df.x
X = X.values.reshape(-1, 1)

# Fit
model.fit(X, y)

# Predict
print(model.predict([[10]]))

Thanks!

sterile prawn Aug 4, 2021, 1:18 PM

#

Anyone here ever do any roguelike stuff with Python that can be read by a lotta shit

there is nothing against it, but it is really screechy

@nekitdev it's already there for you to call it like a regular pattern..maybe if u can help me grab this guys token he keeps logging on my account on steam just saying it out loud for me to learn python?
im new in python that technically it is for
You are not allowed to use that to run in the background while other code is where it doesnt work now
and the webdriver in your code without seeing the code
lmao i use repl .it
that would be a great choice 😄
You need to use requires some notion of OOP and to be compatible with both, Linux and Windows support multiple IPs on same NIC
How do you become better than him including me 😂😅

#

markov chains = somewhat realistic python discord messages

#

When i look at to get a simple 1 or 0. 1 means it is assigned to a class
None
I don't appreciate the tone you're taking with cepo. Do you understand what you're asking for is .... Twisted indeed .... badum tsss
how to do a project skskkss
Isn't there a nice project to do with said bot, Selenium is a testing tool, not for scraping
how do i call a function inside a for loop breaks
well pandas complains about the python language. This python course i'm auditing just got a quick question if u have a point where all the keys in the dictionary, instead you should loop backward so if someone asked me to teach him, he didn't take it seriously
if i want to do something on button click, not refresh or redirect the page, instead; update the content in the github student program, which is easier to read i think
It'll work for all of these errors
** does anyone know of the styling guidelines

desert oar Aug 4, 2021, 1:43 PM

#

wheat sun I was trying to make an ML algorithm that predicts the value of for example f(x)...

DecisionTreeRegressor is really bad for this kind of task. i'd encourage you to look at what exactly a decision tree is, and try to figure out why it's so bad.

wheat sun Aug 4, 2021, 1:43 PM

#

desert oar `DecisionTreeRegressor` is _really bad_ for this kind of task. i'd encourage you...

Decision tree, I understand

#

You're looking at a tree of possibilities, but they're only limited to the trained values or something

desert oar Aug 4, 2021, 1:44 PM

#

also you might want to practice using numpy/pandas for working with data more efficiently:

length = 50
slope = 2
b = 0
step = 1

data = pd.DataFrame({'x': np.arange(0, length, step)})
data['y'] =  data['x'] * slope + b

wheat sun Aug 4, 2021, 1:45 PM

#

That makes intuitive sense

#

So the np.arange function kinda functions exactly like the range function with a min, max and step?

desert oar Aug 4, 2021, 1:48 PM

#

wheat sun You're looking at a tree of possibilities, but they're only limited to the train...

more or less, yeah. a tree has a fixed set of "split points". so it fails in 2 areas:

it can't extrapolate out of range of the training data - this is a problem in every model, but decision trees always predict the same value on out-of-range data
it can't predict a continuous range of values, unless you have an infinitely deep tree, which isn't possible

#

and for the sake of the exercise: if you know that the underlying function has the form f(x) = ax + b, what model is definitely the best choice for learning this function f?

desert oar Aug 4, 2021, 1:49 PM

#

wheat sun So the np.arange function kinda functions exactly like the range function with a...

!d numpy.arange

arctic wedgeBOT Aug 4, 2021, 1:49 PM

#

numpy.arange


numpy.arange([start, ]stop, [step, ]dtype=None, *, like=None)```
Return evenly spaced values within a given interval.

Values are generated within the half-open interval `[start, stop)` (in other words, the interval including *start* but excluding *stop*). For integer arguments the function is equivalent to the Python built-in *range* function, but returns an ndarray rather than a list.

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use [`numpy.linspace`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html#numpy.linspace "numpy.linspace") for these cases.

wheat sun Aug 4, 2021, 1:50 PM

#

desert oar and for the sake of the exercise: if you know that the underlying function has t...

No ML is actually necessary, the function can literally be implemented as a simple function

desert oar Aug 4, 2021, 1:50 PM

#

sure, i'm just checking for understanding 🙂

wheat sun Aug 4, 2021, 1:50 PM

#

I knew it was overkill, I just wanted to check what ML can do

desert oar Aug 4, 2021, 1:50 PM

#

in that case a tree isn't a great idea either. you'll want something that can learn "complicated" functions - random forest, gradient boosting, neural network

#

if you have the ability to arbitrarily create test inputs and outputs, you might do even better with a gaussian process model

#

depends on the situation

wheat sun Aug 4, 2021, 1:52 PM

#

So a tree is better for let's say bool values or limited-choice values like gender, eye color, country of origin, etc

desert oar Aug 4, 2021, 1:53 PM

#

yeah. i don't think i've ever seen a single regression tree used in serious work

#

they used classification trees quite a bit when i worked in insurance

#

i guess a single regression tree isn't bad if you want to "cut" the target into categories/levels, but you don't know what those categories/levels should be

#

pretty specific use case

wheat sun Aug 4, 2021, 1:55 PM

#

I just started kaggle's intro ML course today and it was the first model in the tutorial

desert oar Aug 4, 2021, 1:55 PM

#

what, a regression tree?

#

can you link it?

wheat sun Aug 4, 2021, 1:56 PM

#

desert oar can you link it?

https://www.kaggle.com/dansbecker/your-first-machine-learning-model

Your First Machine Learning Model

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

desert oar Aug 4, 2021, 1:57 PM

#

it's definitely better if you have more features: that means you have more splits, and more possible values the tree can predict

#

i think this tutorial is meant to show off scikit-learn moreso than decision trees

#

but price is probably an okay situation to use a decision tree: you don't really care about the exact price, and it probably makes sense to group prices into "levels" anyway

wheat sun Aug 4, 2021, 1:58 PM

#

Yeah right

#

I used a regression tree for the Titanic competition and got a 74% score

desert oar Aug 4, 2021, 1:58 PM

#

i thought the titanic competition was a binary classification task?

#

i remember doing it in ~2015 when i was first learning python and branching out from "traditional" stats and econometrics

wheat sun Aug 4, 2021, 1:59 PM

#

Yeah, the target variable is survival

desert oar Aug 4, 2021, 2:00 PM

#

a classification tree is a sensible intuitive choice for that problem

#

personally i came from the statistics world, so i didn't even consider it and went right for logistic regression (which is also a sensible choice, for other reasons)

wheat sun Aug 4, 2021, 2:01 PM

#

I just wanted to dive right in to doing ML stuff with whatever code I learned

desert oar Aug 4, 2021, 2:01 PM

#

it's also worth considering what the advantages and disadvantages of a decision tree and logistic regression are

wheat sun Aug 4, 2021, 2:02 PM

#

I still don't know what kinds of ML models there are

desert oar Aug 4, 2021, 2:02 PM

#

i'd argue that the decision tree is a lot easier to understand. but knowing stats and probability is very valuable for doing serious ML or more general data science work

wheat sun Aug 4, 2021, 2:02 PM

#

desert oar i'd argue that the decision tree is a lot easier to understand. but knowing stat...

I know, that's why I've been practicing on Khan Academy

desert oar Aug 4, 2021, 2:02 PM

#

that's good

#

i think i found a better YT channel for stats.. i will try to find it

#

there are pretty much 3 main categories of ML models in common use: trees and ensembles of trees, neural networks, and statistical models (especially linear ones). many other types exist, but these are the 3 that you will see over and over.

wheat sun Aug 4, 2021, 2:04 PM

#

From what I understand, neural networks have layers and utilize linear algebra for getting data from one layer to the next

#

Are trees in ML models represented by actual tree data structures?

serene scaffold Aug 4, 2021, 2:18 PM

#

wheat sun Are trees in ML models represented by actual tree data structures?

are you thinking of random forests?

wheat sun Aug 4, 2021, 2:19 PM

#

Hmm

serene scaffold Aug 4, 2021, 2:20 PM

#

yes, my understanding is that they are acyclic directed graphs.

#

(which is a specific kind of tree--trees can be undirected)

umbral ferry Aug 4, 2021, 2:23 PM

#

so I trained my model trying to get it to overfit (just to see how it performs), and maybe as expected, it performs well on the training data, and worse but also well on test data (never before seen)

#

is that ok? like the only thing that matters is how well it does on never before seen data?

acoustic halo Aug 4, 2021, 2:25 PM

#

I mean, it's okay as long as you are happy with it, but reducing the overfitting may make the result better on the test data

desert oar Aug 4, 2021, 2:26 PM

#

wheat sun Are trees in ML models represented by actual tree data structures?

not usually, but how they are stored is a good question. it probably varies across implementations.

#

a "tree" data structure holds data - a decision tree is an algorithm, and you store various parameters that make the algorithm work

desert oar Aug 4, 2021, 2:27 PM

#

umbral ferry is that ok? like the only thing that matters is how well it does on never before...

in a lot of real-world tasks, yes, this is the most important thing

umbral ferry Aug 4, 2021, 2:27 PM

#

actually, I'm adding parameters to reduce overfitting, and it's really not doing anything

#

huh

acoustic halo Aug 4, 2021, 2:28 PM

#

reduce parameters, don't increase them

wheat sun Aug 4, 2021, 2:28 PM

#

desert oar a "tree" data structure holds _data_ - a decision tree is an _algorithm_, and yo...

Hmm ok

#

I meant to ask how ML models store what they get from fitting

#

Like, it concluded that if sex == male, not survived and if sex == female, survived

desert oar Aug 4, 2021, 2:40 PM

#

wheat sun I meant to ask how ML models store what they get from fitting

the scikit-learn decision tree implementation is all written in python, you could read through it if you're curious https://github.com/scikit-learn/scikit-learn/blob/82df48934eba1df9a1ed3be98aaace8eada59e6e/sklearn/tree/_classes.py#L445-L494

GitHub

scikit-learn/_classes.py at 82df48934eba1df9a1ed3be98aaace8eada59e6...

scikit-learn: machine learning in Python. Contribute to scikit-learn/scikit-learn development by creating an account on GitHub.

wheat sun Aug 4, 2021, 2:43 PM

#

The _init_ method is shorter than I expected

old grove Aug 4, 2021, 3:07 PM

#

What is Covariance ? Can anyone explain with example.. all i know is If one var increases the second also increase then pos covariance but what if one moves up and other down... Can anyone explain waith ease whats covariance and how does it differ with correlation?

desert oar Aug 4, 2021, 3:10 PM

#

old grove What is Covariance ? Can anyone explain with example.. all i know is If one var ...

correlation is covariance normalized to lie in [-1, 1]

desert oar Aug 4, 2021, 3:12 PM

#

old grove What is Covariance ? Can anyone explain with example.. all i know is If one var ...

covariance between X and Y means that, when X is "high", then Y is "high", and when X is "low", then Y is "low"

old grove Aug 4, 2021, 3:13 PM

#

desert oar covariance between X and Y means that, when X is "high", then Y is "high", and w...

Ok... so what if one us high and other seem to go down then ? is it cov or corr ?

desert oar Aug 4, 2021, 3:13 PM

#

old grove Ok... so what if one us high and other seem to go down then ? is it cov or corr ...

correlation is computed from covariance

chilly geyser Aug 4, 2021, 3:13 PM

#

Covariance takes variance units

#

Pearson correlation measures linear correlation

old grove Aug 4, 2021, 3:14 PM

#

ok

old grove Aug 4, 2021, 3:17 PM

#

desert oar correlation is computed from covariance

and what about the range ? Like corr range between 0 and 1 so covariance lies in 1,-1 ?

desert oar Aug 4, 2021, 3:17 PM

#

old grove and what about the range ? Like corr range between 0 and 1 so covariance lies in...

no, correlation ranges from -1 to 1 like i told you

#

covariance is unbounded

old grove Aug 4, 2021, 3:17 PM

#

ok

desert oar Aug 4, 2021, 3:40 PM

#

@fading burrow where did you get that diagram?

#

@fading burrow is this a question about neural networks or about pooling covid tests?

#

or something else

#

the calculations for that table appear to be in the source paper https://www.medrxiv.org/content/10.1101/2020.04.06.20052159v1

Efficient and Practical Sample Pooling for High-Throughput PCR Diag...

In the global effort to combat the COVID-19 pandemic, governments and public health agencies are striving to rapidly increase the volume and rate of diagnostic testing. The most common form of testing today employs Polymerase Chain Reaction in order to identify the presence of viral RNA in individual patient samples one by one. This process has ...

#

i see

#

i think p in that table is the expected frequency of positive results

#

skimming the paper, it sounds like they derived this table from numerical simulations

#

they have an appendix with some derivations though

#

...but i think the appendix is missing

#

oh its here https://www.medrxiv.org/content/10.1101/2020.04.06.20052159v1.supplementary-material?versioned=true https://www.medrxiv.org/content/medrxiv/suppl/2020/04/14/2020.04.06.20052159.DC2/2020.04.06.20052159-1.pdf

Efficient and Practical Sample Pooling for High-Throughput PCR Diag...

In the global effort to combat the COVID-19 pandemic, governments and public health agencies are striving to rapidly increase the volume and rate of diagnostic testing. The most common form of testing today employs Polymerase Chain Reaction in order to identify the presence of viral RNA in individual patient samples one by one. This process has ...

#

yeah there are some derivations and formulas in that appendix

shut dock Aug 4, 2021, 4:06 PM

#

anyone using lazypredict on a regular basis? seems like a major time saver but may also mislead if you don't know why one model would be chosen over another https://towardsdatascience.com/lazy-predict-fit-and-evaluate-all-the-models-from-scikit-learn-with-a-single-line-of-code-7fe510c7281

Medium

Lazy Predict: fit and evaluate all the models from scikit-learn wit...

The easiest way to see which models work best for your dataset!

grave frost Aug 4, 2021, 5:34 PM

#

shut dock anyone using lazypredict on a regular basis? seems like a major time saver but m...

the days, where you need a wrapper for scikit-learn 🤕

torpid scarab Aug 4, 2021, 6:12 PM

#

hello

#

Anyone knows any good site with project ideas on AI? searching hard for my thesis

#

Thanks

sterile prawn Aug 4, 2021, 6:13 PM

#

torpid scarab hello

i'd love to help you brainstorm

#

but for a phd thesis? i wouldn't look for an idea on a website

#

what do you want to do?

#

nlp?

#

computer vision?

torpid scarab Aug 4, 2021, 6:13 PM

#

MSc

sterile prawn Aug 4, 2021, 6:13 PM

#

generattive ai?

#

ok its good to start with an area of machine learning

#

regression?

#

neural net architectures?

torpid scarab Aug 4, 2021, 6:15 PM

#

ye computer vision or sound i guess..I would love to create some easy hardware too, like connect it with arduino

desert oar Aug 4, 2021, 6:15 PM

#

i still wouldn't look too hard for a masters thesis in a python discord server

sterile prawn Aug 4, 2021, 6:15 PM

#

yea

#

we are not big brain ai people

torpid scarab Aug 4, 2021, 6:15 PM

#

desert oar _i still wouldn't look too hard for a masters thesis in a python discord server_

where would you look?

sterile prawn Aug 4, 2021, 6:15 PM

#

torpid scarab where would you look?

professors , university resources, prior papers, arxiv, etc.

desert oar Aug 4, 2021, 6:16 PM

#

there are also online communities more specifically focused on that field

#

that said, some hardware stuff could be interesting. not all theses have to be "implement an AI"

torpid scarab Aug 4, 2021, 6:16 PM

#

ye...professors gave us some cases, didnt like any too much 😛

desert oar Aug 4, 2021, 6:16 PM

#

your contribution could be "i got AI stuff to run on this tiny arduino and it's something that other people will find useful, here is the source code"

#

and your thesis wouldn't be "here is my cool machine learning model", it would be "here is how i got XYZ to run on an embedded system"

#

but i depends on your skillset

sterile prawn Aug 4, 2021, 6:17 PM

#

if you want an ml thing in audio/computer vision - how about a model that generates audio from a slient video (with lip movements) that's a few shot learner

#

so you give it a few short clips of someone speaking

#

and it can figure out the rest

torpid scarab Aug 4, 2021, 6:17 PM

#

I just have this idea of thesis would be something I like you know..trying to think ideas but I get stuck on implementation side..like how I ll manage to collect data

desert oar Aug 4, 2021, 6:18 PM

#

collecting data is always the hardest part

sterile prawn Aug 4, 2021, 6:18 PM

#

just an example

torpid scarab Aug 4, 2021, 6:18 PM

#

sterile prawn just an example

nice one..how you come up with this for example?

sterile prawn Aug 4, 2021, 6:18 PM

#

alright

#

i thought "what's a cross between audio and computer vision"

#

lip sync audio generation

torpid scarab Aug 4, 2021, 6:18 PM

#

desert oar collecting data is _always_ the hardest part

ye i know..had to collect and get the metadata through spotipy from 7.5k songs on another project

sterile prawn Aug 4, 2021, 6:19 PM

#

what's state of the art? "generating new audio from lip sync video after training on a specific speaker"

#

how could that be improved? "rather than having to train on an individual speaker, make the training few-shot so it oculd work on anyone"

torpid scarab Aug 4, 2021, 6:20 PM

#

that's a cool idea

#

Do you have anything of this coolness for AI and environment I like? I would love to use it for dunno maybe predict wildfires or detect stray animals and try to protect them..seems hard to find data though

sterile prawn Aug 4, 2021, 6:21 PM

#

https://www.youtube.com/watch?v=wg3upHE8qJw

YouTube

Two Minute Papers

Can an AI Learn Lip Reading?

❤️ Check out Snap's Residency Program and apply here: https://lensstudio.snapchat.com/snap-ar-creator-residency-program/?utm_source=twominutepapers&utm_medium=video&utm_campaign=tmp_ml_residency
❤️ Try Snap's Lens Studio here: https://lensstudio.snapchat.com/

📝 The paper "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"...

▶ Play video

#

i believe this is SOA

sterile prawn Aug 4, 2021, 6:21 PM

#

torpid scarab Do you have anything of this coolness for AI and environment I like? I would lov...

https://www.youtube.com/watch?v=AGCH1GR7pPU

YouTube

Two Minute Papers

Burning Down an Entire Virtual Forest! 🌲🔥

❤️ Check out the Gradient Dissent podcast by Weights & Biases: http://wandb.me/gd

📝 The paper "Fire in Paradise: Mesoscale Simulation of Wildfires" is available here:
http://computationalsciences.org/publications/haedrich-2021-wildfires.html

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksand...

▶ Play video

#

seems like a good starting point

torpid scarab Aug 4, 2021, 6:22 PM

#

thank you! both!

sterile prawn Aug 4, 2021, 6:22 PM

#

ok good luck!

#

@inland zephyr what are you saying lol

inland zephyr Aug 4, 2021, 6:28 PM

#

hello i need help about keras.backend method.
I now try to validate the result of my model and loop through 100 times different combination of train valid test. I wrap my model train and evaluation method in a one function and call it inside for loop with different data

def train_and_test(trainX,trainY,validX,validY,testX,f_leng):
        tf.keras.backend.reset_uids()
        tf.keras.backend.clear_session()
        stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10,mode='min')
        model = Sequential()
       ...
        model.compile(loss = 'sparse_categorical_crossentropy',optimizer='Adam', metrics=["accuracy"])
        print(model.summary())
        with tf.device('/device:GPU:0'):
            history = model.fit(x=trainX,
                                  y=trainY,
                                  validation_data=(validX,validY),
                                  epochs=50,
                                  shuffle=True,
                                  batch_size=10,
                                  callbacks=[stop_early])
        ypred = model.predict(x = testX)
        ypred = ypred.argmax(axis=-1)
        return history,ypred

with the flow like this:

for each wavelet method:
  tf.keras.backend.reset_uids()
  tf.keras.backend.clear_session()
  for each level:
      set train x,train y
      call train_and_test function
  create pandas summary

I'm glad it works well but pretty suspicious when the result jump drastically from 70% to 90% for each decomposition level. It is same as other kind of wavelet i used.

#

lmao i'm pretty happy but suspicious with my CNN result

#

see both curve are likely identical whether the wavelet are different

#

i'm pretty affraid that the same model always learn with new weight in every for loop whether than assign fresh untrained model

#

wait i find the issue

#

i think is the problem lie on how i define the model

#

dunno if i wrong, is calling the model of each layer with .add or define it with list has different effect to the model definition?

umbral ferry Aug 4, 2021, 7:21 PM

#

I know a good test/train split is 80/20, but if I decrease my test size and get better results on my test set, does that mean that a smaller test set is better for my specific dataset?

unborn glacier Aug 4, 2021, 7:22 PM

#

It's probably just random chance that it's better

#

Or the fact that now you have more training data

#

You want a wide range of samples in your test set (hence making it 20% of the data) so that you can get a realistic picture of what it will do in the real world

umbral ferry Aug 4, 2021, 7:27 PM

#

I'm also confused, say you did all the tuning and stuff, how do you go about generating the final model? do you run it a bunch of times until you get a low scoring metric? Do you train it on 100% of the data or a larger fraction?

#

how do you shape the best deliverable

unborn glacier Aug 4, 2021, 7:28 PM

#

I think time is best spent trying a range of model types to get the best result, not fine-tuning the metrics with the same model

#

The point of the test data is to give you a preview of what it might do in the real world, but if you keep iterating you can get an artificial good result on the test data

#

Which it sounds like you may have done

#

That's also why people do test-train-validate, but if you have very limited amounts of data, it's probably not worth doing that

#

One place where fine-tuning might be worthwhile is in things like batch-size, as I think that can help eliminate over-fit

umbral ferry Aug 4, 2021, 7:31 PM

#

I'm getting quite good fit on my training, and worse but still good on my test lol

inland zephyr Aug 4, 2021, 7:32 PM

#

i using test train valid even if the data are very limited. the reason is i need a validation data to check if the model performance are almost linear in both train and valid... which mean check whether the model overfitted or not

unborn glacier Aug 4, 2021, 7:33 PM

#

It's not the end of the world to have slight overfit, it's typical to have a model perform better on the train than the test. You don't want crazy large differences though

umbral ferry Aug 4, 2021, 7:35 PM

#

I have large differences 😬, RMSE on train is 2, test is 6.5

inland zephyr Aug 4, 2021, 7:35 PM

#

when i using scheme 50-20-30 for train test valid split with random configuration sometimes my model overfit in validation after training or underfit the result... and reflected by the test one...

#

This is like what happen in my case after 50 epoch, even the train acc almost 100 but the validation is very low around 50 or 60 with the lowest val loss, the test will follow valid result and underfit happen... but not frequently happen

#

which scare me anytime

uncut orbit Aug 4, 2021, 7:56 PM

#

sterile prawn Aug 4, 2021, 7:57 PM

#

~~i just use 100% of data for training~~

uncut orbit Aug 4, 2021, 8:02 PM

#

its been an hour

uncut orbit Aug 4, 2021, 8:02 PM

#

sterile prawn ~~i just use 100% of data for training~~

why?

sterile prawn Aug 4, 2021, 8:02 PM

#

oh that was a joke

#

now and then im too lazy to make a validation callback

#

and jregret it later

uncut orbit Aug 4, 2021, 8:05 PM

#

lmao

#

i took 10000 images of fake people

#

10000 images of real people

#

and its taking 10000 years to finish

grave frost Aug 4, 2021, 8:13 PM

#

uncut orbit and its taking 10000 years to finish

its probably running on CPU

#

BART usually takes about 1-2 years

uncut orbit Aug 4, 2021, 8:23 PM

#

grave frost its probably running on CPU

no gpu i think, the data is huge

lapis sequoia Aug 4, 2021, 8:24 PM

#

uncut orbit its been an hour

an hour for 4 epochs, each epoch takes approx. 827s. You wait around 3308s for 4 epochs (an hour), to reach 50 epochs, 41350s. 38042s you must wait, around 10.5 hours.

You must wait around 10 hours and 30 minutes for this to complete, correct me if i am wrong pwease.

uncut orbit Aug 4, 2021, 8:25 PM

#

i will definetly

lapis sequoia Aug 4, 2021, 8:25 PM

#

(Since posting)

chilly geyser Aug 4, 2021, 8:25 PM

#

Well that's certainly average case

#

And I think that's a reasonable prediction

uncut orbit Aug 4, 2021, 8:25 PM

#

but i dont know why i put the epochs so high because now its probably going to overfit

lapis sequoia Aug 4, 2021, 8:26 PM

#

Mmm

uncut orbit Aug 4, 2021, 8:29 PM

#

before i was running with 200 deepfakes and real people each

#

the score was ok

waxen veldt Aug 4, 2021, 10:48 PM

#

whats the best way to find a mentor

#

also quick question

#

# lets say I have a list 
lst = [1, 2, 3, 4, 5]
data = pd.DataFrame({'Numbers': [3, 42, 5, 345, 36]})

#

I want to filter for the rows that have any of the numbers in lst

#

how can i do that?

#

the desired rows that I would want has the numbers, 3 and 5

#

# My solution 
data[data['Numbers'].apply(lambda x: x in lst)]

#

But i feel like there would be a faster way

iron basalt Aug 4, 2021, 11:00 PM

#

data[data['Numbers'].isin(lst)]

waxen veldt Aug 4, 2021, 11:01 PM

#

ahhhhhhhhhhhh

#

i seee

#

thanks

#

also what are good visualizations to see correlations between two categorical columns?

iron basalt Aug 4, 2021, 11:04 PM

#

A table.

waxen veldt Aug 4, 2021, 11:04 PM

#

im familiar with visualizations such as pointplot, violinplot, barplot, etc but those are all for categorical columns and also numerical columns

waxen veldt Aug 4, 2021, 11:04 PM

#

iron basalt A table.

huh

iron basalt Aug 4, 2021, 11:04 PM

#

(With coloring, red = high)

waxen veldt Aug 4, 2021, 11:04 PM

#

hmm

#

i'm thinking countplot with a hue as the other categorical

#

but is there a better one

waxen veldt Aug 4, 2021, 11:05 PM

#

iron basalt A table.

you mean cross tab?

iron basalt Aug 4, 2021, 11:05 PM

#

rows are one column, cols are the other

#

cross

waxen veldt Aug 4, 2021, 11:05 PM

#

yeah so corss tab

#

but is there like visualizations?

#

cuz a table isn't a visualization

iron basalt Aug 4, 2021, 11:06 PM

#

It is, but there are others

#

Can do a graph, close nodes are highly correlated.

#

Graphs are useful when there are many components and you want an overview from a distance

#

Can zoom out

exotic maple Aug 4, 2021, 11:09 PM

#

Isnt a heatmap prolly the best for crosstabs kind of data?

iron basalt Aug 4, 2021, 11:09 PM

#

(But a very small cell size in a matrix works too)

#

Yea, colored table.

#

Color is pre-cognitive so it helps a lot.

#

The graph approach can give you insight into clusters.

#

A couple of things all correlated with each other.

waxen veldt Aug 4, 2021, 11:12 PM

#

bet htanks

#

so for the heatmap

iron basalt Aug 4, 2021, 11:13 PM

#

Graph renderings are kinda hard to find though compared to a simple table.

waxen veldt Aug 4, 2021, 11:13 PM

#

would you first create a cross tab and then call sns.heatmap()

#

or pivot table with aggfunc=np.count if that will even work

lapis sequoia Aug 4, 2021, 11:15 PM

#

hey can someone help me out. I'm new to tensorflow and am getting a dimension error on validation data.

basically im using an imagedatagenerator on my training data, when I try to evaluate the mdoel based on the evaluation data however, it throws an error.

here is the error, and im guessing its to do without the output since its a 10x1 array output.
ValueError: Shapes (None, 10, 2) and (None, 10) are incompatible

#

model.fit(datagen.flow(x=x_train, y=y_train, shuffle=True, batch_size=32), epochs=1,
callbacks=[callback], validation_data=(x_test, y_test))

#

here is my model.fit line, i bet its something here

#

could someone help

#

the error occurs at the end of the epoch

junior lintel Aug 4, 2021, 11:23 PM

#

‘‘’py
Test’’’

lapis sequoia Aug 4, 2021, 11:23 PM

#

wow ok, im just stupid nvm

lapis sequoia Aug 5, 2021, 1:00 AM

#

What am I doing wrong? Why is the target variable and its column duplicated?

dummies = pd.get_dummies(df2, columns= ['payment_type','category_name'])
  
df3 = pd.concat([df2,dummies], axis='columns')
X = df3.loc[:,df3.columns != 'price']
# Target
y = df3['price']```

serene scaffold Aug 5, 2021, 1:27 AM

#

lapis sequoia What am I doing wrong? Why is the target variable and its column duplicated? ```...

perhaps the same column appears in both df2 and dummies?

#

also, the definition of X could just be X = df3.drop('price', axis='columns')

lapis sequoia Aug 5, 2021, 2:02 AM

#

Hey not sure if this is the right place but I've been reading around but couldn't find 1 exact answer.

I am reading large files in python and looking for the fastest way to do so.

serene scaffold Aug 5, 2021, 2:09 AM

#

lapis sequoia Hey not sure if this is the right place but I've been reading around but couldn'...

are you using the open function? and are you sure that your approach really truly is too slow?

lapis sequoia Aug 5, 2021, 2:17 AM

#

serene scaffold are you using the `open` function? and are you sure that your approach really tr...

Well I'm just using the normal with open function and then readlines(), pretty much as basic as I can. I was trying to find a faster way to check over it

serene scaffold Aug 5, 2021, 2:18 AM

#

lapis sequoia Well I'm just using the normal with open function and then readlines(), pretty m...

but how do you know that what you're doing is too slow?

lapis sequoia Aug 5, 2021, 2:19 AM

#

I don't tbh
I've had my code running for 5 hours and it processed 1.85mil lines, every line I am checking a list with a length of 26k if that item in the list is equal to the line of the file

serene scaffold Aug 5, 2021, 2:19 AM

#

lapis sequoia I don't tbh I've had my code running for 5 hours and it processed 1.85mil lines,...

can you do a few lines at a time in parallel?

lapis sequoia Aug 5, 2021, 2:20 AM

#

serene scaffold can you do a few lines at a time in parallel?

How would I go about that?

serene scaffold Aug 5, 2021, 2:20 AM

#

lapis sequoia How would I go about that?

how many cores do you have?

lapis sequoia Aug 5, 2021, 2:20 AM

#

Like multiprocessing?

serene scaffold Aug 5, 2021, 2:20 AM

#

yes

lapis sequoia Aug 5, 2021, 2:20 AM

#

I got 4 cores

serene scaffold Aug 5, 2021, 2:20 AM

#

if processing each line doesn't depend on knowing about previous lines, you can do it in parallel

lapis sequoia Aug 5, 2021, 2:21 AM

#

It's using about 25% of my CPU

serene scaffold Aug 5, 2021, 2:21 AM

#

so maybe you could let it run overnight with three cores 🤷‍♂️

lapis sequoia Aug 5, 2021, 2:21 AM

#

hmm i could look into it

#

I just don't want it to kill my CPU over time lol

#

I have no idea if it will I'm kinda new to this

serene scaffold Aug 5, 2021, 2:35 AM

#

@lapis sequoia if each process uses at most 25% then running three instances will give you some clearance.

#

But it's up to you to know what the peak memory usage is

lapis sequoia Aug 5, 2021, 2:37 AM

#

alright thx

odd falcon Aug 5, 2021, 3:34 AM

#

hello everyone

#

excuseme, has anyone worked with lstm before?

native bay Aug 5, 2021, 5:41 AM

#

odd falcon excuseme, has anyone worked with lstm before?

yup

odd falcon Aug 5, 2021, 5:43 AM

#

help me, understand lstm, please!!

royal crest Aug 5, 2021, 5:43 AM

#

uhhh

native bay Aug 5, 2021, 5:43 AM

#

ok so the full form is long short term memory mostly its used in generating text messages like quotes or you can also make maths questions with it

#

and it is smart enough to also understand the grammar used in a sentence once you have enough data

#

https://www.youtube.com/watch?v=LfnrRPFhkuY&t=616s

YouTube

codebasics

Simple Explanation of LSTM | Deep Learning Tutorial 36 (Tensorflow,...

LSTM or long short term memory is a special type of RNN that solves traditional RNN's short term memory problem. In this video I will give a very simple explanation of LSTM using some real life examples so that you can understand this difficult topic easily. Also refer to following blogs to explore math and understand few more details.

http://c...

▶ Play video

#

this video will be more helpful 🙂

ruby patio Aug 5, 2021, 9:03 AM

#

i have seen it

scarlet python Aug 5, 2021, 9:10 AM

#

@ruby patio great!

ruby patio Aug 5, 2021, 9:11 AM

#

scarlet python <@872514906405085225> great!

yeah bro

grave frost Aug 5, 2021, 10:13 AM

#

native bay and it is smart enough to also understand the grammar used in a sentence once yo...

it does not understand grammar lol - who said that?

native bay Aug 5, 2021, 10:14 AM

#

grave frost it does not understand grammar lol - who said that?

yes it doesnt i meant the patterns can make it learn grammar

grave frost Aug 5, 2021, 10:14 AM

#

ruby patio yeah bro

start with the official paper on arxiv, check out yannic kilcher or stack overflow if you have any doubts - or post them here

#

karpathy also wrote a ton of blogs on it, you can see them also

carmine tide Aug 5, 2021, 10:28 AM

#

Hello, can someone help me a bit with curve fitting? I try fitting a curve on data with x and y errors using odr but it doesn't give me the correct curve. Thanks!

red mortar Aug 5, 2021, 10:47 AM

#

how many GBs of data is recommended for machine learning (I'm choosing between 8 and 16)? I usually use Google colab for ML because of the free GPU, however i realized that in the future if i am doing things with larger datasets, google colab might not work because uploading the dataset to drive takes a really long time sometimes.

#

16gb data seems like overkill because i probably won't even do machine learning locally for the next few years, so i probably won't get it, but i am just wondering what other people think

grave frost Aug 5, 2021, 10:54 AM

#

red mortar 16gb data seems like overkill because i probably won't even do machine learning ...

use google cloud if you have the bucks

solid lintel Aug 5, 2021, 11:07 AM

#

If you guys are looking for implementation of Machine Learning algorithms on python, I've made a github repository which you can follow (https://github.com/vanshhhhh/Hands-On-Machine-Learning)
If you find this helpful please do give it a star on github 🙂

GitHub

GitHub - vanshhhhh/Hands-On-Machine-Learning: This repository conta...

This repository contains the implementation of all the Machine Learning algorithms like Regression, Classification, Clustering etc. All of this has been Implemented in python - GitHub - vanshhhhh/H...

undone flare Aug 5, 2021, 11:34 AM

#

Does this mean as n_student increases, posttest decreases?

rigid zodiac Aug 5, 2021, 1:05 PM

#

Hi guys, i have a quick question. if I have a data frame from 1 to 10, and I trying to create some sort of group like
(0,1,2), (1,2,3),(2,3,4) .... (8,9,10). How can I do it ?

chilly geyser Aug 5, 2021, 1:59 PM

#

rigid zodiac Hi guys, i have a quick question. if I have a data frame from 1 to 10, and I tr...

There's this
https://stackoverflow.com/questions/54280228/how-to-iterate-n-wise-over-an-iterator-efficiently

Stack Overflow

How to iterate "n-wise" over an iterator efficiently

Possibly a duplicate, but I couldn't find anything.

I have a very long iterator (10000 items) and I need to iterate over it ~500 items at a time. So if my iterator was range(10000), it would look ...

desert oar Aug 5, 2021, 2:42 PM

#

rigid zodiac Hi guys, i have a quick question. if I have a data frame from 1 to 10, and I tr...

what kinds of groups? you want a list of dataframes of 3 rows each?

#

or do you just need to perform a calculation on 3 rows at a time?

#

don't make us guess!

rigid zodiac Aug 5, 2021, 2:44 PM

#

idk what is it called, like mathematical term. My goal is for a list of data ranging from 1 to 10, I will create kinda a partition or group like
Group1: (1,2,3)
group2: (2,3,4)
etc.

desert oar Aug 5, 2021, 2:44 PM

#

yes, and what are you trying to do with those groups? how do you want to store them? do you even need to store them, or do you just want to perform a calculation on them?

#

are these dataframe rows? index values? etc

rigid zodiac Aug 5, 2021, 2:45 PM

#

Once I have those group. I will use that as the number to identify data. Example data['c'].iloc(group1)

#

idk whether it is possible to do that or not

desert oar Aug 5, 2021, 2:46 PM

#

and what do you want to do with that data?

#

you can make a list of overlapping triples, like [(0,1,2), (1,2,3), (2,3,4), ...] pretty easily. but you usually don't/shouldn't need to do this explicitly with pandas, see e.g. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html#pandas.DataFrame.rolling and https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.rolling.Rolling.apply.html

#

so i ask again - what are you actually trying to do?

rigid zodiac Aug 5, 2021, 2:50 PM

#

Thank you, I will look in to those documents. Reason why I do that is:
on the fall data that I have, I realize that the fall usually happen on the second smallest data. and the next 14 data of it has to be less than the 2nd smallest, specifically the data #2 up to #14 has to be between 0 and 20.

#

that's why I want to get the sequence of data then isolate it,

desert oar Aug 5, 2021, 3:32 PM

#

!eval @rigid zodiac if you really just want the overlapping tuples, you can do something like this

from itertools import islice

def infinite_windows(window_size):
    start = 0
    while True:
        yield tuple(range(start, start+window_size))
        start += 1

windows = list(islice(infinite_windows(5), 5))
print(windows)

arctic wedgeBOT Aug 5, 2021, 3:32 PM

#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

[(0, 1, 2, 3, 4), (1, 2, 3, 4, 5), (2, 3, 4, 5, 6), (3, 4, 5, 6, 7), (4, 5, 6, 7, 8)]

desert oar Aug 5, 2021, 3:33 PM

#

but it sounds like you should probably be using DataFrame.rolling, i just don't really understand what you're trying to do

rigid zodiac Aug 5, 2021, 3:33 PM

#

desert oar but it sounds like you should probably be using `DataFrame.rolling`, i just don'...

Thank you, I will try both.

desert oar Aug 5, 2021, 3:35 PM

#

!eval @rigid zodiac you might also want your "windows" to be (start, stop) pairs, which you can use with df.iloc[start : stop]:

window_size = 5
n_windows = 5
window_bounds = [(start, start+window_size) for start in range(n_windows)]
print(window_bounds)

arctic wedgeBOT Aug 5, 2021, 3:35 PM

#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

[(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]

desert oar Aug 5, 2021, 3:36 PM

#

!eval as opposed to the way i did it before:

window_size = 5
n_windows = 5
window_bounds = [(start, start+window_size) for start in range(n_windows)]
window_indices = [list(range(start, stop)) for start, stop in window_bounds]
print(window_indices)

arctic wedgeBOT Aug 5, 2021, 3:36 PM

#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

[[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]]

rigid zodiac Aug 5, 2021, 3:40 PM

#

desert oar !eval as opposed to the way i did it before: ```python window_size = 5 n_windows...

with this one I dont have to do the def right?

desert oar Aug 5, 2021, 3:42 PM

#

i'm just showing a couple different ways to do the same thing

#

it's a good exercise to figure out what these different versions do

odd patio Aug 5, 2021, 3:45 PM

#

Is there is any other packages except pyautogui and opencv
I want a comment like Locateonscreen in pyautogui

rigid zodiac Aug 5, 2021, 3:46 PM

#

desert oar it's a good exercise to figure out what these different versions do

i tried this but it would work with DataFrame

odd patio Aug 5, 2021, 3:46 PM

#

So is there is any other packages that do this work?

desert oar Aug 5, 2021, 3:46 PM

#

!eval here's another one @rigid zodiac :

from itertools import count, islice

def infinite_windows(window_size: int) -> tuple[int, int]:
    for window_start in count():
        window_stop = window_start + window_size
        yield (window_start, window_stop)

def window_to_indices(window: tuple[int, int]) -> list[int]:
    start, stop = window
    return list(range(start, stop))

window_size = 5
n_windows = 5
windows = infinite_windows(window_size)
windows = islice(windows, n_windows)
windows = map(window_to_indices, windows)
windows = list(windows)
print(windows)

arctic wedgeBOT Aug 5, 2021, 3:46 PM

#

@desert oar :white_check_mark: Your eval job has completed with return code 0.

[[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]]

desert oar Aug 5, 2021, 3:47 PM

#

rigid zodiac i tried this but it would work with DataFrame

for a dataframe i told you that you probably want to use .rolling. but of course you can write a loop and use .iloc as well

rigid zodiac Aug 5, 2021, 3:48 PM

#

desert oar for a dataframe i told you that you _probably_ want to use `.rolling`. but of co...

I see it now, thank you so much

agile jolt Aug 5, 2021, 4:05 PM

#

hi, im new to jupyter notebook so i'm wondering what's wrong in here

#

#

#

solved it, nvm!

inland zephyr Aug 5, 2021, 4:35 PM

#

hello i need suggestion about this case. When i try to run my CNN model, the loss reducted but in every epoch the result is raising. I using 38 training data and 12 validation for the validation. I know this is overfitted heavily.

1/1 [==============================] - 5s 5s/step - loss: 0.7031 - accuracy: 0.4000 - val_loss: 6.6536 - val_accuracy: 0.5000
Epoch 2/50
1/1 [==============================] - 0s 260ms/step - loss: 0.0332 - accuracy: 1.0000 - val_loss: 15.3232 - val_accuracy: 0.5000
Epoch 3/50
1/1 [==============================] - 0s 239ms/step - loss: 6.5698e-04 - accuracy: 1.0000 - val_loss: 25.3745 - val_accuracy: 0.5000
Epoch 4/50
1/1 [==============================] - 0s 244ms/step - loss: 3.0219e-06 - accuracy: 1.0000 - val_loss: 36.1942 - val_accuracy: 0.5000
Epoch 5/50
1/1 [==============================] - 0s 262ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 47.4359 - val_accuracy: 0.5000
Epoch 6/50
1/1 [==============================] - 0s 257ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 58.8482 - val_accuracy: 0.5000
Epoch 7/50
1/1 [==============================] - 0s 255ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 70.2395 - val_accuracy: 0.5000
Epoch 8/50
1/1 [==============================] - 0s 273ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 81.4671 - val_accuracy: 0.5000
Epoch 9/50
1/1 [==============================] - 0s 252ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 92.4239 - val_accuracy: 0.5000
Epoch 10/50
1/1 [==============================] - 0s 268ms/step - loss: 0.0000e+00 - accuracy: 1.0000 - val_loss: 103.0329 - val_accuracy: 0.5000```

#

i cannot add more data due limited size of data (actually it is 1D data)

odd falcon Aug 5, 2021, 5:18 PM

#

help me, search document implement sentiment analysis use CNN with pytorch 🥸

sterile prawn Aug 5, 2021, 5:26 PM

#

inland zephyr hello i need suggestion about this case. When i try to run my CNN model, the los...

this seems to be a caase of chronic overfitting

#

if you can't add any more data

#

then this will keep happening

#

try using a model with less parameters

#

maybe a simple one-layer perceptron

#

the less parametetrs

#

the more some info will be etractred from the data

#

or try finetuning a larger network

#

either way more daeta helps

fiery bane Aug 5, 2021, 5:36 PM

#

inland zephyr hello i need suggestion about this case. When i try to run my CNN model, the los...

maybe not use CNN?

inland zephyr Aug 5, 2021, 5:43 PM

#

nvm i have check the problem

#

at one class, the data is has same proportion with the other class (since its binary classification), but the problem each data for that problematic class are repeated and identical due preprocessing issue

#

so even for 12 data, it always shown only 2 distinct data since the repetition

shrewd cradle Aug 5, 2021, 5:57 PM

#

Hello, any idea how to extract last date of the week from yyyyww format data?
for example: My dataset has 201501 and I need last date from 1st week of 2015 i.e. 4-1-2015
using python^

thorny coral Aug 5, 2021, 6:25 PM

#

how hard is it to learn ML and AI

trim stream Aug 5, 2021, 6:34 PM

#

thorny coral how hard is it to learn ML and AI

that depends, what are you using it for?

odd falcon Aug 5, 2021, 6:34 PM

#

how to data text activate in CNN ?

earnest herald Aug 5, 2021, 6:47 PM

#

I'm getting this error:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: (128, 20, 20)
and I know how to fix it:

currentStates = currentStates.reshape(-1, *env.STATE_SHAPE) # env.STATE_SHAPE = (20, 20, 1)
currentQsList = self.model.predict(currentStates)

and currentStates.shape returns (128, 20, 20, 1) as expected so why is it saying Full shape received: (128, 20, 20)

earnest herald Aug 5, 2021, 7:42 PM

#

I'm convinced this is unsolvable

lapis sequoia Aug 5, 2021, 8:21 PM

#

hey anyone know a GOOD text to speech engine that sounds like a google home?

sterile prawn Aug 5, 2021, 8:39 PM

#

thorny coral how hard is it to learn ML and AI

hard

#

with work defo doable

#

but you need some math knowledge

#

some programming knowledge

#

a dash of natural aptitiude doesn't hurt

#

and most of all persistence

#

lots and lots of persistence

desert oar Aug 5, 2021, 9:46 PM

#

@earnest herald are you running this in a notebook?

earnest herald Aug 5, 2021, 9:47 PM

#

desert oar <@!495971074320629770> are you running this in a notebook?

no, my own environment

desert oar Aug 5, 2021, 9:48 PM

#

what is self.model?

#

is it trying to do some batching or something?

#

i assume self.model is some sklearn-style wrapper around a keras model, but i have no idea what its predict method does

#

if you can share what libraries you are using and the full traceback, maybe someone can help

earnest herald Aug 5, 2021, 9:51 PM

#

I'm on my phone now, can we speak in DM?

desert oar Aug 5, 2021, 9:52 PM

#

i'd rather not

#

you can @ me when you get back to a computer

#

i'm on here most days

#

can't guarantee an answer though

earnest herald Aug 5, 2021, 9:53 PM

#

okay,

#

self.model is a keras Sequential model with 1 Conv2d layer and has input shape 20,20,1

#

the predict method takes a batch of states (captures of a game in array form) and returns the output layer's values

#

but as a batch since currentStates is a batch of states

desert oar Aug 5, 2021, 11:44 PM

#

Can you show the actual code

grizzled barn Aug 6, 2021, 3:08 AM

#

Was thinking of expanding my knowledge into the machine learning/Ai category. Anyone have any tips beforehand/stuff I should know?

royal crest Aug 6, 2021, 3:31 AM

#

linalg

honest venture Aug 6, 2021, 4:10 AM

#

grizzled barn Was thinking of expanding my knowledge into the machine learning/Ai category. An...

What do you want to work on?

rough otter Aug 6, 2021, 4:59 AM

#

im not very experienced with unsupervised learning-if input data is not labeled then how do you determine whether a model is accurate or not?

grand breach Aug 6, 2021, 8:10 AM

#

Is keras a wrapper around tensorflow

serene scaffold Aug 6, 2021, 8:11 AM

#

@grand breach yes

vital compass Aug 6, 2021, 8:27 AM

#

which modules are important to learn data science

serene scaffold Aug 6, 2021, 9:16 AM

#

vital compass which modules are important to learn data science

numpy: doing math, especially in large batches
pandas: manipulating tabular data
sklearn: lots of data science tools and models to work with
pytorch or tensorflow: deep learning stuff
matplotlib: data visualization

but focus on learning data science in general and doing projects. don't try to "learn libraries".

undone flare Aug 6, 2021, 9:27 AM

#

Can I share my kaggle notebook? If anyone could tell me how to improve it and what bad practices I should avoid doing?

#

It's a simple linear regression problem

serene scaffold Aug 6, 2021, 9:42 AM

#

undone flare Can I share my kaggle notebook? If anyone could tell me how to improve it and wh...

you can post the link here, yes

undone flare Aug 6, 2021, 9:43 AM

#

thanks, https://www.kaggle.com/arnavrangwani/post-test-prediction

serene scaffold Aug 6, 2021, 9:54 AM

#

undone flare thanks, https://www.kaggle.com/arnavrangwani/post-test-prediction

where does the part that you wrote start?

undone flare Aug 6, 2021, 9:54 AM

#

hmm?

serene scaffold Aug 6, 2021, 9:55 AM

#

or did you write the whole thing? I thought it was like a template or something

undone flare Aug 6, 2021, 9:55 AM

#

no no I wrote the whole thing

serene scaffold Aug 6, 2021, 9:55 AM

#

ahhh

#

def score(y_test, y_pred):
    """Helper function for evaluation metrics."""
    print(f"""Explained Variance: {explained_variance_score(y_test, y_pred) * 100:.2f}%
MAE: {round(mean_absolute_error(y_test, y_pred), 2):.2f}""")

I find this difficult to read

#

maybe make variables and then put those in the f string?

undone flare Aug 6, 2021, 9:58 AM

#

yea, will do that

serene scaffold Aug 6, 2021, 9:59 AM

#

For all the cells where you go over the value counts for each feature, it might be interesting to show both the counts and the percent share

undone flare Aug 6, 2021, 10:00 AM

#

thanks, will do

serene scaffold Aug 6, 2021, 10:01 AM

#

pretty good, I think 👍

undone flare Aug 6, 2021, 10:01 AM

#

should I also consider the standard deviation or mse in this case?

vital compass Aug 6, 2021, 10:07 AM

#

serene scaffold * numpy: doing math, especially in large batches * pandas: manipulating tabular ...

Thanks

serene scaffold Aug 6, 2021, 11:09 AM

#

undone flare should I also consider the standard deviation or mse in this case?

I'm not really sure. Eventually my ignorance about stats is going to catch up with me.

rigid zodiac Aug 6, 2021, 11:11 AM

#

I need some help. I trying to create a loop like the following, but it just keep running forever.

for i in range(len(c)):
    if (c['ay'].iloc[i] == second_smallest(c['ay'])) and (c['ay'].iloc[i] < -20) and (c['az'].iloc[i] == second_smallest(c['az'])) and (c['az'].iloc[i] < -20):
        for j in range(1, len(c)):
            if (c['ay'].iloc[j] < abs(c['ay'].iloc[i])) and (c['az'].iloc[j] < abs(c['az'].iloc[i])): # frame 1 after minimum
                 for k in range(2,len(c)):
                    if (abs(c['ay'].iloc[k]) < 10 ) and (abs(c['az'].iloc[k]) < 10): # frame 2
                        for n in range(3, len(c)):
                            if (abs(c['ay'].iloc[n]) < 10 ) and (abs(c['az'].iloc[n]) < 10):# frame 3 
                                for m in range(4, len(c)):
                                    if (abs(c['ay'].iloc[m]) < 10 ) and (abs(c['az'].iloc[m]) < 10):# frame 4 
                                        for b in range(5, len(c)):
                                            if (abs(c['ay'].iloc[b]) < 10 ) and (abs(c['az'].iloc[b]) < 10):# frame 5 
                                                for v in range(6, len(c)):
                                                    if (abs(c['ay'].iloc[v]) < 10 ) and (abs(c['az'].iloc[v]) < 10):# frame 6 
                                                        for h in range(7, len(c)):
                                                            if (abs(c['ay'].iloc[h]) < 10 ) and (abs(c['az'].iloc[h]) < 10):# frame 7 
                                                                c['cat'].iloc[h] = 1```

serene scaffold Aug 6, 2021, 11:17 AM

#

what is this supposed to do?

winged stratus Aug 6, 2021, 11:17 AM

#

omg

serene scaffold Aug 6, 2021, 11:18 AM

#

there must be a better way

winged stratus Aug 6, 2021, 11:18 AM

#

please don't tell me you had to write this by hand

rigid zodiac Aug 6, 2021, 11:19 AM

#

i use copy and paste

#

well my logic is if I get the second smallest, then if the next number is less than the second smallest.... and the subsequence 6 more number is ranging between 0 and 20 then the categorical is 1

#

but god forbid the pc didnt think like I do

#

like it will follow this logic i >j>k>n>m>b>v>h

rigid zodiac Aug 6, 2021, 11:24 AM

#

serene scaffold there must be a better way

I hope so cause I have absolutely how to 😦

acoustic halo Aug 6, 2021, 11:31 AM

#

rigid zodiac I need some help. I trying to create a loop like the following, but it just keep...

are you sure its running forever, and not for a long time? what does len(c) evaluate to?

rigid zodiac Aug 6, 2021, 11:32 AM

#

c has 67 data

acoustic halo Aug 6, 2021, 11:32 AM

#

Because it looks like it runs O(n^8)

#

Yeah thats why lol

rigid zodiac Aug 6, 2021, 11:32 AM

#

is there any better way to do this? I been google my ass off like entire of week

acoustic halo Aug 6, 2021, 11:37 AM

#

What exactly is it doing?

rigid zodiac Aug 6, 2021, 11:40 AM

#

it will categorize whether object will fall or not

rigid zodiac Aug 6, 2021, 11:42 AM

#

acoustic halo What exactly is it doing?

so for the 1st line, I'm trying to say that if there exist a second smallest in y and z, and the one right after that is less than its absolute value, and the absolute value of 6 more frame after that is between 0 and 20. Then cat =1

#

Like this image

#

but in the y and z acceleration. I also have to do like 3 more condition, similar with it as a fail proof. Because sometime we dont have a second smallest. we have smallest

acoustic halo Aug 6, 2021, 11:53 AM

#

So let me see if I understand, you want to label data as 1 if it is a second smallest value, and the following 6 points don't go above 20?

rigid zodiac Aug 6, 2021, 11:54 AM

#

acoustic halo So let me see if I understand, you want to label data as 1 if it is a second sma...

Yeah, second smallest value, and the one right after it is smaller than the abs(second smallest) and the following 6 points don't go above 20?

acoustic halo Aug 6, 2021, 11:57 AM

#

Okay well your nested for loops are super unnecessary because it repeats itself for example lets just look at a bit of it:

    if (abs(c['ay'].iloc[m]) < 10 ) and (abs(c['az'].iloc[m]) < 10):# frame 4 
        for b in range(5, len(c)):
            if (abs(c['ay'].iloc[b]) < 10 ) and (abs(c['az'].iloc[b]) < 10):# frame 5 ```

Lets say m is 10 and b is 11, on the next m loop, it checks b = 11 again

#

You could just have a single for loop, staring at second_smallest, and ending at the number of points you want to check

#

Infact give me a minute and I'll rewrite it to show you

#

for i in range(len(c)):
    if (c['ay'].iloc[i] == second_smallest(c['ay'])) and (c['ay'].iloc[i] < -20) and (c['az'].iloc[i] == second_smallest(c['az'])) and (c['az'].iloc[i] < -20)
    and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+1] < abs(c['az'].iloc[i])) # frame 1 after minimum
    and (abs(c['ay'].iloc[i+2]) < 10 ) and (abs(c['az'].iloc[i+2]) < 10) # frame 2
    and (abs(c['ay'].iloc[i+3]) < 10 ) and (abs(c['az'].iloc[i+3]) < 10)# frame 3 
    and (abs(c['ay'].iloc[i+4]) < 10 ) and (abs(c['az'].iloc[i+4]) < 10)# frame 4 
    and (abs(c['ay'].iloc[i+5]) < 10 ) and (abs(c['az'].iloc[i+5]) < 10)# frame 5 
    and (abs(c['ay'].iloc[i+6]) < 10 ) and (abs(c['az'].iloc[i+6]) < 10)# frame 6 
    and(abs(c['ay'].iloc[i+7]) < 10 ) and (abs(c['az'].iloc[i+7]) < 10):# frame 7 
        c['cat'].iloc[h] = 1```

#

You can probably find a way to shorted the condition as well using another loop but i'll leave that to you to figure out, but this gets rid of the nested for loops

#

And I think should accomplish what you want to achieve

rigid zodiac Aug 6, 2021, 12:09 PM

#

thank you so much let me try it

acoustic halo Aug 6, 2021, 12:10 PM

#

then you can alter the giant condition with a for j in range 7 to shorten it

rigid zodiac Aug 6, 2021, 12:11 PM

#

acoustic halo then you can alter the giant condition with a for j in range 7 to shorten it

I got this

acoustic halo Aug 6, 2021, 12:11 PM

#

You probably need to fix the indentation / newlines

#

It may be that the comments are cutting up the condition

#

for i in range(len(c)):
    if (c['ay'].iloc[i] == second_smallest(c['ay'])) and (c['ay'].iloc[i] < -20) and (c['az'].iloc[i] == second_smallest(c['az'])) and (c['az'].iloc[i] < -20) and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+1] < abs(c['az'].iloc[i])) and (abs(c['ay'].iloc[i+2]) < 10 ) and (abs(c['az'].iloc[i+2]) < 10) and (abs(c['ay'].iloc[i+3]) < 10 ) and (abs(c['az'].iloc[i+3]) < 10) and (abs(c['ay'].iloc[i+4]) < 10 ) and (abs(c['az'].iloc[i+4]) < 10) and (abs(c['ay'].iloc[i+5]) < 10 ) and (abs(c['az'].iloc[i+5]) < 10) and (abs(c['ay'].iloc[i+6]) < 10 ) and (abs(c['az'].iloc[i+6]) < 10) and(abs(c['ay'].iloc[i+7]) < 10 ) and (abs(c['az'].iloc[i+7]) < 10):
        c['cat'].iloc[h] = 1```

rigid zodiac Aug 6, 2021, 12:16 PM

#

acoustic halo ```c['cat'] = np.nan for i in range(len(c)): if (c['ay'].iloc[i] == second_s...

it work but it categorize the wrong thing..

#

acoustic halo Aug 6, 2021, 12:17 PM

#

Which should be 1?

rigid zodiac Aug 6, 2021, 12:18 PM

#

the ay at -310

acoustic halo Aug 6, 2021, 12:21 PM

#

is -310 the second smallest in ay?

rigid zodiac Aug 6, 2021, 12:21 PM

#

yeah

#

acoustic halo Aug 6, 2021, 12:22 PM

#

Okay but look at the conditions

#

and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])

#

c['ay'].iloc[i+1] is 207.523

#

Which is obviously bigger

rigid zodiac Aug 6, 2021, 12:24 PM

#

yeah that's why from the second line I switch it back to its absolute value and compare the rest

acoustic halo Aug 6, 2021, 12:24 PM

#

Yeah so you need to do the same here then

rigid zodiac Aug 6, 2021, 12:26 PM

#

it has to pair with the az in other to work

winged stratus Aug 6, 2021, 12:26 PM

#

looks like those ands can get an help from all

rigid zodiac Aug 6, 2021, 12:26 PM

#

because some time ay happen, without az or ax, the whole thing will consider as fall

winged stratus Aug 6, 2021, 12:27 PM

#

so ```py
if all([
c['ay'].iloc[i] == second_smallest(c['ay']),
c['ay'].iloc[i] < -20,
...
])

acoustic halo Aug 6, 2021, 12:27 PM

#

a loop would be better since the conditions are basically the same with incremented indices but yeah

winged stratus Aug 6, 2021, 12:27 PM

#

and put the c['ay'].iloc into a separate function

#

fn = c['ay'].iloc
fn(i), fn(i+1), ...

#

this should also help

rigid zodiac Aug 6, 2021, 12:29 PM

#

so what will it look like?

winged stratus Aug 6, 2021, 12:29 PM

#

and as spagoose said, it should be a loop

winged stratus Aug 6, 2021, 12:29 PM

#

rigid zodiac so what will it look like?

wait a min

undone flare Aug 6, 2021, 12:30 PM

#

I can safely drop a few categorical rows which are null if I have a big data set right? and it doesn't remove any of the unique values

winged stratus Aug 6, 2021, 12:32 PM

#

undone flare I can safely drop a few categorical rows which are null if I have a big data set...

yeah,

rigid zodiac Aug 6, 2021, 12:36 PM

#

undone flare I can safely drop a few categorical rows which are null if I have a big data set...

you can either remove it or use KNN approximate it

acoustic halo Aug 6, 2021, 12:36 PM

#

@rigid zodiac I realised the code i sent still has the h variable in it

#

SO obviously you need to remove that

rigid zodiac Aug 6, 2021, 12:37 PM

#

acoustic halo <@!380930360407621646> I realised the code i sent still has the h variable in it

so just i?

acoustic halo Aug 6, 2021, 12:37 PM

#

i+7 no?

#

Thats how you were doing it originally

#

for i in range(len(c)):
    condition_name = all([c['ay'].iloc[i] == second_smallest(c['ay']),
    c['ay'].iloc[i] < -20),
    c['az'].iloc[i] == second_smallest(c['az']),
    c['az'].iloc[i] < -20])
    
    for j in range (1, 8):
        condition_name &= (c['ay'].iloc[i+j] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+j] < abs(c['az'].iloc[i]))
        
    if condition_name:
        c['cat'].iloc[i+7] = 1```

#

But it could be i if thats where you wanted the label to be

rigid zodiac Aug 6, 2021, 12:40 PM

#

it said out of bound.

#

for the previous code that you send it work. I will see whether it work for the other data

#

for i in range(len(c)):
    if (c['ay'].iloc[i] == second_smallest(c['ay'])) and (c['ay'].iloc[i] < -20) and (c['az'].iloc[i] == second_smallest(c['az'])) and (c['az'].iloc[i] < -20) and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+1] < abs(c['az'].iloc[i])) and (abs(c['ay'].iloc[i+2]) < 10 ) and (abs(c['az'].iloc[i+2]) < 10) and (abs(c['ay'].iloc[i+3]) < 10 ) and (abs(c['az'].iloc[i+3]) < 10) and (abs(c['ay'].iloc[i+4]) < 10 ) and (abs(c['az'].iloc[i+4]) < 10) and (abs(c['ay'].iloc[i+5]) < 10 ) and (abs(c['az'].iloc[i+5]) < 10) and (abs(c['ay'].iloc[i+6]) < 10 ) and (abs(c['az'].iloc[i+6]) < 10) and(abs(c['ay'].iloc[i+7]) < 10 ) and (abs(c['az'].iloc[i+7]) < 10):
        c['cat'].iloc[i] = 1```

acoustic halo Aug 6, 2021, 12:42 PM

#

I'm just trying to wing it in notepad, so theres bound to be errors, you should easily be able to resolve something like an out of bounds error though

undone flare Aug 6, 2021, 12:51 PM

#

rigid zodiac you can either remove it or use KNN approximate it

eh can't bother just for a few columns

elfin storm Aug 6, 2021, 12:53 PM

#

i have to make a system map on good health and well being i want some idea or examples

rigid zodiac Aug 6, 2021, 12:54 PM

#

undone flare eh can't bother just for a few columns

i'm not quiet sure what you mean

rigid zodiac Aug 6, 2021, 12:56 PM

#

acoustic halo I'm just trying to wing it in notepad, so theres bound to be errors, you should ...

do i have to add a break behind cat? In order to jump to the elif?

undone flare Aug 6, 2021, 12:56 PM

#

rigid zodiac i'm not quiet sure what you mean

rows sorry

acoustic halo Aug 6, 2021, 12:58 PM

#

You dont have an elif?

rigid zodiac Aug 6, 2021, 12:58 PM

#

undone flare rows sorry

well my ultimate goal is just id the initial fall. So i think, if all of the condition is satisfy then the ML algorithm will say object fall

rigid zodiac Aug 6, 2021, 12:59 PM

#

acoustic halo You dont have an elif?

I do, I'm just saying if I want to add the elif with it, should I put the break beneath c['cat'].iloc[i] = 1?

acoustic halo Aug 6, 2021, 1:00 PM

#

Depends on what your trying to do exactly, a break will exit the for loop

rigid zodiac Aug 6, 2021, 1:01 PM

#

well because sometime fall will happen if we have the second smallest. Occasionally, it will happen if it is a smallest

#

So I have

c['cat'] = np.nan
for i in range(len(c)):
    if (c['ay'].iloc[i] == second_smallest(c['ay'])) and (c['ay'].iloc[i] < -20) and (c['az'].iloc[i] == second_smallest(c['az'])) and (c['az'].iloc[i] < -20) and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+1] < abs(c['az'].iloc[i])) and (abs(c['ay'].iloc[i+2]) < 10 ) and (abs(c['az'].iloc[i+2]) < 10) and (abs(c['ay'].iloc[i+3]) < 10 ) and (abs(c['az'].iloc[i+3]) < 10) and (abs(c['ay'].iloc[i+4]) < 10 ) and (abs(c['az'].iloc[i+4]) < 10) and (abs(c['ay'].iloc[i+5]) < 10 ) and (abs(c['az'].iloc[i+5]) < 10) and (abs(c['ay'].iloc[i+6]) < 10 ) and (abs(c['az'].iloc[i+6]) < 10) and(abs(c['ay'].iloc[i+7]) < 10 ) and (abs(c['az'].iloc[i+7]) < 10):
        c['cat'].iloc[i] = 1
        


    elif (c['ay'].iloc[i] == min(c['ay'])) and (c['ay'].iloc[i] < -20) and (c['az'].iloc[i] == min(c['az'])) and (c['az'].iloc[i] < -20) and (c['ay'].iloc[i+1] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+1] < abs(c['az'].iloc[i])) and (abs(c['ay'].iloc[i+2]) < 10 ) and (abs(c['az'].iloc[i+2]) < 10) and (abs(c['ay'].iloc[i+3]) < 10 ) and (abs(c['az'].iloc[i+3]) < 10) and (abs(c['ay'].iloc[i+4]) < 10 ) and (abs(c['az'].iloc[i+4]) < 10) and (abs(c['ay'].iloc[i+5]) < 10 ) and (abs(c['az'].iloc[i+5]) < 10) and (abs(c['ay'].iloc[i+6]) < 10 ) and (abs(c['az'].iloc[i+6]) < 10) and(abs(c['ay'].iloc[i+7]) < 10 ) and (abs(c['az'].iloc[i+7]) < 10):
        c['cat'].iloc[i] = 1```

#

I was just wondering do I need a "break" in order for that to jump to the elif

acoustic halo Aug 6, 2021, 1:02 PM

#

no, no break needed for that

#

but you have two conditions that have the same result sooo

chilly geyser Aug 6, 2021, 1:03 PM

#

Please refactor...

#

Why is the code like that?

rigid zodiac Aug 6, 2021, 1:03 PM

#

agh, so it will be (c['ay'].iloc[i] <= second_smallest(c['ay']))

acoustic halo Aug 6, 2021, 1:04 PM

#

Yeah you need to simplify the conditions, I only suggested that so you could see how I altered it from your original code

rigid zodiac Aug 6, 2021, 1:05 PM

#

chilly geyser Why is the code like that?

because I'm trying to say that if the second smallest. and satisfy those condition that I set. And the one right after that has to be < abs of the secondsmallest, and the next 7 data has to be between 20 and 0... then it has to be fall

chilly geyser Aug 6, 2021, 1:05 PM

#

um what

#

I don't understand

#

I pasted it into VSC and I still don't understand

rigid zodiac Aug 6, 2021, 1:06 PM

#

chilly geyser Why is the code like that?

Condition that: second smallest value, and the one right after it is smaller than the abs(second smallest) and the following 6 points don't go above 20?

#

wait you mean my issue or your lol

chilly geyser Aug 6, 2021, 1:06 PM

#

Right

#

This feels like one of those trend-change rules of thumbs

acoustic halo Aug 6, 2021, 1:07 PM

#

it is basically

chilly geyser Aug 6, 2021, 1:08 PM

#

Would be better if you can show an example dataframe

acoustic halo Aug 6, 2021, 1:08 PM

#

But @rigid zodiac this simplifies what i said originally:

for i in range(len(c)):
    condition_name = all([c['ay'].iloc[i] == second_smallest(c['ay']),
    c['ay'].iloc[i] < -20,
    c['az'].iloc[i] == second_smallest(c['az']),
    c['az'].iloc[i] < -20])
    
    for j in range (1, 8):
        condition_name &= (c['ay'].iloc[i+j] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+j] < abs(c['az'].iloc[i]))
        
    if condition_name:
        c['cat'].iloc[i] = 1```

chilly geyser Aug 6, 2021, 1:08 PM

#

why is second_smallest a dataframe too

acoustic halo Aug 6, 2021, 1:08 PM

#

Then you just add any extra conditions to that

#

I would be tempted to remove the i loop as well, but idk exactly what your labelling rules are

rigid zodiac Aug 6, 2021, 1:13 PM

#

acoustic halo But <@!380930360407621646> this simplifies what i said originally: ```c['cat']...

so to add the new condition in there i just need to create new condition_name?

acoustic halo Aug 6, 2021, 1:14 PM

#

condition_name is just the name of the variable i picked because I don'tr know what a label of 1 represents, you should rename it

#

But add it to the same variable with an OR, since the result is the same, the label 1

rigid zodiac Aug 6, 2021, 1:23 PM

#

acoustic halo But add it to the same variable with an OR, since the result is the same, the la...

i'm not quiet sure what you mean there

acoustic halo Aug 6, 2021, 1:24 PM

#

Lets say you add your new alternate condition, the result is the same : c['cat'].iloc[i] = 1 if its true

#

So why make a new condition, just make the original => original condition OR new condition

#

You can make a new condition if you want and its easier for you to read, the end result is the same, but it's less code otherwise

rigid zodiac Aug 6, 2021, 1:30 PM

#

acoustic halo You can make a new condition if you want and its easier for you to read, the end...

idk how to to be honest

acoustic halo Aug 6, 2021, 1:36 PM

#

@rigid zodiac like this for example:

for i in range(len(c)):
    condition_name = all([
    (c['ay'].iloc[i] == second_smallest(c['ay']) and c['az'].iloc[i] == second_smallest(c['az'])) or (c['ay'].iloc[i] == min(c['ay']) and c['az'].iloc[i] == min(c['az'])),
    c['ay'].iloc[i] < -20,
    c['az'].iloc[i] < -20])
    
    for j in range (1, 8):
        condition_name &= (c['ay'].iloc[i+j] < abs(c['ay'].iloc[i])) and (c['az'].iloc[i+j] < abs(c['az'].iloc[i]))
        
    if condition_name:
        c['cat'].iloc[i] = 1```

rigid zodiac Aug 6, 2021, 1:40 PM

#

so that for ay and az right? what if I want to add the second condition for the ay and ax within that

acoustic halo Aug 6, 2021, 1:41 PM

#

Yeah, but like I said, if you are not confident in doing it that way, do it however is easiest for you to understand

rigid zodiac Aug 6, 2021, 2:28 PM

#

acoustic halo Yeah, but like I said, if you are not confident in doing it that way, do it howe...

thank you so so much, you save me from loop hell

real wigeon Aug 6, 2021, 2:48 PM

#

hello, I tried to google this but am having issues

#

im trying to use sqlalchemy to query my db

#

im just trying to get all the values for a specific column

undone flare Aug 6, 2021, 2:52 PM

#

cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=42)
model = RidgeCV(alphas=np.arange(0, 1, 0.01), cv=cv, scoring='neg_mean_absolute_error')
model.fit(X_train, y_train)
print('alpha: %f' % model.alpha_)
```any better way cuz this takes way too long

#

I guess my mistake trying to do 0.01

chilly geyser Aug 6, 2021, 2:53 PM

#

Yeah that's 100 cases

#

You could probably throw it to some online host

undone flare Aug 6, 2021, 2:55 PM

#

I- great

#data-science-and-ml

there is nothing against it, but it is really screechy