#data-science-and-ml | Python | Page 307

velvet thorn Apr 22, 2021, 11:47 PM

#

but generally, that is true

lapis sequoia Apr 22, 2021, 11:47 PM

#

So if I set my unit=32

#

I'm guessing we have 32 different combinations of bs and cs

#

Right?

velvet thorn Apr 22, 2021, 11:48 PM

#

subject to the above caveat

#

but

#

"combination" is not really

#

an appropriate word

#

or rather, it's ambiguous

#

but, yes, in essence you have 32 (w, b) tuple-equivalents

#

which are independent

lapis sequoia Apr 22, 2021, 11:49 PM

#

So what decides which 32 values of b,c you get?

velvet thorn Apr 22, 2021, 11:49 PM

#

this

#

subject to initial conditions

lapis sequoia Apr 22, 2021, 11:49 PM

#

what's backpropagation of error

velvet thorn Apr 22, 2021, 11:49 PM

#

hm

#

how did you learn about deep learning?

lapis sequoia Apr 22, 2021, 11:50 PM

#

kaggle

velvet thorn Apr 22, 2021, 11:50 PM

#

I would suggest

#

you pick up a book, or a video course, or something like that

#

it's important to have a theoretical foundation

lapis sequoia Apr 22, 2021, 11:50 PM

#

I already am

#

It's just a hole in my knowledge I need fixing

velvet thorn Apr 22, 2021, 11:50 PM

#

backpropagation is one of the most basic aspects of neural networks

#

basically

lapis sequoia Apr 22, 2021, 11:51 PM

#

Oh, the loss function?

velvet thorn Apr 22, 2021, 11:51 PM

#

it's the application of the chain rule, given the application of a loss function to a neural network's prediction vs ground truth, to successively update the weights (including bias) of preceding layers

#

the layer closest to the end is updated first

#

and then the weight updates are propagated backwards throughout the network

lapis sequoia Apr 22, 2021, 11:52 PM

#

So the number of layers is the number of times the gradient is calculated and applied?

velvet thorn Apr 22, 2021, 11:53 PM

#

hm

#

you can think of it that way

#

but that is not always true

#

because you work at a higher level of abstraction than that

#

sometimes layers may incorporate multiple such mathematical operations

#

each of which requires one backpropagation step

#

consider for example

#

RNNs

lapis sequoia Apr 22, 2021, 11:54 PM

#

Yeah I'm gunna go look up backpropagation a bit

#

I think I'm there with the individual components, I just dont have much intuition as to how it all ties together

#

Thanks 🙂

velvet thorn Apr 22, 2021, 11:55 PM

#

yw 👋

#

if you ever need it, .reset_index()

turbid drift Apr 23, 2021, 3:58 AM

#

Can someone suggest me a good roadmap for deep learning? Thanks!

quasi sparrow Apr 23, 2021, 4:01 AM

#

Deep learning with Python is a good start

#

That’s the name of the book

stuck socket Apr 23, 2021, 5:08 AM

#

sup

#

watcha doin

near nymph Apr 23, 2021, 8:31 AM

#

Heyo, does anyone here know how to download a .json file from a html link and convert it into dataframe or csv format?

dense relic Apr 23, 2021, 9:14 AM

#

use requests library?

mint palm Apr 23, 2021, 10:50 AM

#

#

i know the derivation of the above two equation

#

but not able to derive for third one (marked with 2 arrows)

#

it is for one hidden layer NN

#

we are using sigmoid for first layer and tanh for output layer

#

this is cost function j

mint palm Apr 23, 2021, 11:16 AM

#

does someone know how to derive it

warm wharf Apr 23, 2021, 11:55 AM

#

hi im having trouble converting this architecture into code

#

#

(2) The convolutional layer is followed by a max pooling layer. The pooling is 2x2 with stride 2.
(3) After max pooling, the layer is connected to the next convolutional layer, with 64 output feature maps. The convolution kernels are of 5x5 in size. Use stride 1 for convolution. The activation is ReLU.
(4) The second convolutional layer is followed by a max pooling layer. The pooling is 2x2 with stride 2.
(5) After max pooling, the layer is connected to another convolutional layer, with 128 output feature maps. The convolution kernels are of 5x5 in size. Use stride 1 for convolution. The activation is ReLU.
(6) After convolutional layer, there is fully connected layer with 3072 nodes and ReLU activation function.
(7) The fully connected layer is followed by another fully connected layer with 2048 nodes and ReLU activation function, then connected to the last fully connected layer with 10 output nodes (corresponding to the 10 classes). Use the SoftMax activation for the last layer. ```

#

so far i have:

#

                          keras.layers.Conv2D(64, (5, 5), 1 , padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.MaxPooling2D((2,2), 2),
                          keras.layers.Conv2D(64, (5, 5), 2, padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.MaxPooling2D((2,2), 2),
                          keras.layers.Conv2D(64, (5, 5), 2, padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),```

#

i don't know how to make a fully connected layer

#

or know if my input_shape arguments are correct

#

let me know 😎

#

@ me when you respond tysm

cobalt creek Apr 23, 2021, 12:06 PM

#

@warm wharf just add a dense layers

#

before that flatten the result

warm wharf Apr 23, 2021, 12:07 PM

#

                          keras.layers.Conv2D(64, (5, 5), 1 , padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.MaxPooling2D((2,2), 2),
                          keras.layers.Conv2D(64, (5, 5), 2, padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.MaxPooling2D((2,2), 2),
                          keras.layers.Conv2D(64, (5, 5), 2, padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.Flatten(),
                          keras.layers.Dense(3072, activation='relu'),
                          keras.layers.Dense(2048, activation='relu'),
                          keras.layers.Dense(10, activation='softmax')
])```

#

something like this?

cobalt creek Apr 23, 2021, 12:21 PM

#

#

why is this 2444 when i have 78200 images on dataset

arctic crown Apr 23, 2021, 12:30 PM

#

please help

#

#

but it doesent go more than 0.8
i have been training a hotword 5000 times

cobalt creek Apr 23, 2021, 12:39 PM

#

try changing huperparameters

grave frost Apr 23, 2021, 12:45 PM

#

warm wharf

is that supposed to be VGG~ish?

mint palm Apr 23, 2021, 12:46 PM

#

maybe use bigger network

warm wharf Apr 23, 2021, 12:47 PM

#

grave frost is that supposed to be VGG~ish?

architecture description doesn't mention it, but it looks similar kinda

mint palm Apr 23, 2021, 12:48 PM

#

is it coursera

warm wharf Apr 23, 2021, 12:48 PM

#

uni class

grave frost Apr 23, 2021, 12:49 PM

#

@lapis sequoia You can search it up on StackOverflow, it's a hardware reason in GPU - batches in the power of 2 can be efficiently calculated by (4 CUDA cores in parallel?) in the end, it boils down to the GPU architecure and what Nvidia has adopted

warm wharf Apr 23, 2021, 12:49 PM

#


model = keras.Sequential([
                          keras.layers.Conv2D(64, (5, 5), 1 , padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.MaxPooling2D((2,2), 2),
                          keras.layers.Conv2D(64, (5, 5), 2, padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.MaxPooling2D((2,2), 2),
                          keras.layers.Conv2D(64, (5, 5), 2, padding='same', activation='relu',
                                              input_shape=(32, 32, 3)),
                          keras.layers.Flatten(),
                          keras.layers.Dense(3072, activation='relu'),
                          keras.layers.Dense(2048, activation='relu'),
                          keras.layers.Dense(10, activation='softmax')
])

lr_schedule = keras.callbacks.LearningRateScheduler(
              lambda epoch: 1e-4 * 10**(epoch / 10))
optimizer = keras.optimizers.SGD(
    learning_rate=0.01, momentum=0.0, nesterov=False, name="SGD"
)

model.compile(optimizer=optimizer,
                  loss='categorical_crossentropy',
                 metrics=['accuracy'])```

#

i ended up setting up my model like this im not exactly sure if its correct but i really hope so cause training 20 epochs is taking forever even on colab

grave frost Apr 23, 2021, 12:49 PM

#

warm wharf ```keras.backend.clear_session() model = keras.Sequential([ ...

what's the prob?

warm wharf Apr 23, 2021, 12:50 PM

#

im not exactly sure if i set it up correctly and wanted to make sure before i spent the training the model

#

specifically the input shape param

#

i did it mostly looking at a kaggle notebook and kinda guessing

grave frost Apr 23, 2021, 12:51 PM

#

uni eh? what's the end aim? any baselines?

warm wharf Apr 23, 2021, 12:52 PM

#

its using svhn dataset the google street view house numbers

#

but its more a learning activity or something

#

first intro to NN

grave frost Apr 23, 2021, 12:52 PM

#

warm wharf first intro to NN

wow, that's not a first intro to NN

warm wharf Apr 23, 2021, 12:52 PM

#

requirement is to train the model and plot loss functions

grave frost Apr 23, 2021, 12:52 PM

#

first intro should be Dense architectures only

#

do you know how conv layers work?

warm wharf Apr 23, 2021, 12:53 PM

#

yeah they had a few modules on DL

#

this is the application project

grave frost Apr 23, 2021, 12:53 PM

#

so is your knowledge in CNN's fully fleshed out?

warm wharf Apr 23, 2021, 12:53 PM

#

not particularly, but that may be my fault i am a little behind

grave frost Apr 23, 2021, 12:54 PM

#

yea, I suggest you take things slow and learn the basics first

warm wharf Apr 23, 2021, 12:54 PM

#

i have a little experience with them cause i took the andrew ng DL coursera course a few years back but it has been a while

mint palm Apr 23, 2021, 12:55 PM

#

andrew ng is the goat

grave frost Apr 23, 2021, 12:55 PM

#

better learn DL from the ground up

#

Andrew NG's course is shit - it's just spoon feeding you code

warm wharf Apr 23, 2021, 12:55 PM

#

yeah i got that vibe when i took it

grave frost Apr 23, 2021, 12:55 PM

#

though I didn't complete it, so I might be biased

#

but learning NN's from the ground up is much better

mint palm Apr 23, 2021, 12:56 PM

#

mint palm

see this i aint feeding from spoon.......(he said you may not wonder how to derive cuz its complex......but i did wonder)

#

i think its good if you see deeper with the course side by side

warm wharf Apr 23, 2021, 12:57 PM

#

i don't mean to flood the channel but is this normal? ik its early in the training of the model but accuracy hasn't changed in 4 epochs but the loss is going down

grave frost Apr 23, 2021, 12:57 PM

#

mint palm i think its good if you see deeper with the course side by side

I don't see much from ground up skimming over his syllabus - and one slide does not represent his entire course

grave frost Apr 23, 2021, 12:59 PM

#

warm wharf i don't mean to flood the channel but is this normal? ik its early in the traini...

it's just starting to overfit

warm wharf Apr 23, 2021, 12:59 PM

#

o not good

grave frost Apr 23, 2021, 12:59 PM

#

run it over a few more epochs to see whether val accuracy increases

mint palm Apr 23, 2021, 1:01 PM

#

wait can we tell overfitting just by this?

warm wharf Apr 23, 2021, 1:02 PM

#

sorry for the stupid question but hows it even possible for the loss to decrease and the accuracy to remain the same? isn't the loss function measuring accuracy in a way by calculating error?

mint palm Apr 23, 2021, 1:03 PM

#

yeah

#

your right

cobalt creek Apr 23, 2021, 1:04 PM

#

can someone help me🙄

arctic crown Apr 23, 2021, 1:04 PM

#

@cobalt creek can i dm you?

cobalt creek Apr 23, 2021, 1:05 PM

#

ye maybe

cobalt creek Apr 23, 2021, 1:06 PM

#

cobalt creek

can someone help with this

arctic crown Apr 23, 2021, 1:07 PM

#

cobalt creek try changing huperparameters

how?

cobalt creek Apr 23, 2021, 1:09 PM

#

What model are u using

arctic crown Apr 23, 2021, 1:12 PM

#

ls hotword

cobalt creek Apr 23, 2021, 1:14 PM

#

I hv not used it but roughly hyperparameters are the values changing which affect accuracy,

arctic crown Apr 23, 2021, 1:19 PM

#

hmm

#

can i send you the code?

#

@cobalt creek

cobalt creek Apr 23, 2021, 1:24 PM

#

can someone help me please

cobalt creek Apr 23, 2021, 1:24 PM

#

arctic crown <@!585303716416978960>

how can i help anyway

#

why is this 2000 training partition is of 65000... i m confused pls help

cobalt creek Apr 23, 2021, 1:48 PM

#

is there some default value of batch size, i just set it to 1, i have 65000 on the counter

#

exactly what i was expecting

short heart Apr 23, 2021, 2:10 PM

#

Ive got RL algorithm to choose from buying,selling or holding things. How do I prevent it from choosing actions like buying when it has no money, or selling when it doesnt have anything? Cause it can choose these things for a lot of iterations and gets 0 reward, which breaks everything I assume

sinful briar Apr 23, 2021, 2:19 PM

#

@drifting void this one

lapis sequoia Apr 23, 2021, 2:22 PM

#

If i have an image, and its mask, what operation do i need to apply the mask but leave the background white?

drifting void Apr 23, 2021, 2:39 PM

#

Sorry, my lab got destroyed and I couldn't get the sample data...
So my case is the following. I am generating a lot of data in a form I choose, last version is something like that:
[
'0001c06e32a85a5d92c9cb784ff6a492df1d0055',
'00088f45a8bc798ceb2b5a37505f787fad19d9af',
[89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99],
[350, 351, 352, 353, 354, 355, 356, 357, 99],
-9.5,
1.0
]

Since I have many of these, I do that in parallel and append to a file. I chose msgpack but now the file is so large that I cannot read it back...

#

I use Dask for other use case and it worked well with reading many parquet files. So maybe I should write in several parquet files instead of msgpack single file.
My question is how do you usually write big files and what do you use for reading and searching later on

arctic crown Apr 23, 2021, 3:05 PM

#

can someone please help me

#

i am making a personal assistant and i want to add a hotword in it

#

please help

wicked meadow Apr 23, 2021, 3:32 PM

#

Hey this is a fairly basic question I think. I was told to post in here.

I currently have pandas 0.20.3 installed. I want 0.24 or newer versions. I tried update pandas but it apparently only sees the 0.20.3 as the newest version.

Currently working on corporate servers so I can't download anything. Anybody know how to get the new version of pandas in my situation?

lapis sequoia Apr 23, 2021, 3:35 PM

#

If i have an image, and its mask, what operation do i need to apply the mask but leave the background white?

tidal bough Apr 23, 2021, 3:39 PM

#

wicked meadow Hey this is a fairly basic question I think. I was told to post in here. I curr...

What Python version do you have?

wicked meadow Apr 23, 2021, 3:45 PM

#

tidal bough What Python version do you have?

3.7.4

tidal bough Apr 23, 2021, 3:45 PM

#

That's really strange, hmm

wicked meadow Apr 23, 2021, 3:46 PM

#

Probably just how it's all set up here

tidal bough Apr 23, 2021, 3:46 PM

#

Try updating pip, perhaps. python -m pip install --upgrade pip

#

I've had weird behaviour from old pips

wicked meadow Apr 23, 2021, 3:47 PM

#

Hmm that's giving me an error in the prompt. Says unable to get local issuer certificate

tidal bough Apr 23, 2021, 3:50 PM

#

try also pip install --upgrade pip, I guess, but that ends up badly for me sometimes

wicked meadow Apr 23, 2021, 3:51 PM

#

It gives me that same error

tidal bough Apr 23, 2021, 3:52 PM

#

hmm, weird

#

might want to open a help channel

#

something is wrong with your pip, possibly

wicked meadow Apr 23, 2021, 3:52 PM

#

#

Fyi

#

Okay I may have to do that

#

Thanks so far!

rough otter Apr 23, 2021, 4:10 PM

#

in a regression model, would you keep variables that have low correlation with the target variable?

lapis sequoia Apr 23, 2021, 4:12 PM

#

imagine my classifier classifies melons and water melons

#

how can i make it infer a melon colored with red as a melon and not a water melon?

tidal bough Apr 23, 2021, 4:13 PM

#

wait, what

#

watermelons are the ones with red insides, not melons

lapis sequoia Apr 23, 2021, 4:13 PM

#

thats what i mean

#

if i paint a melon with red, like, manually on photoshop

#

the cnn will think it is water melon, but it is actually a melon

#

or orange - lemon

tidal bough Apr 23, 2021, 4:14 PM

#

Well, include in the training set such trick examples.

lapis sequoia Apr 23, 2021, 4:14 PM

#

like my question is, how can i make it not rely that much on colors but on shape

supple turtle Apr 23, 2021, 4:14 PM

#

hey guys i wrote my first blog. It would be great if you check it out https://www.analyticsvidhya.com/blog/2021/04/exploratory-analysis-using-univariate-bivariate-and-multivariate-analysis-techniques/

Analytics Vidhya

khushis

Exploratory Analysis | Univariate, Bivariate, and Multivariate Anal...

Exploratory Analysis is the preliminary analysis of data to discover relationships between measures in the data and to gain an insight

lapis sequoia Apr 23, 2021, 4:15 PM

#

will be good randomly paint some images on the data training set???

tidal bough Apr 23, 2021, 4:15 PM

#

Possibly, yeah

lapis sequoia Apr 23, 2021, 4:15 PM

#

like, idk, then maybe when he sees an orange, maybe it will think it is a lemon that has been painted

#

:/

#

how will it not mess up with real and fake?

tidal bough Apr 23, 2021, 4:17 PM

#

Include enough examples and eventually it will learn.

You could also just grayscale the image and so abandon matching on color entirely, but that might reduce accuracy on normal examples.

lapis sequoia Apr 23, 2021, 4:20 PM

#

#

i just made it right now

#

if u hadnt the original photos, it will be hard even for u to see what is a lemon and what is an orange

tidal bough Apr 23, 2021, 4:20 PM

#

I'd be fooled by that too, yeah

lapis sequoia Apr 23, 2021, 4:21 PM

#

so there is not actual way?

lapis sequoia Apr 23, 2021, 4:21 PM

#

tidal bough Include enough examples and eventually it will learn. You could also just grays...

and on black and white, since u have less data to analyze, u need more examples, right?

broken warren Apr 23, 2021, 4:32 PM

#

Hey i'm trying to build an ai that predicts a 6. number to a given 4 number series. what is the best neural Net i could use for that? (I heard that RNN or specifically LTSM is good for the task)

lapis sequoia Apr 23, 2021, 5:00 PM

#

hello I need help with an AI in open cv someone knows about this topic

exotic maple Apr 23, 2021, 7:40 PM

#

rough otter in a regression model, would you keep variables that have low correlation with t...

that's a very circumstancial question. If you have no limitations in processing power, data collection, etc, would you keep it all?

Feature engineering / selection is practically a field of its own within ML / AI, so I dont think its a trivial question

dapper halo Apr 23, 2021, 7:49 PM

#

Could anyone direct me to information on feeding a bayesian network distributions as inputs? All I see are on using bayesian networks to produce a distribution as an output

bronze skiff Apr 23, 2021, 7:56 PM

#

to be fair, by "producing a distribution" we usually mean "parameters of a predefined family of distributions"

#

so you can also say that your inputs follow a parameterized family and feed those in

distant trout Apr 23, 2021, 11:59 PM

#

Hi guys how can i get 8 peak values at every charts? I have this values saved in txt file and numpy dataFrame

lapis sequoia Apr 24, 2021, 12:03 AM

#

How many times do neurons get backpropagated in a neural network?

fleet vault Apr 24, 2021, 1:54 AM

#

im not sure if a beautifulsoup question belongs here, but #help-carrot

serene scaffold Apr 24, 2021, 2:01 AM

#

fleet vault im not sure if a beautifulsoup question belongs here, but <#696840664435916950>

that would be a #web-development question.

rare shell Apr 24, 2021, 2:01 AM

#

Hey guys! I had a quick Numpy question. If I have an array such as

[
[0], [1], 
[1, 0], 
[1, 1],
[1, 0, 0], 
[1, 0, 1], 
[1, 1, 0], 
[1, 1, 1]
]

And I wanted to populate the blank spaces martrix with any number lets say 8 to become a 8 by 3 matrix such as

[
[0, 8, 8],  
[1, 8, 8], 
[1, 0, 8], 
[1, 1, 8], 
[1, 0, 0], 
[1, 0, 1], 
[1, 1, 0], 
[1, 1, 1]]

How would I do something like this?

serene scaffold Apr 24, 2021, 2:02 AM

#

rare shell Hey guys! I had a quick Numpy question. If I have an array such as ```py [ [0],...

how is numpy letting you make a non-rectangular array like that?

rare shell Apr 24, 2021, 2:02 AM

#

Its not in the first place I have to specify dtype=object

velvet thorn Apr 24, 2021, 2:03 AM

#

serene scaffold how is numpy letting you make a non-rectangular array like that?

it's an array of lists

velvet thorn Apr 24, 2021, 2:04 AM

#

rare shell Hey guys! I had a quick Numpy question. If I have an array such as ```py [ [0],...

why is your data like that?

#

anyway, this is probably what I would do

rare shell Apr 24, 2021, 2:05 AM

#

Well its accually a step for solving a problem in a question on my CS assigment so yea

serene scaffold Apr 24, 2021, 2:05 AM

#

I would find another way to approach the problem so that you end up with nans instead of an improper matrix.

velvet thorn Apr 24, 2021, 2:05 AM

#

!e

import numpy as np

data = [
[0], [1], 
[1, 0], 
[1, 1],
[1, 0, 0], 
[1, 0, 1], 
[1, 1, 0], 
[1, 1, 1]
]

max_length = len(max(data, key=len))
repeated_element = 8

a = np.array([row + [repeated_element] * (max_length - len(row)) for row in data])
print(a)

arctic wedgeBOT Apr 24, 2021, 2:06 AM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 | [[0 8 8]
002 |  [1 8 8]
003 |  [1 0 8]
004 |  [1 1 8]
005 |  [1 0 0]
006 |  [1 0 1]
007 |  [1 1 0]
008 |  [1 1 1]]

rare shell Apr 24, 2021, 2:06 AM

#

wow

velvet thorn Apr 24, 2021, 2:07 AM

#

lapis sequoia How many times do neurons get backpropagated in a neural network?

neurons don't get backpropagated.

velvet thorn Apr 24, 2021, 2:07 AM

#

distant trout Hi guys how can i get 8 peak values at every charts? I have this values saved in...

define "peak"

lapis sequoia Apr 24, 2021, 2:07 AM

#

velvet thorn neurons don't get backpropagated.

Sorry, the weights

#

How many times do they get recalculated

rare shell Apr 24, 2021, 2:07 AM

#

velvet thorn neurons don't get backpropagated.

Thanks that helped 🙂

velvet thorn Apr 24, 2021, 2:08 AM

#

dapper halo Could anyone direct me to information on feeding a bayesian network distribution...

that would depend on the library

velvet thorn Apr 24, 2021, 2:08 AM

#

lapis sequoia Sorry, the weights

weights don't get backpropagated either...

#

...but if I understand the thrust of your question correctly

#

that would depend on the architecture of the network.

lapis sequoia Apr 24, 2021, 2:08 AM

#

is it defined in the model?

#

like in keras

velvet thorn Apr 24, 2021, 2:09 AM

#

lapis sequoia how can i make it infer a melon colored with red as a melon and not a water melo...

are you talking about resilience to adversarial examples?

distant trout Apr 24, 2021, 2:11 AM

#

velvet thorn define "peak"

I am solving this problem on help-cake could u check it out? I am struggle with one thing

rare shell Apr 24, 2021, 2:12 AM

#

@velvet thorn In your code how would I replace the number 8 with something different?

#

Nm i got it

distant trout Apr 24, 2021, 2:26 AM

#

How can i check when series of booleans like "True, True, Ture,True, False,False" is changing from True to False?

velvet thorn Apr 24, 2021, 3:05 AM

#

distant trout How can i check when series of booleans like "True, True, Ture,True, False,False...

compare to a slice

bright aurora Apr 24, 2021, 4:29 AM

#

Guys can you please help me with this question

#

https://www.reddit.com/r/learnmachinelearning/comments/mx3ud1/pytorch_lstm_sine_wave_prediction_using_adam_and/?utm_medium=android_app&utm_source=share

r/learnmachinelearning - PyTorch LSTM: Sine Wave Prediction using A...

1 vote and 0 comments so far on Reddit

#

I'm struggling to implement a sine wave predictor using LSTM in pytorch. If someone can help me understand why it's not working

stuck socket Apr 24, 2021, 5:06 AM

#

wtf

#

how r ya guys

#

@bright aurora u there?

#

woao

#

highjshgsjdf

#

where did u learn pytorch=?

bright aurora Apr 24, 2021, 5:45 AM

#

stuck socket <@!585644667564195841> u there?

What's up

whole mica Apr 24, 2021, 6:15 AM

#

What’s up guys?

prime vortex Apr 24, 2021, 8:14 AM

#

anyone free to look at my beginner data analysis code? I really appreciate the help

#

https://github.com/GervinFung/SomeDataAnalysis

lapis sequoia Apr 24, 2021, 9:43 AM

#

Would someone mind having a look at a short notebook where I'm toying with some data exploration/visualization? I wound up coming to almost the opposite conclusion I expected when I started and I'm wondering if I inverted something somewhere or made some really stupid mistake in sorting and filtering my data?

#

https://www.kaggle.com/cephalopoda/class-performance

balmy junco Apr 24, 2021, 11:15 AM

#

How does one typically use non-image data with image data when training models using pytorch? I have used pytorch for image classification, but never for image classification with non-image features. Any thoughts?

crisp wing Apr 24, 2021, 12:30 PM

#

Sorry, asked this in help section, but since I'm in deep dung, and the question probably was a bit too specific, I'd ask it here if ok:

I did SVD on some precentered data,

# done in python

# T: amount of samples with time, kind of our "variable" with this type of data
# X: data put inside a np.ndarray
# X.shape = (T=109, N_Lat*N_Lon=alot)
# X has mean ~ 0
U, s, V, = svd(X)

# mean ~ 0, std ~ var ~ 1
# but min ~ -2, max ~2.5
# retain three components
standardised_PCs = sqrt(T) * U[:, 0:2]

# Since standardised, I'd assume this would result in the correlation matrix, but...
standardised_PCs.T @ standardised_PCs
array([[ 1.09000000e+02,  1.45674989e-14, -8.57975238e-15],
       [ 1.45674989e-14,  1.09000000e+02,  2.23983947e-14],
       [-8.57975238e-15,  2.23983947e-14,  1.09000000e+02]])

The diagonals are equal to T rather than 1.
I feel like I misunderstand the approach or result somehow. Everywhere I look I feel they say you'd get a correlation matrix using standardised PCs

My reference for this approach is (eq. 16)
http://www.ehu.eus/eolo/pyclimate/downloads/matrix.pdf

ripe forge Apr 24, 2021, 1:32 PM

#

lapis sequoia is it defined in the model?

Indirectly. How often a backpropagation is fired depends on your batch size, and your learning algorithm. So, you can pretend it happens once for every batch but there's exceptions too. Now that means number of epochs also affects it. And then finally you throw a gpu into the mix and it all goes to shit

lapis sequoia Apr 24, 2021, 1:49 PM

#

ripe forge Indirectly. How often a backpropagation is fired depends on your batch size, an...

Makes sense. Helps with the intuition anyway. Thanks.

lavish tundra Apr 24, 2021, 1:57 PM

#

i'm having a problem with chinese and korean words using seaborn+matplotlib, someone know how i can fix that?

crisp wing Apr 24, 2021, 2:30 PM

#

lavish tundra i'm having a problem with chinese and korean words using seaborn+matplotlib, som...

Can't help you directly, but found this:
https://stackoverflow.com/questions/58172176/python-seaborn-plot-shows-data-names-as-ㅁㅁㅁㅁ-how-can-i-fix-this

Stack Overflow

Python seaborn plot shows data names as ㅁㅁㅁㅁ. How can I fix this?

I am trying to draw out some graph using Python Seaborn, but it looks like it cannot read the data names. Data is in Korean.

So this happens...

import seaborn as sns
import matplotlib.pyplot as p...

lavish tundra Apr 24, 2021, 2:43 PM

#

ty

arctic wedgeBOT Apr 24, 2021, 3:03 PM

#

Command Help

!eval [code]
Can also use: e

*Run Python code and get the results.

This command supports multiple lines of code, including code wrapped inside a formatted code
block. Code can be re-evaluated by editing the original message within 10 seconds and
clicking the reaction that subsequently appears.

We've done our best to make this sandboxed, but do let us know if you manage to find an
issue with it!*

cobalt creek Apr 24, 2021, 4:54 PM

#

what do u guys prefer to save some ML model? i read about h5, pickle, YAML, json...which one should i prefer

tidal bough Apr 24, 2021, 5:35 PM

#

probably h5 or pickle, would prefer h5

#

storing giant arrays of numerical data in YAML or JSON is a crime against efficiency

#

like, how'd you encode them, as base64?

terse hull Apr 24, 2021, 5:54 PM

#

is R better than python in datascience

#

?

exotic maple Apr 24, 2021, 5:55 PM

#

its more widespread and has a growing community, so yes

#

R is stale

terse hull Apr 24, 2021, 5:55 PM

#

oh

#

does R perform better than python though

#

like in terms of computing speed

#

i assume it would

#

considering python needs so much dependencies

ripe forge Apr 24, 2021, 6:01 PM

#

Bad assumption, you're assuming number of dependencies decides programming speed.

safe tapir Apr 24, 2021, 6:02 PM

#

Is there a go-to lib for a/b testing?

terse hull Apr 24, 2021, 6:24 PM

#

ripe forge Bad assumption, you're assuming number of dependencies decides programming speed...

i mean wouldnt numpy be slower compared to what it would be if it was ddirectly a python thing

crisp wing Apr 24, 2021, 6:40 PM

#

I imagine most performance-driven stuff in python as well as r is basically a wrapper around lower level language functionality.

vague vector Apr 24, 2021, 6:40 PM

#

Hey guys, I need to make a Visualisation project in Tableau
I chose the London Underground, Bus and Overground usage data compared to daily covid cases. The data looks like this:

#

#

I need ideas for the visualisation
the tricky part is that I dont have dates, rather time periods. Date from to Date to. How can we handle it while visualizing?

crisp wing Apr 24, 2021, 6:40 PM

#

crisp wing I imagine most performance-driven stuff in python as well as r is basically a wr...

Numpy and dask are two examples for instance

vague vector Apr 24, 2021, 6:41 PM

#

Any Data Engineering and Visualisation expert here?

ripe forge Apr 24, 2021, 6:51 PM

#

terse hull i mean wouldnt numpy be slower compared to what it would be if it was ddirectly...

No, numpy is actually going to be a lot faster than native python or R. (for context, numpy is written in C). This is kinda why the python ecosystem is so strong, you have python acting as glue language with heavy lifting written in low lvl languages. Otherwise python wouldn't be dominating the ds space right now

iron basalt Apr 24, 2021, 6:52 PM

#

It's a very fun read. I really like Jeff's story, need more people like him.

stuck socket Apr 24, 2021, 6:56 PM

#

guys, how can i add time series into my enviroment'

#

???

iron basalt Apr 24, 2021, 7:07 PM

#

terse hull i mean wouldnt numpy be slower compared to what it would be if it was ddirectly...

Both R and Python end up calling some C code (for datascience stuff), that C code probably involves calling BLAS/CBLAS (e.g. calling numpy) which will result in mostly identical speed. The overhead of Python and R for calling a C function may be different, but it's irrelevant to any data science task. For example, if python took say 0.2 milliseconds to call a C function which got the mean of 20000 data points and R did the same but with 0.1 milliseconds overhead, it would not matter since something like 99.9...% of the time is spent in the C function (actually computing the mean). So it's a micro-optimization at best. If you are worried about speed and want to get serious about it, consider learning C to make fast things. As it will probably result in you learning about the relative speeds of things in modern computing and the C community is more focused on such things while Python/R is focused on using the things made by those people to be productive without too much work (Systems programmers make the fast systems which Python and R programmers use for their specific use cases).

#

(People that know both Python and C are the engines of the Python community that let everyone be very productive with Python (and there are a lot of them -> python is very big / used everywhere))

grave frost Apr 24, 2021, 7:40 PM

#

iron basalt It's a very fun read. I really like Jeff's story, need more people like him.

yea, I was pretty taken aback when I learnt he was the palm guy. you wouldn't think someone in Neuroscience would dip into mobile phones/portables.

iron basalt Apr 24, 2021, 7:41 PM

#

grave frost yea, I was pretty taken aback when I learnt he was the palm guy. you wouldn't th...

Yeah it's what makes me trust his opinions much more, he has actual experience making things (software especially, a bunch of people in AI never actually programmed which is strange (Often lacking a grasp of computational complexity and such)).

grave frost Apr 24, 2021, 7:44 PM

#

iron basalt Yeah it's what makes me trust his opinions much more, he has actual experience m...

a bunch of people in AI never actually programmed which is strange

#

I wouldn't believe it lol. it's such a fundamental thing when working with heavy computation

iron basalt Apr 24, 2021, 7:46 PM

#

Yeah they put out of a bunch of theory stuff (typically some crazy equations and such), but don't know that if such a thing could be computed it would be easy. One cannot ignore the physical reality of implementing an idea.

lapis sequoia Apr 24, 2021, 8:30 PM

#

hey y'all, could someone of you take pity on me and have a look at my problem that posted over here on r/learnpython? https://www.reddit.com/r/learnpython/comments/mxr8yw/merging_pandas_dataframes_how_can_i_split_my_big/

r/learnpython - Merging Pandas Dataframes // How can I split my big...

0 votes and 0 comments so far on Reddit

late shell Apr 24, 2021, 9:04 PM

#

Can someone please help me with sklearn.preprocessing.OneHotEncoder. I can't figure out how to use its categories parameter.

exotic maple Apr 24, 2021, 9:27 PM

#

late shell Can someone please help me with `sklearn.preprocessing.OneHotEncoder`. I can't f...

Categories is what the encoder "learns" after you instance it. That is

#

if your column had 3 options A,B,C those are going to the categories tehe encoder is going to fit

#

I think you can also pass a list of your own categories, if you already have them or if you want to exclude unknowns

lapis sequoia Apr 24, 2021, 10:21 PM

#

Hey ! I just followed the tutorial of Tech With Team to create an AI playing Flappy Bird using the NEAT algorithm

#

everything works as intended and I now want to check if I understood correctly by coding a snake game

#

But I'm wondering about something : something the snake game has that flappy bird don't is collectibles

#

Basically If I have X snakes playing around at any given time but only one apple for them to eat, this will cause issues. My question is : should I give each snake its own Apple that other snakes can't eat ? In this case, should all the apples be at the same position (apple #1 will be at 54;60 for every snake then apple #2 at 100;100, etc) or will a random position for each snake work just fine ?

#

Thanks for your answers lol I'm only starting out with ML

grave frost Apr 24, 2021, 11:08 PM

#

lapis sequoia Thanks for your answers lol I'm only starting out with ML

you are starting ML with NEAT !??

lavish tundra Apr 25, 2021, 12:26 AM

#

someone who understand about asia fonts(cjk) can give me a hand?
i'm trying to set the Noto Sans CJK font family using seaborn

sns.set_style({'font.family':'NotoSansCJK-Medium.ttc'})

i tried this too

sns.set_style({'font.family':'Noto Sans CJK'})

idk if the problem is with the font or with the code

dapper halo Apr 25, 2021, 4:48 AM

#

Following along one of keras tutorials with my own data....really just trying to use datasets instead of a dataframe....but I keep getting the error that the model expects 3 inputs, but only receives 1 input tensor when it tries to fit the model to the dataset. Stackoverflow solution was to ensure that the second part of the tuple for the dataset needs to be the targets, which I have done....so not really sure what to do to resolve this. Anyone know how to resolve this?

`metal = 'N_SiII'
dataset = tf.data.Dataset.from_tensor_slices((Dataframe[['N_H','Redshift',metal]].values,Dataframe[['Metallicity','Density']].values))

def get_train_and_test_splits(dataset,train_size,batch_size=1):
train_dataset = (dataset.take(train_size).shuffle(buffer_size=train_size).batch(batch_size))
test_dataset = dataset.skip(train_size).batch(batch_size,drop_remainder=True)
return train_dataset, test_dataset

def run_experiment(model, loss, train_dataset, test_dataset):
model.compile(
optimizer=keras.optimizers.RMSprop(learning_rate=learning_rate),
loss=loss,
metrics=[keras.metrics.RootMeanSquaredError()],
)

model.fit(train_dataset, epochs=num_epochs, validation_data=test_dataset)

run_experiment(baseline_model, mse_loss, train_dataset, test_dataset)`

terse hull Apr 25, 2021, 5:09 AM

#

iron basalt Both R and Python end up calling some C code (for datascience stuff), that C cod...

ah alright

also woah you explained it very nicely 😳

somber prism Apr 25, 2021, 7:53 AM

#

anyone have a tip on what to learn/do after finishing 'ml for stanford' ?

inland isle Apr 25, 2021, 8:45 AM

#

what is data warehousing?

#

how to do it using python ?

lapis sequoia Apr 25, 2021, 8:46 AM

#

grave frost you are starting ML with NEAT !??

Yes, is that a bad way to start ? I think I understood the concepts

mint palm Apr 25, 2021, 10:10 AM

#

#

what are we actually doing in this L2 regularisation

#

i dont get the sigma part

#

are we squaring the numbers in weight parameter W and adding them?

grave frost Apr 25, 2021, 10:43 AM

#

oof, I just read about a "ML scientist" (not a Data Scientist) who doesn't know any aspect of DL or anything in NLP, CNN's etc. And is wondering why he got fired from his company

safe tapir Apr 25, 2021, 12:57 PM

#

Is there a way to use kwargs for the parameters in scipy.stats distributions? I can't find them in the docs:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html

Example: we are looking for parameters mu and sigma2 here

velvet thorn Apr 25, 2021, 1:03 PM

#

mint palm

sigma?

#

you mean lambda?

#

or omega?

mint palm Apr 25, 2021, 1:03 PM

#

the lambda/2m term

velvet thorn Apr 25, 2021, 1:03 PM

#

anyway yes just take the sum of the squares of the weights

mint palm Apr 25, 2021, 1:03 PM

#

ok but what it does??

velvet thorn Apr 25, 2021, 1:04 PM

#

mint palm ok but what it does??

it applies a penalty that scales with the magnitudes of the weights

mint palm Apr 25, 2021, 1:04 PM

#

so the purpose is to decrease the overfitting right?

velvet thorn Apr 25, 2021, 1:06 PM

#

mint palm so the purpose is to decrease the overfitting right?

uh.

mint palm Apr 25, 2021, 1:06 PM

#

overfitting i saw it

velvet thorn Apr 25, 2021, 1:06 PM

#

that is what it is commonly used for, yes

mint palm Apr 25, 2021, 1:07 PM

#

if we are adding to term how is it penalising

velvet thorn Apr 25, 2021, 1:07 PM

#

mint palm if we are adding to term how is it penalising

that’s the loss function

#

higher = worse

mint palm Apr 25, 2021, 1:08 PM

#

of so we are increasing loss function so than dW and db increase?

velvet thorn Apr 25, 2021, 1:08 PM

#

mint palm of so we are increasing loss function so than ``dW`` and ``db`` increase?

sorry, didn’t understand that

mint palm Apr 25, 2021, 1:08 PM

#

i wanna know why are we adding it to loss function

somber prism Apr 25, 2021, 2:05 PM

#

anyone know how to make it less ugly

#

how can i make them not overlap on each value

mint palm Apr 25, 2021, 2:08 PM

#

use alpha

muted oyster Apr 25, 2021, 2:08 PM

#

xticks yticks

mint palm Apr 25, 2021, 2:08 PM

#

@somber prism

muted oyster Apr 25, 2021, 2:09 PM

#

anyone knows how do we create a function like grid search

tidal bough Apr 25, 2021, 2:12 PM

#

Well, you can use numpy's linspace and meshgrid (or a similar method) to generate the sets of parameters, then you evaluate the function on all of them and pick the best results.

muted oyster Apr 25, 2021, 2:19 PM

#

i need to pass strings, trying with some for loops first may be function is not necessary

somber prism Apr 25, 2021, 2:25 PM

#

mint palm use alpha

?

mint palm Apr 25, 2021, 3:41 PM

#

@somber prism nvm dont use alpha.......its for discriminating how dense the overlapping plot is...........instead you can use plt.setp

#

u can rotate it through an angle and it would be much clear

#

something like this :

#

the usage is something like this:

somber prism Apr 25, 2021, 4:13 PM

#

mint palm the usage is something like this:

ok

grave wasp Apr 25, 2021, 5:06 PM

#

I have a program with face recognition by adding my custom database with images. Reading the video and doing face recognition. Libs that i am using are cv2, face_recognition. My question is can i use sklearn for classification report?

dapper halo Apr 25, 2021, 5:35 PM

#

Doing a bayesian regression. I fit the model with separated dataframes x_train, y_train of shapes (samples,3).

Trying to look at output distributions instead of the deterministic values from model.predict(). So I feed the model x_test. Get error that x_test has no rank.
Tried converting the dataframe to a series oriented dictionary and spits out the error "expected one input tensor and got 3.

Any suggestions on what I need to convert my testing dataframe to, to view the output distributions?

frozen marten Apr 25, 2021, 6:04 PM

#

within unet there is something called backbone_name parameter which takes resnet, vggnet.. unable to understand the fundamental difference between unet and (vggnet, resnet) ... Are'nt the latter too models like unet?
base_model = Unet(backbone_name='resnet34', encoder_weights='imagenet')

#

anyone online??

#

help me out with this guys....

#

😩 😩

lapis sequoia Apr 25, 2021, 8:19 PM

#

velvet thorn are you talking about resilience to adversarial examples?

idk what am i talking about 😄 my model failed predicting something cuz it wasnt colored as it is supposed to be

#

hey

#

if I want to learn AI

#

but I don't exactly have the mathematical background to understand it all

#

but I still want to understand the math instead of just using pre-built pipelines and treating it as black box

#

where should I look

#

youtube

#

😄

#

BRUH oK thank you 😁

#

but seriously are there books or courses that could do the trick

#

u can look for something like maths under a neural network or soemthing

#

and then building my own nn from scratch

lapis sequoia Apr 25, 2021, 8:21 PM

#

lapis sequoia and then building my own nn from scratch

ppl do this and implement everything on their own

#

h u h

#

alright that sounds like a good idea

lapis sequoia Apr 25, 2021, 8:21 PM

#

lapis sequoia but seriously are there books or courses that could do the trick

im pretty sure there are, but i know none

#

oK then anyways thanks again :)

iron basalt Apr 25, 2021, 8:22 PM

#

lapis sequoia oK then anyways thanks again :)

Pattern Recognition and Machine Learning by Bishop.

#

ISBN-10: 0387310738

#

You should know linear algebra and multivariate calculus for that book. There are tons of books for both of those things. For linear algebra try Linear Algebra Done Right. For multivariate calculus, idk, do whichever.

#

There is also links for the math in the pins.

lapis sequoia Apr 25, 2021, 8:46 PM

#

Hey ! I'm currently trying to apply the NEAT algorithm to a snake game I coded with python. For now I already have the "base" : I have a snake object with it's own food, So I can spawn how much snake I want at once and each snake will only be able to eat it's own food. Now, I'm wondering about which inputs I should give each snake for the algorithm to work
the obvious ones are easy : position of the food, position of the head, current direction and current lenght of the snake
but for it to be efficient, the snake should be able to know the position of each of it's body part for it to be able to avoid it properly
the problem is that the number of body parts can change, and from what I know, the number of inputs should be fixed. How should I proceed ?

#

I've seen this on the web but here the snake still does not have informations about the position of it's body

#

My goal is to make it learn to avoid self enclosing

merry frost Apr 25, 2021, 9:14 PM

#

any pandas experts in here willing to educate me?

exotic maple Apr 25, 2021, 9:39 PM

#

merry frost any pandas experts in here willing to educate me?

if you have a question ask it and if someone can help they most likely will

merry frost Apr 25, 2021, 9:40 PM

#

exotic maple if you have a question ask it and if someone can help they most likely will

Fair enough

tough surge Apr 25, 2021, 10:15 PM

#

Hey guys a question regarding anaconda..

#

I have initially installed anaconda on different drive and now I have reinstalled windows and deleted the .anaconda2 and .anaconda3 hidden folders inside AppData.

#

The problem is that now i dont know how to make it work with pycharm

#

Maybe i should create all the envs, one by one using the .yml files. But I cant find them inside the env's folder

serene scaffold Apr 25, 2021, 10:34 PM

#

merry frost any pandas experts in here willing to educate me?

What question do you have about pandas?

merry frost Apr 25, 2021, 10:46 PM

#

serene scaffold What question do you have about pandas?

I am trying to calculate the average sales across months. I have a pivot table created with pandas and if i was in excel i would use as sum if to aggregate each row but I am new to python any help would be greatly appreciated

velvet thorn Apr 25, 2021, 11:07 PM

#

merry frost I am trying to calculate the average sales across months. I have a pivot table c...

show your data

#

as text

#

also if you have a question, just ask it.

#

no need for a preface

velvet thorn Apr 25, 2021, 11:09 PM

#

lapis sequoia where should I look

you can try https://www.deeplearningbook.org/

merry frost Apr 25, 2021, 11:20 PM

#

velvet thorn show your data

The data is very large and I am afraid I lack the ability to reduce it to a smaller understandable form

velvet thorn Apr 25, 2021, 11:21 PM

#

merry frost The data is very large and I am afraid I lack the ability to reduce it to a smal...

show a subset of the data

#

and an expected result

#

otherwise it's hard to help you

merry frost Apr 25, 2021, 11:22 PM

#

velvet thorn you can try https://www.deeplearningbook.org/

I understand thats why I was more looking to pick someones brain so I can understand what happens to the dataframe when I apply a panda function. I am 43 and teaching myself how to code

#

I can show you a screenshot of a pivot table in excel if that is helpful, i figured you wanted data you could manipulate

exotic maple Apr 25, 2021, 11:24 PM

#

merry frost I am trying to calculate the average sales across months. I have a pivot table c...

you can use agregation in pandas too

#

for example

#

df.groupby(COLUMN).agg(function(s) to aggregate with)

velvet thorn Apr 25, 2021, 11:25 PM

#

merry frost I understand thats why I was more looking to pick someones brain so I can unders...

did you mean to reply to something else

exotic maple Apr 25, 2021, 11:25 PM

#

pivot table pretty much does the same, but I find groupby cleaner to read

velvet thorn Apr 25, 2021, 11:25 PM

#

merry frost I am trying to calculate the average sales across months. I have a pivot table c...

I'm not really sure what "as sum if" is

#

maybe you can give me an example

#

of what you want to do

#

and what the shape of your data is like

velvet thorn Apr 25, 2021, 11:25 PM

#

exotic maple pivot table pretty much does the same, but I find groupby cleaner to read

same

exotic maple Apr 25, 2021, 11:25 PM

#

SUMIF is an excel funciton that does SUM when the IF is true

velvet thorn Apr 25, 2021, 11:25 PM

#

uh

#

okay

#

so you mean like this?

merry frost Apr 25, 2021, 11:26 PM

#

you guys type faster than i can think lol

velvet thorn Apr 25, 2021, 11:26 PM

#

!e

import pandas as pd

s = pd.Series([2, 5, 4, 3, 8])  # data
evens = s[s % 2 == 0]  # events
print(evens.sum())

arctic wedgeBOT Apr 25, 2021, 11:26 PM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

velvet thorn Apr 25, 2021, 11:26 PM

#

i.e. 2 + 4 + 8, since all of those are even

#

like that?

exotic maple Apr 25, 2021, 11:26 PM

#

velvet thorn !e ```py import pandas as pd s = pd.Series([2, 5, 4, 3, 8]) # data evens = s[s...

Pretty much yes

#

excel has it in a single function

velvet thorn Apr 25, 2021, 11:26 PM

#

merry frost you guys type faster than i can think lol

so there's no direct equivalent of SUMIF

exotic maple Apr 25, 2021, 11:26 PM

#

so he probably needs to make a custom function

velvet thorn Apr 25, 2021, 11:26 PM

#

but you can apply filters

#

which is what I did

#

s[s % 2 == 0] this is basically "get me the subset of the data where the remainder when divided by 2 is 0"

#

and you can apply any condition in the same way

#

even something much more complex

merry frost Apr 25, 2021, 11:27 PM

#

exotic maple so he probably needs to make a custom function

Thank you for reading my mind i would not be able to respond fast enough to be helpful

exotic maple Apr 25, 2021, 11:28 PM

#

merry frost Thank you for reading my mind i would not be able to respond fast enough to be h...

I'm self-taught as well, i've been there trying to replicate some excel functionalities lol

#

pandas > excel though

#

specially for much larger files

velvet thorn Apr 25, 2021, 11:28 PM

#

merry frost Thank you for reading my mind i would not be able to respond fast enough to be h...

again, it depends on what you want to do specifically

exotic maple Apr 25, 2021, 11:28 PM

#

@merry frost the best way to get help is to include a subset or visual sample of your data and what result you expect to see

merry frost Apr 25, 2021, 11:29 PM

#

I have sales data with 13 different revenue types and 700+ reps I need to use historical data to create a sales goal

exotic maple Apr 25, 2021, 11:30 PM

#

how is the data structured? the revenue types are columns and the reps rows?

#

so its a 700x13 table?

merry frost Apr 25, 2021, 11:30 PM

#

exotic maple <@!798216757130297355> the best way to get help is to include a subset or visual...

stupid question can i post a photo here?

exotic maple Apr 25, 2021, 11:30 PM

#

its not a stupid question, and yes, but its noyt preferred.

#

I dont remember how people post their df's here thou lol

merry frost Apr 25, 2021, 11:31 PM

#

each row is an event they went to with the date of the even the revenue type, rep name, number of new members.

exotic maple Apr 25, 2021, 11:31 PM

#

so for example if you want the total revenue per rep (regarldess of date)

#

you can do

#

data.groupby("rep").agg(sum)

#

that will give you the rep, and the sum of each revenue (assuming they are columns)

velvet thorn Apr 25, 2021, 11:32 PM

#

merry frost stupid question can i post a photo here?

you can but

#

it's harder to read

#

so not everyone will

velvet thorn Apr 25, 2021, 11:33 PM

#

exotic maple data.groupby("rep").agg(sum)

you can just do .sum() btw

merry frost Apr 25, 2021, 11:33 PM

#

how would i than take that and get an average of each monthly total for each rep in each revenue type

velvet thorn Apr 25, 2021, 11:33 PM

#

also I believe .agg(sum) would be slower than .agg('sum')

#

that's my guess though

exotic maple Apr 25, 2021, 11:33 PM

#

velvet thorn you can just do `.sum()` btw

Ik, but i still find that cleaner :p

exotic maple Apr 25, 2021, 11:33 PM

#

velvet thorn also I *believe* `.agg(sum)` would be slower than `.agg('sum')`

interesting. I would like to test it.

velvet thorn Apr 25, 2021, 11:34 PM

#

exotic maple interesting. I would like to test it.

my guess is that .agg(sum) would use the builtin sum

#

which sums as Python objects

#

-> slow

#

whereas .agg('sum') would use C summation

#

it might be specialcased

#

no idea

velvet thorn Apr 25, 2021, 11:34 PM

#

merry frost how would i than take that and get an average of each monthly total for each rep...

what is a revenue type

#

is it a column?

merry frost Apr 25, 2021, 11:34 PM

#

yes

exotic maple Apr 25, 2021, 11:34 PM

#

velvet thorn whereas `.agg('sum')` would use C summation

you could always pass np.sum, which is what i do haha

velvet thorn Apr 25, 2021, 11:34 PM

#

or is it a value in a column for each row

velvet thorn Apr 25, 2021, 11:35 PM

#

merry frost yes

df.groupby('rep').mean()

#

which is

#

take all the rows

#

group them by representative

#

then take the mean of all remaining columns

merry frost Apr 25, 2021, 11:35 PM

#

it took me this long but i cleaned the data, i hope this pastes ok

#

velvet thorn Apr 25, 2021, 11:36 PM

#

!e

import pandas as pd

df = pd.DataFrame([['a', 5, 8], ['b', 3, 6], ['a', 2, 7], ['b', 1, 7], ['a', 4, 3], ['c', 2, 6]], columns=['rep', 'type_a', 'type_b'])

print(df.groupby('rep').mean())

arctic wedgeBOT Apr 25, 2021, 11:36 PM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 |        type_a  type_b
002 | rep                  
003 | a    3.666667     6.0
004 | b    2.000000     6.5
005 | c    2.000000     6.0

velvet thorn Apr 25, 2021, 11:37 PM

#

!e

import pandas as pd

df = pd.DataFrame([['a', 5, 8], ['b', 3, 6], ['a', 2, 7], ['b', 1, 7], ['a', 4, 3], ['c', 2, 6]], columns=['rep', 'type_a', 'type_b'])
print(df)

arctic wedgeBOT Apr 25, 2021, 11:37 PM

#

@velvet thorn :white_check_mark: Your eval job has completed with return code 0.

001 |   rep  type_a  type_b
002 | 0   a       5       8
003 | 1   b       3       6
004 | 2   a       2       7
005 | 3   b       1       7
006 | 4   a       4       3
007 | 5   c       2       6

velvet thorn Apr 25, 2021, 11:37 PM

#

^ the original

velvet thorn Apr 25, 2021, 11:37 PM

#

merry frost it took me this long but i cleaned the data, i hope this pastes ok

revenue type is the first column?

merry frost Apr 25, 2021, 11:37 PM

#

no

velvet thorn Apr 25, 2021, 11:37 PM

#

then?

merry frost Apr 25, 2021, 11:38 PM

#

i think its a hash from the elastic search the data is exported from

velvet thorn Apr 25, 2021, 11:44 PM

#

huh

#

so what's the type

solid blaze Apr 26, 2021, 12:02 AM

#

Anyone have any idea of how to solve this? The wording is confusing to say the least. It doesn't help that I'm self taught.

merry frost Apr 26, 2021, 12:13 AM

#

velvet thorn so what's the type

sorry i dont think i understood your question Type is not the first column the first column is the hash i spoke of ( 'id') if you are asking which column the type is in that would be the 5th column

velvet thorn Apr 26, 2021, 1:08 AM

#

merry frost sorry i dont think i understood your question Type is not the first column the f...

but the first column is type

#

in your screenshot

merry frost Apr 26, 2021, 1:09 AM

#

Correct i cut 45 columns of useless information

lapis sequoia Apr 26, 2021, 1:10 AM

#

Do I need GPU for image recognition model training?

velvet thorn Apr 26, 2021, 1:14 AM

#

lapis sequoia Do I need GPU for image recognition model training?

not necessarily

velvet thorn Apr 26, 2021, 1:14 AM

#

merry frost Correct i cut 45 columns of useless information

yeah, I was taking reference from your screenshot

velvet thorn Apr 26, 2021, 1:14 AM

#

lapis sequoia Do I need GPU for image recognition model training?

it depends on the complexity of your model. you rarely NEED a GPU, but it can help a lot.

merry frost Apr 26, 2021, 1:28 AM

#

velvet thorn yeah, I was taking reference from your screenshot

Not sure if this helps but this is what I'm trying to accomplish

Goal = MidQuintile(MonthlySum(new membersByRepByType))

wind bobcat Apr 26, 2021, 2:37 AM

#

I am sorry if this channel is inappropriate for this question :C

May i ask for recommendations for python packages that helps extract or convert music into some sort of data?

slim ivy Apr 26, 2021, 4:09 AM

#

how can i remove rows in pandas by the name of the column

hollow grove Apr 26, 2021, 4:33 AM

#

i needed some help doing something specific with tensorflow

#

im really not sure how this all work since i didnt write the code but, this is the code and what i want to do is serve it as with a Flask API, as in i want it to take image data as input and get the output. How should i approach it? should i build a model file and then somehow run it? if i were to simply import that to the flask main.py it would do a lot of computation on each request so im not sure how to do it.

#

ive never worked with tf before

#

i mean yeah its from google collab but i need know what changes i need to make

ripe forge Apr 26, 2021, 6:43 AM

#

I'd say step 1, get familiar with the code. You should be able to tell yourself what each line is doing before proceeding.

#

No point trying to build on top of something you don't understand, especially when the code is right there

gentle birch Apr 26, 2021, 6:45 AM

#

um i have a question
being RLY new to trying to llearn tensorflow, are there any good resources to use to actually learn the code and how it works?

#

i understand the basics of how nueral nets work but other then premade tensorflow code i cant find any good resources for learning tensorflow

iron basalt Apr 26, 2021, 6:58 AM

#

gentle birch i understand the basics of how nueral nets work but other then premade tensorflo...

Are you trying to learn how tensorflow itself is implemented? Or make a network with just TF's basic building blocks?

gentle birch Apr 26, 2021, 7:04 AM

#

im trying to learn how to use tensorflow to write nueral nets
so i guess the second one u asked

iron basalt Apr 26, 2021, 7:05 AM

#

gentle birch im trying to learn how to use tensorflow to write nueral nets so i guess the se...

https://www.kaggle.com/hbaderts/simple-feed-forward-neural-network-with-tensorflow

Simple feed-forward neural network with TensorFlow

Explore and run machine learning code with Kaggle Notebooks | Using data from Titanic - Machine Learning from Disaster

gentle birch Apr 26, 2021, 7:05 AM

#

im trying to learn how to actually use tensorflow to implement convulutional nets and gans and such, but theres nothing i can find that explains what the code actually does like what attributes do what etc

iron basalt Apr 26, 2021, 7:05 AM

#

Take a look at that simple feed forward neural network example.

#

That code just uses basic concepts from TF, not an entire prebuilt network.

#

Prebuilt networks are really just a bunch of those basic units combined and made into a class.

#

Fundamentally, TF and Pytorch are just fancy automatic differentiation tools that make running stuff on the GPU (and CPU with threading and vector operations) easier (for the most part).

#

(I actually have my own which is very much like pytorch and it was not hard to make, the real gains from using pytorch or TF is that lots of other people have already made a bunch of models for you)

gentle birch Apr 26, 2021, 7:09 AM

#

damn that wouldve taken allot of learning and math to do that though?

iron basalt Apr 26, 2021, 7:10 AM

#

Not anything anybody that's really into ML would not know.

minor charm Apr 26, 2021, 7:10 AM

#

I second using pre-built stuff like TF. Very easy to set up

gentle birch Apr 26, 2021, 7:11 AM

#

ok i understand what u mean by TF code uses basic blocks and adds them together to do bigger tasks,
what i dont get is jus what each attribute and function actually does, and the tensorflow documentation isint very good at explainging it at a level a beginner would understand

iron basalt Apr 26, 2021, 7:12 AM

#

Give me a concrete example of what is causing you trouble.

gentle birch Apr 26, 2021, 7:12 AM

#

for example, the conv2Dtranspose function, what does it actually do to make an image??
that makes no sense to me, cause as far as i know a transpose of a matrix is just rotating the matrix, why that help?

#

its used in a bunch of GAN code i saw to essentialy morph the input data toward turning it into an image
but i cant find a good explanation as to what its actually doing to the data

iron basalt Apr 26, 2021, 7:17 AM

#

https://datascience.stackexchange.com/questions/6107/what-are-deconvolutional-layers

Data Science Stack Exchange

What are deconvolutional layers?

I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work.

The re...

#

That has an animation

gentle birch Apr 26, 2021, 7:20 AM

#

ah thankyou
this helps this specific issue
guess i should search more on stack overflow when i get these questions

iron basalt Apr 26, 2021, 7:20 AM

#

That's more of a general deep learning question rather than an TF question.

#

Note that a lot of things in DL and ML have terrible names. Like "convolution" does not make sense to begin with (but many miss-uses later and it just became accepted that it means a specific thing in the context of ML/image processing).

#

And even worse, very popular papers will use different definitions for the same word (even in the same context).

#

So it's important to kind of be on the same wave-length in terms of jargon to be able to quickly understand what is going on. This can only be done by just having followed of bunch of projects and read a bunch of ideas. It's kind of like playing baseball and not knowing all the baseball terminology that was made up just for baseball. https://en.wikipedia.org/wiki/Glossary_of_baseball_terms

Glossary of baseball terms

This is an alphabetical list of selected unofficial and specialized terms, phrases, and other jargon used in baseball, along with their definitions, including illustrative examples for many entries.

#

It's annoying to have to learn it all, but not really any way around it.

primal tulip Apr 26, 2021, 9:15 AM

#

iron basalt It's annoying to have to learn it all, but not really any way around it.

Unless you build an NLP AI around it 😏 ... Oh, wait.

arctic wedgeBOT Apr 26, 2021, 10:38 AM

#

Hey @rose thicket!

Uh-oh! It looks like your message got zapped by our spam filter. We currently don't allow .txt attachments, so here are some tips to help you travel safely:

• If you attempted to send a message longer than 2000 characters, try shortening your message to fit within the character limit or use a pasting service (see below)

• If you tried to show someone your code, you can use codeblocks
(run !code-blocks in #bot-commands for more information) or use a pasting service like:

https://paste.pythondiscord.com

rose thicket Apr 26, 2021, 10:40 AM

#

duh

#

see this error

#

https://paste.pythondiscord.com/yejasarahu.yaml

digital aurora Apr 26, 2021, 10:47 AM

#

Any data scientist available??

#

Need some help

#

What did you mean?😅

mystic harbor Apr 26, 2021, 10:53 AM

#

Links are allowed,just that particular one isn't

#

but yeah,

mystic harbor Apr 26, 2021, 10:53 AM

#

digital aurora Need some help

you should ask your question

digital aurora Apr 26, 2021, 10:53 AM

#

I just needed some guidance.

#

I am a beginner in this field

#

I'm done with python and numpy

#

But have no clue, what to do next

frozen marten Apr 26, 2021, 10:55 AM

#

this question can be answered even by learners... not necessarily data scientists

digital aurora Apr 26, 2021, 10:59 AM

#

frozen marten this question can be answered even by learners... not necessarily data scientist...

So can you??

tough surge Apr 26, 2021, 11:29 AM

#

Another question regarding anaconda. I have exported multiple envs as .yml files and imported those file using anaconda navigator. But in every enviroment the packages are not installed. Should I do a conda install "something" ?

serene scaffold Apr 26, 2021, 12:00 PM

#

tough surge Another question regarding anaconda. I have exported multiple envs as .yml files...

Is there a reason you're using anaconda? I don't really know anyone who uses it and it's easier to get help if you use venv

mental nova Apr 26, 2021, 12:14 PM

#

tough surge Another question regarding anaconda. I have exported multiple envs as .yml files...

https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#create-env-from-file

tough surge Apr 26, 2021, 12:23 PM

#

serene scaffold Is there a reason you're using anaconda? I don't really know anyone who uses it ...

i just stuck to it from the beginning. Was not having any problem until now, when I had to do a win 10 reinstall on my machine

tough surge Apr 26, 2021, 12:40 PM

#

mental nova https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.h...

tried it just now but when i do conda env list its shows none installed packages.

#

Ill try to export my env as a requirements.txt

mental nova Apr 26, 2021, 12:43 PM

#

tough surge tried it just now but when i do `conda env list` its shows none installed packag...

conda env list only shows if the environment was installed not the packages. If you want to see the packages you need to use conda list -n environment name

lapis sequoia Apr 26, 2021, 1:29 PM

#

I need a software which work to recognize money. I like to work with open cv o what you recomend?

drifting void Apr 26, 2021, 1:37 PM

#

Hi, how do you check for values such as those in Dask dataframe:
df[(df['val0']==val0) & (df['val1']==val1)].compute()
the above is super slow so perhaps there's another way?

serene scaffold Apr 26, 2021, 2:41 PM

#

drifting void Hi, how do you check for values such as those in Dask dataframe: ``` df[(df['val...

Why is it that you're using Dask, in this case?

#

Is the data larger than your RAM, or are you doing operations in parallel?

drifting void Apr 26, 2021, 2:52 PM

#

Yes it is a lot of data in parquet files

#

I want to add more data in case it doesn’t exist

#

I guess I shouldn’t be doing that with dask but rather use a different data structure to do the check

serene scaffold Apr 26, 2021, 2:59 PM

#

drifting void I guess I shouldn’t be doing that with dask but rather use a different data stru...

I'm wondering if you should be putting all this in a proper database and querying it.

drifting void Apr 26, 2021, 3:02 PM

#

I was considering that too. The data will be growing to (if I calculated that correctly) 20-30 millions of entries

serene scaffold Apr 26, 2021, 3:07 PM

#

drifting void I was considering that too. The data will be growing to (if I calculated that co...

I think there's a point at which you have as much RAM as you have.

#

FYI, I probably won't know if you've responded unless you ping/reply to me

drifting void Apr 26, 2021, 4:01 PM

#

serene scaffold I think there's a point at which you have as much RAM as you have.

I am not sure what you mean...

serene scaffold Apr 26, 2021, 4:01 PM

#

drifting void I am not sure what you mean...

If you're working with more data than you can fit in live memory, the operation is going to be slown down by disk reads.

drifting void Apr 26, 2021, 4:03 PM

#

Yes, true. I thought Dask would help here.

#

I should probably start using a database. It may be useful

serene scaffold Apr 26, 2021, 4:19 PM

#

drifting void Yes, true. I thought Dask would help here.

I checked the docs for Dask and it said that you can use it to parallelize certain operations, but that it's not an alternative to a database

#

On the flip side, I had somehow never heard of Dask, so thanks for bringing that to my awareness!

grave frost Apr 26, 2021, 4:21 PM

#

https://pages.awscloud.com/NAMER-field-OE-AWS-is-How-May-4-2021-reg-event.html?sc_channel=em&sc_campaign=NAMER_FIELD_WEBINAR_aws-is-how-event-3_20210504_7014z0000014Shv.DG2 - Free Trial Abandoners&sc_publisher=aws&sc_medium=em_362184&sc_content=event_ev_field&sc_country=us&sc_geo=namer&sc_outcome=event&trk=em_362184

BlackBerry 🤣 😂 "Innovation" 🤣🤣
Im dying

Behind the Innovation. AWS is How.

Free, virtual event on May 4, 2021.

digital aurora Apr 26, 2021, 4:28 PM

#

Guys, what all do i need to study under Stats for DS, anybody?

serene scaffold Apr 26, 2021, 4:33 PM

#

digital aurora Guys, what all do i need to study under Stats for DS, anybody?

I would probably start with probability theory.

short heart Apr 26, 2021, 4:46 PM

#

How do I decrease discount factor in reinforcement learning

tough surge Apr 26, 2021, 4:52 PM

#

mental nova `conda env list ` only shows if the environment was installed not the packages. ...

Worked fine this time. Tried again with .yml file and all packages are there. Thanks guys 💪

weak remnant Apr 26, 2021, 5:09 PM

#

guys can anyone assist me on how to train models efficiently if i have a low grade GPU and buying a new one is not an option

#

also i've tried google colab and looking for other suggestions

mint palm Apr 26, 2021, 5:18 PM

#

do we compute cost after setting output values that are greater then 0.5 to 1 and others to 0, or before that?

#

in NN

livid jetty Apr 26, 2021, 5:23 PM

#

What data I need to build a machine learning model which can predict future coronavirus cases count?

serene scaffold Apr 26, 2021, 5:26 PM

#

livid jetty What data I need to build a machine learning model which can predict future coro...

You have to know what factors account for the number of coronavirus cases, and then see if you can obtain reliable data for those factors

livid jetty Apr 26, 2021, 5:28 PM

#

serene scaffold You have to know what factors account for the number of coronavirus cases, and t...

And which algorithm is best for my model?

mint palm Apr 26, 2021, 5:31 PM

#

mint palm do we compute cost after setting output values that are greater then 0.5 to 1 an...

in reference to binary classification

serene scaffold Apr 26, 2021, 5:39 PM

#

livid jetty And which algorithm is best for my model?

You'll want to look into algorithms that look at data points in chronological order, rather than those that predict based on each data point in isolation.

livid jetty Apr 26, 2021, 5:49 PM

#

serene scaffold You'll want to look into algorithms that look at data points in chronological or...

Thanks sir

grave frost Apr 26, 2021, 5:57 PM

#

livid jetty And which algorithm is best for my model?

Time series predictions? RNN it is

iron basalt Apr 26, 2021, 5:58 PM

#

drifting void I should probably start using a database. It may be useful

Dask is cute, but a database should be the go to for more serious storage and processing of data beyond just some spreadsheets and things that fit in memory.

grave frost Apr 26, 2021, 5:59 PM

#

weak remnant guys can anyone assist me on how to train models efficiently if i have a low gra...

Apart from cloud options like GCP, nothing else. You could try doing CPU-only training too, if you can afford a beefier CPU

cyan ridge Apr 26, 2021, 6:04 PM

#

what can you suggest who is taking a data science career? actually I'm a second year student im so confused whether I wanted to be a software eng. or a data sci.

drifting void Apr 26, 2021, 6:20 PM

#

iron basalt Dask is cute, but a database should be the go to for more serious storage and pr...

Thanks! Yes, I am looking into neo4j right now. Might be a good fit for my needs

dusk hornet Apr 26, 2021, 7:26 PM

#

Can someone help me with python code for AI face recognizer

lavish tundra Apr 26, 2021, 8:08 PM

#

we have a chinese/korean/japanase data scientist here? . _. i'm having problem about use a asia font to do a data visualisation ; -;

#

=/ i'm hard stuck on this problem using seaborn

iron basalt Apr 26, 2021, 8:30 PM

#

lavish tundra =/ i'm hard stuck on this problem using seaborn

Change the font to one that supports the characters you need.

lavish tundra Apr 26, 2021, 8:34 PM

#

iron basalt Change the font to one that supports the characters you need.

thats the problem, dont matter what i try he dont change

grave frost Apr 26, 2021, 8:41 PM

#

https://www.reddit.com/r/MachineLearning/comments/myr072/d_huawei_just_announced_that_they_trained_a_200/?%24deep_link=true&correlation_id=6da94216-b609-4479-aa67-62df273ea4c8&post_fullname=t3_myr072&post_index=5&ref=email_digest&ref_campaign=email_digest&ref_source=email&utm_content=post_title&%243p=e_as&_branch_match_id=893197452729060697

r/MachineLearning - [D] Huawei just announced that they trained a 2...

49 votes and 28 comments so far on Reddit

#

technically, it's bigger than GPT-3 ^^

iron basalt Apr 26, 2021, 8:57 PM

#

lavish tundra thats the problem, dont matter what i try he dont change

import seaborn as sns
from matplotlib.pyplot import show
import numpy as np

sns.set(font="Noto Sans CJK JP")
sns.heatmap(np.array([[1,2,3]]), annot=np.array([['ë', 'bădărău', 'いえ']]), fmt='')
show()

#

exotic maple Apr 26, 2021, 9:04 PM

#

grave frost https://www.reddit.com/r/MachineLearning/comments/myr072/d_huawei_just_announced...

-scared american noises-

#

nah tbh that looks pretty impressive. especially because it seems they did entirely with chinese tech. No Tensorflow / pytorch, no nvidia, etc

lavish tundra Apr 26, 2021, 9:14 PM

#

iron basalt ```py import seaborn as sns from matplotlib.pyplot import show import numpy as n...

i tried this but dont works for me
i think i'll just give up, i tried to search everywhere for a solution, but nothing works

weak remnant Apr 26, 2021, 9:23 PM

#

grave frost Apart from cloud options like GCP, nothing else. You could try doing CPU-only tr...

i was looking for a virtual GPU that could provide close enough if not equal to a physical GPU

lavish tundra Apr 26, 2021, 9:26 PM

#

iron basalt ```py import seaborn as sns from matplotlib.pyplot import show import numpy as n...

if i run ur code:

iron basalt Apr 26, 2021, 9:32 PM

#

lavish tundra if i run ur code:

you need the font

#

you don't have the font

lavish tundra Apr 26, 2021, 9:32 PM

#

i thought i had it

lavish tundra Apr 26, 2021, 9:33 PM

#

iron basalt you need the font

do u know where i can find the original one for i install?

deft basin Apr 26, 2021, 9:34 PM

#

woah

iron basalt Apr 26, 2021, 9:35 PM

#

lavish tundra do u know where i can find the original one for i install?

No, just use another CJK font.

lavish tundra Apr 26, 2021, 9:38 PM

#

OMG I CANT BELIEVE

#

i tried to user a different font and it works

lavish tundra Apr 26, 2021, 9:38 PM

#

iron basalt No, just use another CJK font.

u are my new religion, ty for the help

#

a angel on my life

lavish tundra Apr 26, 2021, 9:56 PM

#

idk why but some fonts works for chinese words and dont work for korea words

torpid ember Apr 26, 2021, 9:57 PM

#

def cannex_format_over1y(url,product):
    curr_dt = datetime.datetime.today().strftime('%Y-%m-%d')
    # curr_dt = (datetime.datetime.today() - BDay(2)).strftime('%Y-%m-%d')
    curr_dt_str = datetime.datetime.today().strftime("%Y%m%d")
    df_html = pd.read_html(url,header=1)
    header = df_html[0].iloc[0]
    cols = ['Financial Institution'] # only forward fill on Financial Institution column
    df = (df_html[0].iloc[1:])
    df[cols].fillna(method='ffill')
    df.columns = header
    df.insert(0, 'Date', curr_dt)
    # df.to_csv(csv_path)
    df.rename(columns={df.columns[5]: "1Y",
                       df.columns[6]: "2Y",
                       df.columns[7]: "3Y",
                       df.columns[8]: "4Y",
                       df.columns[9]: "5Y",
                       df.columns[10]: "6Y"},inplace = True)
    df.insert(loc=2,column='product',value=product)
    return df

cannex_format_over1y(gic_nonreg_1to6y_url,'Non-registered GIC').replace('-','')```

#

hey guys i have a really simple code that im getting a warning on. Im wondering if you guys can help me figure out what needs to change to avoid the warning

#

this is the warning:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(```

#

im struggling to understand what the issue is, but i assume it has to do with the

    df[cols].fillna(method='ffill')``` 
portion of the code

iron basalt Apr 26, 2021, 10:04 PM

#

lavish tundra idk why but some fonts works for chinese words and dont work for korea words

Getting a font that can do multiple will not be easy, but they exist.

#

A font only has the glyphs that the typographer created for the font. Making a font with glyphs for multiple languages is a ton of work.

#

Maybe ask reddit: https://www.reddit.com/r/typography/

r/typography

#

Few fonts are free and even fewer are free and good.

grave frost Apr 26, 2021, 10:25 PM

#

weak remnant i was looking for a virtual GPU that could provide close enough if not equal to ...

wdym?

#

you have full control over your GPU in the cloud. There is nothing you can't do physically that you can do using a terminal 🤷

grave frost Apr 26, 2021, 10:28 PM

#

exotic maple nah tbh that looks pretty impressive. especially because it seems they did entir...

ikr - not even CUDA 👀 that impressive af, but thank god US bans these chineese companies from operating

exotic maple Apr 26, 2021, 10:29 PM

#

grave frost ikr - not even CUDA 👀 that impressive af, but thank god US bans these chineese ...

I mean, if you ask me that's a stupid idea

#

If I was on the CCP id say "The U.S is unreliable as partner for this, let's throw bullshit cash at it and develop our own" and boom. goodbye Tensorflow / PyTorch AI monopoly on the U.S

lavish tundra Apr 26, 2021, 10:29 PM

#

iron basalt Getting a font that can do multiple will not be easy, but they exist.

well.. i found some fonts where has options to chinese/korean/japanese... but when i try to make the seaborn use it he dont use . _.

exotic maple Apr 26, 2021, 10:30 PM

#

that's just my take thou

grave frost Apr 26, 2021, 10:30 PM

#

I trust no chineese AI dev - especially when you see what they use AI in China like. it's literally like 1984, and all research done helps them 😞

lavish tundra Apr 26, 2021, 10:31 PM

#

i tried

sns.set(font="Arial Unicode MS")

sns.set(font="ArialUnicodeMS")

sns.set(font="arial-unicode-ms")

but none of those work

exotic maple Apr 26, 2021, 10:32 PM

#

grave frost I trust no chineese AI dev - especially when you see what they use AI in China l...

I have no doubt China does shady shit with AI, but some of those claims I find them absurdedly exaggerated (they would require AI beyond state-of-the-art)

grave frost Apr 26, 2021, 10:32 PM

#

exotic maple I have no doubt China does shady shit with AI, but some of those claims I find t...

like?

exotic maple Apr 26, 2021, 10:32 PM

#

grave frost like?

social credit stuff, which ended up being mistranslations and exagerrations

#

its pretty much a credit score lol

grave frost Apr 26, 2021, 10:32 PM

#

The documentaries I have seen are pretty demonstrative of those tech used

grave frost Apr 26, 2021, 10:33 PM

#

exotic maple its pretty much a credit score lol

it's in prototype stage

#

one doc actually interviewed the chineese guy making it - and he was answering those question very carefully

#

he expressly stated that those technologies will "benefit" chineese citizens

exotic maple Apr 26, 2021, 10:35 PM

#

Idk I find it still exagerrated. Think it like this from another perspective:

JP Morgan and all other U.S banks have credit scores. Literally your whole life, including employment is based on this credit score, and this is not called "dystopian". Either all these types of scores are dystopian (they are) or none are, but cherry picking because "X GOOD" "Y BAD" its annoying.

In fact, US credit score sounds worse than a social score lol.

crisp wing Apr 26, 2021, 10:36 PM

#

Can you lose variance even though you do a full reconstruction with SVD? Like X = s @ V.T

grave frost Apr 26, 2021, 10:36 PM

#

it's not only credit scores or anything - it's a lot of tracking tech too, which is perfectly plausible to build with the proper investment

#

I mean, just look at what NSA did in earlier times. no one could have believed that such resources would be poured just to track common people.

exotic maple Apr 26, 2021, 10:37 PM

#

grave frost it's not only credit scores or anything - it's a lot of tracking tech too, which...

but this doesnt ma,ke the chinese government dystopian. People willlingly give this up.

I didnt see any U.S protests over the NSA leaks that basically all Facebook, Cisco, etc, have backdoors for the U.S gov.

the sad trust is: people give away privacy for conveniency

#

you dont need government intervention when people give it away on their own

grave frost Apr 26, 2021, 10:38 PM

#

ngl, people do give up their privacy. but you would be wrong that there were no protests or any opposition

exotic maple Apr 26, 2021, 10:38 PM

#

grave frost ngl, people do give up their privacy. but you would be wrong that there were no ...

*no significant

#

if there was no change, the protest was irrelevant.

#

I would know that, living in 3rd world semi dictatorship country

grave frost Apr 26, 2021, 10:39 PM

#

anyways, I for one support US's mass surveillance

exotic maple Apr 26, 2021, 10:39 PM

#

I dont. Neither chinese nor US. screw both

grave frost Apr 26, 2021, 10:39 PM

#

china's is really bad - but then you never know when USA might be too

grave frost Apr 26, 2021, 10:40 PM

#

exotic maple I dont. Neither chinese nor US. screw both

well.....if it guranteed safety for your kids 🤷

exotic maple Apr 26, 2021, 10:40 PM

#

grave frost well.....if it guranteed safety for your kids 🤷

we go back to square 1 lol.

#

I know a lot of chinese. they dont care about privacy, HR or whatever, as long as they're safe and prosper

grave frost Apr 26, 2021, 10:41 PM

#

if giving up a part of your daily privacy can prevent some mass shootings (maybe with your family involved) would you pay the price?

exotic maple Apr 26, 2021, 10:41 PM

#

I feel a lot of americans think the same

exotic maple Apr 26, 2021, 10:41 PM

#

grave frost if giving up a part of your daily privacy can prevent some mass shootings (maybe...

i would yes,

#

in fact i do

#

but i dont call that dystopian

grave frost Apr 26, 2021, 10:42 PM

#

no, but what china does is defintely wrong - and their research funds all go into that "dystopian" research

exotic maple Apr 26, 2021, 10:42 PM

#

we are a bit off topic here i think, not in the domain of the channel

#

if you'd like we can continue discussing political perspectives via DM and not spam the channel

grave frost Apr 26, 2021, 10:43 PM

#

Did you read that research where some chineese uni made a model to classify criminals based on their faces? with a lot of SOTA work, they got 85% accuracy in predicting criminals alone from their faces

#

may not be gov sponsored (didn't check), but still

exotic maple Apr 26, 2021, 10:44 PM

#

grave frost may not be gov sponsored (didn't check), but still

everything is pretty much gov sponsored in China.

grave frost Apr 26, 2021, 10:44 PM

#

true. who would have even thought of making a model to do that unless specifically directed??

#

in any case, chineese life is just ....depressing to say the least

exotic maple Apr 26, 2021, 10:45 PM

#

grave frost true. who would have even thought of making a model to do that unless specifical...

I mean, i would try random crap if i was bored, but that was oddly specific

sharp nimbus Apr 26, 2021, 10:46 PM

#

!ot

arctic wedgeBOT Apr 26, 2021, 10:46 PM

#

Off-topic channels

There are three off-topic channels:
• #ot0-psvm’s-eternal-disapproval
• #ot1-perplexing-regexing
• #ot2-never-nester’s-nightmare

Their names change randomly every 24 hours, but you can always find them under the OFF-TOPIC/GENERAL category in the channel list.

Please read our off-topic etiquette before participating in conversations.

grave frost Apr 26, 2021, 10:47 PM

#

https://www.technologyreview.com/2016/11/22/107128/neural-network-learns-to-identify-criminals-by-their-faces/
2016?? oh shit....

MIT Technology Review

Neural Network Learns to Identify Criminals by Their Faces

Soon after the invention of photography, a few criminologists began to notice patterns in mugshots they took of criminals. Offenders, they said, had particular facial features that allowed them to be identified as law breakers. One of the most influential voices in this debate was Cesare Lombroso, an Italian criminologist, who believed that crim...

exotic maple Apr 26, 2021, 10:52 PM

#

@grave frost we are off topic, you want we can conitnue via DM, lets stop spamming here

grave frost Apr 26, 2021, 10:55 PM

#

exotic maple <@!738058085083381760> we are off topic, you want we can conitnue via DM, lets s...

some day later 🙂 Ive got Homework to do

exotic maple Apr 26, 2021, 10:55 PM

#

grave frost some day later 🙂 Ive got Homework to do

sure! good luck man (y)

molten hamlet Apr 27, 2021, 1:01 AM

#

https://sklearn.org/datasets/index.html#rcv1-dataset

#

How do you even detokenize this dataset? 😐

#

i want words! i mean, I solved my problem, but can't check it

#

got it

#

from some issues xD

hexed heath Apr 27, 2021, 3:08 AM

#

Hi, I am using implicit package (https://github.com/benfred/implicit) to create a recommender system. I am using the implicit least square algorithm.
I was able to make predictions for already existing users, or to find similar items, no prob. But I don't get how can I get predictions for a new user which was not in input data? the idea is that I have a set of items (each one existing in input data), and I want recommendations based on this set. I could get recommendations for each items and sum them up, but it doesn't feel right. This seems like a common usage, so I think I am missing something ^^'. Any ideas? Thanks 🙂

GitHub

benfred/implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets - benfred/implicit

lapis sequoia Apr 27, 2021, 3:32 AM

#

is anyone available to help me?

scenic elbow Apr 27, 2021, 3:35 AM

#

@lapis sequoia Possibly, what is it that you're trying to do?

minor charm Apr 27, 2021, 3:47 AM

#

anyone familiar with tensorflow? Having some issues getting logs to write for a customtensorboard

carmine iron Apr 27, 2021, 3:50 AM

#

why is this returning 11

coins = 8
max = 0
while max < coins:
    # print(max)
    for i in costs:
        max +=i
max```

royal crypt Apr 27, 2021, 3:54 AM

#

carmine iron why is this returning 11 ```costs = [1,3,2,4,1] coins = 8 max = 0 while max < c...

because in the first loop of "while" the condition "max < coins" true, you has been go through all elements in "costs" and add it to "max", so the result is 11 is normally 😄

iron basalt Apr 27, 2021, 3:55 AM

#

carmine iron why is this returning 11 ```costs = [1,3,2,4,1] coins = 8 max = 0 while max < c...

The while loop only goes through one iteration.

whole mica Apr 27, 2021, 5:17 AM

#

anyone here use machine learning for finance?

tall loom Apr 27, 2021, 6:53 AM

#

Hello guys. I need help in understanding this dataset .
http://www.timeseriesclassification.com/description.php?Dataset=WordSynonyms

What are those features? And what is being classified?

lapis sequoia Apr 27, 2021, 7:28 AM

#

Each case is a word. A series is formed by taking the height profile of the word

#

From http://www.timeseriesclassification.com/description.php?Dataset=FiftyWords

#

WordSynonyms remapped FiftyWords to 25 classes

#

But the data is the same (and I think flipped)

tall loom Apr 27, 2021, 7:35 AM

#

@lapis sequoia What do the classes tell and what is height profile of a word?

lapis sequoia Apr 27, 2021, 9:21 AM

#

hi, i'm new here

#

anyone want's to help me with some code suggestions?

#

i'm working on CNN project, and i have to prepare a dataset for my boss, who gave me 2 .HID files with inside some specific image filenames from a big Dataset of Images. I've converted every line of the .HID file in a element of a list, and i have a dictionary with all image filenames. But to check if the names in the .HID are matched with the names of the Dataset, i have to join ".jpg" string at the end of every elements of the list, cause the list elements are image filenames without the extension. Is right my
reasoning? Someone who can help me to do this? Cause the problem is that you can't concatenate list elements with string...

#

`import cv2
from PIL import Image
import os

path_file = 'E:\Work\AU(13)-SottoCampioniA e B\SetA.HID'
path_image = 'E:\Work\AU13_face'
work_dir_tr = 'E:\Work\x'

image_file_names = [i for i in os.listdir(path_image)]
#images = [i for i in os.listdir(path_image) if i[-3:]=='JPG' or 'jpg']

file1 = open(path_file, 'r')

list_of_lists = []

for line in file1:
#print(line_list)
stripped_line = line.strip()
line_list = stripped_line.split()
#print(line_list)
list_of_lists.append(line_list)
list_of_lists = [line_list + ".jpg" for line_list in list_of_lists]

file1.close()

print(list_of_lists)

#============================================================================
#############################################################################
#============================================================================

#result = any(elem in line for elem in list_of_lists)
os.listdir(path_image)

wIDTH = 100
hEIGHT = 100

for i,image in enumerate(image_file_names):
#print(image)
if any(elem in list_of_lists for elem in image_file_names):
print(i,'matched')
# im = Image.open(image)
# im = im.convert('L')
# I = Image.open(path_image+"/"+ image)
# I = I.resize((wIDTH,hEIGHT), Image.BICUBIC)
# I.save(work_dir_tr+'/'+ image)`

kindred radish Apr 27, 2021, 10:02 AM

#

lapis sequoia `import cv2 from PIL import Image import os path_file = 'E:\Work\AU(13)-SottoCa...

This looks like something that should go in one of the help channels

#

As it's not necessarily directly related to AI

lapis sequoia Apr 27, 2021, 10:03 AM

#

yes but, there is no people who answer me

kindred radish Apr 27, 2021, 10:03 AM

#

wait :)

#

It took me two weeks but I've only just noticed that the reason my model isn't training isn't necessarily my data. It's my code for my model:

#

#

Orange is sklearn.linear_model.linearRegression() blue is my own OLS algorithm

#

kill me ;-;

#

idk how i didn't notice what a shit job it was doing

olive shore Apr 27, 2021, 12:25 PM

#

has anyone used hugging face before or is good with AI

#

I am trying to create a personal assistant app that would answer questions based on information I trained it on. I want to lets say upload a book or a dataset of a lot of research papers then when I ask a question it would give it

#

is that possible without having context or something

#

just train it with some data?

#

is this what i am looking for?

#

https://github.com/google-research/google-research/blob/master/t5_closed_book_qa/README.md

GitHub

google-research/google-research

Google Research. Contribute to google-research/google-research development by creating an account on GitHub.

#

or is this something else

arctic wedgeBOT Apr 27, 2021, 12:38 PM

#

Hey @lapis sequoia!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

lapis sequoia Apr 27, 2021, 12:39 PM

#

import cv2
import numpy as np

def dibujar(mask,color):
,contornos, = cv2.findContours(mask, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
print ("ya deje de joder")
for c in contornos:
area = cv2.contourArea(c)
if area > 3000:
M = cv2.moments(c)
if (M["m00"]==0): M["m00"]=1
x = int(M["m10"]/M["m00"])
y = int(M['m01']/M['m00'])
nuevoContorno = cv2.convexHull(c)
cv2.circle(frame,(x,y),7,(0,255,0),-1)
cv2.putText(frame,'{},{}'.format(x,y),(x+10,y), font, 0.75,(0,255,0),1,cv2.LINE_AA)
cv2.drawContours(frame, [nuevoContorno], 0, color, 3)

cap = cv2.VideoCapture(0)

azulBajo = np.array([100,100,20],np.uint8)
azulAlto = np.array([125,255,255],np.uint8)

amarilloBajo = np.array([15,100,20],np.uint8)
amarilloAlto = np.array([45,255,255],np.uint8)

redBajo1 = np.array([0,100,20],np.uint8)
redAlto1 = np.array([5,255,255],np.uint8)

redBajo2 = np.array([175,100,20],np.uint8)
redAlto2 = np.array([179,255,255],np.uint8)

font = cv2.FONT_HERSHEY_SIMPLEX
while True:

ret,frame = cap.read()

if ret == True:
frameHSV = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
maskAzul = cv2.inRange(frameHSV,azulBajo,azulAlto)
maskAmarillo = cv2.inRange(frameHSV,amarilloBajo,amarilloAlto)
maskRed1 = cv2.inRange(frameHSV,redBajo1,redAlto1)
maskRed2 = cv2.inRange(frameHSV,redBajo2,redAlto2)
maskRed = cv2.add(maskRed1,maskRed2)
dibujar(maskAzul,(255,0,0))
dibujar(maskAmarillo,(0,255,255))
dibujar(maskRed,(0,0,255))
cv2.imshow('frame',frame)
if cv2.waitKey(1) & 0xFF == ord('s'):
break
cap.release()
cv2.destroyAllWindows()

#

Someone help me i dont know what is wrong

#

Hello, Do you know some good sources for finding a plan for how to become a data scientist? I mean a real plan, not how to become a senior data scientist. I found some articles but there is too much to learn, you need a lifetime to learn this stuff. I started with linear algebra and also learned the basics for ANN, but there is much more. I need a good plan because I want to find a job in a year maybe.

late shell Apr 27, 2021, 1:32 PM

#

Hello, I'm a noobie to ML and was learning about Decision Tree Regression and was testing out something on my own. The Decision Tree algorithm for Regression works in a way that, for each node of the tree, iterates through all the values of all the features trying to find the split that decreases the SSR the most. At each iteration the algo considers only 2 points at a time, takes their average, makes the split at that average, and then makes predictions using that split and calculates the SSR. And then selects the split which decreases the SSR the most. I was wondering, does the number of observations considered at the time of a split (i.e. 2 right now) affect the model in any way. I believe its a trade-off between speed/time taken by model to train and accuracy of the model. So I wrote a notebook for testing it out whether this trade-off is significant enough to be considered. Can someone please go through this notebook and let me know if I'm just wasting my time doing silly & useless things or should I continue this exploration. It'll be really valuable to me if someone gives a feedback on this. Thanks
Here is the notebook : https://github.com/Noobie20/ML/blob/master/Regression/Decision Tree Regression/n_obs_split.ipynb

GitHub

Noobie20/ML

notebook learnings. Contribute to Noobie20/ML development by creating an account on GitHub.

untold verge Apr 27, 2021, 2:58 PM

#

is there a way to data mine facebook?

tame lichen Apr 27, 2021, 3:29 PM

#

hi so if im prediciting sales for a company whats the best type of model to use for something like that?

mint palm Apr 27, 2021, 3:59 PM

#

#

used above, what does int64 do?

#

does it limit the number of digits to such that they are 64bits and fasten up the model?

serene scaffold Apr 27, 2021, 4:16 PM

#

mint palm does it limit the number of digits to such that they are 64bits and fasten up th...

I think it's more about your system architecture. What type is A2?

mint palm Apr 27, 2021, 4:17 PM

#

a2 is just activated vector after applying activation function

#

for layer2

serene scaffold Apr 27, 2021, 4:18 PM

#

mint palm a2 is just activated vector after applying activation function

in that case, A2 > 0 returns a boolean array, and the call to np.int64 is converting bools to 64-bit ints.

mint palm Apr 27, 2021, 4:18 PM

#

oh yess so that instead of true false we get numbers

#

right?

serene scaffold Apr 27, 2021, 4:18 PM

#

sounds right to me

mint palm Apr 27, 2021, 4:19 PM

#

did saw something like that in lecture

serene scaffold Apr 27, 2021, 4:20 PM

#

it looks like that line is just a fancy way of setting certain values in dA2 to 0

#

I think dA2[A2 <= 0] = 0 would have the same effect.

mint palm Apr 27, 2021, 4:21 PM

#

he did it in less fancy manner one in forward propogation

#

why would he make it more fancy here lol

#

😆

#

he did this earlier

#

in forward prop of dropout

serene scaffold Apr 27, 2021, 4:24 PM

#

Can you see how that could be simplified?

mint palm Apr 27, 2021, 4:25 PM

#

no cuz i dont get what int64 is doing there

serene scaffold Apr 27, 2021, 4:25 PM

#

there is no int64 in that one

mint palm Apr 27, 2021, 4:25 PM

#

mint palm

ya in this i understand everything

#

its making values less then probability to 0 and others unchanged

serene scaffold Apr 27, 2021, 4:26 PM

#

so do you see how you could simplify it, knowing that dA2[A2 <= 0] = 0 is an alternative to the other one?

mint palm Apr 27, 2021, 4:26 PM

#

its sort of same i guess i understand

lilac needle Apr 27, 2021, 4:51 PM

#

mint palm oh yess so that instead of true false we get numbers

True == 1
False == 0

mint palm Apr 27, 2021, 4:52 PM

#

is saw some usage but does fit into bool concept

#

its not in bool

lilac needle Apr 27, 2021, 4:53 PM

#

That’s why when you sum a series of true and false, you’ll get the total count of True values

mint palm Apr 27, 2021, 4:57 PM

#

lilac needle That’s why when you sum a series of true and false, you’ll get the total count o...

i didnt follow

lilac needle Apr 27, 2021, 5:26 PM

#

mint palm i didnt follow

In Python 3.x True and False are keywords and will always be equal to 1 and 0.

haughty pagoda Apr 27, 2021, 5:51 PM

#

guys

#

can anyone help me with opencv?

#

i wanna detect angular velocity

#

of a rotating object

#

Abyone?

#

*anyone

grave frost Apr 27, 2021, 6:08 PM

#

damn I hate pandas

#

I want to drop the second row, but it messes up the index

#

0               column_2
2                     ....
3                     ....

1 is missing from the index due to the deletion causing keyerror. anyone know what this problem is called and Its solution?

primal tulip Apr 27, 2021, 6:44 PM

#

grave frost ```py 0 column_2 2 .... 3 ...

You could always reset the index, but that's ill-advised since you want to be able to trace back the changes from the original dataset back to your actual work.

grave frost Apr 27, 2021, 6:45 PM

#

primal tulip You could always reset the index, but that's ill-advised since you want to be ab...

🤷 Im just double iterating over it now rather than indexing

primal tulip Apr 27, 2021, 6:45 PM

#

In any case, if you want to go that way, https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html

nova widget Apr 27, 2021, 6:46 PM

#

@grave frost just do "for row in dataset:"

#

or "for row in range(len(dataset))

grave frost Apr 27, 2021, 6:53 PM

#

nova widget <@!738058085083381760> just do "for row in dataset:"

something like that - I wanted to iterate over double columns, so I converted it to a dict to make everything easy to work with

#

no more pandas 🥳

mint palm Apr 27, 2021, 8:24 PM

#

lilac needle In Python 3.x True and False are keywords and will always be equal to 1 and 0.

ya but still it doesnt make sense to me about using int64

tepid rapids Apr 27, 2021, 8:49 PM

#

hey im working on a KNN algorithm that tries to predict whether a youtube video will trend based on the title. Does anyone know where i can get a dataset that includes trending and non trending? i can only find trending so far...

velvet linden Apr 27, 2021, 9:11 PM

#

so if i have a program
that checks a csv file, and it is like if this input is found in the a column then go the value next to it in the b column, and check if the next input the user types matches that.
but i dont know how to do that
any help?

real basalt Apr 27, 2021, 10:15 PM

#

velvet linden so if i have a program that checks a csv file, and it is like if this input is f...

that's what i am trying to do lol

exotic maple Apr 27, 2021, 10:41 PM

#

velvet linden so if i have a program that checks a csv file, and it is like if this input is f...

You could probably do something trivial like this (not sure if the best python way to do it)

# finding index of the "match"
idx = df[df["column"] == value].index
# retrieving index at column b
val = df.at(idx,"column-b")

maiden sigil Apr 28, 2021, 3:09 AM

#

how to filter date multiple coloumn

copper willow Apr 28, 2021, 3:20 AM

#

Hi guys, I hope this is the right channel to ask for opinions about this: I want to create a whatsapp bot using Python that report to the users the status of delivery of their product. Any ideas of how can I do that?

flint mason Apr 28, 2021, 6:22 AM

#

df['Percent'] = clean_values(df['changesPercentage'])

Note: clean_values removes brackets and symbols out of the number and convert the string to float. Is there something wrong with the syntax

raven knoll Apr 28, 2021, 10:55 AM

#

I am working on a unsupervised text sentiment project but this is the first time I am doing this. I got some feedback the last time I posted here but I still have some questions.

Currently I have a dataset to train the model but I don't know how to make the model.

I have preproccesed the data. (stemmed, lemmetized, removed stopwords)
I have used a w2vec
Used Kmeans to create 2 clusters (but the clusters are not good because I don't know what I can do.)
Now I don't know what to do

lapis sequoia Apr 28, 2021, 10:58 AM

#

you cannot believe how many hours it took me to realize that fit_transform is actually fitting and then transorming the DataFrame 🤣

#

I was applying that nonstop on test dataset

primal tulip Apr 28, 2021, 11:01 AM

#

lapis sequoia you cannot believe how many hours it took me to realize that fit_transform is ac...

Oh boy hahahaha. I guarantee you're not doing that mistake again. Emotional trauma is the best teacher sometimes lol.

lapis sequoia Apr 28, 2021, 11:01 AM

#

yeah, i was literally hugging the documentation at night and praying to it in hopes of finding an answer

primal tulip Apr 28, 2021, 11:31 AM

#

lapis sequoia yeah, i was literally hugging the documentation at night and praying to it in ho...

I have like 5 days fighting over a read-streaming-data program I kinda need for work to do some aggregations on huge datasets.
I read a bit on https://wiki.haskell.org/Lazy_evaluation Haskell's Lazy approach and when combined with Pandas it can deal with the data in chunks decently. Even tho it was going really slow so something was amiss. I have a padding function for UnicodeEncodeErrors where it just printed a '?' for each invalid char it found, but the issue was I passed the whole chunk of the dataframe and it casted it to str, instead just the invalid value. Since almost each chunk had one weird char, I was casting everything as a string, printing each char one by one in that chunk and then casting it back to pd.Dataframe. To read 1 million records and 30ish rows, it took 52 minutes lol. I haven't fixed it yet, but hopefully it'll work it out in 20 seconds (ish) if everything goes accordingly.

untold cove Apr 28, 2021, 11:32 AM

#

Hi all, is it possible to have 2 y axis for 1 x value? https://stackoverflow.com/questions/66545695/python-plotly-dash-question-custom-labels-and-color-based-on-values

Stack Overflow

Python - Plotly, Dash question - Custom Labels and Color based on V...

I have the following code, im new to Plotly, dash, and pandas so I am hoping someone may be able to help me out. I am after 3 things:

A Second Y line with the data from df['ScoreMaths']
A Legend, so

primal tulip Apr 28, 2021, 11:33 AM

#

untold cove Hi all, is it possible to have 2 y axis for 1 x value? https://stackoverflow.com...

yes https://plotly.com/python/multiple-axes/

untold cove Apr 28, 2021, 11:34 AM

#

I want 2 bars with px.bar.

#

@primal tulip doesn’t provide a bar example nor with plotly.express unfortunately

lapis sequoia Apr 28, 2021, 11:37 AM

#

primal tulip I have like 5 days fighting over a read-streaming-data program I kinda need for ...

sheesh yeah i've noticed string like work in pandas with huge datasets is a bit slow, datetime modules are even slower.

fast saffron Apr 28, 2021, 11:39 AM

#

I need help to fit multiple columns in a linear

#

LinearRegression

#

Like comparing X to diffrent Ys(diffrent columns)

primal tulip Apr 28, 2021, 11:46 AM

#

untold cove <@!234094728805613569> doesn’t provide a bar example nor with plotly.express unf...

I have to do something like this in the near future for my program. I'll get back to you with the answer (if I have it) in case you haven't found it yet.

mint palm Apr 28, 2021, 2:36 PM

#

#

How is this momentum equation derived.....i need reason to why the equations are like this....

#

?

noble sand Apr 28, 2021, 2:53 PM

#

I'm trying to plot information/facts about companies from a dictionary item onto a timeline like this, how would I do that? At the moment, I've got one item stored in the dictionary, it gets plotted on the graph but doesn't get labelled with its name and as well annotating other info too...

winged yew Apr 28, 2021, 3:07 PM

#

anyone have ML projects >>>

#

???

knotty kayak Apr 28, 2021, 3:38 PM

#

does anyone know why using multiprocessing while using matplotlib opens up multiple plots

kindred radish Apr 28, 2021, 4:24 PM

#

Quick question about plotting data

#

Say i've got this data:

#

#

You could easily just draw a straight line through it and say they're linearly correlated

#

but is this gap in the middle problematic?

#

If i plotted a straight line through both regions individually it would say that they're not linearly correlated

sweet cobalt Apr 28, 2021, 4:48 PM

#

knotty kayak does anyone know why using multiprocessing while using matplotlib opens up multi...

is plt.show() in the function?

#

If so thats probably why

knotty kayak Apr 28, 2021, 4:48 PM

#

nope

#

it can be any function, even with a function with just print and itll do the same thing

sweet cobalt Apr 28, 2021, 4:49 PM

#

So nothing to do with matplotlib is in the function t

#

Just the fact that you imported matplotlib causes the multiple plotting

misty flint Apr 28, 2021, 5:25 PM

#

kindred radish but is this gap in the middle problematic?

depends on the data and context of your problem

#

2 data clusters seem pretty significant depending on your problem

#

like if you were in retail and it represented 2 different demographics

#

otherwise, you could probs use a linear model...just probs not the strongest is all

dapper halo Apr 28, 2021, 5:49 PM

#

My colab continuously crashes when I simply take the difference between predicted values and true values.

Worked around that by throwing it into a loop to compute difference of each element (wouldnt this take more RAM??).

Now it gets stuck when attempting to plot the histogram of those difference. Any tips for reducing the load....which i honestly don't even get why its having a problem with plotting yet trains just fine

late shell Apr 28, 2021, 7:01 PM

#

hello, can someone help me understand the non-decreasing property of R^2 regarding regression models. I clearly, can't understand why the hell can R^2 never decrease upon addition of new predictors. I found this explanation on stackexchange. At the end of this answer, the guy says Or if extra estimated coefficient(βp+1) takes a nonzero value , the SSE will reduce. Why would the SSE necessarily decrease. Isn't is possible that the new combination of coefficients (β's) would make even worse predictions. What if the model, upon addition of a new predictor makes even more worse prediction than the model where the "new predictor" wasn't present. Because of worse predictions, the SSE will increase as a result of which, R^2 will decrease. Where am I wrong?

molten hamlet Apr 28, 2021, 7:03 PM

#

with some stretching and maybe some of it is redundant but this is pandas version haha https://pastebin.com/vzBKKTNy

Pastebin

def rules_check(df, labels): full = df.copy() full['class...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

@desert oar , just saying I solved it if you were curious hah

desert oar Apr 28, 2021, 7:04 PM

#

this looks like a much better use of pandas functionality

#

glad you figured it out

fierce grove Apr 28, 2021, 9:23 PM

#

@late shell The basic idea is Sum of Squares total (SSTotal) equals to Sum of square of individual factors + Sum of sqaure of their interactions(if any) + Sum of squares of error(SSE)
SST = (SS1 + SS2+...SSn)+ (SS12+SS13+...)+ SSE.
So by adding a new predictor say n+1 , then it comes in form of SS(n+1) and its interactions with others (if any) and since SStotal remains constant the SSerror has to decrease.
Thus in R^2 formula either the SSE decreases or it remains constant. So the R^2 either increases or remains constant.

#

Hope it helped you.

oak jungle Apr 28, 2021, 10:05 PM

#

Hey, I was wondering if this channel included neural networks and machine learning, or if this is just for standard a.i.

tidal bough Apr 28, 2021, 10:11 PM

#

oak jungle Hey, I was wondering if this channel included neural networks and machine learni...

Sure, the channel description mentions ML and there are often ML people hanging out here.

oak jungle Apr 28, 2021, 10:11 PM

#

Ok, thanks, forgot to check that

#

For some reason

wicked mantle Apr 28, 2021, 10:33 PM

#

can one CNN model be used for all types of images? 🤔 (for recognition)
For example i have model which is good at dogs, cats, ducks dataset. At the result, can i just change dataset to else images? Without changing fundamental CNN model

ripe forge Apr 28, 2021, 11:00 PM

#

Sure, with no guarantees whether it will perform just as well or not.

thick kestrel Apr 28, 2021, 11:44 PM

#

Hello people... first time posting here... I am working on a model that predicts whether a person was arrested based on some variable information...
My target variable has multi-class data and I chose to convert the classification to numerical values prior to fitting the data to the model.

new_target_values = {'Arrest':0,
'Field Contact':1,
'Citation / Infraction':2,
'Offense Report':3,
'Referred for Prosecution':4}

I got ValueError: ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].

Should I just do a binary classification and have arrest be 1 while the rest are 0s or should i try fitting in a multiclass model

tame lichen Apr 28, 2021, 11:47 PM

#

question what applications is a random forest model good for?

serene scaffold Apr 28, 2021, 11:50 PM

#

tame lichen question what applications is a random forest model good for?

it can sometimes be good for classification tasks.

tame lichen Apr 29, 2021, 12:09 AM

#

serene scaffold it can sometimes be good for classification tasks.

what would be a good model to use to predict sales?

#

kinda general I know

velvet thorn Apr 29, 2021, 2:12 AM

#

tame lichen what would be a good model to use to predict sales?

if you wanna use classical ML, try some sort of boosting

lavish sleet Apr 29, 2021, 2:30 AM

#

Does anyone have a code snippet for multi word keyword analysis

slate hollow Apr 29, 2021, 2:48 AM

#

Epoch 1/100
1250/1250 [==============================] - 1s 400us/step - loss: 128.7992
Epoch 2/100
1250/1250 [==============================] - 0s 397us/step - loss: 1.5939
Epoch 3/100
1250/1250 [==============================] - 1s 406us/step - loss: 1.4500
Epoch 4/100
1250/1250 [==============================] - 0s 385us/step - loss: 1.3226
Epoch 5/100
1250/1250 [==============================] - 0s 398us/step - loss: 1.1951```so uh my `X_train` size is 40k, so why is it only 1250

#

(ping 2 reply thx)

velvet thorn Apr 29, 2021, 3:07 AM

#

slate hollow ```py Epoch 1/100 1250/1250 [==============================] - 1s 400us/step - l...

show your training code

slate hollow Apr 29, 2021, 3:07 AM

#

wait uh

#

here

#

import tensorflow as tf
import numpy as np
keras = tf.keras


def func(inp: np.ndarray) -> np.ndarray:
    return np.array([inp[0] * 2, inp[1] + inp[0] * 3, inp[0] * 1 + inp[1] * 10, inp[1], inp[0] + inp[1]])


training = []
for x in range(200):
    for y in range(200):
        training.append([x, y])
X_train = np.array(training)
y_train = np.array([func(x) for x in X_train])

model = keras.models.Sequential(layers=[
    keras.layers.Dense(5, input_shape=(2,)),
    # keras.layers.Dense(5)
])

lr_decay = keras.callbacks.LearningRateScheduler(lambda e, lr: lr * np.exp(-0.1) if e < 20 else lr)
model.compile(optimizer=keras.optimizers.SGD(lr=1e-3), loss=keras.losses.MeanAbsoluteError())
model.fit(X_train, y_train, epochs=100, callbacks=lr_decay)

#

it's really rudimentary, but i'm just learning

velvet thorn Apr 29, 2021, 3:08 AM

#

slate hollow ```py Epoch 1/100 1250/1250 [==============================] - 1s 400us/step - l...

default batch size is 32

#

40000 / 32 = 1250

slate hollow Apr 29, 2021, 3:08 AM

#

oh

#

so that's like how many batches?

velvet thorn Apr 29, 2021, 3:08 AM

#

ye

#

I don't remember it being like that

slate hollow Apr 29, 2021, 3:08 AM

#

also another question, the loss is the sum of the losses

velvet thorn Apr 29, 2021, 3:08 AM

#

but it's been a while

slate hollow Apr 29, 2021, 3:09 AM

#

over all

#

over all training instances right

velvet thorn Apr 29, 2021, 3:09 AM

#

uh

#

the loss is defined

#

as a function over arrays

#

like f(actual, predicted)

slate hollow Apr 29, 2021, 3:10 AM

#

yeah

#

but there's so many instances

#

is it averaged or summed

velvet thorn Apr 29, 2021, 3:10 AM

#

slate hollow is it averaged or summed

no

#

that's my point

slate hollow Apr 29, 2021, 3:10 AM

#

wait what

velvet thorn Apr 29, 2021, 3:10 AM

#

not a function applied over individual values in actual and predicted

slate hollow Apr 29, 2021, 3:10 AM

#

i'm confused

velvet thorn Apr 29, 2021, 3:10 AM

#

you have an array of actual values, and an array of predicted values

#

and the loss function

#

does whatever it wants

#

so

#

in the case of mean absolute error

slate hollow Apr 29, 2021, 3:11 AM

#

it depends on the loss function

velvet thorn Apr 29, 2021, 3:11 AM

#

it's the sum of the absolute differences

#

(I would presume)

slate hollow Apr 29, 2021, 3:11 AM

#

yeah ok then

#

thx!

velvet thorn Apr 29, 2021, 3:11 AM

#

yw 👋

slate hollow Apr 29, 2021, 3:46 AM

#

velvet thorn yw 👋

hey, uh, i have another question
when tweaking the parameters, why is it lr * gradient? wouldn't just a general direction be enough? (positive, negative, or 0)?

shy tundra Apr 29, 2021, 3:53 AM

#

Hey guys, so i want to build a 3D model of a place for a project and i want to run an AI simulation through it based on customer shopping patterns. What is a good program for me to use which supports AI in the 3D model

exotic maple Apr 29, 2021, 3:53 AM

#

slate hollow hey, uh, i have another question when tweaking the parameters, why is it `lr * g...

the learning rate is an adjustment to fine-tune how much your function is changing based on the observed gradient.

if you simply set it to 1 you have no flexibility and your model might never reach (or take forever to) reach the global minima for the cost function. Too high LR can make you "fly over" the minima and too small may take too long and use too much computational resources to converge into a solution

slate hollow Apr 29, 2021, 3:54 AM

#

exotic maple the learning rate is an adjustment to fine-tune how much your function is changi...

but just because a function is like really steep where the params are now, doesn't mean it's steep for a long time

exotic maple Apr 29, 2021, 3:55 AM

#

slate hollow but just because a function is like really steep where the params are now, doesn...

But you dont (generally) know that, neither does the algo.

slate hollow Apr 29, 2021, 3:55 AM

#

i mean like take this hypothetical cost function: / ---/ <-- we're herei mean it's steep, but that doesn't mean we should

#

oh ok

#

so it's just generally agreed upon, and it's worked for most models?

exotic maple Apr 29, 2021, 3:55 AM

#

#

this is what LR does

#

Learning rate just determines the "step size" how large is your jump

#

#

and this is a good visualization of what happens with different learning rates

#

#

if you look at the right image it "jumps" over the minima because the step size (LR) is too large

slate hollow Apr 29, 2021, 4:18 AM

#

hey- for the sklearn housing data set, ik the target variable is the mean house price, but what's the unit for that?

#

$100k or something?

delicate lodge Apr 29, 2021, 6:15 AM

#

Hi ,
I am developing a recommendation system
I have a question...
that suppose we have the product list so how we can do synthetic grouping of that list.
for example
we have
milk , 1L
milk, 500ml
milk,2L
I want that my system consider it as same
any idea ...

ripe forge Apr 29, 2021, 6:39 AM

#

slate hollow hey, uh, i have another question when tweaking the parameters, why is it `lr * g...

You're driving on the road. You ask, are we there yet?. I tell you, no, but your destination is ahead. You ask, how far are we?. I say, "your destination is ahead". You say.. That's not very useful on its own is it. I say, yep. Too bad.

#

Here's the real kicker. My direction information felt incomplete when we even knew our destination. Now imagine the same scenario where we don't even know where the destination is.. Oh and we teleport randomly to different roads and keep asking the same question.....

#

Best part is, that assumes that we would even know we're there when we arrive. Which we don't. Sounds like fun

late shell Apr 29, 2021, 7:21 AM

#

fierce grove <@594900402634227752> The basic idea is Sum of Squares total (SSTotal) equals to...

thank you

hardy jetty Apr 29, 2021, 10:23 AM

#

What is the difference between np.min(array) and array.min()? I timed it on a numpy.ndarray, and array.min() is a bit faster. Would've thought it would be the same speed.

lapis sequoia Apr 29, 2021, 10:46 AM

#

Hey everyone, I have a question about the credit risk notebook from pysurvival

#

this one https://github.com/square/pysurvival/blob/master/notebooks/Credit Risk - Computing the speed of repayment of loans.ipynb

#

the goal in this notebook is to predict the speed of a repayment loan

#

but at the end

#

we finish by plotting this graph that I don't understand

#

#

I'm not sure to know what is the y-axis and I don't understand how the high risk line can be faster to repay the loan than the low and medium risk

#

shouldn't it be the opposite?

#

Also I'm not sure to understant what the "T=6.0" means (the actual time)

#

I looked at the code but it didn't help me that much, can you help me please?

lapis sequoia Apr 29, 2021, 11:34 AM

#

Hello. Anybody with data science experience? I want Simpsons transcripts for a machine learning task. I want them all in .txt files for all the episodes named ep1.txt, ep2.txt, ep3.txt, ep4.txt ... and so on. I found a script dataset of Simpsons here: https://www.kaggle.com/prashant111/the-simpsons-dataset?select=simpsons_script_lines.csv
but it is one csv file that is not split. How to I get the data in the kaggle link to my format?
Can anybody tell me a script to get the data in the format I want? Or is the data available in my wanted formatted anywhere? I'd appreciate any sort of help!

The Simpsons Dataset

serene scaffold Apr 29, 2021, 12:01 PM

#

@lapis sequoia do you know pandas?

uncut barn Apr 29, 2021, 12:03 PM

#

What is the difference between the correlation coefficient and the p-value in relation to how good the regression model is?

tame sleet Apr 29, 2021, 12:10 PM

#

I need some help with numpy's random.randn
why and what does it print

serene scaffold Apr 29, 2021, 12:23 PM

#

tame sleet I need some help with numpy's random.randn why and what does it print

it returns an array where the elements have standard normal distribution

serene scaffold Apr 29, 2021, 12:24 PM

#

uncut barn What is the difference between the correlation coefficient and the p-value in re...

one uses the p-value to figure out if the model's performance is just the result of random chance (to simplify it a bit)

#

whereas the correlation coefficient is a measure of how strongly related two variables are.

uncut barn Apr 29, 2021, 12:27 PM

#

@serene scaffold ah ok thanks

#

has anyone read the storks delivers babies paper?

lapis sequoia Apr 29, 2021, 12:28 PM

#

serene scaffold <@456226577798135808> do you know pandas?

yes

serene scaffold Apr 29, 2021, 12:28 PM

#

lapis sequoia yes

is there a column that gives you the episode number?

lapis sequoia Apr 29, 2021, 12:29 PM

#

serene scaffold is there a column that gives you the episode number?

yes there is

serene scaffold Apr 29, 2021, 12:29 PM

#

lapis sequoia yes there is

so you can keep selecting rows by episode number and write each slice of the data to file like you wanted.

lapis sequoia Apr 29, 2021, 12:30 PM

#

serene scaffold so you can keep selecting rows by episode number and write each slice of the dat...

yes, but how exactly would I implement that in code?

serene scaffold Apr 29, 2021, 12:30 PM

#

lapis sequoia yes, but how exactly would I implement that in code?

well, it's not much of a learning experience if I give you the code

lapis sequoia Apr 29, 2021, 12:31 PM

#

serene scaffold well, it's not much of a learning experience if I give you the code

pweez I'll be able to learn from the code

serene scaffold Apr 29, 2021, 12:31 PM

#

lapis sequoia pweez I'll be able to learn from the code

how much programming experience would you say you have?

lapis sequoia Apr 29, 2021, 12:32 PM

#

serene scaffold how much programming experience would you say you have?

If beginner is 0, Intermediate is 0.5, and expert is 1, I'd say I'm 0.7

#

but I'm new to dealing with csv files

tame sleet Apr 29, 2021, 12:34 PM

#

serene scaffold it returns an array where the elements have standard normal distribution

what does standard normal distribution mean?

lapis sequoia Apr 29, 2021, 12:34 PM

#

@serene scaffold are you there?

serene scaffold Apr 29, 2021, 12:35 PM

#

!docs pandas.DataFrame.groupby

arctic wedgeBOT Apr 29, 2021, 12:35 PM

#

pandas.DataFrame.groupby


DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=<object object>, observed=False, dropna=True)```
Group DataFrame using a mapper or by a Series of columns.

A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

lapis sequoia Apr 29, 2021, 12:36 PM

#

how do I use pd.dataframe.groupby to achieve what I want?

serene scaffold Apr 29, 2021, 12:37 PM

#

lapis sequoia how do I use pd.dataframe.groupby to achieve what I want?

what column are you trying to group things by?

serene scaffold Apr 29, 2021, 12:38 PM

#

tame sleet what does standard normal distribution mean?

while I'm averse to telling people to "just google it", I personally wouldn't be able to give you a better answer than an online resource.

lapis sequoia Apr 29, 2021, 12:40 PM

#

?
the original dataset looks like this, preview it here pls: https://www.kaggle.com/prashant111/the-simpsons-dataset?select=simpsons_script_lines.csv
I just want files like 'ep1.txt' 'ep2.txt' and so on containing scripts like this:

Homer Simpson: Hello
Moe: Hello
*Character: Dialogue*

The Simpsons Dataset

serene scaffold Apr 29, 2021, 12:40 PM

#

lapis sequoia ? the original dataset looks like this, preview it here pls: https://www.kaggle....

right, I know what the data looks like. I downloaded it so I can help you. but I'm not going to write the code for you

#

look at the data--if you want to handle each episode separately, what column gives you that information?

lapis sequoia Apr 29, 2021, 12:41 PM

#

serene scaffold look at the data--if you want to handle each episode separately, what column giv...

episode number

serene scaffold Apr 29, 2021, 12:41 PM

#

are you sure?

lapis sequoia Apr 29, 2021, 12:41 PM

#

episode id

serene scaffold Apr 29, 2021, 12:41 PM

#

right

lapis sequoia Apr 29, 2021, 12:41 PM

#

🙂

serene scaffold Apr 29, 2021, 12:42 PM

#

strictly speaking it is episode_id. the underscore is necessary

lapis sequoia Apr 29, 2021, 12:42 PM

#

serene scaffold strictly speaking it is `episode_id`. the underscore is necessary

yes

#

wait

serene scaffold Apr 29, 2021, 12:42 PM

#

so you need to select rows by each episode id and write out each slice.

lapis sequoia Apr 29, 2021, 12:42 PM

#

let me try to write the code
can I ask you if I have any problems while writing the code

serene scaffold Apr 29, 2021, 12:43 PM

#

lapis sequoia let me try to write the code can I ask you if I have any problems while writing ...

I need to do homework but if you ping me I'll try to look at it

lapis sequoia Apr 29, 2021, 12:43 PM

#

serene scaffold I need to do homework but if you ping me I'll try to look at it

thank you

serene scaffold Apr 29, 2021, 12:43 PM

#

I'll give you one more hint

#

https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.__iter__.html#pandas.core.groupby.GroupBy.__iter__

lapis sequoia Apr 29, 2021, 12:50 PM

#

hello
my computer is very weak
it crashed because the csv file was too large
and repl.it can't load the csv file

serene scaffold Apr 29, 2021, 12:52 PM

#

lapis sequoia hello my computer is very weak it crashed because the csv file was too large and...

let me see if there's an option

#

@lapis sequoia you can just read in a certain number of rows at a time, but that means you'll need to be appending to the outputted files

tawdry hamlet Apr 29, 2021, 12:58 PM

#

Yo, is it alright if I ask for some advice on how to do some down-and-dirty outlier detection in a t-SNE plot? I am currently evaluating a weird machine learning method I jury-rigged together and am trying to generate some evidence that what the system is flagging as abnormal is actually abnormal

lapis sequoia Apr 29, 2021, 1:44 PM

#

@serene scaffold ARe you there

#

I wrote my code

#

but was waiting for your homework to finish

lapis sequoia Apr 29, 2021, 1:45 PM

#

serene scaffold <@456226577798135808> you can just read in a certain number of rows at a time, b...

Oh don't worry I wrote the code

#

but I need your help

#

a lil

#

are the episode ids random unique ids? or are they the number of the episode?

#

I wrote this, so I can get all the dialogues of a particular episode to write to my txt files: https://replit.com/@BleepLogger/freeprocess#main.py

repl.it

BleepLogger

freeprocess

A Python repl by BleepLogger

sick wedge Apr 29, 2021, 1:59 PM

#

Sorry to interrupt you Pinkie, hoping someone else can chime in, I'm trying to catch up on this course but I'm doing I'm really stuck on the basics, at the moment I'm on this exercise:

Exercise 6:
Please import from seaborn the famous Anscombe’s quartet. Then plot them with
matplot. And calculate their means, variances correlations and linear fitting
coefficients. For linear regression, you can use the sklearn lib. Can you have a more
concise way to plot the data?

And I'm given the code

import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import linear_model

anscombe = sns.load_dataset("anscombe")
print(anscombe)

# create subsets and subplots of the anscombe data
dataset_1 = anscombe[anscombe['dataset'] == 'I']
dataset_2= anscombe[anscombe['dataset'] == 'II']
dataset_3 = anscombe[anscombe['dataset'] == 'III']
dataset_4 = anscombe[anscombe['dataset'] == 'IV']
fig = plt.figure()

axes1 = fig.add_subplot(2, 2, 1)
axes2 = fig.add_subplot(2, 2, 2)
axes3 = fig.add_subplot(2, 2, 3)
axes4 = fig.add_subplot(2, 2, 4)

axes1.plot(dataset_1['x'], dataset_1['y'], 'o')
axes2.plot(dataset_2['x'], dataset_2['y'], 'o')
axes3.plot(dataset_3['x'], dataset_3['y'], 'o')
axes4.plot(dataset_4['x'], dataset_4['y'], 'o')

#linear regression model
regr = linear_model.LinearRegression()
regr.fit(dataset_1['x'].values.reshape(-1,1), dataset_1['y'].values.reshape(-1,1))
axes1.plot(dataset_1['x'].values.reshape(-1,1), regr.predict(dataset_1['x'].values.reshape(-1,1)), 'r')
plt.show()

I really just barely have a clue how this code is even working, I understand it is plotting graphs atleast, and I know the Anscombe’s quartet will have the same means, variances, medians, etc... but can anyone guide me through calculating those values? Would appreciate any help

#

I didn't receive much support from my lecturer since face-to-face teaching is not allowed :\

tame lichen Apr 29, 2021, 2:56 PM

#

velvet thorn if you wanna use classical ML, try some sort of boosting

how complicated would something be like this to set up for a noob?

#

like where would I find the code for an alogarithm like this?

uncut barn Apr 29, 2021, 3:00 PM

#

what are the possible relationships between correlation and causation?

lapis sequoia Apr 29, 2021, 3:20 PM

#

@serene scaffold Are you online

serene scaffold Apr 29, 2021, 3:20 PM

#

lapis sequoia <@!253696366952316929> Are you online

yes

lapis sequoia Apr 29, 2021, 3:20 PM

#

lapis sequoia I wrote this, so I can get all the dialogues of a particular episode to write to...

👆

lapis sequoia Apr 29, 2021, 3:22 PM

#

serene scaffold yes

⤴️ .

serene scaffold Apr 29, 2021, 3:24 PM

#

lapis sequoia 👆

alright, what's next?

lapis sequoia Apr 29, 2021, 3:25 PM

#

serene scaffold alright, what's next?

I have written the function to get the script of an episode by knowing it's ID. How do I use this?

serene scaffold Apr 29, 2021, 3:26 PM

#

lapis sequoia I have written the function to get the script of an episode by knowing it's ID. ...

right, so once you have all those CSVs, what do you want to do with them?

golden pawn Apr 29, 2021, 4:09 PM

#

logo_panda3d [PANDAS] Hello men. I have a big trouble with having no idea how to write a code to print this:
The most popular girl’s name and boy’s name in every year ( two records for year )
And I wonder how to make that? That’s the excel sheed which I have read in. Liczba means amound, Plec means sex, Imie means name and Rok means year. logo_panda3d And thats the code I was trying to do smth with ```py

print(f"{df1.loc[(df1.groupby('Rok')) & (df1.Plec == 'M')]['Liczba'].idxmax()}")

lapis sequoia Apr 29, 2021, 4:13 PM

#

serene scaffold right, so once you have all those CSVs, what do you want to do with them?

I don't want many csvs, I want many txts. I need them for a machine learning project, specifically, few-shot learning with EleutherAI's GPT-Neo.
I finished the code to generating all my txts. Can you please verify my code and correct and explain me all errors? Also tell me how I can improve my code and why it isn't working if it isn't working. Also inform me if it works as expected.

#

Here is the finished code: https://replit.com/@BleepLogger/freeprocess#main.py

repl.it

BleepLogger

freeprocess

A Python repl by BleepLogger

#

@serene scaffold Are you there?

serene scaffold Apr 29, 2021, 4:22 PM

#

lapis sequoia I don't want many csvs, I want many txts. I need them for a machine learning pro...

What do you want in the text files?

#

Just all the dialogue in a given episode as one continuous stream of text?

lapis sequoia Apr 29, 2021, 4:23 PM

#

serene scaffold What do you want in the text files?

the scripts for all the episodes

lapis sequoia Apr 29, 2021, 4:23 PM

#

serene scaffold Just all the dialogue in a given episode as one continuous stream of text?

yes

serene scaffold Apr 29, 2021, 4:24 PM

#

That's easy to do if you can fit the whole csv in your ram

lapis sequoia Apr 29, 2021, 4:24 PM

#

did you check out my code

lapis sequoia Apr 29, 2021, 4:24 PM

#

serene scaffold That's easy to do if you can fit the whole csv in your ram

oh is the method used in my code fine

lapis sequoia Apr 29, 2021, 4:24 PM

#

serene scaffold That's easy to do if you can fit the whole csv in your ram

I can, I have 13 gb RAM

#

https://replit.com/@BleepLogger/freeprocess#main.py

repl.it

BleepLogger

freeprocess

A Python repl by BleepLogger

serene scaffold Apr 29, 2021, 4:25 PM

#

One moment

lapis sequoia Apr 29, 2021, 4:26 PM

#

serene scaffold One moment

take as long as you want

#

np.full((25, 25), "white", dtype="object")
raises
ValueError: Object arrays cannot be loaded when allow_pickle=False

lapis sequoia Apr 29, 2021, 4:29 PM

#

lapis sequoia `np.full((25, 25), "white", dtype="object")` raises `ValueError: Object arrays c...

set allow_pickle to True

#

then it will work

#

where?

#

i'm not seeing that as an argument in np.full

lapis sequoia Apr 29, 2021, 4:30 PM

#

lapis sequoia where?

https://stackoverflow.com/questions/55890813/how-to-fix-object-arrays-cannot-be-loaded-when-allow-pickle-false-for-imdb-loa

Stack Overflow

How to fix 'Object arrays cannot be loaded when allow_pickle=False'...

I'm trying to implement the binary classification example using the IMDb dataset in Google Colab. I have implemented this model before. But when I tried to do it again after a few days, it returned a

lapis sequoia Apr 29, 2021, 4:31 PM

#

lapis sequoia https://stackoverflow.com/questions/55890813/how-to-fix-object-arrays-cannot-be-...

i'm not using np.load tho

lapis sequoia Apr 29, 2021, 4:31 PM

#

lapis sequoia i'm not seeing that as an argument in np.full

argument in np.load()

lapis sequoia Apr 29, 2021, 4:31 PM

#

lapis sequoia i'm not using np.load tho

wait a sec

#

try the argument in np.full

#

or remove pickle data from your file

#

@lapis sequoia wait I looked it up

#

there is no allow_pickle argument in numpy.full()

serene scaffold Apr 29, 2021, 4:34 PM

#

@lapis sequoia your code can at least be greatly simplified

lapis sequoia Apr 29, 2021, 4:34 PM

#

try opening an issue on github'

lapis sequoia Apr 29, 2021, 4:34 PM

#

serene scaffold <@456226577798135808> your code can at least be greatly simplified

great! how?

lapis sequoia Apr 29, 2021, 4:35 PM

#

lapis sequoia there is no allow_pickle argument in numpy.full()

okay, let's try a different solution then...what is the correct dtype for a string with variable length?

serene scaffold Apr 29, 2021, 4:35 PM

#

import pandas as pd

df = pd.read_csv('simpsons_script_lines.csv')
episode_ids = df['episode_id'].unique()

for id_ in episode_ids:
    ...

#

see if you can go from there

lapis sequoia Apr 29, 2021, 4:36 PM

#

serene scaffold ```py import pandas as pd df = pd.read_csv('simpsons_script_lines.csv') episode...

oh there's a unique function

serene scaffold Apr 29, 2021, 4:37 PM

#

it's a method but yes

lapis sequoia Apr 29, 2021, 4:37 PM

#

I thought I had to make it a list and then make it a set to make it unique

serene scaffold Apr 29, 2021, 4:37 PM

#

that's the verbose way to do it

#

it's easier to just let pandas do it for you 😁

lapis sequoia Apr 29, 2021, 4:38 PM

#

I understand it's not simple, but does my code at least run properly? Can you check the txt files that it generates and whether they make any sense?