young granite Feb 12, 2023, 10:21 PM

#

but drawing is math

wheat snow Feb 12, 2023, 10:21 PM

#

da functions

#

hmm, ok i wanna ask for sum help now ^^

young granite Feb 12, 2023, 10:22 PM

#

shes lazy

#

she didnt want to calc. 10.000 sine functions for me

#

im very displeased nera

#

nera as a demihuman ai model would u feel insulted by sexual comments

#

its just a thought expiriment

#

cause u identify as a half human half ai

wheat snow Feb 12, 2023, 10:30 PM

#

i got the following code: (analysis of personal netflix data)

        df_vd_ac= df_vd[df_vd['Duration']> '0 days 00:01:00' ]
        df_vd_ac['Hour']= df_vd['Start Time'].dt.hour
        df_vd_ac['Hour']= pd.Categorical(df_vd_ac['Hour'], categories=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23], ordered=True)
        
        Week=df_vd_ac.groupby(df_vd_ac["Start Time"].dt.date)['Duration'].sum()
        Week.index = pd.to_datetime(Week.index)
        Week=(Week.dt.total_seconds()/60/60)
        Week= Week.groupby([ Week.index.weekday]).sum()
        print(Week.dtype)
        user_by_hour= df_vd_ac['Hour'].value_counts()
        user_by_hour = user_by_hour.sort_index()
        print(user_by_hour)
        fig_ep_started, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, figsize=(17, 12))
        plt.tight_layout(pad=5.0)

        Week.plot(kind='bar', ax=ax1, title="Total Watchtime of "+ cb_person.get()+ " by Weekday")
        user_by_hour.plot(kind='bar', ax=ax2, title=("Wann startet "+ cb_person.get()+ " Netflix Sessions am meisten? (by hour)"))
    
        plt.show()
```which results perfecty in the attached ss

But now i want to embed this into my tkinter GUI (which currently looks like the second picture) it already works for all other charts, to embed them i created this function
```py
def embeded_plot(figure):
        
        canvas = FigureCanvasTkAgg(figure,
                               master = master)  
    
        canvas.get_tk_widget().pack()
        
        canvas.get_tk_widget().place(x=250, y= 350, width=1000, height=400)

wheat snow Feb 12, 2023, 10:30 PM

#

young granite cause u identify as a half human half ai

marriage w chatgpt when

young granite Feb 12, 2023, 10:31 PM

#

wheat snow marriage w chatgpt when

sign me in

#

text dirty to me AI

wheat snow Feb 12, 2023, 10:31 PM

#

young granite text dirty to me AI

excuse me I AM A HUMAN ||AI|| BEEING

young granite Feb 12, 2023, 10:31 PM

#

😂
did u assume my identity

#

how nasty

wheat snow Feb 12, 2023, 10:32 PM

#

anyways

#

i need to copypasta this hole shit into a help forum if noone answers

young granite Feb 12, 2023, 10:32 PM

#

ask chatgpt 🗿

wheat snow Feb 12, 2023, 10:32 PM

#

SO, the error lies somewhere in the implementationof the graph

wheat snow Feb 12, 2023, 10:33 PM

#

young granite ask chatgpt 🗿

bro i chatted with him over the whoe function the last 3 hours

young granite Feb 12, 2023, 10:33 PM

#

hahahahaha

#

hes dumb

#

chat with nera

wheat snow Feb 12, 2023, 10:33 PM

#

young granite chat with nera

@deep spire

wheat snow Feb 12, 2023, 10:33 PM

#

young granite hes dumb

smh, ye he always wants to solve it complicated asf

young granite Feb 12, 2023, 10:34 PM

#

wheat snow smh, ye he always wants to solve it complicated asf

hes not that smart ye

wheat snow Feb 12, 2023, 10:34 PM

#

like for some shit he wanted to whether my array is out of bounce with

if len(self.ax_pos) > 0:
    s_edge = self.ax_pos[0] - 0.25 + self.lim_offset
else:
    # handle the case where self.ax_pos is empty

instead of saying me where the fking error is

young granite Feb 12, 2023, 10:34 PM

#

hehehehe

#

classic

#

u need to ask smarter then 🗿

wheat snow Feb 12, 2023, 10:35 PM

#

fun fact, i didnt even idle in my VSC that 8 hours were actually work time lmao

#

i dont know what im doing with my sunday

young granite Feb 12, 2023, 10:35 PM

#

hahaha

wheat snow Feb 12, 2023, 10:35 PM

#

young granite u need to ask smarter then 🗿

bro idfk how

young granite Feb 12, 2023, 10:35 PM

#

just german things

wheat snow Feb 12, 2023, 10:35 PM

#

young granite just german things

brainmon

young granite Feb 12, 2023, 10:35 PM

#

can relate

wheat snow Feb 12, 2023, 10:36 PM

#

imagine asking chatgpt to explain code in german

young granite Feb 12, 2023, 10:36 PM

#

bruhhhh

#

do it once and ull never do it again

wheat snow Feb 12, 2023, 10:37 PM

#

LMAOOO

young granite Feb 12, 2023, 10:37 PM

#

hahahahahaa

#

who doesnt know the almighty FÜR schleife

#

@wheat snow but tbh its the most german thing to use a piechart to compare watchtimes and probably adjust payments on that 🗿

wheat snow Feb 12, 2023, 10:41 PM

#

young granite <@763828494060355654> but tbh its the most german thing to use a piechart to com...

fuck off

young granite Feb 12, 2023, 10:41 PM

#

hahahaha

wheat snow Feb 12, 2023, 10:41 PM

#

my code is too long

#

i createdwy too much useless shit

#

idk what to do with it

young granite Feb 12, 2023, 10:42 PM

#

hahahah

#

delete

#

down for project

wheat snow Feb 12, 2023, 10:42 PM

#

not like that i watched 2400h of netflix over all, and that data is last updated in july 2021

wheat snow Feb 12, 2023, 10:42 PM

#

young granite delete

hell nah ive been on that for 2 years with long pauses

young granite Feb 12, 2023, 10:42 PM

#

rlly?

wheat snow Feb 12, 2023, 10:42 PM

#

look at dis

young granite Feb 12, 2023, 10:43 PM

#

thats inefficient af bro

#

first project?

wheat snow Feb 12, 2023, 10:43 PM

#

first big one yeah

young granite Feb 12, 2023, 10:43 PM

#

then its fine

wheat snow Feb 12, 2023, 10:43 PM

#

dis my year watchtime development

young granite Feb 12, 2023, 10:44 PM

#

ugly matplot replace it with plotly

wheat snow Feb 12, 2023, 10:44 PM

#

young granite then its fine

and im thiking about enriching it with IMDB data so i can align for each title a numeric alpha value so i can create some charts that give uinformation about genre and stuff

young granite Feb 12, 2023, 10:45 PM

#

wheat snow and im thiking about enriching it with IMDB data so i can align for each title a...

i do need to do that for my definitely_not_porn_folder

wheat snow Feb 12, 2023, 10:45 PM

#

young granite i do need to do that for my definitely_not_porn_folder

ah yes

young granite Feb 12, 2023, 10:45 PM

#

if u can code python u can just swap libs lel

#

brainmon

#

always only 30sec

#

watchtime lemon_angrysad

wheat snow Feb 12, 2023, 10:46 PM

#

young granite if u can code python u can just swap libs lel

bro no, i aint understanding shit of matplotlib and OOP with matplotlib so far

wheat snow Feb 12, 2023, 10:46 PM

#

young granite always only 30sec

das week

#

okay if u have time to talk w me u can also help me

young granite Feb 12, 2023, 10:47 PM

#

if im able to sure

wheat snow Feb 12, 2023, 10:47 PM

#

okidoki

young granite Feb 12, 2023, 10:50 PM

#

wheat snow okidoki

open help channel so we stop spamming here and tag me

wheat snow Feb 12, 2023, 10:50 PM

#

young granite open help channel so we stop spamming here and tag me

ouff okay ima ping u there

cerulean ginkgo Feb 12, 2023, 11:09 PM

#

Hey guys I'm stuck with an error here when I run the entire project raise me this :NameError: name 'tensorflow' is not defined

#

Here`s the problematic code

#

import tensorflow as tf
from tensorflow import keras

from base.base_model import BaseModel

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout, BatchNormalization, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import TensorBoard, ModelCheckpoint
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.imagenet_utils import preprocess_input, decode_predictions
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing.image import ImageDataGenerator


class ConvMnistModel(BaseModel):

    def __init__(self, config):
        super(ConvMnistModel, self).__init__(config)
        self.build_model()

    def build_model(self):
        width_shape = 224
        height_shape = 224
        image_input = Input(shape=(width_shape, height_shape,3))
        conv_base = VGG16(input_tensor=image_input,include_top=False,weights='imagenet')
        
        #First model
        for layer in conv_base.layers:
            layer.trainable = False

        self.model = Sequential()
        self.model.add(conv_base)
        self.model.add(tensorflow.keras.layers.Flatten())

        self.model.add(tensorflow.keras.layers.Dense(64,activation='relu'))
        self.model.add(tensorflow.keras.layers.Dense(1,activation='sigmoid')) 

            
        self.model.summary()
        

        self.model.compile(
              loss='binary_crossentropy',
              optimizer=self.config.model.optimizer,
              metrics=['acc'],
        )

agile cobalt Feb 12, 2023, 11:11 PM

#

cerulean ginkgo ```python import tensorflow as tf from tensorflow import keras from base.base_m...

you are importing it as tf
that alias it and does not adds the original name to the namespace

cerulean ginkgo Feb 12, 2023, 11:11 PM

#

I think is a bad use of the libraries when I use the methods

agile cobalt Feb 12, 2023, 11:12 PM

#

either import it without aliasing, or use tf.thing instead of tensorflow.thing in your code

#

you might want to just from keras.layers import Flatten, Dense and use them without having to include the namespace though

cerulean ginkgo Feb 12, 2023, 11:14 PM

#

I'll clean the code

#

valeu valeu 🇧🇷

#

thanks for the help

hasty mountain Feb 12, 2023, 11:35 PM

#

Made with Stable Diffusion? Or did you make your own generative model?

#

Gracious. Did you use GANs? What was the Generator and the Discriminator architectures?

#

And how many epochs?

#

Oh, I see...
DCGAN is a good one to begin with, but...eh. For complex datasets it tends to be more meh

#

At least for mine it's having some trouble

#

I see

#

Well...that's the thing... I began studying the theory(and math) in neural networks exactly to learn how to make my GANs work

#

But the bad part is...GANs are quite...oblivious.
Even when you study the theory, you discover that, in practice, it's complicated.
There's so many alternative loss functions that were developed for them, but there's also a Google paper that says that, they might not even be able to provide a better performance, depending on the task and on the dataset.

#

They're...crazy.

#

Oh, I'm trying to make an unconditioned GAN first, then stick to conditional

#

Though I've read that conditional GANs tend to be more stable

#

Meh... Ready-made AI is not fun grumpchib

#

I could simply use a Stable Diffusion with pretrained weights and voilá, but...meh

#

Well... I'm trying to make one that can generate anime fanarts

#

But I'm afraid the results aren't exactly what I desire for now

#

But...perhaps this might be because I'm not using enough epochs...or maybe not enough parameters.
When I tested SRGAN, I remember it required thousands of epochs to provide a good output. And I'm just sticking to...50 epochs

#

I also can't make my Generator too robust, because it has a ResNet architecture, and the Discriminator follows a VGG-Like architecture...
And ResNet is surprisingly effective

#

I had to use a Text2Speech dataset for my Transformer implementation because my pc couldn't handle a proper Machine Translation dataset, since those are too heavy yert

#

Nah, I was just trying to produce better outputs.

#

I know that residual blocks are good for that, but DCGAN is incompatible with skip connections, soooo...resnet

#

I'm also using a growing GAN, and it seems that resnet tends to be more stable when passing from one level to another(from 16x16 images to 32x32, for example)

hasty mountain Feb 13, 2023, 12:33 AM

#

I just get a bit annoyed by the fact that I can't make a GAN work in an unsupervised configuration.
I mean...there's the real images and fake images, they both will have a different entropy(a difference that tend to be minimized by the generator), so something like this should demonstrate some results. But my tests doesn't provide anything...

#

But then...I might be doing things wrong. I'll insist a bit more.

#

Yeah, that's what I see the most when it comes to GANs

#

For most AIs, but especially for generative AIs

cerulean ginkgo Feb 13, 2023, 12:52 AM

#

cerulean ginkgo Hey guys I'm stuck with an error here when I run the entire project raise me thi...

Hi guys (again) I got another error in code, but now is in another part of the project

novel python Feb 13, 2023, 12:52 AM

#

anyone used to plotly here? getting some errors can't figure out by myself

cerulean ginkgo Feb 13, 2023, 12:53 AM

#

cerulean ginkgo Hi guys (again) I got another error in code, but now is in another part of the p...

Now I got this error and can't realize what's going on there: <class 'trainers.simple_mnist_trainer.SimpleMnistModelTrainer'>
Start training the model.
'Sequential' object has no attribute 'trainer'

#

this's the code of the SimpleMnistModelTrainer

#

import tensorflow as tf
from tensorflow import keras

from base.base_trainer import BaseTrain
import os
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard


class SimpleMnistModelTrainer(BaseTrain):
    def __init__(self, model, train_datagen,val_datagen, config):
        super(SimpleMnistModelTrainer, self).__init__(model, train_datagen,val_datagen, config)
        self.model = model
        self.train_datagen = train_datagen
        self.val_datagen = val_datagen
        self.loss = []
        self.acc = []
        self.val_loss = []
        self.val_acc = []
        
      
        

    def train(self):
        history = self.model.fit_generator(
            self.train_datagen,
            validation_data = self.val_datagen,
            epochs=self.config.trainer.num_epochs,
            verbose=self.config.trainer.verbose_training,
            batch_size=self.config.trainer.batch_size,
            callbacks=self.model.callbacks
        )
        
        self.loss.extend(history.history['loss'])
        self.acc.extend(history.history['acc'])
        self.val_loss.extend(history.history['val_loss'])
        self.val_acc.extend(history.history['val_acc'])
        
        return history

#

Also I modified the code of the model definition that @agile cobalt helps me to fix, I made another changes and now got this

arctic wedgeBOT Feb 13, 2023, 12:55 AM

#

Hey @cerulean ginkgo!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

cerulean ginkgo Feb 13, 2023, 12:56 AM

#

class ConvMnistModel(BaseModel):
    def __init__(self, config):
        super(ConvMnistModel, self).__init__(config)
        self.build_model()

    def build_model(self):
        width_shape = 224
        height_shape = 224
        image_input = Input(shape=(width_shape, height_shape,3))
        conv_base = VGG16(input_tensor=image_input,include_top=False,weights='imagenet')
        
        #First model
        for layer in conv_base.layers:
            layer.trainable = False

        self.model = Sequential()
        self.model.add(conv_base)
        self.model.add(Flatten())

        self.model.add(Dense(64,activation='relu'))
        self.model.add(Dense(1,activation='sigmoid')) 

        
        self.model.callbacks = []
        self.model.callbacks.append(
            ModelCheckpoint(
                filepath=os.path.join(self.config.callbacks.checkpoint_dir, '%s-{epoch:02d}-{val_loss:.2f}.hdf5' % self.config.exp.name),
                monitor=self.config.callbacks.checkpoint_monitor,
                mode=self.config.callbacks.checkpoint_mode,
                save_best_only=self.config.callbacks.checkpoint_save_best_only,
                save_weights_only=self.config.callbacks.checkpoint_save_weights_only,
                verbose=self.config.callbacks.checkpoint_verbose,
            )
        )

        self.model.callbacks.append(
            TensorBoard(
                log_dir=self.config.callbacks.tensorboard_log_dir,
                write_graph=self.config.callbacks.tensorboard_write_graph,
            )
        )
            
        self.model.summary()
        

        

        self.model.compile(
              loss='binary_crossentropy',
              optimizer=self.config.model.optimizer,
              metrics=['acc'],
        )

agile cobalt Feb 13, 2023, 12:57 AM

#

what is your 'config'?

frigid flint Feb 13, 2023, 1:00 AM

#

Does anyone know how to fix this issue?

agile cobalt Feb 13, 2023, 1:01 AM

#

cerulean ginkgo Also I modified the code of the model definition that <@256442550683041793> help...

one way or the other, I'm not really familiar with the tensorflow / keras API - first double check if you are passing the correct arguments in the correct order

cerulean ginkgo Feb 13, 2023, 1:04 AM

#

agile cobalt what is your 'config'?

here I instance config I'll send the code

#


from utils.config import process_config
from utils.dirs import create_dirs
from utils.args import get_args
from utils import factory
import sys

def main():
    # capture the config path from the run arguments
    # then process the json configuration fill
    try:
        args = get_args()
        config = process_config(args.config)

        # create the experiments dirs
        create_dirs([config.callbacks.tensorboard_log_dir, config.callbacks.checkpoint_dir])

        print('Create the data generator.')
        data_loader = factory.create("data_loader."+config.data_loader.name)(config)

        print('Create the model.')
        model = factory.create("models."+config.model.name)(config)

        print('Create the trainer')
        trainer = factory.create("trainers."+config.trainer.name)(config,model.model, data_loader.get_train_data(),data_loader.get_val_data())

        print('Start training the model.')
        trainer.train()

    except Exception as e:
        print(e)
        sys.exit(1)

if __name__ == '__main__':
    main()

cerulean ginkgo Feb 13, 2023, 1:06 AM

#

agile cobalt one way or the other, I'm not really familiar with the tensorflow / keras API - ...

yeah me too I'm newebie using this libraries, but I think the passing is correct pithink

manic jolt Feb 13, 2023, 6:00 AM

#

Is there a way to train your ai locally?

#

Sry if this is a very stupid question im very new to ai

wooden sail Feb 13, 2023, 6:05 AM

#

yes, exactly the same way as you'd do it remotely 😛

#

but you can only realistically train very small and simple models on your own computer, especially if it's a laptop

lapis sequoia Feb 13, 2023, 6:30 AM

#

How to insert line by line in a python file.
I am trying with wtitelines but it is printing on a single line.

manic jolt Feb 13, 2023, 6:37 AM

#

wooden sail but you can only realistically train very small and simple models on your own co...

Can i do it wit cpus too? I have a server with to cpus

flat cobalt Feb 13, 2023, 6:38 AM

#

Hey thanks for the reply. I did look into NER but I couldn't figure out how to actually split the blog to the four portions I had mentioned. It would be great if you could help me out Thanks

austere swift Feb 13, 2023, 6:38 AM

#

manic jolt Can i do it wit cpus too? I have a server with to cpus

yes but it will be a lot slower than gpus

manic jolt Feb 13, 2023, 6:39 AM

#

Oh ok

wooden sail Feb 13, 2023, 6:39 AM

#

manic jolt Can i do it wit cpus too? I have a server with to cpus

yes

austere swift Feb 13, 2023, 6:39 AM

#

for basic networks its probably fine, but for anything decently large you wouldn't want to

manic jolt Feb 13, 2023, 6:39 AM

#

wooden sail yes, exactly the same way as you'd do it remotely 😛

So is there a tutorial to do so?

austere swift Feb 13, 2023, 6:41 AM

#

manic jolt So is there a tutorial to do so?

depending on what framework you're using, you can check their documentation for installation tutorials

manic jolt Feb 13, 2023, 6:41 AM

#

Ok thanks

austere swift Feb 13, 2023, 6:43 AM

#

if you're using gpus then check nvidias cuda installation page

last halo Feb 13, 2023, 9:19 AM

#

I want to share an idea for jupyter python syntax, hope this is the right channel!
Consider the following code as a cell in a jupyter notebook.
What do you think about inline cell-loops that runs cells again for varying parameters?
That would make it easy to quickly create and remove for-loops.
In the below example, the code cell would run 4 times (for x=10;x=11;x=12;x=13)

z = 10
x = 10 # @modify x = [11,12,13]
y = x+z
print(y)

# out
20,21,22,23

wooden sail Feb 13, 2023, 9:22 AM

#

this sounds like a very cumbersome way of recreating functions

boreal gale Feb 13, 2023, 9:23 AM

#

What do you think about inline cell-loops that runs cells again for varying parameters?
are you perhaps looking for papermill? (https://github.com/nteract/papermill)

otherwise a for loop will suffice?

last halo Feb 13, 2023, 9:45 AM

#

@wooden sail :
this idea is not for replacing for loops, it's more for quick debugging and fine-tuning parameters.
Normally, one would have to do: write the for loop statement, indent the code, run the code, delete the loop statement, unindent the code.
This workflow would be replaced by only one line containing the @modify statement.
@boreal gale :
Thanks a lot for pointing me to paper mill, that will be useful to me! 🙂

manic jolt Feb 13, 2023, 11:45 AM

#

How do you save the state of your ai after you've trained it? Obviously you don't train your ai each time you use it

tidal bough Feb 13, 2023, 11:49 AM

#

you dump the weights into a file, very generally speaking. all the serious ML libraries support fancy ways of doing that (and e.g. even automatically storing snapshots of weigths every few epochs of the training process)

violet pier Feb 13, 2023, 12:26 PM

#

Hi! I make recommendation system with the Flask app, hopefully it` useful 👍
https://www.kaggle.com/code/wojteksy/santander-hybrid-recommendation-system

Also I made a custom transformer:
https://www.kaggle.com/code/wojteksy/housing-prices-pipelines-custom-transformer

And EDA and model comparision for begginers:
https://www.kaggle.com/code/wojteksy/iris-eda-model-comparison

late shell Feb 13, 2023, 12:50 PM

#

hello, I wanna build a model that generates a sentence using only a given set of words. basically have to generate ordered tokens with a set of unordered tokens. Im really a noob at nlp, does anyone have anything that you can point me to that might help? Thanks

serene scaffold Feb 13, 2023, 1:12 PM

#

late shell hello, I wanna build a model that generates a sentence using only a given set of...

you need a way of knowing if it "makes sense" to put one token after another, and you can't do that if all you have to start with is the set of tokens. do you have example text (a corpus) to use as training data?

late shell Feb 13, 2023, 1:13 PM

#

yes i have collected 25 children stories from the web (coz in the end, my goal is to generate stories but Im starting with sentences rn)

serene scaffold Feb 13, 2023, 1:13 PM

#

late shell yes i have collected 25 children stories from the web (coz in the end, my goal i...

a simple way to do it would be to count ngrams and do markov chains

#

but if you want to generate stories, you would have to use a really sophisticated model, like GPT-3.

late shell Feb 13, 2023, 1:14 PM

#

yeah i read about it. But I dont think markov chains would be able to capture the connections within a story.

#

yeah

serene scaffold Feb 13, 2023, 1:14 PM

#

it wouldn't. you won't be able to train a model that you can on your computer.

late shell Feb 13, 2023, 1:16 PM

#

ok, but can i train gpt-3 for such a different task?

serene scaffold Feb 13, 2023, 1:16 PM

#

you won't be able to train GPT-3 on your computer, either.

#

what you're trying to do is actually exceptionally challenging. you might pick a more attainable project.

#

and generating text with markov chains is actually fun

late shell Feb 13, 2023, 1:16 PM

#

actually this is the project Im proposing in my SOP for Masters.

serene scaffold Feb 13, 2023, 1:17 PM

#

oh

late shell Feb 13, 2023, 1:17 PM

#

serene scaffold and generating text with markov chains is actually fun

yup i tried that out, was pretty cool.

serene scaffold Feb 13, 2023, 1:17 PM

#

well, I guess you should pick something challenging, then.

late shell Feb 13, 2023, 1:17 PM

#

lol

serene scaffold Feb 13, 2023, 1:17 PM

#

does your university have a high-performance computer?

late shell Feb 13, 2023, 1:18 PM

#

it does, its only accessible to PhD scholars though but I can try to write an application or something, but before that I need to convince them that this task is possible in the first place and how im gonna do it. I dont have to build it right now. Just have to explain them HOW Im gonna do it.

#

or maybe make a prototype or something with a very small training data set, if possible.

serene scaffold Feb 13, 2023, 1:21 PM

#

late shell or maybe make a prototype or something with a very small training data set, if p...

GPT-2 is the most advanced GPT model that you can actually get ahold of. I guess I would start by figuring out how to generate text with it.

#

(you can interact with GPT-3 and ChatGPT, but the actual model isn't available to anyone else.)

late shell Feb 13, 2023, 1:22 PM

#

oh okay. thanks a lot mate. Will surely look into GPT-2.

woeful falcon Feb 13, 2023, 1:26 PM

#

Suppose I want to train an OCR, 26 small case + 26 Capital letters = 52 categories. The dataset contains 30x30 jpg Images. Which model should I choose, how many layers and neurons, and what activation functions, etc...

hasty mountain Feb 13, 2023, 3:30 PM

#

serene scaffold (you can interact with GPT-3 and ChatGPT, but the actual model isn't available t...

GPT-3's paper isn't available to the public, right?

#

I'd like to know what it has different from GPT-2. The idea of unsupervised learning that GPT-2 brought was quite interesting...
And I'm fascinated by unsupervised learning neural networks

serene scaffold Feb 13, 2023, 3:32 PM

#

hasty mountain GPT-3's paper isn't available to the public, right?

this appears to be the GPT-3 paper https://arxiv.org/pdf/2005.14165.pdf
but it's the model itself that isn't public.

#

"the model" as in the actual trained weights, not a description of the architecture.

hasty mountain Feb 13, 2023, 3:33 PM

#

Well, the description is enough to me hyperlemon

#

But then...what is it with this "few-shot" thing? I don't get the difference between a normal model evaluation and a "few-shot"/"first-shot" evaluation

#

I think YOLO has this idea... "You Only Look Once"...

#

Oh, ok...

https://en.wikipedia.org/wiki/One-shot_learning

One-shot learning

One-shot learning is an object categorization problem, found mostly in computer vision. Whereas most machine learning-based object categorization algorithms require training on hundreds or thousands of examples, one-shot learning aims to classify objects from one, or only a few, examples. The term few-shot learning is also used for these problem...

#

Exquisite...

#

It seems that I have one more thing to get fascinated with, then

#

Of course I'll try to apply this to GANs because why not

main kestrel Feb 13, 2023, 3:42 PM

#

Would you say data science gives broad employment prospects? Starting to take course about machine learning algorithms in both python (with whom im familiar with) and R. Was hoping to use it to some sort of fun projects later on using raspberry pi. My main goal is to do some serious stuff maybe including engineering diploma project.

boreal gale Feb 13, 2023, 3:42 PM

#

woeful falcon Suppose I want to train an OCR, 26 small case + 26 Capital letters = 52 categori...

the architecture you want is going to be most likely some form of CNN (convolution neural network), because CNN mimics some of the mechanisms found in biological eyes, e.g. in the ways how it first recognise lines/edges before these knowledge is used to predict the final output of category.

other than that, number of layers and activation functions, initialisations, etc. are just up to you, kinda. you can experiment with these, but just remember to use CNN, without that it's unlikely you will get sensible result (unless you are doing research into new architectures, then by all means try everything you can think off.)

boreal gale Feb 13, 2023, 4:03 PM

#

main kestrel Would you say data science gives broad employment prospects? Starting to take co...

broad is a bit of a relative term.. but sure, i think it's fair to say data science gives broad employment prospects. there are many components in the life cycle of any data science project, the more experience you get in various part of the lifecycle, the more "hire-able" you are in other domains of software engineering.

e.g. you did some web scraping to source data to train your ML model? that's probably tick some boxes of a backend engineer skillset.
e.g. you maintained the deployment of your ML model? that's probably tick some boxes of a devops engineer skillset.

likewise, if you can only write and train half-optimised model but it produces good result in terms of metrics, then you will only be hired as someone who continually write this sort of stuff without much possibility to branch out.

lapis sequoia Feb 13, 2023, 4:38 PM

#

How do I shift the City column up

boreal gale Feb 13, 2023, 4:41 PM

#

lapis sequoia How do I shift the City column up

it would be easier to answer that if you provide more context.
but i am guessing df.reset_index()

the lowered text of City implies it's a named index in pandas dataframe, by resetting index, you make the index into one of the columns, hence "shifting it up"

lapis sequoia Feb 13, 2023, 4:44 PM

#

Thanks!

austere prawn Feb 13, 2023, 5:13 PM

#

Do you remember which version of the documentation this was? I was thinking about experimenting to avoid generating uuid html id attributes altogether in the pandas html output if possible.

austere prawn Feb 13, 2023, 5:15 PM

#

austere prawn Do you remember which version of the documentation this was? I was thinking abou...

Oh I found it in 1.5.3 now (the newest), it was moved to a subpage in the docs.

austere prawn Feb 13, 2023, 5:47 PM

#

https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.render.html mentions self.template.render, does anyone know what that is?

tacit nacelle Feb 13, 2023, 5:48 PM

#

hey I'm working on traffic flow prediction using Kalman filter, these are the state prediction model and the flow observation equations: x(k+1) =M(k)x(k)+w(k); z(k) = Hx(k)+v(k) . most documents I've read either it just mentions the M(k), w(k),H, v(k) names and a little explanation or give a complex method on identifying them. if someone familiar with Kalman filter , please suggest me some values I can start with then I'm gonna change till I get more accurate value

#

for the Kalman code implementation on python I've found some example online

#

import numpy as np
from collections import namedtuple

State = namedtuple('State', 'X, P')


def predict(state, F, Q):
    """Perform the predict step

    x_pred = Fx
    P_pred = F P F^T + Q

    :param state: State namedtuple
    :param F: Transition matrix
    :param Q: Process Covariance
    :return: The prior as a State namedtuple
    """

    

    x_pred = np.matmul(F, state.X)
    p_pred = np.matmul(F, np.matmul(state.P, F.T)) + Q
    return State(x_pred, p_pred)


def update(prior, z, R, H):
    """Perform update step

    S = H P_prior H^T + R
    K = P_prior H^T S^-1
    y = z - H x_prior
    x = x_prior + Ky
    P = (I - KH) P_prior

    :param prior: State namedtuple holding the prior mean and covariance
    :param z: measurement vector
    :param R: measurement covariance matrix
    :param H: measurement matrix
    :return: Returns the posterior mean and covariance as State namedtuple
    """

    

    z_pred = np.matmul(H, prior.X)
    y = z - z_pred
    S = np.matmul(H, np.matmul(prior.P, H.T)) + R
    K = np.matmul(prior.P, np.matmul(H.T, np.linalg.inv(S)))
    x_posterior = prior.X + np.matmul(K, y)
    p_posterior = np.matmul((np.identity(prior.P.shape[0]) - np.matmul(K, H)), prior.P)
    return State(x_posterior, p_posterior)

compact wraith Feb 13, 2023, 6:37 PM

#

guys i need help my chatbot is offline import discord
import asyncio
import torch
from transformers import pipeline

client = discord.Client(intents=discord.Intents.default())
TOKEN = 'somerandomnumbersandletters'

generator = pipeline('conversational', model='EleutherAI/gpt-neo-2.7B')
prompt = 'This is an ai chatbot based on gpt neo'

@client.event
async def on_message(message):
res = generator(prompt, max_length=40, do_sample=True, temperature=0.9)
if message.author == client.user:
return
await asyncio.sleep(4)
async with message.channel.typing():
await asyncio.sleep(2)
await message.channel.send(res[0]['generated_text'])

client.run(TOKEN)

bright heath Feb 13, 2023, 6:38 PM

#

how do I get just Self from "[""Self""]" using pandas? Can anyone please help me!

supple knoll Feb 13, 2023, 7:18 PM

#

Hi! I'm trying to plot this dataframe. Was wondering if someone would happen to know how I can make the X and Y axis show the correct values? For some reason it's showing the X axis as "sample number" instead of time

umbral charm Feb 13, 2023, 7:33 PM

#

supple knoll Hi! I'm trying to plot this dataframe. Was wondering if someone would happen to ...

plt.plot(x, y)

worn hollow Feb 13, 2023, 7:52 PM

#

i have a pandas dataframe of geographical data that is a slice from a bigger frame, with some modifications made:

relevant = bigger_df['days','lat','lon']
# the days columns are floats, make ints
relevant['days'] = relevant['days'].apply(math.floor)```
but this keeps throwing a `SettingWithCopyWarning`. how  do i stop that?

young granite Feb 13, 2023, 8:03 PM

#

worn hollow i have a pandas dataframe of geographical data that is a slice from a bigger fra...

pandas wants u to use .iloc/loc

tawny spire Feb 13, 2023, 9:24 PM

#

i'm not sure how to interpret my random forest cross validation results 😛 does anyone know what to look for?

iron quest Feb 13, 2023, 9:34 PM

#

you can run https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html and see which features are the most important, right? (i am a data science noob so not sure if im correct here lol)

scikit-learn

Feature importances with a forest of trees

This example shows the use of a forest of trees to evaluate the importance of features on an artificial classification task. The blue bars are the feature importances of the forest, along with thei...

worn hollow Feb 13, 2023, 9:42 PM

#

young granite pandas wants u to use .iloc/loc

Same problem with .loc

boreal gale Feb 13, 2023, 9:50 PM

#

tawny spire i'm not sure how to interpret my random forest cross validation results 😛 does ...

do you mean the numbers you get in cell 73?
those are the mean accuracy of the test set in that particular fold in CV.
i would say that's pretty good. too good perhaps, if i were you, i would double check you didn't leak your target into your features.

charred light Feb 13, 2023, 9:52 PM

#

worn hollow Same problem with .loc

It's a warning, as long as the code is correct you can safety ignore it.
If you really want to turn it off, you can do pd.options.mode.chained_assignment = None # default='warn'

Sometimes it can be a false positive due to usage. See https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas and https://www.dataquest.io/blog/settingwithcopywarning/

#

If I were to get high 80s -> 90s % accuracy, I would check for target leakage (as mentioned above) and check if model is overfitting.

agile cobalt Feb 13, 2023, 10:19 PM

#

@serene scaffold have you seen https://arxiv.org/abs/2302.04907 ?
someone brought it up in another server ; I don't understand it 100% myself, but seems interesting - not to the point of reducing all transformer models by 1/16, but may still help to get a considerable reduction on some?

(pinged since I thought you may be interested ; let me know if I should ping or not for that kind of thing. ~~thought there's also the chance you already saw it since that's literally what you work with...~~)

arXiv.org

Binarized Neural Machine Translation

The rapid scaling of language models is motivating research using
low-bitwidth quantization. In this work, we propose a novel binarization
technique for Transformers applied to machine translation...

serene scaffold Feb 13, 2023, 10:20 PM

#

agile cobalt <@253696366952316929> have you seen https://arxiv.org/abs/2302.04907 ? someone b...

you think I read papers?? LOL that's nerd shit

#

jk
when you say "reducing all transformer models", do you mean in terms of disk space?

agile cobalt Feb 13, 2023, 10:21 PM

#

the paper is about using booleans instead of float16

#

so yeah, disk space and memory usage

serene scaffold Feb 13, 2023, 10:22 PM

#

inb4 NVIDIA lobbies to have this paper deleted

#

anyway, feel free to ping me with papers if you think I'll like them 😄

tawny spire Feb 13, 2023, 10:24 PM

#

boreal gale do you mean the numbers you get in cell 73? those are the mean accuracy of the t...

that's kind, thank you 🙂 i'm certain i didn't, the dataset is really clean

iron basalt Feb 13, 2023, 10:28 PM

#

agile cobalt the paper is about using booleans instead of float16

This is a common technique in NNs. Going from f32 -> f16 -> b8 (and we went to single bit weights too).

austere swift Feb 13, 2023, 10:29 PM

#

iron basalt This is a common technique in NNs. Going from f32 -> f16 -> b8 (and we went to s...

Lowering precision has been around for a while but I’ve never seen anything go down to boolean precision

#

Lowest I’ve seen is int8

iron basalt Feb 13, 2023, 10:30 PM

#

When NNs first started it was often binary weights.

austere swift Feb 13, 2023, 10:30 PM

#

I’m just curious how they handle gradients (I’ll read the paper later tonight to figure that out ig)

hasty mountain Feb 13, 2023, 11:52 PM

#

Hey guys, I'm trying to implement PPO and I'm having some trouble with the way its surrogate loss is calculated.
The surrogate loss is given by:
surrogate_loss = ratio * advantage, where ratio = current_action_distribution/previous_action_distribution and advantage = (predicted reward - possible_rewards)
Then, the final loss is final_loss = surrogate_loss + (0.5 * value_loss), right?

Problem is...my model is having the habit of eventually predicting the same reward for all possible actions, so the advantage becomes 0. When this happens, the model will simply output always the same action, even after it performed some random sampling(it performs the random action, then goes back to the same action as before).
Any idea on how to solve this? Maybe I'm doing something wrong?

#

Even with the value_loss still trying to compensate this, the model simply won't change...at least for some time.

#

Oh... I guess the problem isn't that the advantage is necessarily 0...it's just a number that is so small that it's considered 0 pithink

#

I'm getting... vanishing advantages py_guido

#

Oh...the advantage is also a exponential moving average...so maybe this is what I'm getting wrong...

#

What is it with Reinforcement Learning and so many EMAs? yert

patent lynx Feb 14, 2023, 2:21 AM

#

tawny spire that's kind, thank you 🙂 i'm certain i didn't, the dataset is really clean

Play around with these parameters increase
min_samples_split,min_samples_leaf,
Reduce max_depth. It should generalise the data abit.

hoary wigeon Feb 14, 2023, 4:41 AM

#

Hi everyone!

#

Here's seasonal decompose of target variable that I must use for building a time series model.

Screenshot_2023-02-14_at_10.09.38_AM.png

#

Building a time series model on this data is possible?

#

Data from past 3 years on Day Level

outer fulcrum Feb 14, 2023, 10:43 AM

#

Hey !

#

I'm struggling to read this kind of multi index / subcolumns CSV with pandas. Do you have any ideas?

#

cinder schooner Feb 14, 2023, 10:52 AM

#

Greetings, so i'm new to gnn's and trying to work on a little side project where I use Gnn's to build a spotify playlist recommender. I don't have any error's to show or something but while training the model and trying hyperparameter tuning i'm not getting results and a very very low recall@K and i don't know how to go on debuging this. Would anyone have some advice to me, or guidance?

flint jetty Feb 14, 2023, 11:04 AM

#

OpenCV. These are basically image processing applications

tidal sonnet Feb 14, 2023, 11:38 AM

#

The image is fake?

atomic tide Feb 14, 2023, 12:45 PM

#

outer fulcrum I'm struggling to read this kind of multi index / subcolumns CSV with pandas. Do...

What are you having trouble with specifically?

uneven mist Feb 14, 2023, 1:01 PM

#

Anyone has some recommendations on starting working with Neural Networks? Some good learning sources. Currently I am trying to understand the raw Neural Network idea in pure Python code and then trying to write a small flappy bird game with Pygame using a Neural Network. Then I'll move on working with Jupyter Notebook and TensorFlow.

hoary wigeon Feb 14, 2023, 3:07 PM

#

outer fulcrum I'm struggling to read this kind of multi index / subcolumns CSV with pandas. Do...

use this df.columns = df.columns.swaplevel(0, 1)

prime hearth Feb 14, 2023, 3:08 PM

#

hello I would like to ask if i trained my model( LDA) with bigrams, for testing data do i need to transform it into bigrams or it not needed? I know i need to clean testing data like remove stopwords or punctuation

wispy wolf Feb 14, 2023, 4:04 PM

#

Facial recognition

wise pelican Feb 14, 2023, 5:17 PM

#

Are there any alternatives to matplotlib that look better and can write out a plot to a video file with transparency? Not finding any other libraries out there with those kind of features, and I'm kind of tired of how matplotlib plots look
The issue is that I'm making animated scrolling graphs, where the y-axis is some metric and the x-axis is time, and I scroll through the x-axis
I was thinking my next best bet would be to use plotly or seaborn to render out individual PNG's, then stitch them back together with some external program like ffmpeg

foggy maple Feb 14, 2023, 5:25 PM

#

anyone interested in a team for competetion on KAGGLE?

mild dirge Feb 14, 2023, 5:27 PM

#

wise pelican Are there any alternatives to `matplotlib` that look better and can write out a ...

You can change the style of a matplotlib plot

#

You are just using the default look

#

https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html

wise pelican Feb 14, 2023, 5:30 PM

#

It's not so much the style as it is the way stuff like lines on a line graph looks
Let me see if I can find a graph of mine that shows what I'm talking about

#

Here's an older graph video I made that also showcases the scrolling animation I was talking about
Things like the lines in the graph, the text labels on either side of the graph, and even the numerical axis ticks are all just very aliased and jagged. I remember using seaborn to generate a single image of this graph and being able to smooth out the lines (not the data for them, just make them look cleaner) but seaborn doesn't seem to let you export animations like this

mild dirge Feb 14, 2023, 5:40 PM

#

That is just the resolution probably

#

You can up the resolution

wise pelican Feb 14, 2023, 5:40 PM

#

Happens with 4k as well

mild dirge Feb 14, 2023, 5:40 PM

#

hmm

wise pelican Feb 14, 2023, 5:40 PM

#

I can go higher and higher with resolution but past 4k is already the point where it's not worth the time to write out that file

mild dirge Feb 14, 2023, 5:41 PM

#

There is an antialiased param apparently (inplt.plot())

#

have you tried that?

#

"but seaborn doesn't seem to let you export animations like this", you could always manually do it by generating the set of graphs, and combining them with ffmpeg f.e.

wise pelican Feb 14, 2023, 5:45 PM

#

Yep that graph is actually with antialiased=True
And that idea with seaborn is what I mentioned at the end of my first message, was just hoping I wouldn't have to go that route

mild dirge Feb 14, 2023, 5:45 PM

#

Do you have an example of the seaborn plot?

#

Where you think it is smooth enough?

wise pelican Feb 14, 2023, 5:47 PM

#

I unfortunately do not, I was rewriting a lot of this specific project and cleared out all the old samples & graphs
Had to get that example video from a discord DM I had with a friend that I was showing the development process to

mild dirge Feb 14, 2023, 5:47 PM

#

Well it doesn't seem you like you can change the magnitude of anti-aliasing, you could increase the line width to make it less noticeable

#

But I'm not sure if it is possible to make it any more smooth with matplotlib

#

Tbh, I don't really think it is that much of a problem, the graph is there to convey information, it doesn't have to look super pretty. It seems like a tiny detail to me.

sterile tundra Feb 14, 2023, 8:40 PM

#

Hmmm

hasty mountain Feb 14, 2023, 11:34 PM

#

Hey guys, about PPO...
What's the difference between calculating my Ratio using exp(log(current_policy_action_prob) - log(previous_policy_action_prob)) and simply making current_policy_action_prob/previous_policy_action_prob?

#

Oh, and I hope my RL model's grads are normal...

tender knot Feb 15, 2023, 4:07 AM

#

PS C:\Users\Downloads\compvision> python face_recognition.py
python : The term 'python' is not recognized as the name of a cmdlet, function, script file, or operable program. Check
the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:1

python face_recognition.py

  + CategoryInfo          : ObjectNotFound: (python:String) [], CommandNotFoundException
  + FullyQualifiedErrorId : CommandNotFoundException

#

hey i tried to run python face_recognition.py in my vscode terminal but theres a problem

#

it was weird since I always get to run my code

midnight kayak Feb 15, 2023, 4:08 AM

#

tender knot PS C:\Users\Downloads\compvision> python face_recognition.py python : The term '...

assuming you're on Windows, search for add and edit environment variables and check that the python bin is in the path

tender knot Feb 15, 2023, 4:09 AM

#

C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\Scripts

#

I already had it

low island Feb 15, 2023, 4:14 AM

#

#

Hi I installed Sklearn yet but I still cannot import

#

What should I do

midnight kayak Feb 15, 2023, 4:35 AM

#

tender knot C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8...

Is there Scripts\ directory populated with .bin files?

low island Feb 15, 2023, 4:46 AM

#

I will check

patent lynx Feb 15, 2023, 5:08 AM

#

Pip install sklearn again

tender knot Feb 15, 2023, 5:15 AM

#

midnight kayak Is there Scripts\ directory populated with .bin files?

no

midnight kayak Feb 15, 2023, 5:50 AM

#

tender knot no

Then your env probably isn’t pointing to the right path. I’d probably try a fresh install of python and see if that fixes it

deft pewter Feb 15, 2023, 6:43 AM

#

low island

probably have multiple python versions

#

if on mac make sure to do pip3 install

misty vector Feb 15, 2023, 6:49 AM

#

hey I installed the SimplePyGUI

#

but cannot use it on the mac and the atmosphere is VS code

#

any ideas with that

dense yarrow Feb 15, 2023, 7:09 AM

#

energy_data["Date", "U.S. Regular Conventional"].plot(x="Location", y="Date", kind="line", legend=True,
                 label="Time Series of Gas Prices")```

#

i'm trying to plot these two columns but getting "key error"

lapis sequoia Feb 15, 2023, 7:58 AM

#

Hi all. I have retrieved the JSON data from an API and I want to extract the information on capital cities, which is part of the dict under the key 'capital'

#

But I get key error when I try to access it. Any solutions?

inner hemlock Feb 15, 2023, 10:21 AM

#

lapis sequoia Hi all. I have retrieved the JSON data from an API and I want to extract the inf...

print i and see if you need to go deeper into the dict

tidal bough Feb 15, 2023, 10:23 AM

#

perhaps not all the dicts in the list it returns have that key.

lapis sequoia Feb 15, 2023, 10:24 AM

#

inner hemlock print `i` and see if you need to go deeper into the dict

Tried and it did not work. When I check any i from the JSON (essentially what I did in the first screenshot), I can find the key and it's value. When I iterate it is when it raises the key error

lapis sequoia Feb 15, 2023, 10:25 AM

#

tidal bough perhaps not all the dicts in the list it returns have that key.

I actually thought the same way, given there's 250 entries in the data set

inner hemlock Feb 15, 2023, 10:26 AM

#

does it fail on the first loop where it is confirmed to be there?

boreal gale Feb 15, 2023, 10:26 AM

#

Antarctica doesn't have a capital TIL.
if you print i after the failing cell block, you can see what's the i - that's how i knew that's what causing the issue

inner hemlock Feb 15, 2023, 10:26 AM

#

maybe have it print the iteration

tidal bough Feb 15, 2023, 10:26 AM

#

lapis sequoia I actually thought the same way, given there's 250 entries in the data set

well, why not make it show the dict it fails on?

#

like, in this case you can literally just do print(i) in the next cell, because of how loops work

lapis sequoia Feb 15, 2023, 10:28 AM

#

boreal gale Antarctica doesn't have a capital TIL. if you print `i` after the failing cell b...

Interesting

#

Lemme check the iteration at which the issue comes up

#

Thank you!

boreal gale Feb 15, 2023, 10:31 AM

#

also TIL south africa has 3 capitals..? that's another edge case you need to be careful about
ditto for Palestine, which has 2..
according to this API anyway!

lapis sequoia Feb 15, 2023, 10:37 AM

#

boreal gale Antarctica doesn't have a capital TIL. if you print `i` after the failing cell b...

Can I see your code for this?

boreal gale Feb 15, 2023, 10:39 AM

#

for checking what's failing?
literally the same as yours but with just a single i in the next block and manually run that after the above failed

lapis sequoia Feb 15, 2023, 10:39 AM

#

Okay. Thanks

long widget Feb 15, 2023, 10:44 AM

#

Does anyone know an existing model to extract statements/claims from a 'medium to large' text like a research paper?

lapis sequoia Feb 15, 2023, 12:06 PM

#

can someone suggest me a playlist for how to create a neural network from scratch for chatbot (like gpt).
i am trying create a chatbot with my own 100billnn structure.
and how to train 100 billion NN in online for free of cost and how to make chat bot like gpt
somebody give me suggestions on this

mild dirge Feb 15, 2023, 12:24 PM

#

lapis sequoia can someone suggest me a playlist for how to create a neural network from scratc...

100 bil nn? what's that

lapis sequoia Feb 15, 2023, 12:25 PM

#

i want to make a chatbot

#

with 100bill

mild dirge Feb 15, 2023, 12:25 PM

#

100 bil what?

lapis sequoia Feb 15, 2023, 12:25 PM

#

neural network

mild dirge Feb 15, 2023, 12:25 PM

#

🧱 wall

#

I have no clue what a 100 bil nn is, that is why I ask the question lol

tidal bough Feb 15, 2023, 12:26 PM

#

parameter count

atomic tide Feb 15, 2023, 12:26 PM

#

100 billion weights?

lapis sequoia Feb 15, 2023, 12:26 PM

#

no parameters

tidal bough Feb 15, 2023, 12:26 PM

#

chatgpt has something like 175B for comparison

mild dirge Feb 15, 2023, 12:26 PM

#

If it is 100 bil params, then that wouldn't be very possible

tidal bough Feb 15, 2023, 12:26 PM

#

hence my 🥴 at the idea of training a model this big for free

mild dirge Feb 15, 2023, 12:27 PM

#

mild dirge If it is 100 bil params, then that wouldn't be very possible

For just some random person, you need a very big investment

tidal bough Feb 15, 2023, 12:27 PM

#

I've seen estimations in the vicinity of ~$100K for training a chatgpt clone

lapis sequoia Feb 15, 2023, 12:27 PM

#

i want it for freee of cost

tidal bough Feb 15, 2023, 12:27 PM

#

hence, 🥴

mild dirge Feb 15, 2023, 12:27 PM

#

Not going to happen

#

You should look into learning how nns work, and what it takes to train a network of that size

lapis sequoia Feb 15, 2023, 12:28 PM

#

100bill or ok 100 trill is enough

atomic tide Feb 15, 2023, 12:28 PM

#

cathmm

mild dirge Feb 15, 2023, 12:28 PM

#

If you're just going to troll, just do it elsewhere

lapis sequoia Feb 15, 2023, 12:28 PM

#

mild dirge You should look into learning how nns work, and what it takes to train a network...

is there any free server

lapis sequoia Feb 15, 2023, 12:29 PM

#

mild dirge If you're just going to troll, just do it elsewhere

no i am newbie

tender knot Feb 15, 2023, 12:29 PM

#

hey i know its kinda off the point but do u know how my gcc compiler just download and then stuck like this?

nocturne eagle Feb 15, 2023, 12:29 PM

#

ChatGPT has a few 100 bil weights. This is apx the complexity of a mouse or vole brain. The human brain is apx 1000x to 10000x more complex. Just to give you an idea of scale.

#

sorry if my reply is a bit late

tidal bough Feb 15, 2023, 12:30 PM

#

~~sell your parents' house, spend it on cloud compute~~ 🥴

lapis sequoia Feb 15, 2023, 12:30 PM

#

k

#

but my hobby is to make a sentient

nocturne eagle Feb 15, 2023, 12:32 PM

#

lapis sequoia can someone suggest me a playlist for how to create a neural network from scratc...

The cost involved is in training the model. I suspect that the $100k cost estimate is in using a trained model. IMO, The training probably took 100's to 1000's of years of GPU time. or a few mil $. The methods we currently use to train NN's are crude and unrefined. Essentially we brute force it and it's vastly slower than how biology does it.

lapis sequoia Feb 15, 2023, 12:33 PM

#

mmm

serene scaffold Feb 15, 2023, 12:33 PM

#

lapis sequoia but my hobby is to make a sentient

You would first need to establish what would make an AI sentient, and get everyone else to agree with you. and you'd probably need a PhD in CS.

nocturne eagle Feb 15, 2023, 12:33 PM

#

lapis sequoia but my hobby is to make a sentient

If your goal is a true AI, you want to focus your research in two areas: 1) self-modifying connection structure (i.e. the topology) and 2) improved training methodologies

lapis sequoia Feb 15, 2023, 12:34 PM

#

ok thx

#

i took ai as hobby

nocturne eagle Feb 15, 2023, 12:34 PM

#

#1 and #2 are sort of interrelated

dim palm Feb 15, 2023, 1:15 PM

#

Hi everyone !
I am a French student from Polytech Nantes (France) on third year of engineer degree called "Data Engineering and Artificial Intelligence".
I am seeking for a 9-week internship abroad in the area of data science for summer 2023 from July 3rd to September 1st.
I would be grateful pour any opportunity ! Please send me a message 🙂 my mp are open

junior schooner Feb 15, 2023, 2:10 PM

#

I've relatively new to python and working with data, I've got a data set as CSV that I want to experiment with by visualising the data, looking for patterns, juxtaposing with additional data etc.

I'm at that stage where I only have a vague idea of what I want to do because I only have a vague idea of the possibilities, tools and methods available. Can anyone point me in the direction of where I can look?

serene scaffold Feb 15, 2023, 2:13 PM

#

junior schooner I've relatively new to python and working with data, I've got a data set as CSV ...

try doing the kaggle pandas tutorial.

solemn atlas Feb 15, 2023, 2:22 PM

#

I am trying to understand multivariate regression
For simple linear regression it's y= mx+ c ,but i don't know what will be the formula for multivariate regression
And i want to compare with simple linear regression formula so i can understand the maths

celest vine Feb 15, 2023, 3:17 PM

#

CNN + LTSM + XGBoost for stock market prediction?

wooden sail Feb 15, 2023, 3:17 PM

#

solemn atlas I am trying to understand multivariate regression For simple linear regression i...

for the multivariate case, it is the intersection hyperplanes

#

recall that a (hyper)plane can be written as <n,v> = c, where n is the normal vector to the plane, c is a constant that depends on how far away the plane is from the origin, and all v satisfying the equation are on the plane. <n, v> is a dot product

#

for n and v with length N, the hyperplane is N-1 dimensional. a vector equation y = Mx + c, where y, x and c are vectors and M is a matrix with r rows and c columns, defines the intersection of r hyperplanes, each being c-1 dimensional

#

this might seem kinda weird and complex at first, but note that a line as in your usual y = mx + c in linear regression is a 1-dimensional hyperplane in a 2D space, which fits exactly what we described above

solemn atlas Feb 15, 2023, 3:40 PM

#

wooden sail for the multivariate case, it is the intersection hyperplanes

so its formula will be y = mx1+mx2+c right?
here y is dependent variable and x1 and x2 is independent variable

wooden sail Feb 15, 2023, 3:45 PM

#

solemn atlas so its formula will be y = mx1+mx2+c right? here y is dependent variable and x1 ...

it could have arbitrarily many x_i's, but yes

solemn atlas Feb 15, 2023, 3:46 PM

#

wooden sail it could have arbitrarily many x_i's, but yes

thanks literally i was trying to figure out this for literally 2h banging my head over my keyboard and searching on internet

wooden sail Feb 15, 2023, 3:48 PM

#

all good. if your y is a scalar here, then it's really just the equation of a hyperplane

#

ah btw, the m's you wrote there should also have indices, i.e. m1 x1 + m2 x2. there's no reason in general why m1 = m2

solemn atlas Feb 15, 2023, 4:46 PM

#

ok

clever owl Feb 15, 2023, 4:48 PM

#

I've got a dataframe where I wanna groupby multiple columns on the same condition. So for every column after key, group on whether it is pos or neg.

import pandas as pd

df = pd.DataFrame({'key': ['john', 'john', 'jack', 'jamie', 'jamie', 'jamie'],
                   'col2': [5,-5,6,7,8,-10],
                   'col3': [6,-7,8,9,2,-90]
                   })

df = df.groupby([df["key"], *[df[column] < 0 for column in df.columns[1:]]]).sum()

The above code works, but I just wanna make sure there isn't a more pandas way of writing, *[df[column] < 0 for column in df.columns[1:]] (create a mask for every column after key for whether the col is pos or neg)

manic jolt Feb 15, 2023, 5:17 PM

#

When running tenserflow in a docker container, how can I pull the tenserflow package for cpu and jupyter?

serene scaffold Feb 15, 2023, 5:18 PM

#

manic jolt When running tenserflow in a docker container, how can I pull the tenserflow pac...

you can expose whichever port jupyter is using, so that you can view it in your browser.

manic jolt Feb 15, 2023, 5:18 PM

#

I mean which is the image for cpu use, sry

serene scaffold Feb 15, 2023, 5:19 PM

#

oh, I see what you mean

manic jolt Feb 15, 2023, 5:19 PM

#

yeah there are only gpu packages and some with now type

serene scaffold Feb 15, 2023, 5:21 PM

#

manic jolt yeah there are only gpu packages and some with now type

do you know how to mount volumes with docker?

#

docker run -it --rm -v $(realpath ~/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-jupyter

in particular, do you know what the -v $(realpath ~/notebooks):/tf/notebooks part is doing?

manic jolt Feb 15, 2023, 5:22 PM

#

yes, theoretically, why?

#

No sry

serene scaffold Feb 15, 2023, 5:23 PM

#

inside the container, there's a /tf/notebooks directory for notebooks. and -v /your/path:/tf/notebooks exposes /tf/notebooks/ as wherever /your/path is

manic jolt Feb 15, 2023, 5:24 PM

#

so in this case there will be a notebooks folder in my home dir?

serene scaffold Feb 15, 2023, 5:24 PM

#

you decide where you want /your/path to be

manic jolt Feb 15, 2023, 5:25 PM

#

so that is what i should specify after realpath?

serene scaffold Feb 15, 2023, 5:25 PM

#

you can replace all of $(realpath ~/notebooks) with whatever path you want

manic jolt Feb 15, 2023, 5:25 PM

#

Ok thank you very much 👍

#

One quick question. How can I make the container start at boot

boreal gale Feb 15, 2023, 5:33 PM

#

don't think you can truly do that, but restart policy gets you almost all the way there

#

https://docs.docker.com/config/containers/start-containers-automatically/#use-a-restart-policy

i.e. add --restart always

novel python Feb 15, 2023, 7:48 PM

#

what's your guys opinion on using polars over pandas? I've seen some people changing lately because of performance, but not sure to what extent it really makes that much of a difference.

charred light Feb 15, 2023, 8:13 PM

#

novel python what's your guys opinion on using polars over pandas? I've seen some people chan...

If you want, go for it. I doubt it will really take hold. Pandas is very useful for what it does & provides. As for larger datasets, most will just use spark (pyspark) instead.

serene scaffold Feb 15, 2023, 8:24 PM

#

novel python what's your guys opinion on using polars over pandas? I've seen some people chan...

it has some features that I wish pandas had, but I don't feel motivated to switch.

boreal gale Feb 15, 2023, 8:25 PM

#

serene scaffold it has some features that I wish pandas had, but I don't feel motivated to switc...

what would those features be? i haven't had the chance to take a good look at polars

serene scaffold Feb 15, 2023, 8:26 PM

#

boreal gale what would those features be? i haven't had the chance to take a good look at po...

self-referencing. you know operations like df[df[x] == y]? that works if the dataframe you're operating on has a variable assigned to it, but that isn't always the case, like if you're doing a bunch of chained operations. and polars has pl.col('x')

valid void Feb 15, 2023, 8:29 PM

#

guys tell me plz what ides is most comfortable for python and ai

serene scaffold Feb 15, 2023, 8:31 PM

#

valid void guys tell me plz what ides is most comfortable for python and ai

it's just a matter of preference. most people pick between PyCharm and visual studio code.

valid void Feb 15, 2023, 8:32 PM

#

thz

#

thx

#

both ides are good tbh

#

but vs code is more customizable

#

th pycharm

charred light Feb 15, 2023, 8:33 PM

#

I prefer VS Code for DS , PyCharm more for pure engineering.

valid void Feb 15, 2023, 8:33 PM

#

exactly

#

and what is the best starting point for learning python

#

i mean some books, courses etc.

boreal gale Feb 15, 2023, 8:34 PM

#

serene scaffold self-referencing. you know operations like `df[df[x] == y]`? that works if the d...

i see! pl.col('x') sounds very spark-ish to me, yeah i quite like being able to do that as well.

i would say pandas do have similar utils albeit in a slightly different style, though it might be a little bit more limiting
e.g.

import pandas as pd
df = pd.DataFrame({"x": [2,3]})
print(df)
print(df.eval("x1 = x + 2").query("x1 == 5"))

serene scaffold Feb 15, 2023, 8:35 PM

#

boreal gale i see! `pl.col('x')` sounds very spark-ish to me, yeah i quite like being able t...

I don't like crossing namespaces like that.

boreal gale Feb 15, 2023, 8:36 PM

#

that's fair!

charred light Feb 15, 2023, 8:36 PM

#

Ewww what is that

#

print(df.eval("x1 = x + 2").query("x1 == 5")) floradeadeyes

boreal gale Feb 15, 2023, 8:37 PM

#

well that's just a demo ha, don't judge 😛

charred light Feb 15, 2023, 8:38 PM

#

I think I've seen .eval and .query in the past, but I have chosen to suppress them.

agile cobalt Feb 15, 2023, 8:39 PM

#

~~query is not that bad iirc, but eval probably is~~
these methods are actually not that bad

boreal gale Feb 15, 2023, 8:40 PM

#

i don't use them often fwiw, maybe once or twice when i can't be bothered writing another line

#

df.eval uses pd.eval under the hood, i had look at the source before, thought it was cool 🤷‍♂️

agile cobalt Feb 15, 2023, 8:43 PM

#

seems like they recommend using eval/query for moderately large dataframes (>10K rows), though you must have numexpr installed for it to be effective https://pandas.pydata.org/docs/user_guide/enhancingperf.html

charred light Feb 15, 2023, 8:46 PM

#

Is there an advantage compared to just the default filtering?

deft spire Feb 15, 2023, 8:46 PM

#

The size of a matrix like
2 0
5 9
6 2

Is 3x2 right? First rows then cols?

charred light Feb 15, 2023, 8:47 PM

#

Still sounds like speeding up a hippo, just use pyspark if data is that large.

split prism Feb 15, 2023, 9:02 PM

#

Hey all did you see this: https://www.phoronix.com/news/Intel-AVX-512-Quicksort-Numpy

Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switchi...

Intel recently published an open-source C++ header file library for high performance SIMD-based sorting, which initially is focused on providing a lightning fast AVX-512 quicksort implementation

bold wadi Feb 15, 2023, 9:28 PM

#

i need help i am performing A Fast multivariate EMD-LSTM model aided with Time Dependent Intrinsic Cross-Correlation for monthly rainfall prediction and i have so much error . can you help me..i'll share codes please

jaunty geyser Feb 15, 2023, 10:26 PM

#

what do this error mean UserWarning: X does not have valid feature names, but DecisionTreeClassifier was fitted with feature names
warnings.warn(
array(['HipHop', 'Dance'], dtype=object)

novel python Feb 15, 2023, 10:40 PM

#

you are most likely predicting on an X array that has no feature names

#

just remove the headers of the data you are fitting and it should go away

jaunty geyser Feb 15, 2023, 10:45 PM

#

@novel python ``` import pandas as pd
from sklearn.tree import DecisionTreeClassifier

music_data = pd.read_csv('music.csv.zip')
x = music_data.drop(columns=['genre'])
y = music_data['genre']

model = DecisionTreeClassifier()
model.fit(x,y)
predictions = model.predict([[21,1],[22,0]])
predictions```

bold timber Feb 16, 2023, 1:07 AM

#

Hello guys, do you know how to fix this problem?

boreal gale Feb 16, 2023, 1:10 AM

#

bold timber Hello guys, do you know how to fix this problem?

so by looking at the repo, the offending lines are these https://github.com/facebookresearch/Kats/blob/main/kats/compat/compat.py#L16-L19
what version of packaging do you have? i suspect the author didn't specify which version packaging does kats require and you have a stale (or too bleeding edge!) one installed

arctic wedgeBOT Feb 16, 2023, 1:10 AM

#

kats/compat/compat.py lines 16 to 19

from packaging import version as pv


V = Union[str, "Version", pv.Version, pv.LegacyVersion]```

boreal gale Feb 16, 2023, 1:11 AM

#

together with these two
https://github.com/facebookresearch/Kats/blob/main/setup.py#L16-L18
https://github.com/facebookresearch/Kats/blob/main/requirements.txt#L7
i suspect you have packaging>=22?

arctic wedgeBOT Feb 16, 2023, 1:12 AM

#

setup.py lines 16 to 18

# read dependency requirements
with open("requirements.txt", "r") as f:
    install_requires = f.read().splitlines()```
`requirements.txt` line 7
```txt
packaging<22```

bold timber Feb 16, 2023, 1:12 AM

#

boreal gale so by looking at the repo, the offending lines are these https://github.com/face...

how to see the package version do I have? can you guide me?

boreal gale Feb 16, 2023, 1:15 AM

#

sure!
usually packages store their version info at the root level under __version__ "dunder" variable

so in this case, we can try

import packaging
print(packaging.__version__)

which does work according to my test

if this doesn't work then, you look like you are using some form of jupyterlab, and so you can run pip command to find that info out too!
e.g. by putting this in a cell and run it
!pip freeze
the ! instructs jupyter to run this as a shell command outside of python

bold timber Feb 16, 2023, 1:16 AM

#

boreal gale sure! usually packages store their version info at the root level under `__versi...

this is my version, what's next?

boreal gale Feb 16, 2023, 1:17 AM

#

sweet, so we confirmed you have a packaging that's too new, that's worrying

#

could you run a !pip check before proceeding? this would probably flag that you have a packaging that's too new for kats (hopefully)

bold timber Feb 16, 2023, 1:19 AM

#

boreal gale sweet, so we confirmed you have a `packaging` that's too new, that's worrying

like this?

boreal gale Feb 16, 2023, 1:20 AM

#

missing the check in !pip check 😉

bold timber Feb 16, 2023, 1:23 AM

#

boreal gale missing the `check` in `!pip check` 😉

No problem 😅

This is what I get

#

and this is the last rows

boreal gale Feb 16, 2023, 1:26 AM

#

bold timber No problem 😅 This is what I get

the last rows is probably fine, given you are in a hosted jupyterlab (i presume).

however this is still not a !pip check

can you do these two things please?

run !pip check and post output, i want to see if there is package incompatibility issues already flagged
post more output of the first screenshot here, i want to see if packaging is reinstalled

bold timber Feb 16, 2023, 1:27 AM

#

boreal gale the last rows is probably fine, given you are in a hosted jupyterlab (i presume)...

I'm so sorry. I've wrong captured it. I mean, like this

bold timber Feb 16, 2023, 1:29 AM

#

boreal gale the last rows is probably fine, given you are in a hosted jupyterlab (i presume)...

Sorry, I've lost focused because burned out to fix this🙏 🙏

boreal gale Feb 16, 2023, 1:29 AM

#

ah, this is so odd!

#

could you restart your kernel and retry first? since packaging incompatibility issues is not flagged now

#

if it still doesn't work, let's do a pip install packaging<22 and restart kernel and retry
though after this you really need to check if other packages are still working (by again !pip check), as by doing this you could be introducing other incompatibility issues.

bold timber Feb 16, 2023, 1:31 AM

#

boreal gale ah, this is so odd!

I've restart kernel, what should I do now?

boreal gale Feb 16, 2023, 1:33 AM

#

try the kats import again

bold timber Feb 16, 2023, 1:34 AM

#

boreal gale try the kats import again

This is what I get

boreal gale Feb 16, 2023, 1:35 AM

#

!pip install packaging==21.3 - you had a space after == which is wrong

#

after you pip install, you should restart kernel just to be safe and retry the import

bold timber Feb 16, 2023, 1:35 AM

#

boreal gale `!pip install packaging==21.3` - you had a space after `==` which is wrong

OMG thank you! Wait a minute

bold timber Feb 16, 2023, 1:37 AM

#

boreal gale `!pip install packaging==21.3` - you had a space after `==` which is wrong

Yow maaan, you're so amazing! Thank you so much!!!

boreal gale Feb 16, 2023, 1:38 AM

#

🙌 🙌 don't forget to pip check again just in case 😉

bold timber Feb 16, 2023, 1:38 AM

#

boreal gale 🙌 🙌 don't forget to pip check again just in case 😉

Yeah of course, thank you!!!

errant forum Feb 16, 2023, 2:36 AM

#

hi, has anyone dealt with ensemble model using mlens ? My problem is that when I use predict_proba with ensemble model it gives me the direct prediction output instead of the probability

#

<@&267628507062992896> need help on this

vagrant kite Feb 16, 2023, 2:43 AM

#

errant forum <@&267628507062992896> need help on this

please be patient and do not ping roles for help

errant forum Feb 16, 2023, 2:45 AM

#

Okay

odd meteor Feb 16, 2023, 3:49 AM

#

errant forum hi, has anyone dealt with ensemble model using mlens ? My problem is that when I...

Mlxtend, yeah. Mlens? No. I've not used mlens; I'm actually hearing it for the first time. Hopefully, someone who's familiar with it will respond.

bleak zealot Feb 16, 2023, 3:58 AM

#

Evening guys

I got problems reference keras even though i have tensorflow installed in my environment, (general chat couldnt help me) so i hope maybe since its keras other know it better in here?

#

"from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM"

Simply wont reference any of them

#

its in pycharm

odd meteor Feb 16, 2023, 4:28 AM

#

bleak zealot "from tensorflow.keras.models import Sequential from tensorflow.keras.layers imp...

Where was your TensorFlow installed? In your Conda environment or In a different environment outside Conda?

Check that the environment where you installed TensorFlow is actually the active one in your IDE

bleak zealot Feb 16, 2023, 4:31 AM

#

odd meteor Where was your TensorFlow installed? In your Conda environment or In a different...

Name: tensorflow
Version: 2.12.0rc0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: C:\Users\Main\PycharmProjects\pythonProject6\venv\Lib\site-packages
Requires: tensorflow-intel
Required-by:
(venv) PS C:\Users\Main\PycharmProjects\pythonProject6>

When using py -m pip show tensorflow in terminal.

TensorFlow

An end-to-end open source machine learning platform for everyone. Discover TensorFlow's flexible ecosystem of tools, libraries and community resources.

#

And got pycharm 3.2

#

I guess ill check tomorrow, its like 5 am here 🙂 Properly just to tired and looking my self blind in this.

#

And its ofcause in project6 i got the code also...

#

I even tried just right click the keras and dense and take action directly on in the code, and install from there. Didnt work either.

lapis sequoia Feb 16, 2023, 4:58 AM

#

apologies in advance if ML talk is not appropriate here

#

I thought high recall means your model is correctly predicting true positives?

Precision measures the proportion of predicted positives that are actually true positives, so a high recall and low precision would mean that the model is predicting too many positive cases, including false positives.

#

oh wait im dumb

#

high recall is the proportion or ratio of the model predicting true positives out of all the total true positives in the training set

#

therefore, a high recall would mean the model is accurately predicting or identifying the positive cases in the data

dim palm Feb 16, 2023, 6:27 AM

#

Hi everyone !
I am a French student from Polytech Nantes (France) on third year of engineer degree called "Data Engineering and Artificial Intelligence".
I am seeking for a 9-week internship abroad in the area of data science for summer 2023 from July 3rd to September 1st.
I would be grateful pour any opportunity ! Please send me a message 🙂 my mp are open

long widget Feb 16, 2023, 9:45 AM

#

what is the best/fastest way to check whether or not a text is written in english?

umbral charm Feb 16, 2023, 9:51 AM

#

long widget what is the best/fastest way to check whether or not a text is written in englis...

?? google translate?

long widget Feb 16, 2023, 9:52 AM

#

umbral charm ?? google translate?

google translate has an api?

umbral charm Feb 16, 2023, 9:52 AM

#

long widget google translate has an api?

OH

#

ok i just thought u meant in general

long widget Feb 16, 2023, 9:52 AM

#

no xd

umbral charm Feb 16, 2023, 9:53 AM

#

Hm

#

You could check too see if they have any letters outside of the english alphabet

#

such as accents and russian and mandarin characters

long widget Feb 16, 2023, 9:56 AM

#

the problem with that is that tons of languages use that alphabet

umbral charm Feb 16, 2023, 9:57 AM

#

Yea

#

i think you will need like a library or API than

umbral charm Feb 16, 2023, 9:58 AM

#

long widget the problem with that is that tons of languages use that alphabet

https://stackoverflow.com/questions/43377265/determine-if-text-is-in-english

Stack Overflow

Determine if text is in English?

I am using both Nltk and Scikit Learn to do some text processing. However, within my list of documents I have some documents that are not in English. For example, the following could be true:

[ "t...

long widget Feb 16, 2023, 9:59 AM

#

I saw that but according to the comments those solutions were very slow

#

idk if there is a faster way

umbral charm Feb 16, 2023, 10:01 AM

#

Make your own dictionary and compare

#

XD

desert pulsar Feb 16, 2023, 11:28 AM

#

Hi Im struggling with splitting into train, test and validation

#

could someone please help me

#

def DataLoaderCreator(dataset,batch_size,shuffle_dataset=True,random_seed=42):
    train_split=0.6
    val_split = 0.2
    dataset_size = len(dataset)
    indices = list(range(dataset_size))
    if shuffle_dataset:
        np.random.seed(random_seed)
        np.random.shuffle(indices)
    train_end = int(train_split * dataset_size)
    val_end = int(val_split * dataset_size)
    train_indices = indices[:train_end]
    val_indices=indices[train_end:val_end]
    test_indices=indices[val_end:]
    train_sampler = SubsetRandomSampler(train_indices)
    valid_sampler = SubsetRandomSampler(val_indices)
    test_sampler = SubsetRandomSampler(test_indices)
    train_data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, 
                                           sampler=train_sampler)
    val_data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                                sampler=valid_sampler)
    test_data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                                sampler=test_sampler)
    return train_data_loader,val_data_loader, test_data_loader```

deft spire Feb 16, 2023, 2:56 PM

#

Hey, could someone tell me what layers should be in a tensorflow model that accepts text and gives a probability of text being a spam text or not? I can think of Input -> TextVectorization -> (idk maybe Dense with relu) -> Dense(2) but I am very inexperienced so might be nonsense

dim palm Feb 16, 2023, 3:34 PM

#

Hi everyone !
I am a French student from Polytech Nantes (France) on third year of engineer degree called "Data Engineering and Artificial Intelligence".
I am seeking for a 9-week internship abroad in the area of data science for summer 2023 from July 3rd to September 1st.
I would be grateful pour any opportunity ! Please send me a message 🙂 my mp are open

serene scaffold Feb 16, 2023, 3:35 PM

#

dim palm Hi everyone ! I am a French student from Polytech Nantes (France) on third year...

This isn't a place to seek out internships, but you can ask for internship hunting advice in #career-advice

dim palm Feb 16, 2023, 3:35 PM

#

serene scaffold This isn't a place to seek out internships, but you can ask for internship hunti...

thanks and sorry

serene scaffold Feb 16, 2023, 3:36 PM

#

No problem. Hope you find one 😄

trail zodiac Feb 16, 2023, 4:01 PM

#

Hey folks, I've a bit of an odd question- does anyone know of a resource of a timeline of major research papers in AI?

I'm finding myself frustrated by a gap between modern tools (which I can use, but do not fully understand), and older concepts I read the research on in Uni (which I understand, but are all out of date). What I'd like to do is progressively close the gap from what I understand to modern tools, but I'm having a hard time finding any such resource and diving into recent papers has me going to google every third sentence because I lack the context for the paper.

drifting kelp Feb 16, 2023, 4:20 PM

#

Hello, could someone help me?

long widget Feb 16, 2023, 4:47 PM

#

drifting kelp Hello, could someone help me?

Best to ask the question instead of asking for help

drifting kelp Feb 16, 2023, 4:47 PM

#

Yea. But i think i found a solution.

#

For sure no the best.

#

*not

cinder schooner Feb 16, 2023, 5:56 PM

#

deft spire Hey, could someone tell me what layers should be in a tensorflow model that acce...

I don't know if this answers your question, but you can try a transformers architecture. You can take a pretrained one and fine tune it for spam classification

bold pumice Feb 16, 2023, 6:17 PM

#

https://github.com/pranftw/neograd - A deep learning framework I created from scratch using only Python and NumPy

GitHub

GitHub - pranftw/neograd: A deep learning framework created from sc...

A deep learning framework created from scratch with Python and NumPy - GitHub - pranftw/neograd: A deep learning framework created from scratch with Python and NumPy

odd meteor Feb 16, 2023, 6:20 PM

#

bleak zealot Name: tensorflow Version: 2.12.0rc0 ...

Have you resolved this yet? If no, create a virtual environment and do the work therein (if you're not using venv yet.) This should fix the issue with ease.

bleak zealot Feb 16, 2023, 6:21 PM

#

odd meteor Have you resolved this yet? If no, create a virtual environment and do the work ...

Yes i resolved it thanks.

odd meteor Feb 16, 2023, 6:23 PM

#

long widget what is the best/fastest way to check whether or not a text is written in englis...

You can use langdetect https://pypi.org/project/langdetect/ or several of its equivalence to get it done.

PyPI

langdetect

Language detection library ported from Google's language-detection.

odd meteor Feb 16, 2023, 6:24 PM

#

bleak zealot Yes i resolved it thanks.

awesome

odd meteor Feb 16, 2023, 6:27 PM

#

trail zodiac Hey folks, I've a bit of an odd question- does anyone know of a resource of a ti...

I don't know if this will be of help but try Elicit. https://elicit.org/

sharp wave Feb 16, 2023, 6:28 PM

#

Is anyone good with decision trees for machine learning? I appreciate any help 🙂

odd meteor Feb 16, 2023, 6:32 PM

#

sharp wave Is anyone good with decision trees for machine learning? I appreciate any help �...

Hi cvspharmacyluvr, don't ask question to ask question. If you had mentioned what you need more clarity on as regards using Decision Trees, I'm sure a lot of people will respond swiftly.

#

Pope Stelercus 🙌🏾 😎 It's been a minute. I trust you're doing great

trail zodiac Feb 16, 2023, 6:46 PM

#

odd meteor I don't know if this will be of help but try Elicit. https://elicit.org/

Definitely looks like a neat tool but unfortunately it doesn't really solve my problem. Appreciate the link though, bookmarked for later use

sleek shuttle Feb 16, 2023, 8:25 PM

#

Hi guys, I want to do a project for my course of NLP at university and I was thinking to do a model that understand the topic of documents and article. If you have some advise on how to start and about libraries I really appreciate it. Thanks in advance 🙂

vestal cave Feb 16, 2023, 9:39 PM

#

Hi everyone, I'm trying to sort out how memory is being used in Python.

import hashlib
from _compat import to_bytes

sha256_hashobj = hashlib.sha256()
sha384_hashobj = hashlib.sha384()

sha256_hashobj.update(to_bytes("Bob is here"))
sha384_hashobj.update(to_bytes("Bob is here"))

print(id(sha256_hashobj))
print(id(sha384_hashobj))

Both objects have different references, but the byte data itself is the same. While I'm storing the binary data in each hash object, is the reference to that byte data the same for both?

#

This gets a bit more complicated if I'm trying to calculate multiple checksums for a very large file (>1TB) from a stream at the time of read/writing that file to disk. It feels like it's going to be really slow because Python is going to run out of RAM (from my current understanding) - and swap to Disk memory.... and not much I can do about it

agile cobalt Feb 16, 2023, 9:48 PM

#

vestal cave This gets a bit more complicated if I'm trying to calculate multiple checksums f...

it looks like there is a section about hashing entire files in the documentation (new in 3.11?) - have you checked it?
https://docs.python.org/3/library/hashlib.html#file-hashing

Python documentation

hashlib — Secure hashes and message digests

Source code: Lib/hashlib.py This module implements a common interface to many different secure hash and message digest algorithms. Included are the FIPS secure hash algorithms SHA1, SHA224, SHA256,...

vestal cave Feb 16, 2023, 9:49 PM

#

Ahhh, thank you. I was using 3.9 docs

agile cobalt Feb 16, 2023, 9:50 PM

#

I'm not sure if it supports getting multiple different hashes at once, but worst case scenario you can still try peeking at the source code for ideas if doing one hash type at a time isn't viable

vestal cave Feb 16, 2023, 9:51 PM

#

That's exactly what I'm doing to see how they're getting it done!

agile cobalt Feb 16, 2023, 9:51 PM

#

still... for >1 TB you might as well consider looking for options specifically meant to work with large data

vestal cave Feb 16, 2023, 9:52 PM

#

Thank you... I agree, I think this is going to be problematic

#

Especially if a requirement is to return 4 checksums 😫

soft badge Feb 16, 2023, 9:54 PM

#

guys for be expert in IA, can be self-taught studying on internet or need study in a university?

agile cobalt Feb 16, 2023, 9:56 PM

#

'self-taught' might not be impossible per se, but you are better off taking the safe route with going to an university

soft badge Feb 16, 2023, 9:58 PM

#

What is the biggest learning difficulty in this area?

modest hazel Feb 16, 2023, 10:28 PM

#

Guys, what book/course would you recommend to develope fundamentals? (pandas/numpy/matplotlib)

#

Like to learn it and to be ready to build my own projects

#

👀

dense crane Feb 17, 2023, 12:39 AM

#

https://www.youtube.com/watch?v=t-8ZIWCRvP0

YouTube

PyData Bydgoszcz

Jan Kanty Milczek - Nienawidzę pandasów (PyData Bydgoszcz #5)

PyData Bydgoszcz spotkanie 5
wtorek 17.05.2022
klub Eljazz, Kręta 3, Bydgoszcz
https://fb.me/e/1KcjbZDsp

Jan Kanty Milczek
Nagradzany informatyk i analityk danych. Jako Principal Data Scientist w deepsense.ai kieruje projektami z zakresu bezpieczeństwa sieci i monitorowania systemów.

„Nienawidzę pandasów”
Pandasy są jak klucz francuski - nie p...

▶ Play video

#

he just literally roasted the pandas

#

so i have a question for the experts here do you still use pandas or do you prefer to use something else instead?

drifting kelp Feb 17, 2023, 12:42 AM

#

Hello, how can i create a cmap, matplotlib, with black when the value is 0, red when is -1 and blue when its 1?

hasty mountain Feb 17, 2023, 12:53 AM

#

soft badge What is the biggest learning difficulty in this area?

Calculus

#

yert

#

And things related to Monte-Carlo and Probability Distributions in uncertain scenarios

#

Something which I find too complicated...but ironically I'm always being dragged to it.

patent lynx Feb 17, 2023, 12:58 AM

#

Manipulating Time series and forecasting is a 💀

bleak zealot Feb 17, 2023, 1:21 AM

#

Any helpers who can help with this?

The dataframe looks so weird or maybe its just me?

predictors = ["Close", "Volume", "Open", "High", "Low", ]
model.fit(train[predictors], train["Target"])
preds = model.predict(test[predictors])
preds = pd.Series(preds, index=test.index, name="Prediction")
precision = precision_score(test["Target"], preds)
combined = pd.concat([test['Target'], preds], axis=1)
combined.plot()
plt.show()

mild dirge Feb 17, 2023, 1:25 AM

#

Do you not want to take an average, instead of plot every value?

bleak zealot Feb 17, 2023, 1:25 AM

#

But the problem is, if i change the date to lets say 1990 to get longer prediction, i cant even see the predictions, since its just a orange square

#

I have to zoom in so much, and then when i zoome in i have no datos

mild dirge Feb 17, 2023, 1:26 AM

#

I'm not really sure what it is you are trying to plot here?

hasty mountain Feb 17, 2023, 1:27 AM

#

Stock prices and a target price?

bleak zealot Feb 17, 2023, 1:27 AM

#

yes

hasty mountain Feb 17, 2023, 1:27 AM

#

Hm...maybe you plot it in a confusing way.

bleak zealot Feb 17, 2023, 1:27 AM

#

the code is working, so its not that.

patent lynx Feb 17, 2023, 1:28 AM

#

What is the model you are using? Auto arima?

hasty mountain Feb 17, 2023, 1:28 AM

#

Try using a scatter plot. When in doubt, I tend to use scatter plot, since it tends to not mess that much with the image...just pollutes it a bit...or too much

bleak zealot Feb 17, 2023, 1:28 AM

#

randomforrest

#

(if thats what you mean)

#

oh sorry im tired

#

using pandas

#

Else im confused about what you ask about?

patent lynx Feb 17, 2023, 1:30 AM

#

bleak zealot randomforrest

Yeah this is what i meant

bleak zealot Feb 17, 2023, 1:30 AM

#

Right fair enough

#

Using randomforrestclassifier

patent lynx Feb 17, 2023, 1:31 AM

#

Hmm shouldnt it be a random forest regression since this is more like forecasting problem and you handled a time series

#

Is this an ensemble methods?

bleak zealot Feb 17, 2023, 1:33 AM

#

prediction

sp500["Tomorrow"] = sp500["Close"].shift(-1)
sp500["Target"] = (sp500["Tomorrow"] > sp500["Close"]).astype(int)

Train the model higher estimatores = better accuracy, higher split less accuracy

model = RandomForestClassifier(n_estimators=100, min_samples_split=100, random_state=1)

train = sp500.iloc[:-100]
test = sp500.iloc[:-100]

This the code for it. (im still learning) so im not exactly sure what you mean by that question?

patent lynx Feb 17, 2023, 1:35 AM

#

Yeah in short you are using the wrong model not suited for this problem

#

Randomforestclassifiers are used for binary outcomes/categorical classification.

bleak zealot Feb 17, 2023, 1:37 AM

#

Yes exactly, later down i put the functions to make the binary outcomes, but its more the barchart im not really happy with (its seems way to big for my screen) or something, lemme just take a screenshoot of it, when i change my prediction date to 1990 instead

#

This from 1990, maybe its just me, but how can i even check the dates ?

mild dirge Feb 17, 2023, 1:41 AM

#

It's not a bar chart

#

It's a line graph

#

Can you try scatter instead of plot

bleak zealot Feb 17, 2023, 1:42 AM

#

Yes give me 5

#

Hmm give me 5 min more, when i try to run it as scatter my computer almost crash 😄

#

This should be the right scatter right?

plt.scatter(test.index, test["Target"], color="blue", label="Actual")
plt.scatter(test.index, preds, color="red", label="Predicted")
plt.legend()
plt.show()

mild dirge Feb 17, 2023, 1:51 AM

#

Yeah, but it seems like you will show the value for every single example, which isn't very useful information

#

You may want to show average prediction per day or something. But my guess is that all predictions are either 0 or 1, which means it makes even less sense.

bleak zealot Feb 17, 2023, 1:52 AM

#

Yeh it show value for each day since 1990

mild dirge Feb 17, 2023, 1:52 AM

#

Is this a regression task or classification?

hasty mountain Feb 17, 2023, 1:52 AM

#

How about trying a histogram?

#

Histogram could be like a candlestick graph pithink

bleak zealot Feb 17, 2023, 1:53 AM

#

Oh that would be so much better, lemme just google how to incorporate that, never worked with that

bleak zealot Feb 17, 2023, 1:53 AM

#

mild dirge Is this a regression task or classification?

What do you mean by that?

mild dirge Feb 17, 2023, 1:54 AM

#

Before trying out any model, look up what those two terms mean.

#

It's the very first step for deciding what model to use

hasty mountain Feb 17, 2023, 1:55 AM

#

Oh, I've just seen the RandomForestClassifier...

#

You might want to review that. Predicting numbers, prices, target prices = regression

#

Classification = classifying data between...uh...classes...

#

pithink

livid elk Feb 17, 2023, 2:01 AM

#

binning might be a good word idek

patent lynx Feb 17, 2023, 2:03 AM

#

bleak zealot What do you mean by that?

https://medium.com/@maryamuzakariya/project-predict-stock-prices-using-random-forest-regression-model-in-python-fbe4edf01664

Idk what you are trying to target, but i think this article might help

Medium

Project: Predict Stock Prices Using Random Forest Regression Model ...

Predict the S&P500 stock prices and accordingly deciding on when to buy, sell or hold a stock.

bleak zealot Feb 17, 2023, 2:07 AM

#

Yeh sorry i didnt knew what those two were called, but its regression.

But im targeting the binary if it goes up or down, this comes in my function later down the code, the only thing was just that i wasnt really happy with the graph that it showed, it was all.

red moon Feb 17, 2023, 4:08 AM

#

I'm doing machine learning, so I got this data set:
https://archive.ics.uci.edu/ml/datasets/Algerian+Forest+Fires+Dataset++
I downloaded the CSV, and I deleted the first data, leaving the Sidi-Bel Abbes Region Dataset (I also deleted the words “Sidi-Bel Abbes Region Dataset”).
Now, I read the csv file into Jupiter, and I'm trying to write a model to see how the month relates to temperature, using this code:
import numpy as np
import pandas as pd
import matplotlib
dataset = pd.read_csv('Algerian_forest_fires_dataset_UPDATE.csv', delimiter = ";")
mydata = dataset[["month", "Temperature"]]
mydata = mydata.dropna()
mydata
But, this is giving me a key error, with a screenshot of the error shown below... How can I fix this?

#

lone vine Feb 17, 2023, 4:20 AM

#

Hi all,

I created a pypi package that allows you to access data from ETF DB, one of the large ETF data providers out there.

https://github.com/lvxhnat/pyetf-scraper

Will love some feedback, and do give it a star if you like it 😉
Also looking for contributors who can help maintain and improve on the current package. Do reach out to me if interested, thanks! 🙂

patent lynx Feb 17, 2023, 5:38 AM

#

red moon

Syntax error check for column name spelling or dataset.columns first

surreal swan Feb 17, 2023, 5:44 AM

#

HI guys i need good ML Model for NER and RE for japanese language!

#

Anyone has idea?

red moon Feb 17, 2023, 6:17 AM

#

patent lynx Syntax error check for column name spelling or dataset.columns first

yea i checked for spelling. wdym dataset.columns?

#

oh wait i think ik what u mean ill try that

red moon Feb 17, 2023, 6:38 AM

#

patent lynx Syntax error check for column name spelling or dataset.columns first

hey uhhh so i tried it

#

#

and it still won't work (when i uncomment the first line it gives me error)

#

k found my error... i seet the delimeter wrong

sleek shuttle Feb 17, 2023, 9:19 AM

#

Hi guys, I want to do a project for my course of NLP at university and I was thinking to do a model that understand the topic of documents and article. If you have some advise on how to start and about libraries I really appreciate it. Thanks in advance 🙂

frigid lion Feb 17, 2023, 9:37 AM

#

guys im just starting out and doing some udemy course on data science
whats the diffrence between these 2 things?

#

#

like does it matter if i get the data by loc or just by passing the columns i want?
if there is no diffrence what is the conventional way

worldly atlas Feb 17, 2023, 11:48 AM

#

Hello, is 8 GB RAM enough? I have just started learning Data Analytics, and doing some data cleaning in Excel (about 100,000 rows). It's kinda laggy.

Would 8 GB be enough or should I try to go for 16 GB?

safe vortex Feb 17, 2023, 12:11 PM

#

patent lynx https://medium.com/@maryamuzakariya/project-predict-stock-prices-using-random-fo...

Thank you

patent lynx Feb 17, 2023, 12:11 PM

#

worldly atlas Hello, is 8 GB RAM enough? I have just started learning Data Analytics, and doin...

There will be limitations as you go for complex ML or deep learning approaches. Even me with 16GB laptop crashed some kernels when running it 💀. You may adjust the parameters, scale it, transform data, feature selection to make it less computationally heavy. But the best thing to do at your current case is to take a sample (reduce size of your training data) of the data to analyze it.

atomic tide Feb 17, 2023, 12:13 PM

#

frigid lion guys im just starting out and doing some udemy course on data science whats the ...

On this particular data-frame, there is no difference. But if you select a range of columns, you will get all the columns between them as well. There just happen not to be any columns between 'smoker' and 'day' in this data-frame.

#

I think you should generally prefer to use loc though, to avoid chained indexing (which is not the case here). See: https://pandas.pydata.org/docs/user_guide/indexing.html#returning-a-view-versus-a-copy

mossy lance Feb 17, 2023, 12:19 PM

#

sleek shuttle Hi guys, I want to do a project for my course of NLP at university and I was thi...

yeah sure man, you're after something like "text/sense disambiguation" or "text summarisation"? i'd suggest looking at how you extract semantic information from text, you'll probably find lots of downstream tasks there

fiery dust Feb 17, 2023, 2:10 PM

#

what does it take to create and train Ai that generates good quality images. What it takes in terms of PC specs, AI knowledge, etc etc.

#

pretty general question but I'm interested in doing research to start it as a personal project 🙂

#

and when I say images it can be any random thing as long as its an image, it can be a logo, a drawing of something, everything!

hasty mountain Feb 17, 2023, 2:13 PM

#

worldly atlas Hello, is 8 GB RAM enough? I have just started learning Data Analytics, and doin...

Learn about chunks of data. This will be really helpful

#

numpy.load has the argument mmap_mode which allows you to load a numpy array on demand(it'll keep your entire array stored, without loading it into your RAM. It'll just load the samples you need)

#

For Pytorch, I've seem this tends to be buggy, so I have to learn how to do it... my language models depends on this

mild dirge Feb 17, 2023, 2:17 PM

#

With pytorch you make a dataloader and dataset class

hasty mountain Feb 17, 2023, 2:18 PM

#

mild dirge With pytorch you make a dataloader and dataset class

Yes, but it still seems a bit confusing
https://discuss.pytorch.org/t/load-data-in-chunks-using-dataset/123219

PyTorch Forums

Load data in chunks using Dataset

I am wondering if I can modify get_item in Dataset to accept multiple indices instead of one index at a time to improve data loading speed from disk using H5 file. My dataset looks something like this class HDFDataset(Dataset): def init(self, path): self.path = path def len(self): return self.len ...

#

I admit that I have to yet test it, though

worldly atlas Feb 17, 2023, 2:19 PM

#

I'm just beginning, I'm doing Google Data Analytics course and this is a case study they gave. It was laggy when I was sorting, filtering and applying some formulas in the excel. So I asked if 8GB ram will be enough in future.

mild dirge Feb 17, 2023, 2:19 PM

#

hasty mountain Yes, but it still seems a bit confusing https://discuss.pytorch.org/t/load-data-...

It's quite alright if you just try an example. The dataset just has a getitem method that returns 1 single image/sample. Dataloader does rest of the work making the batches

hasty mountain Feb 17, 2023, 2:22 PM

#

mild dirge It's quite alright if you just try an example. The dataset just has a getitem me...

Yes, but I usually use the Dataset class with the entire data being loaded in the init function(np.load()). Should I load my data in the __getitem__() function?

mild dirge Feb 17, 2023, 2:22 PM

#

Yes, the idea is that you load only 1 image.

#

You could preload them if you know the dataset is small

#

But if it is big you would load them 1 by 1

#

And you can use caching to make it quicker

hasty mountain Feb 17, 2023, 2:23 PM

#

Only 1 at once is a bit...meh...

mild dirge Feb 17, 2023, 2:23 PM

#

mild dirge And you can use caching to make it quicker

^

#

Otherwise you'd have to manually make those batches

#

Which is more meh imo

hasty mountain Feb 17, 2023, 2:23 PM

#

pithink

mild dirge Feb 17, 2023, 2:24 PM

#

Loading of the data is also often not really the bottleneck of a training process

hasty mountain Feb 17, 2023, 2:24 PM

#

I've never used caching. I'll take a look

mild dirge Feb 17, 2023, 2:26 PM

#

mild dirge Loading of the data is also often not really the bottleneck of a training proces...

And you can also use multiprocessing for loading the data by setting num_workers in the dataloader.

hasty mountain Feb 17, 2023, 2:27 PM

#

Hm... I've seen that the indices in __getitem__() can be a list of indices.

mild dirge Feb 17, 2023, 2:28 PM

#

That could be true, been a bit since I've used pytorch

lapis sequoia Feb 17, 2023, 3:58 PM

#

hello

#

is anyone here

#

Suggest me how to make a Neural network (NN) that creates more NN if required like our brain...

serene scaffold Feb 17, 2023, 4:05 PM

#

lapis sequoia Suggest me how to make a Neural network (NN) that creates more NN if required li...

it doesn't really work like that

#

neural networks are about approximating a function that fits the training data

lapis sequoia Feb 17, 2023, 4:08 PM

#

k

hasty mountain Feb 17, 2023, 4:55 PM

#

serene scaffold neural networks are about approximating a function that fits the training data

I think that actually there's a type of self-building neural network...

#

I don't remember how they're named, but I think it's something involving Reinforcement Learning to make it decide the best number of layers and features

tidal bough Feb 17, 2023, 5:00 PM

#

it also just isn't true that NNs necessarily approximate a given function (RL is hard to describe like that), but I assume Stelercus is going for a basic explanation here

hasty mountain Feb 17, 2023, 5:00 PM

#

Maybe something like this:
https://en.wikipedia.org/wiki/Neuroevolution_of_augmenting_topologies

?

Neuroevolution of augmenting topologies

NeuroEvolution of Augmenting Topologies (NEAT) is a genetic algorithm (GA) for the generation of evolving artificial neural networks (a neuroevolution technique) developed by Kenneth Stanley and Risto Miikkulainen in 2002 while at The University of Texas at Austin. It alters both the weighting parameters and structures of networks, attempting to...

hasty mountain Feb 17, 2023, 5:00 PM

#

tidal bough it also just isn't true that NNs necessarily approximate a given function (RL is...

Yeah, probably pithink

boreal gale Feb 17, 2023, 5:07 PM

#

tidal bough it also just isn't true that NNs necessarily approximate a given function (RL is...

i might be wrong, NNs in RL is still approximating something, so the above comment is still kinda true, e.g. in the case of deep Q learning, NN is used for approximate the Q-values in tradition Q-learning (though it's a significant upgrade, since you can deal with continuous space given enough data e.g. - whereas Q-learning is bounded to a discrete state space)

agile cobalt Feb 17, 2023, 5:07 PM

#

tidal bough it also just isn't true that NNs necessarily approximate a given function (RL is...

uh, if I recall correctly, in RL you have the NN approximate the cost function recursive case?
haven't messed around with it much though

boreal gale Feb 17, 2023, 5:09 PM

#

lapis sequoia Suggest me how to make a Neural network (NN) that creates more NN if required li...

sounds like you would be interested in reading up on Google's AutoML?
not sure if that's exactly what you wanted but it might be of interests.

hasty mountain Feb 17, 2023, 5:09 PM

#

boreal gale i might be wrong, NNs in RL is still approximating something, so the above comme...

I thought Q-Learning didn't use NNs at all pithink

tidal bough Feb 17, 2023, 5:09 PM

#

I was more thinking, like, RL algorithms do tend to involve e.g. learning an expected-loss function, but they also involve choosing between exploring to get better knowledge of it and exploiting the parts you already know for utility.

tidal bough Feb 17, 2023, 5:10 PM

#

hasty mountain *I thought Q-Learning didn't use NNs at all* <:pithink:652247559909277706>

that's called deep-q-learning - it's like q-learning but with an NN outputting the q-function.

agile cobalt Feb 17, 2023, 5:10 PM

#

they are not necessarily about approximating a given function, but about creating one that fits the data (inputs and outputs) you have

hasty mountain Feb 17, 2023, 5:11 PM

#

Uuuh... Idk, I just try to make a NN that can select the best action given a certain state(sometimes random sampling, of course) while also trying to predict the reward yert

#

That could be a function, but... yert

tidal bough Feb 17, 2023, 5:11 PM

#

tidal bough that's called deep-q-learning - it's like q-learning but with an NN outputting t...

(see e.g. this problem from the Practical RL course (formerly on coursera) about implementing an Atari Breakout agent with it: https://github.com/yandexdataschool/Practical_RL/blob/master/week04_approx_rl/homework_pytorch_debug.ipynb, https://github.com/yandexdataschool/Practical_RL/blob/master/week04_approx_rl/homework_pytorch_main.ipynb)

oak cosmos Feb 17, 2023, 6:54 PM

#

how do i flip thhat?

#

serene scaffold Feb 17, 2023, 7:18 PM

#

oak cosmos

What code made this

oak cosmos Feb 17, 2023, 7:41 PM

#

serene scaffold What code made this

df_sa= df_vd[df_vd['Profile Name']== user]
        df_sa['Duration']= df_sa['Duration'].dt.total_seconds().div(3600)
        df_sa_c=df_sa.groupby(['Title_clean'])['Duration'].sum()
        df_sa_c= df_sa_c.sort_values(ascending=False)
        df_clear= df_sa_c.head(10)
        figtop10, ax= plt.subplots()
        label= df_clear.index
        y_pos= np.arange(len(label))
        ax.barh(label,df_clear, color= 'red')
        addlabels(label, df_clear)
        
        embeded_plot(figtop10)

#

i think i need to simply reverse it

#

lemme think of smth rq

serene scaffold Feb 17, 2023, 7:42 PM

#

oak cosmos ```py df_sa= df_vd[df_vd['Profile Name']== user] df_sa['Duration']= df_s...

The problem is that you used barh

#

Just use bar

#

The h stands for horizontal.

oak cosmos Feb 17, 2023, 7:43 PM

#

nah, i want it to be barh

#

that was intendez

serene scaffold Feb 17, 2023, 7:43 PM

#

Then idk what you mean by flip

oak cosmos Feb 17, 2023, 7:43 PM

#

intendet*

#

like so that the highest value is on top

#

i think i cna use ```
df_clear= df_clear.iloc[:: -1]

serene scaffold Feb 17, 2023, 7:44 PM

#

Or change ascending to true

oak cosmos Feb 17, 2023, 7:45 PM

#

serene scaffold Or change ascending to true

since it is a fd with like 20k rows i would only get the values with 0, but if i think further, we could simply say ascending true and than take the tail of the df

#

df*

#

yup works

#

ty for the idea mate

serene scaffold Feb 17, 2023, 7:48 PM

#

Yw

hasty mountain Feb 17, 2023, 9:38 PM

#

Hey @serene scaffold , do you have a tip or trick for loading text files that are too heavy?
I've downloaded CC100 English dataset, but...the .xz file has a size of 85 Gb, and the .txt is 320 Gb, so...is it possible to use pickle or open() without blowing up my HD or my RAM?

charred light Feb 17, 2023, 9:41 PM

#

hasty mountain Hey <@253696366952316929> , do you have a tip or trick for loading text files th...

Generators and/or Chunking.

hasty mountain Feb 17, 2023, 9:42 PM

#

charred light Generators and/or Chunking.

Yes, but is it possible to load chunks of data without decompressing my .xz file?

charred light Feb 17, 2023, 9:44 PM

#

hasty mountain Yes, but is it possible to load chunks of data without decompressing my .xz file...

Without unzipping/decompressing? I don't know enough to help you there.

tidal bough Feb 17, 2023, 9:44 PM

#

hasty mountain Yes, but is it possible to load chunks of data without decompressing my .xz file...

xz is LZMA I believe, which does support incremental decompression: https://docs.python.org/3/library/lzma.html#lzma.LZMADecompressor

hasty mountain Feb 17, 2023, 9:44 PM

#

Oh, I've just seen that Python has the tarfile library

hasty mountain Feb 17, 2023, 9:45 PM

#

tidal bough `xz` is LZMA I believe, which does support incremental decompression: <https://d...

Nice. I didn't know about this library either

charred light Feb 17, 2023, 9:46 PM

#

Hmm, I found this saying XZ is not supported. Although this is not the official docs.

tidal bough Feb 17, 2023, 9:48 PM

#

ah, interesting, it seems to say lzma can stream-decode, but can't seek to specific parts, which may well be true.

hasty mountain Feb 17, 2023, 9:51 PM

#

pithink

lapis sequoia Feb 17, 2023, 9:51 PM

#

Hello, is it appropriate to ask for direction here?

serene scaffold Feb 17, 2023, 10:07 PM

#

lapis sequoia Hello, is it appropriate to ask for direction here?

If it's about data science or ai, yes

hasty mountain Feb 17, 2023, 10:10 PM

#

charred light Hmm, I found this saying XZ is not supported. Although this is not the official ...

pithink

lapis sequoia Feb 17, 2023, 10:10 PM

#

serene scaffold If it's about data science or ai, yes

Thanks lemon_happy

hasty mountain Feb 17, 2023, 10:10 PM

#

It's just annoying those bytes characters.
Good thing codecs can easily solve this.

charred light Feb 17, 2023, 10:10 PM

#

hasty mountain <:pithink:652247559909277706>

CL9_Shrug

lapis sequoia Feb 17, 2023, 10:10 PM

#

I was having problems with Lasso regression in Pandas.

#

I was wondering if there is any optimal method for it

#

I don't think my approach was really working.

charred light Feb 17, 2023, 10:15 PM

#

lapis sequoia I was having problems with Lasso regression in Pandas.

Start with explaining what problem your trying to solve, and where it's not working.

hasty mountain Feb 17, 2023, 10:45 PM

#

hasty mountain <:pithink:652247559909277706>

This is fabulous. Now I can see how unsupervised learners language models really are, as Radford said in GPT-2 paper brainmon

lapis sequoia Feb 17, 2023, 10:45 PM

#

I tried to write in an explanation, but I don't think I am educated enough. Sorry.

#

I will try to get back when I have a better idea what I am doing

simple tapir Feb 17, 2023, 10:53 PM

#

What's actually the value of random seed doing? (which is generally set to 42.) The higher it is, the more randomness a tensor has?

oak cosmos Feb 17, 2023, 11:22 PM

#

anybody of u guys ever worked with like BIG BIG data

#

like 800mB

#

idf´k how to process that shit, im so stuck rn

hasty mountain Feb 17, 2023, 11:32 PM

#

oak cosmos anybody of u guys ever worked with like BIG BIG data

Uh... Right now I'm trying to work with 80 Gb pithink

oak cosmos Feb 17, 2023, 11:33 PM

#

hasty mountain Uh... Right now I'm trying to work with 80 Gb <:pithink:652247559909277706>

holy fuck

hasty mountain Feb 17, 2023, 11:33 PM

#

Is your problem loading the data? Or saving it?

oak cosmos Feb 17, 2023, 11:33 PM

#

damn, well would u mind helping me out rq?

#

loading and a lil bit of filtering

#

it has arround 8mil rows

hasty mountain Feb 17, 2023, 11:33 PM

#

Is it a numpy array?

oak cosmos Feb 17, 2023, 11:34 PM

#

hasty mountain Is it a numpy array?

its a tsv i transformed to a csv

#

ever heard of IMDb movie dataset?

hasty mountain Feb 17, 2023, 11:34 PM

#

No.
Hm...I don't really know how to deal with csv...

#

https://numpy.org/doc/stable/reference/generated/numpy.load.html

oak cosmos Feb 17, 2023, 11:35 PM

#

hmm sadge

hasty mountain Feb 17, 2023, 11:35 PM

#

Numpy has this mmap argument which allows you to open your array in "read" mode, so you don't have to actually load everything at once in your RAM. I suppose Pandas might have this, too

oak cosmos Feb 17, 2023, 11:35 PM

#

o well, tahts interresting

#

any keyword i can look for?

hasty mountain Feb 17, 2023, 11:37 PM

#

Uh, again, I don't know how to deal with pandas.
But, if pandas doesn't have this option(which I think it's unlikely), you might be able to use open(path, 'r+')

#

Try to load your data in read mode, then use chunks of data for what you want to do

#

That way, you can deal with small parts of your data "on demand", without occuppying your entire RAM

oak cosmos Feb 17, 2023, 11:39 PM

#

hasty mountain Uh, again, I don't know how to deal with pandas. But, if pandas doesn't have thi...

i asked the AI it said i can import mmap so i can say it shoudl only load in 10k lines per tick

hasty mountain Feb 17, 2023, 11:39 PM

#

It seems Python also has a mmap built-in module
https://docs.python.org/3/library/mmap.html

Python documentation

mmap — Memory-mapped file support

Availability: not Emscripten, not WASI. This module does not work or is not available on WebAssembly platforms wasm32-emscripten and wasm32-wasi. See WebAssembly platforms for more information. Mem...

oak cosmos Feb 17, 2023, 11:40 PM

#

yeye

#

das what he recommendet me

hasty mountain Feb 17, 2023, 11:41 PM

#

iter() and yield might be interesting to know about, too

oak cosmos Feb 17, 2023, 11:41 PM

#

oh i red about iter in my pandas book, it is a bit slow tho

hasty mountain Feb 17, 2023, 11:41 PM

#

Well, it can be useful to get chunks in a dataloader function

charred light Feb 17, 2023, 11:50 PM

#

Uh, need some help brainstorming:
I'm working on a image classification for dog breeds. Only problem is there's 120 classes, so the model generally going to suck. I'm thinking of first filtering by a type of dog (e.g. Large,medium,small dog breeds or terriors, hounds, toy)., then out of those classes, running a secondary model for the specific breed.

Issue is, the only "upper grouping" I've found is https://www.akc.org/public-education/resources/general-tips-information/dog-breeds-sorted-groups/. Just curious if anyone knows a better way to group.

jolly sparrow Feb 18, 2023, 12:04 AM

#

import cv2

Ouvrir la vidéo

cap = cv2.VideoCapture('nom_de_la_video.mp4')

Lire les images de la vidéo

while True:
ret, frame = cap.read()
if not ret:
break

# Traiter chaque image
# ...

# Afficher l'image
cv2.imshow('video', frame)

# Attendre une touche pour quitter
if cv2.waitKey(1) & 0xFF == ord('q'):
    break

Fermer la vidéo et la fenêtre

cap.release()
cv2.destroyAllWindows()

#

Please I need help, I want put my video in the first stape

hasty mountain Feb 18, 2023, 12:12 AM

#

charred light Uh, need some help brainstorming: I'm working on a image classification for dog ...

Hm... This "upper grouping" is a quite keen idea.
but have you heard the word of unsupervised learning?

charred light Feb 18, 2023, 12:34 AM

#

hasty mountain Hm... This "upper grouping" is a quite keen idea. *but have you heard the word o...

Not applicable here? It's a CNN.

hasty mountain Feb 18, 2023, 12:35 AM

#

charred light Not applicable here? It's a CNN.

Oh, trust me, it is hyperlemon

#

Make the CNN output the features it extracted, use those features to determine the degree of entropy in the data and make the model generate pseudolabels based on that.

#

I think the idea is to make a model to generate pseudolabels, and then train a classifier based on those pseudo-labels

charred light Feb 18, 2023, 12:38 AM

#

Dude what?

hasty mountain Feb 18, 2023, 12:38 AM

#

If you have 120 classes, you can make 120 different pseudo-labels.

charred light Feb 18, 2023, 12:38 AM

#

These are predefined dog breeds. What are you talking about?

hasty mountain Feb 18, 2023, 12:38 AM

#

charred light These are predefined dog breeds. What are you talking about?

https://lilianweng.github.io/posts/2021-12-05-semi-supervised/

https://www.sciencedirect.com/science/article/pii/S0031320323000651

Learning with not Enough Data Part 1: Semi-Supervised Learning

When facing a limited amount of labeled data for supervised learning tasks, four approaches are commonly discussed.
Pre-training + fine-tuning: Pre-train a powerful task-agnostic model on a large unsupervised data corpus, e.g. pre-training LMs on free text, or pre-training vision models on unlabelled images via self-supervised learning, and the...

MinEnt: Minimum entropy for self-supervised representation learning

Self-supervised representation learning is becoming more and more popular due to its superior performance. According to the information entropy theory…

charred light Feb 18, 2023, 12:38 AM

#

To start, they are already labeled.

hasty mountain Feb 18, 2023, 12:39 AM

#

charred light To start, they are already labeled.

It appears that training with pseudo-labels and applying a fine-tuning with labels tend to work better

hasty mountain Feb 18, 2023, 12:40 AM

#

hasty mountain https://lilianweng.github.io/posts/2021-12-05-semi-supervised/ https://www.scie...

Self-supervised representation learning is becoming more and more popular due to its superior performance. According to the information entropy theory, the smaller the information entropy of a feature, the more certain it is and the less redundant it is.

#

charred light Feb 18, 2023, 12:41 AM

#

I have labels on all my images. That's not the task here.

#

My original point is to find high-level* groupings that make sense to dog breeds.

hasty mountain Feb 18, 2023, 12:42 AM

#

charred light My original point is to find high-level* groupings that make sense to dog breeds...

You can apply unsupervised learning for N iterations and, after those iterations, you can apply a supervised fine-tuning using a small fraction of your labels

charred light Feb 18, 2023, 12:44 AM

#

Bruh, I think your missing the point of there is NO unlabeled data AT ALL.

hasty mountain Feb 18, 2023, 12:44 AM

#

I think you're missing the point that you can simply ignore the labels for unsupervised learning, and use those labels to fine-tune your model

charred light Feb 18, 2023, 12:44 AM

#

Why would I do that?

hasty mountain Feb 18, 2023, 12:45 AM

#

Because the model might perform better

#

shipit

charred light Feb 18, 2023, 12:45 AM

#

Why would I want to sandbag myself and remove the labels? If I needed more data points, I can just do image augmentation.

hasty mountain Feb 18, 2023, 12:45 AM

#

For instance, GPT-2 had its entire dataset labeled...but they applied unsupervised learning on it, and fine-tuned it using 5% of the labels

charred light Feb 18, 2023, 12:46 AM

#

And that's not what I"m trying to do.

hasty mountain Feb 18, 2023, 12:47 AM

#

But that can help with what you're trying to do

#

py_guido

charred light Feb 18, 2023, 12:47 AM

#

???

hasty mountain Feb 18, 2023, 12:47 AM

#

hasty mountain

Take a look at this image. MNIST dataset has all its labels. However, the model was able to classify the images better when using pseudolabels

#

Rather than when directly using the labels

charred light Feb 18, 2023, 12:48 AM

#

Dude

hasty mountain Feb 18, 2023, 12:48 AM

#

On MNIST

#

MNIST is labeled. All you have to do is ignore the labels when training

charred light Feb 18, 2023, 12:49 AM

#

They are comparing:
labeled data VS labeled data + unlabeled data w/ "Psudeo-labeling".
I've already told you all my data is labeled

#

Your literally saying:
Un-label some data, and it might perform better.

No, no it won't.

hasty mountain Feb 18, 2023, 12:50 AM

#

It might, actually

charred light Feb 18, 2023, 12:50 AM

#

excuseme

hasty mountain Feb 18, 2023, 12:50 AM

#

hasty mountain

That's exactly what the image is showing

#

MNIST is a labeled dataset. The researcher trained the model with all its labels, and then trained with just a few labels

#

In the MinEnt paper:

#

The degree of information entropy helps the model classify the image

#

Images with similar degree of information entropy tend to be similar

#

(or, to be within the same class)

charred light Feb 18, 2023, 12:53 AM

#

hasty mountain That's exactly what the image is showing

Fig. 9. t-SNE visualization of outputs on MNIST test set by models training (a) without and (b) with pseudo labeling on 60000 unlabeled samples, in addition to 600 labeled data.
No it's literally not

#

It's 600 LABELED + 60k UNLABELED vs 600 LABELED + 60k PSEUDO LABELED.

#

I'm walking away from this conversation now. Sounds like trolling to me.

hasty mountain Feb 18, 2023, 12:55 AM

#

Sigh...
Okay, reject the might of unsupervised learning in neural networks

hasty mountain Feb 18, 2023, 12:57 AM

#

charred light It's 600 LABELED + 60k UNLABELED vs 600 LABELED + 60k PSEUDO LABELED.

How would someone train a neural network in unlabeled data, without using pseudolabels?

charred light Feb 18, 2023, 12:59 AM

#

hasty mountain How would someone train a neural network in unlabeled data, without using pseudo...

Not relevant, its not what I asked <#data-science-and-ml message>
Don't ping me again.

hasty mountain Feb 18, 2023, 1:00 AM

#

Meh

#

At least give it a try, in a small, quick test

#

Use the MinEnt

junior stone Feb 18, 2023, 2:55 AM

#

I made an TV Series/Shows/Sitcom AI Video mini search engine). You can find the name of the show (episode and season) and also links to stream it from. Now shorts only works the video input not yet
anyone wants to try? I tested it with ricky and morty shorts and family guy and its pretty good sometimes it does get results for other shows. But as long as transcript/dialogue is clear and not an edit it how be food 🙂 Try it out?
Where should i post the link? https://sulynajimsj-testseriesavid-main-obefxn.streamlit.app/
try this link? https://www.youtube.com/shorts/v3IS1ikLDJQ
for some reason it works really well for family guy and rick and morty haha

Streamlit

Not found · Streamlit

This app was built in Streamlit! Check it out and visit https://streamlit.io for more awesome community apps. 🎈

YouTube

NOT FamilyGuy

Talking Gun

▶ Play video

#

DEMO 🙂 Try it out

patent lynx Feb 18, 2023, 3:31 AM

#

junior stone I made an TV Series/Shows/Sitcom AI Video mini search engine). You can find the ...

Damn what approaches did you use?

#

Lda?

coarse elk Feb 18, 2023, 5:44 AM

#

junior stone DEMO 🙂 Try it out

how much big database is required to make this, dude this is on a different level

manic jolt Feb 18, 2023, 8:03 AM

#

Which versions of tensorflow were compiled with AVX? Cause I cant run it with docker since my machine doesn't support it

wooden sail Feb 18, 2023, 8:05 AM

#

my understanding is that the default tf one can download does not use avx

#

what error are you getting exactly?

manic jolt Feb 18, 2023, 8:06 AM

#

When I run
sudo docker run -it --rm tensorflow/tensorflow:latest-jupyter python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
I get
The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.

wooden sail Feb 18, 2023, 8:06 AM

#

huh, yeah. that's the opposite i would've expected. 1 sec

#

all righty, here are some workarounds https://github.com/yaroslavvb/tensorflow-community-wheels/issues/209

GitHub

TensorFlow 2.8.0 No AVX, No GPU, Python 3.7, 3.8, 3.9, 3.10, Ubuntu...

Built on Ubuntu 18.04. Builds are not tested and provided as is. Example configuration for Python 3.7 and westmere: PYTHON_VERSION=python3.7 PYTHON_BIN_PATH=$(which $PYTHON_VERSION) \ PYTHON_LIB_PA...

#

a handful of non avx builds are provided there, give them a try. otherwise, you'll have to compile from source

manic jolt Feb 18, 2023, 8:10 AM

#

But can I stilll run it from docker?

#

Or do I have to create my own Docker image wit tensorflow as base and the perform these steps?

wooden sail Feb 18, 2023, 8:13 AM

#

you would have to copy the whl into your docker image and pip install it

#

probably not using tensorflow as base

#

but you can try first using it as base. that'll save you installing blas and other libs

manic jolt Feb 18, 2023, 8:15 AM

#

Ok thanks. Because I also use the image with jupyter so dont have to install then too

#

FROM tensorflow/tensorflow:latest-jupyter

RUN pip install --ignore-installed --upgrade tensorflow-2.8.0-cp37-cp37m-linux_x86_64.whl
``` woul that be enough?

wooden sail Feb 18, 2023, 8:19 AM

#

you'd have to copy the whl too

manic jolt Feb 18, 2023, 8:19 AM

#

Oh yes sry

wooden sail Feb 18, 2023, 8:19 AM

#

but otherwise, i would hope that's enough

manic jolt Feb 18, 2023, 8:19 AM

#

Ok thanks

wooden sail Feb 18, 2023, 8:24 AM

#

lemme know if it works, cuz then that github repo might be worth pinning here

manic jolt Feb 18, 2023, 8:25 AM

#

At the moment I get this during build
#0 1.592 ERROR: tensorflow-2.8.0-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform.

wooden sail Feb 18, 2023, 8:25 AM

#

which python and gcc versions come in the tf image?

manic jolt Feb 18, 2023, 8:26 AM

#

good question

#

How can I look that up? Just bash into the container?

#

Sry if this is a stupid question, Im pretty new to docker

wooden sail Feb 18, 2023, 8:26 AM

#

yeah i think that's the easiest way

#

you can run it in interactive mode with the -it flag, i think that would give you access to its shell

manic jolt Feb 18, 2023, 8:29 AM

#

Python: Python 3.8.10
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

#

I se the problem

#

I download the wrong wheel for Python 3.8 😂

#

But quick questioin @wooden sail

#

What is the GCC Compiler Option

wooden sail Feb 18, 2023, 8:32 AM

#

right, that's also what i was trying to figure out lol

#

can you try running this: echo | gcc -### -E - -march=native

manic jolt Feb 18, 2023, 8:33 AM

#

What do you need from that output? Its pretty big

wooden sail Feb 18, 2023, 8:33 AM

#

hmm

#

echo | gcc -dM -E - -march=native

#

does this give a smaller output?

manic jolt Feb 18, 2023, 8:34 AM

#

No even longer

wooden sail Feb 18, 2023, 8:34 AM

#

how about gcc -march=native -Q --help=target | grep march

manic jolt Feb 18, 2023, 8:35 AM

#

Now its smaller:

  Known valid arguments for -march= option:

wooden sail Feb 18, 2023, 8:36 AM

#

there we go

#

so you'd want the westmere python 3.8 whl

#

that has to do with the architecture of your cpu

manic jolt Feb 18, 2023, 8:37 AM

#

Ok thanks

wooden sail Feb 18, 2023, 8:45 AM

#

any luck now?

manic jolt Feb 18, 2023, 8:46 AM

#

Building finished without errors

#

But when I run it I get this error: AssertionError: Duplicate registrations for type 'experimentalOptimizer

wooden sail Feb 18, 2023, 8:49 AM

#

ok, it seems you have to remove tensorflow and its components first, then install the wheel 😩

#

but we're going in the right direction lol

manic jolt Feb 18, 2023, 8:49 AM

#

Yes i think so too lol

#

How can I remove it? is the name just tensorflow?

wooden sail Feb 18, 2023, 8:50 AM

#

keras
keras-nightly
keras-preprocessing
tensorboard
tensorflow
tb-nightly
tf-nightly

google says these are the ones that should be removed

manic jolt Feb 18, 2023, 8:50 AM

#

Ok thanks

#

Another question, in jupyter notebook, is it supposed to only display python 3 under the selection when I want to create a new notebook?

wooden sail Feb 18, 2023, 8:56 AM

#

if that is all you have installed, yes

manic jolt Feb 18, 2023, 8:57 AM

#

Should I consider using anaconda?

wooden sail Feb 18, 2023, 8:57 AM

#

this is what i get on mine, for example

manic jolt Feb 18, 2023, 8:58 AM

#

Ok yeah same, but only Pathon3

wooden sail Feb 18, 2023, 8:58 AM

#

anaconda is just python + the conda package and environment manager, along with extra goodies like IDEs and jupyter

#

conda does make it very easy to install intel-optimized modules, but otherwise it makes no big difference

manic jolt Feb 18, 2023, 8:59 AM

#

Oh ok

#

Yeah its building now

#

May take some time

wooden sail Feb 18, 2023, 9:02 AM

#

awesome, lemme know how it goes

manic jolt Feb 18, 2023, 9:02 AM

#

Yes I will msg when its finished

#

IT WORKS

#

Or at least printing the verson

wooden sail Feb 18, 2023, 9:08 AM

#

noice

manic jolt Feb 18, 2023, 9:10 AM

#

How can I train a model in jupyter without having jupyter opened?

wooden sail Feb 18, 2023, 9:16 AM

#

wdym by train in jupyter?

#

you can just not use jupyter 😛 that's just a special repl environment

manic jolt Feb 18, 2023, 9:17 AM

#

Yeah true

#

So just execute it in the terminal?

cobalt dawn Feb 18, 2023, 9:21 AM

#

gm

wooden sail Feb 18, 2023, 9:25 AM

#

manic jolt So just execute it in the terminal?

that's what i would say. otherwise you need to have jupyter open

manic jolt Feb 18, 2023, 9:40 AM

#

Ok thank you very much

lapis sequoia Feb 18, 2023, 10:48 AM

#

i want help

#

can some one help me

simple tapir Feb 18, 2023, 12:32 PM

#

Don't ask to ask, just ask

lapis sequoia Feb 18, 2023, 1:47 PM

#

Hi guys

#

I am trying to build some Machine Learning models for stock predictions.

#

Anyone here good with ML ?

serene scaffold Feb 18, 2023, 1:51 PM

#

lapis sequoia Anyone here good with ML ?

don't ask for an expert. ask a question. what do you need help with?

oak cosmos Feb 18, 2023, 2:53 PM

#

Lets say i have a column genres
which has a bunch of different options

#

how would i print all options that appear in that column out?

serene scaffold Feb 18, 2023, 2:54 PM

#

oak cosmos how would i print all options that appear in that column out?

so you want to print all the unique elements once? you can do print(genres.unique())

oak cosmos Feb 18, 2023, 2:55 PM

#

serene scaffold so you want to print all the unique elements once? you can do `print(genres.uniq...

ah okay

#

ty

serene scaffold Feb 18, 2023, 2:56 PM

#

oak cosmos ah okay

no problem. throughout data science/AI, the word "unique" is used to refer to elements in a collection without duplicates.

oak cosmos Feb 18, 2023, 2:56 PM

#

serene scaffold no problem. throughout data science/AI, the word "unique" is used to refer to el...

gtk!

grizzled hill Feb 18, 2023, 2:58 PM

#

Hello guys i have a presentstion about lstm next Tuesday about LSTM model , can someone help me explain lstm in a simple way to the professor, and also for dense layer

nocturne eagle Feb 18, 2023, 3:19 PM

#

isn't that your job?

grizzled hill Feb 18, 2023, 3:30 PM

#

I know i did some research but i want to explain it in a simple way

hasty mountain Feb 18, 2023, 4:01 PM

#

grizzled hill Hello guys i have a presentstion about lstm next Tuesday about LSTM model , can ...

You could try something like this:

Consider the following operation:

5 * W = 10

A neural network is basically a mathmatic model that tries to find the value of W such that the result of this multiplication can be the closest as possible to 10. In this case, the perfect value for W would be 2.
Not only that, but, in order for the neural network to be generalist and useful in a classification task, for example, it should find the value W such that for any value that will be multiplied by W, you can get a result that is as close as possible to the right answer.

A Dense layer does this directly by using the formula Result = Input * W, while a LSTM layer does this by trying to simulate the idea of memory

In order to find the best value for W, it's done a series of calculations of derivatives

#

I don't know if this is correct ~~Edd might be triggered by this~~, but I find it an easy introdutory explanation

#

Unfortunately I don't know how to explain LSTMs in an easy way and without having to also explain residual connections and vanishing gradients

tidal bough Feb 18, 2023, 4:08 PM

#

this explanation mentions nothing about nonlinearities, which are crucial for NNs to work at all. it also limits itself to only one input and output, and so doesn't explain why a dense layer is, well, called dense - each output is affected by each input.

wooden sail Feb 18, 2023, 4:09 PM

#

it also barely addressed the lstm part of it, i think it's a little short

hasty mountain Feb 18, 2023, 4:09 PM

#

Isn't the nonlinearity part like...simply adding a bias to the multiplication?
And the part of dense...I guess you could simply apply matrices, then...

wooden sail Feb 18, 2023, 4:10 PM

#

hasty mountain Isn't the nonlinearity part like...simply adding a bias to the multiplication? A...

the bias, while really affine rather than linear, can anyway be rewritten as a linear transformation in an n+1 dimensional vector space. reptile means the activation function, which is where the power of neural networks comes from

tidal bough Feb 18, 2023, 4:10 PM

#

no? the only reason NNs can appoximate any function is that they have activation functions after each layer. Without them, it's easy to show that any number of linear linears without activations is the same as just having one linear layer, and so such an NN is linear.

wooden sail Feb 18, 2023, 4:10 PM

#

otherwise, all of the layers could be rewritten as a single affine transformation, no need for layers

hasty mountain Feb 18, 2023, 4:11 PM

#

wooden sail the bias, while really affine rather than linear, can anyway be rewritten as a l...

Oh... Are activation functions that important? pithink

wooden sail Feb 18, 2023, 4:11 PM

#

they are the whole reason neural networks are interesting

#

without them, it's just a simple matrix-vector mult

hasty mountain Feb 18, 2023, 4:11 PM

#

I see

wooden sail Feb 18, 2023, 4:12 PM

#

matrix multiplication is defined the way it is so that it corresponds to the application of a linear transformation in a given domain and codomain basis

#

it's also associative, meaning you can throw parentheses around the whole thing, and you end up with a single matrix

tidal bough Feb 18, 2023, 4:13 PM

#

grizzled hill Hello guys i have a presentstion about lstm next Tuesday about LSTM model , can ...

you could, say, watch 3b1b's videos on NNs and see if that way of explanation hits it better for you.

hasty mountain Feb 18, 2023, 4:13 PM

#

hasty mountain You could try something like this: ``` Consider the following operation: 5 * W...

Take this explanation and also add an explanation about activation functions py_guido 👍

#

If your presentation is in the area of physiology and it involves neuroanatomy in some degree, you could make a comparison between an activation function and a trigger zone in a neuron

wooden sail Feb 18, 2023, 4:15 PM

#

this looks pretty good tbh https://colah.github.io/posts/2015-08-Understanding-LSTMs/

queen cradle Feb 18, 2023, 4:16 PM

#

hasty mountain If your presentation is in the area of physiology and it involves neuroanatomy i...

That's a little risky to do, since there are neuromorphic computing devices that work like actual neurons, and they're very different from neural networks.

#

They're only of research interest so far, but they do exist.

tidal bough Feb 18, 2023, 4:18 PM

#

tidal bough no? the only reason NNs can appoximate *any* function is that they have activati...

(example proof: a single linear layer mapping n inputs to k outputs consists of a kxn matrix, let's call it A, and k biases, let's call them the vector a. It works like out1 = A@inp + a.
The next layer is the same - it's some matrix B and biases b. So out2 = B@out1 + b = B@(A@inp + a) + b = B@A@inp + B@a + b = (B@A)@inp + (B@a + b).
And this is equivalent to a single linear layer with a matrix B@A and biases vector B@a + b.
This can be applied to a network of any number of linear layers without activations, to conclude that such a network is acting as just a single linear layer. And hence, it definitely doesn't have any good approximations properties - there's only so well you can approximate an arbitrary function by a linear one.
So neural networks without activations are uninteresting.
)

hasty mountain Feb 18, 2023, 4:18 PM

#

queen cradle That's a little risky to do, since there are neuromorphic computing devices that...

Indeed, but the ReLU activation function works in a quite similar way than a trigger zone in a neuron.
In the neuron, if an stimulus is enough to open ionic channels dependent of voltage, the an action potential is triggered. If it isn't, then no action potential is generated and then no information is passed.

#

It's more or less like in a neural network when the function input is lower than 0, the ReLU makes makes it become 0, so no information is passed to the next layer.

queen cradle Feb 18, 2023, 4:20 PM

#

Actual neurons end up creating spike trains, though, which is a quite different mode of operation.

hasty mountain Feb 18, 2023, 4:21 PM

#

Yes, but each spike train correspond to a single action potential, isn't it?

#

An action potential could be compared to a number. The higher the number of action potentials generated, the higher the number that is output from the neural network. Thus, the stimulus can be interpreted as having higher intensity(like pain, light...)

queen cradle Feb 18, 2023, 4:22 PM

#

I think they're analogous but don't correspond exactly.

#

This is well outside my area of expertise, though. Maybe there's a way to make them line up perfectly that I don't know about.

hasty mountain Feb 18, 2023, 4:23 PM

#

Exactly is a very strong term

#

I prefer just saying that they're analogous

queen cradle Feb 18, 2023, 4:25 PM

#

Sure, I agree with that.

mint palm Feb 18, 2023, 4:25 PM

#

need help, to improve my model I introduced negative samples, now in order to incorporate them, i only need to introduce them at place where loss is calculated?
or do i need a triplet loss?
is there no other way to use negative sample to utilise them?

mortal robin Feb 18, 2023, 4:27 PM

#

Most important thing to learn for data analytics? other than programming language

wooden sail Feb 18, 2023, 4:28 PM

#

math

nocturne eagle Feb 18, 2023, 4:28 PM

#

statistics

queen cradle Feb 18, 2023, 4:30 PM

#

Math, statistics, and data cleaning skills.

nocturne eagle Feb 18, 2023, 4:30 PM

#

good data cleaning requires domain knowledge

grizzled hill Feb 18, 2023, 5:15 PM

#

hasty mountain You could try something like this: ``` Consider the following operation: 5 * W...

Thx!

hasty mountain Feb 18, 2023, 5:54 PM

#

Damn guys. Thanks for the tip about activation functions. It seems my model accuracy did improve consistently after I added a strategically positioned ReLU to it. It wasn't much, around 2~3% better accuracy, but it's something.
It also seems to optimize better as I add more layers(an effect that was strangely reduced until now)

grizzled hill Feb 18, 2023, 5:56 PM

#

As a computer science student should i go deeply in mathematics of neural network ?

#

Even in my university they didnt teach us the mathematics of it

#

We just learned how to calculate entropy and basic maths of it

hasty mountain Feb 18, 2023, 5:56 PM

#

Computer science graduation that doesn't teaches the maths of neural networks? yert

grizzled hill Feb 18, 2023, 5:57 PM

#

Sadly yes

hasty mountain Feb 18, 2023, 5:57 PM

#

I'd say go for it

#

The math can be quite useful when making a model or debugging it

#

I think there's some books in the pins

#

And there's the 3b1b's guy videos that the folks here tend to recommend

grizzled hill Feb 18, 2023, 6:00 PM

#

hasty mountain I think there's some books in the pins

Thx i just saw them

grizzled hill Feb 18, 2023, 6:01 PM

#

hasty mountain And there's the 3b1b's guy videos that the folks here tend to recommend

Yup i checked that channel

hasty mountain Feb 18, 2023, 6:18 PM

#

hasty mountain Damn guys. Thanks for the tip about activation functions. It seems my model accu...

Okay, it wasn't just 2~3% improvement. At least on my current run, it has improved more than 200% of what it was before... And it seems I'll have to review my entire project

#

brainmon

mint palm Feb 18, 2023, 7:20 PM

#

currently my video-retrieval model trains on cross-entropy. So, for batch size B having B number of video and text, it treat correct pair as positive(minimise loss between them) and all other pairs as negative(maximise loss)
I now have introduced some extra negative text for each video.
So for each video i have 1 positive text, B-1 negative texts + few extra negative texts i am adding.
I want to ask how should i modify the current loss function?
Currently it makes matrix such that on diagonal it has correct pair and all other place are filled with negatvie pair.

cerulean kayak Feb 18, 2023, 7:22 PM

#

does anybody here know a thing or two about Jupyter notebook?

iron basalt Feb 18, 2023, 7:28 PM

#

hasty mountain Indeed, but the ReLU activation function works in a quite similar way than a tri...

*This is an out of date understanding of how neurons work, but it's what deep learning is still based on originally.

#

*In some cases, it's still correct, both others not.

hasty mountain Feb 18, 2023, 7:29 PM

#

iron basalt *This is an out of date understanding of how neurons work, but it's what deep le...

The biological neuron or the artificial one?

iron basalt Feb 18, 2023, 7:29 PM

#

hasty mountain The biological neuron or the artificial one?

The biological one.

#

Well, also can't say one here. There are many types.

hasty mountain Feb 18, 2023, 7:29 PM

#

Aaawn... Then my grad teacher told me an obsolete info grumpchib

iron basalt Feb 18, 2023, 7:30 PM

#

Neurons still have these kinds of action potentials above a threshold, but they can also be sensitive to a specific range, so not just above some value.

hasty mountain Feb 18, 2023, 7:31 PM

#

So...it's more like a sigmoid function than a ReLU function?

iron basalt Feb 18, 2023, 7:31 PM

#

For reference, to reproduce the behavior (input/output responses) of a single Pyramidal cell takes a CNN with approximately 5-8 hidden layers.

#

That is, a single neuron has vast computational complexity.

hasty mountain Feb 18, 2023, 7:33 PM

#

I wonder how many layers it would take to reproduce the behavior of a Broca's Area pithink

iron basalt Feb 18, 2023, 7:39 PM

#

wooden sail they are the whole reason neural networks are interesting

@hasty mountain non-linearity is what lets networks become "bendy" (not a line) like this: https://www.desmos.com/calculator/xm6x1obhry

Desmos

Desmos | Graphing Calculator

#

Without that, not super interesting.

cerulean kayak Feb 18, 2023, 7:39 PM

#

anyways, does anyone know how to change jupyter font size in a markdown cell without using the header modifier? Because the header modifier does not allow bold text.

at me if you know

tidal bough Feb 18, 2023, 7:41 PM

#

doesn't jupyter's markdown support HTML tags?

hasty mountain Feb 18, 2023, 7:42 PM

#

iron basalt Without that, not super interesting.

Interesting... I'll have to review my models, then.
Simply adding an extra ReLU to one of my models improved its performance dramatically, so...interesting... joe_salute

iron basalt Feb 18, 2023, 7:42 PM

#

(And a real neuron has many non-linearities, which is why it can do so much)

tidal bough Feb 18, 2023, 7:44 PM

#

tidal bough doesn't jupyter's markdown support HTML tags?

yup, it does, @cerulean kayak

#

so you can do stuff like  bigg text

#

and allows mixing with bold like e.g. your_bold_text_here

#

(at least, works in vscode's jupyter support, haven't tried in jupyter notebooks)

iron basalt Feb 18, 2023, 7:52 PM

#

hasty mountain Aaawn... Then my grad teacher told me an obsolete info <:grumpchib:5522142571488...

It's not obsolete if they were trying to explain deep learning. Deep learning is based on 1940s neuroscience.

wooden sail Feb 18, 2023, 7:53 PM

#

hasty mountain Interesting... I'll have to review my models, then. Simply adding an extra ReLU ...

there's a paper that i can't for the life of me find again which shows that the relu is optimal in a sense, and families of splines in general are

hasty mountain Feb 18, 2023, 7:53 PM

#

iron basalt It's not obsolete if they were trying to explain deep learning. Deep learning is...

They weren't. It was a physiology class grumpchib

iron basalt Feb 18, 2023, 7:53 PM

#

hasty mountain They weren't. It was a physiology class <:grumpchib:552214257148887060>

They may also start with the more simple older version. Otherwise they could have the entire class be about just the neuron.

wooden sail Feb 18, 2023, 7:53 PM

#

it's the type of function you end up with when you do joint optimization on the task and also the activation function at the same time (under some sparsity-equivalent constraints)

hasty mountain Feb 18, 2023, 7:54 PM

#

iron basalt They may also start with the more simple older version. Otherwise they could hav...

The class was about the neuron action potential. I just made the correlation with deep learning by myself.

hasty mountain Feb 18, 2023, 7:54 PM

#

wooden sail it's the type of function you end up with when you do joint optimization on the ...

pithink

iron basalt Feb 18, 2023, 7:54 PM

#

hasty mountain The class was about the neuron action potential. I just made the correlation wit...

I meant the whole semester / course.

hasty mountain Feb 18, 2023, 7:54 PM

#

I just get troubled with the possibility of ReLU producing "dead neurons"

#

Which is why I got used to preferring PReLU lately

iron basalt Feb 18, 2023, 7:55 PM

#

They could present a new type of neuron each day and still not cover all of them.

#

(Or as same have put it, that there are so many types that assigning them classes is starting to lose meaning)

iron basalt Feb 18, 2023, 7:57 PM

#

hasty mountain I just get troubled with the possibility of ReLU producing "dead neurons"

That should not be an issue really. There will only be so many dead neurons in practice.

hasty mountain Feb 18, 2023, 7:58 PM

#

Hm... Yes, and the neural network should be able to overcome the dead neuron problem through optimization, isn't it? And the "dead neuron" should also help with filtering info, I suppose?

#

But idk... I'm crazy over GANs, and people tend to prefer LeakyReLU over the ReLU in Discriminators exactly to avoid dead neurons and vanishing gradients in the generator

iron basalt Feb 18, 2023, 8:00 PM

#

To some extent having dead neurons over time helps, because it acts as a mild sparsity factor. But it's simply worth the benefits of ReLU.

hasty mountain Feb 18, 2023, 8:01 PM

#

New hyperparameter discovered: Testing ReLU or its Leaky variations

#

Except GeLU and SiLU, though. I'll let those aberrations to OpenAI and their diffusions

iron basalt Feb 18, 2023, 8:03 PM

#

There are many optimization methods where parts of it become "dead." But that is not really an issue, only maybe if you want continual learning, but even then it's actually often better to leave it dead and add on more / have "growing" abilities.

#

(Having things come back alive again can ruin any attempt at online learning (other parts were built up assuming that it's dead, making it alive again disrupts that))

#

(Imagine you have a neuron that was optimized with some dead neurons as inputs, and then after a while those suddenly come alive again, now that neuron's trained output is ruined ("catastrophic forgetting"))

#

(As a runtime performance / memory optimization you can have a pruning pass that just deletes the dead parts)

lapis sequoia Feb 18, 2023, 8:21 PM

#

hi. idk if this is related to datascience but what is output normalisation, gating, amplification and competition when it comes to things like machine learning and networks

iron basalt Feb 18, 2023, 8:28 PM

#

hasty mountain Indeed, but the ReLU activation function works in a quite similar way than a tri...

To be a bit more clear on this, the thing is that the dendrites can be selective on input ranges, so yes, if there is enough stimulus coming down the dendrites reaching the soma it will trigger the axon, that is correct (still the same as in text books (threshold)). But "stimulus" to me here means inputs to the neuron, not internally to a different part of itself, which is selective, so the simplified model in deep learning does not work.

#

(So the activation function part of perceptrons is correct (threshold), but the input to that is not (it's not linear / affine))

#

(You can imagine a chain reaction of a bunch of stuff happening in the dendrites before the action potential in the axon)

#

(So your teacher is probably not wrong, but it's much less detail than needed to model it)

cerulean kayak Feb 18, 2023, 8:37 PM

#

tidal bough and allows mixing with bold like e.g. `your_bold<...

okay so it does work:
why? because of the fact that the  tag denotes style="bold" shouldn't all the text be bold? Why are the  opening and closing tags needed? better yet why is the style part of the font tag needed?

iron basalt Feb 18, 2023, 8:40 PM

#

There are also dendrites / inputs to neurons that will never trigger an output axon spike, they can only "prime" the neuron so that the next time it would fire, it fires a bit earlier. This is part of why a single neuron can do sequence prediction.

#

(regardless of the "weights" of these inputs)

#

*This is kind of the issue with trying to discuss neurons, they are so complex, to be correct I must now also include that dendrites are also outputs (backpropagation).

#

*When I say output I usually mean the axon.

tidal bough Feb 18, 2023, 8:46 PM

#

cerulean kayak okay so it does work: why? because of the fact that the `` tag denotes `st...

better yet why is the style part of the font tag needed?
That's just how that tag works: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/font
shouldn't all the text be bold?
Oh, whoops, I accidentally left a non-working part there. Ignore the style="bold", it doesn't work. It's only the b tag that provides bolding here.

cerulean kayak Feb 18, 2023, 8:47 PM

#

tidal bough > better yet why is the style part of the font tag needed? That's just how that...

I am confused. So this is without style.

#

and this is with it

tidal bough Feb 18, 2023, 8:48 PM

#

Yeah, like I said, I left that part there by accident - it doesn't do anything as you see.

cerulean kayak Feb 18, 2023, 8:49 PM

#

tidal bough Yeah, like I said, I left that part there by accident - it doesn't do anything a...

okay so it is not "just how the tag works"?
If it sounds like Im insulting you because you messed up I am not, so please don't. I am just not getting it because I know you edited it for a reason.

tidal bough Feb 18, 2023, 8:50 PM

#

Yeah, I can't read apparently, I thought you were talking about the size part

cerulean kayak Feb 18, 2023, 8:51 PM

#

okay, thank you. You've been a huge help

hasty mountain Feb 18, 2023, 8:54 PM

#

iron basalt There are also dendrites / inputs to neurons that will never trigger an output a...

Yes, it's when their trigger zone isn't triggered, because the stimulus(the input) didn't manage to produce an ionic current intense enough to open any ionic channel. It's like a negative value being passed to ReLU

tidal bough Feb 18, 2023, 8:55 PM

#

cerulean kayak okay, thank you. You've been a huge help

I played around more and figured out how to do it via a CSS style:

<p style="font-size: 3rem">yay</p>

#

or I guess I'm supposed to use a div and not a p, but the result is the same here

winter ledge Feb 18, 2023, 8:59 PM

#

Are cv and opencv2 the same thing?

tidal bough Feb 18, 2023, 9:03 PM

#

yes, and the pypi package is called opencv-python

cerulean kayak Feb 18, 2023, 9:09 PM

#

@tidal bough btw: if I want to do a line break, if I do my text like this:

Numpy is a library for creating arrays of numbers that are more efficent than Pythonic lists.
Numpy arrays are usually created by either creating a pythonic list and converting it to a Numpy array or by using the arange method.
The below will be an example of the former:

where the text is seperated by a line break it comes out like this

#

and only when I do it like this

Numpy is a library for creating arrays of numbers that are more efficent than Pythonic lists.

Numpy arrays are usually created by either creating a pythonic list and converting it to a Numpy array or by using the arange method.

The below will be an example of the former:

it'll come out with line breaks, but it looks like they skiped a line. I just want one line break, so what the heck?

please at me if you know

tidal bough Feb 18, 2023, 9:20 PM

#

this a markdown thing I believe

#

you can use \ at the end of a line for a forced linebreak

sly nymph Feb 18, 2023, 10:33 PM

#

Where can I find a database of 500-ish images of bees?

#

I dont wanna do it all by hand ;-;