#data-science-and-ml | Python | Page 135

verbal venture Jul 17, 2024, 1:57 AM

#

the training data is text + image embeddings of image clips (embedded vectors of image frames) and their respective captions. the other 2 names are video datasets

violet gull Jul 17, 2024, 1:59 AM

#

Stelercus 👉 👈 ducky_sphere

serene scaffold Jul 17, 2024, 2:29 AM

#

violet gull Stelercus 👉 👈 <:ducky_sphere:989966352150319186>

I appreciate that you think I'm the expert, but I am not.

violet gull Jul 17, 2024, 2:30 AM

#

but orange name

serene scaffold Jul 17, 2024, 2:30 AM

#

mods are orange
admins are tomato

violet gull Jul 17, 2024, 2:30 AM

#

but tomato

serene scaffold Jul 17, 2024, 2:31 AM

#

anyway, if the experiment setup gives the model a clear signal, and the modal architecture is appropriate, the model will converge on something. it just won't necessarily be the best possible model.

violet gull Jul 17, 2024, 2:32 AM

#

serene scaffold anyway, if the experiment setup gives the model a clear signal, and the modal ar...

my example proves it wont

serene scaffold Jul 17, 2024, 2:32 AM

#

what won't what?

violet gull Jul 17, 2024, 2:33 AM

#

serene scaffold what won't what?

huh

#

my example shows how convergence is impossible (clearly its not) so my reasoning must be wrong somewhere

serene scaffold Jul 17, 2024, 2:33 AM

#

you said "it won't", which is short for "x will not y", but idk what x and y are.

violet gull Jul 17, 2024, 2:34 AM

#

serene scaffold you said "it won't", which is short for "x will not y", but idk what x and y are...

my example proves that even with a model having a clear signal and an appropriate architecture that the model will not converge on something

serene scaffold Jul 17, 2024, 2:35 AM

#

I think one of those two things might not be true. or the hyperparameters are bad (which I guess is a third condition that I didn't mention)

violet gull Jul 17, 2024, 2:35 AM

#

my example doenst involve hyperparameters

serene scaffold Jul 17, 2024, 2:36 AM

#

well, I've never done reinforcement learning
ask me about interactive LLMs.

violet gull Jul 17, 2024, 2:36 AM

#

🤧 lemon_sentimental

autumn comet Jul 17, 2024, 2:37 AM

#

Right. I actually had validation code before (which I accidentally left out of the post here)

     # Check if CUDA is available
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    # Determine number of GPUs
    if device.type == 'cuda':
        num_gpus = torch.cuda.device_count()
        print(f"There are {num_gpus} CUDA devices available.")
    else:
        num_gpus = 0
        print("No CUDA devices available.")

When I run this on my local machine, I get:

No CUDA devices available.
Initializing a new model.
Parameters to be optimized: 7041970

When I run it on a new pod (with different hardware) I get:

There are 2 CUDA devices available.
Initializing a new model.
Parameters to be optimized: 7041970

violet gull Jul 17, 2024, 2:37 AM

#

ill keep reading books until i find an answer to this

serene scaffold Jul 17, 2024, 2:38 AM

#

autumn comet Right. I actually had validation code before (which I accidentally left out of t...

okay, so make sure it's always device = torch.device('cuda'). If you don't have a GPU available, there's no point trying to continue.

In the case where you have 2 CUDA devices, that appears to be where you end up with the "tensors are on different devices" problem. I've never run an experiment on more than one GPU.

autumn comet Jul 17, 2024, 3:47 AM

#

Thanks for the reply.

Yeah, I suppose I can bail out if there is no CUDA.

But as far as the "tensors are on different devices" problem, I am stumped. Just to make it clearer, I am now using .cuda() and that on everything in train.py that will accept it but still not working. Also, now I am only working with one GPU with the hopes that I can get it running on 1 before trying to get it to run on multiple.

//...
        
    train_data = torch.load("assets/output/train.pt").cuda()
    valid_data = torch.load("assets/output/valid.pt").cuda()

    //...

    if update:
        try:
            model = torch.load("assets/models/model.pt").cuda()
            print("Loaded existing model to continue training.")
        except FileNotFoundError:
            print("No existing model found. Initializing a new model.")
            model = GPTLanguageModel(vocab_size=len(vocab)).cuda()
        
    else:
        print("Initializing a new model.")
        model = GPTLanguageModel(vocab_size=len(vocab)).cuda()

//...
            train_loss = estimate_loss(model, train_data).cuda()
            valid_loss = estimate_loss(model, valid_data).cuda()

            time = current_time().cuda()
//...

        # sample batch of data
        x_batch, y_batch = get_batch(train_data).cuda()

        # evaluate the loss
        logits, loss = model(x_batch, y_batch).cuda()
        //...

    torch.save(model, "assets/models/model.pt").cuda()
    print("Model saved")

digital heath Jul 17, 2024, 6:32 AM

#

Hello

#

How are you

#

I need help guys

frosty fulcrum Jul 17, 2024, 6:39 AM

#

Does anyone know how can i normalize the points so i can use the mask with roboflow?

frosty fulcrum Jul 17, 2024, 7:22 AM

#

fixed

past meteor Jul 17, 2024, 8:20 AM

#

Because off-policy algorithms can update a different target policy than the behaviour policy. This is what the Q in Deep Q learning stands for.

#

I'd check out the paper I sent you

wild loom Jul 17, 2024, 8:23 AM

#

Hey guys, I've been training a AI coco-model on image detection lately in google colab. I was wondering if anyone had a link oe two that would explain a way in which I can somehow download this model I've trained so that I can import it to a new file and just plug in an image to be detected rather than re-run the enitre model on colab for it to be used everytime I restart my PC.

lapis sequoia Jul 17, 2024, 9:04 AM

#

just download the weights/checkpoints, and the configuration file @wild loom

#

you can run !find . -type f -name *.{ckpt,pth} i think

#

otherwise maybe this !find . -type f -name "*.ckpt" -o -name "*.pth"

#

those are frequent but it may be .keras etc depending on what is the framework and format.

wild loom Jul 17, 2024, 9:21 AM

#

okay thank you for your help I'll try that out

remote stream Jul 17, 2024, 9:30 AM

#

Guys which field or topic of machine learning should i focus more

#

because i think that i understand most of the supervised learning topics and their codes are pretty similar

#

same goes for unsupervised

deep sleet Jul 17, 2024, 10:06 AM

#

is making a custom loss function good in certain cases?

mild dirge Jul 17, 2024, 10:09 AM

#

deep sleet is making a custom loss function good in certain cases?

Yeah, it can be useful if you want to penalize certain behaviour of your model

#

F.e. increasing the loss for underrepresented classes, and decreasing loss of overrepresented classes.

deep sleet Jul 17, 2024, 10:12 AM

#

oh

#

Ty

unique spoke Jul 17, 2024, 10:17 AM

#

How to do semantic segmentation?

#

Is it a better option than bounding boxes for object detection

#

So basically, I am trying to make a project and I have already used the COCO dataset

#

I want to combine it with the Mapillary vistas dataset which is more specific for street objects

#

Havent found any resources regarding object detection using bounding boxes but only results for semantic segmentation

deep sleet Jul 17, 2024, 10:36 AM

#

if the loss is shown to be nan what does that mean?

unique spoke Jul 17, 2024, 10:43 AM

#

nvm researched it

wooden sail Jul 17, 2024, 11:11 AM

#

deep sleet if the loss is shown to be nan what does that mean?

it means something like a division by zero or multiplication/division by infinity happened somewhere

deep sleet Jul 17, 2024, 11:13 AM

#

wooden sail it means something like a division by zero or multiplication/division by infinit...

oh ol

unique spoke Jul 17, 2024, 11:47 AM

#

Can you guys suggest some datasets which I can train a model on or use a preexisting model on which can be used for Object detection NOT Semantic Segmentation for detecting objects on the street. Already using coco dataset want to combine with another

hollow sentinel Jul 17, 2024, 11:57 AM

#

unique spoke Can you guys suggest some datasets which I can train a model on or use a preexis...

kaggle? google datasets?

unique spoke Jul 17, 2024, 12:35 PM

#

hollow sentinel kaggle? google datasets?

decided to use open images

#

Any tutorials you guys can suggest or any links to code a custom object detector - Specifically for tensorflow. Like Object Detection API (A good tutorial which you find useful for this would be much appreciated)

#

Also if I already know tensorflow , should I still learn pytorch?

#

YoloV8 and others require pytorch

lapis sequoia Jul 17, 2024, 12:53 PM

#

A question about TorchRL:

    checkpoint = torch.load('ppo_model.pth')
    actor_net.load_state_dict(checkpoint['actor_net_state_dict'])
    value_net.load_state_dict(checkpoint['value_net_state_dict'])
    ProbabilisticActor_policy_module.load_state_dict(checkpoint['probabilistic_actor_state_dict'])
    scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
    Adam_optimizer_used.load_state_dict(checkpoint['optimizer_state_dict'])
    GAE_advantage_module.load_state_dict(checkpoint['gae_state_dict'])
    maximum_average_reward = checkpoint['maximum_reward_tensor']

For some reason the average reward decreases after I load the model. I saved the states of Actor Network, Value Network, ProbabilisticActor state, Cosine Annealing Learning Rate Scheduler state, Adam optimiser state, Generalised Advantage Estimator State and maximum average reward number.

Do I need to save

sub-batch ReplayBuffer state
SyncDataCollector experience collector class state
ClipPPOLoss class state
Do I need to save the model before .step() method of my Cosine Annealing LR Scheduler or after?
https://discord.com/channels/267624335836053506/1263119686867091487

oblique isle Jul 17, 2024, 2:11 PM

#

Helllo guys hope u good , i have a tiny problem , so basically i have my zip who contains a well structured Python code files (CTGAN model) and i want to implement it in api so i could use it in a desktop app , where i should deploy the code first ?

violet gull Jul 17, 2024, 2:24 PM

#

violet gull ill keep reading books until i find an answer to this

Book says I’m correct

#

Stochastic approximation theory proves that if the random exploration chance doesn’t decay to zero convergence is impossible

deep sleet Jul 17, 2024, 2:43 PM

#

So most cost functions allow us to reach local minima and which local minima you end up in depends on what your random weights and biases that were set initially are, isn't that a inefficient ? because you are missing out on much better minima so maybe something llke trying several random initializations which gives us a higher chance of getting a lower local minimia

lapis sequoia Jul 17, 2024, 2:46 PM

#

violet gull Stochastic approximation theory proves that if the random exploration chance doe...

So what if your exploration coefficient (entropy) = 0% until you are stuck at the same reward too long, at which point exploration chance will start increasing until a better reward is reached?

mild dirge Jul 17, 2024, 2:51 PM

#

deep sleet So most cost functions allow us to reach local minima and which local minima you...

This is something that is commonly done yeah.

#

But if the search space is so big, you can't find every local minimum (and thus finding the global minimum is almost never possible)

#

But it's great that you come to these conclusion by yourself ok_handbutflipped

deep sleet Jul 17, 2024, 2:53 PM

#

mild dirge This is something that is commonly done yeah.

oh that makes more sense

#

Thx man

deep sleet Jul 17, 2024, 2:54 PM

#

mild dirge But if the search space is so big, you can't find every local minimum (and thus ...

Yeah he mentioned that

deep sleet Jul 17, 2024, 2:54 PM

#

mild dirge But it's great that you come to these conclusion by yourself <:ok_handbutflipped...

❤️

violet gull Jul 17, 2024, 2:56 PM

#

lapis sequoia So what if your exploration coefficient (entropy) = 0% until you are stuck at th...

I’ve never heard of that but maybe

lapis sequoia Jul 17, 2024, 2:57 PM

#

violet gull I’ve never heard of that but maybe

yeah it's called adaptive learning

cedar tusk Jul 17, 2024, 4:05 PM

#

finding local min/max is very easy. finding global min/max requires us to be able to either try all numbers or differentiate the function itself.

#

the most widely used method is to have multiple different starting points so that a large area of numbers is covered

scenic parcel Jul 17, 2024, 4:26 PM

#

#

Is this a common philosophy?

violet gull Jul 17, 2024, 4:34 PM

#

Chaining? Yes

agile cobalt Jul 17, 2024, 4:42 PM

#

scenic parcel

you should absolutely never use inplace

and yes, chaining is pretty common

scenic parcel Jul 17, 2024, 4:51 PM

#

agile cobalt you should absolutely **never** use `inplace` and yes, chaining is pretty commo...

Never? Why not? I've read two articles on it now and I think it offers performance benefits for methods like drop, and fillna

#

Specifically this https://sourcery.ai/blog/pandas-inplace/

When is inplace in Pandas faster?

And when is the inplace argument misleading?

agile cobalt Jul 17, 2024, 4:54 PM

#

scenic parcel Never? Why not? I've read two articles on it now and I think it offers performan...

the gain is negligible compared to the headaches it can cause

in particular, that exact article you linked is saying that you should not use it for drop

scenic parcel Jul 17, 2024, 4:55 PM

#

agile cobalt the gain is negligible compared to the headaches it can cause in particular, th...

I think they made a mistake there, since drop is shown in their chart as being possible to do without making a copy. They then say that if that's the case, its ok to use inplace

agile cobalt Jul 17, 2024, 4:55 PM

#

scenic parcel I think they made a mistake there, since `drop` is shown in their chart as being...

drop is not going to copy data one way or the other

scenic parcel Jul 17, 2024, 4:56 PM

#

drop is the only method that has a green checkmark in their chart, that is later listed as a method to avoid using inplace with

agile cobalt Jul 17, 2024, 4:56 PM

#

that confusion is all the more reason to just never use it

desert cedar Jul 17, 2024, 5:05 PM

#

hi! for a project, i'm interested in creating a small application that would be able to read/parse through links and mark down certain info like the article name, the original language the article is in, etc. how would i be able to do this? could someone redirect me to a youtube video or an api i'd be able to use? i feel like i could possibly incoorporate the usage of ai to automate this process. thanks!

cedar tusk Jul 17, 2024, 6:09 PM

#

is there anywhere i can ask R related questions? if anyone is knowledgable i can ask from dm as well, i have a least squares question

serene scaffold Jul 17, 2024, 6:12 PM

#

cedar tusk is there anywhere i can ask R related questions? if anyone is knowledgable i can...

https://discord.gg/XjbcrrzM

lapis sequoia Jul 17, 2024, 6:26 PM

#

remote stream because i think that i understand most of the supervised learning topics and the...

depends on what you want to do. hyper graph neural nets are on the rise

lapis sequoia Jul 17, 2024, 6:29 PM

#

unique spoke Can you guys suggest some datasets which I can train a model on or use a preexis...

collect from roboflow

lapis sequoia Jul 17, 2024, 6:31 PM

#

unique spoke Also if I already know tensorflow , should I still learn pytorch?

They are similar, use whatever you want. I use keras 3 currently, it's great.

serene scaffold Jul 17, 2024, 6:33 PM

#

unique spoke Also if I already know tensorflow , should I still learn pytorch?

if you understand neural networks and python well enough, you should be able to switch between tensorflow and pytorch if the situation requires it.

#

that is: no.

serene grail Jul 17, 2024, 6:37 PM

#

lapis sequoia depends on what you want to do. hyper graph neural nets are on the rise

I have no idea what those are, I assume you need a solid understanding of graph theory to get into those?

cedar tusk Jul 17, 2024, 6:38 PM

#

serene scaffold if you understand neural networks and python well enough, you should be able to ...

honestly, there is no situation where i would require tensorflow over pytorch

serene scaffold Jul 17, 2024, 6:38 PM

#

cedar tusk honestly, there is no situation where i would require tensorflow over pytorch

I've never used tensorflow at my job.

#

it seems like the only people who use tensorflow are tutorial authors

cedar tusk Jul 17, 2024, 6:38 PM

#

serene scaffold I've never used tensorflow at my job.

rightfully so, that package is VERY overrated

lapis sequoia Jul 17, 2024, 6:40 PM

#

serene grail I have no idea what those are, I assume you need a solid understanding of graph ...

i'm just starting with it, yes, you do need some background

#

it's a bit more complex than graph neural nets

cedar tusk Jul 17, 2024, 6:41 PM

#

i hate the fact that with neural nets intuition is just out the window

lapis sequoia Jul 17, 2024, 6:42 PM

#

wdym?

cedar tusk Jul 17, 2024, 6:42 PM

#

there are neurons which take input and have different coefficients, those coefficients produce a result

#

an oversimplification

#

but even then the intuition of the researcher towards the data is never better than the neuron itself since neuron really do not explain anything

lapis sequoia Jul 17, 2024, 6:44 PM

#

yeah that's like saying that cars are like horses

cedar tusk Jul 17, 2024, 6:44 PM

#

obviously nets are made to think instead of the researchers

#

to mimic the brain

fervent shore Jul 17, 2024, 6:44 PM

#

cedar tusk rightfully so, that package is VERY overrated

yeah google has been lacking on maintaining the framework, a lot of the data processing functions especially don't work the way their documented on the documentation (bc they don't maintain that either) or work at all

unkempt apex Jul 17, 2024, 6:45 PM

#

is it necessary to convert images to Grayscale ( for CNN )
because I am dealing with weather images!

lapis sequoia Jul 17, 2024, 6:45 PM

#

necessary no, i don't think so, but may be wrong

fervent shore Jul 17, 2024, 6:45 PM

#

unkempt apex is it necessary to convert images to Grayscale ( for CNN ) because I am dealing ...

not necessary but a 3D CNN is a little harder to implement, adding RGB channels makes it 3D while grayscale keeps it 2D

cedar tusk Jul 17, 2024, 6:46 PM

#

lapis sequoia necessary no, i don't think so, but may be wrong

a mixed cnn where one model takes the grayscale and the other the rgb values of the pixels maybe?

#

but then you would need more than 1 layer of input neurons :/

lapis sequoia Jul 17, 2024, 6:46 PM

#

an image is a 3 D tensor, you can feed that to a CNN network

unkempt apex Jul 17, 2024, 6:46 PM

#

fervent shore not necessary but a 3D CNN is a little harder to implement, adding RGB channels ...

what is this?
I am directly loading dataset using dataset!
and my first input is 3[channels] for CNN

fervent shore Jul 17, 2024, 6:46 PM

#

lapis sequoia an image is a 3 D tensor, you can feed that to a CNN network

^ and use 3D convolution

lapis sequoia Jul 17, 2024, 6:46 PM

#

well, it's 2D

#

the D in a convolution are the dimensions of the window, not of the cube

fervent shore Jul 17, 2024, 6:47 PM

#

unkempt apex what is this? I am directly loading dataset using dataset! and my first input is...

is that with converting the image to grayscale?

lapis sequoia Jul 17, 2024, 6:47 PM

#

2 D means you specify the kernel window size

fervent shore Jul 17, 2024, 6:47 PM

#

I mean would't you add another dimension of corresponding kernels for the added RGB dimension?

lapis sequoia Jul 17, 2024, 6:47 PM

#

1D would still be 3D, but you specify 1 dimension

#

no, that's a bit confusing at first

#

the depth is always automatically set to the depth of the input data

#

you only specify the area for 2D, and the height for 1D convolutions.

fervent shore Jul 17, 2024, 6:51 PM

#

yeah so it would go 1D -> [x], 2D -> [x,y], 3D -> [x,y,z]
and so should the kernels, if an image has a dimensional depth of 3 then the kernel matrix would be 3D

unkempt apex Jul 17, 2024, 6:51 PM

#

okay after debate just explain me in simple!@

lapis sequoia Jul 17, 2024, 6:52 PM

#

3D convs you set the depth, they aren't very common id say

fervent shore Jul 17, 2024, 6:52 PM

#

oh wait nvm I see where 2D is used for RGB

#

https://www.kaggle.com/code/shivamb/3d-convolutions-understanding-use-case

3D Convolutions : Understanding + Use Case

Explore and run machine learning code with Kaggle Notebooks | Using data from 3D MNIST

#

https://discuss.pytorch.org/t/3d-vs-2d-convolution-when-using-grayscale-images/67689

PyTorch Forums

3d vs 2d convolution when using grayscale images

When we do 2d convolution with RGB images we are, actually, doing 3d convolution. For this we still use the pytorch 2d_conv layers. When we do 3d convolution of a set of RGB images, we are doing 4d convolution and can use the 3d conv layer. My question is: what is the difference, if any, between using the 3d conv layer for a set of grayscale i...

wooden sail Jul 17, 2024, 6:53 PM

#

interestingly, what stuff like pytorch will do is apply 2d convolutions separately to each layer of color, then add the results up

fervent shore Jul 17, 2024, 6:53 PM

#

yeah I saw that on the forum

wooden sail Jul 17, 2024, 6:53 PM

#

however, this turns out to be equivalent to doing a 3d convolution if you ignore all of the outputs that aren't fully overlapping

lapis sequoia Jul 17, 2024, 6:53 PM

#

yes, that's what i was trying to say

fervent shore Jul 17, 2024, 6:53 PM

#

I see now

wooden sail Jul 17, 2024, 6:53 PM

#

you're free to interpret it as you like for this one

lapis sequoia Jul 17, 2024, 6:54 PM

#

but each has != weights

fervent shore Jul 17, 2024, 6:54 PM

#

true

lapis sequoia Jul 17, 2024, 6:54 PM

#

it's not the same kernel

wooden sail Jul 17, 2024, 6:54 PM

#

(you can make it have the same weights)

lapis sequoia Jul 17, 2024, 6:54 PM

#

sure, but that's not commonly the case

wooden sail Jul 17, 2024, 6:55 PM

#

in any case, the way pytorch does it by default can be written on paper both as several 2d and a single 3d convolution

#

just kind of a boring 3d conv

lapis sequoia Jul 17, 2024, 6:55 PM

#

yes, it's one of a single step, if i understand correctly?

#

like the same transition happens from 1D to 2D convolution imho

fervent shore Jul 17, 2024, 6:56 PM

#

every higher dimensional convolution is just a giant 1D convolution Pepe_Hmmm

wooden sail Jul 17, 2024, 6:57 PM

#

well, you just had to trigger the topic that made me a meme

lapis sequoia Jul 17, 2024, 6:57 PM

#

@unkempt apex so imho you'd just use conv2D to be brief. others may disagree

wooden sail Jul 17, 2024, 6:57 PM

#

N-D convolution can be represented as a multi-level block-toeplitz matrix

#

the number of dimensions and the order of unrolling the multi-way array determines whether you have blocks that are toeplitz, or toeplitz blocks

#

so in that sense yes, you can unfold multidim convolutions into an operation that looks like a 1d conv

lapis sequoia Jul 17, 2024, 7:00 PM

#

fervent shore yeah so it would go 1D -> [x], 2D -> [x,y], 3D -> [x,y,z] and so should the kern...

where is the image from? text has so many weird words lol

fervent shore Jul 17, 2024, 7:01 PM

#

its from kaggle, this article https://www.kaggle.com/code/shivamb/3d-convolutions-understanding-use-case

3D Convolutions : Understanding + Use Case

Explore and run machine learning code with Kaggle Notebooks | Using data from 3D MNIST

lapis sequoia Jul 17, 2024, 7:02 PM

#

interesting, there are several misspellings in that single par

#

but the image looks right

#

ptrblck the nvidia guy from pytorch forums, he is so clever lol

iron basalt Jul 17, 2024, 7:04 PM

#

wooden sail N-D convolution can be represented as a multi-level block-toeplitz matrix

To add to this / explain it a bit if you look it up. You can represent a ton of a operations as a matrix multiplication by just setting a bunch of entries to 0 (often representing lack of connection / edge / interaction) (which then can be skipped for performance reasons). It's kind of like adding 0 to an expression. So why do this? So you can write it down in linear algebra form to analyze it.

fervent shore Jul 17, 2024, 7:04 PM

#

lapis sequoia **ptrblck** the nvidia guy from pytorch forums, he is so clever lol

yeah the amount of times his answers carried my torch projects 💀

serene grail Jul 17, 2024, 7:04 PM

#

iron basalt To add to this / explain it a bit if you look it up. You can represent a ton of ...

chocojNoted

lapis sequoia Jul 17, 2024, 7:04 PM

#

fervent shore yeah the amount of times his answers carried my torch projects 💀

nice

lapis sequoia Jul 17, 2024, 7:06 PM

#

iron basalt To add to this / explain it a bit if you look it up. You can represent a ton of ...

i'll have to read, guess i have 0s in my brain

#

i assume that if there is a zero you may skip reading the other matrix's row, or col, but not anything else

iron basalt Jul 17, 2024, 7:09 PM

#

lapis sequoia i assume that if there is a zero you may skip reading the other matrix's row, or...

You skip the reading / loading / multiply / add.

lapis sequoia Jul 17, 2024, 7:09 PM

#

yeah

iron basalt Jul 17, 2024, 7:10 PM

#

For example. If you have say two vectors: [0, 1, 0, 0] and [1, 2, 3, 4] and you want to element-wise multiply them. You can skip almost all of the work if you know the index of the 1 in the first one-hot vector.

lapis sequoia Jul 17, 2024, 7:10 PM

#

matrix mult optimisation is quite nice

#

yea, if 1234 is a mtrix, still more savings

iron basalt Jul 17, 2024, 7:11 PM

#

Yes, if your matrix is like 80% zeros, you get massive gains.

lapis sequoia Jul 17, 2024, 7:11 PM

#

but that's unlikely if those are actual weights, but i guess it's useful somehow, like you indicated..

iron basalt Jul 17, 2024, 7:11 PM

#

Not if your weights are sparse...

lapis sequoia Jul 17, 2024, 7:12 PM

#

why would they?

#

ah maybe in graphs, but only in the first layers

wooden sail Jul 17, 2024, 7:12 PM

#

if the conv kernel is comparatively small wrt the image, you immediately have humongous sparsity (and you'll notice this is super often the case in CNNs)

fervent shore Jul 17, 2024, 7:12 PM

#

iron basalt For example. If you have say two vectors: `[0, 1, 0, 0]` and `[1, 2, 3, 4]` and ...

so something along the lines of a piecewise?
{ 0 if V1_n || V2_n == 0
{ V1_n * V2_n if V1_n and V2_n != 0

iron basalt Jul 17, 2024, 7:12 PM

#

wooden sail if the conv kernel is comparatively small wrt the image, you immediately have hu...

And it gets even better, as it's shared weights.

lapis sequoia Jul 17, 2024, 7:13 PM

#

wooden sail if the conv kernel is comparatively small wrt the image, you immediately have hu...

but are you assuming some of the entries in the kernel are 0s?

iron basalt Jul 17, 2024, 7:13 PM

#

lapis sequoia ah maybe in graphs, but only in the first layers

As you may have guessed, graph problems usually don't have everything connected to everything, often the opposite, they are sparse.

wooden sail Jul 17, 2024, 7:13 PM

#

lapis sequoia but are you assuming some of the entries in the kernel are 0s?

they don't need to be

#

the linear transformation acting on the image is the same size as the image, and everything outside the kernel size is automatically 0

iron basalt Jul 17, 2024, 7:14 PM

#

I like to show this video to get across the idea, it's well made: https://www.youtube.com/watch?v=0fHkKcy0x_U

YouTube

Physics for the Birds

Solving the "Lights Out" Problem

Ever run into this funny little puzzle? It appears in Legend of Zelda: Link's Awakening, LEGO Star Wars: The Skywalker Saga, and in a 1995 electronic toy called Lights Out. It turns out that this game has some pretty rich math. In this video, we'll learn about modular arithmetic and the matrix inverse. We'll also learn about substitution ciphers...

▶ Play video

lapis sequoia Jul 17, 2024, 7:14 PM

#

yeah i had to write some graph parsing a while ago

#

actually, that happens in vector encoding of characters

iron basalt Jul 17, 2024, 7:15 PM

#

iron basalt I like to show this video to get across the idea, it's well made: https://www.yo...

(this also is an example of why you want to have it (the problem) in linear algebra form)

lapis sequoia Jul 17, 2024, 7:16 PM

#

wooden sail the linear transformation acting on the image is the same size as the image, and...

uhm..i don't think i get that but thanks for trying to explain

wooden sail Jul 17, 2024, 7:17 PM

#

maybe i can cook something up. in the 1D case, imagine we have a vector of length 15 and we want to convolve it with a convolution kernel [1,2,1]

#

the matricized transformation would look like this

lapis sequoia Jul 17, 2024, 7:19 PM

#

gonna take some time 🙂

wooden sail Jul 17, 2024, 7:22 PM

#

!e

import numpy as np
import scipy.linalg as slin
import matplotlib.pyplot as plt

N = 15
kernel = np.zeros(2*N - 1)
kernel[13] = 1
kernel[14] = 2
kernel[15] = 1

M = slin.toeplitz(kernel[N-1:], np.flipud(kernel[:N]))
plt.imshow(M)
plt.savefig("biggest_oof.png")

#

ugh

#

from my terminal

#

you get a nice sparse, toeplitz matrix representing the convolution

unkempt apex Jul 17, 2024, 7:54 PM

#

what should be count of epochs?
for training CNN?

#

I have dataset with approx. 1200 images which are divided into 4 classes

lapis sequoia Jul 17, 2024, 8:05 PM

#

put many (say 200) and use the "early stop callback" (i.e search about it.) @unkempt apex

#

with patience

unkempt apex Jul 17, 2024, 8:25 PM

#

lapis sequoia put many (say 200) and use the "early stop callback" (i.e search about it.) <@84...

200?, ohh I was litteerally only training on 10 epochs and analyzing output

#

earlystopping is good!

lapis sequoia Jul 17, 2024, 8:38 PM

#

great 🙂

violet gull Jul 17, 2024, 9:04 PM

#

In RL if I have an agent with the pure goal of staying alive and is rewarded after every day it’s still alive. Is there an analytical difference in making the new reward per day constant (1, 1, 1) vs increasing (1, 2, 3) ?

small wedge Jul 17, 2024, 9:18 PM

#

violet gull In RL if I have an agent with the pure goal of staying alive and is rewarded aft...

do agent scores get reset each day?

violet gull Jul 17, 2024, 9:21 PM

#

small wedge do agent scores get reset each day?

No they get reset on death

#

End of episode

small wedge Jul 17, 2024, 9:21 PM

#

regardless of the answer to that, there is one difference depending on your implementation. Say you are rewarding agents x score per day survived and 5 score for getting berries, if x increases proportionally, that will make berries less impactful as a source of reward and thus change what the "optimal policies" are for that task during the training.

violet gull Jul 17, 2024, 9:21 PM

#

Book says rewarding for getting berries is wrong

small wedge Jul 17, 2024, 9:21 PM

#

idk what you're actually doing I'm just giving a hypothetical

violet gull Jul 17, 2024, 9:22 PM

#

I am only rewarding for days survived

small wedge Jul 17, 2024, 9:22 PM

#

what kind of RL are you talking about here, q learning?

violet gull Jul 17, 2024, 9:22 PM

#

Deep q

small wedge Jul 17, 2024, 9:23 PM

#

if days alive is the only source of reward and death setting the agent score to 0 for the rest of the sim is the only punishment, I can't think of any analytical reason that increasing the reward over time would change the training compared to keeping it constant

#

but shrug maybe there is one idk

violet gull Jul 17, 2024, 9:24 PM

#

small wedge if days alive is the only source of reward and death setting the agent score to ...

Ty

lapis sequoia Jul 17, 2024, 9:41 PM

#

https://eurekalabs.ai/

#

there is their discord invite in the website.

hearty crown Jul 17, 2024, 9:46 PM

#

Hello everyone, can anyone tell me where to find or buy Spanish proxies?

violet gull Jul 17, 2024, 9:52 PM

#

hearty crown Hello everyone, can anyone tell me where to find or buy Spanish proxies?

eBay

hearty crown Jul 17, 2024, 9:53 PM

#

thinkmon

lapis sequoia Jul 17, 2024, 9:59 PM

#

https://www.proxynova.com/proxy-server-list/country-es/

lapis sequoia Jul 17, 2024, 10:13 PM

#

fervent shore https://discuss.pytorch.org/t/3d-vs-2d-convolution-when-using-grayscale-images/6...

sorry, i had some delay to read it, but that's a nice expl from the post:

In a 3-dimensional convolution, you would use a 4-dimensional filter, which still uses all input channel, but moves in all 3 volumetric dimensions.
The method is very similar to a 2-dimensional convolution with an additional depth dimension the filter moves along.

#

thanks !

fervent shore Jul 17, 2024, 10:14 PM

#

ah I see what its getting at with 3D convolution being used mostly for something like video convolution

lapis sequoia Jul 17, 2024, 10:16 PM

#

Could be useful for interesting problems. Video seems likely indeed @fervent shore

#

maybe chemical reactions, for example

#

ahh, the blogpost talks about drug discovery, interesting! i'm just tweaking / rewriting a net for a similar purpose.

ocean pawn Jul 17, 2024, 10:20 PM

#

I know I shouldn't ask to ask, but would anyone mind to do some sorta code review for a non-linear regression?

lapis sequoia Jul 17, 2024, 10:21 PM

#

i don't have enough knowledge, also, it may be better in the #algos-and-data-structs ? not a problem for me though

ocean pawn Jul 17, 2024, 10:22 PM

#

lapis sequoia i don't have enough knowledge, also, it may be better in the <#65040190985286455...

pithink regression is ai, right?

#

ducky_concerned

lapis sequoia Jul 17, 2024, 10:22 PM

#

I mean...barely but yeah...

#

it is normally included in most books, i guess i consider ai=dl which is unfair

ocean pawn Jul 17, 2024, 10:23 PM

#

It's fine either way, it seemed to be producing reasonable result

#

So the code is right?

lapis sequoia Jul 17, 2024, 10:23 PM

#

by non-linear you mean polynomial?

ocean pawn Jul 17, 2024, 10:23 PM

#

lapis sequoia by non-linear you mean polynomial?

Yes

lapis sequoia Jul 17, 2024, 10:24 PM

#

i may be able to read it then

ocean pawn Jul 17, 2024, 10:24 PM

#

Oh, would you mind?

lapis sequoia Jul 17, 2024, 10:24 PM

#

i wouldn't if you put it on some github ill check it out as codespace

#

would be good if it's got unittests

#

or whatever is called in python

#

that's a neat way to test it, adding some tests

ocean pawn Jul 17, 2024, 10:26 PM

#

https://github.com/sunnyayyl/machine-learning
Do note, the code is quite bad

lapis sequoia Jul 17, 2024, 10:26 PM

#

don't worry im a very mediocre coder

ocean pawn Jul 17, 2024, 10:26 PM

#

This is the first time I'm implementing these kinda of algorithm

lapis sequoia Jul 17, 2024, 10:26 PM

#

probably won't say much either

#

why did you call it non-linear though? I think in terms of linear algebra is still linear, but may be wrong

ocean pawn Jul 17, 2024, 10:27 PM

#

lapis sequoia why did you call it non-linear though? I think in terms of linear algebra is sti...

Just me being stupid

lapis sequoia Jul 17, 2024, 10:27 PM

#

oh np, i just wasn't sure

#

iirc it's a similar sol to linear regression

ocean pawn Jul 17, 2024, 10:27 PM

#

In my brain, not straight line must be non-linear

#

I'm probably wrong

lapis sequoia Jul 17, 2024, 10:28 PM

#

yeah, in terms of linear algebra it isn't bc it's linear in terms of the coefficients

ocean pawn Jul 17, 2024, 10:28 PM

#

lapis sequoia yeah, in terms of linear algebra it isn't bc it's linear in terms of the coeffic...

Huh, so when is it not linear?

lapis sequoia Jul 17, 2024, 10:28 PM

#

you don't use x1,x2,...xn as variables but as constants.

#

deep linearning is that

#

lol, deep learning

#

i'll go for a walk though, i'll eventually sit aand read it

ocean pawn Jul 17, 2024, 10:29 PM

#

It's fine, I don't expect anyone to check it for me

#

It looks like it's working

lapis sequoia Jul 17, 2024, 10:30 PM

#

np, i've got nothing to do

ocean pawn Jul 17, 2024, 10:30 PM

#

#

The graph looks right

lapis sequoia Jul 17, 2024, 10:30 PM

#

looks perfect

#

what i found a proble, is when you have large values and they are closely spaced i think

#

there is some conditions where it failed (other code, not yours.)

ocean pawn Jul 17, 2024, 10:31 PM

#

Oh I can't find dataset, I'm just randomly generating value

lapis sequoia Jul 17, 2024, 10:31 PM

#

i'll link you one

ocean pawn Jul 17, 2024, 10:31 PM

#

Oh, thanks!

lapis sequoia Jul 17, 2024, 10:32 PM

#

check their datasets https://github.com/mljs/regression-polynomial

ocean pawn Jul 17, 2024, 10:32 PM

#

It's kinda sad, they're used by 1.8k, but only have 15 stars

lapis sequoia Jul 17, 2024, 10:33 PM

#

why sad?

ocean pawn Jul 17, 2024, 10:34 PM

#

Kinda funny, I suppose? Usually, I would kinda expect used project to have a considerable amount of star

#

As appreciation, I suppose

lapis sequoia Jul 17, 2024, 10:34 PM

#

oh, JS isn't like that for math

#

but will eventually get there

spare forum Jul 17, 2024, 10:35 PM

#

ocean pawn

Where are the axis title tho

ocean pawn Jul 17, 2024, 10:36 PM

#

spare forum Where are the axis title tho

ducky_sus

#

jam_cuneiform_this

#

His fault

spare forum Jul 17, 2024, 10:36 PM

#

ducky_concerned

ocean pawn Jul 17, 2024, 10:36 PM

#

Dunno what to name the axis tho, it's random data

#

@partial(jit, static_argnames="data_size")
def generate_data(key: Array, data_size: int) -> tuple[float32, float32]:
    _, subkey = random.split(key)
    x = jnp.sort(
        random.uniform(key=subkey, shape=(data_size,), minval=-500, maxval=500)
    )
    # x = jnp.arange(-200.0, 200.0, step=30.0)
    y = 2 * jnp.pow(x, 2) + 6 * jnp.pow(x, 3) + 1
    return x, y

#

x and y, I suppose

#

spare forum Jul 17, 2024, 10:37 PM

#

Yeah, just don't leave it blank (good habit)

ocean pawn Jul 17, 2024, 10:38 PM

#

spare forum Yeah, just don't leave it blank (good habit)

Fair enough

lapis sequoia Jul 17, 2024, 10:38 PM

#

units can be au arbitrary units

ocean pawn Jul 17, 2024, 10:38 PM

#

Imagine if I got x and y flipped, it'll be so embarrassing

ocean pawn Jul 17, 2024, 10:39 PM

#

ocean pawn

Oh and the le8 is that *10^8?

lapis sequoia Jul 17, 2024, 10:39 PM

#

out of curiosity, what happens if you feed a parabola lying on its side

#

like y = +- sqrt(x)

ocean pawn Jul 17, 2024, 10:40 PM

#

lapis sequoia like `y = +- sqrt(x)`

Let me see

lapis sequoia Jul 17, 2024, 10:40 PM

#

i guess it throws a straight line or smth

#

or it could fail to solve

ocean pawn Jul 17, 2024, 10:41 PM

#

Do numpy sqrt return both positive and negative?

#

Or just positive

lapis sequoia Jul 17, 2024, 10:41 PM

#

idk

#

ducky_concerned

ocean pawn Jul 17, 2024, 10:42 PM

#

lapis sequoia i guess it throws a straight line or smth

Looks like it

#

Wait

#

It's because I got nan

lapis sequoia Jul 17, 2024, 10:42 PM

#

x has to be positive..!

ocean pawn Jul 17, 2024, 10:43 PM

#

Ohhhhh

#

Woops

#

I was imagining thing

#

/j

#

#

Your guess is correct

#

It's a straight line

#

@partial(jit, static_argnames="data_size")
def generate_data(key: Array, data_size: int) -> tuple[float32, float32]:
    _, subkey = random.split(key)
    x = jnp.sort(random.uniform(key=subkey, shape=(data_size,), minval=0, maxval=500))
    # x = jnp.arange(-200.0, 200.0, step=30.0)
    # y = 2 * jnp.pow(x, 2) + 6 * jnp.pow(x, 3) + 1
    y1 = +jnp.sqrt(x)
    y2 = -jnp.sqrt(x)
    return x, jnp.add(y1, y2)

lapis sequoia Jul 17, 2024, 10:45 PM

#

but the points should be a parabola

ocean pawn Jul 17, 2024, 10:45 PM

#

lapis sequoia but the points should be a parabola

Just realised it

#

Weird

#

Why is y all 0

lapis sequoia Jul 17, 2024, 10:46 PM

#

why are you adding y1, y2?

#

id return x,y1,y2 maybe, in this case

ocean pawn Jul 17, 2024, 10:47 PM

#

lapis sequoia why are you adding y1, y2?

I assume sqrt(4) only return 2

#

Not -2

lapis sequoia Jul 17, 2024, 10:47 PM

#

but you get 2 points right (x,y1), (x,y2)

ocean pawn Jul 17, 2024, 10:48 PM

#

Fixed it

#

Wait

#

How do I do the other half

lapis sequoia Jul 17, 2024, 10:49 PM

#

also, you can return (x,y1), (x, -y1) isn't it?

#

no need to calc 2 sqrt

ocean pawn Jul 17, 2024, 10:50 PM

#

Oh I can just *-1

lapis sequoia Jul 17, 2024, 10:50 PM

#

yes

spare forum Jul 17, 2024, 10:50 PM

#

abs

ocean pawn Jul 17, 2024, 10:50 PM

#

I am stupid

#

I am loooking for concat not add

#

#

Fixed

lapis sequoia Jul 17, 2024, 10:51 PM

#

nice

ocean pawn Jul 17, 2024, 10:52 PM

#

Not a perfect straight line it seemed

w: [-2.7694678e-06  5.7525358e-06 -2.8457648e-06] b: 1.0132458783118636e-07

lapis sequoia Jul 17, 2024, 10:52 PM

#

now you can set y2=0

ocean pawn Jul 17, 2024, 10:52 PM

#

Gradient decent must be going mad

#

lapis sequoia Jul 17, 2024, 10:52 PM

#

no, just removing it, should fit

ocean pawn Jul 17, 2024, 10:52 PM

#

lapis sequoia now you can set y2=0

What do you mean?

lapis sequoia Jul 17, 2024, 10:52 PM

#

ohh that's gradient descent?

ocean pawn Jul 17, 2024, 10:52 PM

#

lapis sequoia ohh that's gradient descent?

Yup

lapis sequoia Jul 17, 2024, 10:53 PM

#

you can solve it with a single linear algebra formula, it's got exact solution i think

ocean pawn Jul 17, 2024, 10:53 PM

#

@jit
def grad_decend(
    w: Array,
    b: float32,
    learning_rate: float32,
    x_train: Array,
    y_train: Array,
):
    w_grad = jacfwd(lambda w: cost(w, b, x_train, y_train))(w)
    b_grad = grad(cost, argnums=1)(w, b, x_train, y_train)
    temp_w = w - learning_rate * w_grad
    temp_b = b - learning_rate * b_grad
    return temp_w, temp_b, w_grad, b_grad

lapis sequoia Jul 17, 2024, 10:53 PM

#

but it's great however you do it

#

looks good

ocean pawn Jul 17, 2024, 10:54 PM

#

lapis sequoia looks good

Thanks

lapis sequoia Jul 17, 2024, 10:54 PM

#

grad_descent is i think

spare forum Jul 17, 2024, 10:54 PM

#

lapis sequoia you can solve it with a single linear algebra formula, it's got exact solution i...

It does

ocean pawn Jul 17, 2024, 10:54 PM

#

lapis sequoia grad_descent is i think

My good ol' reliable spelling (mistake)

ocean pawn Jul 17, 2024, 10:55 PM

#

lapis sequoia grad_descent is i think

Apparenly even pycharm is shouting at me for the spelling mistake

lapis sequoia Jul 17, 2024, 10:55 PM

#

makes sense

spare forum Jul 17, 2024, 10:55 PM

#

Linear and polynomial regression with least squared error has a closed solution

ocean pawn Jul 17, 2024, 10:56 PM

#

Can I, a newbie to ml impliment that tho

#

Gradient descend seemes to be quite simple

#

(and good enough)

lapis sequoia Jul 17, 2024, 10:56 PM

#

yes, you can, but it can be tricky with edge cases

#

check wikipedia

spare forum Jul 17, 2024, 10:56 PM

#

With numpy it should be fine

lapis sequoia Jul 17, 2024, 10:56 PM

#

you have to calc a couple of matrix transpositions and are done

#

in numpy this means X.t I think

ocean pawn Jul 17, 2024, 10:57 PM

#

What's it called?

lapis sequoia Jul 17, 2024, 10:57 PM

#

try linear regression wikipedia

#

and links to other methods

#

or polynomial regression directly...

ocean pawn Jul 17, 2024, 10:58 PM

#

Least-squares estimation

#

This?

lapis sequoia Jul 17, 2024, 10:58 PM

#

tbh, these days you may be better of doing gradient descent, but idk

spare forum Jul 17, 2024, 10:58 PM

#

ocean pawn This?

Y

ocean pawn Jul 17, 2024, 10:58 PM

#

Looks intimidating, but I'll have a look

lapis sequoia Jul 17, 2024, 10:59 PM

#

it's basic linear algebra unless you dig into it

#

not saying it's easy, but it's that

spare forum Jul 17, 2024, 10:59 PM

#

Wikipedia might have full maths details, maybe you can find simpler things

ocean pawn Jul 17, 2024, 11:00 PM

#

||Funny thing, I've only have basic math knowledge, I am only in high school||
I do surprisingly know more than what school teach

#

Oh

#

So are you getting the derivative and set it to 0 and solve?

lapis sequoia Jul 17, 2024, 11:02 PM

#

yes, that's one approach

ocean pawn Jul 17, 2024, 11:02 PM

#

lapis sequoia yes, that's one approach

How would you implement it w/o manually doing the solving?

lapis sequoia Jul 17, 2024, 11:03 PM

#

probably other users know better about the exact details, i don't remember/know that much

spare forum Jul 17, 2024, 11:04 PM

#

I've literally done this a few years ago like 4 years or smthing I have no idea if I can find it somewhere

lapis sequoia Jul 17, 2024, 11:04 PM

#

but you don't need to code the derivatives

ocean pawn Jul 17, 2024, 11:05 PM

#

That's a big if, really, I haven't even been thought how to do derivative yet (I know some differentiation), but, there's nothing stopping me from self teaching ngl

#

I mean my school is still teaching you how to use if statement

lapis sequoia Jul 17, 2024, 11:05 PM

#

you may wait until learning basics of matrices, imho

ocean pawn Jul 17, 2024, 11:06 PM

#

lapis sequoia you may wait until learning basics of matrices, imho

I know how to do them

#

Even though it's not taught

#

I know multiplication at least

lapis sequoia Jul 17, 2024, 11:06 PM

#

that's good..

ocean pawn Jul 17, 2024, 11:06 PM

#

pithink Doesn't hurt learning something new

lapis sequoia Jul 17, 2024, 11:07 PM

#

spare forum Wikipedia might have full maths details, maybe you can find simpler things

this is a good idea @ocean pawn

ocean pawn Jul 17, 2024, 11:07 PM

#

If I can learn python when I'm 11 or something, why can't I learn ml now (cope)

ocean pawn Jul 17, 2024, 11:07 PM

#

lapis sequoia this is a good idea <@761613305693208656>

Any good resources?

#

Wikipedia is really intimidating

lapis sequoia Jul 17, 2024, 11:08 PM

#

uhmm...

#

i took a look but most look intimidating tbh

ocean pawn Jul 17, 2024, 11:09 PM

#

Looks like calculus is where I start off with

ocean pawn Jul 17, 2024, 11:09 PM

#

lapis sequoia i took a look but most look intimidating tbh

It's fine, I can understand most math notation

#

I hope, at least

spare forum Jul 17, 2024, 11:10 PM

#

You can fool around with libraries, implementing is another story

ocean pawn Jul 17, 2024, 11:10 PM

#

spare forum You can fool around with libraries, implementing is another story

I kinda wanna understand it

#

The reason I can never understand keras/tf

#

is because I have no idea why am I doing certin thing

#

By knowing how it works

#

I actually understand what and why

#

I guess I'm just really stubborn and wanna understand it

lapis sequoia Jul 17, 2024, 11:11 PM

#

if you type"least squares matrix formula"

#

and go to images/videos

#

you will realise what of those may fit your level. try it

ocean pawn Jul 17, 2024, 11:12 PM

#

Thanks!

lapis sequoia Jul 17, 2024, 11:12 PM

#

in the end, the sol looks like this (A.tA)^(-1) A.t b

#

or similar.

ocean pawn Jul 17, 2024, 11:13 PM

#

Oh and to get derivative for grad decedent calculus is used right?

lapis sequoia Jul 17, 2024, 11:14 PM

#

gradient descent isn't used here

ocean pawn Jul 17, 2024, 11:14 PM

#

lapis sequoia gradient descent isn't used here

No I meant

#

In case where gradient descent is used

#

How do they get derivative?

#

Is it calculus?

lapis sequoia Jul 17, 2024, 11:14 PM

#

yes, the derivative is calculus

ocean pawn Jul 17, 2024, 11:15 PM

#

Currently, I only understand what the derivative mean, but I don't know how to do it myself

#

So I'll see
Thanks, everyone

#

(I do wanna do it myself, cause, why not, I am thankfull for Jax's autograd tho)

spare forum Jul 17, 2024, 11:15 PM

#

Only need derivation and partial derivate (not that much more complicated)

lapis sequoia Jul 17, 2024, 11:17 PM

#

ur welcome 🙂

ocean pawn Jul 17, 2024, 11:18 PM

#

Worse case scenario: I'll understand it in two year (I'll eventually learn them in school)

lapis sequoia Jul 17, 2024, 11:34 PM

#

certainly, i think we just didn't want to make you waste time and feel frustrated. it's important to have a good learning path.

ocean pawn Jul 17, 2024, 11:39 PM

#

lapis sequoia certainly, i think we just didn't want to make you waste time and feel frustrate...

It should be fine, I know differentiation, which should be enough to start learning about derivative

serene scaffold Jul 17, 2024, 11:52 PM

#

ocean pawn It should be fine, I know differentiation, which should be enough to start learn...

Differentiation is calculating the derivative.

ocean pawn Jul 17, 2024, 11:53 PM

#

serene scaffold Differentiation is calculating the derivative.

Oh, I know basic differentiation, but not all of the rules

#

Yeah, my sentence is stupid

lapis sequoia Jul 18, 2024, 12:21 AM

#

Nice debate at mlst https://www.youtube.com/watch?v=8LxTWIaInok

(Hotz vs Leahy)

YouTube

Machine Learning Street Talk

MLST Live: George Hotz and Connor Leahy on AI Safety

UPLOADED HQ VERSION HERE: https://www.youtube.com/watch?v=iFUmWho7fBE

▶ Play video

#

did u guys watch it?

#

Any of you web scrape frequently?

#

i do sometimes, can't help w anything very intricated though

visual violet Jul 18, 2024, 1:05 AM

#

which text preprocessing for nlp ml task yall using given that tf.keras.preprocessing.text.Tokenizer is depreciated

lapis sequoia Jul 18, 2024, 1:05 AM

#

https://keras.io/api/keras_nlp/tokenizers/tokenizer/

Keras documentation: Tokenizer base class

#

?

visual violet Jul 18, 2024, 1:05 AM

#

wait wut. keras 3!

lapis sequoia Jul 18, 2024, 1:05 AM

#

yup

visual violet Jul 18, 2024, 1:05 AM

#

hmm the newest version

#

so prob legit

lapis sequoia Jul 18, 2024, 1:06 AM

#

tf 2.16 uses keras3 already

#

under the hood

visual violet Jul 18, 2024, 1:06 AM

#

why is https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text/Tokenizer depreciated 😦

TensorFlow

tf.keras.preprocessing.text.Tokenizer | TensorFlow v2.16.1

DEPRECATED.

#

it is still keras no?

lapis sequoia Jul 18, 2024, 1:06 AM

#

sbecause keras is not part of tf

visual violet Jul 18, 2024, 1:06 AM

#

hmm i guess. but not sure why tf.keras

lapis sequoia Jul 18, 2024, 1:06 AM

#

it is not very defined actually, but you'd rather do improt keras

#

should work

visual violet Jul 18, 2024, 1:07 AM

#

sounds good.

lapis sequoia Jul 18, 2024, 1:07 AM

#

yeah, read the getting started

visual violet Jul 18, 2024, 1:07 AM

#

thoughts on spacy & nltk?

#

they seem to do similar things, for like tokenization at least

lapis sequoia Jul 18, 2024, 1:07 AM

#

no idea what that is sorry, maybe others know

visual violet Jul 18, 2024, 1:08 AM

#

no worries

visual violet Jul 18, 2024, 1:10 AM

#

lapis sequoia tf 2.16 uses keras3 already

just curious. how did u figure out this inforomation

lapis sequoia Jul 18, 2024, 1:11 AM

#

actually, from here https://keras.io/getting_started/
at the bottom of the page

Keras documentation: Getting started with Keras

#

Starting with TensorFlow 2.16, doing pip install tensorflow will install Keras 3. When you have TensorFlow >= 2.16 and Keras 3, then by default from tensorflow import keras (tf.keras) will be Keras 3.

visual violet Jul 18, 2024, 1:13 AM

#

thanks!

lapis sequoia Jul 18, 2024, 1:14 AM

#

welcome !

#

i was so excited about keras 3, cuz it's multibackend, but you will know everythin from that page

visual violet Jul 18, 2024, 1:15 AM

#

https://stackoverflow.com/questions/55492666/what-is-better-to-use-keras-preprocessing-tokenizer-or-nltk-tokenize

Stack Overflow

What is better to use keras.preprocessing.tokenizer or nltk.tokenize

I'm working on multiclass classification problem with Keras.
Tried to use Keras tokenize but think that nltk.tokenizer would be better solution for my problem. I did't find any article witch can de...

#

i forgot to google lol

visual violet Jul 18, 2024, 1:16 AM

#

lapis sequoia i was so excited about keras 3, cuz it's multibackend, but you will know everyth...

for real

#

altho i think it is kinda awkward

#

like i am sure tensorflow can do the same thing as pytorch and vice versa. who would bother learning both of them and using 2 for one task lol

#

i could be wrong though

lapis sequoia Jul 18, 2024, 1:18 AM

#

that's the point

#

you learn only keras

visual violet Jul 18, 2024, 1:20 AM

#

i read from somewhere that some people think keras is shit

#

cuz it is like "tensorflow wrapper"

#

i think it is more than that but i could be wrong

lapis sequoia Jul 18, 2024, 1:21 AM

#

to me it's a symbolic computation layer

#

independent of any backend, and extremely powerful

#

s got almost as many gh stars as pytorch (i know, that's not always the best metric.)

fervent shore Jul 18, 2024, 1:27 AM

#

visual violet cuz it is like "tensorflow wrapper"

Iirc it’s the other way around

#

Tensorflow is a Keras wrapper

lapis sequoia Jul 18, 2024, 1:29 AM

#

the low level calculations are carried out by tf (or other backends now..)

#

matmul and such

obsidian mesa Jul 18, 2024, 5:21 AM

#

I see a wonderful course in udemy for all AI/ML enthusiast in a discounted price of only 9.9 USD, lesser than the udemy original deal's program price which is 13 USD. Discount coupon is only valid till 2 days with unlimited redemptions within 2 days. Sharing here as it might help anyone from this community.

https://www.udemy.com/course/certified-artificial-intelligence-developer-program/?couponCode=CAIDPROGRAMDISCOUNT

coupon code: CAIDPROGRAMDISCOUNT

unkempt apex Jul 18, 2024, 5:41 AM

#

lapis sequoia Any of you web scrape frequently?

yeah I do!

#

selenium , lxml

lapis sequoia Jul 18, 2024, 5:57 AM

#

Helllllllo

#

Is there are anyone thst can help me i am leaening python currently and want to lewrn ai ml for fun

https://youtube.com/playlist?list=PLnfmfrpiDBPgmrfT_04epUpubx33OU8GV&si=31V6xy_vyJ1QUuSL

Here is the playlist tht i created by my own is it are s good playlist or not i eant to learn it but hsve no guidance csn anyone guide me i am a conplete begineer just 2 weeks ago i started learning python

YouTube

🤡🤡

#

Anyine can dm me if he want to guide me

bright garden Jul 18, 2024, 6:34 AM

#

it seems like both my validation loss (binary cross entropy) and accuracy curves are going up together, does anyone know how to interpret this?

#

I would imagine the model is overfitting since my train loss is decreasing (and accuracy is 100%)

bright garden Jul 18, 2024, 6:35 AM

#

bright garden it seems like both my validation loss (binary cross entropy) and accuracy curves...

but these curves still don't make much sense to me

lapis sequoia Jul 18, 2024, 6:43 AM

#

Are the labels correct?

#

Oh i guess binary classification

#

Hard to say without reading the code for me

bright garden Jul 18, 2024, 6:45 AM

#

lapis sequoia Are the labels correct?

Yepp the labels are correct. It's just basically whether the stock market moves up/down in the next 5 days (so I guess that should be pretty random)

lapis sequoia Jul 18, 2024, 6:45 AM

#

(cant open the images here)

#

Sounds like overfitting

bright garden Jul 18, 2024, 6:47 AM

#

lapis sequoia Sounds like overfitting

Yeah that's my guess too, pretty odd behaviour nonetheless

lapis sequoia Jul 18, 2024, 6:48 AM

#

If data is random it memorises the input, then acc and loss for training improve, the other 2 shouldn't, although you said vlidationt accuracy goes up?

bright garden Jul 18, 2024, 6:48 AM

#

lapis sequoia If data is random it memorises the input, then acc and loss for training improve...

Yep, training accuracy goes to 100% and loss goes to near-zero. But that doesn't explain why validation accuracy keeps going up too 😂

lapis sequoia Jul 18, 2024, 6:49 AM

#

Sorry i meant validation acc

bright garden Jul 18, 2024, 6:49 AM

#

A better question rather is whether I should care about the validation loss going up or if I should just care about the accuracy since that's what I'm after anyway

lapis sequoia Jul 18, 2024, 6:50 AM

#

Uhmm i think you should care but am not an expert

bright garden Jul 18, 2024, 6:50 AM

#

def _step(self, batch, _, prefix: str):
        x, y = batch
        y_hat: torch.Tensor = self.forward(x).squeeze()

        loss = self.criterion(y_hat, y)
        acc = 100 * (y_hat.round() == y).float().mean()
        self.log(f"{prefix}_loss", loss)
        self.log(f"{prefix}_acc", acc)

        return loss

#

same code for both validation & training steps

lapis sequoia Jul 18, 2024, 6:50 AM

#

Is the validation dataset too small?

bright garden Jul 18, 2024, 6:51 AM

#

lapis sequoia Is the validation dataset too small?

total dataset is about 2500 rows, did a 60-40 split

#

similar results with 80-20 split too

lapis sequoia Jul 18, 2024, 6:51 AM

#

Sounds smallish for the task isnt it?

bright garden Jul 18, 2024, 6:51 AM

#

lapis sequoia Sounds smallish for the task isnt it?

It is, but unfortunately all I've got to work with

#

which explains why the training accuracy so easily goes up to 100%

lapis sequoia Jul 18, 2024, 6:52 AM

#

Maybe you can add a bit of noise

#

Idk what is the std strategy

bright garden Jul 18, 2024, 6:52 AM

#

Other answers explain well how accuracy and loss are not necessarily exactly (inversely) correlated, as loss measures a difference between raw output (float) and a class (0 or 1 in the case of binary classification), while accuracy measures the difference between thresholded output (0 or 1) and class. So if raw outputs change, loss changes but accuracy is more "resilient" as outputs need to go over/under a threshold to actually change accuracy.

https://stats.stackexchange.com/questions/282160/how-is-it-possible-that-validation-loss-is-increasing-while-validation-accuracy

Cross Validated

How is it possible that validation loss is increasing while validat...

I am training a simple neural network on the CIFAR10 dataset. After some time, validation loss started to increase, whereas validation accuracy is also increasing. The test loss and test accuracy

lapis sequoia Jul 18, 2024, 6:52 AM

#

Id try to get 10x that n at least, I'd try reducing the network params as well

bright garden Jul 18, 2024, 6:53 AM

#

This makes sense too, maybe the model is just learning to predict values close to 0.5

bright garden Jul 18, 2024, 6:53 AM

#

lapis sequoia Id try to get 10x that n at least, I'd try reducing the network params as well

Oh yeah, I'll try a smaller network

lapis sequoia Jul 18, 2024, 6:53 AM

#

Yes, so that you can get an idea about the behaviour

bright garden Jul 18, 2024, 6:54 AM

#

Yeah no, similar results with smaller models too

lapis sequoia Jul 18, 2024, 6:54 AM

#

bright garden > Other answers explain well how accuracy and loss are not necessarily exactly (...

Yeah, that's a good post

bright garden Jul 18, 2024, 6:55 AM

#

Maybe it's a better strategy predicting amount of movement in the stock than classifying up/down, that might get rid of the "bad predictions getting worse" problem

lapis sequoia Jul 18, 2024, 6:56 AM

#

Possibly, or using rnns, unless it is an rnn

bright garden Jul 18, 2024, 6:56 AM

#

lapis sequoia Possibly, or using rnns, unless it is an rnn

a simple MLP for now

lapis sequoia Jul 18, 2024, 6:56 AM

#

Cause those should capture more info

#

So normally you want smth that takes context info i think

bright garden Jul 18, 2024, 6:56 AM

#

lapis sequoia Cause those should capture more info

I doubt there's enough data to train one well, but worth a shot

lapis sequoia Jul 18, 2024, 6:59 AM

#

Actually rnn would be if you need to predict a continuation of the sequence of the real values i think, so you might be fine. But i wonder whether some rnn like structure wouldn't do better.

#

Maybe smone else can help further

lusty lotus Jul 18, 2024, 7:55 AM

#

hello everyone! i hope everyone is doing well.
i am learning Bellman equations and I have some questions on different concepts on RL. I would greatly appreciate it if anyone could answer some of my questions :D

https://docs.google.com/document/d/1JlRIBYSIJKypfkLOEA0P7AOLKCUhmjquXbunD40WqhE/edit

graceful garden Jul 18, 2024, 8:26 AM

#

Hey Ive been following a book to build my own LLM and its all coded in Python, ive made a blog post about it and it would be great to get some input from this community about it. https://theclouddude.co.uk/building-my-own-llm-a-journey-into-language-models-building-a-tokenizer

As Im quite new to Python....

Jason's Blog

Building a Tokenizer for an LLM

In this blog post, I talk about how to build a tokenizer to be used for an LLM.

spare forum Jul 18, 2024, 11:27 AM

#

bright garden which explains why the training accuracy so easily goes up to 100%

That's really not a good thing

#

Need regularization, early_stopping for example etc

unkempt apex Jul 18, 2024, 12:16 PM

#

early_stopper = EarlyStopper(patience = 5, min_delta = 0.01)
```is this good?

#

but after 10 epochs the training stops

lapis sequoia Jul 18, 2024, 12:23 PM

#

https://stackoverflow.com/questions/50284898/keras-earlystopping-which-min-delta-and-patience-to-use

Stack Overflow

Keras EarlyStopping: Which min_delta and patience to use?

I am new to deep learning and Keras and one of the improvement I try to make to my model training process is to make use of Keras's keras.callbacks.EarlyStopping callback function.

Based on the ou...

#

I'd use smaller than 0.0001 i think but post says better

unkempt apex Jul 18, 2024, 12:24 PM

#

0.0001 ?

lapis sequoia Jul 18, 2024, 12:24 PM

#

Yeah try

misty shuttle Jul 18, 2024, 12:26 PM

#

Which is better for ML/AI beginners- tensorflow or pytorch? I am open to learning both though

#

the question is basically what do i learn first

unkempt apex Jul 18, 2024, 12:26 PM

#

pytorch

misty shuttle Jul 18, 2024, 12:26 PM

#

aight ty

unkempt apex Jul 18, 2024, 12:28 PM

#

lapis sequoia Yeah try

..

#

again stopping at 14

#

goal is 100

spare forum Jul 18, 2024, 12:34 PM

#

unkempt apex pytorch

tf is absolutely good tho

unkempt apex Jul 18, 2024, 12:34 PM

#

yeah but it is predicting cloudy image, as rainy!!😂

spare forum Jul 18, 2024, 1:18 PM

#

Skill issue ducky_concerned

unkempt apex Jul 18, 2024, 1:19 PM

#

yeah!💀

spare forum Jul 18, 2024, 1:20 PM

#

Not even joking, if the problem is tf, the problem probably isn't tf ducky_concerned

#

Pytorch is good too tho

unkempt apex Jul 18, 2024, 1:22 PM

#

tf??

#

what should be dropout rate for CNN?

#

0.5 is current!

#

because the model is getting overfitted to quickly !!< should I increase that?

#

ignore that!!

#

current accuracy is good enought actually which is 93

lapis sequoia Jul 18, 2024, 1:32 PM

#

tensorflow=tf

unkempt apex Jul 18, 2024, 1:34 PM

#

I don't use that!
only pytorch!

spare forum Jul 18, 2024, 1:59 PM

#

The misunderstanding was big here lol

#

I was responding to you saying pytorch as an absolute choice just saying tf=tensorflow is also good

lunar wharf Jul 18, 2024, 2:40 PM

#

lapis sequoia tensorflow=tf

tf is tensorflow

verbal oar Jul 18, 2024, 2:45 PM

#

how can I workaround input?
maybe hardcode it?

#

i mean when deploying i dont have access to type text

#

but still i want to show to someone result

#

type name of product

#

text = input("type name of product (e.g beer): ")
test = pred(text)

print("predicted label index e.g 0 - chips: ", test)

predicted = ""
    if test == y[test]:
        predicted = yTxt[test]

print("do you want to add", predicted + "?")

accuracy = np.sum(y == preds) / len(y)

print("Model Accuracy = {}".format(accuracy))```

#

it stops on input

#

no problem when running locally but then i type text to input

#

here I dont have possibility

#

here on render deployed

#

hmm maybe replace input with HTML input

#

i mean change to graphical from console

#

so then I must use e.g streamlit

lapis sequoia Jul 18, 2024, 4:19 PM

#

Guys what do you think of this Value Network Architecture for Inverse Double Pendulum v.4?


value_net = nn.Sequential(
    nn.LazyLinear(num_cells, device=device), # num_cells = 256
    nn.Tanh(),
    nn.ReLU(),
    nn.LazyLinear(num_cells, device=device), # num_cells = 256
    nn.Tanh(),
    nn.LazyLinear(num_cells, device=device), # num_cells = 256
    nn.Tanh(),
    nn.LazyLinear(1, device=device),
)
```? Is there anything that can be improved?

ocean pawn Jul 18, 2024, 5:47 PM

#

lapis sequoia certainly, i think we just didn't want to make you waste time and feel frustrate...

Hey, thanks for your help yesterday

#

I think I managed to understand derivative

#

Seemed like power rule and chain rule is enough

#

For the time being at least

#

(I even managed to understand how to do derivative for sigmoid function)

#

I assume I'll need it for binary classification? (Right?)

wild coral Jul 18, 2024, 7:31 PM

#

guys , if anyone has studied deep learning, does anyone know why mini batch gradient descent is said to be more efficient than batch gradient descent? Im watching andrew ngs deep learning course 2. Without any parallelization, I would think mini batch is inherently the same if not slower than batch computing because you are purposely breaking up the already vectorized operations in favor for linearly looping over the training sample

serene scaffold Jul 18, 2024, 7:46 PM

#

wild coral guys , if anyone has studied deep learning, does anyone know why mini batch grad...

it sounds like you misunderstand what aspects of model training can be parallelized

#

I'll elaborate when I get a chance

#

in the meantime, can you tell me what you think the difference is between batch and mini-batch?

wild coral Jul 18, 2024, 7:54 PM

#

serene scaffold in the meantime, can you tell me what you think the difference is between batch ...

batch is you consider the entire parameter vector theta and compute gradients on theta

mini batch is you split up theta into subsets and compute gradient descent on those subsets

small wedge Jul 18, 2024, 7:55 PM

#

Right but you make updates after each minibatch you calculated

#

Where you are on the gradient therefore changes, so you cannot parallelize those calculations

wild coral Jul 18, 2024, 7:56 PM

#

right so i understand you can parallelize each mini-batches gradient computation, but im clarifying whether mini batch is inherently faster than batch (without parallelizing each batches gradient computation)

#

and also in what manner is it faster, in computation time, or for convergence

small wedge Jul 18, 2024, 7:57 PM

#

That was a typo, you can't parallelize minibatches

formal flume Jul 18, 2024, 7:58 PM

#

Anyone here familiar with discretization schemes in quant finance for the heston model? I am trying to implement a model but having a few issues (one really)

small wedge Jul 18, 2024, 7:58 PM

#

It's faster in convergence time because you trade the accuracy of your gradient estimate for the speed of your convergence. Say you have a 10,000 sample dataset and in batch you use every sample before taking a step. Now say in mini batch you take a step (update your model) after every 20 samples, you have made 500 more updates to your model in a single epoch.

wild coral Jul 18, 2024, 7:59 PM

#

so you are updating more frequently but each epoch of updates is slwoer

small wedge Jul 18, 2024, 8:00 PM

#

The amount of wait time/compute is technically more in mini batch (per epoch) than in batch yes. But the actual convergence speed is faster for mini batch.

wild coral Jul 18, 2024, 8:01 PM

#

so when you say converges faster, you mean it takes less number of epochs to converge?

small wedge Jul 18, 2024, 8:02 PM

#

That's one way to put it yes.

wild coral Jul 18, 2024, 8:32 PM

#

small wedge That's one way to put it yes.

so less epochs that are slower, is somehow faster than more epochs that are faster?

small wedge Jul 18, 2024, 8:54 PM

#

wild coral so less epochs that are slower, is somehow faster than more epochs that are fast...

"slower" and "faster" referring to the amount of compute you have to do yes, but its sort of misleading to put it that way here. In general when we talk about speed in training we mean convergence speed. In terms of convergence one mini batch epoch can be "faster" than multiple batch epochs, because the number of steps you take down the gradient in mini batch is so much more.

#

one way to think about this is an analogue to float precision. We generally use 32 bit floats as the default for machine learning models because, although you can get more precise gradient estimates using 64 bit floating points, that extra precision doesn't help us converge any faster really. mini batch GD is taking that idea to the extreme, we don't need every sample to get a gradient estimation that lets us step in the right direction; assuming our dataset is properly balanced then that handful of samples should be a good enough idea of where to go for us to safely take a step.

wild coral Jul 18, 2024, 8:58 PM

#

small wedge one way to think about this is an analogue to float precision. We generally use...

wait so in minibatch do you use every mini batch in an epoch or no?

small wedge Jul 18, 2024, 8:58 PM

#

yeah you make updates on every mini batch per epoch

wild coral Jul 18, 2024, 8:58 PM

#

ok

#

is MB faster in runtime compared to Batch GD? as in takes less time to converge

small wedge Jul 18, 2024, 9:00 PM

#

yes, mini batch gives us much faster convergence

wild coral Jul 18, 2024, 9:03 PM

#

and why is that? I feel like I am going in circles, but you are still just breaking up a larger task into smaller tasks but a lot more tasks

small wedge Jul 18, 2024, 9:03 PM

#

think of it like this

#

a batch epoch is: compute gradient estimate across all samples -> update model

#

two steps

#

a mini batch epoch is: compute gradient estimate on batch 1 -> update model -> compute gradient estimate on batch 2 -> update model -> ...

#

you are basically having lots of little "batch epochs" in a single "minibatch epoch"

#

you are basically getting the same result per 2 steps, just with mini batch you do a hell of a lot less computational work to get that result, and do it many times

wild coral Jul 18, 2024, 9:10 PM

#

small wedge you are basically getting the same result per 2 steps, just with mini batch you ...

ok let me use an analogy, if I have to take a flight from Los Angeles to New York, I could take one 6 hour flight or I could take three 2 hour flights, but the total flight time is still the same, 6 hours.

(In practice, the overhead for setting up the flight and layovers itself is actually not insigificant, so you would spend more time taking 3 flights than 1)

small wedge Jul 18, 2024, 9:12 PM

#

I don't think it's a great analogy because the speed of the flights is the same or worse when you break it up. A better analogy might be:

You have 10 seconds to get as far down a hill as you can, you can lay on your stomach and measure exactly the angle of the hill in front of you then gently take a step, or you can just take a glance and jump down the hill. Both will get you in about the same place, but you can do the second one way more times in 10 seconds than you can do the first

wild coral Jul 18, 2024, 9:16 PM

#

but in mb gradient descent, your step size is smaller than in b gradient descent

small wedge Jul 18, 2024, 9:16 PM

#

what makes you say that? pithink

wild coral Jul 18, 2024, 9:16 PM

#

thats what andrew ng said lol

#

not exactly sure why that is the case

small wedge Jul 18, 2024, 9:17 PM

#

the magnitude of your step down the gradient might be smaller, or the noise of your data might mean on average the step you take doesn't get you as close to the local/global minimum

#

but 100 mini batch steps vs 10 batch steps favors mini batch greatly

#

you will be much farther down the gradient in mb

wild coral Jul 18, 2024, 9:17 PM

#

ok wait side question, at the end of all epochs you would have an array of costs for each mnibatch, how do you reconcile that at the end for a scalar cost

small wedge Jul 18, 2024, 9:18 PM

#

no you do the updates per mini batch, you don't calculate all the costs then update at the end

#

it's the exact same mathematical algorithm as batch gd you just do it on way less samples, over and over until you run out of samples

wild coral Jul 18, 2024, 9:20 PM

#

oh right

#

then that is not parallelizable then?

small wedge Jul 18, 2024, 9:20 PM

#

it is not

wild coral Jul 18, 2024, 9:20 PM

#

between epochs?

small wedge Jul 18, 2024, 9:20 PM

#

you cannot parallelize mini batch

#

ig you could technically parallelize a single minibatch calculation across multiple processors and aggregate but I don't think anyone does that, you don't need parallelization because it's so much faster than regular GD

past meteor Jul 18, 2024, 9:25 PM

#

wild coral then that is not parallelizable then?

Yes and no. You really should view most of stochastic gradient descent as matrix multiplication. You multiply your batch, a tensor, with your weights (a matrix or a tensor), calculate the loss and subsequently do your update. Multiplying matrices is parallelizable

gray citrus Jul 18, 2024, 9:41 PM

#

spent the last 3 hours trying to figure out
why read_csv() wont read all the rows in my text file.

tried polars , tried pandas nothing worked,

pandas did work but only kinda ,
my raw data had 40k rows , but it was only reading 36k and dropping shit with no warnings or errors what so ever.

after trying a billion things setting an argument 'quoting=3' and it just fucking works,
but the lines that were being dropped had no quotes in them to begin with

I wanna die

small wedge Jul 18, 2024, 9:44 PM

#

https://tenor.com/view/todd-howard-it-just-works-bethesda-this-all-just-works-gif-20598651

Tenor

wild coral Jul 18, 2024, 11:16 PM

#

past meteor Yes and no. You really should view most of stochastic gradient descent as matrix...

i mean you could just compute -alpha * dtheta for each parallel process, and just add them all up after the parallel processes finish to get the final cost no?

past meteor Jul 18, 2024, 11:25 PM

#

wild coral i mean you could just compute -alpha * dtheta for each parallel process, and jus...

you could absolutely do that, but it's not worth the overhead unless your dataset is larger than memory and you're doing it distributed

#

Matrix multiplication with numpy is already multithreaded and with jax/torch you can go a step further and run it on a gpu for even more parallelism EDIT: fixed it, thanks to yo

hearty depot Jul 18, 2024, 11:28 PM

#

^

#

and also askshually numpy is cpu only 🤓

past meteor Jul 18, 2024, 11:29 PM

#

Ah, the second numpy is meant to be torch

wild coral Jul 18, 2024, 11:32 PM

#

past meteor Ah, the second numpy is meant to be torch

your saying that running in a for loop serially for all mini batches is not significantly slower than running mini batches in parallel?

past meteor Jul 18, 2024, 11:33 PM

#

wild coral your saying that running in a for loop serially for all mini batches is not sign...

Oh, I think I get your question now

#

You need to do it serially because you do an update of the weights after each batch

wild coral Jul 18, 2024, 11:34 PM

#

if you could imagine one is O(N) would the other not be O(N / n) where n is number of processes

wild coral Jul 18, 2024, 11:34 PM

#

past meteor You need to do it serially because you do an update of the weights after each ba...

o rite totally forgot about that lol

past meteor Jul 18, 2024, 11:35 PM

#

code it up 😄

#

shouldn't take longer than a couple of hours

#

at most

glass ridge Jul 18, 2024, 11:54 PM

#

guys , i was wondering what is the best module fr me to learn and start with as a bigginer , pytorch or tanserflow

serene scaffold Jul 19, 2024, 12:15 AM

#

glass ridge guys , i was wondering what is the best module fr me to learn and start with as ...

Those are both for the same thing. You shouldn't start with either. Start by learning about what "data" is in the context of data science and AI and how to manipulate it.

hearty depot Jul 19, 2024, 12:20 AM

#

glass ridge guys , i was wondering what is the best module fr me to learn and start with as ...

pytorch is more used, also tensorflow isnt supported anymore by google
they have shifted more towards jax

late lichen Jul 19, 2024, 12:39 AM

#

Uhm guys can you give me resources about back prop?

#

Or gradient decent?

#

Ping pls

small wedge Jul 19, 2024, 12:42 AM

#

late lichen Uhm guys can you give me resources about back prop?

Like math resources on the basic implementation?

#

https://arxiv.org/pdf/1802.01528

hearty depot Jul 19, 2024, 12:44 AM

#

late lichen Uhm guys can you give me resources about back prop?

all u really need to know is autograd for the very basics

#

which is just chain rule

true gulch Jul 19, 2024, 1:04 AM

#

Hello, are you guys familiar with prototype_path and class name script?

hollow sentinel Jul 19, 2024, 1:26 AM

#

true gulch Hello, are you guys familiar with prototype_path and class name script?

you can just ask the question

late lichen Jul 19, 2024, 1:58 AM

#

small wedge Like math resources on the basic implementation?

I don't mind if it's complex implementation as long as it explain how it can calculate for optimizing the weights and biases

small wedge Jul 19, 2024, 2:03 AM

#

yeah the paper I linked does that

marble turtle Jul 19, 2024, 8:46 AM

#

Hey guys, Can anyone help me with setting up the apache spark environment?

lapis sequoia Jul 19, 2024, 9:59 AM

#

small wedge <https://arxiv.org/pdf/1802.01528>

nice resource, ty

urban pendant Jul 19, 2024, 10:57 AM

#

Hello guys

hollow sentinel Jul 19, 2024, 11:11 AM

#

urban pendant Hello guys

supppppp

orchid forge Jul 19, 2024, 1:57 PM

#

hello

true gulch Jul 19, 2024, 1:58 PM

#

hollow sentinel you can just ask the question

I’m having trouble understanding how it’s supposed to work

#

The path in the script isn’t working I’m guessing it’s because of .npy file

orchid forge Jul 19, 2024, 2:00 PM

#

I'm making a project on an fred website economy data, can someone help me generate questions regarding it?

#

i am new in making personal project i haven't got confidence till now to create my own questions for a project that i'm making

#

#

this is how the data looks

dark minnow Jul 19, 2024, 2:21 PM

#

Hi! I want to ask a question, so I want to create a chatbot and i want this chatbot to get information from the websites using ai. i know python a little bit i completed some python 3 courses and im going to start the intermediate course and like review python. However, I don't know where to start building the chatbot and which ai should i use etc.

can someone tell me what do i need to create a chatbot? some says use openai assistant other says use gemini and while others say use dialogflow etc.

topaz abyss Jul 19, 2024, 2:30 PM

#

webscraping?

#

you want it to webscrape?

left tartan Jul 19, 2024, 3:05 PM

#

orchid forge I'm making a project on an fred website economy data, can someone help me genera...

What about it? What's your question?

lapis sequoia Jul 19, 2024, 4:06 PM

#

keras guides are really superb...

#

im reading this one for curiosity.. https://keras.io/guides/serialization_and_saving/

Keras documentation: Save, serialize, and export models

deep sleet Jul 19, 2024, 4:41 PM

#

If you are deciding the batch size in a model then what's the use of number of steps per epoch param?

#

isn't it automatically decided by dividing the number of data points in the dataset by the batch size?

lapis sequoia Jul 19, 2024, 4:47 PM

#

i'd think so

orchid forge Jul 19, 2024, 4:47 PM

#

left tartan What about it? What's your question?

your name reminds me of Angelina Jolie and Billy bob marriage, it was wild

lapis sequoia Jul 19, 2024, 4:47 PM

#

the steps per epoch may be useful when you feed the whole data set

deep sleet Jul 19, 2024, 4:53 PM

#

lapis sequoia the steps per epoch may be useful when you feed the whole data set

Yeah I thought the same

slender meadow Jul 19, 2024, 5:21 PM

#

Hello respected members , as you all can see that i have just joined in this community

#

I want to become an AI engineer but i dont knows the proper steps

#

Can anyone guide me about the roadmap to become one

serene scaffold Jul 19, 2024, 5:26 PM

#

slender meadow I want to become an AI engineer but i dont knows the proper steps

what stage are you at currently? are you in high school or what?

slender meadow Jul 19, 2024, 5:26 PM

#

I just complete my high school

serene scaffold Jul 19, 2024, 5:26 PM

#

slender meadow I just complete my high school

will you be going to college/university for computer science?

slender meadow Jul 19, 2024, 5:27 PM

#

yes

#

but its 50\50 percent

serene scaffold Jul 19, 2024, 5:27 PM

#

what do you mean by that?

slender meadow Jul 19, 2024, 5:28 PM

#

like i am still not sure whether i will get a seat in the desired uni

#

cz thats the only good in uni in my city

serene scaffold Jul 19, 2024, 5:28 PM

#

what country are you in?

slender meadow Jul 19, 2024, 5:28 PM

#

so i m trying not to depend on my uni too much

#

INDIA

#

I hail from north eastern india which doesnt have good educational instituition thats y

serene scaffold Jul 19, 2024, 5:30 PM

#

you will need a university education in CS with an emphasis in AI to be able to have a career in this space.

slender meadow Jul 19, 2024, 5:30 PM

#

Yes i have opted for that

#

Most probably i will get it

serene scaffold Jul 19, 2024, 5:30 PM

#

okay, so take as many AI-related courses as you can, and make sure you're taking the appropriate math prerequisites

slender meadow Jul 19, 2024, 5:30 PM

#

I have aplied for CS in Data science and AI

#

Yes the uni has already set courses which include discrete maths , linear algebra etc

lapis sequoia Jul 19, 2024, 5:31 PM

#

serene scaffold okay, so take as many AI-related courses as you can, and make sure you're taking...

What would you do if your uni doesnt provide you with such classes ?

serene scaffold Jul 19, 2024, 5:32 PM

#

lapis sequoia What would you do if your uni doesnt provide you with such classes ?

find another uni

lapis sequoia Jul 19, 2024, 5:32 PM

#

Nice

bleak reef Jul 19, 2024, 5:32 PM

#

lol

lapis sequoia Jul 19, 2024, 5:32 PM

#

I mean wouldnt you recommend just learn the topic by yourself with books ?

slender meadow Jul 19, 2024, 5:32 PM

#

Sir may i ask from which country do u hail frm

serene scaffold Jul 19, 2024, 5:32 PM

#

lapis sequoia I mean wouldnt you recommend just learn the topic by yourself with books ?

you can do that if you want, but your chances of getting a job are so low that you need another plan.

bleak reef Jul 19, 2024, 5:32 PM

#

it takes alot of effort and hardwork to do it without a good uni

bleak reef Jul 19, 2024, 5:33 PM

#

serene scaffold you can do that if you want, but your chances of getting a job are so low that y...

yep

serene scaffold Jul 19, 2024, 5:33 PM

#

bleak reef it takes alot of effort and hardwork to do it without a good uni

and luck.

slender meadow Jul 19, 2024, 5:33 PM

#

i also think so

bleak reef Jul 19, 2024, 5:33 PM

#

serene scaffold and luck.

true that

lapis sequoia Jul 19, 2024, 5:33 PM

#

Okay. I just have one DS class which teaches the basic algorthimes with Neural Networks.

slender meadow Jul 19, 2024, 5:33 PM

#

what aside from uni studies what can i do to achieve my desired goal

lapis sequoia Jul 19, 2024, 5:33 PM

#

I have modules like Data Driven Decision where you work with python mostly to preidct something

serene scaffold Jul 19, 2024, 5:34 PM

#

slender meadow what aside from uni studies what can i do to achieve my desired goal

see if you can work with research professors who specialize in AI.

lapis sequoia Jul 19, 2024, 5:34 PM

#

But its not a lot of theroy

slender meadow Jul 19, 2024, 5:34 PM

#

Like i dont have any coding background

serene scaffold Jul 19, 2024, 5:34 PM

#

slender meadow Like i dont have any coding background

that's fine. they'll teach you when you get there.

slender meadow Jul 19, 2024, 5:34 PM

#

and i want to learn Python

bleak reef Jul 19, 2024, 5:34 PM

#

slender meadow what aside from uni studies what can i do to achieve my desired goal

there are online courses by good profs

slender meadow Jul 19, 2024, 5:34 PM

#

by myself

serene scaffold Jul 19, 2024, 5:34 PM

#

slender meadow and i want to learn Python

!resources

arctic wedgeBOT Jul 19, 2024, 5:34 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

bleak reef Jul 19, 2024, 5:35 PM

#

slender meadow and i want to learn Python

that's easy

slender meadow Jul 19, 2024, 5:35 PM

#

Is there any steps or routine i need to follow like a roadmap

bleak reef Jul 19, 2024, 5:36 PM

#

slender meadow Is there any steps or routine i need to follow like a roadmap

for clear basics I'd recommend apna college. youre from India So im guessing yk hindi

slender meadow Jul 19, 2024, 5:36 PM

#

@serene scaffold which country are u from

#

@bleak reef yes i k them

#

@serene scaffold are u a working professional

#

@serene scaffold what type of work u do like ai or web developer etc

serene scaffold Jul 19, 2024, 5:38 PM

#

slender meadow <@253696366952316929> what type of work u do like ai or web developer etc

I'm a computational linguist for a lab.

slender meadow Jul 19, 2024, 5:38 PM

#

@bleak reefare u also from india

bleak reef Jul 19, 2024, 5:38 PM

#

slender meadow <@756501371318042665>are u also from india

yeah

slender meadow Jul 19, 2024, 5:38 PM

#

@serene scaffold sounds pretty impressive

#

@bleak reefare u a student

bleak reef Jul 19, 2024, 5:39 PM

#

yeah in 3rd year of my CS degree

slender meadow Jul 19, 2024, 5:40 PM

#

@bleak reefis there anything i need to be aware of before starting my cs degree as i will start my first sem from next month

#

@serene scaffold is there any roadmap to learn python

serene scaffold Jul 19, 2024, 5:40 PM

#

slender meadow <@253696366952316929> is there any roadmap to learn python

!resources

arctic wedgeBOT Jul 19, 2024, 5:40 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

bleak reef Jul 19, 2024, 5:40 PM

#

slender meadow <@756501371318042665>is there anything i need to be aware of before starting my ...

profs will guide you there just fine

slender meadow Jul 19, 2024, 5:41 PM

#

@bleak reef actually i dont believe in the teaching faculty of uni cz i heard the faculty is average

lapis sequoia Jul 19, 2024, 5:42 PM

#

I would recommend watch some random python yt course to gasp the basics and then look up libaries you probably need. #

#

And work with code in generell

bleak reef Jul 19, 2024, 5:42 PM

#

slender meadow <@756501371318042665> actually i dont believe in the teaching faculty of uni cz ...

But if you want to be better than others you should know how to communicate well and network with professionals

slender meadow Jul 19, 2024, 5:43 PM

#

oh i see

bleak reef Jul 19, 2024, 5:43 PM

#

and be good at maths

slender meadow Jul 19, 2024, 5:43 PM

#

hmm

bleak reef Jul 19, 2024, 5:43 PM

#

lapis sequoia I would recommend watch some random python yt course to gasp the basics and then...

yeah

slender meadow Jul 19, 2024, 5:49 PM

#

Is there tips to learn code better

serene grail Jul 19, 2024, 5:58 PM

#

slender meadow Is there tips to learn code better

Don't just read or watch books/courses, actually type in the code yourself and try to apply what you learn by making your own projects
Active practice is far better than just passively absorbing material

deep sleet Jul 19, 2024, 6:22 PM

#

serene scaffold I'm a computational linguist for a lab.

What is that?

serene scaffold Jul 19, 2024, 6:23 PM

#

deep sleet What is that?

I specialize in language technology

deep sleet Jul 19, 2024, 6:25 PM

#

oh so stuff similar to chat gpt?

#

sorry for my ignorance

serene scaffold Jul 19, 2024, 6:27 PM

#

yes

slender meadow Jul 19, 2024, 6:28 PM

#

@serene grail thnx

umbral blaze Jul 19, 2024, 7:33 PM

#

Hello everybody. I was wondering if anyone knew good resources to learn how to build AI applications using Python before college?

past meteor Jul 19, 2024, 7:37 PM

#

@warm copper the slopes and the means are functionally the ssame

warm copper Jul 19, 2024, 7:37 PM

#

For instance, if you're analysing the impact of different factors on delivery times, regression could help you. It can easily predict how changes in these factors affect the overall supply chain. If you're comparing the average delivery times across different regions or transportation methods, ANOVA would be more appropriate.

past meteor Jul 19, 2024, 7:37 PM

#

if you have a binary variable yes/no annd you make a dummy the intercept is the mean of yes or no

#

and the "YES" variable is the difference of the mean

#

Can you not see how this is the same?

#

It's presented differently in R but it's fundamentally the same thing repackaged

warm copper Jul 19, 2024, 7:38 PM

#

https://www.youtube.com/watch?v=_BfiRvm39lc

YouTube

GraphPad Software

When to Use Regression Instead of ANOVA

You know that you need to use a t-test, but are you stumped on what kind of t-test to use?

When choosing a t-test, you will need to consider two things: whether the groups being compared come from a single population or two different populations, and whether you want to test the difference in a specific direction.

This video is part of the H...

▶ Play video

past meteor Jul 19, 2024, 7:39 PM

#

Hence why more and more programs don't teach ANOVA anymore altogether

warm copper Jul 19, 2024, 7:39 PM

#

I literally used ANOVA during my internship

past meteor Jul 19, 2024, 7:39 PM

#

Don't take this the wrong way but are you reading what I'm saying?

warm copper Jul 19, 2024, 7:40 PM

#

I am but I still use them for different purposes

past meteor Jul 19, 2024, 7:40 PM

#

If you ONLY have 3 categorical variables and you make 2 dummies

#

red, blue and green. You make a column for red and a column for green

#

It's clear that beta_0 is the mean of blue, yes?

#

It's also clear that the coefficients for beta_1 and beta_2 are expressing the difference of the means compared to red and green, right?

warm copper Jul 19, 2024, 7:41 PM

#

I used ANOVA in DOE a lot too

#

Instead of Regression

#

Yeah I understand that

#

but you cant do logistic regression with ANOVA

#

multiple regression in my bulletpoint doesnt refer to linear regression

#

it can be logistic or polynomial regression too

past meteor Jul 19, 2024, 7:44 PM

#

Exactly

#

All I'm saying is that if it's a stats heavy data scientist role and you don't see that linear regression subsumes anova

#

you may have an issue during the interview

#

And for that reason I avoid those roles like the plague

warm copper Jul 19, 2024, 7:45 PM

#

I mean ANOVA can be used as a linear regression for sure. You can use them interchangebly

#

but the output i get from each of them are different

#

but they achieve the same thing

past meteor Jul 19, 2024, 7:46 PM

#

Other fun questions I get at interviews is how random forest and xgboost work

#

For some reason every technical interview has asked that

warm copper Jul 19, 2024, 7:46 PM

#

Anov table puts out more statistical stuff

#

like p values, f score degree of freedom

#

you don't see that kind of stuff on linear regression output

past meteor Jul 19, 2024, 7:48 PM

#

Depends on the library of course

#

We used Stata (🤢) for this stuff in uni an afaik you get F scores there with your regression output

warm copper Jul 19, 2024, 7:48 PM

#

https://github.com/KadirOrcunAltunel/HeartFailureAnalysis/blob/main/Methods.md

GitHub

HeartFailureAnalysis/Methods.md at main · KadirOrcunAltunel/HeartFa...

Contribute to KadirOrcunAltunel/HeartFailureAnalysis development by creating an account on GitHub.

#

this is from my undergrad thesis @past meteor

past meteor Jul 19, 2024, 7:49 PM

#

I'm very picky, are you sure you want me to look? 😂

warm copper Jul 19, 2024, 7:49 PM

#

yup

past meteor Jul 19, 2024, 7:52 PM

#

My first question is, did the people that die specifically die due to heart failure or could it be anything

#

If you had a car crash in september are you added to the group that passed?

warm copper Jul 19, 2024, 7:53 PM

#

they died specifically from it

#

https://github.com/KadirOrcunAltunel/HeartFailureAnalysis/blob/main/Dataset.md

GitHub

HeartFailureAnalysis/Dataset.md at main · KadirOrcunAltunel/HeartFa...

Contribute to KadirOrcunAltunel/HeartFailureAnalysis development by creating an account on GitHub.

past meteor Jul 19, 2024, 7:54 PM

#

stepwise regression is a no-go

warm copper Jul 19, 2024, 7:54 PM

#

dataset part of project states that

past meteor Jul 19, 2024, 7:54 PM

#

many papers on why you shouldn't use stepwise

warm copper Jul 19, 2024, 7:54 PM

#

Well we only learnt stepwise, logistic and linear regression in that course

past meteor Jul 19, 2024, 7:54 PM

#

damn

warm copper Jul 19, 2024, 7:54 PM

#

It was called Applier Regression Analysis

past meteor Jul 19, 2024, 7:54 PM

#

stepwise + AIC = 🥴

#

No Lasso?

warm copper Jul 19, 2024, 7:55 PM

#

no

past meteor Jul 19, 2024, 7:55 PM

#

damn

warm copper Jul 19, 2024, 7:55 PM

#

The only lasso I know is the regularization technique

past meteor Jul 19, 2024, 7:55 PM

#

Yeah

warm copper Jul 19, 2024, 7:55 PM

#

which I learnt in ML

past meteor Jul 19, 2024, 7:55 PM

#

that one

#

https://www.reddit.com/r/statistics/comments/7bvo6m/why_is_stepwise_regression_criticized/

From what I've been taught about stepwise regression, the problem is how very atheoretical it is - and also how dependent it is on sample characteristics. Since predictors are often at least a little correlated, using the exact same set of variables with stepwise selection on two different datasets will often get you a VERY different solution.

Hell, if you run it forward and backward on the same dataset you'll often get very different solutions.

For models that are inherently multivariate, pretending they aren't (doing a bunch of pairwise comparisons, one variable at a time) is generally not the best way to go.

#

I guess that if your prof is teaching you stepwise, then I can see why you did it

warm copper Jul 19, 2024, 7:56 PM

#

I would use DT on this dataset now

past meteor Jul 19, 2024, 7:57 PM

#

I'm mostly missing a residual analysis

warm copper Jul 19, 2024, 7:57 PM

#

😄

past meteor Jul 19, 2024, 7:58 PM

#

Your regression formula assumes a linear relationship between each variable with 0 interactions

warm copper Jul 19, 2024, 7:58 PM

#

Homoscedasticity

#

I learnt all those later in my degree

past meteor Jul 19, 2024, 7:59 PM

#

Well, you could plot the residuals wrt each variable and find out if it's homoscedastic or not indeed

#

isn't the bachelor thesis at the very end?

warm copper Jul 19, 2024, 7:59 PM

#

Assumptions of Linearity

#

Nag

past meteor Jul 19, 2024, 7:59 PM

#

strange

warm copper Jul 19, 2024, 7:59 PM

#

Nah this was more like a class project

#

It was a very tough class tho

#

We had 24 peeps when we started and ended with 8

past meteor Jul 19, 2024, 8:00 PM

#

I glossed over all the statistical tests because it's been too long for me and I never use them at work

#

yeah, I think it's mostly stepwise and no iterative, "data driven" approach to modelling

#

but I guess you saw that in other classes

warm copper Jul 19, 2024, 8:01 PM

#

yup

#

This is my current project for deep learning @past meteor

past meteor Jul 19, 2024, 8:01 PM

#

Also with a tad of omitted variable bias https://en.wikipedia.org/wiki/Omitted-variable_bias

warm copper Jul 19, 2024, 8:02 PM

#

https://colab.research.google.com/drive/1a_wldtua97Co9iqgPuuptvGimAjElmGm#scrollTo=DW_kYEyRqUj-

Google Colab

#

does it let you see it

past meteor Jul 19, 2024, 8:02 PM

#

Mostly matters for interpretation of your results

#

Hence why I'm always afraid of interpreting regresssion coefficients, it's risky business if you're not a statistician pur sang

warm copper Jul 19, 2024, 8:02 PM

#

https://colab.research.google.com/drive/1a_wldtua97Co9iqgPuuptvGimAjElmGm?usp=sharing

Google Colab

past meteor Jul 19, 2024, 8:02 PM

#

Or maybe I take this too seriously 🤷

warm copper Jul 19, 2024, 8:03 PM

#

try now lol

#

im using transformers

#

to detect colon cancer

#

I managed 99 percent accuracy

#

👽

past meteor Jul 19, 2024, 8:07 PM

#

I'm not gonna read this ngl haha

warm copper Jul 19, 2024, 8:07 PM

#

LOL

past meteor Jul 19, 2024, 8:07 PM

#

it's very notebooky

warm copper Jul 19, 2024, 8:07 PM

#

yusss

past meteor Jul 19, 2024, 8:07 PM

#

you need to read it top to bottom

#

And can't join in the middle and see what's going on

warm copper Jul 19, 2024, 8:08 PM

#

i said to myself

past meteor Jul 19, 2024, 8:08 PM

#

it's 10 pm for me on a friday, aint gonna do that rn 🤣

warm copper Jul 19, 2024, 8:08 PM

#

if I cant get a job i will become a college prof

past meteor Jul 19, 2024, 8:08 PM

#

What I'd look for here is if you're leaking data or not

warm copper Jul 19, 2024, 8:08 PM

#

i will do a phd and stay in the college

past meteor Jul 19, 2024, 8:08 PM

#

If you get 99 % accuracy you should be worried, not happy imo

warm copper Jul 19, 2024, 8:09 PM

#

transformers are really powerful

#

basicall you are training your model on a pretrained model

#

I used ViT

past meteor Jul 19, 2024, 8:10 PM

#

which is what you do with say resnet as well

warm copper Jul 19, 2024, 8:11 PM

#

yup

#

what is worrying me is the fluctiations in epochs @past meteor

#

#

Epoch: 4 | train loss: 0.2172 | test accuracy: 0.94
Epoch: 4 | train loss: 0.0596 | test accuracy: 1.00

#

something is off

agile cobalt Jul 19, 2024, 8:15 PM

#

# Get the next batch for testing purposes
test = next(iter(test_loader))
test_x = test[0]
``` that `iter()` is either redundant or a bug

warm copper Jul 19, 2024, 8:15 PM

#

its redundsnt

#

but what do you think about the epochs @agile cobalt

#

why such high fluctiations?

#

Epoch:  6 | train loss: 0.1894 | test accuracy: 0.94
Epoch:  6 | train loss: 0.0428 | test accuracy: 1.00
Epoch:  6 | train loss: 0.1503 | test accuracy: 0.94
Epoch:  7 | train loss: 0.3682 | test accuracy: 0.94
Epoch:  7 | train loss: 0.3046 | test accuracy: 0.94

agile cobalt Jul 19, 2024, 8:17 PM

#

idk batch size too small?

warm copper Jul 19, 2024, 8:17 PM

#

its 16

agile cobalt Jul 19, 2024, 8:17 PM

#

don't you have 8 classes or so

warm copper Jul 19, 2024, 8:17 PM

#

yup

#

why?

agile cobalt Jul 19, 2024, 8:19 PM

#

never mind, I got a bit confused
(thinking about how many classes it'll see each iteration, but that probably shouldn't matter)

warm copper Jul 19, 2024, 8:20 PM

#

lol

#

I changed my batch size to 64 from 16 now

#

and LR to 0.01 from 0.00001

#

#

Made it worse lol

agile cobalt Jul 19, 2024, 8:22 PM

#

maybe try smaller just to see what happens

warm copper Jul 19, 2024, 8:22 PM

#

yeah maybe like 8?

#

for batch_size?

agile cobalt Jul 19, 2024, 8:22 PM

#

or even 4

0.01 learning rate is probably too high though

warm copper Jul 19, 2024, 8:23 PM

#

EPOCHS = 50
BATCH_SIZE = 8
LEARNING_RATE = 0.000001

#

Im gonna do this lol

#

Epoch:  5 | train loss: 0.6605 | test accuracy: 1.00
Epoch:  5 | train loss: 1.0521 | test accuracy: 0.75
Epoch:  5 | train loss: 0.5068 | test accuracy: 1.00
Epoch:  5 | train loss: 0.7508 | test accuracy: 1.00```

#

lol 🥲

past meteor Jul 19, 2024, 8:36 PM

#

warm copper what is worrying me is the fluctiations in epochs <@260493929047130113>

Why is a single epoch printed out mroe than once?

#

That in an of itself is a bit strange

warm copper Jul 19, 2024, 8:36 PM

#

I increased dropout layer to 0.5

#

it goes through each batch size

#

you can make it print once

#

it gives more detail on whats going on each batch size

past meteor Jul 19, 2024, 8:37 PM

#

evaluating on the validation set each batch is strange

#

I'd just have 1 line per epoch tbf

warm copper Jul 19, 2024, 8:43 PM

#

fixed it

past meteor Jul 19, 2024, 8:49 PM

#

Anyway, if your model's performance is really good odds are you shouldn't be celebrating but rather looking for where you have a leak

#

If you've exhaustively search and you find nothing then you can celebrate

harsh sun Jul 19, 2024, 8:53 PM

#

I did 50 epochs of training on my model, and on the 11th epoch I got 92% training and validation accuracy. How can I like select the epoch with the best training and validation when I train it next so that I can make that configuration the permanent one?

#

Also, when doing the hyperparameter search, when I get the best parameters on the next run it always changes. So I finally ran it and I saw good parameters and I hard coded thoes in and I got better results. Why isnt that the standard instead of them changing every time

glass ridge Jul 19, 2024, 9:31 PM

#

warm copper fixed it

what library do u use for deep learning

warm copper Jul 19, 2024, 9:31 PM

#

i use pytorch

glass ridge Jul 19, 2024, 9:32 PM

#

warm copper i use pytorch

ok , from where did u learn it

warm copper Jul 19, 2024, 9:32 PM

#

college

#

😄

#

work

glass ridge Jul 19, 2024, 9:33 PM

#

warm copper college

ok , still at school 😦

warm copper Jul 19, 2024, 9:33 PM

#

are you?

glass ridge Jul 19, 2024, 9:33 PM

#

warm copper are you?

yeah

#

1 year to graduate and choose a specializaton

past meteor Jul 19, 2024, 9:34 PM

#

Sadly the answer will be the same as I give you with numpy 😅

#

The documentation

glass ridge Jul 19, 2024, 9:34 PM

#

past meteor Sadly the answer will be the same as I give you with numpy 😅

i ve only readen the quick start

#

then see neural nine video

past meteor Jul 19, 2024, 9:35 PM

#

The pytorch documentation has a "learn" section

#

it's how I learnt pytorch. It's mostly the same as Numpy and Tensorflow

#

no books, no videos, just the docs

glass ridge Jul 19, 2024, 9:35 PM

#

past meteor The pytorch documentation has a "learn" section

is it possible to learn it from courses

past meteor Jul 19, 2024, 9:35 PM

#

that's how you should learn to use tools imo. If the tools have bad docs, just use a different one if you have a choice

glass ridge Jul 19, 2024, 9:35 PM

#

past meteor The pytorch documentation has a "learn" section

m now learning pandas

past meteor Jul 19, 2024, 9:36 PM

#

I really don't like the idea of learning from courses

#

I explained last time already why not

#

You should pick a book like the ones in the pinned post I list

#

They all have exercises

glass ridge Jul 19, 2024, 9:37 PM

#

past meteor You should pick a book like the ones in the pinned post I list

?

past meteor Jul 19, 2024, 9:37 PM

#

do you know how to find pinned messages on discord?

#

no problem if you don't

glass ridge Jul 19, 2024, 9:37 PM

#

i know

past meteor Jul 19, 2024, 9:38 PM

#

The second pinned message is about books I recommend

glass ridge Jul 19, 2024, 9:38 PM

#

from Raggy?

past meteor Jul 19, 2024, 9:38 PM

#

#data-science-and-ml message

glass ridge Jul 19, 2024, 9:39 PM

#

https://mml-book.github.io/book/mml-book.pdf

glass ridge Jul 19, 2024, 9:42 PM

#

past meteor https://discord.com/channels/267624335836053506/366673247892275221/1150186929053...

do i need to learn API's

serene scaffold Jul 19, 2024, 9:43 PM

#

glass ridge do i need to learn API's

"APIs" are a broad concept. if you're learning how to use <x programming thing>, you're learning the api of x.

past meteor Jul 19, 2024, 9:44 PM

#

glass ridge do i need to learn API's

Do you mean backend APIs / Web development or what stelercus is talking about (the correct usage of the term API)

glass ridge Jul 19, 2024, 9:45 PM

#

past meteor Do you mean backend APIs / Web development or what stelercus is talking about (t...

is there an API concept in ML (i just heard it so sorry for any confu)

past meteor Jul 19, 2024, 9:46 PM

#

Can you first clarify what you mean with API first

#

it doesn't need to be perfect, your describe it in your own words so I understand what you mean

warm copper Jul 19, 2024, 9:49 PM

#

Epoch:  31 | train loss: 0.1147 | test accuracy: 1.00
Epoch:  32 | train loss: 0.0760 | test accuracy: 0.75
Epoch:  33 | train loss: 0.0796 | test accuracy: 0.88
Epoch:  34 | train loss: 0.0718 | test accuracy: 0.88
Epoch:  35 | train loss: 0.0762 | test accuracy: 0.88
Epoch:  36 | train loss: 0.0627 | test accuracy: 1.00
Epoch:  37 | train loss: 0.0548 | test accuracy: 0.88
Epoch:  38 | train loss: 0.0607 | test accuracy: 1.00
Epoch:  39 | train loss: 0.0512 | test accuracy: 0.88
Epoch:  40 | train loss: 0.0575 | test accuracy: 1.00
Epoch:  41 | train loss: 0.0567 | test accuracy: 1.00
Epoch:  42 | train loss: 0.0506 | test accuracy: 1.00
Epoch:  43 | train loss: 0.0588 | test accuracy: 1.00
Epoch:  44 | train loss: 0.0361 | test accuracy: 1.00
Epoch:  45 | train loss: 0.0434 | test accuracy: 1.00
Epoch:  46 | train loss: 0.0401 | test accuracy: 1.00
Epoch:  47 | train loss: 0.0331 | test accuracy: 1.00
Epoch:  48 | train loss: 0.0440 | test accuracy: 1.00
Epoch:  49 | train loss: 0.0330 | test accuracy: 1.00

glass ridge Jul 19, 2024, 9:49 PM

#

past meteor Can you first clarify what you mean with API first

i just heard it on a data scientist roadmap

warm copper Jul 19, 2024, 9:49 PM

#

Pretty good accuracy! @past meteor

#

I would probably get a lower train loss if I ran it for another 10 epochs

glass ridge Jul 19, 2024, 9:51 PM

#

past meteor Do you mean backend APIs / Web development or what stelercus is talking about (t...

it can be just a miss understanding

past meteor Jul 19, 2024, 9:52 PM

#

So, they're using API in the colloquial but incorrect way as a way for the outside world to interact with your ML models over the internet

serene scaffold Jul 19, 2024, 9:52 PM

#

glass ridge i just heard it on a data scientist roadmap

Throw away that roadmap.
Pick a basic ML concept (not a library) and write some code that exemplifies that concept. You'll inevitably need at least one ML library to accomplish it. Just learn whatever minimal amount of that library's API that you need to do it.

past meteor Jul 19, 2024, 9:52 PM

#

Don't worry about it

warm copper Jul 19, 2024, 9:52 PM

#

i deem my model as a success

#

😛

past meteor Jul 19, 2024, 9:52 PM

#

Stelercus is spot on

#

I started learning data science in exactly that way. I downloaded all of my data from facebook (... I use messenger a lot) and did a data analysis project with it

glass ridge Jul 19, 2024, 9:53 PM

#

serene scaffold Throw away that roadmap. Pick a basic ML concept (not a library) and write some ...

u mean like learning from projects

serene scaffold Jul 19, 2024, 9:53 PM

#

glass ridge u mean like learning from projects

sure.

warm copper Jul 19, 2024, 9:53 PM

#

use kaggle @glass ridge

past meteor Jul 19, 2024, 9:54 PM

#

Along the way I learnt the basics of pandas, storing data in DBs with Python, what JSON is, making ML models using sklearn, ...

warm copper Jul 19, 2024, 9:54 PM

#

https://www.kaggle.com

Kaggle: Your Machine Learning and Data Science Community

Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals.

#

https://huggingface.co

Hugging Face – The AI community building the future.

past meteor Jul 19, 2024, 9:54 PM

#

This is a project you can easily copy for whatever platform you use because of GDPR, like you can ask discord for all of your data

#

And then you can do some analysis on that

#

Doesn't need to be advanced, but it's more productive than roadmaps and whatnot

warm copper Jul 19, 2024, 9:55 PM

#

https://roboflow.com/universe

Roboflow Universe: The Largest Community of Vision Datasets

The largest resource of computer vision datasets and pre-trained models.

#

these are good places

past meteor Jul 19, 2024, 9:55 PM

#

warm copper I would probably get a lower train loss if I ran it for another 10 epochs

A very low train loss is absolutely a bad thing
2.A very high accuracy is very suspiciouis

glass ridge Jul 19, 2024, 9:55 PM

#

past meteor This is a project you can easily copy for whatever platform you use because of G...

what project

past meteor Jul 19, 2024, 9:56 PM

#

glass ridge what project

I just told you 😭

warm copper Jul 19, 2024, 9:56 PM

#

it was really high

#

#

went from 180 percent to 3 percent

#

why would it be a bad thing

#

you aim to minimize train loss

past meteor Jul 19, 2024, 9:57 PM

#

I recently had a very good model and I presented it to my clients (both are PhD + multiple post doc tier data scientists)

#

their obvious reaction was "where did you make an error?"

glass ridge Jul 19, 2024, 9:58 PM

#

past meteor I started learning data science in exactly that way. I downloaded all of my data...

what do u mean by dataanalytics , is it clearing and representing data

past meteor Jul 19, 2024, 9:58 PM

#

And that is with me presenting my results with loads of skepticisim

warm copper Jul 19, 2024, 9:58 PM

#

first of all transformers usually lead to very low train loss

past meteor Jul 19, 2024, 9:58 PM

#

warm copper you aim to minimize train loss

you don't

#

that's not the point of training models, it's the exact opposite 😔

warm copper Jul 19, 2024, 9:58 PM

#

??????

past meteor Jul 19, 2024, 9:59 PM

#

The point is not minimising the train loss

#

it's training something that generalizes

warm copper Jul 19, 2024, 9:59 PM

#

you need high accuracy

past meteor Jul 19, 2024, 9:59 PM

#

If you make the training loss go to 0

#

you're typically not generalizing

warm copper Jul 19, 2024, 9:59 PM

#

Im doing image classification

#

lower the loss the better the accuracy

past meteor Jul 19, 2024, 9:59 PM

#

it doesn't matter if it's classification, regression or clustering

#

I'm sorry but this is patently false

warm copper Jul 19, 2024, 10:00 PM

#

Im supposed to aim 95 percent accuracy

#

for the project

past meteor Jul 19, 2024, 10:00 PM

#

I'm being hard on you because you're interviewing

#

If you say this in an interview it's over

warm copper Jul 19, 2024, 10:00 PM

#

alright then lets have a shitty classification model that doesn't accurately classify images

serene scaffold Jul 19, 2024, 10:00 PM

#

warm copper lower the loss the better the accuracy

that's not guaranteed to be the case
and high accuracy doesn't mean that the model will perform well on instances outside the dataset
accuracy might also be the wrong metric

warm copper Jul 19, 2024, 10:01 PM

#

first of all

#

the model is already given