#data-science-and-ml

1 messages · Page 133 of 1

worldly dawn
#

that sounds like a class I would have loved

past meteor
#

And you could construct the algo however you wanted

#

And just like you, EAs are things I use at work but super infrequently

worldly dawn
#

I think they are underrated and DL has taken quite a bit of the spotlight. But they are still worth knowing and having in your toolbox

#

Also it was cool to see some of the papers on EA applied to DL architectures

#

or even the evolution of weights

past meteor
#

To me they occupy a very different space

worldly dawn
#

which is?

#

and how do you use them at work?

past meteor
#

Needing data vs not needing data

worldly dawn
#

Interesting

past meteor
#

But I agree, they're important to have in your toolbox

#

My colleagues would try and solve typical problems like TSP with machine learning

#

Or nurse scheduling, (3D) knapsack etc.

worldly dawn
#

oh yeah they are fun

lethal pendant
#

hello

#

can anyone tell me what should i do to start

#

ill do any suggestion

past meteor
lethal pendant
#

ok

worldly dawn
toxic mortar
#

Follow-up on this: What benchmarks you look for when evaluating a DL/ML classification paper. I was wondering things such as dataset, accuracy, confusion matrix. Is there anything else that can give me insights about their attempt? Thanks man 😄

hoary merlin
#

which social media site is good for a sentiment analysis project? i was going for twitter but its api is paid now'

jaunty helm
# toxic mortar Follow-up on this: What benchmarks you look for when evaluating a DL/ML classif...

accuracy may not be great when dealing with imbalanced data
e.g. say you have a house fire dataset where only 1/100 houses actually caught fire, you can get 99% accuracy by always guessing no fire, but that model's terrible in practice (it's more important to correctly predict those that may catch fire to deal with it preemptively, than to correctly predict those that don't catch fire)
so you have 3 more stats, precision, recall, and f1.
to put it simply, if I were hunting a group of 100 ducks and got 60, I'd have 60% recall(which measures how many sheep I got out of the total); but in the process I also mistakenly shot 60 geese, then I'd have a 50% precision(which measures how good I was at shooting actual sheep and not something else)
I could be more cautious and only shoot those that I'm sure are ducks, maybe then I'd get 20 ducks and 0 goose, then I'd have 20% recall and 100% precision
f1 is like a healthy blend of precision and recall

toxic mortar
jaunty helm
toxic mortar
#

Like why would I explicitly look for f1,recall and precision when I can see them from the cm?

ionic valley
#

So I've finished my Youtube Dislikes Predictor with linear regression, but I'd like to expand upon it. In particular, I am looking at these two unused features. From a glance, am I likely to get any possible insight from these variables at a significant level? If so, what techniques do you recommend I try?

toxic mortar
#

Also research papers might be pretty faked. Am i missing some metrics that can evaluate attempt

#

And help me identify weak links within the model

narrow tiger
#

if i am using ollama locally
does it restart the specified model each time i make a request?
or are all the models that I pulled always running
i can see that ollama is always running

sick raft
hallow sphinx
#

How does go mod tidy search for the module? Like in what order?

arctic silo
#

I want to transform large amount of data from kaggle to the cloud in database how can I do it I want to build pipeline ??

spare magnet
#

is it fine to learn nlp even i dont master python yet

unkempt apex
spare magnet
unkempt apex
#

no one have mastered yet !

#

go then start NLP!

hearty token
#

I wonder how good it is to fine tune a small English pre trained model for a non-english language task

#

Or if it would just be better to pre train a very model entirely on the other language and specialize the task

small wedge
#

Depends on how close the language is to English, but I'd say it's probably better to train a model from scratch

valid basalt
#

Hi everyone, Im Anna. I'm currently finishing my MBA with a focus on quantitative finance and for my dissertation I'd like to do a paper on "Machine Learning for Financial Market Forecasting: Unveiling Hidden Patterns for Informed Decisions" using LLM. I have a grounding in data science and I'm currently completing a course on the subject. However, I need help with the practical part of the model and training. If someone is interested in the subject and wants to help me, feel free to write me inbox so we can talk about it. Thanks

cloud flower
#

Could someone help me get started with the first task please? 🙂

#

This is how far I've gotten 🤦:

class Interval:
    def __init__()
spring field
#

is it meant to say that c or d can't be 0 or that there can't be a 0 in that range? cuz I don't see why there couldn't be a 0 somewhere in that range

spring field
cloud flower
cloud flower
spring field
#

well, let's break it down a bit
well, first, how much do you know about classes in general?
how many instance attributes would a single interval need?

#

or even earlier, how many arguments would __init__ need to take? (excluding self and assuming a simple case like the task suggests)

cloud flower
#

it's just a bunch of information, would help if it was broken down

spring field
#

well, naively speaking, you pretty much need to grasp this bit

#

like, what part of the math do you not understand? cuz I find it explained relatively clearly pithink

serene scaffold
spring field
#

tbf, they are technically involved for the first two as well, but they can be simplified in those two cases

pallid badge
spring field
#

for multiplication I can think of cases where the smallest (left) values can multiply and return a larger value than other combinations

#

same for division

#

like, you can't just simplify them to simple arithmetic because there are several possible outcomes depending on the values used

#

I suppose they can be alternatively written as piecewise functions as well

cloud flower
#

@spring field could you dumb this down for me

#

for starters

spring field
#

I don't even know where to begin breaking it down sobbing
like, do you have at least some intuition on what an interval is or what +/- is?

cloud flower
#

yes, [a, b] could mean that there's some parameter t that is within the bounds of a and b, e.g. a≤t≤b

#

and yes, i know how arithmetic operators like + and - works

#

does the task have to do with error margins?

calm osprey
spring field
spring field
#

I don't understand why there can't be a 0 somewhere in the range (a, b)

wooden sail
#

i don't see any question at all, the first image is all definitions

spring field
#

well, you replied to my question

calm osprey
wooden sail
#

naturally if you want to divide a by b, b can't be 0

cloud flower
#

Overwhelming the student

#

Teachers need to learn to condense the content

wooden sail
#

if 0 is in the interval, you'd have to split it in two because 0 wouldn't be in the domain anyway

spring field
wooden sail
#

you'd also have half open intervals

iron basalt
cloud flower
wooden sail
spring field
wooden sail
cloud flower
#

This task is problematic

#

The course too

iron basalt
#

When you do allow division by zero, it's usually boring (e.g. all numbers are now equal to zero).

cloud flower
#

I’m abouta give my teacher some problems

iron basalt
#

(An interval)

cloud flower
#

TLDR here’s how you can find out how to calculate the error margin, or the range within which the value you’re seeking is in?

iron basalt
#

(Or with more dimensions, a bounding box)

calm osprey
iron basalt
spring field
iron basalt
#

Like if I throw a ball, and want to say it could end up in this interval, depending on this interval of mass, starting velocity, etc. And I want to calculate that interval.

calm osprey
#

It’s complex but you can find more there
It’s not okay if I post the link here

cloud flower
spring field
#

sure, but I'm not sure how that's exactly related to the topic at hand

cloud flower
#

me neither

calm osprey
cloud flower
#

Alright bro but what does it have to do with the theory

calm osprey
calm osprey
spring field
wooden sail
#

it's just formulated in a way you're not used to

#

you've done this all along whenever you got those questions about domain and range of functions

#

elementary arithmetic operators are binary functions too, and this shows one way of studying the domain and range

spring field
#

yeah, but using two ranges arithmetically threw me off a bit

calm osprey
#

@cloud flower
It’s proportional to the actual figure by a slight difference

cloud flower
#

When you guys are done chatting about the math stuff I will repost my question and what progress I’ve made🖐️

calm osprey
#

Just remembered 🧠 my brains now working harder

mental rampart
#

why does tensorflow upwards of 2.10 not support native gpu support on windows

umbral delta
#

so i have this very simple code: print(data["rank"]) raster["rank"] = data["rank"] print(raster["rank"]) which somehow outputs```idx
0 65.686275
1 77.450980
2 80.392157
3 37.254902
4 68.627451
...
576 68.000000
577 51.000000
578 46.000000
579 83.000000
580 84.000000
Name: rank, Length: 581, dtype: float64
0001_U_0018_2010 NaN
0010_L_0002_2010 NaN
0012_L_0048_2010 NaN
0013_U_0016_2010 NaN
0015_L_0050_2010 NaN
..
0072_L_0031_2010 NaN
0008_U_0012_2010 NaN
0009_L_0086_2010 NaN
0009_U_0009_2010 NaN
0009_U_0034_2010 NaN

left tartan
#

Because the indices aren't aligned

#

Reset the index of raster then see what happens

rich moth
#

damn guys, I dont think its posssible to running on my system. I've tried a lot of different things, but I would have to find something with more GPU memory.

Traceback (most recent call last):
  File "/home/plunder/CATMANDODO63.py", line 692, in <module>
    main()
  File "/home/plunder/CATMANDODO63.py", line 687, in main
    model = train(model, train_dataloader, optimizer, criterion, tokenizer, device, epochs, val_dataloader, num_frames)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/plunder/CATMANDODO63.py", line 423, in train
    scaler.step(optimizer)
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py", line 416, in step
    retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py", line 315, in _maybe_opt_step
    retval = optimizer.step(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/optim/optimizer.py", line 373, in wrapper
    out = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/optim/optimizer.py", line 76, in _use_grad
    ret = func(self, *args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/optim/adam.py", line 163, in step
    adam(
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/optim/adam.py", line 311, in adam
    func(params,
  File "/home/plunder/miniconda3/envs/qusar/lib/python3.11/site-packages/torch/optim/adam.py", line 565, in _multi_tensor_adam
    exp_avg_sq_sqrt = torch._foreach_sqrt(device_exp_avg_sqs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 GiB. GPU 0 has a total capacty of 23.99 GiB of which 0 bytes is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 66.75 GiB is allocated by PyTorch, and 15.41 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
(qusar) plunder@localhost:~$```

I feel like the code is right there, I just dont have the resources to run it.
#

I tried so many different things. But I i need commerical equipment at this point I guess. Switching the batch size, number of frames, accumlation steps in the training.. I can get it to go for a bit, but runs out of memory.

#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

rich moth
#

Anyone have any suggestions?

#

Maybe I can use a gpt2 distill instead...

agile cobalt
rich moth
#

like 10-15 seconds.

agile cobalt
#

how many fps?

rich moth
#

25/s

agile cobalt
#

so you are effectively working with a batch size of 375, using an extremely large model in consumer grade GPUs?

#

I'm not sure if I understood your loading code

rich moth
#

I had a batch size of 16, tried to reduce it all the way down to 2. It runs for a bit. Input to VideoEncoder: batch_size=2, num_frames=16, channels=3, height=128, width=128

#

So no 375 batch sizes 🙂

agile cobalt
#

I must have misunderstood how it integrates with the loader then

lapis sequoia
#

is this the correct way to do sentiment analysis

#

from textblob import TextBlob

def polarity(text):
return TextBlob(text).polarity

df['polarity'] = df['lyric'].apply(polarity)

def sentiment(label):
if label < 0:
return "Negative"
elif label == 0:
return "Neutral"
elif label >= 0:
return "Positive"

df['sentiment'] = df['polarity'].apply(sentiment)

rich moth
#

I think I got it to work incorporating gardiuent accumulation and tensor management.

#

Keep my fingers crossed! Epoch 1/1: 16%|█████████ | 231/1402 [15:59<1:33:01, 4.77s/batch, Batch Loss=0.00133]Input to VideoEncoder: batch_size=4, num_frames=8, channels=3, height=128, width=128 After view reshape: torch.Size([4, 24, 128, 128]) After conv2d_layers: torch.Size([4, 512, 128, 128]) After view reshape before fc: torch.Size([4, 8388608]) After fc layer: torch.Size([4, 512]) Input to VectorQuantizer: torch.Size([4, 512]) After flattening and reshaping: torch.Size([32, 64]) Distances shape: torch.Size([32, 512]) Encoding indices shape: torch.Size([32]) Quantized tensor shape: torch.Size([4, 512]) Commitment loss: 1.738540959195234e-05, Codebook loss: 1.738540959195234e-05, VQ loss: 2.1731761080445722e-05 Input to VideoDecoder: torch.Size([4, 512]) After fc layer: torch.Size([4, 131072]) After view reshape: torch.Size([4, 512, 16, 16]) After conv_reduce: torch.Size([4, 512, 16, 16]) After conv2d_transpose_layers: torch.Size([4, 24, 128, 128]) Channels: 3, Expected size: 1572864, Actual size: 1572864 Final output shape: torch.Size([4, 8, 3, 128, 128]) Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation. Epoch 1/1: 17%|█████████ | 232/1402 [16:04<1:32:57, 4.77s/batch, Batch Loss=0.00223]

peak ridge
#

How much math, is enough math
to get started

#

i've studied a bit calculas in high-school also linear algebra,
do u have any specific resource for Stats and Probability to be specific? @final kiln

#

the problem with khan academy is

#

they have tooo toooo tooo deept in content

#

it literally took me idk how many hours to just do 1 out of 3 unit of Linear Algebra

#

also,
_ practicing frequently with py and jupyter notebooks_
how can i practice math with kaggle 🧐 i mean from python/coding itself

#

how can i practice math from coding

#

ohh

#

how can i do stats and probability from python

#

ohh

#

okay

#

ohh

devout python
#

Hey folks - I have to routinely run some data manipulation of my postgres database (likely once per day), I am considering making a python script deployed on k8 as a cronjob, but it feels a bit like overkill. Are there simpler methods?

hallow sphinx
#

Why do peeps code AI in notebook, and not just python files like every programmer does?

mint goblet
#

Hey there im using Autogen AI conversable agents to generate images. In my output i have a few urls that i need to get out of the conversation but i can't seem to find a solution for this problem. In the omage you can see one of the agents talking and giving an output url "ai_generated_img" with the link that i want to save in a variable

hallow sphinx
#

Can we consider supervised learning, as where result is known, and unsupervised where the output is not known?

agile cobalt
#

it's not necessarily "known" / "unknown", but rather labelled or not

take outlier detection an example - you may train a model without telling it which records are outliers, then validate that it is catching all records you know that are outliers

another example would be the unsupervised pre-training step of GPT models and alike, I wouldn't consider there to be any unknowns, but it's still considered unsupervised

hallow sphinx
#

I was reading a book "100 pages ML book" and it has this formula of Support Vector Machine (SVM)

y = sign(wx - b)

Then it continues to say that ```math
wx - b >= +1 if y = +1, and
wx - b <= -1 if y = ≠1


But shouldn't it be using limits here?
Since, `sign(0.01) = +1` and `sign(-0.01) = -1`
wooden sail
hallow sphinx
#

Do you have to remember lots of formulas for ML?

serene grail
#

I'm not an ML guy, but I think in math in general what matters is understanding the concepts behind the formula, not memorizing the formila

hallow sphinx
#

o.O

wooden sail
#

i agree with that

#

i was also about to write this as well. it just happens inevitably

tepid pecan
#

So... I have two training/learning question. So my IT (A) and a co-worker (B) would like to learn python data entry, transformation, and clean up.
The packages I know about are scikitlearn, scipy, and pandas.

#

Q1) I primarily write in R and python mix (using both together), but I use pipe notation in R (see link). My boss also uses R notation, but does not know pandas. Only myself and IT person A have used pandas. https://www.r-bloggers.com/2021/05/the-new-r-pipe/
Does the scientific python packages have a general pipe operator or function for general python and data clean up? Because I like to convert my code over pure to python so they can learn better.

#

Q2) Are there any new scientific packages or resources since 2022 that people recommend to make coding easier?

stable rover
#

how does YOLO object detection predict bounding boxes? i understand that it uses a grid and each cell predicts a bounding box(es) and class but i don't understand how its possible. does each grid cell run its own classifier so to speak?

frigid cove
#

Is it possible to fine-tune a model using 100 classes, in a laptop with a RTX 4060?

cosmic lynx
#

what is the least jarring way to step into learning more about the engineering side of AI?
or do I just need to go headfirst into learning the scary half of calculus?
I know a little bit about the big three (most comfortable in Calc, okay in stats and shaky in linear algebra)

wooden sail
#

i would argue that the linalg is the most important component, and the stats right after

#

the calc is often used more as a tool in helping out with getting nice results for the other two

cosmic lynx
#

okay thank you

lapis sequoia
#

Can running neural networks dismantle your pc?

#

And, when is imagedatagenerstor better? Under what circumstances?

lapis sequoia
cosmic lynx
past meteor
#

Being a consumer of GenAI libraries and services needs absolutely 0 math

#

And it's, for better or for worse, what many companies consider to be the "engineering side of AI"

lapis sequoia
cosmic lynx
cosmic lynx
lapis sequoia
#

I thought calc 3 was a walk in the park. Like, dirt easy.

#

Ok, like what then?

#

Who said I didn’t?

#

No, I just took so many partials and optimization problems with constraints to the point it was easy when I took it.

#

Bro, I took adanced calculus 1 and grad level optimization class. What I meant by hard, depends on the person. I don’t understand finance at all, but calculus never ever gave me a problem.

#

Ok, give me a problem. I don’t know how to answer that.

#

Like, end of calc 3 with cramers rule? I don’t know. Not even bad

#

All I am saying is it just depends on the person. I agree, abstract algebra is cancer, I hate Game Theory, but that is because there are not as many books on it as there are for calculus, matrix and linear algebra, stats/probability. I just think people should decide for themselves.

#

I don’t remember much, this was a while ago. I took calc1-3 in 2016-2017, I just remember game theory was insanely hard to me.

serene grail
#

Sounds tough as hell.
But I've also heard that real analysis is very hard in general, personally I have no idea what it is

#

That sucks

lapis sequoia
#

Real analysis is stupid hard tho in all honesty

#

Stats, like I took stats 1-3, metrics 1-3, did well in all of them and I don’t remember anything from it. Like, basic concepts.

#

It’s not that I don’t get it, it’s just easier to explain concepts. I don’t know calc1-3, matrix and linear algebra, optimization were always much easier to me compared to even like finance stuff. I swear, it took me so long to understand the concept of a bond.

obsidian mesa
#

Is udemy free course on AI/ML is recognized as advertisement?

#

I mean I am new here, not much aware of rules, hence asking...

past meteor
#

Can you remove this message? ads are against the rule and it's an advertisement

lapis sequoia
#

Any of you ever take Game Theory? It is a literal branch of mathematics.

wary cosmos
#

The decoder in a transformer is trained on a whole sentence at once. It has a mask during self attention to prevent the an earlier timestep from looking forward and directly seeing what it should be outputting.

As far as I am aware the feedforward layer near the end is just an MLP, and they work by having every neuron in a subsequent layer sees all the outputs of the previous layer.

How is looking forward also prevented in the feedforward layer?

#

Thx did not know that

I assume the encoder is also only by embedding?

#

Thx

lapis sequoia
#

Thank you for acknowledging it as a branch of mathematics. It was until the 70s or something and all the pioneers of game, pretty much had an insane influence or partial differential equations.

wary cosmos
#

Isn’t game theory just a complicated optimization function

#

Broadly speaking

iron basalt
#

Invented / pushed by the cold war to beat the soviets.

#

(Its modern form (although like all math, you can find it waaaay earlier))

lapis sequoia
#

Bertrand Duopolies were just limits pretty much, but for price wars. Industrial Organization is my favorite topic of all time.

iron basalt
#

It's really fun and will change your perspective on all the actions taken by nations and such (a better understanding of why they are doing what they do (they calculate things, especially in wars)).

lapis sequoia
#

Yes

lapis sequoia
lapis sequoia
lapis sequoia
# iron basalt Combinatorial game theory.

Do you know what I am talking about tho? I get they are using combinations of all possible moves, but, the person could not move at a specific node so I never understood roll they got second mover advantage in rollback equilibrium.

odd meteor
iron basalt
narrow tiger
#

so i am trying to create custom tool for agent but why does it get stuck like this?

#

what does this even mean. it runs the tool perfectly

#
@tool
def get_prices(query:str):
    """Can be used to get current market prices of any crypto asset"""
    return 69.69

prices_tool = Tool(name = 'crypto prices',func= get_prices,description="Can be used to get current prices of crypto assets")

x = [prices_tool]
agent = initialize_agent(x,llm,AgentType.ZERO_SHOT_REACT_DESCRIPTION,verbose= True, handle_parsing_errors =True)```
This is how i am using it (hardcoded for now bcz it is faster)
#

LLMs have totally replaced regex for meducky_concerned

serene scaffold
narrow tiger
#

no like most of the times llms are just using natural language string to get some some data from that string

#

used to do regex for that and now everyone wants to use llm for it

#

also llm write regex really good for some reason,

buoyant vine
#

🤨

lapis sequoia
lapis sequoia
junior ibex
#

Anyone understand why the d2 vector for language is 0, and why on d1 “what” is 0.25 and candy 0.125 if they both show up once?

iron basalt
lapis sequoia
iron basalt
lapis sequoia
lapis sequoia
iron basalt
lapis sequoia
#

(Up,low)

iron basalt
#

White plays low if black plays up, and white plays high if black plays down. This is the only subgame perfect equilibrium.

lapis sequoia
#

Yes.

#

!but that is purely sequential. I am talking when it becomes simultaneous

arctic wedgeBOT
#
Microsoft Visual C++ Build Tools

When you install a library through pip on Windows, sometimes you may encounter this error:

error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/

This means the library you're installing has code written in other languages and needs additional tools to install. To install these tools, follow the following steps: (Requires 6GB+ disk space)

1. Open https://visualstudio.microsoft.com/visual-cpp-build-tools/.
2. Click Download Build Tools >. A file named vs_BuildTools or vs_BuildTools.exe should start downloading. If no downloads start after a few seconds, click click here to retry.
3. Run the downloaded file. Click Continue to proceed.
4. Choose C++ build tools and press Install. You may need a reboot after the installation.
5. Try installing the library via pip again.

lapis sequoia
iron basalt
#

It's first mover advantage.

#

This has second mover advantage:

lapis sequoia
#

Oh, white has the payoffs to the right in that game.

iron basalt
#

Yeah.

#

(Black, White)

lapis sequoia
#

Yeah, I was reading it (white, black) in terms of payoffs

serene grail
#

I was a bit confused too, used to the chess order

lapis sequoia
lapis sequoia
#

I do get it, it is just weird when the game starts out sequential and turns simultaneous.

lapis sequoia
#

I drew the starting game table and did when each person moved first. They have the same payoffs demo ending on the order of the moves and there is no pure strategy Nash equilibrium in the original game.

iron matrix
#

Hi! Apologies if this is a dumb question, I'm not a data scientist, just a programmer 🙂

#

I have two 2D spaces. In each space there are some number of points, each identified by (x,y) and some (text) label. The labels are not guaranteed to be unique (there might be two points in the same space with the same label), and there's no guarantee that every label in one space also exists in the other, but there will be a lot of correlation.

#

What I want to do is find a best-match transformation matrix to convert coordinates from one space into the other. This seems like the sort of thing that ought to exist (image processing has to have this covered, right?), but I'm not sure what I should be looking for. Can someone point me in the right direction, please?

lapis sequoia
# iron basalt Yeah.

not trying to blow you up about game man, I just kind forgot how much fun it was (and just doing math in general on paper was instead of programming all of the time) but like, with grim trigger, can that be used to detect if a a company will collude or defect on a collusion data set?

stable rover
#

how are convolutional layers (output tensor) flattened into fully connected layers?

#

for example, in here, are there really 1690 * 10 * classes (10) weights to learn just for the final layer?

opal magnet
#

Anyone here

#

And would be willing to help

unkempt apex
#

everyone is!

lapis sequoia
#

its just 13x13x10 flattened into 1690 inputs, and 10 outputs, so the weights are 1690*10 matrix

toxic mortar
#

Hi guys,

I've integrated a RAG LLM-based OpenAI chatbot into my app. Now, I want to implement algorithms within the chatbot, like procedural steps that my chatbot will do sequentially. For example, the chatbot first asks the user for name, then save that information, then ask for their age, then again saves it, and finally when he has all the necessary informations he computes something based on this and then provides the result.

Is there a tool or method to achieve this within Langchain framework? Or can you point me to some keyword name. Thanks 😄

fiery vigil
#

Hi everyone,

a very basic question from me, to decide between two implementation of Tensorflow: C API or Python?

I understand that Python is still implementing code written in C, but I am wondering if the size of data I am dealing with, makes it more time-consuming? I am trying to simulate a recursive iteration process of the type:

x[i+1] = f(x[i])

where x[i] is a vector with millions of components. Conventional iterative methods lead to "stagnation", kind of like vanishing gradient. So x[i+1] ~ x[i] after a while. I have tried updating the iteration scheme itself, include higher-order terms, but it is already slower.

So I was thinking maybe with some architecture of NNs, it might be possible to "jump ahead" in the iteration scheme, to accelerate convergence and get out of stagnating solutions.

Now comes the issue of what implementation to use: the standard, well-documenter Python Tensorflow, or the C API that barely had any documentation, just a loose clump of github pages. I thought that even if training is done in C/C++, for a NN defined in Python, maybe it would be faster? Or is it not so? Even if there's some degree of speed-up in C++, is it possible to implement it as quickly and consistently as in Python (using Python 3, if that matters). All CPU too, btw, no GPU. Got a whole cluster of hundreds of CPUs.

Thank you for any help/insights/suggestions!

left tartan
fiery vigil
left tartan
#

But, perhaps in some isolated use case, one might find ways of improving... but I'd think you'd be at some expert level and a year or more of experience

fiery vigil
# left tartan I don't have direct experience, so I'm speaking to generalities: the Python inte...

Yeah, I just need to read it from others with more experience. I have tried Keras here and there, and now I have to go a level lower and use Tensorflow directly. I hope whatever model is implemented in Python can handle the data. Of course, people do image processing and all in Tensorflow Python, but this recursive iteration scheme might need some matrix multiplication operations (that are already millions x millions ~ quadrillions of unique elements).

wooden sail
#

matrix multiplication never happens directly in python

tidal bough
fiery vigil
tidal bough
#

...what do you mean? Are you writing your own matric multiplication function from scratch?

#

Since numpy's matmul implementation is already in a compiled language, and can use multiple cores if you have the right numpy build (I think it has to be the MKL one and not the BLAS one).

fiery vigil
tidal bough
#

Ah, I see. Is your matrix sparse?

fiery vigil
tidal bough
#

Hmm, and yet it works better to manually split it into chunks than to let numpy handle it?

fiery vigil
tidal bough
#

Oh wait, quadrillions of elements - so you can't even- yeah, okay, that makes sense

#

I wonder if something like dask has a streaming matmul implementation that'd work here, but it makes sense that you made your own. Anyway, naively it seems to me like there's nothing there that can't be done efficiently in python (do a partial matmul via numpy, afterwards (or even in the process) start loading the next chunk, etc), but I might not be thinking about some details of your implementation.

fiery vigil
#

Using for loops to cut chunks then parallise the chunks. For loops already bad enough. OTOH, for loops in C are just simple, all static data types. Another advantage of C is now I don't need to define thr whole matrix. Just parallise the for loops, do some aggressive compilation, and it works out. Was hoping to get that compiler magic for NN training too.

tidal bough
#

For loops already bad enough
I don't really get why. The overhead of things like looping in python becomes noticable when the body of the loop takes very little time - but for you it's a pretty big matmul, so it seems to me that it shouldn't matter.

fiery vigil
#

I did the chunks thing in python first. It is painful. I was thinking dask etc, but too much abstraction is going to make it slower. Trying to keep it simple. I wanted to try and "strip" all the OOP stuff on numpy matrices, if that made it faster. Was thinking of Cython because of it. Had some discussion here, found even Cython is at best a bit slower than pure C, so switched to C.

fiery vigil
tidal bough
#

oh, If you're using multiprocessing I actually have a guess what was happenning - it was probably the fact that arguments to functions generally need to be serialized in order to send them to another process. So you might have been eating the serialization overhead on all of that data, which is quite a lot.

fiery vigil
#

Now with tensorflow, situation is different. Not seeing much discussion on the C API, just bits and pieces scattered around. Official Tensorflow website has a barebones as well on C API. So have to contend with this in Python.

tidal bough
#

I don't know much about the tensorflow C API (when I tried using libtorch my experience was that there was basically no docs about the C++ side and I had to read the python ones instead, so it's not much better either), but since you're basically just working with raw arrays, all you really need is to extract the pointer to the data and some shape information, and then work on that.

fiery vigil
fiery vigil
# tidal bough I don't know much about the tensorflow C API (when I tried using `libtorch` my e...

The indexing and slicing part becomes expensive. But if there's no other option...
I was thinking that maybe I will try to avoid the matrix stuff altogether. The tensors in the DNNs are kind of emulating that part anyway. So I could maybe try to emulate a massive matrix with quadrillion elements with say, 200 neurons with some 10-100 rank tensors? If such a thing is possible. It will be approximate, but the hope is that the DNNs find the most essential features to sufficiently emulate just right enough information to feed into the recursive iteration.

tidal bough
#

Maybe? I'd expect that if it's the kind of task that it needs an iteration scheme to compute, it would also be unstable and amplify the approximation errors exponentially as the iterations go on. But if that's not the case for yours, maybe it'll work.

fiery vigil
jaunty helm
#

in pytorch, is there any disadvantage to using the lazy modules vs. the non-lazy ones? (e.g. nn.LazyLinear vs nn.Linear)

#

I see... to me right now they just feel like nicer-to-use counterparts to their non-lazy siblings

past meteor
#

The API is unstable

#

I typically use non-lazy as it's a sanity check while I'm implementing the net

mild yarrow
#

Hey Guys! Consider you are implementing a fashion recommender system which recommends user what to wear on that day on the basis of various factors which will make user look better on that particular day ... What are the requirements I need to have in my mind already for creating one

tidal bough
mild yarrow
split bone
#
import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import LabelEncoder, OneHotEncoder


# Kodlar



# Veri Yükleme

veriler = pd.read_csv('D:\\Project\\eksikVeriler.csv')
print(veriler)


ulke = veriler.iloc[:, 0:1].values
print(ulke)

from sklearn import preprocessing

le = preprocessing.LabelEncoder()

ulke[:,0] = le.fit_transform(veriler.iloc[:,0])

print(ulke)

ohe = preprocessing.OneHotEncoder()
ulke = ohe.fit_transform(ulke).toarray()
print(ulke)

print(list(range(22)))
sonuc = pd.DataFrame(data=ulke, index=range(22), columns=['fr', 'tr', 'us',''])
print(sonuc)

sonuc2 = pd.DataFrame(data=ulke, index=range(22), columns=['boy', 'kilo', 'yas',''])
print(sonuc2)

cinsiyet = veriler.iloc[:,-1].values
print(cinsiyet)

sonuc3 = pd.DataFrame(data=ulke, index=range(22), columns=['cinsiyet','','',''])
print(sonuc3)

s=pd.concat([sonuc, sonuc2], axis=1)
print(s)

s2=pd.concat([s, sonuc3], axis=1)
print(s2)
umbral delta
#

so im trying to train a nn using tensorflow, but it just returns the same output for all inputs. what could be causing this?

split bone
#

Please look my projects file

#
     fr   tr   us       boy  kilo  yas       cinsiyet               
0   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
1   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
2   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
3   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
4   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
5   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
6   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
7   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
8   0.0  1.0  0.0  0.0  0.0   1.0  0.0  0.0       0.0  1.0  0.0  0.0
9   0.0  0.0  1.0  0.0  0.0   0.0  1.0  0.0       0.0  0.0  1.0  0.0
10  0.0  0.0  1.0  0.0  0.0   0.0  1.0  0.0       0.0  0.0  1.0  0.0
11  0.0  0.0  1.0  0.0  0.0   0.0  1.0  0.0       0.0  0.0  1.0  0.0
12  0.0  0.0  1.0  0.0  0.0   0.0  1.0  0.0       0.0  0.0  1.0  0.0
13  0.0  0.0  1.0  0.0  0.0   0.0  1.0  0.0       0.0  0.0  1.0  0.0
14  0.0  0.0  1.0  0.0  0.0   0.0  1.0  0.0       0.0  0.0  1.0  0.0
15  0.0  0.0  0.0  1.0  0.0   0.0  0.0  1.0       0.0  0.0  0.0  1.0
16  1.0  0.0  0.0  0.0  1.0   0.0  0.0  0.0       1.0  0.0  0.0  0.0
17  1.0  0.0  0.0  0.0  1.0   0.0  0.0  0.0       1.0  0.0  0.0  0.0
18  1.0  0.0  0.0  0.0  1.0   0.0  0.0  0.0       1.0  0.0  0.0  0.0
19  1.0  0.0  0.0  0.0  1.0   0.0  0.0  0.0       1.0  0.0  0.0  0.0
20  1.0  0.0  0.0  0.0  1.0   0.0  0.0  0.0       1.0  0.0  0.0  0.0
21  1.0  0.0  0.0  0.0  1.0   0.0  0.0  0.0       1.0  0.0  0.0  0.0
#

This is my result

#

Why are my results like this

abstract wasp
#

hi i need help, im trying to clean some text but i get this error:
import nltk import nltk.corpus from nltk.corpus import stopwords from nltk.tokenize import word_tokenize from nltk.stem import WordNetLemmatizer import re import matplotlib def preprocessing_text(text): text = text.lower() text = re.sub(r"(@\[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)|^rt|http.+?", "", text) tokens = word_tokenize(text) return tokens def remove_stopwords(tokens): stop_words = set(stopwords.words('english')) filtered_tokens = [word for word in tokens if word not in stop_words] return filtered_tokens def cleaning_text(text): tokens = preprocessing_text(text) filtered_tokens = remove_stopwords(tokens) lemmatized_tokens = lemmatization(filtered_tokens) cleaned_text = ' '.join(lemmatized_tokens) return cleaned_text text = '/Users/avatarvaleria/UCSD/NLP/HH_english_transcripts.txt' with open(text, 'r') as file: text = file.read() print(text) cleaned_data = cleaning_text(text) print(cleaned_data)

#

`TypeError Traceback (most recent call last)
Cell In[59], line 1
----> 1 cleaned_data = cleaning_text(text)
2 print(cleaned_data)

Cell In[57], line 4, in cleaning_text(text)
2 tokens = preprocessing_text(text)
3 filtered_tokens = remove_stopwords(tokens)
----> 4 lemmatized_tokens = lemmatization(filtered_tokens)
5 cleaned_text = ' '.join(lemmatized_tokens)
6 return cleaned_text
Cell In[56], line 3, in lemmatization(tokens)
1 def lemmatization(tokens):
2 lemmatizer = WordNetLemmatizer()
----> 3 lemmatized_tokens = [lemmatizer.lemmatize(tokens) for token in tokens]
4 return lemmatized_tokens
Cell In[56], line 3, in <listcomp>(.0)
1 def lemmatization(tokens):
2 lemmatizer = WordNetLemmatizer()
----> 3 lemmatized_tokens = [lemmatizer.lemmatize(tokens) for token in tokens]
4 return lemmatized_tokens
File /opt/anaconda3/lib/python3.11/site-packages/nltk/stem/wordnet.py:45, in WordNetLemmatizer.lemmatize(self, word, pos)
33 def lemmatize(self, word: str, pos: str = "n") -> str:
34 """Lemmatize word using WordNet's built-in morphy function.
35 Returns the input word unchanged if it cannot be found in WordNet.
36
(...)
43 :return: The lemma of word, for the given pos.
44 """
---> 45 lemmas = wn._morphy(word, pos)
46 return min(lemmas, key=len) if lemmas else word
File /opt/anaconda3/lib/python3.11/site-packages/nltk/corpus/reader/wordnet.py:2096, in WordNetCorpusReader._morphy(self, form, pos, check_exceptions)
2094 # 0. Check the exception lists
2095 if check_exceptions:
-> 2096 if form in exceptions:
2097 return filter_forms([form] + exceptions[form])
2099 # 1. Apply rules once to the input to get y1, y2, y3, etc.
TypeError: unhashable type: 'list'`

rich moth
#

Make a new def that handles that lemmization to each token in the list

umbral delta
#

im trying to change the learning rate in tf as such: ```py
from keras import backend as K
K.set_value(model.optimizer.learning_rate, 0.01)

#

but i get an error ```
AttributeError: module 'keras.api.backend' has no attribute 'set_value'

rich moth
umbral delta
umbral delta
mental rampart
#

how to find gtx 1650 gpu device plugin for tensorflow gpu support

mental rampart
#

latest ig and yes windows 11

#

i think i need device plugin for tensorflow to access my gpu
To use a particular device, like one would a native device in TensorFlow, users only have to install the device plug-in package for that device.

#

what is WSL2?

#

also is tensorflow intel works as device plugin? not sure

#

oh ic

#

so like

#

does pytorch provides all the functionality of tensorflow?

#

hmm lemme check

#

its so easy to make sequential models on tensorflow

rich moth
#

i finally got the evaluation stage working correctly, i couldnt be happier! Epoch 1/1: 100%|█████████████████████████████████████████████| 351/351 [5:15:18<00:00, 53.90s/batch, Batch Loss=3.68e-5] Evaluation: 0%| | 0/88 [00:00<?, ?it/s]Input to VideoEncoder: batch_size=16, num_frames=16, channels=3, height=128, width=128 After view reshape: torch.Size([16, 48, 128, 128]) After conv2d_layers: torch.Size([16, 512, 128, 128]) After view reshape before fc: torch.Size([16, 8388608]) Input to VideoDecoder: torch.Size([16, 512]) After fc layer: torch.Size([16, 131072]) After view reshape: torch.Size([16, 512, 16, 16]) After conv_reduce: torch.Size([16, 512, 16, 16]) After conv2d_transpose_layers: torch.Size([16, 48, 128, 128]) Channels: 3, Expected size: 12582912, Actual size: 12582912 Final output shape: torch.Size([16, 16, 3, 128, 128]) Evaluation: 1%|▊ | 1/88 [00:10<15:15, 10.52s/it]Input to VideoEncoder: batch_size=16, num_frames=16, channels=3, height=128, width=128

#

Epoch 1 Metrics: {'Total Loss': tensor(8.0492e-07, device='cuda:0'), 'PSNR': 2.810752446743289, 'SSIM': 0.06163982837892718}

#

Seems alright for the first epoch.

#

Jesus the pytorch model is 16.6 gigs

serene scaffold
rich moth
serene scaffold
rich moth
serene scaffold
#

@rich moth in fact, the size of a model is constant. more parameters -> larger model.
epochs are just a complete pass over the training data. More epochs only means more chances to adjust the parameters. It doesn't add additional parameters.

hearty depot
serene scaffold
serene grail
#

What does quantizing even do? I've heard that it makes models smaller/possible to run on worse hardware but also makes them perform worse.
Is it like lowering the resolution of the model in a way?

hearty depot
small wedge
#

it's lowering the number of bits used to represent weights and biases in the model

hearty depot
#

so if standard model uses fp 64 u can lower it to like fp16

small wedge
#

quantization can even happen on ridiculously small bit precisions like 8, 4, 3, 2, 1

#

at that point you don't even use float though

#

they just use ints instead

hearty depot
serene grail
hearty depot
jaunty helm
jaunty helm
# serene grail What does quantizing even do? I've heard that it makes models smaller/possible t...

great answers above already
here's a pull request from 1 year ago (so maybe a bit outdated) adding K-quantization to llama.cpp
from the first graph you can see that mid-high quantization(quants that keep more bits, compressing the model less) actually doesn't hurt the quality too much(lower perplexity is better) but lowers the hardware reqs significantly(note that the x-axis is in log scale)
so from a running-LLMs-locally perspective at least, there's almost no reason not to run a quantized model

hard nest
#

In NN training, the validation and test dataset can have any batch size right? Like I can just use the biggest allowed by my pc to accelerate the learning

mild dirge
mild dirge
#

Does that make sense? @hard nest

hard nest
hard nest
plucky island
#

does anyone here use mmdetection?

severe hare
plucky island
severe hare
ember pawn
#

hello

#

i wanted to ask if someone can help me with a tad bit of issue that i am running into

#

how do you go about doing str.extract

#

in pyspark.pandas ?

#

it is saying that is not supported in the documentation

austere agate
#

whole lotta shit that is out of my knowledge that I dont understand

#

But soon

vagrant root
#

has anyone tried this?

severe hare
severe hare
vagrant root
#

It's not math tho

rich moth
# hearty depot Time to quanitize

Im using vector quantization durning the training. It consist of a codebook with learnable embeddings. In the forward pass the input features are mapped to the clostest embeddings in the codebook, quantizing the features. It actually allows me to preform this on my system by compressing it to lower dimensions. I imagine if they were in their orginal format this would take an insane amount of time.

haughty cradle
#

do we still don't have a better way other than gradient descent for training AI? since from what I learned it seems to not even guarantee you to get the lowest valley, it just guaranteed you getting to the lowest point of the nearest valley

serene scaffold
haughty cradle
#

I see... I guess i need to learn more first pithink

serene scaffold
#

but in general, I don't think there even can be an optimization algorithm that is guaranteed to find the global minimum

haughty cradle
#

I see... 😔

tidal bough
#

just guaranteed you getting to the lowest point of the nearest valley
that's only true for normal gradient descent; the fancy ones are a bit better about it (or worse, if you're unlucky)

#

but generally speaking, yes. optimization is hard.

past meteor
#

Are we talking of neural networks?

#

Because if the problem is convex you have strong guarantees with basic SGD

hearty depot
#

yeah

hearty depot
wild loom
#

hey guys so I am currently working on a project surrounding training my own Faster RCNN model and it's running as we speak but it's tages ages and I have no refernce for when it's going to sotp traning it. Do you guys know any way that on google colab I am either able to monitor it's traning speed as it runs through or if there is a free / cheap way to speak up the speed of it

vagrant root
vagrant root
wild loom
#
[07/11 16:25:03 d2.engine.train_loop]: Starting training from iteration 0
[07/11 16:35:38 d2.utils.events]:  eta: 2:26:54  iter: 19  total_loss: 1.874  loss_cls: 0.6037  loss_box_reg: 0.5858  loss_mask: 0.6831  loss_rpn_cls: 0.005265  loss_rpn_loc: 0.009363    time: 31.3418  last_time: 25.0513  data_time: 0.0441  last_data_time: 0.0071   lr: 1.6068e-05  
[07/11 16:45:58 d2.utils.events]:  eta: 2:13:44  iter: 39  total_loss: 1.661  loss_cls: 0.4723  loss_box_reg: 0.5756  loss_mask: 0.6377  loss_rpn_cls: 0.007119  loss_rpn_loc: 0.01208    time: 31.0541  last_time: 36.0582  data_time: 0.0089  last_data_time: 0.0081   lr: 3.2718e-05 
#

these are what it's outputing currently and you can see the eta per 10 iterations

tidal bough
vagrant root
#

10 minutes for 20 epochs(30 s/epochs)

wild loom
#

I am currently trying to train it on 300 iterations so that it can serve as a base point for where I continue from and what I'm doing wrong but if it's going to take 30 seconds per iteration I'm gonna have to leave it over night than because that sounds horrid

vagrant root
#

how big is the model being trained

wild loom
#

it's around 120 images in the training portion

#

this is my first real attempt I would say at working with training my own model as well

vagrant root
#

how many layers are in your model?

wild loom
#

does it also make sense every time it's returning me these eta's that it gets shorter and shoter?

#
import detectron2
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2 import model_zoo

# Setup configuration
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")  # Use pre-trained weights
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 300    # Adjust the number of iterations as needed
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # Number of classes (excluding background)

# Use CPU for training
cfg.MODEL.DEVICE = "cpu"

# Create output directory
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)

# Train the model
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()
vagrant root
#

yea eta is estimated time of arrival(completion)

wild loom
#

that's the configurations for the training

vagrant root
#

so yeah the total time left is decrasing everytime

wild loom
#

shit

#

thanks for you're help lmfao

vagrant root
#

and you are using a cpu instead of cuda?

wild loom
#

if you have any idea of how I could decrease this please let me know

wild loom
#

idk if there was another way around it

vagrant root
#

colab has a gpu

#

runtime>>change runtime type>> T4 GPU

#

Use CPU for training

cfg.MODEL.DEVICE = "cuda"

hearty depot
wild loom
#

Okay

vagrant root
wild loom
#

I will make sure to do so

vagrant root
hearty depot
vagrant root
hearty depot
#

For metal it’s like mps iirc

#

I’d check documentation tho for the library u r using

vagrant root
#

oh

hearty depot
vagrant root
#

i thought there was no hardware accelation for my m1

#

lol

wild loom
#

I have an M2 Mac so would that change anything

wild loom
haughty cradle
#

never thought NN would be this complex 🥹

where to learn how to make NN like this?

#

how to know it's ok to literally add a sin or cos formula to a bias

severe hare
#

^ Transformers and NNs are 2 different things.

haughty cradle
#

how do they even know self-attention can be made like that

ember pawn
#

if pyspark.pandas is worth using

#

or should i use pyspark directly

haughty cradle
#

is there something similar to it?

severe hare
# ember pawn or should i use pyspark directly

It's fine for large databases- you probably don't need to add pandas, at least not right away
https://www.linkedin.com/pulse/leveraging-pyspark-integrating-diverse-data-sources-guide-ramesh-0ljnc/

In today’s data-driven landscape, the ability to seamlessly integrate data from various sources is a vital skill for data professionals. Apache Spark, particularly its Python API PySpark, offers robust capabilities for handling large-scale data processing in a distributed computing environment.

haughty cradle
#

thx

ember pawn
#

and found out there is this pyspark.pandas

#

but like i do not know if it is i should use it or not

#

and i tried to use it

hearty depot
ember pawn
#

i only worse and worse errors

severe hare
haughty cradle
#

is there maybe others NN model other than Transformer?

severe hare
hearty depot
#

function on spark df

ember pawn
#

it is pandas api

#

for pyspark

wooden sail
severe hare
ember pawn
#

ohh i wanted to ask are there any courses that you would reccoment for deep leanring
i have done the coursera deep learning specialization

haughty cradle
wooden sail
#

as for the answer to "how do people come up with this" and "how to know when to use which activation function", it's "by studying a lot of math"

severe hare
wooden sail
#

you usually need a very good statistical or optimization or other math-based motivation to come up with a new architecture that works well and know why it works

haughty cradle
#

damn... so I need to learn the math

#

I see...

#

pithink I guess I need to learn more

wooden sail
#

if you want to make new stuff without stumbling around in the dark, yes

severe hare
vagrant root
hearty depot
wooden sail
#

if you jus want to use stuff, you can just fish up the hottest stuff used atm

haughty cradle
severe hare
hearty depot
ember pawn
ember pawn
#

for linear algebra if there is any i would love to buy i tend to my deep leaning on laptop where i can like code it out a lil

vagrant root
haughty cradle
#

I feel like each connection between neural is just a linear algebra tbh

hearty depot
ember pawn
vagrant root
hearty depot
severe hare
ember pawn
#

yes
i am college graduate level

vagrant root
severe hare
serene grail
ember pawn
#

and i wanted to ask like HMMMM
how do you think a neural network reaches the optimal solution honestly whenever to try to sum it over in my head and explain i simply cannot because it is just dot product happening and lot of it is random

hearty depot
# ember pawn yes i am college graduate level

ok then i would suggest gilbert strang or axler linear alg
strang is a lot more computational whereas axler is more proof based and formal

as for calculus paul's online math notes is a good intro and if u want to do deep dive read spivak

haughty cradle
#

damn... that is matrix... does this mean I need to learn FFT 😭
I never understand FFT (Fast Fourier Transform)

small wedge
hearty depot
#

also ur prob gonna learn in college

#

at some pooint

small wedge
vagrant root
# ember pawn and i wanted to ask like HMMMM how do you think a neural network reaches the opt...

This is the most step-by-step spelled-out explanation of backpropagation and training of neural networks. It only assumes basic knowledge of Python and a vague recollection of calculus from high school.

Links:

▶ Play video
hearty depot
#

loss function so loss is guaranteed to be suffecient decrase in certain cases

haughty cradle
#

you use Gradient descent for that no? or it just don't always work in real practice? pithink

ember pawn
#

hmmmmmmm idk it just seems odd to me how it all works out
you can never an output being reached with a singlular unit of neuron what exaclty is the point of stacking them up
like if you were to spread a neural network with same amount of neurons with that of hidden layers will it work the same ?
if not then what fucntionality is the hidden layer adding ?

#

i know the question seems stupid but i cannot understand it

small wedge
haughty cradle
hearty depot
vagrant root
hearty depot
small wedge
#

you can get lost in the sauce with optimization algorithms (quasi-hyperbolic momentum 🥴 )

vagrant root
wild loom
#

@hearty depot & @vagrant root if I hook up my google colab to a local runtime do you guys think it would run faster

vagrant root
#

adding more units will make it more precise

ember pawn
#

but we can have a non linear activation regardless of hidden layer or not

hearty depot
vagrant root
hearty depot
#

^

wild loom
#

okay thank you

hearty depot
#

p sure colab has tpu

small wedge
#

I'm lazy I just stick to my genetic algorithms so I don't have to do math peepoSit

vagrant root
#

the nvidia gpu(cuda) is designed for ai matrix multiplication

ember pawn
#

so if a neural network is a one dim array with non liner activation and the number of neuron matches the number of neuron units with the hidden layer the performace will be the same ?

hearty depot
small wedge
vagrant root
#

yo @final kiln how is the job search going?

#

lessgooo

#

congrats

#

goodluck

hearty depot
#

The continuous case right?

haughty cradle
#

isn't one hidden layer just mean it only pass through 2 linear function? that mean it's should behave like X^2 polynomial graph no?

hearty depot
#

Lowk a blessing and a curse, some of this beliefs make for the worst papers
Mfers be writing papers on emergent properties of llms and then be using mcq for the metric 😭
No shit there is a sharp linear increase in accuracy, now try that a non linear metric

haughty cradle
#

I see...

haughty cradle
haughty cradle
#

I found this meme on that X link, what does this mean?

#

KAN superior?

hearty depot
serene grail
#

I'm a noob but I've never even heard of that, is KAN new?

hearty depot
#

like u make kan from mlp layers

hearty depot
#

to be an alt to mlp

serene grail
#

I don't know much math yet, sounds like something similar to a Fourier Transform?

vagrant root
serene grail
#

Interesting, thanks

haughty cradle
#

I still can't understand Fourier and I have studying it for like 4 years 😭

#

I get the basic but once you get into the compression and accelerating stuff 💀

haughty cradle
#

I understand that image, but putting it into practice is another things

severe hare
# haughty cradle I understand that image, but putting it into practice is another things

Nearly everything in our life can be represented as a waveform. From the images displayed on our phone screens to sound waves coming from our headphones, they can all be represented as a waveform. …

haughty cradle
iron basalt
# haughty cradle KAN superior?

It's a more recent hyped paper that in practice is just the same thing again but worse, several of these pop up over time in ML. You can also show it to be the same mathematically.

#

You might get some neat concepts from such papers, but don't let the hype get to you.

#

(Kolmogorov's work)

serene grail
#

I mean, I like that people way smarter than me are investigating approaches that are different from the current ones

iron basalt
iron basalt
#

But if there is a demonstration, that I can reasonably reproduce, I am very interested.

haughty cradle
#

wow I don't know such things exist

vagrant root
#

🙂

#

notice how all different waves are made on same frequencies with different altitudes

#

thats what fourier transform does it deconstructs a waveform into multiple frequencies waves

ornate bronze
#

science

hearty depot
#

this book is nice imp

#

also a lot of good examples in code

river cape
#

Why is ReLU mostly used in the hidden layers of a neural network than compared to Sigmoid?

hearty depot
#

One problem w sigmoids r vanishing gradients, relu r less prone to this

harsh sun
#

Hello. I am having difficulty conceptualizing specific parts of neural networks. I have went over the math, and I understand how the math works in terms of calculating the values. Here is my question:

What is the significance of the prediction any one neuron makes (linear regression predictions and activation in this case RelU). So like, when that value is produced and then passed into another layer with the dot product of further weights, at the end of the day, how do all of those numbers come together to form a cohesive output.

#

Essentially what does each part mean to that final whole? Cause i dont get how each neuron relates to its output.

wooden sail
harsh sun
wooden sail
#

nope

#

in fact, especially if you approach it from the perspective that NNs were inspired by the brain, which we also don't understand, the idea is that complex organized behavior "emerges" from simple interactions in "inexplicable" manners

#

the theorems involved in justifying neural networks are claims of existence of good approximators, but they are not constructive (they don't tell you exactly how to build such a network)

#

you can read into explainable AI if you like

harsh sun
# wooden sail you can read into explainable AI if you like

so what ur saying is, they did linear regression and got a prediction. and then used that prediction in tandem with tons of other predictions and were like hm so if we make this a big chain and we do tons of dot products with tons of weights then we can get better predictions?

#

then with that, how would RelU make it non-linear? just by adding holes in the data?

wooden sail
wooden sail
#

sure, "adding holes in the data" in this case is a nonlinearity, but it isn't always

#

all nonlinearity means here is that you applied a function for which it is NOT true that T(aB + cD) = aT(B) + cT(D), where T is a transformation, a and c are scalars, and B and D are vectors

#

"punching holes in the data" can also be done by multiplying with binary matrices, but this is a linear transformation, so the idea has to be defined more carefully

harsh sun
hearty depot
#

one of problems with a lot of nns r that they r flexible

wooden sail
hearty depot
#

but hard to interpret when compared to classic strategies like linear reg

wooden sail
#

the composition of all of the layers of a network makes a prediction

harsh sun
wooden sail
#

you're trying to interpret stuff that makes no sense

harsh sun
#

i mean linear regression makes sense

#

you are producing a line of best fit

wooden sail
#

you have already assigned (a weird) meaning to these things in your head, that's what's throwing you off

wooden sail
hearty depot
wooden sail
#

the layers are not doing linear regression

harsh sun
wooden sail
#

no

wooden sail
#

if you find one you win a prize, cuz researchers haven't so far

#

a lot of deep learning is "lmao it worked, look"

harsh sun
# wooden sail no

thats so mind boggling. how would people know then that doing that math, and doing that math in layers, produces a prediction?

wooden sail
#

because there are severa ltheorems saying that if you compose some number of nonlinear functions, you can get arbitrarily close to any other function

harsh sun
#

i js cant conceptualize that

hearty depot
wooden sail
#

it doesn't tell you which functions nor how many to compose

#

nor what each of them mean

hearty depot
#

also iirc non-linearity is another reason why relus exist iirc

hearty depot
wooden sail
#

those theorems motivate you to try composing simple functions. the training procedure does not give you nicely interpretable layers, they do arbitrary shit

harsh sun
wooden sail
#

properties that are nice for some optimization strategies, but not others

harsh sun
#

hm

hearty depot
#

also calculation for relu r a lot easier in terms of flops compared to like sigmoid

harsh sun
#

my professor showed us how neural nets worked for XOR, and he said that neural networks for XOR produce a line (when using non-linearity), but it produces a fat line

harsh sun
#

i mean from what hes conveyed it seems rather simple to implement like a XOR neural network with no libraries

wooden sail
#

for one, it has the nice property of producing outputs in the range 0 to 1, which you want in this case

serene grail
#

What makes a line fat?

iron basalt
# harsh sun i js cant conceptualize that

Imagine you get some input and you compute a ton of random functions on that input. If you have enough of those one or more of them will probably compute the correct answer to the problem. There are neural networks that operate on this as a foundation itself. So you can see why big networks would generally work, even if your method is random init except for the last bit.

hearty depot
iron basalt
wooden sail
harsh sun
# serene grail What makes a line fat?

he showed us a graph of a line with a slope of id say approximately -x and he conveyed it as two lines that are parallel with different y intercepts, and the space between those two lines are corresponding to two XOR values and the space outside of those two lines correspond to two other XOR values

iron basalt
#

Sigmoid was specifically chosen for several nice properties and also it kind of mimics real neuron activations which is why we still call them neural networks even though it's far removed from that at this point.

harsh sun
wooden sail
#

not in the way you wanted, no

serene grail
harsh sun
#

but, in the end, the neural network produced three values. those three values were probabilities of dog, cat, and then smth else

hearty depot
harsh sun
#

i was dumbfounded to find that after reading 10-20 articles none of them address why

#

so ur assertions make sense

#

they only address the math to do so

#

which IMO is relatively simple considering I learned the calculus and linear algebra for it two days ago

#

so @wooden sail the dot product acting as a method of assigning similarity in terms of vector math isnt relevant at all to how the neural net produces these predictions?

iron basalt
wooden sail
harsh sun
wooden sail
#

yes

harsh sun
#

interesting

wooden sail
#

you could use other functions instead. what even IS a neural network?

harsh sun
#

thats rlly interesting actually

hearty depot
wooden sail
#

it's often a lot more useful to just think of it as function composition

harsh sun
wooden sail
harsh sun
#

ik

#

but it makes me feel happier inside

wooden sail
#

not to mention it's not even true (anymore)

#

double whamie

harsh sun
#

thats how he developed the concept

#

he was a neurologist or smth

wooden sail
iron basalt
#

Like when you look at a puddle of water, and you splash it, and that causes some output. You can give some general properties and even explain how each individual particle works on its own, but if I gave you some random state in that dynamical process and asked you what it "means" for the output, there is nothing to really say other than it will cause the output to happen eventually / has a lot of random information.

hearty depot
wooden sail
#

carrying that forward will only hurt you

harsh sun
harsh sun
#

i find it ironic that the tool thats meant to make sense out of things that dont always make sense itself doesnt make sense

#

😐

hearty depot
# harsh sun hm

if u want to something closer to brain model, look at ml papers on assembly calculus
that is prob what ur looking for

harsh sun
#

im js taking a class

wooden sail
# harsh sun 😐

AI was never a tool to make sense of things, it's a tool for replacing one problem with another

harsh sun
#

wdym

#

i mean isnt it inherently meant to solve things that arent classicly solvable using standard decision trees?

wooden sail
#

it addressed the problem of not knowing what a function is, and replaces it with "maybe if i have enough examples of inputs and outputs, i can get something similar to it"

wooden sail
hearty depot
wooden sail
#

(not technically true about the 0 guarantees, but the guarantees are usually not of the kind one would want. for some architectures, you can explicitly when under which conditions you'll fit the training data exactly and get overfitting)

harsh sun
#

i didnt realize how present ML is everywhere

#

now transformers are going crazy

wooden sail
hearty depot
iron basalt
#

(It's about complexity)

harsh sun
#

so we dont even know how neural networks come up with patterns.

wooden sail
#

100% autonomous vehicles are illegal in most places

harsh sun
#

do we even know if they come up with patterns?

harsh sun
#

my cousin is building a new autopilot system with his company and he was telling me abt it

wooden sail
#

i'm just trying to make sure you're not romanticizing AI as something it isn't

harsh sun
#

it was rlly interesting the problems and also solutions

harsh sun
#

but im also more wrapping my head around this

hearty depot
harsh sun
#

so theres no need to deflect its usefulness. it obviously has downsides too

harsh sun
#

(besides linear regression btw)

hearty depot
#

examples being shap and saliency maps

iron basalt
# harsh sun so we dont even know how neural networks come up with patterns.

Well the problem is what "how" means here. I can give you a general overview, but what you are probably asking for is a step by step demonstration, which gets back to the thing where I could say what each node in the network is computing, but I can't assign a label to that that has meaning to a human at a high level.

harsh sun
harsh sun
iron basalt
#

The calculation happening is the best description I can give.

harsh sun
#

i understand conceptually how they work how you have your features and with your data, you run the NN over and over again to get the weights and bias that minimize the error. then at the end u can do other mathematical techniques to get the answer into different formats

#

and i know the mathematical functions used

iron basalt
hearty depot
# harsh sun ah well i know that already

for stuff like cnn, u can visualize the convolutional layers by the overall values of the convolutions when doing forward step
like this kind hacky but it can help u know what ur netowrk think is improtant or not

harsh sun
#

(like same category yk what i mean)

harsh sun
#

personally i find linear aggression the most intuitive

#

that was rlly intuitive to me and made a lot of sense

iron basalt
#

If you have sparse neural networks, and especially not distributed, it becomes a lot easier to tell because unlike dense not everything is hooked up to everything / a giant soup. Convolutions (networks) have sparse weights (shared weights basically). This makes them better for visualization / understanding, but still not great due to dense activations.

#

Even with sparsity, while better, it's still not explainable when complex enough.

iron basalt
hearty depot
harsh sun
harsh sun
iron basalt
#

(It's probably right and seems to hold so far from people testing various pruned networks)

iron basalt
late lichen
#

I want to write a simple AI and I want to write the logic that runs it, my biggest question is how I will do gradient optimization?

harsh sun
late lichen
#

I previously made DAGs neuron network but now I want to try MLP

late lichen
#

Yeah

harsh sun
late lichen
#

Yes

#

I have watched 3brown1blue vid

harsh sun
#

okay. so do u know the sum of squared residuals that will produce the curve that gradient descent will attempt to optimize?

late lichen
#

But it for some reason I still don't understand it maybe because he didn't explain how to work with biases to

harsh sun
#

sum of squared residuals is what im trying to reference

#

do you know what that is

late lichen
#

Yes

harsh sun
iron basalt
harsh sun
#

its just another value that can help make the prediction more correct

#

so instead of training just weights with features, you can have an additional value that is determined and factored into the overall calculation

#

they arent always necessary

iron basalt
#

But that is a human problem, because it's about what means something to a human.

harsh sun
late lichen
#

I got the idea of back propagation but I feel like it only update the weights and never with biases

harsh sun
#

gradient descent comes in once u understand sum of squared residuals

harsh sun
#

which updates it as the model is trained

#

i can find the formula for u if u need tho

#

like the math associated with updating it

late lichen
#

Let's say we got the perfect weights but with that isn't will already work? I mean if we ramp up the bias ,the descendant node will recognize that node to be active isn't?

hearty depot
harsh sun
iron basalt
#

Unless you have to start sparse or it won't run fast enough.

harsh sun
#

@iron basalt btw, I read here that this is a thing, but ive never heard of this in my studies so far

Each node connects to others, and has its own associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network.
is this js another form of an activation function?

#

like relu?

#

looks to be that way

harsh sun
iron basalt
harsh sun
#

ok ty

late lichen
#

Wait I think I get it, so we perform back prop on weights and after that we do it on biases?

iron basalt
#

First perceptron implementation.

late lichen
#

I was thinking that even we have perfect weights we always lack something if we don't tell how much the node want to be activated

iron basalt
#

Then they added more layers, although as you can see the physical implementation is not really a great idea.

harsh sun
iron basalt
harsh sun
#

NNs

iron basalt
iron basalt
late lichen
#

But isn't if we ramp the bias it will increase the final val more than just weights?

hearty depot
#

they r performed at the same time via chain rule

late lichen
#

If we need to increase the gradient of a single node how we will know that we need to increase the weights instead of bias?

late lichen
hearty depot
#

we calculate gradient in respect to that and update weights and bias layerwise using an algo called autodiff

#

basically top down approach

late lichen
#

I don't know much on "convex loss function"

late lichen
hearty depot
late lichen
#

Cool

#

Let's say we have 2 nodes hidden and output nodes

What the formula looks like on the bias and weights to update it's value?

#

@hearty depot

narrow tiger
#
LangChainDeprecationWarning: The class `LLMChain` was deprecated in LangChain 0.1.17 and will be removed in 1.0. Use RunnableSequence, e.g., `prompt | llm` instead.```
how do i resolve these
severe hare
#

prompt | llm is going to be your input prompt as a class instance.

late lichen
#

Thats supposed to be English enough....

severe hare
#

Eh that doesn't show up in the docs very much though. They have OpaquePrompts but it's supposed to work with API calls more I think

harsh sun
#

btw, why do neural networks not solve non-linear things without a non-linear activation. doesnt it discover patterns based off of the values it calculates?

small wedge
#

neural networks without nonlinear activations are just linear function estimators

late lichen
#

¯_(ツ)_/¯

small wedge
#

adding more layers does nothing if you don't add nonlinearity, it's just making a bigger linear function

late lichen
#

Hmmm

#

Kinda make sence

#

gelu is good function for hidden layers instead of relu?

small wedge
harsh sun
#

im having trouble conceptualizing NNs

small wedge
serene scaffold
small wedge
#

^^

#

people understand neural networks enough to build them, what we don't understand is the actual data stored inside of massive models after training; the values of the weights and biases and what they relate to are viewed as a black box. Say you are given a single parameter from GPT4's weights and you can change it up or down, we don't have a way to know exactly how that change will effect the output without testing that output. If we did, we could do "AI brain surgery" and fine tune models by hand instead of having to train them further through things like RLHF to align them.

worldly dawn
harsh sun
harsh sun
#

and ive read 10-20 articles and it never explained how each node is significant to the total output

serene scaffold
harsh sun
#

theres a lot so its hard to rephrase it now

harsh sun
worldly dawn
small wedge
#

okay so why don't we take a simple model and lay it out as a single function that will show you what's happening

harsh sun
#

i know the math for the neural networks

#

but like, the patterns that the neural net discovers as it trains. i dont see how it results in a line

worldly dawn
#

to be honest, I am not clear on the hold up for you

small wedge
#

it has nothing to do with the patterns it learns, the neural network without nonlinearity is only able to output a line. it's job is just to find the best line that fits.

#

if I give you a linear function like x * w + b how could you ever represent anything other than a line by changing w and b? (these are scalars in the example)

harsh sun
#

because the equation is linear per node

iron basalt
#

s(x) is the activation function, change it to just x and see what happens.

harsh sun
harsh sun
#

otherwise no matter what its a straight line

iron basalt
harsh sun
iron basalt
#

This is XOR, the boundary between the two classes is that curve.

#

Depending on which side the point lies, it spits out that class.

harsh sun
#

depending on the task

iron basalt
#

Try making up your own points in the table, can you manually find the solution (weights)?

small wedge
# harsh sun because the equation is linear per node

say we add another layer to my example l1 = x * w1 + b1 l2 = l1 * w2 + b2 we can break this down to (x * w1 + b1) * w2 + b2, no matter how many multiplication or additions you add to this function the output will always be linear

harsh sun
#

this is like ground breaking

serene scaffold
#

A lot of people know about desmos.

harsh sun
#

so @iron basalt by making is non-linear, your output would inherently be shaped weirdly. training it just manipulates the "thing" so that it fits the solutions?

harsh sun
iron basalt
#

Try ReLU btw.

#

You can visually see the line-ness of ReLU.

harsh sun
iron basalt
harsh sun
iron basalt
harsh sun
harsh sun
iron basalt
#

It's the literal code.

harsh sun
# iron basalt Yeah, that is the "how."

yeah but like how does it contribute to the solution. like how does doing the dot product between the current layer and the weight of a specific node, how does that value come to impact this in the end.

harsh sun
#

conceptually is what im asking

iron basalt
#

Are you asking why, not how?

harsh sun
# iron basalt I'm not sure how to answer that / what you are asking for. The math shown there ...

actually i think i know why. lmk if this is true. the value you get from the dot product is weighted. so when it trains you can shape the data. then plugging that in again to another node above means that the value passed into that node is a reflection of many previous weights. therefore, by changing the weight, you are having an even greater impact because just one node later in the nn uses many weights.

#

does that sound right?

iron basalt
#

Try some stuff like f(x)=x, f(x)=cos(x), f(x)=cos(2x), f(x)=x+cos(2x). Try adding parameter sliders, like f(x)=ax+bcos(cx). Then try making another function, g(x), and try making it make use of f(x).

#

Then know that a NN is just a bunch of these combinations of various functions to solve the problem. But instead of having cos(x) in there (although some NNs do), we have just something simple, like ReLU(dot product and bias), and by composing a ton of them we can also get what we want.

serene grail
#

So many functions combine into one function that is supposed to approximate the data?

ionic valley
snow garden
#

hey yall, i'm new here and i was just wondering how long yall have been programming/working with AI and what its like in your opinion 🙂

unkempt apex
#

yeah so start with you!

rich moth
harsh sun
#

Depending on which AI method u use is differing levels of complexity

#

I’d say linear regression is pretty easy

#

Neural networks, now that I get them, are a bit more complex. CNNs are weird I’m still working em out.

ionic valley
#

is there any point in accounting for multicollinearity when using methods like ridge and lasso?

wooden sail
# ionic valley https://discord.com/channels/267624335836053506/1261143782179606619 Any help?
  • Does lasso avoid multicollinearity?
    it CAN, but not always. it depends on whether the matrix involved in the computation has a large enough "kruskal rank"

  • if there are two highly related terms, wouldn't lasso just pick one, and shrink the other to zero, essentially rendering techniques like vif useless?
    related to the previous point, it favors terms that result in a lower L1 norm when deciding which terms to ignore. it can only do this in a useful way if the kruskal rank is high enough, and the result you get is only useful if what you want is a sparse solution. VIF is a statistical criterion based on covariance. if you interpret LASSO as the maximum a posteriori estimator, what it does is assume the solution vector follows a Laplace distribution, but says nothing about the covariance directly. they do different things, you have to decide which approach works for your problem.

  • Lasso is strong for a large number of predictors, but what's the limit?
    idk what you mean with this. usually what matters is the ratio of nonzero predictors vs the total, which is also related to the kruskal rank. there's a thing called "phase transition maps" which plot out at what level of sparsity L1 regularization methods break down.

  • What stops me from simply "feature engineering" tons of useless variables on the basis that one of them might be good for the model? After all, anything deemed useless will shrink to zero given a good lambda value.
    the more correlated the features are with each other, the worse L1 regularization works, so adding extra features is pretty much always bad no matter how you look at it

deep sleet
#

Does google colab offer GPUs for free access?

wooden sail
#

yes but they're shared and sometimes you don't get immediate access

#

it can tell you to wait if you've been using too much compute time

deep sleet
#

but how do I know if I have access to one or not

#

I am using device = torch.device("cuda" if torch.cuda.is_available() else "cpu") and it says CPU

wooden sail
#

the code won't run 😛

#

you have to change the runtime type to be able to see the gpu

deep sleet
#

oh

wooden sail
#

sure

deep sleet
wooden sail
#

tpu is not a gpu

#

i've never played with tpu's myself so i'm not familiar with the specs and i can't really comment on what is better here

#

try them out and see what happens

deep sleet
#

oh

#

I think I will google what they are first

vagrant root
#

gpu is specific for matmuls

deep sleet
#

oh

vagrant root
#

ye

deep sleet
#

Thx

vagrant root
#

go for t4 gpus

#

tpus are deprecated

deep sleet
#

Noted but what does deprecated mean?

vagrant root
#

not up to the technology

#

left to die projects basically

wooden sail
#

not quite right, they're way more power efficient

vagrant root
wooden sail
#

i would also imagine they work great for anything using XLA (which pytorch does not use by default)

wooden sail
vagrant root
#

moving towards tensor chips?

vagrant root
wooden sail
haughty cradle
#

I know some of them but where can I learn the others?

#

I only know Input, Hidden, Output, and Recurrent cell

#

don't know others things

vagrant root
vagrant root
haughty cradle
#

thx

vagrant root
# haughty cradle thx

FROM CLAUDE:
Input Cell: The entry point for data into a neural network.
Backfed Input Cell: An input cell that receives feedback from later layers.
Noisy Input Cell: Adds random noise to input data for improved generalization.
Hidden Cell: Processes and transforms data in intermediate network layers.
Probablistic Hidden Cell: Incorporates randomness for modeling uncertainty.
Spiking Hidden Cell: Mimics biological neurons by firing at specific thresholds.
Capsule Cell: Groups neurons to represent entities and preserve hierarchical relationships.
Output Cell: Produces the final prediction or result of the network.
Match Input Output Cell: Attempts to recreate specific input patterns in its output.
Recurrent Cell: Processes sequential data using self-looping connections.
Memory Cell: Stores information over time in recurrent networks.
Gated Memory Cell: Controls information flow with additional mechanisms.
Kernel: A small matrix of weights used in convolutional operations.
Convolution or Pool: Applies kernels or reduces spatial dimensions of data.

haughty cradle
#

thx ❤️

vagrant root
past meteor
#

I think I had a meta overfitting problem 🥴

#

The optimization algorithm you use for hyperparam tuning can also overfit if the tuning set is fixed. After some time it may find hyperparams that don't generalize to other datasets

#

That's really interesting

lapis sequoia
#

if some one want data science and coding related courses dm me

past meteor
#

I have a 3 way split

#

It's on the holdout test you can see if the hyperparam tuning "overfit"

#

This is 100% how you should do it

#

But

#

I don't cross validate neural nets

#

All the rest, yes

#

Takes too much time

#

Because you mostly do DL right?

#

Doing the entire thing is way better, improves skin quality, sleep at night, blood pressure etc.

#

But it takes too long for deep nets

#

I'd argue that not training a model means you still need to evaluate

#

Unless it's low stakes stuff

#

That's not what I meant 😮

#

If I were doing RAGs in the real world I'd spend a lot of time thinking about how to quantify their performance

#

That's the most high value literature out there right now for "industry ML"

#

You're not training anything but you do need to iterate and be able to quantify the performance gain somehow, otherwise it's anecdotal and small sample based

#

How will you evaluate a RAG?

#

Evaluation is the thing I take the most seriously 😄

#

There was talk of collaborating with the emergency services to make a model to detect people falling

#

95-99% of such a project is data collection + evaluation framework stuff

#

I don't think it's something that data people take seriously tho

#

My colleague just had some videos of him falling and was making the models on that

vagrant root
#

is a 2000 sample data valid in your opinion?

#

to form a hypothesis

past meteor
vagrant root
past meteor
#

I don't want to sound annoying but your question isn't specific enough 🙂

vagrant root
#

would you consider a research on 2000 element dataset valid? if it provides good result

past meteor
#

"To form a hypothesis" what do you exactly mean with this?

#

Are you training models? What are you doing

#

Or statistical tests? Or data analysis?

vagrant root
#

i have a method proposal which takes 2000 samples and trains the model on it

#

the model performs with an accuracy of 85 on testing and 94 on validation

#

would you consider the method to be a valid research or is the data to small to infer