#data-science-and-ml

1 messages Β· Page 154 of 1

serene scaffold
#

don't try to learn "fast". that probably means you're willing to cut corners and develop only a superficial understanding.

left tartan
#

Fast is Slow. Slow is Fast.

#

I've got all the dad-isms memorized.

past meteor
#

yes you need FC layers, each neuron is basically a dimension of your output vector

#

How many do you need? That' a hyperparameter πŸ™‚

spring field
#

so does seaborn also have an unintuitive API?
alternatives? plotly?

serene scaffold
spring field
calm thicket
#

i am also a plotly shill

young granite
past meteor
#

Matplotlib is OK if you read their docs then it makes sense

#

BUT it’s awfully verbose to do some things seaborn does easily out of the box

#

The annoying thing is that imo, to use seaborn you must know matplotlib or you’ll not be able to customise it properly

serene scaffold
#

@upbeat prism why did you delete your message

#

I was formulating my response

torpid ginkgo
#

Hi anyone, can anyone help me with code? Its quite long. I tried to train a model for computer vision. For some reason it does not completely run

serene scaffold
thorny geode
#

hi, beginner here, does smaller data set cause predictions to capture more noise (the randomness of the data that causes irreducible error)?
so the smaller the dataset, the error represents the data variation larger, is that true?

#

@serene scaffold ._.

serene scaffold
thorny geode
#

so i usually asks for confirmation to my teachers whether my exact understanding of the material is true or not

serene scaffold
#

I've never heard anyone describe a prediction as "capturing" something.

thorny geode
#

perhaps you may need to see yourself

#

wait

#

this is the definition of f, and epsilon (which is the irreducible error i were talking about earlier)

upbeat prism
#

lol I thougth I deleted a different one >.<

serene scaffold
upbeat prism
serene scaffold
thorny geode
serene scaffold
#

@thorny geode what does all this have to do with the size of the dataset?

thorny geode
#

this is also what i meant from "capturing the noise" (the green curve)

serene scaffold
#

it's not that all the data points are "noise". it's that the green curve isn't general.

thorny geode
# serene scaffold <@537775568507240471> what does all this have to do with the size of the dataset...

my hypothesis is that as the data set got larger, the error rate can balance each other so the overall prediction will capture the true f much more than the error rate (as the error rate or variance of the data decreases as the sample increases), but if on an extreme cass, where there is only 3 data sets, even a slight variance of the data can cause the prediction to "wobble" considerably, such that the prediction captures much more noise than the correlation itself

thorny geode
serene scaffold
upbeat prism
#

Let me ask a similar question but more concret. I wrote a pytorch tensor using tensor subclasses. I basically just log any function call that is made to the. tensor. My tensor is called LoggingTensor. So if you have a tensor t of type LoggingTensor and you do t + 1 you'll see a printing statement saying that the addition function was used.

I have this network:

class MyModel(nn.Module):
    def __init__(self, bias=False):
        super(MyModel, self).__init__()
        self.input = nn.Linear(1, 2, bias=bias)
        self.hidden1 = nn.Linear(2, 2, bias=bias)
        self.hidden2 = nn.Linear(2, 2, bias=bias)
        self.output = nn.Linear(2, 2, bias=bias)

        self.init_weights()

    def forward(self, x):
        v = self.input(x)
        h = self.hidden1(v)
        k = self.hidden2(h)
        o = self.output(k)

        return o

    def init_weights(self):
        # [[w1], [w2]]
        self.input.weight = torch.nn.Parameter(torch.tensor([[0.2], [0.3]]))
        # [[w3, w4], [w5, w6]]
        self.hidden1.weight = torch.nn.Parameter(torch.tensor([[0.4, 0.5], [0.6, 0.7]]))
        # [[w7, w8], [w9, w10]]
        self.hidden2.weight = torch.nn.Parameter(
            torch.tensor([[0.8, 0.9], [0.22, 0.33]])
        )
        # [[w11, w12], [w13, w14]]
        self.output.weight = torch.nn.Parameter(
            torch.tensor([[0.44, 0.55], [0.66, 0.77]])
        )

It's very basicay. πŸ™‚

Now I do:

    bias = False
    model = MyModel(bias=bias)
    x = torch.tensor([6.9], requires_grad=True)
    breakpoint()
    prediction = model(x)
    grad_out = LoggingTensorOld(torch.ones_like(prediction))
    L = prediction.sum()
    prediction.backward(gradient=grad_out
thorny geode
upbeat prism
#

Note that the input tensor is a normal torch tensor but I define the "output gradients" as one and of type LoggingTensor. I pass it to backward . Now this allows me to see a log of all the function calls during backprop. I see this: https://bpa.st/FU4Q

As you can see, we have: (My notation is simplified)

  • dL/do * W
  • dL/dk * W
  • dL/dh * W

But also

  • dL/dx * x
  • dL/dh * v
  • dL/dk * h

and 2-3 other things IM not sure about yet.

Question: Why exactly do I see forward values show up?

serene scaffold
upbeat prism
#

hmm maybe chatgpt to the rescude for me

#

god maybe I'm just way too tired.

#

Does backprop also compute dL/dw?

serene scaffold
upbeat prism
#

okay so it's really just as simple as

#

"we derive with respect to the weight duh"

#

okay good, very close to figuring out my issue

#

what about derivatives between nodes e.g. dv_i/dk_j? I think that's something you could theoretically compute during the forward pass right?

thorny geode
serene scaffold
serene scaffold
thorny geode
serene scaffold
#

also keep in mind that when you view the network as one grand function, the nodes aren't really discrete things in the function. the weights are.

serene scaffold
thorny geode
#

but i think from conversing with you i can relates the knowledge that the variance decreases with sample size.... so technically the irreducible error becomes less important

upbeat prism
serene scaffold
upbeat prism
#

god I swear fucking with the backprop implementation in torch makes me lose my mind πŸ˜† but yeah, I think I finally solved all my theoretically issues. No idea why I got so confused. Now I could finally continue coding, at 02:43 in the morning πŸ˜„

#

thanks

upbeat prism
deep veldt
#

im confused in which loss should i use for siamese nn, contrastive, triplet, crossentrophy which fit the best?

rich moth
#

I made this financial model for crypto and in the optuna studies ive seen some really interesting behavior with this thing.

frigid ingot
#

can i have a look at the code?

young granite
#

the residual plot also seems to lack generalization

rich moth
#

youll see t hat the valadation begins to go negative.

young granite
worldly dawn
young granite
worldly dawn
young granite
#

looks like a overfit to me

rich moth
# young granite u say it like its a good thing?

Usually that's the case, but in my model its doing better on validation actually whole using simpler and simpler representations. This is an outline what Im using to compute the loss

    # Quantum-aware uncertainty scaling
    quantum_uncertainty = torch.exp(-log_var) * quantum_scale
    
    # Enhanced negative log likelihood with quantum effects
    nll_loss = 0.5 * (quantum_uncertainty * error**2 + log_var + math.log(2 * math.pi))```
worldly dawn
#

or the error vs predicted to not be a giant V ?

rich moth
#

I start the model with a tiny dim size of 32 and , and I built a module that actively seeks to reduce dimensional usage when it spots simpler patterns. think of it like compression instead of using a huge dim to represent market patterns, it's learning to be to be more efficient

as far as the predictions vs actuals plot, instead of trying to predict every tiny price movement (which would give you that diagonal line), it's learning to recognize particular patterns where it can make reliable predictions. it shows the model is most accurate (smallest errors) around certain market conditions, and it knows to express more uncertainty when conditions don't match its learned patterns. but thats why the validation loss going negative is a good thing.

worldly dawn
#

For the V, it sounds like you are predicting returns, which should be centered around 0 (and also explains why you have a lot of points centered there). So the V could mean the further you are from the center, the more wrong you can be, by definition

rich moth
# worldly dawn Not sure to follow. Your model has predictions and they do not match the actual...

Its actually intentional due to how i built the architecture. The horizontal banding happens because the model separates market conditions into different "regimes" or patterns. Look at the color intensity in those bands - the brighter colors show where the model is most confident.

So when you see those horizontal bands, that's the model saying "under these specific market conditions, I expect returns in this range with high confidence." The spacing between bands suggests it's identified distinct market states where different return ranges are likely.

Think of those horizontal bands like prediction zones so when the model sees certain market conditions, it's saying "I expect returns to fall in this specific range." The fact that they're distinct bands rather than a scattered diagonal line means the model has identified clear, separate market states where different return ranges are likely.

worldly dawn
#

If your model predicts a gain of 100% tomorrow but the stock actually loose 100%, you loose money, even if the model was super confident about it

young granite
#

better approach then would be to predict such events as a timeseries data giving u dates to invest

#

but honestly as recursive pointed out a confident model on wrong predictions brings u nothing

worldly dawn
#

note also you can use/build a model to classify the market regime and use that in your model

#

but regardless, your predictions need to have some predictive power

#

Otherwise, how do you plan to use that model?

#

If you do not believe the feedback you got here, I would recommend you to run a back test of your model to see the results for yourself on a trading strategy (make sure the backtest data has data posterior to the training data of your model)

rich moth
rich moth
#

2.75 - 3.5 sharpe ratio in the backtest with a 70% win rate? What do you want to know?

worldly dawn
#

like win rate? sharpe ratio? max drawdown?

rich moth
#

The max drawndown is actually in the negative. lol

worldly dawn
rich moth
#

Im sure it does hehe

worldly dawn
#

but hey, I would be happy to be proven wrong

young granite
#

soon u are millionaire

#

but good for u if u backtested and it works, its fine to gamba a bit i guess

past meteor
#

When that's fixed, you can tell in an instant if a model is doing well or not

rich moth
#

I was showing my friends this but im actually for the entire image. I had to make it discord friendly, lol

#

lemme locate it

past meteor
#

Either way, I'm always extremely skeptical of this stuff

#

It's the data equivalent of turning iron into gold

past meteor
#

And the follow up is "if I can do it, why can't an army of PhDs that have dedicated their life it at {insert massive firm here} not do it?"

young granite
worldly dawn
young granite
worldly dawn
worldly dawn
#

I kinda do that in a month and am not even looking at intra days, more chill stuff

rich moth
#

look, i get it. i'd be skeptical too. heres one, you have to save it to zoom in though

#

im running optuna trials so im still trying to narrow down the best parameters.

worldly dawn
rich moth
#

obviously bitcoin doesnt go that far dude, its a hobby project, its not perfect geez lol but my architecture does sing

worldly dawn
past meteor
#

Or at least simulate it in the real world

rich moth
#

I do simulate real world.

past meteor
#

And the results are the same there?

rich moth
#

lol, that's why I implemented market regime detection, transaction costs, and realistic position sizing. those were the real world results my friend. you should have seem them before on the these results I had a sharpe in the 4's

worldly dawn
rich moth
#

I plan to, but I going to get everything right before I chuck real money at it

odd meteor
# deep veldt im confused in which loss should i use for siamese nn, contrastive, triplet, cro...

I've seen contrastive loss used most of the time, however, triplet loss can as well be used depending on how your Siamese network is structured.

The best fit depends on your task.

When to choose between Contrastive and Triplet loss

  • Triplet Loss: when working on fine-grained distinctions or ranking tasks, when working with a very large dataset with diverse examples, or when working on tasks that requires nuanced relative comparisons.

  • Contrastive Loss: when you're trying to determine whether two inputs are similar (binary similarity tasks), when working with small dataset, when working on a simpler similarity/dismilarity tasks.

desert cove
#

I need help!

So i need a problem statement for the major project of college but I ain't got no idea. So please help me. Drop Some real life problems that can be solved or optimized using AI but no one has yet attempted to do so.

thorny geode
# serene scaffold you don't need to be sorry. I should be sorry for not knowing. but I'm not sorry...

i finally got the quote i need !!!!!

Overfitting is especially likely in cases where learning was performed too long or where training examples are rare, causing the learner to adjust to very specific random features of the training data that have no causal relation to the target function. In this process of overfitting, the performance on the training examples still increases while the performance on unseen data becomes worse.
wikipedia

thorny geode
serene scaffold
thorny geode
deep veldt
thorny geode
past meteor
thorny geode
past meteor
errant bison
#

From where can i start learning llm

serene scaffold
trim saddle
odd meteor
# deep veldt i used tripletloss but the loss doesnt improve, its always around 1 or 0.8 ive t...

Unfortunately, there's no shortcut to this part in ML. You just have to experiment more to figure out how to improve your model performance.

To avoid wasting much effort, start your debugging by verifying that your model is at least able to overfit on a small subset of your data (n <= 100 samples) .

If you're unable to overfit on that small subset, then the issue is most likely one of these

  • your model architecture is too simple to fit the data
  • you didn't set up the training process properly
  • there's indeed a bug in your implementation of Siamese network.
lusty relic
#

Today i launched my AI project with for business owners where AI handle inventory and chat with customers about business products, availability and services in realtime i would anyone to try it out https://app.cognova.io/?ref=dsc

Cognova

AI-Driven Chat For Smarter Decisions

serene scaffold
#

do you have any other features available? otherwise, you're limited to figuring out if there's some sort of time-based cycle (weekly, monthly, yearly, etc.)

mighty lake
#

there's other features but they're redundant, I used them to remove any outliers

#

the only actual data that would matter is the price and sold_at I'm pretty sure

serene scaffold
mighty lake
#

even after removing obvious outliers there's still some stragglers that I'll remove

serene scaffold
#

why is there more than one point per day?

mighty lake
#

kus multiple sell per day

serene scaffold
#

what is the thing that's being traded? a stock?

mighty lake
#

nah, just an online product being sold

serene scaffold
#

okay, what online product?

mighty lake
#

why does that matter?

spring field
#

could be seasonal for example

serene scaffold
mighty lake
#

it's just a gaming thing, available for purchase 24/7 and is non-seasonal. all of this stuff I've accounted for I'm fairly sure

#

the problem is idk which method of forecasting to use

tawdry ore
#

Hey guys how do you manage your python virtual environments

mighty lake
#

I'm only able to determine price trends currently, I can't really accurately determine what the specific price would be in x days

serene scaffold
#

I'm not sure that you have enough data to produce a forecasting model that isn't essentially random.

tawdry ore
#

I have this problem where I don't know what to do exactly making a big env related to one topic like data science will make me download a lot of libs in one place and some projects don't need most of these libs, however If I make small env srelated to each subtopic I will have to install extra libs that will probably be in the rest of these envs beacuse most projects will mostly need the same libs that are in an another env.

spring field
#

it does have a global cache btw
and it's very speed

mighty lake
tawdry ore
#

for example how would you group these

serene scaffold
calm thicket
#

even pip has a global cache. it's usually fine to have multiple venvs with the same packages

spring field
#

ah, hmm, thought it was a uv thing
oh well, uv is speeeeeeed

tawdry ore
#

pandas
numpy
matplotlib
seaborn
streamlit
shiny
requests
beautifulsoup
scrapy
tensorflow
sklearn
keras
scrapy

would you install it in one env or make small ones for each category?

calm thicket
#

i would not do categories. create a venv for each project

serene scaffold
tawdry ore
calm thicket
#

the point of caching is that it doesn't take up extra space (also the packages are not even that big to begin with)

mighty lake
serene scaffold
serene scaffold
tawdry ore
#

about 25MB or something

spring field
mighty lake
calm thicket
#

if you're really concerned about local storage, just use google colab or something

tawdry ore
#

I make a lot of small projects and this would be inefficient!

#

I want to make a big env, but I am worried about the performance!

serene scaffold
tawdry ore
#

So, you what do you recommend?

serene scaffold
#

make a venv for each project

tawdry ore
#

That is a storage space problem. I have limited storage on that Linux boot.

serene scaffold
#

how much?

tawdry ore
#

44.11 GiB / 58.81 GiB (75%) - ext4

serene scaffold
#

okay, well having just one environment won't affect the performance of your code. but if you need to reproduce the results later, you need to keep track of all the library versions that you used

#

which you can do with pip freeze and write the result to file.

tawdry ore
#

I mostly import all the dependencies at the begging of the code. This is not primarily an issue.

serene scaffold
#

right, but each environment can only have one version of each library

tawdry ore
#

I just heard people saying that having a lot of dependencies would be heavy while loading the env

serene scaffold
#

they were wrong.

tawdry ore
#

Ok, then if this is the case I prefer having everything related to each topic in one place

#

Gemini just told me that having a lot of dependencies may make the env heay

serene scaffold
#

what does it mean for the env to be "heavy"?

tawdry ore
#

takes time to activate

calm thicket
#

activating a venv is basically just setting some environment variables

tawdry ore
#

Does this disprove it?

serene scaffold
#

actually activating the venv is instantaneous no matter how much you have installed, for the reason PSVM just said.
starting a python process with that venv is different. having more libraries installed might make an imperceptibly small difference.
having more libraries installed increases the number of places python has to look when you import stuff. and this will have an imperceptibly small difference as well.

none of this is the kind of thing you need to optimize around.

serene scaffold
tawdry ore
#

the increase of time taked to activate an env depending on the number of libs

serene scaffold
tawdry ore
#

yeah, thanks

serene scaffold
#

remember that starting a python process with a venv is not "activating" the venv.

tawdry ore
#

I am not taking about the efficiency of the code. I am talking about the time to activate the .venv

serene scaffold
#

like source .venv/bin/activate? that is always instantaneous.

tawdry ore
#

alr

#

thanks

lapis sequoia
#

can you use a Adam optimizer for a LSTM?

young granite
#

@serene scaffold are u not on mobile today :D?

serene scaffold
young granite
#
img_path = pathlib.Path('./data/img/Test')

for folder in img_path.iterdir():
    if folder.is_dir():
        main_name = folder.name
        for subfolder in folder.iterdir():
            if subfolder.is_dir():
                sub_name = subfolder.name
                subfolder.rename(img_path / f"{main_name}_{sub_name}")


dataset = torchvision.datasets.ImageFolder(img_path)
lable_dict = dataset.class_to_idx

# PIL to Torch img
data_transformer = transforms.Compose([transforms.ToTensor(),
                                       transforms.Resize((46,46))])

img_paths, img_lables = [i[0] for i in dataset.imgs], [i[1] for i in dataset.imgs]

class CustomDataset(Dataset):
    def __init__(self, image_paths, labels, transform):
        self.image_paths = image_paths
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        # Load the image
        image = Image.open(self.image_paths[idx]).convert("RGB")
        image = self.transform(image)  # Apply transformations
        return image, torch.tensor(self.labels[idx], dtype=torch.long)
    
img_dataset = CustomDataset(img_paths, img_lables, data_transformer)
classes = dataset.classes
serene scaffold
serene scaffold
serene scaffold
left tartan
mellow pecan
#

Does anyone here works in the field of BIO DATA SCIENCE?

serene scaffold
rich moth
#

Im finally finding the sweet spots in the models parameters i gained nearly two hundred epochs even with a patience of 10 set. it's kinda hard to see but thers a minus sign in front of the y=-0.03

rich moth
#

print the first 5 path and label pairs, lets see whats going on with the data

rich moth
#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

toxic palm
thorny geode
rich river
#

if I defined a pytorch model in a file, how do I use it in another file?

past meteor
wooden sail
#

you can check chapters 4 and 5 here. sadly it's not searchable, but also i didn't find overfitting mentioned explicitly. the ideal is discussed in depth though: under some conditions, you can always generate a sequence of functions that converges exactly to the data points you observed, and this is often not what you want. you want a way of restricting the possible functions to ones that are somehow "simple"

#

the discussion on the order of polynomials in ch 4 is good

past meteor
#

I think if you really want the technical definition googling "generalization gap" is a good start

gritty vessel
#

Anyone here has idea about datacubes

#

I want to learn how to create datacubes but can't find any useful resources

past meteor
gritty vessel
#

Like.oerform olap operations on it and storing it like this

past meteor
#

I doubt people think of the image above when talking about OLAP

#

And you're better of googling either of "data engineering" and "dimensional modelling"

gritty vessel
#

Okie

#

Will check that

gritty vessel
past meteor
#

What kind of data do you have?

hidden pelican
#

kaboom can i dm you i have a question

scarlet anchor
#

Hey, anyone here has any Time series forecasting project tat has a dataset where dates repeat and there is a hourly pattern in the data?

gritty vessel
#

Latitude longitude time and different sensors data at those

#

Coordinates and time

gritty vessel
past meteor
#

What types of queries

gritty vessel
#

Convert hourly or minutes data to daily or vice versa

hidden pelican
gritty vessel
#

Change resolutions of data suppose one satellite has 2x2km resolution

#

And another has 5x5km

#

So we have to interpolate or some other technique

#

To bring them to the same grid

past meteor
#

hmm, then I don't think OLAP is a good fit. How did you get there in the first place?

gritty vessel
#

What do you mean?

past meteor
#

Just curious to see where you got the idea to use OLAP from + what their train of thought was?

gritty vessel
#

Is a good choice suppose I have to slice a part of data

#

Or convert data to

#

Like monthly

past meteor
#

Have you consired time series DBs?

gritty vessel
#

Nup

#

But will we able to perform similar operations

#

In time series dB?

past meteor
#

Yes

gritty vessel
#

And multiple dimensions as well?

#

Though do I have to create a datacube as My guide told me to do so

#

I did created something similar

#

By storing it in ndimarrays

past meteor
#

Aside from time series DBs

#

I feel like there should be some geospatial DB or so

#

I've not worked with GIS but upsampling and downsampling satelite images seems like a standard use case

gritty vessel
#

Datacubes are used widely tho

#

While handling Geospatial data

gritty vessel
#

And like to check the trends

#

How it's behaving weakly

#

Monthly ,quarterly

plush kettle
#

Hello

#

Does any of you know about computer vision

unkempt apex
#

dont ask to ask

plush kettle
#

Alright, so I would like to make a face recognition system. The input is only one image. How do I augment the picture in a way that the face is facing every angle like for example I am looking up

#

Also can mediapipe’s face mesh be used for facial recognition? By identifying who does the face belongs to with facemesh?

unkempt apex
#

you only want to detect face or you wanna detect particular face?

plush kettle
#

Particular

#

Also, is there any python library or GANN to remove accessories like goggles from faces

flat hawk
#

Hi, do you guys know any good Data Science conferences in Europe for 2025?

unkempt apex
rich moth
#

I think I might have stumbled on an interesting new approach for financial time series architectures , hear me out. Rather than manually tuning the dim sizes using the standard optimzation methods ,ive been experiemtning with letting the network discover its own ideal capacity through a continuous feedback mecahnism. The idea came to me thinking about financial markets and their immutable property. So it got me thinking, what if it could adapt its capacity based on the inherent complextity of the market state its processing. Im testing this theory to check it out.

toxic mortar
#

hahahaa

#

But if that is you, use xgboost. Now

odd meteor
# deep veldt how do you overfit a model?

Do you understand the concept of bias-variance tradeoff?

Overfitting occurs when your model performs exceptionally well on the train data but poorly on the test data.

I suggested you check if your model can overfit on a small subset of your data because it's actually a good sanity check.

If you train your model on a small subset of your data and it cannot achieve 99% - 100% accuracy, then it's a clear indication that your model might have those 3 issues I mentioned in my last response.

The idea is, a small dataset is easy to memorize, even for a relatively simple model. So if your model cannot even overfit on a small dataset, then it'll likely struggle to generalize on even larger dataset.

So in essence, confirming your model can overfit on a small subset of your dataset is a sanity check used to catch bullsh!t very early w/o having to waste time and energy training a model for so long only to end up with an abysmal performance.

If you try that and you're able to overfit on a small subset of your data, then that proves that indeed your training pipeline (data loading, model architecture, optimizer, etc) is truly working correctly.

upbeat prism
#
(Pdb) C = torch.tensor([[0,1]])
(Pdb) id(C)
5098297024
(Pdb) C[0], type(C[0]), id(C[0])
(tensor([0, 1]), <class 'torch.Tensor'>, 5099183120)
(Pdb) C[0][0], type(C[0][0]), id(C[0][0])
(tensor(0), <class 'torch.Tensor'>, 5372607632)
(Pdb) C[0][1], type(C[0][1]), id(C[0][1])
(tensor(1), <class 'torch.Tensor'>, 5369838736)

I always thought torch.tensor is just a memory wrapper with a bunch of extra info. but it seems that each element of C and C itself is a tensor object?

odd meteor
# flat hawk Hi, do you guys know any good Data Science conferences in Europe for 2025?

Both ECAI and ECIR 2025 are scheduled to hold in Italy.

Sadly, most popular (top) AI conferences are held in North America. (I could go on to rant on how tiring it is but unfortunately, that's where we found ourselves)

Meanwhile, I use https://openreview.net and https://aideadlin.es to keep tabs on stuff.

odd meteor
odd meteor
# upbeat prism ``` (Pdb) C = torch.tensor([[0,1]]) (Pdb) id(C) 5098297024 (Pdb) C[0], type(C[0]...

C is a rank-2 tensor while C[0][0] is a rank-0 tensor; a scalar, yeah, but still a tensor. However, calling C[0][0].item() will yield a normal scalar object ( an instance of int class).

Also, a tensor is more than just a memory wrapper. It also has other functionalities it encapsulates like autograd, PyTorch's computational graph, metadata.

I think the tensor library can be summarized as a multi-dimensional array with GPU support.

serene scaffold
warm copper
#

The highest in the class hahahaha

#

Teacher commented β€œyou used so many different models and introduced focal loss to address imbalance which was very unique”

serene scaffold
warm copper
#

Thank you πŸ™

warm copper
# serene scaffold nice!

I wasn’t sure if I was gonna get a high score because I focused on F1-score but teacher said that’s more important than accuracy in imbalanced dataset

brisk cypress
#

guys if im trying to train an ai but its a pain to get the data since i need to get it manually is it possible to like generate extra data to train it on or do i just use a smaller dataset?

agile cobalt
scenic parcel
#

Is this accurate

serene scaffold
thorny geode
#

i made a ball

thorny geode
brisk cypress
#

it gives output but idk if it is correcy

#

or if its just being wrong

serene scaffold
brisk cypress
#

right lemme explain

#

So I have a bunch of pickle files which I want to extract into JSON files. These JSON files include multiple economic values such as price, demand, etc of different items (this is all ingame in a very unusual game). I wish to write a program that gets these pickle files, turns them into JSON and then uses the information to tell me whether it would be optimal to buy or sell each item, shows me the average price change and also shows me a "profitability" factor, and the best way I thought to do this was to use AI to like try and do it best as it can.

#

this is a small snippet of output

brisk cypress
serene scaffold
brisk cypress
#

oh ok

#

gn

brisk cypress
#

i seem to have an error

deep veldt
#

best was 61

odd meteor
deep veldt
odd meteor
odd meteor
wooden sail
#

any of you have a favorite book from which to fish out definitions of probability distributions?

deep veldt
thorny geode
#

matplotlib is taking such a long time to learn ._.

#

i literally spend 2 days of my holiday only reading 10 pages of book

serene scaffold
thorny geode
#

i read that plotly is easier and more interactable

serene scaffold
thorny geode
serene scaffold
#

I've been using matplotlib to varying extents for years, and I still don't feel like I understand it.

serene scaffold
thorny geode
thorny geode
#

its no problem learning it later on right? since its much easier anyway

thorny geode
serene scaffold
thorny geode
#

this is pretty funny, im 65 pages away from using matplotlib a second time

lapis sequoia
#

why is BERT driving me bloody mad?

#

A week ago, it was going fine, I guess I just forgot what I knew. Threw it out the window of my mind

rich moth
#

Use a sentence transformer on the original data to enrich it.

brisk cypress
#

but its 2am so im not gonna fix it rn, gn

fallow coyote
#

can someone explain to me what is model bias? just having trouble trying to define it

serene scaffold
fallow coyote
#

I somewhat figured it out. From what I searched up it can involve a number of errors like algorithm errors, the selection of training data used and whatnot. The ISLP book doesnt really explain model bias that well at all

#

Also, recommend any website or books that are good at explaining what Bayes Classifier and KNN are? Something which can be somewhat understandable. Want to learn a bit more about these two classfication methods before I go onto the Classification chapter of ISLP

weary timber
serene scaffold
weary timber
#

sorry i just replied

serene scaffold
unkempt apex
#

there are lots of architecture

#

but prefer to go with building projects

stable isle
#

What yall working on? Is anyone working with training a model on say the Python grammar spec?

serene scaffold
stable isle
#

On the python grammar spec+code so it understands python code...

serene scaffold
#

For models like ChatGPT, their task is to generate text. They aren't trained on formal representations of English syntax.

stable isle
#

Maybe they should be...

serene scaffold
#

Nope.

stable isle
#

lol I can see your not a candidate to assist me with my project lol

#

And your a Computational Linguist!

serene scaffold
#

What do you want your model to do?

calm thicket
stable isle
serene scaffold
#

This conversation can't go anywhere until we've established what the inputs and outputs are for the model.

stable isle
calm thicket
#

even if it would be possible to learn the rules of ebnf grammar from the grammar, it still doesn't have the semantics

serene scaffold
#

There are probably ways that you can improve their performance wrt semantics

stable isle
#

I'm trying to find the paper on the guy who did this with another language to detect 'code smells' in a program

serene scaffold
#

"code smell" is syntactically correct code that's considered distastefully designed

stable isle
#

He used the language's grammar spec...

serene scaffold
#

If you find the paper, please post it in this chat.

stable isle
stable isle
serene scaffold
stable isle
#

That's something but it's not the paper I'm talking about...I'm still searching for it

stable isle
#

I can't find it. Crazy...

spring field
# serene scaffold Yes. But whether or not something is a code smell isn't a syntax question. So I'...

I think I can see a link to why one might think that learning grammar could help, here's an isolated example:
Given code is

for i in range(len(seq)):
    item = seq[i]

Model learns that based on grammar rules (is this grammar related at all? pithink (anyway)), this can be simplified to

for item in seq:
    ...

Now, obviously, such a leap might be several levels of indirection deep for a model to even get there and simply "simplifying an expression while maintaining grammar compatibility" is certainly not a great metric for making code better.
It'd be much easier to just tell the model that one is bad and the other is good and let it figure out the rest or something. So yeah, idk what it would do with grammar exactly, but I can see a thread of thought leading somewhere in a somewhat logical direction I guess

serene scaffold
serene scaffold
#

I'll tell you another time if I remember.

#

(as a reminder for myself, my coworker whose name starts with J suggested it, and I thought it was a bad idea.)

spring field
#

was it Joe? /s

serene scaffold
#

@worldly wagon, don't bother learning nltk. I work professionally as a linguist and do not use it.

worldly wagon
#

but i do need nltk, its for my research/research paper

serene scaffold
worldly wagon
serene scaffold
worldly wagon
#

most of my work before was in predective modellingpithink

worldly wagon
#

i'm honestly worried about the performance too but hoping panda's numpy and some c++ can carry there

serene scaffold
#

and don't worry about performance until you've confirmed that the performance isn't good enough.

worldly wagon
worldly wagon
serene scaffold
#

@worldly wagon what are you doing that you care about phonemes? (I am a linguist and have an IPA chart on my wall next to me right now.)

worldly wagon
serene scaffold
worldly wagon
#

the plan is for my UG to produce meta data for any given text input

serene scaffold
worldly wagon
serene scaffold
worldly wagon
serene scaffold
worldly wagon
#

thank you thank you

serene scaffold
left tartan
rich moth
#

Pretty happy with how the uncertainty distribution and calibration are looking after 570 epochs, but im curious what ya'll think.. Happy Holidays and all that jazz too.

left tartan
left tartan
rich moth
rich river
#

they are under the same package

#

why would this happen?

deep veldt
#

Should i make a different fc layers for resnet when i train it on different dataset? since it only got 1 fc layer

past meteor
deep veldt
#

not sure how to describe it

past meteor
#

Sure, the architecture can differ from use case to use case

#

Start large and decrease

river zenith
#

I think this is the right area for my question? I have a dataset of book covers and from each of those covers, I’ve programmed a code to take the most (maybe top 5? It’s all a blur at this point) dominant colors in that image. How do I cluster similar enough colors together to be able to get enough color groups to compare across the different subgenres? I’ve tried to use kmeans a number of different ways and I’m just about to give up on the whole thing altogether.

ETA: for example if I have 5 different rgb codes, but they’re all super similar, I want them to count as one color. And I don’t want to hand code it. Afaik, I need some sort of machine learning.

smoky robin
#

How do I deal with this data distribution

toxic mortar
#

Maybe even apply log tansformation

#

or segment data to two groups, bell curve and spike

odd meteor
#

Merry Christmas everyone πŸŽ‰πŸŽ‰πŸŽ‰πŸ˜„. Today is a good day to allow your pc to rest. Go enjoy your time with your family and loved ones.

toxic mortar
smoky robin
jaunty helm
smoky robin
jaunty helm
smoky robin
#

Hmm

#

Ok will try this thanks

deep veldt
#

does having more fc layers increase the accuracy? also does the input_features param have a specific rule? i always see people set it to 4096 or 2048

fast thorn
#

guys this 2025 im starting with ai ml...........how to get started??

#

can you recommend me some yt channels ,, videos or books maybe?

fast thorn
spice lava
#

I am learning Data Science.. Anyone wants to collaborate ?

pearl barn
#

Is the new anaconda 10.24 I can't change between command and edit mode?? Always blue when I run Jupyter notebook??

past meteor
serene scaffold
pearl barn
#

I'm taking a course for data analysis still learning fundamentals he set up anaconda but he change between two modes while I'm just on Command mode and don't know how to change to edit mode he uses old version of anaconda

serene scaffold
#

@azure osprey I removed your message for containing self promotion.

young granite
#

if one knows a good approach to filter large amounts of text to features other than nltk and rapidfuzz would be interested in ur opinion

serene scaffold
#

Please be as specific as you possibly can.

young granite
# serene scaffold What kind of text and what kind of features, for what task?

i got a description of and want to find (over many descriptions) shared wording, therefore i splitted the original description into features using re, however this can lead to bad fragments etc., therefore i wanted to know if theres a better solution for that task.
Currently its working and im able to filter frequently used words by this approach but i have to do a manual selection of those words and also do manual cropping, once thats done i use a list of those frequently used words to check the original descriptions for the occurrence.

serene scaffold
#

Also, you wrote "I got a description of", not there's nothing after the of.

young granite
serene scaffold
young granite
serene scaffold
young granite
#

do u know if works for text like: "thisIs a string whichCan have linked Words and also integers12g..."

serene scaffold
#

And medial digits?

young granite
#

i would say also, however digits can hold units following them, i tried to do that with re but thats not suitable i think

serene scaffold
#

But not for letters coming after digits.

#

There wouldn't be an easy way to not split medial capitals that are supposed to be there. Like for McGonigle.

young granite
#

i was thinking to maybe give it a kind of limit but am unsure

#

i want to avoid a large dict which needs to be loaded and updated all time

young granite
worldly wagon
teal flame
#

Hello is there any way to bypass the shape of an output being equal to the yTrain data shape in a CNN model when training......I'm trying to follow a CNN architecture stated in a research paper, but the input shape is equal to the output shape which is not equal to the shape of my yTrain data thus causing a mismatch....Idk if I reshape it, that won't let the model learn well.

i.e input_shape = output _shape = [ ?, 1280,1 ]
y_Train shape = [, 5,1]

The research paper I'm talking about is this

Leong, Zi Xian & Zhu, Tieyuan. (2021). Direct Velocity Inversion of Ground Penetrating Radar Data Using GPRNet. Journal of Geophysical Research: Solid Earth. 126. 10.1029/2020JB021047. 
#

Or if there's any suggestion on how I can do it without need to bypass , thanks

rich moth
#

Check out these new wild visuals. the training and val graphs are pretty smooth, it seems its learning without overfitting. Finally got the complexity metrics working right too, but im still running optuna trials to tune parameters, playing with base/hidden/max dims. Tweaking those gives wild different outputs.

#

The complexity metrics is cool is like the pulse of the market

whole valley
#

guys how does linear algebra play in data science

serene scaffold
rich moth
rich moth
young granite
wicked pine
#

guys...

#

help in python-help

pearl barn
#

Is the new anaconda update 10.24 don't offer switch between edit and command mode always shows blue bar with no in out or pencil?? If anyone can confirm this or just a bug from my app??

weak oxide
#

I must know

#

Definitely installing immediately

#

Once I know

wild solar
wild solar
#
import matplotlib.pyplot as plt
import numpy as np

plt.style.use('dark_background')

fig, ax = plt.subplots()

L = 6
x = np.linspace(0, L)
ncolors = len(plt.rcParams['axes.prop_cycle'])
shift = np.linspace(0, L, ncolors, endpoint=False)
for s in shift:
    ax.plot(x, np.sin(x + s), 'o-')
ax.set_xlabel('x-axis')
ax.set_ylabel('y-axis')
ax.set_title("'dark_background' style sheet")

plt.show()
weak oxide
#

I never used dark background

#

I always used five thirty eight

#

Or seaborn styles

#

I'll look at it

#

I assumed it was interactive as well

#

I just got Plotly on my side too

wild solar
weak oxide
#

Altair is pretty good as well

wild solar
#

never heard of it, will check it out

weak oxide
#

You must

#

I have a thing for visualizations

#

I know machine learning is important and all

#

But the graphs are just so pleasing

wild solar
#

just checked it out rn

#

it looks so clean

weak oxide
#

Yep

#

But I need to learn how to use dark backgrounds

wild solar
#

i really like the altair.param method

wild solar
weak oxide
#

For Plotly you just use plotly.dark

weak oxide
#

Something about them seems so alluring

wild solar
weak oxide
#

Bloomberg used dark visualizations

#

Which is probably why

#

And economics is my profession

wild solar
weak oxide
#

Yeah so my point on this channel will probably be for yfinance and other related packages

wild solar
#

I hope you find the help and the assist you need

wicked pine
#

Quick question, does randomforest get affected by outliers?

Should i remove outliers?

weak oxide
#

Only wished that yfinance wasn't discontinued by Yahoo itself

wild solar
weak oxide
#

It's a bit buggy

#

It still works

weak oxide
wicked pine
#

just want conformation, i cant trust gpt anymore

weak oxide
#

Bootstrap aggregation is what the Internet calls it

wicked pine
#

So i found a dataset on kaggle and wanted to make my own notebook.
out of 100,000 records i ended up with with 10,000.

My original idea was make a neural network but the number of records is low isnt it?

idk

weak oxide
#

Can you describe the dataset? Or give a df.head

wicked pine
#

wwait

#

9 features

weak oxide
#

Hmmm interesting

wicked pine
weak oxide
#

My initial thought was to do a logistic regression with #diabetes part

wicked pine
#

but its classification problem, either 0 or 1

wicked pine
#

am second year data science student

wicked pine
weak oxide
wicked pine
weak oxide
wicked pine
#

i will try both then. Extra 2 hours no worries 🚯

abstract wasp
#

Hi, what are the best tools to use for ml model deployment? Docker, Amazon Sagemaker, etc? Idk much about this so if you guys have some online course suggestions that I can take to learn this is would be great, ty!

spring field
abstract wasp
spring field
#

Ah, cuz there aren't any for that I don't think. I may have misread what resources you were looking for specifically.

rich moth
calm thicket
abstract wasp
odd meteor
# abstract wasp Hi, what are the best tools to use for ml model deployment? Docker, Amazon Sagem...

MLOps is quite broad, hence there's really no acclaimed "best" tool out there. This is because, the notion of what's best can be very subjective.

You might find this roadmap interesting. https://marvelousmlops.substack.com/p/mlops-roadmap-2024?utm_source=substack&utm_medium=web&utm_content=embedded-post&triedRedirect=true

The MLOps engineer role is different from an ML engineer role.

serene scaffold
dry raft
#

hey guys i need help with my pytorch train function

#
# Train function
def train_combined(model, train_loader, criterion, optimizer, epochs=30):
  model.train()
  for epoch in range(epochs):
    running_loss=0.0
    correct=0
    total=0
    for image, label in train_loader:
      image, label= image.to(device), label.to(device)
      optimizer.zero_grad()

      fwd_output=model(image)
      loss=criterion(fwd_output, label)

      loss.backward()
      optimizer.step()

      # Calculate accuracy
      _, predicted = torch.max(fwd_output.data, 1)
      total += label.size(0)
      correct += (predicted == label).sum().item()

    train_acc = 100 * correct / total
    print(f"Epoch [{epoch+1}/{epochs}], Loss: {running_loss / len(train_loader):.4f}, Accuracy: {train_acc:.2f}%")
#

I keep using cross-entropy loss and it keeps yapping about: RuntimeError: 0D or 1D target tensor expected, multi-target not supported

#

I dont really understand how though

#

btw ping me, i may get off in a bit πŸ’€(hopefully stelercus could impart me some great wisdom)

unkempt wigeon
#

What page should I read up to on mastering pytorch?

serene scaffold
warm copper
#

what the hell

#

I went to the codes and they were in lisp

warm copper
#

😒

serene scaffold
#

@unkempt wigeon did you not end up buying it?

unkempt wigeon
#

I did I'm asking what page on the book should I read up to.?

serene scaffold
#

so "mastering pytorch" is the name of the book? is it an O'Reilly book?

iron basalt
#

LISP was the AI language.

warm copper
#

This class looks like a blast

serene scaffold
#

which one is it @unkempt wigeon

warm copper
#

a real blast that will blast my head

#

Im getting flashback from my data structures and algorithms course

#

πŸ₯²

serene scaffold
#

@unkempt wigeon the "Who this book is for" section of the book says "Working knowledge of deep learning with Python programming is required", so you should not have purchased the book.

warm copper
#

Omg I just looked at this @iron basalt

#

RL is all about algorithms?

iron basalt
#

And also a branch of ML.

#

It's for AI, not just ML.

warm copper
#

I mean I thought these algos would be in packages

#

Cliff Walking xD

iron basalt
#

The algorithms are abstract and can be implemented in many ways.

warm copper
#

good thing is that this class doesn't have exams

#

just a project and homework

#

I would be screwed I think

iron basalt
#

The book will give them to you like this:

#

I implemented them in C when I first went through it.

warm copper
#

oh okay πŸ˜„

left tartan
unkempt wigeon
serene scaffold
unkempt wigeon
wicked pine
spring field
# wicked pine No, the data was imbalanced so i had to drop most of the records to make the maj...

If the data is representative, you shouldn't try to solve the imbalance
https://stats.stackexchange.com/questions/283170/when-is-unbalanced-data-really-a-problem-in-machine-learning

wicked pine
spring field
#

That the sample, when scaled up to the size of the population, (roughly) matches the population

wicked pine
#

The auther stated that it depends on the learning or model you want to use

#

like he mentioned, random forest is not affected with imbalance

#

but am sure neural network does get affected

spring field
#

I mean, another way to emphasize a minority would be to use a weighted cost function so that the loss is more affected by the minority than the majority and, I'm sure that would have a slightly different effect than solving the imbalance by throwing data out

abstract wasp
shrewd kestrel
#

hey

#

what would be ideal system for machine learning in team environment

wild solar
#

Hi everyone, I’m encountering an issue with timing inconsistencies in my PyTorch GPU matrix multiplication loop. Here’s the situation:

I'm iterating 1000 times and timing each matrix multiplication operation torch.matmul on the GPU.
Some iterations show 0.0 ms, which is unrealistic, while others show high variability (e.g., 188 ms vs. <1 ms).
My total execution time for 1000 iterations is ~496 seconds, which seems inefficient.
I suspect issues with CUDA context initialization, memory allocation within the loop, improper timing measurements, or CUDA/PyTorch compatibility.

Details:
Timing: time.time(), without torch.cuda.synchronize().
Installed PyTorch for CUDA version 12.4; CUDA version on PC is 12.7.
GPU: NVIDIA RTX 3060 Ti.
PC specs: Intel i5-12400F, 16GB RAM, SSD+HDD storage.

#

this is my code:

import torch
import numpy as np
import time
import matplotlib.pyplot as plt

times = [] # this variable is for storing the time taken for each iteration

# Perform large matrix multiplication on GPU 1000 times
start_gpu_100 = time.time()
for i in range(1000):
    # Define large matrices as random tensors directly on GPU
    large_matrix1_torch = torch.rand((10*i, 10*i), device='cuda')
    large_matrix2_torch = torch.rand((10*i, 10*i), device='cuda')
    
    print("Starting Iteration: ", i)
    iter_start = time.time()
    result_gpu_100 = torch.matmul(large_matrix1_torch, large_matrix2_torch)
    iter_time = (time.time() - iter_start) * 1000  # Convert to milliseconds
    print(f"Ending Iteration: {i} in {iter_time} ms")
    
    # Collect iteration times
    times.append(iter_time)

end_gpu_100 = time.time()

plt.plot(times)
plt.xlabel('Iteration')
plt.ylabel('Time (ms)')
plt.title('Time taken for each iteration on GPU')
plt.show()

print("Time taken on GPU (1000 times):", (end_gpu_100 - start_gpu_100) * 1000, "ms")
#

final time taken:
Time taken on GPU (1000 times): 496804.4521808624 ms

#

I also made this plot, i really wonder what is the cause of these spikes, and why isn't it a linear relationship with the matrix size?

dry raft
#

Btw I am using a vision transformer for this

deep veldt
#

how do you make a model that generate text and replies like chatgpt? what is it actually called in deeplearning

limpid zenith
deep veldt
#

yes

limpid zenith
deep veldt
limpid zenith
#

there are many terms, autoregressive sequence to sequence models and transformer models are the common ones

#

but it depends on the kind of LLM being used

inland crown
serene scaffold
dry raft
# serene scaffold Okay, but what about the error message? Are you going to show it?
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-64-79153cd7a2a3> in <cell line: 1>()
----> 1 train_combined(new_vit_pathmnist, val_pathmnist_dataloader, loss_fn, optimizer_fn)

4 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   3477     if size_average is not None or reduce is not None:
   3478         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3479     return torch._C._nn.cross_entropy_loss(
   3480         input,
   3481         target,

RuntimeError: 0D or 1D target tensor expected, multi-target not supported
#

I can send you the notebook if you want

serene scaffold
#

No; I'll look at this when I'm at my computer

dry raft
deep veldt
dry raft
#

Pytorch is a bit weird not going to lie

#

But it's worth it in the end

inland crown
# deep veldt yeah im looking into it can you link it again

FREE GIVEAWAY OF JENSEN-HUANG-SIGNED ORIN NANO SUPER! See Below!
Join Dave as he explores NVIDIA's Jetson Orin Nano Super, a compact AI powerhouse with 1024 CUDA cores and 6 ARM cores for just $249. Learn why this could be the best AI board for your projects in robotics, IoT, or AI development. Free Sample of my Book on the Spectrum: https://a...

β–Ά Play video
desert oar
deep veldt
desert oar
#

the -> points to the line where the error occurred. the ----> above it points to the function that the error occurred inside of. that's usually all you need to understand the error message at the bottom.

upbeat prism
#

PyTorch has a nice doc page for functions like LayerNorm() with the math and everything. Do they somewhere also have docs about the derivative/backward functions of it?

wild solar
# deep veldt is that device needed for llms?

LLMs require huge amount of computation power. The Jetson Orin Nano Super can run Small language models (SLMs), just to be clear i don't have a jetson orin nano super on me, but from its spec you can tell that it can probably run like an 2-7B parameter language model.

dry raft
#

i will check sir

#

alright sir

#

this is the label

#

these are random predictions:

#

i very confused sir

#

this is what it is for mnist

upbeat prism
#

https://pytorch.org/docs/stable/generated/torch.amax.html states that

amax/amin evenly distributes gradient between equal values, while max(dim)/min(dim) propagates gradient only to a single index in the source tensor.

What exactly do they mean by that? Do they mean that if we have an array [1,2,3,4,4,3,2,1] then we have two max vlaues (two 4s), which might have different gradients but amax then just takes teh mean of it and reports that? What does max do in that case? The word propagate doesn't really mean much.

dry raft
#

Oh crap

#

Never mind sir I just realized that my data set was multi-class not single class

#

πŸ’€πŸ˜­ stelercus can rest now

inland crown
# deep veldt is that device needed for llms?

It isn't needed specifically. It's just a little computer for cheap that is almost designed for the job. @ $250, if it's a serious project and you need more power, you could stack these as I understand (I also don't have one)

serene scaffold
dry raft
deep veldt
serene scaffold
#

and even then, sometimes you have to quantize (which means to store each parameter at a lower precision, at the cost of performance)

candid raven
unkempt wigeon
#

For convolutional neuron that my case in it just taking a photo and converting it into an array?

serene scaffold
unkempt wigeon
#

3d is for color

serene scaffold
unkempt wigeon
#

I'm asking do I convert it in each coordinate into an array

serene scaffold
#

if you load the image with PIL, it will just automatically be an array.

unkempt wigeon
#

Is that why we need for the conversation

desert oar
#

usually the answer to "how do i fix the error" either lies in the error message itself, or the immediately surrounding code, or both

unkempt apex
eager hamlet
#

hi, if im starting out and want to do some of my own deep learning projects do you guys recommend I go thoroughly through the theory first, like a textbook or a course then start development or go head first into trying to build a project?

desert oar
#

if so, you might want to check out "dive into deep learning" or "fast ai" courses, both free online

sage valley
#

hey guys can you please provide me a begineer roadmap for machine learning as i am beginner and new in programming and i also have basic knowledge on programming and all so please provide me

eager hamlet
eager hamlet
keen dew
keen dew
#

Np

eager hamlet
keen dew
#

search on yt for the roadmaps

sage valley
eager hamlet
#

I'm replying to someone else

#

But its a well known course for machine learning

acoustic seal
#

heyo

im working on an ml model and need a bit of guidance,

serene scaffold
acoustic seal
#

i need a bit of help with the venv as it says "ipykernel" is missing and when vsc installs it, still doesn't work.

agile cobalt
#

double check if you have the correct environment selected
maybe try reloading/restarting vscode

acoustic seal
#

and i think it's the ve problem, but the cuda and other things still dont work and when i have per say py 3.7 and the terminal still says it's on 3.12 or something

eager hamlet
lapis sequoia
#

I have questions on this following code

#
df['Sentiment'] = df['Sentiment'].map({"positive":2,"negative":0,"neutral":1})

MODEL_NAME = 'bert-base-uncased'
tokenizer = BertTokenizer.from_pretrained(MODEL_NAME)



class Finance_Dataset(Dataset):
    def __init__(self,Sentence,targets,tokenizer,max_len):
        self.Sentence = Sentence
        self.targets = targets
        self.tokenizer = tokenizer
        self.max_len = max_len
        
    def __len__(self):
        return len(self.Sentence)
    
    def __getitem__(self,idx):
        Sentence = str(self.Sentence[idx])
        Sentence = " ".join(Sentence.split())
        target = self.targets[idx]
        
        
        encoding = self.tokenizer.encode_plus(
            Sentence,
            max_length=self.max_len,
            padding="max_length",
            return_attention_mask=True,
            return_token_type_ids=True,
            add_special_tokens=True,
            truncation=True,
            return_tensors='pt',
            )
        attention_mask = encoding['attention_mask'] 
        input_ids = encoding['input_ids']
        token_type_ids = encoding['token_type_ids']
        
        return {
            "Sentence":Sentence,
            "attention_mask":torch.tensor(attention_mask,dtype=torch.long),
            "input_ids":torch.tensor(input_ids,dtype=torch.long),
            "targets":torch.tensor(target,dtype=torch.long),
            "token_type_ids":torch.tensor(token_type_ids,dtype=torch.float)
            }


from sklearn.model_selection import train_test_split
df_train,df_val = train_test_split(df,test_size=.20,random_state=42)

BATCH_SIZE_TRAIN = 8
VAL_BATCH_SIZE = 4
MAX_LEN = 200
num_epochs = 1

#
def get_dataloader(df,tokenizer,batch_size,max_len):
    ds = Finance_Dataset(
        Sentence = df['Sentence'].to_numpy(),
        targets = df['Sentiment'].to_numpy(),
        tokenizer=tokenizer,
        max_len=max_len
    )
    return torch.utils.data.DataLoader(
        ds,
        batch_size=batch_size,
        num_workers=0
        )


train_dataloader = get_dataloader(df_train, tokenizer=tokenizer, batch_size=BATCH_SIZE_TRAIN, max_len=MAX_LEN)
val_dataloader = get_dataloader(df_val, tokenizer=tokenizer, batch_size=VAL_BATCH_SIZE, max_len=MAX_LEN)

training_batch = next(iter(train_dataloader))
training_batch.keys()

attention_mask = training_batch['attention_mask']
input_ids = training_batch['input_ids']
targets = training_batch['targets']
token_type_ids = training_batch['token_type_ids']


 



bert_model = BertModel.from_pretrained(MODEL_NAME)


class BERTClass(nn.Module):
    def __init__(self):
        super(BERTClass, self).__init__()
        self.l1 = BertModel.from_pretrained(MODEL_NAME)
        self.l2 = torch.nn.Dropout(0.1)
        self.l3 = torch.nn.Linear(768, 3)
    
    def forward(self, input_ids, attention_mask, token_type_ids):
        _, output_1= self.l1(token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)
        output_2 = self.l2(output_1)
        output = self.l3(output_2)
        return output

model = BERTClass()
model.to(device)

loss_fn = torch.nn.BCEWithLogitsLoss()

optimizer = torch.optim.Adam(model.parameters(),lr=1e-5)```
#
def training_epoch(epochs):
    model.train()
    for d in train_dataloader:
        input_ids = d['input_ids'].to(device,dtype=torch.long)
        attention_mask = d['attention_mask'].to(device,dtype=torch.long)
        targets = d['targets'].to(device,dtype=torch.long)
        token_type_ids = d['token_type_ids'].to(device,dtype=torch.long)
        
        outputs = model(
            attention_mask=attention_mask,
            input_ids=input_ids,token_type_ids=token_type_ids)
        optimizer.zero_grad()
        loss = loss_fn(outputs,targets.float())
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()```py
serene scaffold
#

@lapis sequoia please put a py after the three backticks so that there's color.

#

```py
code goes here
```

#

just keep that in mind for the future. what is your question?

#

@lapis sequoia please stop trying to fix it. please just ask your question.

lapis sequoia
#

ok, I am trying to run it in a training loop, I keep getting the same error

serene scaffold
lapis sequoia
#

raise ValueError("You have to specify either input_ids or inputs_embeds")

ValueError: You have to specify either input_ids or inputs_embeds

serene scaffold
#

don't worry about how long it is. just show the whole thing.

lapis sequoia
#

print(training_epoch(epoch))

Cell In[200], line 153 in training_epoch
outputs = model(

File ~\anaconda3\envs\pytorch_env\Lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl
return self._call_impl(*args, **kwargs)

File ~\anaconda3\envs\pytorch_env\Lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl
return forward_call(*args, **kwargs)

Cell In[200], line 131 in forward
_, output_1= self.l1(token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)

File ~\anaconda3\envs\pytorch_env\Lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl
return self._call_impl(*args, **kwargs)

File ~\anaconda3\envs\pytorch_env\Lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl
return forward_call(*args, **kwargs)

File ~\anaconda3\envs\pytorch_env\Lib\site-packages\transformers\models\bert\modeling_bert.py:1062 in forward
raise ValueError("You have to specify either input_ids or inputs_embeds")

ValueError: You have to specify either input_ids or inputs_embeds

serene scaffold
#

```py
code goes here on a new line
```

lapis sequoia
#
Cell In[200], line 153 in training_epoch
    outputs = model(```
#
def training_epoch(epochs):
    model.train()
    for d in train_dataloader:
        input_ids = d['input_ids'].to(device,dtype=torch.long)
        attention_mask = d['attention_mask'].to(device,dtype=torch.long)
        targets = d['targets'].to(device,dtype=torch.long)
        token_type_ids = d['token_type_ids'].to(device,dtype=torch.long)
        
        outputs = model(
            attention_mask=attention_mask,
            input_ids=input_ids,token_type_ids=token_type_ids)
        optimizer.zero_grad()```
serene scaffold
# lapis sequoia ```py Cell In[200], line 153 in training_epoch outputs = model(```

You have this
```py
outputs = model(
    attention_mask=attention_mask,
    input_ids=input_ids,
    token_type_ids=token_type_ids
)
```
Which goes to `BERTClass.forward`
```py
    def forward(self, input_ids, attention_mask, token_type_ids):
        _, output_1= self.l1(token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)
        output_2 = self.l2(output_1)
        output = self.l3(output_2)
        return output
```
Look at what you pass to `self.l1`, which is the BERT model that you're wrapping.
lapis sequoia
#

you define it outside first?

serene scaffold
#

before we continue, do you understand that the py must go on the same line as the three backticks?

#

```py
code
```

lapis sequoia
#

yes, just exausted

serene scaffold
#

@lapis sequoia self.l1 is the BERT model itself. l2 and l3 are just two additional layers to turn it into a classifier. right?

lapis sequoia
#

yes

serene scaffold
#

the BERT model requires you to specify either input_ids or inputs_embeds.

#

self.l1(token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)

lapis sequoia
#

before the nn.Module?

serene scaffold
#

do you see?

lapis sequoia
#

I do, is that defined in the class or outside of it first? Or does it all going through with the training epoch?

serene scaffold
#

you need to pass them through to the BERT model.

#

self.l1( ) is where you pass things through the BERT model.

lapis sequoia
#

is defining these variables first, messing it up?

serene scaffold
#

No. defining variables does not, in itself, mess anything up.

lapis sequoia
#

in the class, for the dataset, does anything not have to be there?

serene scaffold
#

The BERT model is telling you "You have to specify either input_ids or inputs_embeds"

#

self.l1(token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)

#

you need to specify either input_ids or input_embeds

#

you currently do neither

#

do you see the solution?

lapis sequoia
#

yes

#

they were not in there

serene scaffold
#

show how this line should be modified to solve the problem.

_, output_1= self.l1(token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)
lapis sequoia
#

do you only need those two?

serene scaffold
#

do you know what the solution is?

lapis sequoia
#

you still need the input ids, or do you not?

serene scaffold
lapis sequoia
#

not both

serene scaffold
#

you currently do neither.

#

can you imagine what code you would need to insert that would "specify input_ids"?

#

the only parameters you have available are input_ids, attention_mask, token_type_ids, none of which are embeds.

lapis sequoia
#

targets

serene scaffold
#

targets?

#

Just replace the forward method with this.

    def forward(self, input_ids, attention_mask, token_type_ids):
        _, output_1= self.l1(input_ids=input_ids, token_type_ids=token_type_ids, attention_mask=attention_mask, return_dict=False)
        output_2 = self.l2(output_1)
        output = self.l3(output_2)
        return output
lapis sequoia
#

I did too much too fast

serene scaffold
#

are you following a tutorial, or something?

lapis sequoia
#

no, when I do it normally, and chill, and just ask someone what is off instead of over doing it and going to another source, I start doubting what I know and just make all of it a mess. Yes, it was from a huggingfaces collab

#
    def __getitem__(self,idx):
        Sentence = str(self.Sentence[idx])
        Sentence = " ".join(Sentence.split())
        target = self.targets[idx]
        
        
        encoding = self.tokenizer.encode_plus(
            Sentence,
            max_length=self.max_len,
            padding="max_length",
            return_attention_mask=True,
            return_token_type_ids=True,
            add_special_tokens=True,
            truncation=True,
            return_tensors='pt',
            )
        attention_mask = encoding['attention_mask'] 
        input_ids = encoding['input_ids']
        token_type_ids = encoding['token_type_ids']
        
        return {
            "Sentence":Sentence,
            "attention_mask":torch.tensor(attention_mask,dtype=torch.long),
            "input_ids":torch.tensor(input_ids,dtype=torch.long),
            "targets":torch.tensor(target,dtype=torch.long),
            "token_type_ids":torch.tensor(token_type_ids,dtype=torch.long)
            }


from sklearn.model_selection import train_test_split
df_train,df_val = train_test_split(df,test_size=.20,random_state=42)

BATCH_SIZE_TRAIN = 8
VAL_BATCH_SIZE = 4
MAX_LEN = 200
num_epochs = 1

def get_dataloader(df,tokenizer,batch_size,max_len):
    ds = Finance_Dataset(
        Sentence = df['Sentence'].to_numpy(),
        targets = df['Sentiment'].to_numpy(),
        tokenizer=tokenizer,
        max_len=max_len
    )
    return torch.utils.data.DataLoader(
        ds,
        batch_size=batch_size,
        num_workers=0
        )


train_dataloader = get_dataloader(df_train, tokenizer=tokenizer, batch_size=BATCH_SIZE_TRAIN, max_len=MAX_LEN)
val_dataloader = get_dataloader(df_val, tokenizer=tokenizer, batch_size=VAL_BATCH_SIZE, max_len=MAX_LEN)

training_batch = next(iter(train_dataloader))
training_batch.keys()
#

I got it to work, thank you

quick sundial
#

is datacamp a good resource for beginners looking to get into data science? i've been looking into courses, but confused on which one to commit to. data camp is currently $159 for the year.

serene scaffold
#

if nothing else, people tend to value what they pay for, so buying it might trick your brain into sticking to it.

past meteor
#

I did a lot of data camp years ago when I was in uni, each prof can apply for a free license for all of their students

#

It doesn’t really teach you anything imo, but at the very least it’s good at exposing you to many new concepts

#

And keeping you β€œbusy” / motivated with some milestones

lapis sequoia
#

do all of you just tune bert like it is nothing?

serene scaffold
#

I've done several projects that involve fine-tuning BERT for classification tasks, with varying degrees of success

lapis sequoia
#

It’s definitely not β€œa walk in the park” or anything, right?

serene scaffold
#

No

lapis sequoia
#

Is it one of the easier ones compared to t5 or BART? I’ve never fine tuned those 2, I’ve used them.

serene scaffold
#

I mean, there's a point at which you know enough about ML that you can adapt ML code without understanding how the model works "all the way down"

lapis sequoia
#

Yeah, I was just overwriting it

serene scaffold
#

I've heard of BART but idk what it is

lapis sequoia
#

Oh

#

Yeah, the model β€œBART”

fading wigeon
#

I'm trying to wrap my head around transfer learning. My understanding is that you use the architecture for a similar task and maybe even the layer parameters. Then you just train the output layer parameters. How does that work out?

I understand you could optionally just retrain all the parameters and just use the original task to initialize them. This wouldn't improve the model right? Only in theory reduce training time/resources?

serene scaffold
fading wigeon
#

Ah okay

#

Well, it seems like at the very least you have to retrain the output layer

#

Unless you are looking for exactly what the original model was

#

Have you ever used it and if so, how?

serene scaffold
fading wigeon
#

Ohhh

#

Yeah, I'm focused on neural networks.

#

It's just my current topic of learning, so forget that other machine learning stuff exists πŸ˜„

#

Tbh I wan't even aware it extended beyond neural nets

serene scaffold
#

in that case, it would be more helpful to focus on fine-tuning, since that's a more specifically-defined concept.

serene scaffold
fading wigeon
#

Ah okay

serene scaffold
#

"using" can mean anything.

fading wigeon
#

I gotta assume regardless of context it needs some fine tuning, right?

#

Or at least experience some level of poorer performance

serene scaffold
#

no, because there are types of models where "tuning" isn't even a thing.

fading wigeon
#

Oh. What sort of models? Do they lack like... hyper parameters?

#

I'm trying to give myself a crash course on all this so I can find better jobs, haha, so it's definitely very new

serene scaffold
#

idk, everything I do involves deep neural networks. which I don't purport to fully understand.

fading wigeon
#

Haha, fair enough!

#

I learned how to make them in numpy or even do them by hand. Obviously not particularly useful or practical, but...

#

Was still fascinating all the same

serene scaffold
#

how to make what in numpy?

fading wigeon
#

Oh, a neural network

serene scaffold
#

nice

fading wigeon
#

I've been doing a lot of learning on how to evaluate different neural networks or choose between them, but I still have no idea how to like... architect a network for a given problem.

#

How do you make that decision?

serene scaffold
#

I don't, because I work in language technology, and the only thing to do that's relevant is to wrap or augment a generative language model (ChatGPT, Llama, etc)

fading wigeon
#

Ahh gotcha

#

I guess no need to reinvent the wheel for this sort of thing, if other people have spent months/weeks determining an ideal architecture

#

or have pretrained a network

#

Do you work with like... audio data? Or just like text?

serene scaffold
fading wigeon
#

Also, if you don't mind me asking, what's your job title?

serene scaffold
#

computational linguist

fading wigeon
#

Nice

serene scaffold
#

(I have formal training in linguistics, in addition to CS)

fading wigeon
#

Oh, badass

serene scaffold
fading wigeon
#

Tell me more about the market?

I understand about the degree thing, most are asking for PhDs

serene scaffold
lapis sequoia
#

Linguistics

#

Does it help that much with natural language processing?

serene scaffold
serene scaffold
lapis sequoia
#

Does it help with knowing the context of the word and stopping recurrence?

serene scaffold
#

like, repetetive use of the same word?

lapis sequoia
#

Knowing the context of a word embedding and not throwing out words the matter

serene scaffold
#

what were you doing that inspired you to ask this question?

lapis sequoia
#

I don’t know, the word β€œpull” may be a stop word in most cases but if it pertains to a bunch of people deadlifting weight, it matters and has context.

serene scaffold
#

stopwords are usually a finite set of words that are the same in every context

fading wigeon
lapis sequoia
#

Just asking how it helps, I know it does and I know people literally go to university for linguistics to get better at NLP.

serene scaffold
fading wigeon
#

Ooof. Alright

#

Ay idea if it's region based?

#

Or global?

lapis sequoia
#

Are you talking about uni for ML?

serene scaffold
serene scaffold
lapis sequoia
#

Yes

serene scaffold
#

what would it mean for NLP to be > CV?

#

"greater" in what sense?

lapis sequoia
#

You prefer it

#

To computer vision

serene scaffold
#

NLP is the one that I do.

lapis sequoia
#

Same

#

Because I am not a engineer or something

serene scaffold
lapis sequoia
#

Why

serene scaffold
#

being an engineer has no baring on whether one would prefer NLP or CV.

lapis sequoia
#

True. I only based that on most people I know who are engineers use computer vision much more than others in DL/ML stuff.

quick sundial
rich moth
#

I genuinely think Im breaking now ground here guys. Check out that V shape in the error vs actuals, thats the uncertainty calibration at work. Smaller market moves, it nails the predictions (tight at the bottom!), and bigger moves show linearly increasing error, and the red line hugging the V means the model knows how uncertain it should be. The complexity shows the adaptive dimensionality is working, see how its adjusting its computational capacity based on market conditions. Im still doing intense parameter discovery right now though, but ive implemented a dynamic range for the model's dimensions ie, 256 base dim, 512 for max dim. So far it seems that the parameter controlling the maximum expansion limit, relative to the complexity scores, play a huge role in it all.

#

Interesting enough because of the adaptive dim size instead of a big fixed one, i can increase the seq length substantial without hitting performance hiccups in the training

inland crown
#

Is this from 1 script or modular scripts? Are you threading? How long has this been training?

past meteor
past meteor
weary timber
#

!code

arctic wedgeBOT
#
Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

weary timber
#

https://paste.pythondiscord.com/T36A
this doesnt train properly and i looked at everything i could and it doesnt have anything different than working ones(i tried with another dataset and still doesnt work)
by doesnt train properly i mean the loss isnt going down

odd meteor
past meteor
#

It could be anything, your learning rate, batch size, amount of neurons, ...

weary timber
#

i tried to fix this for 2days and no result so idk i said i might aswell stop wasting time

past meteor
#

What I'd suggest you do is to take it step by step

#

Comment out the data augmentation for now

#

Make your network larger. If you can't at least get the training loss to 0 with a large net that means you have a bug

weary timber
#

okey

#

and i think the problem is with the dataset

#

or me augmenting it

#

since i copied a conv net i made and tried that with the set and its not working still

past meteor
#

Yeah, so isolate the parts you don't trust one by one πŸ˜„

#

Augmentation is a solution and not something you do when you don't have to

#

It's a form of regularization

#

Basically, when you make your large net that can get 0 training loss (but likely has a decently bad validation loss) you start adding regularization such as augmentation

#

Also, try 3e-4 for your learning rate for Adam

weary timber
#

0.00003?

past meteor
#

it's a value. A default that is often used for Adam, a nice starting point

past meteor
weary timber
#

and you were right about augmentation

#

i will only use when i need it

past meteor
upbeat prism
#

hi, I run a BERT as a classifier on a very very simple task. I have a script that trains it until I get validation accuracy of 1.0. THen I have a second script with which I want to run some experiment. I load the pretrained bert classifier model and check the accurcay again, just to be sure. It is 1.0. Now I put the model into training mode and take 1 sample and do a forward pass: The prediction is wrong.

I'm a bit confused. Sure during training the trainin acc is "only" 98% 99% or whatever but still. Now I know that setting the model to training mode will enable the dropout layers and what not but I still expect a correct predictions - otherwise, how should I be able to train.

Any explanation for this?

My guess would be: I don't store the optimizer state, so I end up with a different model state (if Im in train mode) but that's too deep for me to really be able to judge it.

#

In fact, my test accurcay in train mode is only 63%

#

actually, optimizer state doesn't matter in my case since all I wanna do is do a prediction in train mode. I dont wanna continue training. hm.

#

hmm I think I misunderstand something

#

so I set dropout to 0, retrained my model and now it seems to work. dropout was 0.2 But I still don't fully understand. It makes sense to me, that the randomness of dropout can fuck wiht your results and that's kinda the point but what I struggle with understanding is: We do training with dropout enabled and we get, let's say, training accuracy of 99%. Then we store the model and load the model, shouldn't we have the same accuracy?

#

hmm actually with dropout at 0.2 the training accuracy converges to 69%. I think I have to go back and study some theory! ^^

jaunty helm
#

how should I be able to train.
you're still training, the model will just never converge to a still point
which is actually a good thing cause that's probably just overfitting on training data

#

this is torch's page on dropout which has a link to a paper for ig more theoretical stuff

lapis sequoia
#

ok, NER with a halo 2 dataset. Should I just use spacy or transformers?

serene scaffold
lapis sequoia
#

characters from h2 and their lines

serene scaffold
lapis sequoia
serene scaffold
# lapis sequoia Halo 2

okay. please try to be specific when you talk about non-programming things on this server. we have 400k people from every part of the world.

can you give a link for the halo 2 dataset, so that I know how it's structured?

serene scaffold
lapis sequoia
#

Kaggle

serene scaffold
lapis sequoia
#

Ok

lapis sequoia
dry raft
#

he guys

#

i am trying to do binary classification on images, and I am very confused on how you are supposed to build the model and also what loss function to use

#

if anyone can help, it would be good

serene scaffold
dry raft
#

well, I am using a vision transformer that has a mlp head of 2 classes

#

and I am using a pneumonia dataset that is binary classification

#

the labels in the dataset usually are like this: [0]

#

and my output from the vit is like this: [0.3, 0.4]

serene scaffold
#

so the images are xrays of lungs, and the classes are "HAS PNEUMONIA" and "DOES NOT HAVE PNEUMONIA"?

dry raft
#

yup

serene scaffold
#

okay, that's what I was asking.

serene scaffold
dry raft
serene scaffold
#

I still don't know what that is.

dry raft
#

the first result for searching it up should be its github

#

all you need to know is that it is a vision transformer with not a lot of parameters

#

and there is a mlp head that I am modifying for binary classification

serene scaffold
#

if you don't have much experience with image classification, I would start with a convolutional neural network.

dry raft
#

I have done a cnn for multi-class problems like MNIST

#

but for binary classification, I am confused in general

#

it's more so about the MLP

#

wait i found something on kaggle

odd meteor
# serene scaffold I still don't know what that is.

ViT (Vision Transformer) is a kind of transformer architecture designed for computer vision task. So I'm guessing TinyViT is most likely a small scale implementation of this vision transformer architecture.

weary timber
past meteor
#

But it seems to me yours already has that baked in

dry raft
#

yeah

rich moth
lapis sequoia
#

two years ago on this day, I wrote my first line of python code. I do not know, thought I would just say that.

plush kettle
#

Which is better in face recognition? Deepface or face_recognition library for python?

thorny geode
#

do you guys learn regression models

left tartan
thorny geode
left tartan
thorny geode
#

not really stuck, but i just barely got through slicing and boolean indexing

#

i cant believe this short text is really confusing to learn lol

#

guess which book it is

left tartan
#

Do you get how the boolean indexing works? It's a pretty important idea

#

That's a pretty crazy example tho. Wth

#

Reminds me of how Uni's teach nested loops right after introducing loops, before students 'get it'

mortal bolt
#

hey guys, I’m looking to learn more about the OpenAI API and how to use it to create AI agents that RAG and function calling. I don’t have much experience with these concepts yet, so I’d appreciate any guidance. Does anyone have suggestions on where to start or resources I could use to learn?

left tartan
mortal bolt
left tartan
#

Their developer docs are really good, plus they have examples on GitHub