#data-science-and-ml

1 messages · Page 172 of 1

median depot
#

Sorry I just say with my pritiction

tepid tartan
#

Oh ok

median depot
#

What about you

tepid tartan
median depot
tepid tartan
#

I might do a dbt Fundamentals and a data certification

wet dome
#

For a digit classification project is there really much data exploration you can do?

serene scaffold
#

some people do a closed 4 that looks like a 9

tepid tartan
#

What's the best data analytics certificate since I'm about to get a diploma in a month. School doesn't teach that much and I need the skill. Any of these things worth it? Trying to get much within a month. Is Python mandatory?

serene scaffold
tepid tartan
#

@serene scaffold, is ML hard in jobs? I took a course and usually use six similar charts and calculations, which I copy and paste but make some small changes to it.

serene scaffold
tepid tartan
#

Like graphs

serene scaffold
#

making data visualizations isn't ML.

tepid tartan
#

I'm using the phone but this is one of the homework

#

Plots

serene scaffold
tepid tartan
#

Oh ok

serene scaffold
tepid tartan
serene scaffold
tepid tartan
serene scaffold
tepid tartan
#

👍🏻

tepid tartan
#

@serene scaffold, What's the best approach to get some skills for data analytics? My goal is to project by learning the DBT Fundamental if you recommend and maybe a certification for my resume

tepid tartan
#

Good approach?

serene scaffold
woeful lodge
#

how much python should i know before starting data science

serene scaffold
woeful lodge
tepid tartan
#

I need to know where to start

#

@serene scaffold Tell me where to start. I know bits by bits from different area

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

tepid tartan
#

This?

tepid tartan
#

I know that data science use Python and Data analysts not really @serene scaffold

#

I struggle with Python. Took 3 course and it always a headache for me

sharp dawn
#

Hey guys, I've been very interested in studying and learning about ML/AI architectures, algorithms, and implementation. I've been reading research papers and been digging into some open source code bases to both deepen my understanding of the architecture, but also grow knowledge and comptence with python and these types of codebases in general.
Im just wondering does anyone here have any similar interests and positions that can offer me advice in this process.
Thank you

magic dune
#

sorry for late reply thats annoying how does it compare to other thinking models and also you think we would be able it with system commands or finetuning?

#

what model are you using or you using chatgpt api?

jaunty helm
# magic dune sorry for late reply thats annoying how does it compare to other thinking models...

I've only done some simple tests on the website, usually I wait a few weeks so the software issues are resolved before running local
for example a fix to the chat template a few days ago by unsloth reports that this shoots it up in the polyglot benchmark from 40ish to 60ish

my initial impressions:

  • the sorry I can't help with that trigger frequency isn't too bad. if your question is complex and needs reasoning, it seems to actually reason about that rather than getting stuck on thinking about policies.
  • if you can get past the censor, its actual abilities aren't too bad.
  • people have said that these feel like phi, as in their inherent world knowledge is really bad and are meant for agentic use. haven't tested this myself yet.
  • when it comes to creative writing... yeah probably not gonna test further, all the issues about oss show up here and it's not worth it

the most interesting thing about the recent open models is they're all large MoEs with small active parameters - this means you can reasonably run them on consumer hardware, a good chunk of the model can go on normal ram and still run at respectable speeds. these include gpt-oss of course, but also glm, qwen, ernie, etc.
this also means that gpt-oss is less competing against other reasoning models, but these MoE models; on that front I think I see more people preferring qwen's or ernie's similar sized model over the oss 20b
the oss 120b currently has a niche of being very big but having tiny active params even compared against other MoEs, which might mean you can do pure cpu inference with this one (though I don't have enough normal ram to test myself)

tepid tartan
#

Shall look at a job in data entry or reporting assistance/clerk in order to become a data analyst since most require some type of experience?

wet dome
#

ive got a plot of precision and recall values for different thresholds

#

does the threshold value we now choose just depend on our goal?
as in if we want high precision we may choose a higher threshold and vice versa for recall?

tepid tartan
#

I’ll be only finding jobs that need Atleast 1-3 experience 🤔

wet dome
#

what are some models i can try for digit classification?
Im currently doing support vector machine classifier (SVC), random forest classifier, K nearest neighbours classifier

magic dune
buoyant reef
#

I gemini-2.5-flash good at coding?

#

or 2.5-pro

calm cipher
#

also do hyperparameter tuning if you haven't already, it's possible you might be able to squeeze a little extra performance out of the classifier with better hypermarameters

tender umbra
#

what is better approach to combining function calling and structured outputs?

So i have a llm, that has some tools it can call. After it thinks task is finished it has to give final combined output in a structured response format.

  1. Use structured output explictly. I validated that gemini-2.5 flash can call both tools and give final structured output.
  2. Create a function called give_final_output(response_format: ResponseFormat). where ResponseFormat is pydantic representation of output. Then i can stop llm when it asks to call this function and use the arguments as final output.
ionic dirge
#

Please, I am trying to learn python and leaning towards AI Engineering. Where can I find resources to help me master python. Would be better if these resource is in a jupyter notebook or colab or in any other interactive form that allows me to read code and practice mine right after each learned concept. Would be nice if it contains loads of exercises for me to do.

swift tree
ionic dirge
regal galleon
#

You can pass your mathematics. It will be important in the future. There is a lot of difference between 90 %and 95 %model success. But if you are just learning, you won't feel that much difference.

University education dives into a lot of mathematics. (Including statistics)

ionic dirge
vestal saffron
#

Hello everyone. I've been working as a Software Developer and Data Engineer for a few years now. However I have a master in AI with a focus on ML and Deep Learning. I miss doing maths and analysis and I've been looking at job-postings as an AI/ML specialist. Many of these require practical experience with PyTorch and Tensorflow. Now I obiouvly have used Tensorflow as part of my studies, but that was before Keras was part of it and I'm sure both libraries have evolved a lot in the past 6 years.

My question is, how would go about "relearning" and catching up with Modern ML practices and to learn PyTorch (I believe it's the more popular framework now, right?) to an adequate level?

young granite
#

Hi guys,

anyone can suggest more advanced Object Detection like YOLO or Detectron, which are up to date and still maintained?

young granite
unborn spruce
#

What's keras?

misty yarrow
#

Hey i have to write a review paper on it give me a suggestion my domain will he al/ml

peak field
#

Any experienced AI/Ml engineers?
I've been tryna train models on a dataset I've used and it just doesn't stop overfitting. Reaches like 99% accuracy within the 1st epoch. Idk what to do, I tried GroupShuffleSplit, dropout, Data augmentation, early stopping. Idk what to do. Im just exhausted.

tepid tartan
#

Which one is good enough to learn. I wasn’t fun of reading? I rather do tutorials and watching videos

proven silo
# peak field Any experienced AI/Ml engineers? I've been tryna train models on a dataset I've...

Overfitting often means you have too many parameters for the problem. Consider reducing the width of your layers (assuming you’re making NNs), and maybe working with a training set that you introduce some sort of ‘noise’ to.

The other thing is that 99% on epoch 1 is only a problem if you’re not also getting 99% on your test set 😄 Some problems are easily solved with these methods.

calm cipher
torpid holly
#

guys, I am looking to train a vehicle price prediction model(among various types of vehicles) using randomforest. I scraped 200k+ data from a dealership website, filtered 50K with no price information and did further processing like - deleted unneccessary fields -
[ 'title', 'retail_price', 'location', 'state', 'is_auction', 'is_for_sale', 'dealer', 'description', 'has_images', 'image_count', 'video_count', '360_image_count', 'specs.Serial Number', 'specs.Stock Number', "listing_date"]

So now the dataset has 150K records with [Category, Manufacturer, Year, Model, Condition(1: new, 0: used), Price, Hours] (1305 records doesn't have Model info, 35773 doesn't have Hours info)

The result is MAE: 30K USD, which is not good.

Any tweeks to improve the result?
I appreciate your help in advance.

calm cipher
#

I'm wondering if it might be a data issue

#

What is Hours?

#

I guess I'm wondering if it's reasonable to predict the price of a car from the information you have in the data, and if more might help

#

For example, I'm curious if state is relevant, since I'd think a car in California or New York might be more expensive than a car in Utah or Alabama

#

Also I'm wondering if is_auction has implications for the price, that is, if it means the price listed is a starting bid rather than the final sale price

#

It's possible you could get better results with fine tuning or trying different types of models, but I am curious about your feature selection process and how some of the removed features were determined to be unnecessary

spring field
#

me when I fine-tune a model trained on a naturally massively imbalanced dataset with a uniformly balanced dataset and there's indication of improvement during training and validation, but benchmarking on a fresh naturally massively imbalanced dataset shows significant drop in performance across a slew of metrics

#

so now I'll be using a different architecture (but that has more reasons beyond what I stated above of course)

#

yes

#

previous experience indicated that to be the case, yes

unkempt apex
#

yo @spring field what happen?

spring field
#

the core issue was bad performance for one of the classes (with the lowest representation) in the imbalanced dataset, now, the model that was being fine-tuned was actually fed sub-par data of that class in particular during the initial training
however, given the low representation, fine-tuning on a naturally imbalanced dataset would likely produce only marginal improvements, so a decision was made to balance the fine-tuning dataset

the experience that such an approach would work came from previous experiments and likely some papers, though those other experiments were done on a different architecture

but again, there are other reasons beyond these latest failed fine-tuning experiments to move to different architecture

spring field
unkempt apex
#

I thought your cat was testing the keyboard

spring field
#

lol

#

ah, I suppose a crucial detail here is also that the new architecture will be initially trained only on that one class

#

yes

#

UNET do be cool, but nowadays transformers have taken over pretty much everything, lol

#

There's Mamba though, wonder how it will fare in this environment

past bramble
vagrant oyster
#

how can I increase performance of catboost classification model, any new features or addition

serene scaffold
#

@vagrant oyster if you ask a question in more than one place, please direct people to one of those places, to reduce duplication of effort.

vagrant oyster
# serene scaffold if there was a one-size-fits-all solution to this, everyone would just do that e...

Thanks for pushing me to be specific — here’s what I’ve done so far:

Preprocessing / Feature Engineering:

All features converted to categorical strings

Log-transform of skewed numerics (abs(skew) > 1)

Sine/cosine transforms for cyclical time features

Numeric features binned into 15 categories

Missing values filled with sentinel "NA"

Passed all feature indices to cat_features

Model setup:

CatBoostClassifier (Logloss, AUC, learning_rate=0.08, depth=7, iterations=1000, od_type='Iter', od_wait=100)

Stratified 5-fold CV + 80/20 validation split

Current best validation ROC AUC: 0.89

Tried already:

Hyperparam sweeps (depth, learning_rate, l2_leaf_reg, border_count)

Feature removal based on permutation importance

Various categorical binning/grouping methods

Looking for:
Specific CatBoost tricks beyond the basics — e.g., CTR features, one_hot_max_size tuning, target statistics on grouped features, interaction features between high-importance vars, or advanced quantization settings.

serene scaffold
vagrant oyster
serene scaffold
vagrant oyster
vagrant oyster
serene scaffold
vagrant oyster
serene scaffold
#

how many positive samples are there and how many negative samples are there?

serene scaffold
vagrant oyster
serene scaffold
vagrant oyster
serene scaffold
orchid light
#

Training looking good

#

and 3gb checkpoints ......

obsidian plume
#

anyone there

#

need help

#

im working on mosdac [ISRO] data scraping i need to scrape all the data from the missions section in json format so i can create a knowledge graph but the sub sections in the missions catogory are so unpredictable some migh contain images sub headings and tables other time its just text idk how to do it please HELPPPPPPPPPPPPPPPPPP!!!!

#

please anyone bro

wide carbon
#

Hello guys

obsidian plume
past bramble
obsidian plume
#

can anyone please help me if youre online

wide carbon
#

hello

#

@past bramble @obsidian plume

#

Greetings!
The Programming Club shall be organizing CredTech - a FinTech Hackathon in association with Deep Root Investments from 16th to 22nd August 2025. The hackathon, based on the applications of Machine Learning and Development in Finance involves tackling real-world challenges by utilizing cutting-edge technology.

Deep Root Investments is a forward-looking investment management firm specializing in credit risk strategies enhanced by artificial intelligence and machine learning. They combine deep expertise in credit science with cutting-edge technology to uncover mispriced credit opportunities and deliver differentiated risk-adjusted returns. Core Specializations of the firm include:

  1. Credit Risk Arbitrage
  2. AI/ML-Powered Credit Assessment
  3. Advanced Credit Risk Measurement

This is happening and i am looking for team members of indian origin
Prerequisites
Some ai/ml knowledge and enthusiasm

solar thistle
#

I do too now days later actually just wanted to say ty

paper jay
#

guys what languages do i need for ai n machine learning ik python is one of em but im currently learning c coz of college so is that significant in learning it or should i just take it as another academic sub

jaunty helm
fervent bridge
#

Can anyone provide me a link of Resources to AGI advancements and Open Source LLM advancements in relations to like Gemini competitors etc handling large codebases?

paper jay
jaunty helm
paper jay
wide carbon
#

Look use sklearn

#

Its fast enough for you to make models simple ones as well as ensemble models

viral basin
#

anyone wanna help me fix my auto email generator? 😭

wide carbon
#

Hello what is the problem??

#

Auto email generator??

proven harbor
#

Hey everyone im new to this community and discord thing

#

I am a cse student and i am clueless of what to do with my life

#

I heard about data analysis and wanted to know if it is a good option?

iron basalt
#

(And how things like compilers, linkers, etc work)

#

After that you can get into others pretty easily and start reading some open source projects (the ML libraries themselves (or even Python itself)).

#

This is not nearly as difficult as the math needed, it's just an extra detail that you need to learn if you want to work on these libraries or make your own.

#

(Or just want to improve at programming in general)

lapis sequoia
#

Best config.py for prompting let’s go. Any suggestions????

tacit basin
digital forge
#

hey.. i couldnt work out the sync net part.. i couldnt even run it on server with gpu without crashing,, is there any other way to identify the speaking person in video

wide carbon
#

Hello

#

Hey everyone
Anyone here interested to participate in a credtech hackathon
Pre requisites - Some ai/ml knowledge (sklearn) and enthusiasm
This is the info about the hackathon
Greetings!
The Programming Club shall be organizing CredTech - a FinTech Hackathon in association with Deep Root Investments from 16th to 22nd August 2025. The hackathon, based on the applications of Machine Learning and Development in Finance involves tackling real-world challenges by utilizing cutting-edge technology.

Deep Root Investments is a forward-looking investment management firm specializing in credit risk strategies enhanced by artificial intelligence and machine learning. They combine deep expertise in credit science with cutting-edge technology to uncover mispriced credit opportunities and deliver differentiated risk-adjusted returns. Core Specializations of the firm include:

  1. Credit Risk Arbitrage
  2. AI/ML-Powered Credit Assessment
  3. Advanced Credit Risk Measurement
wide carbon
#

Lemme tell one sec

#

@serene dew

#

Its not quite of a much big hackathon but these are according to 1st,2nd and 3rd
571 Usd
285 Usd
171 Usd

summer maple
#

anyone one know how to do the FinBERT fine turning?

serene scaffold
inland mulch
#

well i am just starting out machine learning
i had a simple problem to make a single neuron learn that when input >5 the activation must be one otherwise zero
i was trying to analyize the behaviour of stochastic and mini batch descent and too my surprise the mini batch approach was less effective , can someone explain me why ?

#

the boundary condition is b/W =- 5 (ideal solution), i have obviously filtered out the outliers

proven pier
#

I'm reading a book and they leverage conda. I am most familiar with simply setting up a python venv

#

Is it necessary to swap to conda for this? Or can I stick to just using venvs?

serene scaffold
proven pier
# serene scaffold No, don't switch to conda.

When you get a moment could you expand why? I'm looking at a very large conda yaml file and there's a lot of "dependencies" that are prior to the - pip: section of dependencies, and that's both confusing and concerning

serene scaffold
proven pier
#

I see, that's refreshing to hear. This book I'm reading seems to use conda a lot however.. maybe I can slug through it and try making it work with venvs instead

serene scaffold
#

How old is the book?

proven pier
#

August 2023

#

"Machine Learning Engineering with Python" 2nd edition

serene scaffold
#

Yeah, seems pretty unlikely that their code requires it.

inland mulch
jaunty helm
#

you can run a good chunk of the model on cpu and still have nice speed
it's a big advantage with these small active param moes

twilit topaz
#

Anyone here can help?

#

How do you get rid the the subplot error on the bottom?

#

For the figure I did

ax1,ax2 = plt.subplots(2,1,  
layout='constrained')
ax1 = plt.subplot(2,1,1)
ax1.plot(PPID, color='red', label= 'PPI Energy')
ax1.set_xlabel("PPI Energy")
ax1. set_ylabel("Index 1982-1984=100")
ax1. legend()```
I then did the same for the second plot
Except add a super title on top 
 but I keep getting extra numbers in the x axis 
I followed the Stack Overflow suggestion but it didn't work
spiral peak
twilit topaz
spiral peak
#

That is your ax2?

twilit topaz
#

Yeah

spiral peak
#

Can you post the code for that?

twilit topaz
#

Sure give a minute

twilit topaz
spiral peak
#

I just want to see the code, I can always fake data if needed

twilit topaz
spiral peak
# twilit topaz

Can you comment out the sharex=ax1 for a second and see what the bottom graph looks like?

twilit topaz
twilit topaz
#

The labels correct themselves but the zoom is gone

#

So I'm not sure how to keep the shared zoom and make sure the axis doesn't create a overlap.

spiral peak
#

Okay, so the secondary ticks/labels only appear with the sharex

spiral peak
twilit topaz
#

I get a error

spiral peak
#

Ah sorry, it's when you create the subplots

#

plt.subplots(2, 1, layout="constrained", sharex=True)

twilit topaz
#

Thanks

#

I did lose my x axis on my top plot but it's not a big deal since they have the same axis

inland mulch
proven silo
#

@inland mulch You've not posted any code so it's impossible for anyone to know what you might have done wrong

inland mulch
#
def activation(inputx,weight,bias):
    def sigmoid(x):
        return (1/(1+np.e**-x))
    return sigmoid((weight*inputx) + bias)
def modelN():
    global weight,bias 
    weight=np.random.uniform(-1.5,1.5)
    bias=np.random.uniform(-5,5)
    weight_0=weight
    bias_0=bias
    def gradient(param,inputx,activation,expected):
        def rectify(param):
            if param =="W" : return 2*(activation-expected)*inputx*activation*(1-activation)
            if param == "b" :return 2*(activation-expected)*1*activation*(1-activation)
        rectifyVal=rectify(param=param) 
        return rectifyVal
    def correction(param,hyperparam,rectifyVal):
        return (param - (hyperparam*rectifyVal))
    def epoch(number):
        global weight,bias 
        if number == 0 : return
        for j in range(number):
            for i in range(len(trainL)):
                training_data=trainL[i][0]
                expected=trainL[i][1]
                activation_value=activation(inputx=training_data,weight=weight,bias=bias)
                rectify_W=gradient(param="W",inputx=training_data,expected=expected,activation=activation_value)
                rectify_b=gradient(param="b",inputx=training_data,expected=expected,activation=activation_value)
                weight=correction(param=weight,hyperparam=0.1,rectifyVal=rectify_W)
                bias=correction(param=bias,hyperparam=0.1,rectifyVal=rectify_b)
    epoch(100)
    return [bias/weight,weight,bias,weight_0,bias_0] # used for graph plotting
proven silo
#

I don't have time to dig into it fully but I don't see where either the stochastic vs minibatch aspect comes in here. You're iterating over each of the training items individually so there's no batch involved.

full thorn
#

Guys I’ve learnt the math (statistics) and python behind AI and Machine Learning. I also know neural networks and the weights and biases... but idk how to continue to be able to make things like the MNIST data set number recogniser and other basic AI projects. But how do I learn these things? Any tutorials and stuff to learn this? (I don’t like YT vids for learning but idm anything)

calm cipher
#

For SGD the model is considering all points at once, for minibatch it's a smaller random subset

inland mulch
calm cipher
#

So the weights tend to get pulled around more in suboptimal directions with minibatch

inland mulch
calm cipher
#

No sorry I'm the one that got them mixed up, sgd is one at a time

calm cipher
#

Hm how are you evaluating the performance of the model other than whether it finds the ideal weights?

inland mulch
#

yeah so i was told that sgd should give us most noise because its not a true desent and makes the path (descent noisy)
but considering over 1000 final values
i dont see that , it makes the solutions most close and dense to ideal

inland mulch
#

because the number of epocs are same for all : 100

proven silo
#

I don’t really understand your gradient operation (possibly because it’s hard to read the code on my phone) but if you were processing a minibatch I’d expect to see code where it calculates gradients in several instances, averages them, and applies that. It would probably benefit from a different learning rate too.

jaunty helm
inland mulch
#

yup i got my signal
tbh i am overcomplicating a simple problem

inland mulch
# proven silo I don’t really understand your gradient operation (possibly because it’s hard to...

well for mini batch i had other code

def mini_batch(number):
        global weight,bias
        if number == 0: return
        for k in range(number):
            for i in range(int(len(trainL)/batch_size)): # since divide give 3.0
                rectify_W_sum=0
                rectify_b_sum=0
                for j in range(batch_size):
                    training_data=trainL[(4*i)+j][0]
                    expected=trainL[(4*i)+j][1]
                    activation_value=activation(input=training_data,weight=weight,bias=bias)
                    rectify_W=gradient(param="W",input=training_data,expected=expected,activation=activation_value)
                    rectify_b=gradient(param="b",input=training_data,expected=expected,activation=activation_value)
                    rectify_W_sum+=rectify_W
                    rectify_b_sum+=rectify_b
                weight=correction(param=weight,hyperparam=0.1,rectifyVal=(rectify_W_sum/batch_size))
                bias=correction(param=bias,hyperparam=0.1,rectifyVal=rectify_b_sum/batch_size)
#

but dw, i should not think very much for this simple problem

proven silo
#

First thing I would do is refactor the code so that the two solutions share as much code as possible.

The other alternative is just to use minibatch with a batch size of 1 - that is just stochastic gradient descent anyway

jaunty helm
inland mulch
proven silo
#

Yes, the ‘ideal’ is to train on the whole set at once, i.e. a 100% batch size, but it’s disproportionately slow to do that

jaunty helm
# inland mulch unrelated but doesnt large batch size push the solution to be more precise , i.e...

more precise is one way to look at it, and also more deterministic
at the extreme when batch_size == training_dataset_size, you follow a path completely determined by the gradient, which means you take 0 "wrong" steps and it converges fast, but it may lead you to a local optimum
at batch_size < training_dataset_size, you can imagine you sometimes take "wrong" steps, converging slower, but these wrong steps can also knock you out of local optimums

proven silo
#

Training on the whole set as one batch converges fast in terms of iterations but incredibly slow in terms of real world time

#

(I’m ignoring the local optima problem though)

molten badger
#

guys you have a complete roadmap for ai?

jaunty helm
#

yeah, it is usually unreasonably expensive (computationally, but also lots of memory) to actually calculate the gradient for the entire dataset, so you really don't see batch training anywhere

inland mulch
#

ah i see
also i am moving on from this simple problem and advancing towards the 3b1b neural network playlist , where will try to code the digit recognition model myself , that will def take a while , cuz i will be DEFINITELY experimenting with EVERYTHING there cuz its so complex
but after that i dont have any definite path for where to go

#

any suggestions?

jaunty helm
molten badger
#

do i need to fully master python to start AI and ML?

#

??

proven silo
#

No such thing as a complete road map, as it’s not a solved problem.

No such thing as fully mastering a programming language either.

Just get stuck in and learn as you go.

calm cipher
#

With any kind of descent algorithm it's going to descend on a solution but not necessarily any one in particular

#

The solution it finds is more up to the random initial weights

inland mulch
#

yup there are , the solution is b/W =-5 , and ultimatley , i have deduced considering the simplicity of the problem that comparing the performance of different descents is moot cuz the differnce is tiny

calm cipher
#

I do think it's interesting that some approaches seem to land on the "ideal" more often than others though

jaunty helm
#

the field is vast that a definitive path I don't think exists

inland mulch
jaunty helm
lavish wraith
#

Hello

#

I am trying to making the dashboard using data but i feel so boring to understand the column and purpose of value in normal when i start learning it feel so excited now i feel bored to make project why it happens to me

wary dust
#

Hello all I would love to review your portfolio or resume especially if you are an entry level professional for Data Science or analytics. I have been exploring working with data and just completed an analyst internship over the summer. No degree so I would love to get an idea of how to show my experience and skills as valuable. If there’s a resource you would like to point me to that would be great too!

cedar tusk
#

my vae implementation for curvature is actually working!

brazen rain
#

hi im a beginner at coding python , im aware of the basics, how can i learn python specific to data engineering which will help me excel in it?

brazen rain
#

thx

cedar tusk
#

making my own diffuaion model

#

for image gen

buoyant vine
#

It's been a while since I've last poked my head into building models, does anyone have any references or interesting posts to read through specifcally looking at training and/or distilling a encoder-decoder model for predicting next words? Or more specifically predicting associated words and phrases to an input

#

In my mind i'm thinking it might be possible to use one of the OSS generative AI models and retune & distil it to target that application, but I can't remember basically anything around that lol

#

thinking like input -> output:

  • Tiger Woods -> Golf, Sports Champion
  • Prada -> Shoes, Designer brand
  • Advertising -> Ad, Marketing
full thorn
agile cobalt
buoyant vine
#

I could do a training setup similar to GLOVE I guess

#

biggest pain would be getting the dictionary coverage though

#

a normal LLM pipeline would probably be easier to ship and deploy though

subtle cairn
#

Hi GenAI enthusiasts, I think you all faced similar problems while working for any AI agent.

I really love to hear from you on how you approach the solution of the below cases.

  • Data storing and retrieval approach also called agent memory
    - Long term
    - Short term
  • Reduce token while LLM call
  • Measure accuracy (testing)

Any suggestion from any of you can help all of us.

haughty plank
#

Hey guys, I’m new here and just released my first open source package onto PyPi. It’s a declarative, object oriented approach to creating LLM agents. I would love any feedback if anyone wants to take a look!

https://rmikulec.github.io/pyAgentic

quick note: it only supports OpenAI right now, but I have plans in the roadmap to add other services and local model support

haughty plank
#

Ya! It totally does have some similarities. I think where pydantic-ai took a more functional "fastapi" - type approach, this sticks with object oriented code. I did this to have an easy way to create inheritance hierarchies, as well as the ability to create mixins with AgentExtentions

Another feature that is nice, user's dont need to create an instance of the Agent itself in order to gain access to things like the pydantic response model, tools definitions, etc.

Obviously it cant compete with pydantic-ai cause that is already impressive itself, but I thought i'd give a shot at it as a fun side-project

#

For sure, tool-calling can definitely be finicky. Working on implementing structured output support right now actually, its pretty amazing how well it works haha

scenic parcel
#

Is it true that llms are just better than sota ocr engines a lot of the time

serene scaffold
lapis sequoia
#

do i have to learn R language for ml or is python enough

serene scaffold
scenic parcel
#

Like, it uses a completely different technique than traditional OCR

hollow flume
#

Hi

calm cipher
#

I am guessing but they're probably good enough at this point that images with some words in them can probably embed the content of the text in the embedding space

#

but if you're looking at something like a PDF or image that is nothing but text, at the end of the day an image embedding is of a fixed size and can only contain so much information

#

and the shape of the embedding space also has to describe everything else that the image model is trained to recognize

severe ridge
arctic wedgeBOT
#

src/transformers/models/mllama/modeling_mllama.py lines 164 to 177

# Copied from transformers.models.clip.modeling_clip.CLIPMLP with CLIP->MllamaVision
class MllamaVisionMLP(nn.Module):
    def __init__(self, config):
        super().__init__()
        self.config = config
        self.activation_fn = ACT2FN[config.hidden_act]
        self.fc1 = nn.Linear(config.hidden_size, config.intermediate_size)
        self.fc2 = nn.Linear(config.intermediate_size, config.hidden_size)

    def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
        hidden_states = self.fc1(hidden_states)
        hidden_states = self.activation_fn(hidden_states)
        hidden_states = self.fc2(hidden_states)
        return hidden_states```
runic parcel
#

Hi, has anyone here enrolled for Andrew NG Coursera, courses of machine learning and deep learning and related. They have stopped the audit option and I really wanted to complete the course.

shell pebble
#

hello is there anyone did langchain project?

maiden arch
#

How is my deep learning model doing ?

slender meadow
#

Guys, what is the best way of learning AIML ?

slender meadow
#

i see, can u share it with me how u are doing it?

#

i see

#

i think u hv a great approach

maiden arch
#

Then wat should I do is there a better way to predict

spring field
#

how close to infinity have you got?

reef nest
#

hello..i am new so i wanna learn AI do rn i am doing python and ml so any guidance

#

resoures or roadmap

#

??

lucid elbow
reef nest
maiden arch
#

is there any other strategy then making an ML or deep learning model ?

#

which is better ?

#

I see but do you recommed a place to look for articles ?

#

What bout where to use which ML algo or deep learning one I have some idea but needs more indept knowledge

molten badger
#

guys which things should i download in my laptop before starting AI and Machine learning , i already know the basics of python , i just started

maiden arch
#

if it takes a lot of memory suddenly

molten badger
#

what should i learn first for ai and ML ? i already know the basics of python , and i dont wanna spend more time on python cause my main goal is AI .

serene scaffold
cyan wasp
#

new to ai in python what should i learn first like i wanna make a chatbot or a nureal network

molten badger
#

yea but whats pandas , my university will start in two weeks and i wanna get head start , which course on youtube is best?

serene scaffold
reef nest
#

😭 ,,now i am confuse can someone plz guide me

ivory umbra
#

tryna familiarize myself in data science, is the free first chapter from datacamp enough? cuz im broke

serene scaffold
ivory umbra
serene scaffold
#

it might be a good place to start.

ivory umbra
#

ooh thanks, thats good to know.

stable wind
#

Guys new to ml where do i start(ik python on its own) jyst no ml libraries

#

Working on pytorch rn

mild dirge
gritty ivy
#

I am not sure if this is the right place to ask this, but has anyone made a neural network with asm?

proven silo
gritty ivy
proven silo
#

Probably because it's not worth doing or sharing.

gritty ivy
#

okay

proven silo
#

NNs are perfectly suited to running on GPUs. Even the most efficient CPU implementation can't come close. So there's not much point. Better to ensure the CPU part is optimised for ease of use, e.g. written in Python, and that the actual processing is offloaded to the GPU.

hexed maple
#

hey guys, im using LASSO within a DML framework, should i fit my nuisance fits separately?

opaque condor
#

How should I go about making a image set

opaque condor
#

I'm using cats also how long should I take for each image to find?

#

Because I was investing in making a web scraper so I can scrape the web for all these images and get my data set without driving myself nuts

#

What I mean is going and finding each cat image to put in my data set or should I just use one made

opaque condor
#

Well I would gain the knowledge of how to make my own dataset

opaque condor
#

How can I evaluate the quality

inland mulch
knotty breach
glacial dune
#

hi guys

quasi lotus
#

how would I make my own models or know how to use existing ML algos I have learned basics of Machine Learning I want to dive deeper into algo trading, quant, HFT etc

#

I am looking into research papers is there any recommendations ?

mellow vector
#

so I'm going over neural networks and find myself in a spot of bother, I'm using the iris dataset for context and not setting aside any validation or test data, just trying to sort out the moving pieces and this code wall is mostly fine

#
class MyNetwork(nn.Module):
    def __init__(self, n_hidden_layers, neurons_per_layer, n_features, n_predictions):
        super().__init__()
        self.layer_dict = nn.ModuleDict()
        self.n_h_l = n_hidden_layers

        self.layer_dict['input'] = nn.Linear(n_features, neurons_per_layer)
        
        # hidden layers
        for _ in range(self.n_h_l):
            self.layer_dict[f'hidden_{_}'] = nn.Linear(neurons_per_layer, neurons_per_layer)

        self.layer_dict['output'] = nn.Linear(neurons_per_layer, n_predictions)

    def forward(self, features):

        features = self.layer_dict['input'](features)
        features.relu_()

        for _ in range(self.n_h_l):
            features = self.layer_dict[f'hidden_{_}'](features)
            features.relu_()

        features = self.layer_dict['output'](features)

        return features

my_model = MyNetwork(2,3,4,3)

def train_my_model(the_model, lr = 0.01, n_epochs = 3000):
    loss_func = nn.CrossEntropyLoss()
    optimizer = torch.optim.SGD(the_model.parameters(), lr = lr)

    for i in range(n_epochs):
        y_hat = the_model(features)
        loss = loss_func(y_hat, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    predictions = the_model(features)
    greatest_prediction = torch.argmax(predictions, axis=1)
    accuracy = 100*torch.mean((greatest_prediction == labels).float())
    
    print(accuracy)

train_my_model(my_model)
#

what I'm confused by is sort of my own labels, I have features feeding into the forward method, and then again when calculating y_hat and I'm a bit confused by what's happening throughout

proven silo
#

It’s the same thing. When you ‘call’ your model, Pytorch uses the model’s ’forward’ method to determine the output

mellow vector
#

not data set but tensor...

#

I sus i'm shadowing but i don't know what's under the hood so i have to ask

#

oh, so my course used convention "x" which was not at all helpful for learning how to write my model class

proven silo
#

X is fine, that’s the standard input to any mathematical function. Not super descriptive, but still.

I’m not entirely sure what you mean about your data set. I guess it’s a bit confusing because you refer to ‘features’ which are usually coming from a data set you iterate over and pass minibatches to ‘forward’ in turn.

mellow vector
#

the instructor used def forward(self, x): this was the same thing claude reported is the convention when i dug into it a bit more

#

had just been a few days since I watched the vid, figure this is code worth commiting to memory

proven silo
# mellow vector the instructor used `def forward(self, x):` this was the same thing claude repor...

I'm at my PC now rather than on my phone so I can see your code a bit clearer now 😄
I wouldn't have a layer_dict - I'm unsure what value that gives you over simply having each layer as a member variable. It probably complicates matters a bit.
I guess it allows you to have that loop inside forward but, to be honest, to begin with I'd just have a list of hidden layers instead.
Not that it really affects the problem - I'm just a firm believer in simplifying the code when it doesn't do what is expected.
The only other big question I have is that I don't see any sort of batching going on. Each epoch is meant to go over the whole training set, but you don't have a training set, just features, and it's not clear what that is exactly. Most likely there's a loop missing here, where you extract those input features from each element in the dataset, either one by one or in batches

mellow vector
#

nah it was never meant too, it's just the pace this instructor is setting. It's a comfortable pace but I do find myself screaming at my screen sometimes.

#

I'm sure he'll incorporate batches and validation in the next section.

waxen kindle
mellow vector
#

it was my forward method that confused me

waxen kindle
#

does it still confuses you or is everything ok ?

mellow vector
#

nah I'm good

#

thanks

proven silo
mellow vector
#

welp, I'll know where to ask when I get confused again

hardy sandal
mellow vector
#

this isn't really much coding, i spent years learning core python and years working in a statistics heavy field

#

but you could totally skip Data structures and various other subjects I've dug into

#

only took about a week to learn the maths and a week writing my first ANN, already being comfortable with python and statistics

hexed maple
#

data science ai bros, am i able to scale my outcome and controls for LASSO, compute weighted penalty loadings (RLASSO), then rescale residuals found from this prediction

#

idk the theory , but in terms of tight confidence intervals, it fcking works

shell pebble
#

Can any of you DM me if you know about LangChain agents?

serene scaffold
strange falcon
#

Hello, I'm hoping someone can help me with a weird LangChain RAG issue.

I'm building a chatbot to read PDFs. My code successfully finds and loads the FAISS vector store from Google Cloud Storage, and the logs even say it's finding the right document chunks

But when the RAG chain tries to build the prompt, the context is completely empty. The final answer is always "no text was provided."

I feel like I've fixed everything else (GCS paths, file IDs, unpickling errors). Has anyone had this problem before, where the system finds the documents but their actual text content is missing? I'm at a dead end and would appreciate any ideas.

Thank you!

serene scaffold
#

Don't wait for someone to ask you to show the code. Just include it in your first message. You can use our paste bin if you need to.

#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

ebon panther
#

Hello guys, i'm from Brazil and want to learn python for data science, which path do you recommend for me to start studying? and which courses??

lapis sequoia
#

with ELO scoring, when it is transitive (A > B > C; -> A>C) are they referring to separate LLMs and there performance? Like, say for example in terms of food I guess, eggs > Waffles > Pizza(E>P; would have to always hold); it would mean you like eggs more than pizza and waffles more than pizza but transitivity would be violated if someone liked pizza more than eggs or the person aid the likes waffles more than eggs. They are just doing that with the performance of separate llms, right?

lapis sequoia
hollow fossil
#

guys could u tell me wht im doing wrong here ?
im practicing using neural networks for predictions but it has only close to 80% accuracy

#

wht can i improve to make it better ?

#

i mean thats only 30% than a random guess
idk just doesn't feel sufficent

#

especially if u were to use this for any real world purposes

calm cipher
#

also since this is a simple binary classification problem I'd do more of an investigation into the precision and recall and not focus solely on the accuracy

hollow fossil
#

oh im sorry i just used the share button

#

dont really know much abt kaggle sharing lemme fix it

#

ok now you should be able to see it

#

seems like u can only view it when im not running it which is a little inconvinient

calm cipher
#

hm yeah I would definitely look into a confusion matrix

#

I just noticed the dataset is imbalanced and about 80% of the data is the negative class

#

I mean in general 80% isn't terrible as far as accuracy goes, but also the model could achieve 80% accuracy by only predicting the negative class

hollow fossil
#

i forgot to test for class imabalances

calm cipher
#

try outputting the f1 score during training or something else that balances precision and recall

#

but also if you are going to do a neural network, I do recommend comparing it against a simple logistic regressor just to give you some context for the results you see

calm cipher
#

also this is a smaller point but I strongly recommend a train, test, and validation set so you can meaningfully compare the results

hollow fossil
#

thanks for the advise 🙂

hollow fossil
calm cipher
#

you used two splits, train/test

hollow fossil
#

i ran it for close to 120 epochs and there were no differences in accuracy or validation scores

calm cipher
#

if you're going to do hyperparameter tuning, a baseline like I suggested, or tweak the neural network like @final kiln suggested, you need a validation set to compare them

#

train on the training set, use the validation set to find the best-performing model, then perform a final evaluation on the test set only on the best-performing model

#

tweaking the neural network is ok but you really need to figure out the imbalance and find a baseline

#

ehhh

#

this looks like an introductory project, trying to encode the surname is a lot to deal with

#

100% the imbalance is the biggest issue here

#

all this stuff about tweaking the network and doing embeddings of people's names and stuff isn't going to help if 80% of the dataset is the negative class and no one has seen the confusion matrix yet

#

the most important thing right now is understanding what that 80% accuracy figure means and getting a better picture of the predictions the model is making, maybe doing some precision/recall curve work, and then tweak the network

hollow fossil
sharp crow
#

Does anyone have a really messy dataset(s) with atleast 100k records? I need one for my project.

hollow fossil
#

as u said the class imbalance probably contributes since there's a really low precision for one of the classes

calm cipher
#

But it correctly identified 62% of the positive class overall

#

So if in reading this right, it's saying a lot of things are positive that aren't

calm cipher
#

There are also techniques for balancing the dataset

#

I would start with that just to make sure you're giving the model the best opportunity to learn the data, but I would also consider other models besides neural networks too

#

Unless this is just an exercise to learn neural networks

marsh sequoia
#

Newb here, sorry in advance if this is the wrong channel for this question! I'm working on a simple RAG PoC: Currently I'm using the basics, RecursiveCharacterTextSplitter from langchain_text_splitters, along with WebBaseLoader from langchain_community.document_loaders to extract content from a webpage, turn it into embeddings, and index in a vector store. Basically RAG 101 stuff.

My question is related to trying to figure out a useful way to identify where in a page a particular "chunk" is pulled from. A very long web page for example, might have 10 "chunks" in it, but when one of those chunks if indexed / added to a vector database, it isn't really related to a particular part of the page. I am familiar with the web / HTML / etc, so realize this is perhaps an impossible problem to solve (there may not be a reasonable "hook" that a tool could use to programmatically identify a "chunk" in a way that's useful in the output), but I was curious if anybody is aware of any tooling / strategy to help identify where on a page a chunk would come from, so for example in the final UI I might be able to somehow identify where in a particular webpage the content was referenced from?

vital cradle
#

hi i wanna be a data analyst/scientist i have some python fundamentals and now im learning pandas
but im not so perfect in excel (i know some formulas and pivot tables a little bit) also i did not learned tableu or powerbi whatever
so am i doing wrong
cause i feed more happy when im working about python
(btw im not even a university stundent i will start this year)

hollow fossil
hollow fossil
#

thanks guys for the help 🙂

calm cipher
# hollow fossil oh i see will look into it

One more thing, it's a useful exercise to put yourself in the position of the bank and think about what precision and recall mean for you in this problem, and which is more important

#

Assuming you aren't already familiar with the precision and recall trade-off, I I highly recommend using this problem as an excuse to study it

abstract wasp
#

hi i need some help, im trying to use rag for my llm and im on the data prep section, im was about to chunk but should i tokenize first, then chunk then embed? i thought that if i tokenize after wouldnt the chunks end up becoming larger in size? i was planning on adding some overlap too... also should i use an embedding model or do it manually? im a beginner w rag so was thinking if doing it from scratch as much as possible would help me understand the process better or idk what do u think?

abstract wasp
agile cobalt
#

same goes for whenever or not to embed

How do you want to search later?
if hardcoded filters or traditional full text search suffice for your use case, no need
If you want semantic search, you'll need to embed and preferably create a index in whichever vector db you choose

#

depending on your use case, the ideal chunk size could be anywhere from a single sentence to an entire pdf

abstract wasp
torpid ember
#

trying to run from pyspark.sql import SparkSession/spark = SparkSession.builder.appName('test').getOrCreate() just hangs in my ipynb file forever anyone know a fix?

slender crown
#

hi

serene scaffold
slender crown
#

does people generally talk about ML or data science at here

slender crown
serene scaffold
slender crown
#

anyways

#

i've seen lots of LLM NLP models on huggingface before

serene scaffold
#

I'm on a walk right now, but you don't need me specifically. There are lots of knowledgeable people in this community

slender crown
#

and they all had the transformers arch

#

dunno

#

ur the only person here?

serene scaffold
#

People come and go pretty quickly throughout the day. They respond when they check the channel and see a message that they know how to answer

agile cobalt
rugged stratus
#

Sup

stiff grove
#

!paste

#

is my explaination correct?

dim torrent
#

need help with ai agent pretty decent already just need finishign stuff

serene scaffold
calm cipher
# stiff grove is my explaination correct?

I agree with @final kiln , and also I would recommend that if you aren't already familiar with lists and dictionaries and other Python language features, you should take the time to understand them before getting too far into Pandas

#

They are very much their own thing independent of Pandas, and it's better to understand them on their own terms instead of how they are used by Pandas

#

To give you an example, to reword your explanation of {}, the dictionary's keys are used as column names and its values become the values inside each column

#

If that doesn't make sense, especially the key/value terminology, definitely go study dictionaries

echo iris
#

I don't understand oversampling. Can someone explain? I understand that if we have 7000 of x and 4000 of y then there will be bias as there will be a greater incentive to prioritise x, but from what i know, oversampling will change it so that there is 7000 of x and 7000 of y. But won't that alter the original data and lead to inaccurate results?

#

and won't that change the meaning of the data?

fair tulip
#

anyone beginner in data science? like who knows pandas and matplotlib

jaunty helm
pine prism
#

how does AI exactly work because i thought of making an AI and so far all im
doing is just giving it responses that im writing manually and making the AI randomly choose from the responses it's allowed this seems very basic and I feel like im doing something wrong in a way

arctic wedgeBOT
pine prism
#

the letters took so long to do btw

jaunty helm
pine prism
pine prism
jaunty helm
jaunty helm
pine prism
#

like if i ask how to start javascript it shows the sources it pulls from like reddit and then shows me the summarized result

jaunty helm
pine prism
jaunty helm
pine prism
jaunty helm
#

right now you can also download a model trained by others and just run it; running it requires a lot less (but still a considerable amount) of compute compared to training

pine prism
jaunty helm
#

hence the reason people are using ML approaches

pine prism
#

ima get to work on manually writing my billions of if statements

gritty vessel
#

I found the reason I was creating sequences on the fly that's why it was taking so much time after saving sequences in start and saving them now it takes only 3-4 mins per epoch .I reduced image size as well to 256 x 256

#

yes you are correct I fixed that as well now I am predict t9,t10,t11 and input is t to t8

#

yes first 8 are input and later 3 are outputs so its rolling window first sequence is t to t8 as inputs and t9 to t11 as outputs,next sequence will be t1 to t9 as inputs and then t10 to t12 as outputs and so on for whole data set

#

these are the plots

#

from what I realised

#

Epoch [25/25] Train Loss: 12.5875 Train MAE: 12.5875 Val Loss: 8.5209 Val MAE: 8.5209 LR: 0.000100 my Loss and Mae both are same for all epochs 😭

waxen kindle
#

are you sure you are not plotting twice the same thing ?

#

what loss do you use ?

gritty vessel
#

mae

#

L1Loss

waxen kindle
#

so it's normal for both to be the same, if they are the same thing

gritty vessel
#

But still Train loss and val loss should be different right?

#

as data is different

waxen kindle
#

train and loss are different

#

but of course your two plots are identical

#

(Or are you using different data for both plots ?)

gritty vessel
#

yes

#

So like after training loop I added val loop for predictions and evaluation

#

I might have messed something

waxen kindle
#

but if you do that, you compute val on the val data and train on the train data

#

so both images are the same but show 2 different plots

gritty vessel
#

that should not be the case

#

as what I did to stating 70% as training and than from remaining 30% took 15% val and 15% test

#

I doubt it should be same

waxen kindle
#

do you mind sharing your code ?

#

at least the part where you compute the 4 values

gritty vessel
#

sure

#

Before that let me try again

#

I used gpt to do it

#

I will first write it on my own

calm cipher
gritty vessel
calm cipher
#

ok yeah you're just doing the same computation twice

gritty vessel
#

but data is different

calm cipher
#

wait how so

gritty vessel
#

for train its first 70% data

calm cipher
#

you have "loss over epochs" and "MAE over epochs"

gritty vessel
#

and for val 70:85

calm cipher
#

i'm saying the "loss over epochs" and "MAE over epochs" graphs look the same because they're both computing the same thing, but the train/validation loss in each graph is different because they are different datasets

#

right? or is there some difference I'm not seeing

gritty vessel
#

yes

#

They are computing same thing

calm cipher
#

ok

gritty vessel
#

but data is different so results should be different right?

#

'''pbar = tqdm(loader, desc="Training", leave=False)
for X, y in pbar:
X, y = X.to(device), y.to(device)
batch_size = X.size(0)

    optimizer.zero_grad()
    outputs = model(X)
    loss = loss_fn(outputs, y)
    loss.backward()
    optimizer.step()

    # Track
    total_loss += loss.item() * batch_size
    total_sq_error += compute_mse_sum(outputs, y)
    num_samples += batch_size

    pbar.set_postfix({"batch_loss": f"{loss.item():.4f}"})

avg_loss = total_loss / num_samples
avg_mse = total_sq_error / num_samples
return avg_loss, avg_mse'''
calm cipher
#

there are two dimensions to think about here

#

one is L1 vs MAE, the other is training vs validation

gritty vessel
#

this is for training I replaced mae with mse

calm cipher
#

oh

gritty vessel
#

and this is for val def validate_one_epoch(model, loader, loss_fn, device):
model.eval()
total_loss = 0.0
total_sq_error = 0.0
num_samples = 0

with torch.no_grad():
    pbar = tqdm(loader, desc="Validation", leave=False)
    for X, y in pbar:
        X, y = X.to(device), y.to(device)
        batch_size = X.size(0)

        outputs = model(X)
        loss = loss_fn(outputs, y)

        total_loss += loss.item() * batch_size
        total_sq_error += compute_mse_sum(outputs, y)
        num_samples += batch_size

        pbar.set_postfix({"batch_loss": f"{loss.item():.4f}"})

avg_loss = total_loss / num_samples
avg_mse = total_sq_error / num_samples
return avg_loss, avg_mse
#

so in loader I give train_loader and val_loader

calm cipher
#

what is compute_mse_sum

gritty vessel
#

train_loss, train_mse = train_one_epoch(model, train_loader, loss_fn, optimizer, device)
val_loss, val_mse = validate_one_epoch(model, val_loader, loss_fn, device)

gritty vessel
#

def compute_mse_sum(outputs, targets):
return torch.sum((outputs - targets) ** 2).item()

calm cipher
#

why not just use the mse_loss() function? this seems like it's more complicated than it needs to be

gritty vessel
#

got it but they will do same thing i guess

calm cipher
#

I see

waxen kindle
#

and same for validation

#

so of course the plots are identical

gritty vessel
#

train_loss, train_mse = train_one_epoch(model, train_loader, loss_fn, optimizer, device)
val_loss, val_mse = validate_one_epoch(model, val_loader, loss_fn, device)

#

different loaders

waxen kindle
#

are you sure you are using it ?

gritty vessel
#

here
pbar = tqdm(loader, desc="Validation", leave=False)
for X, y in pbar:

#

loader

#

its before the loop

waxen kindle
#

ok but you are computing both the loss and the mse in the same function

total_loss += loss.item() * batch_size
total_sq_error += compute_mse_sum(outputs, y)
gritty vessel
#

oh ok so we dont do it each batch wise?

waxen kindle
#

no you can do it for each batch that's not the point

gritty vessel
#

and then avg out

#

at last

waxen kindle
#

but those two lines are one after the other, working with the same loader

gritty vessel
#

yes those are for trainloss and trainsqerror

waxen kindle
#

yes

#

on the same loader

gritty vessel
#

oh ok

#

I will check it

waxen kindle
#

they are part of the same for loop

#

litteraly using the same inputs/outputs

gritty vessel
#

I am little confused as I am doing trainloss and trainsqerror and after that I am doing valloss and valsqerror

waxen kindle
#

yes that's fine

#

but you are doing it on the same loader, so you get the same function

gritty vessel
#

def train_one_epoch(model, loader, loss_fn, optimizer, device):
model.train()
total_loss = 0.0
total_sq_error = 0.0
num_samples = 0

pbar = tqdm(loader, desc="Training", leave=False)
for X, y in pbar:
    X, y = X.to(device), y.to(device)
    batch_size = X.size(0)

    optimizer.zero_grad()
    outputs = model(X)
    loss = loss_fn(outputs, y)
    loss.backward()
    optimizer.step()

    # Track
    total_loss += loss.item() * batch_size
    total_sq_error += compute_mse_sum(outputs, y)
    num_samples += batch_size

    pbar.set_postfix({"batch_loss": f"{loss.item():.4f}"})

avg_loss = total_loss / num_samples
avg_mse = total_sq_error / num_samples
return avg_loss, avg_mse

def validate_one_epoch(model, loader, loss_fn, device):
model.eval()
total_loss = 0.0
total_sq_error = 0.0
num_samples = 0

with torch.no_grad():
    pbar = tqdm(loader, desc="Validation", leave=False)
    for X, y in pbar:
        X, y = X.to(device), y.to(device)
        batch_size = X.size(0)

        outputs = model(X)
        loss = loss_fn(outputs, y)

        total_loss += loss.item() * batch_size
        total_sq_error += compute_mse_sum(outputs, y)
        num_samples += batch_size

        pbar.set_postfix({"batch_loss": f"{loss.item():.4f}"})

avg_loss = total_loss / num_samples
avg_mse = total_sq_error / num_samples
return avg_loss, avg_mse
#

for epoch in range(1, num_epochs + 1):
print(f"\nEpoch {epoch}/{num_epochs}")

train_loss, train_mse = train_one_epoch(model, train_loader, loss_fn, optimizer, device)
val_loss, val_mse = validate_one_epoch(model, val_loader, loss_fn, device)
waxen kindle
#

so, only one loader for mse and loss

#

1 for train, one for val, and on each of them you compute the loss and the mse

gritty vessel
#

yes

#

loader are different see the names

waxen kindle
#

so if you use the mse as the loss, you'll get the exact same values

gritty vessel
#

okie I a m confused I will go through the code once properly and update you

waxen kindle
#

to simplify, here is what you are doing:

import mse_function
loss_function = mse_function
for i in range(epochs):

     #training
     output = model(x_train)
     loss = loss_function(output, y_train)
     mse = mse_function(output, y_train)
     loss.backward()
     optimizer.step()
     loss_list_train.append(loss.item())
     mse_list_train.append(loss.item())
     
     #validation
     output = model(x_val)
     loss = loss_function(output, y_val)
     mse = mse_function(output, y_val)

     loss_list_val.append(loss.item())
     mse_list_val.append(loss.item())

So obviously, the content of both loss_list_val and mse_list_val are the same

#

(I removed the batching for the sake of simplicity)

gritty vessel
#

Oh okay

#

they are calculated by same function

#

as loss function and mse function are same

#

Oh dang that was so dumb

waxen kindle
#

yes they are, that's what we talked about at the very beginning

waxen kindle
# gritty vessel

do that but with mse as the loss and mae on the right and you'll get different plots

gritty vessel
#

got it I am updating the code and keep it on run

#

Epoch 1/50 [Train]: 22%|███▊ | 168/757 [00:58<03:23, 2.90it/s, Batch Loss=78844.4609, Batch MAE=280.3940]

#

lets see what happens looks crazy high lol

waxen kindle
#

you are accumulating an MSE so it's normal

gritty vessel
#

yes

waxen kindle
#

MSE is square of MAE in order of magnitude

gritty vessel
#

yes

#

I kept it for 50epochs it will take around 2 hours

#

check this out

#

This paper proposes a simple modification to the mean squared error loss function that eliminates the problem of overly-smooth fine scales in data-driven weather forecasts.

#

Imma go to sleep now

#

no

#

Majority are hard coded

#

Deterministic models

#

check this one The WRF model

#

aah okay yeah

#

🙂

#

here from data-driven they meant ml dl models

#

haha

#

nah its fine

#

now that I think of everything is data driven approach

safe merlin
#

I want to start ML but I dont really see anyone talk any code. Its great to understand the concept on how an ML works but I never seem to pick up how to actually code one. Do you guys have any recommendations to learning actual code?

latent steppe
#

hi, i have copied an api based ai assistant from a youtuber
now i want to add some features like it can help making 3d models and can perform functions on command while editing

magic dune
#

Or are you asking how to use them in practice

opaque condor
#

How many images do I need again for making my own dataset because may as well learn how to make one from ground zero in case I can't find one

gritty vessel
#

results are fine as well

#

will look in different loss functions

#

It looks like its smoothing the image

#

Looking in that

#

yes

#

its actually good as its able to detect the structure

#

input are simialr images

#

this is an example

#

as you can see there are some changes

#

just a min let me look

#

these are predictions on train -first sequence

gritty vessel
#

yes

#

satellite imagery

#

of INSAT

#

so these are clouds you can see it forming .changing shapes and dispersing

#

wait really?

#

wait its gone

#

yes

#

exactly

#

yeah

#

I am gonna go now and look what I can do to get sharp image

#

now I see its not actually learning

#

in input features edge is same

#

30epochs after that it went like flat

#

I was in logs

#

its movie around 7.1 to 7.5

#

I will try that but its from paper

#

one with 1024 bottleneck

#

alright imma go play with it

snow cypress
#

my code for getting mood based on color

from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import numpy as np
import pandas as pd

''' ----- CONFIG ----- '''
MODEL_PATH = r"Main\Machine Learning\model\ColorMood_SVMV1.pkl"  # folder to save/load model
DATASET_PATH = r'Main\Machine Learning\data\ColorMoodDataSet.csv'

# region PREPARE Data

''' ----- LOAD DATA -----'''
ds = pd.read_csv(DATASET_PATH, comment='#')

X = ds.drop('mood',axis=1)
y = ds['mood']

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# endregion

# region SVM Classifier

# Initialize the SVM classifier with One-vs-One strategy
svm_ovo = SVC(decision_function_shape='ovo')
svm_ovo.fit(X_train, y_train)

# endregion

# region Test

# Predict and evaluate the One-vs-One model
y_pred_ovo = svm_ovo.predict(X_test)
print("One-vs-One Accuracy:", accuracy_score(y_test, y_pred_ovo))

# endregion

# region Save Model

import joblib

# save
joblib.dump(svm_ovo, MODEL_PATH) 
print("✅ Model saved at:", MODEL_PATH)

# endregion

this has accuracy of 1

but for the dataset

PrevNote,Note,Tempo,Mood,Dur,NextNote
60, 65, 120, 2, 0.5, 67
62, 67, 120, 2, 0.5, 69
65, 70, 120, 2, 0.5, 60
67, 60, 120, 2, 0.5, 62
69, 62, 120, 2, 0.5, 64
65, 60, 120, 2, 0.5, 60
62, 67, 120, 2, 0.5, 62
65, 70, 120, 2, 0.5, 65
67, 60, 120, 2, 0.5, 67
69, 62, 120, 2, 0.5, 69

it has accuracy of 0 any particular method i should use to train my model on music notes?

calm cipher
snow cypress
calm cipher
#

You're wanting to change it to predict something else? What is this new data that's getting a 0 on accuracy?

calm cipher
#

Oh I see you're giving mood, note, etc as input and trying to predict nextnote

#

How many different notes are in the dataset, it's all c4 to a4? How much data do you have and how are you preprocessing mood? Also I would reiterate that standardizing the features is a good idea

snow cypress
# calm cipher How many different notes are in the dataset, it's all c4 to a4? How much data do...

https://youtu.be/4bCrNl4Bx1M?si=OWj-lCMf_VQ2ili2

I found this video that might help me
Not sure how the code would work tho

In this episode of the AI show Erika explains how to create deep learning models with music as the input. She begins by describing the problem of generating music by specifically describing how she generated the appropriate features from a midi file. She then describes the deep learning model she used in order to generate music. Learn more:

Blo...

▶ Play video
signal raptor
#

hey guys

#

I am completely new to coding other than the basics and just want to learn by observing i dont have much value to provide

crystal axle
#

Are good resources for AI in general pined into the channel?

I want to create an AI that plays a bit complicated match 3 game but not sure how to approach this complex project and I didn't find much on this topic it's mostly entertainment or something unrelated

calm cipher
#

but that is a big complicated machine learning topic and I wouldn't recommend making that your introduction to machine learning

#

assuming you're familiar with Python already, I really like the book "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow", and I think it has a section on reinforcement learning

#

there are also some resources in the pinned messages

crystal axle
serene scaffold
crystal axle
#

It's based on turn you've a set number of movies and you must reach to the score

calm cipher
crystal axle
#

Yes rather by getting less score or not getting extra movies or not using items you have at disposal

serene scaffold
calm cipher
#

so a really good concept to familiarize yourself with is something called the state space, because this is going to determine whether you can reasonably solve the game with an algorithm versus needing to use a probabilistic method like reinforcement learning

calm cipher
#

if the state space of the game is relatively small you can solve it just by doing an exhaustive search over all possible outcomes

#

if it's too large to do that, the heuristic idea starts becoming more appealing

crystal axle
#

This sounds cool and all but I have zero experience with designing stuff with python, at most I just know some syntax I thought of going through books first to get a bit of an understanding of reinforcement learning

calm cipher
#

and then of course if it is quite large and complicated it could be a good use case for reinforcement learning

calm cipher
crystal axle
#

By very solid you mean I should study it for couple of years before going into reinforcement learning? I wanted to do that but to me learning stuff without a clear roadmap is confusing and discourage me easily

calm cipher
#

idk what it says, you have to have an account to use it

snow cypress
# snow cypress https://youtu.be/4bCrNl4Bx1M?si=OWj-lCMf_VQ2ili2 I found this video that might ...

Ok I have had an idea didn't try it yet but though of using markov chain

So

Sudo code

Key = (previous note, curr note, type, number, mood) 
Value = {next note : prob}

For data in getdata(data.csv):
        dict[key][Next note] += 1

Convert to prob

For key, next_counts in dict.items()
          Total = sum(next_count.values()
          Chain[key] = {note: count/total for note, count in next_count.items()


This should give a set of notes and their probability that can play based on the input params(key) 

Am I going in the right direction?

#

Also if I want to predict a data that was not in this set how do I do that

calm cipher
#

this is a more limited variant of what you were doing with the sklearn model before

#

totally worth doing as an exercise, but the support vector machine was estimating the relationship between the input data and output note in a way that this method can't

hybrid pumice
#

Anyone know how realistic it is get into machine learning if you are majoring in electrical and comp engineering?

serene scaffold
calm cipher
hybrid pumice
#

I'm trying to us ece as a fail safe

serene scaffold
#

but also, if you know you want to do ML, why are you in ECE?

hybrid pumice
#

fail safe

#

i am a freshman in uni

serene scaffold
#

I don't understand your reasoning. if your goal is to get a job in A, why would you study B for fear that you won't get a job in A?

hybrid pumice
#

i kind of expected compujter engineering to give me a foundation

serene scaffold
#

are jobs in ECE more plentiful, or something?

hybrid pumice
#

yeah

#

hmmm

#

I'm also not 100% sure if ML is for me

#

i'm still kind of in the deciding phase for what I want to pursue

serene scaffold
hybrid pumice
#

hmmm

serene scaffold
hybrid pumice
#

there is

#

I already spoke w people about it

serene scaffold
#

great

hybrid pumice
#

I'm also taking data structures

serene scaffold
#

that's not really part of ML, but yay

hybrid pumice
#

and some cs classes for comp engineering

#

thing is

hybrid pumice
#

is it possible to learn machine learning on my own?

serene scaffold
#

I'm confused. are you taking a data structures course and an ML course? two different courses?

serene scaffold
hybrid pumice
#

I can take an ML elective as well

serene scaffold
#

great

hybrid pumice
#

so if i get a certification

serene scaffold
#

no

hybrid pumice
#

hmmm

serene scaffold
#

it needs to be from a university

#

coursera et al are fine for upskilling, but nobody cares about those ML certs.

hybrid pumice
#

so if lets say I want to go into machine learning what major would I look at

serene scaffold
#

very likely CS.
On the scale of academic history, ML is very new, and has historically just been a niche area of computer science (which itself is also very new).

#

if you see "masters degree in {data science, machine learning, artificial intelligence}", it's probably low-key predatory

hybrid pumice
#

wdym predatory

serene scaffold
# hybrid pumice wdym predatory

they're designed to extract money from people who are desperate to switch into a currently-hyped career type, and don't necessarily impart a credential that employers perceive as having any value.

hybrid pumice
#

so mainly comp sci is what i'm looking for

serene scaffold
#

yes. if CS is part of the engineering school of your university, it should be easy to switch.

hybrid pumice
#

it is...

#

it's just a pain bc i have no idea what i should do

serene scaffold
hybrid pumice
#

thats the thing though

#

i enjoy a ton of things engineering and coding related

#

to be honest I'm mainly going into ML for the money but I also know that i'm interested in this type of thing as well

#

its the same thing as electrical and computer engineering except the job market is way more consistent but also doesn't reach values as high as ML

snow cypress
#

so i tried ml to identify music notes as in i feed the previous note and current note and it predicts the next note i was stuck for a while but then i tried markov chain

it works but it cant predict new data only ones i train it on

i am using chordonomicon for the dataset just smaller set for testing that has been transfromed in this format

is chordonomicon good enough to get most of the notes right but if i wanted to generlize do i use SVM?

Type,Number,PrevNote,CurrNote,Nextnote
1,1,F,C,E7
1,1,C,E7,Am
1,1,E7,Am,C
...
import pandas as pd
from collections import defaultdict

''' ----- CONFIG ----- '''
DATASET_PATH = r"Main\Machine Learning\data\chordonomicon_SmallPrepared.csv"

ds = pd.read_csv(DATASET_PATH)

# region CREATE TRANSITION DICT

# Dictionary: (Type, Number, PrevNote, CurrNote) -> {NextNote : count}
transitions = defaultdict(lambda: defaultdict(int))

for _, row in ds.iterrows():
    key = (row["Type"], row["Number"], row["PrevNote"], row["CurrNote"])
    transitions[key][row["NextNote"]] += 1

# endregion

# region CONVERT COUNTS TO PROBABILITY

# Dictionary: (Type, Number, PrevNote, CurrNote) -> {NextNote : probability}
markov_chain = {}

for key, next_count in transitions.items():
    Total = sum(next_count.values())

    markov_chain[key] = {note: count/Total for note,count in next_count.items()}

# endregion

# region LOOKUP

state = (2, 1, 'F', 'G')   # Type, Number, PrevNote, CurrNote
print(markov_chain.get(state, {}))

# endregion
hollow fossil
#

model = keras.Sequential([
Input(shape=(12,)),
Dense(6, activation='relu'),
Dropout(0.1),
Dense(6, activation='relu'),
Dropout(0.1),
Dense(1, activation='sigmoid')
])
model.compile(optimizer='adamW', loss=BinaryFocalLoss(gamma=2), metrics=['accuracy'])

this model performed better with 85% accuracy both on test and training data than the model below

model = keras.Sequential([
Input(shape=(12,)),
Dense(100, activation='relu', kernel_regularizer='l2'),
Dropout(0.2),
Dense(100, activation='relu', kernel_regularizer='l2'),
Dense(1, activation='sigmoid', kernel_regularizer='l2')
])
model.compile(optimizer='adamW', loss=BinaryFocalLoss(gamma=2), metrics=['accuracy'])
this model only got 79% accuracy, despite having more neurons, why is this ? i even added regularizers

waxen kindle
#

Could be overfitting, could be randomness (did you try different seeds ?)

#

It is also weird that your last layer is a Dense(1) while you compute an accuracy
Dense(1) indicate you are doing a regression, but accuracy is a classification metric. What are you trying to achieve ?

late bolt
#

12 -> 100 -> dropout => 100 -> 1?

late bolt
#

first one total param : 127
second one total param : 11,501

gentle stone
#

Hello guys I'm new in data science

#

Is kaggle a good platform for learn data science?

waxen kindle
#

I don't know about the courses, but the datasets and examples are pretty good

gentle stone
#

Honestly it is pretty hard to learn data science especially in logic from kaggle for the first time(my opinion), I am self-taught.

waxen kindle
#

You should rely on other sources too, youtube have some good courses, maybe someone else here have some recommendations

gentle stone
hollow fossil
#

oh that makes sense

#

was trying to reduce overfitting using the regularizers

lilac canopy
#

Hello everyone, is there any quant traders in the server ?

hollow fossil
#

with neural networks it feels like im just taking shots in the dark sometimes
like add a dropout layer here, arbitrarily change regularizers, add or change the amount of layers in the NN
i get the concepts behind each of them, but the amount of variability seems honestly overwhelming
so I can't understand how to properly approach making a neural network

#

any ideas on how to improve this ?

hollow fossil
#

will look into it thanks

#

what are some of the main youtubers or courses (free ones) that you'd reccomend for learning data science ?

#

im currently using 3blue1brown for theoretical stuff and the mathematics of it, freecodeacademy, codebasics and stat quest

calm cipher
#

I'm working on a problem in pytorch that involves comparing a model trained with both autoregression and teacher forcing, and because of how the model works, I'm incrementally accumulating a hidden tensor of outputs then attending to it before giving that output to a decoder. so something like this in pseudocode:

context = torch.zeros(batch, num_timesteps, 32, device=inputs.device)
for i in range(num_timesteps):
  context[:, i] = encoder(inputs)
  output = decoder(attention(context))

I'm training in mixed-precision mode and I noticed the context tensor is float32, but the encoder outputs are float16, so the encoder outputs are being upcast to float32. then when the context is attended to, the results are cast back down to float16

#

I don't see a lot in the documentation about whether it's good practice to create tensors like this of the correct dtype, is it ok to ignore this or should I be proactively creating the context tensor as float16 if mixed precision is active?

waxen kindle
#

It probably doesn't matter, your memory usage isn't optimized but it doesn't really matter

calm cipher
#

I was mainly looking at this from a performance perspective because the main loop of the model is quite slow and I'm trying to squeeze out as much performance as possible

#

I tried doing %timeit and it does not seem to make a difference

#
tensor_32_hist = torch.zeros(1000, 1000, 1000, dtype=torch.float, device="cuda")
torch_16_in = torch.rand(1000, 1000, device="cuda")

def run():
    for i in range(1000):
        tensor_32_hist[:, i] = torch_16_in

%timeit run()

14.8 ms ± 250 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#
tensor_16_hist = torch.zeros(1000, 1000, 1000, dtype=torch.half, device="cuda")
torch_16_in = torch.rand(1000, 1000, device="cuda")

def run():
    for i in range(1000):
        tensor_16_hist[:, i] = torch_16_in

%timeit run()

11 ms ± 138 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#

i'm not at a scale where that few ms is going to make a difference lol

#

this is a more memory efficient solution and it is quite slow

torch_16_in = torch.rand(1000, 1, 1000, device="cuda")

def run():
    hist = []
    for i in range(1000):
        hist.append(torch_16_in)
        torch.concat(hist, dim=1)

%timeit run()

4.7 s ± 260 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)

#

even then it's eventually going to take up the same amount of space as the faster solution, but not for the whole loop

#

oh whoops I forgot to set torch_16_in to the Half type but it still doesn't make a difference when I do

cerulean violet
#

Guys wth is markov model

desert oar
# cerulean violet Guys wth is markov model

did you look it up? this channel isn't a good replacement for a search engine. but if you already tried to look it up and don't understand something, this channel is a good resource for asking specific questions.

cerulean violet
#

And this isn't helping anyway

desert oar
#

Pr(X[t] | X[0], X[1], ..., X[t-1]) = Pr(X[t] | X[t-1])

#

a Markov model is broadly a probability model with that property

cerulean violet
#

Is it something like the anchor affect where one guess affectsthe others to anchor near the first one

desert oar
#

it's not about a guess, it's about the actual outcome

#

think of it like following directions on a map

cerulean violet
desert oar
#

it doesn't matter how you got to where you are, right? all that matters is where you are now, and the next turn you need to make.

#

the next turn you need to make is independent of the 100 turns before you got to where you are

#

that's one example of the markov property

cerulean violet
#

So next turn ain't dependent of the previous

desert oar
#

yes, but it does depend on the current location

cerulean violet
#

Oo

desert oar
#

(this is actually how Kalman filtering works if you've heard of that, the Kalman filter is a Markov model)

cerulean violet
#

Lemme check what it is

desert oar
#

don't worry about it. it might be more of a distraction than a help, if you don't already know what it is

#

there is also a very important type of model called a hidden markov model where you don't actually observe the "state" X. instead of you observe something else "Y" that depends on X but is not the same thing as X

#

that is: X[t] only depends on X[t-1], and Y[t] only depends on X[t]

cerulean violet
#

I'm already confused tbh

desert oar
#

are you not very familiar with math notation?

cerulean violet
#

Yea

#

I am familiar

desert oar
#

ok. in any case stick with the Markov model for now, don't worry about the "hidden" version until you understand the regular version

cerulean violet
#

Yeah okay thx mate cheers

desert oar
cerulean violet
#

I was looking that up only but then when they started explaining nuclear fission with markov chain i rlly got confused

desert oar
#

yeah, it can get hard to think about

#

it can be a challenging topic. remember to slow down and work through things step by step. a good textbook is worth more than 100 blog posts and wikipedia pages

cerulean violet
#

Should I get a book on it

#

I found a book on it by Joshua Chapman and ima start it now

delicate imp
#

can anyone help me learn ai/ml

#

im beginner and im so fascinated to learn it

#

anybody their to help me

#

??

serene scaffold
modest vigil
#

I am also working on a LSTM+PPO, been having difficulty getting DDP/torchrun (that creates a run script) parellelization to utilize the full CPU. 1mil model params, 50 epochs, taking 10hrs a trial at 12% CPU on a 16core lol. The more DDP processeses it just uses more RAM and the same CPU lol. Any suggestions?

calm cipher
#

also some details around the model could be relevant, I'm not familiar with PPO specifically but looking it up it seems like it's a RL thing, how are you computing the cost function and training the LSTM? I'm assuming it isn't possible to train with teacher forcing if the state needs to be modified and fed back into the LSTM at each timestep

modest vigil
#

train_sampler = DistributedSampler(train_dataset, num_replicas=world_size, rank=rank)
val_sampler = DistributedSampler(val_dataset, num_replicas=world_size, rank=rank)

train_loader = DataLoader(
    train_dataset, batch_size=batch_size, sampler=train_sampler,
    shuffle=(train_sampler is None), num_workers=num_workerz, #tried 4 too same result
    pin_memory=True if device.type == 'cuda' else False,
    persistent_workers=True if num_workerz > 0 else False
)
            
val_loader = DataLoader(
    val_dataset, batch_size=batch_size, sampler=val_sampler,
    shuffle=False, num_workers=num_workerz, #tried 4 too
    pin_memory=True if device.type == 'cuda' else False,
    persistent_workers=True if num_workerz > 0 else False
)```
#

actually this one isnt a ppo just an lstm

calm cipher
#

what about the model code?

modest vigil
#

the model.train or the model class or the LSTM data class?

calm cipher
#

how about just the constructor forward function and we can go more into it if necessary?

modest vigil
arctic wedgeBOT
calm cipher
#

I guess I'm just trying to figure out how the LSTM is being used and if it should be extremely optimized, or if you're doing a step-by-step thing

#

yeah this looks pretty straightforward

modest vigil
arctic wedgeBOT
calm cipher
#

i'm assuming this was written by a LLM too

#

sry dood I don't really want to try to parse through this :\

#

good luck

modest vigil
#

all good lol

boreal yoke
#

hey where's the resource page?

mellow vector
#

!res

arctic wedgeBOT
#
Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

mellow vector
#

this?

boreal yoke
#

yes it used to be there in the # list?

mellow vector
keen wind
#

what do you guys think are the best coursera courses for learning AIML

#

or maybe is there one that isn't from there, or a playlist, anything?

oak elbow
#

mmm. someone does maybe know how to write an algo which will see relations between ordered number sets?

#

i mean, dataset like that:

8 77 90 34
6 55 80 17
44 99 100 51
22 111 70 34
...
#

as far i saw (by googling) it should be unsupervised model to assign tags to have relations but unsure here…

silk acorn
#

!rule 9 We don't allow offering payment

arctic wedgeBOT
#

9. Do not offer or ask for paid work of any kind.

modest vigil
#

ah okay

waxen kindle
#

Like on google collab

jaunty helm
#

are you looking to cluster? this is probably what the googling results mean by unsupervised model, you'll get groups where data instances in same/different groups will be similar/different according to a metric
and using your 4 column dataset as example, if you're looking for something like trying to find if there's relation between say column 0 and 2, then have you tried just statistical measures like pearson correlation, info gain, etc?

oak elbow
#

i'm looking for something that will find relations between numbers in dataset

jaunty helm
oak elbow
#

i'm trying to find correlations between sets of numbers (less or more random, like, for a game :) )

#

so it can for example see that in that example first numbers are dividable by 2 and last ones by 17…

jaunty helm
oak elbow
#

and like calculate some expressions like idk. 2n*7+20 (where n as in number in array of sets) for each piece of set…

jaunty helm
oak elbow
#

i know…

#

like just wanted to make something like "akinator" but for maths where player inputs n numbers related and it says what should be next, then to keep results in file and make answers also personalized for player…

#

something like recently was added to excel i mean, doesn't need to be exact, as i say - it's just for a game…

waxen kindle
#

you can make test manually mulitple rules and say you don't know if none of them work

#

I mean, make a few regressions, test some divisibility, check some easy known series like fibonacci

#

that kind of things

oak elbow
#

i'd rather have it say something than nothing even if is far from that number tbh…

jaunty helm
jaunty helm
waxen kindle
jaunty helm
# oak elbow like just wanted to make something like "akinator" but for maths where player in...

not to say this can't be interesting though, when it comes to optimizing rule checking
for example, you can check gcd(a, b, c, d) != 1 for if the numbers are divisible by something (other than the trivial case of divisible by 1)
for a harder example, the berlekamp-massey algorithm will automatically find the shortest linear recurrence of a sequence of numbers (like the fibonacci sequence of F_n = F_n-1 + F_n-2)

oak elbow
#

thx

fresh harbor
#

i am building expense tracking using llms. i discovered langchain and langgraph. which should i use? also should i use 2 llms, one for tool call and one for actual user interaction?

final cobalt
#

I'm planning a rather ambitious AI, and I wanted to get you guys' opinion

#

I want to build a diffusion model for magic the gathering decks. Diffusion is a great choice because it excels at learning the implicit structure of data while also exploring with maximum creativity. A deck of magic the gathering has an internal structure, but not one that can be easily explained in words

#

The problem with diffusion though is that can't output real cards. It'll output continuous values. Specifically, I'll pass in a 100 x D matrix where 100 is the number of cards and D is the number of values in each card's vector

#

I'll add noise, then denoise. During inference I'll have the use select some seed cards and diffuse the rest from noise

#

What I'm thinking is that to convert the diffuser's output to actual cards, I could use a transformer and iterative refinement

serene scaffold
#

@final cobalt this makes sense to me.

serene scaffold
#

I just figured that the dimensions of the desired image is an inference parameter.

final cobalt
#

Note: not stable diffusion, just diffusion in general

#

You can add noise to and then remove it from a tensor of any shape

exotic star
#

I started the khan academy course for linear algebra and after this one i plan to also go trough both the one for statistics and calc but there's a lot of content and i realised that its gonna take a lot of time to learn this part. So i wonder if i should just focus on the math and nothing else for a while or also do something else as well? I learned the basics of pandas numpy and a bit matplot so far tho nothing too much

#

did a pandas project on whats the biggest winning factor in LOL and some other little programs as well

modest vigil
#

What do you guys use for parallellization, just DDP?

final cobalt
#

It was the only class I've ever had to take twice - though in my defence, my first teacher spoke in broken English and was very difficult to understand

#

XD It was so stressful that one day I just stood up and walked out

exotic star
final cobalt
exotic star
#

many sources recommend that u first learn the math but that will take a while so i came here to ask

#

i watched the first 2 videos and they were great but i thought i can go trough khan academy and use 3blue as a second resource to clarify and solidify the concepts i'll learn

#

the visuals are really helpful

final cobalt
#

I guess I'm just saying that if you wanted to you could start using ML libraries without learning the math first. Advance on two fronts

exotic star
final cobalt
#

"The Math" is three terms of calculus and at least one of linear algebra

exotic star
#

but i'll be starting school next week and there's so much content to cover so ig i'll try to do 2h a day

final cobalt
#

So, three terms worth of work

exotic star
#

i got linear algebra starting this year as a subject i think so that's great

#

thats true i'll most likely go to uni so it wont be a waste of time for sure

final cobalt
#

I have a special loathing for linear algebra

#

Though this might have more to do with that teacher I had than anything

exotic star
#

thats very true

final cobalt
#

He was very, very Ukrainian

#

Very impatient, and spoke in very broken English

#

There was also this girl in the class who was very autistic. Now, don't think I've got anything against that, but one of her quirks was that when she was frustrated she just talked endlessly

exotic star
final cobalt
#

So on one hand you could barely understand the teacher, on the other hand you had this girl talking non stop for the entire lecture

exotic star
#

ego is a huge factor

#

some teachers dont understand they are on the same side with the students

#

at least they should be

final cobalt
#

Oh!

#

Sadly he doesn't have a linear algebra course, but

#

Professor leonard (aka professor sexy, as I call him) is your god with respect to calculus

#

I didn't even bother going to my calculus classes. I just learned from this guy and showed up for the exams

exotic star
final cobalt
#

Glory unto professor leonard

exotic star
#

some schools are trying to implement more collaboration but id think its working too well

#

it is, i hope more schools start doing that

#

thanks a lot for the answers

pine prism
#

does anyone have any free courses that i could sign up for to learn more about machine learning beyond google searches

final cobalt
#

You can ask it to mak you one. Better yet, though, rather than trying to "learn ai" I suggest you try to build one

#

Autoencoders for images are a good start

#

You'll need a powerful GPU though

calm cipher
# final cobalt I strongly recommend ChatGPT

ehhh don't do this, if you don't already know about machine learning you aren't going to be equipped to protect yourself against things it tells you to do that are incorrect or unusual

#

if you're into books I strongly recommend "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow", and there are more resources pinned in the channel

final cobalt
#

You're not exactly wrong, but I would argue that learning from LLMs is a skill in and of itself

#

One that we'll all have to get good at

calm cipher
#

yeah, and a major part of working effectively with LLMs is understanding what you're asking about to the point that when it tells you incorrect things, you can recognize it and work around it

zenith topaz
#

hey, i wanna use langchain/langgraph for my project. But since its paid service, can u guys give me a review about langchain and langgraph. Is it a good and recommended framework?

swift tree
#

definitely agree w you that you shouldn’t be using LLMs to learn at first, only to supplement

tacit basin
jaunty helm
long plover
calm cipher
#

also any study of machine learning is going to include the classical non-neural models like linear regression, SVMs, decision trees, and so on, and most implementations of them don't even use a GPU

#

and even then some smaller neural networks don't benefit from a GPU, you probably won't need it until you start studying very deep neural networks or any of the more complex types like RNNs and CNNs

final cobalt
final cobalt
#

Is summarizes the basic concepts you'll be working with - convolution, linear layers, the symbolic nature NNs

#

And it opens the door for lots of fun stuff, like using compressed latents are conditioning for other models, diffusion, etc

final cobalt
#

It basically means making a graph which works as a map from current state A to desired state B (and scoring each edge on the way from A to B), and then using A* pathfinding to find the shortest path

#

Bonus: nodes along the path can be made to degrade into primitives when prerequisites aren't met

calm cipher
#

not to say it couldn't make use of machine learning, but just that in a typical sense of looking at what machine learning is trying to accomplish, a support vector machine or decision tree is more similar to a neural network

final cobalt
#

Machine Learning, no

#

But Artificial Intelligence, sure

dreamy elbow
#

Is there someone who can help me? I have an AI that I built in Google AI Studio that I'm trying to connect to streamer.bot - I have no idea if I have the coding or syntax right, I'm completely new to this.

grim girder
#

Chat what kind of model can i use in my dc bot?

#

transformers couldn't be downloaded

agile cobalt
#

what kind of model for which task?
if you just want to have machine learning for the sake of having it, you can use something simple from scikit-learn

if you want large language models (aka chatgpt-ish chatbot), use an API such as Google Gemini's instead of trying to host it locally

opal linden
#

Hey folks, any resource recommendation for anomaly detection in time series data? (Prometheus)

pine prism
livid rune
final cobalt
#

XD They really are unavoidable these days

#

A sensible attitude, but it seems LLMs have won out

#

It seems you can even rely on them as complicated if statements, filters for data, in living code

#

The only question these days is having a reasonable GPU

#

Also, why tf am I awake at 7AM!?

#

I'm building an MtG platform

#

And it seems ChatGPT/DeepSeek is capable of acting as an AI opponent

#

Magic the Gathering

waxen kindle
#

it doesn't look like you need a LLM for that

#

make actions in a card game

#

s** sorry that was not for you then

#

my bad

final cobalt
#

I'd like to reiterate my amazement and displeasure

#

Why tf am I awake at 7AM!?!?!?!?!?

#

As a rule that's good reasoning

#

But I drank 7 very strong beers last night

#

I should be conked

waxen kindle
#

it starts to be a little off topic....

#

Sure, here are simple solutions to solve your alcohol induced circadian rhythm disruption:

  • Stop drinking
  • Stop sleeping

Would you want an example of how to use these solutions or maybe some solutions to fix your drug addiction-induced cholesterol?

jade harness
#

Hi people, I'm new to AI programming, having learned Python these last few years mostly to gain access to ML libraries such as Torch and Tensorflow.

I want to explore writing real time AI agents to play games, similar to this now 10 year old video about using neural networks to play Mario. https://www.youtube.com/watch?v=qv6UVOQ0F44

However, I wanted to ask if anyone is experienced in this area and would recommend the best way of going about this in the modern era (not 10 years ago). What libraries are favored, are there any good resources for this topic?

I want to explore building and training AI agents to play relatively simple 2d games at a high level of skill, with my highest goal being that they be competetive against high skill human players in PVP.

When I first began researching this, OpenAI playground was a common recommendation, but that was quite a while ago.

So far I have found myself favoring working with Tensorflow, as it's being integrated with Unity, indicating an emphasis on game development. I am not a Unity developer myself though.

Using the high level API Keras, I have gotten the impression that it is not much harder to prototype in than with Torch.

Architecturally all I really know is that I will be serializing my complete game state and running it separately from the rendering, so that a neural network can have parameters to act upon to achieve the goal, and so the game simulation can run quickly to facilitate many repetitions and training.

Anyways any recommendations greatly appreciated. Thanks.

MarI/O is a program made of neural networks and genetic algorithms that kicks butt at Super Mario World.
Source Code: https://gist.github.com/SethBling/598639f8d5e8afb5453a0b9519be51ff
"NEAT" Paper: http://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf
Some relevant Wikipedia links:
https://en.wikipedia.org/wiki/Neuroevolution
https://en.wik...

▶ Play video
calm cipher
#

my usual recommendation for this is starting with the simplest formulation of it, which is multi-armed bandits, and that will get you started on really important concepts like rewards, regret, and the exploration/exploitation tradeoff

#

you can play simple games like tic-tac toe with Q-learning, which can be done with or without neural methods, and will introduce you to more complex action selection

#

I can recommend a textbook but I don't know if you're intending to do a very rigorous study of it or if you're wanting to be more hands on

#

note that this is just reinforcement learning which I have some experience with, but the mario thing you posted was made with genetic algorithms, which isn't quite the same thing

calm cipher
torpid linden
#

hey can i get some guidence on making a chatbot for a project

iron basalt