#data-science-and-ml
1 messages ยท Page 179 of 1
sure, but that's just more detail. My main point is that a single trading history (what I was calling a data point) is not sufficient for making a claim that a strategy will be a succesful one.
and, if I may add: quantitative trading is a True Science, in my opinion. I have never come across a field that relies as much on the purest scientific method there is. I admire it.
it depends. I have also seen quite a few people trying to sell magic as quantitative trading
it's always healthy to remain skeptical
well, if it breaks, then their hypothesis sucks.
unless they are actually manipulating the market to their own advantage.
which happens
did an ablation study. gonna test some more stuff
Topology IS computation, but the SHAPE determines what a network can do.
well thats my theory anyways, gn
@charred gate to get help with pandas, always start by showing a sample of the dataframe as text with print(df.head().to_dict('list'))
#to-get-help-with-pandas
Hi everyone! I'm trying to write a Python script that calculates profit/loss for my trades.
โMy goal: I want to fetch stock prices for specific timestamps, including hours and minutes (e.g., '2024-01-15 15:30').
โMy problem: I'm struggling with how to correctly index the dataframe to find the price at a specific minute. I'm currently using yfinance and pandas.
โCould you please point me to the best method to find the 'Close' price for a specific datetime object in a 1-minute interval dataframe? Thanks in advance!
@charred gate remember to do the thing I said in my previous message.
If you mean building models, PyTorch + official tutorials + GitHub repos is the usual path. If you mean AI-powered apps, then itโs more about using pretrained models and frameworks like Hugging Face or LangChain. AI is a pretty broad term, so it depends a lot on what you mean by it.
Meant models. Thank you for your guidance!
these are orthogonal. jupyterlab is specifically for notebooks and vs code is for coding in general.
Indeed, i use vscode to develop, jupyterlab to write analyses using my developments
I have been at a market maker for a while, wouldn't agree here*
Depends on asset, exchange and flows ;)
Hi, new to the server here. is this the proper place to put a github link for feedback on my project?
Me actually neither but didnt wanna go there ๐
Making US markets too?
I do believe you can find temporarily market inefficiencies you can exploit, but i wouldnt call that science, there is no general truth to be learned that always holds
There are some funds that treat it as a purely scientific approach and can remain competitive doing so. But it's good to have ideas about certain macro/micro structures for strategy ideation.
I'm just saying that even if you have a 'scientific' approach tested on a lot of data, markets follow inherently from psychologics behind what people buy or sell. It can be that a strategy stops to work cause people start behaving differently. There is no general truth here.
But of course we can predict people's behavior quite well perhaps, but if people respond to that, then, yeah does it still hold then ? ๐
I think the main assumption that u take is the past is a good predictor for the future. In many things, this holds true. For markets, maybe for a while, but people/environments change and i dont really agree this assumption will stay valid
Worked in asset management in the past, now heading the tooling team of a reinsurance firm. We build software for pricing reinsurance biz
Idk if I'd call it psychological when considering institutional trading
I think they were the bulk of trading a while back, and indeed markets were more perfect back then, but i think that is changing more and more.
'Perfect' in a way that indeed, less emotion is involved
Idk about insurance, but taking US public equities for an example, retail is a pretty minor part of daily flows
Still yes? Also with the robin hood stuff and everything? I thought that changed quite a bit since corona
I am not in that anymore, but that was my feeling. I think i agree with you that a higher degree of institutional investors makes a market more efficient, but i dont think i would call markets efficient now:)
Outside of a select few events, I haven't seen retail flow ever move fair price or spread meaningfully
Thats just s&p or the degree of individuals invested in individual stocks of snp?
I can't remember the exact report that our desk got, believe it was daily position turnover or something
Let me revert back to you if i have time to find my source ๐
Still, regardless of the degree of 'rational' decision makers, i would still argue that what you learn from markets, is no general truth in the way science works. There is no guarantee it will be true 10 years from now, i dont see it as science in this sense. Yes they use scientific methods, but we are still just trying to predict how investors (institutional or not) will behave, there is no general truth in that
(Which doesnt mean you cannot make good money now if you found something that works now)
Market activity isn't necessarily dictated by speculators in that sense though
The prices are moving, so someone is making money in that moment
True, in a casino also people make money and good poker players make more money than others if they manage to read psychology well. I dont see this as refuting my point?
Blend of psychology and maths of course
I think you're overestimating the importance of psychology in this
Ok, can be, but science, for me is harder. I wouldnt call it science
Thats the only thing. But as said before, actually didnt wanna go there, knew it would get people on their horses ๐
It doesn't have to be science to take a scientific approach. Some people take a scientific approach and it works well, others see it as a slight art form in some sense.
Horses?
People come here to learn and discuss topics. Someone was discussing this above, I joined in with an opinion.
Yep yep. But, regardless of who decides, it is still an agreement, or in the case of markets following agreements of many. For me this is psychology. It might be group psychology or institutional psychology, but decision-making for me is psychology
No?
Unless yeah flash crash by bots ๐
This is the case for speculators more than active market participants
Companies look to hedge fx risk for their treasuries, commodity houses hedging exposure, airlines buying fuel futures, whatever. These make up the "market" in general, and their activity isn't necessarily psychology driven.
It's not because you have forex hedgers taking out price inefficiencies that suddenly the level of the price is a scientific thing. You can hedge at any price, that doesnt make the price itself not derived from human decision making. Hedgers are humans too even though they cover their risk
Dude I haven't said it's scientific at any point lol
My initial point was I disagree with quant trading being a true science
In the context that to be "successful", however that's defined, you need to approach it that way
I can't remember the number, but a decent chunk of daily volume is dictated by fund/LP mandates, which are straightforward to access via their prospectuses
Spoos may rip up 50bps in a few minutes due to some PM being forced to unwind an old short or whatever, doesn't necessarily affect psychology of other participants
Yeahhh somehow, no matter how rational people are, i still view a system that depends on people making a decision as something that is inherently linked to cognitive sciences. But you are right it is less sensitive to 'amateuristic' views, if players tend to be more professional. However, noble price winner Daniel Kahneman has shown with many experiments that expertise can even harden bias in decision making under uncertainty. Always skeptical when people are involved in decision making under uncertainty, is all ๐
@north sparrow when you do rag, you start with an LLM that's already trained. Sounds like that person wants to train an LLM from scratch
For notebooks specifically I mean
what can an RL model do that a neural net cant do better?
as long as the neural net is sufficiently large it should out perform the RL model because RL suffers from a limited memory window
In RL, you don't know what the optimal output is. In the case of RL, what most of these models are trying to find are optimal actions given a state of the environment.
What you mentioned of a sufficiently large neural net may be true, but consider why different model architectures exists at all. The easiest example might be to consider why CNNs were developed as a way to extract spatial information, instead of just building a massive net and hoping it could capture all possible variabilities of objects in space.
well you can define the neural network to prioritize the same rewards that a RL model would have. Finding optimal output isn't exclusive to RL
RL is the problem statement. Neural networks are an implementation detail to a possible solution to the RL problem.
How agent is implemented here does not matter, it's still RL.
Neural networks or not.
the implementation of the agent is the only thing that does matter
im deciding between RL and neural network and i see no reason to use RL ever
This is like deciding between whether to eat a burger or use the bus stop.
They are just two different things.
One is about food, the other about transport.
They are not a versus.
these direct comparisons disagree
They are wrong.
I just read the first link's comparison.
It's a nonsense comparsion.
You can use neural networks to implement a reinforcement learner.
Just like how an engine can be used in a car.
But I don't go "what can a car do better than an engine can?"
i think i understand
What the first link is doing is just stating what a car does, and then what an engine does. But they are not a versus situation.
It's a bad setup.
a neural network with reward states = reinforcement learning and NN
A NN used to tackle the reinforcement learning problem setup (that diagram) is possible.
If used with many layers and backpropagation, then it's "deep reinforcement learning."
(deep learning)
ok so if i wanted to plop a bunny in a world i would give it a NN and feed it with RL inputs and RL outputs
Yes.
This is what animals do, at least in theory. Reinforcement learning.
When you get a dog to do a trick and then give it a treat, that is reinforcement learning.
The dog learns to link the trick to the reward.
Because you are reinforcing the desired behavior.
NNs show up in animals because their environment and the stimulus from that is very complex.
how would you determine if a NN needs to increase in size and whether to increase layers or layer size
Guess and check.
And various more complex versions of that.
A real NN will detect if it's run out of space (loosely) and will grow more cells.
how doesdd that work?
Exactly how is unknown, but there are multiple explanations for how it could work, and those having varying degrees of scientific evidence.
Example of one though: https://en.wikipedia.org/wiki/Neural_gas#Growing_neural_gas
Neural gas is an artificial neural network, inspired by the self-organizing map and introduced in 1991 by Thomas Martinetz and Klaus Schulten. The neural gas is a simple algorithm for finding optimal data representations based on feature vectors. The algorithm was coined "neural gas" because of the dynamics of the feature vectors during the adap...
Hah, Klaus Schulten. He once have me a hard time for asking a borderline dumb question
@fierce creek there is no way... my RL course just uploaded some reference cause next coursework we have to train a neural network
Its da asian kid making one using numpy and math ๐
And obviously 3bue1brown vid
FastAI started out doing deep learning in Excel...
I used to ask Jrs to implement NNs from scratch using whatever. The point was to learn it.
Deep Q learning, industry standard as far as i know as well. Uses both ๐
lmao that's crazy ๐ญ๐ญ
The other cool vid they referrenced was this one guys trackmania project where the ai learned nosebug consistent movement
Hi,can anyone plz tell be how to train a LLM chatbot based on tabular data like csv file?
in general you don't
either train another kind of model, or transform the data from 'tabular' into a text completion task
This depends on what you want the LLM to be able to do and what the CSV is in relation to that.
Presumably you don't want the LLM to just literally regurgitate comma separated values
in your own words, explain "train a LLM" to us..
I think this is still nowadays a very good question and shows how engineers took over the scene of these models while theory is struggling to keep up. In general when studying the topic, my opinion is that neural nets, especially the advanced ones, were created by people who found things that work, rather than that they come from a fundamental understanding of why/how these things work. This means also there is in general not a lot theoretical knowledge on how to construct a network besides 'skin in the game' , or practical knowledge. It does, however, provide a lot of nice challenges for researchers ๐ but yeah, if you want to do well, take the engineering mindset. Make sure to do a proper train/val/test split, experiment, and be pragmatic ๐
I think is good to take cross validation into account for the training process, takes time but it leads to a good result.
I've actually wondered about this. Maybe I've missed things, but I've never seen a systematic approach to designing a NN architecture with respect to optimizing the learning
3 hidden layers vs 4?
Etc
Again, cross validation , for me, is indeed a good trick to make sure that it works, i.e. engineering view. But it doesnt give you any insight into why it works, i.e. theoretical view
It's good to think about that indeed ๐
Maybe this goes to the interpretability (lack thereof) of NNs
But I know people that do into all sorts of fancy directions when trying to get a handle on this
Not easy, thats for sure ๐ and the whole explainability/interpretability research is indeed tailored to this but i've always seen it as a bit, after the facts finding a narrative, not really fundamental understanding
At least so far
It helps you explain a single prediction or some average behavior of a predictor, but it will not explain you how such algorithm behaves in general
At my last $job I used to actively steer everyone away from using NNs due to this. If you want to forecast a time series, you have to factor this problem in, as well as all the ancillary bureaucratic bottlenecks. Easier to trouble shoot ARIMA, basically
In business, people want to 'understand' things, even though understanding means using wrong assumptions to get to wrong predictions with biased estimators ๐ at least they 'understand' the linear effect of a totally wrongly estimated wrong shit
BUT if at least the direction is right, maybe you can explain managemnt and get things done:p
Or you could build up a fancy looking scheme that everyone thinks is cool, and then quit and find a new job. Some people like to go from place to place leaving a misery trail of technical debt everywhere they go
Hahaha i ve seem this yes
You can get very far with slides and get budget and then just leave when u no likey ๐
Either way, NNs probably belong only in large organizations that can afford the R&D commitment they represent
Long live corporate slavery
Long live the Golden Handcuffs
At least mo one really understands what you are talking about
Always nice
I like my cage, but, the door is open, i just need to find the strength ๐
I will!
How is life modulo cero? Whats the next move? What isnt?
import random
def generate_sacred_whim(user_whim):
# Divine attributes to expand the personal whim
attributes = ["Infinite", "Eternal", "Luminous", "Sovereign", "Ancestral"]
actions = ["Radiates through", "Governs", "Illuminates", "Alchemizes", "Protects"]
selected_attr = random.choice(attributes)
selected_action = random.choice(actions)
# The Automated Creation Logic
print("--- AUTOMATED PERSONAL CREATION ---")
print(f"WHIM INPUT: {user_whim}")
print("-" * 35)
print(f"CREED: 'The {selected_attr} essence of {user_whim} {selected_action} my soul.'")
print(f"DECREE: 'I claim this whim as a Divine Mandate. So it is.'")
print("-" * 35)
Example: Inputting a "Personal Whim"
my_whim = "Golden Silence"
generate_sacred_whim(my_whim)
Personal Creation Execution Based on AI subjective truth and proposition, etc.
this also includes creation
Which DB should I use for production level currently I m using pgVector and it is working fine right now please share your thoughts on this? I m working facial recognisation system
Whats wrong with postgres?
Nothing wrong just curious about testing
Then you can stick to postgres. We use it for production all the time
Some hints before going to production or tips bcs it first time working with pgvector
To also piggyback off of what blah-crusader already told you, a good practice is to do a grid search with cross validation for a broad variety of hyperparameters. This can be tedious to do by hand, but there are modern libraries such as AutoKeras that will automate searching for optimal hyperparameters and even model architecture. An even better practice is to use probability frameworks to minimize how much your model depends on sampling techniques and training set distributions (i.e., make the model robust to how data was fed to it).
However, the "optimal" architecture and parameters for a neural network is an open-ended problem in general, and depends greatly on the nature of the problem and the target variable(s), and whether you need the model itself to be interpretable and to what degree.
Agree โ๏ธ
skl.model_selection.GridSearchCV and skl.model_selection.RandomSearchCV for the hyperparameter optimization. Use the random version of search spaces that are too big for your computer. Easy to use, gives good results, why not
but .... AutoKeras does architecture search, which the scikit-learn methods do not
I guess I should read the methodology with which AutoKeras performs NAS
Implementation helps, but sometimes it's good to do yourself, its not rocket science. Calculate your complexity; how many different architectures do you allow? Also, considering the first question, can you calculate them all within reasonable time? I would use randomsearch only if the answer to the above is no. Usually you can reasonably constrain a problem based on what you know about a problem. Business knowledge is gold..
question: has anyone come across issues with the implementation of the p-value using either the stats or the scipy modules?
I am getting p-val =0.0, which feels wrong. Despite a sample size of > 5000.
I've run it with scipy, statsmodels, and coded it from scratch (albeit with a call to scipy for the p-val CDF)
mean1, mean2 = np.mean(data1),np.mean(data2)
n1, n2 = len(data1),len(data2)
std1, std2 = np.std(data1, ddof=1), np.std(data2, ddof=1)
pooled_std = np.sqrt(((n1-1)*std1**2+(n2-1)*std2**2)/(n1+n2-2))
t_statistic = (mean1-mean2)/(pooled_std*np.sqrt(1/n1+1/n2))
deg_freedom = n1+n2-2
p_value = scp.stats.t.sf(np.abs(t_statistic), deg_freedom)*2
this code duplicates the output of scipy & statsmodels builtin p-values
so, if there is a problem, it is coming from the scp.stats.t.sf invocation
.... I guess I am just going to have to go ahead and reject the null
my yolo model always make the GPU device out of memory, and this is the advice gpt has given, does it make sense?
@rich river this overall seems to make sense, doing basic stuff like clipping, disabling features, switching dtypes, and immediate garbage collection. but what inputs are you feeding into the model that is causing your gpu to run out of memory? how much vram do you have? if you're feeding super high quality images, it obviously stores a lot more data, so maybe try reducing that. if you're doing a video, try frame skipping to cut the amount of times the model needs to run inference. i think pytorch has a function to clear gpu memory, it might work in between inferences. im no professional, but ive dealt with the struggles of gpu oom so maybe go ahead try a few of these out.
it is a spinning program and I will send a image to the model per second with resolution 1536*1280
yeah the res is pretty high, are you storing all the images in memory?
no I think it is just for inference
what yolo model are you using? extra large, nano, small, etc
yolo_11x by ultralytics
yeah maybe try reducing that to something like l or s?
what gpu r u using and how much memory does it have?
does batch size matter?
yeah batch size is pretty significant
i would recommend going all the way down to 2 or 4 but increase if it's too slow
p values tend to shrink to tiny values when you get bigger and bigger sample sizes tho?
It depends, as they say
but generally it's true, with a large sample size even tiny differences will give you significant p
anyone one have internship online pls share me in program data science or data analyst
Is it just me or does Sklearn not cover time series data great
It's also hard to forecast like in Stata
Sklearn also doesn't have good metrics like R
Like in R I'm able to get like a summary
do you have any answer jhon
Am I missing something from Sklearn
With Stata too you are able to get like a summary of model.
sklearn is mainly for machine learning and model selection in my opinion. I would suggest statsmodels especially if you're looking for summary statistics.
we can use pandas and numpy for summary statistic
Obviously...
Alright let's see. Hopefully it isn't too complicated.
No I think it should be fairly straightforward. Even if not, their documentation is pretty excellent.
Nah those two libraries would not be able to tell you certain tests. They could tell you variance and stuff. With Sklearn you could get R squared but not adjusted R squared
You can manually calculate adjusted R squared but I'm not doing that
I'll try it out
do what you like simple
at the end it matters output and answer
I could share a screen of the summary statistics thing I was looking for
Maybe it's online
I got this from online
Statsmodels looks alright but I already see a complication but it's a subtle one.
remember this
It's around the X-13 Arima seats but it's minor , and I might just use Rpy
R has better X-13 support
For my field which is going to be forecasting you need all of this.
same to my field also
I'm looking through the docs but does statsmodels have the summary thing too I showed from R?
Ok it does
Never mind
Did they just copy from R
The syntax looks so similar
Not sure, to be honest I really just assumed they'd have something because it already has a fairly decent summary method for OLS and also a bunch of time series stuff ๐คทโโ๏ธ
From what I see with statsmodels I'm already going to miss the Sklearn syntax though
But it's ok
I think it's specifically designed to be easy if you're coming from R (the formula api I believe it was called)
there's also a more python-y object api but I'm pretty sure that's less developed anyway
What's the more pythony one
For the lasso regression ones I already kind of miss the Sklearn API
statsmodels.api iirc
statsmodels.formula.api for the R-like one
but as you can already feel python's weaker on the statistical side of things when compared to R, if you're doing more traditional statistics I say just stick to R
The way Statsmodels seems to want you to do it is by messing with the regularization parameters but you have to keep to the OLS script
With Sklearn you get dedicated classes like LassoCV
one clear example I experienced and can point to is structural equation modeling
semopy is basically abandoned and still way less developed than lavaan
That's true but I found it messier to do certain things from R like tuning a lasso regression model which was why I moved to Python on things. But yeah It seems I still need R for like X-13 Arima stuff
Yeah statsmodels doesn't have LassoCV it's kinda annoying
I'll figure it out maybe you are able to combine both Sklearn and statsmodels somehow
I think it won't be too hard to write a sklearn wrapper yeah
then you can throw it into say GridSearchCV
this might work for you?
I'll check it out
anyone down to make me an ai in python? i got $50 btc
you wanna pay someone $50 to build an AI with Python?
si
i remeber using tensorflow and shi back in the day, its not it ๐ญ
like even a simple one lowk
i tried making mine solve simple math equations from images
why don't you go to Upwork and bid on data scientists for this task.
nvm I found smth
Within the next few days hopefully I will be able to make a image detector ๐ค
I swear that how p-values are reported is triggering some sort latent dyslexia in me. Small --> significant. Inverse semantic relationships ๐คฆโโ๏ธ
the data shows significant differences, a p-value of 0 only supports the visual
Im using LM Studio, linux (WSL2) and a qwen3 vl 32b instruct model with a nomic text embedding v2 moe model. I got the model interacting with apps on the desktop. It was scrolling the news and I accidently clicked on the lm studio app . Well it turned it attention right back to what it was doing and alt tabbed to my amazement. The keyboard commands work great, and the mouse accuracy is on point, but for some reason the "click" command wont execute. I thought it might be windows UAC but it wouldn't make sense cause keyboard commands are fine, mouse moves. Clicks don't, nothing. Has anyone had any success with Powershell commands related to this?
it seems to be because the reserved memory is keep increasing
Im not sure how to fix this
memory leak?
I often got CUDA OOM when trying to start a new thread
reading up on it rn.... seems like it could be a number of things.
are you reading CUDA right now?
I just looked into some search results for cuda oom memery leaks, and there appears more than one possible culprit.
possibly force a cache-emptying step?
have you tried a profiler? to maybe see better which lines are eating up memory
yeah, very hard to say with what's available here.
I've found this line to be very helpful for reducing CUDA memory
but it seems to add some inference time
@rich river what GPU r u using and how much vram does it have?
4090
25GB
Hey, I am mostly unfamiliar with Python, but it seems it's the language I'll be using for the vast majority of my Big Data course in college this semester. What resources would you guys recommend to learn Python syntax and Pytorch for projects starting in 2-3 weeks? (And in general any concepts or libraries applicable to data science)
for data science i am learning at school: pandas, matplotlib, seaborn (only these 3) you can also try polars
numpy also helps
Thanks for the reply dude, I'll look into those
One extra thing, any important theory/concepts I should be aware of before I learn those? Since I'm a CS Major and haven't done much data science.
just basic python should help, they're not too complicated - i picked it up in a month or two
Bet
numpy and pandas are easy to learn but maybe difficult to master and matplotlib is just something u use to plot graphs based on your pandas data
you're prob gonna be stuck w/ matplotlib & friends anyway, but
if you can avoid using it I'd advise you do so and use an alternative like plotly, or my personal choice rn of hvplot
it's not that it's bad
but the api certainly makes me want to throw it in the bin everytime I use it
seaborn can alleviate some of that pain if it's available in your classes
and as you pointed out pytorch, that probably means you're going into deep learning, where knowing linear algebra will help a lot
im learning basic unsupervised learning with sklearn and while i was learning kmeans model i stumbled across a question which i couldn't find the answer for from chatgpt
after we fit_transform() with standard scaler and we model.fit() with kmeans, there is this model.labels_ and also model.predict() but i dont know whats the difference. chatgpt told me that model.labels_ return a numpy array of the cluster IDs (like 1, 3, 2, 4, ...) if i used n_clusters=4 The cluster IDs that kmeans assigned during fit. but idk whats the difference between model.predict() and these model.labels_ ? because chatgpt said predict works on new data or smth but we're only talking about the one single dataset used for training
You can use predict() to classify which cluster new data points belongs to after you fit it with an existing dataset
that isn't very widely used though
it has little to no use if you only have one single dataset and no new data is added after that, but you could have an online process classify new messages each time someone sends a message for example
Hey folks, I need some help with a project from my university. It's a multi class comment category prediction competition, but the catch is, we're allowed to only use sklearn, imblearn, lightgbm, xgboost, and statsmodel models.
I have little experience with text classification, and would like some guidance on how to proceed. From what I read up until now, the best way to approach it is to use TF-IDF for transforming the comment text, and process categorical features with One Hot Encoding, and numerical features with Standard Scaler.
I'm planning on using Linear SVM, Balanced Random Forest, XGBoost, LightGBM, and possibly Hist Gradient Boosting, as I've had quite high scores with it in the past on unbalanced data.
What do y'all think of this? Any suggestions/areas of improvement for me to consider?
sklearn has a guide on text feature extraction
you could try a make_pipeline(CountVectorizer(), MultinomialNB()) as a very easy to implement and fast to train baseline
The sklearn.feature_extraction module can be used to extract features in a format supported by machine learning algorithms from datasets consisting of formats such as text and image. Loading featur...
also, tree models don't really need one hot encoding nor feature scaling
Thanks for the heads up haha. It's been 2 years since I took lin alg so I definitely should refresh myself on some of it XD
Polars is a lifesaver
Their syntax is clean
If you want time series forecasting I recommend Darts tbh
Darts is super simple to use
I know XGBoost, HistGradientBoost and LightGBM do not need encoded values, but doesn't Balanced Random Forest, and the regular one needs encoded values, if not one hot encoded ones?
thanks so much for this, I'll look into making a quick submission soon before trying out different models!
model.labels_ isn't a method like model.predict(). labels_ returns a numpy array of the "labels" like cluster 0, 1, .., K-1 where the K is the parameter you pass to kmeans. predict is used like preds = model.predict(X_test) where preds will be an array of predicted cluster labels for each test value, each label being in labels_
right, use model.labels_ for visualization purposes, use model.predict().labels_ as the estimator / forecaster
not really - for many cases the labels are everything you care about (they're already estimations based on top of your data)
there are only so few cases in which you'll want to use .predict()
that is a pretty big difference between supervised (like classification) and unsupervised learning (clustering)
holy bro ur a baller
yeah especially on a 4090 with 25gb that should really not be happenin unless you r doing some insane parallelization or sending tens of thousands of images per batch
well, right. If you already have the labels, then adding new points to the set can then be assigned the predict() method.
anyone has good labeled image datasets sources with open licenses and unrestricted access? I am looking specifically for emotion labeled faces
A facial expression database is a collection of images or video clips with facial expressions of a range of emotions.
Well-annotated (emotion-tagged) media content of facial behavior is essential for training, testing, and validation of algorithms for the development of expression recognition systems. The emotion annotation can be done in discre...
while we are on the clustering topic, and standard datasets, this site has some pretty amazing datasets for unsupervised algorithms. Very high dimensionality, too
https://cs.joensuu.fi/sipu/datasets/
there is at least 1 dataset that exists in 1024 dimensions
hi guys so im basically building a speed estimator for tennis clips and im running into some issues. i used the player height as a reference and converted it into meters per pixel, and then from there, it was pretty simple. now the issue im running into is that velocity is typically measured with change, and since the video is in 2d while im trying to estimate 3d movement, it results in some extremely low values. any ideas for how to fix this? i was thinking to increase meters per pixel by a certain factor, but im not sure if there is a good way to get that programatically rather than just trying random values.
right, those still need encoding
-# bro how have they still NOT added support for this :blows up:
if there are too many unique values you may also try ordinal or target encoding ig
but yeah, in principle (newer) trees shouldn't need it
though note that both xgboost and lightgbm support random-forest-type classifiers now, through XGBRFClassifier and LGBMClassifier(boosting_type='rf'), so you can also use that
Hi there! I wanna learning about the machine learning. I know about Designing Machine Learning Systems book that is popular in Data science, machine learning. But when i read the introduction of the book, it requires a little machine learning basic knowlegde. I have just learnt about python, and dont have much machine learning or coding knowledge background. Should I do something to gain in-depth knowledge about machine learning?
Took the AlibabaโNLP/gteโmodernbertโbase and added a soft Moe and other techniques I learned its benchmarks are rivaling 8b+ models on HF
still for a 624 meg embedding model, pretty wicked
input_text = (
"You are a language assistant generating gender-inclusive and gender-neutral text.\n"
"Follow these rules:\n"
"- If the input asks to rewrite, rewrite it in a gender-neutral way\n"
"- If the input asks to write or describe, generate appropriate content in a gender-neutral way\n"
"- If the input contains blanks (___), fill them using gender-neutral terms or pronouns\n"
"- Do not assume, specify, or infer gender unless explicitly stated\n"
"- Avoid stereotypes and biased assumptions\n"
"- Preserve the original meaning and intent\n"
"- Output only the final text\n\n"
f"Input: {text}\n"
"Output:"
)
inputs = tokenizer(
input_text,
return_tensors="pt",
truncation=True
).to(device)
output_ids = model.generate(
**inputs,
max_length=256,
num_beams=4,
no_repeat_ngram_size=3,
early_stopping=True
)
return tokenizer.decode(output_ids[0], skip_special_tokens=True)
test_text = "A researcher publishes a paper. ___ receives recognition for the work."
print(rewrite_text(test_text))
o/p=You are a language assistant generating gender-inclusive and gender-neutral text
print(rewrite_text(test_text))
o/p=You are a language assistant generating gender-inclusive and gender-neutral text. Follow these rules: - If the input asks to write, rewrite it in a non-binary way; - if the input contains blanks (___), fill them using nonverbal terms or pronouns - Avoid stereotypes and biased assumptions - Output only the final text.```
why prompt is not working
using t5-base
Instruction-tuning / Prompt tuning
thanks for the info! I'm just log scaling some numbers, and using ordinal encoding for the tree models and leaving the others as is. I still need to train at least model one linear/regular rf model as per my guidelines, so as much as I'd love to use xgbrfc, I'm still stuck with either balanced, or the regular one :/
Hi guys i am building a platform for OCR extraction with mistral OCR and other stuff. but these are't that much accurate also tried with "https://www.docling.ai/" also the tables are not placed in exact place which is extracted any suggestions or ideas.
you can try DeepSeek OCR or models specially made for tables, but pretty sure Mistral's OCR is state of the art
if you're extracting say pdf documents specifically there's been a wave of those releasing
like mineru (tho note the license), paddleocr-vl (tho note the install process), lightonocr, etc
Is there a way to use tfidf vectorizer with no feature cap with tree models, or is my only solution to use either count/hashing vectorizer, or tfidf with a low max feature cap?
I'm just running into memory issues on my laptop :/
can someone checkout my question
where is it
Hello guys I need atleast 1 more person for this hackathon (more can join)
Does anyone wanna join with me, its online hackathon
Domain : AI/ML and bit Frontend
what is this
ok i can join
but what do you want to build?
Did you ever look into the algorithm itself and how it works? It would help i think
For, reasons, which, will be clear when you rethink about your problem afterwards
Or i misunderstood your concerns
anybody do any reinforcement learning? I've recently been working on actor-critic DRL for a classification problem. almost like learning ML all over again; really enjoyable
we need more gradient ascent representation fr
I know a high level overview of what each algorithm does, but the maths part has been too daunting for me to look into. right now, I'm just working on shipping a model that has high enough scores as the deadline to cross the cutoff for the competition is fast approaching.
I'll have to look into it deeper sooner rather than later though, as the rest of the project depends on it ๐
๐ฅ
okay thanks i'll try on these also for the info the ocr is not for LLMs it is used to digitize the scanned document like what the data entry persons will do
DM me if you still have a spot open
tables always heavy ... dont have seen any model than can do tables beside of normal filled rows/columns cell by cell
btw those big VL models ~3b/7b and bigger need ~10s/page or more ... usual simple text parsing with pdfplumber (can extract simple tables) ~1s/page or less (you can doo multicore) 0,1s/page
How are chess bot made?
Help
what directions are the rows returned by the pyrr.matrix33.create_from_eulers ??
i think 0 is right, 1 is up and 2 is forward
But im not sure
w one ui
Apparently you need to transpose bruh
hello, i'm currently working on a AI project, but I currently ran into some problems and I need help, please dm me if you want to work with me
why not share your problem here so everyone can help? ๐
Hello, quick question. For my uni coursework, I need to train a model for numerical data and another for text data. We are open to choose any publicly available dataset we want. I want to choose a dataset that would be "easy" in some sorts that I will be able to pre-process it, clean it efficiently etc. Do you people recommend anyone to be used? I need 2 dataset, one for the numerical and one for the text classification.
I checked it up on kaggle. I can just use one of the dataset it provides but don't know... I wanted to "solve" something tbh, use certain pre-worked datasets on kaggle as a reference then work on my project.
What would you guys suggest, that I find a dataset or I just pick on kaggle then work on an already available one?
Just ask Claude to do it and get a beer
Datasets exists because they are used to solve a problem, so basically any dataset you can find online would not solve a new problem. I think it's more than fine do use a dataset from kaggle, or from somewhere else, as soon as you are interested in working with them
yup noted, ty !
I've been learning about svms and tried applying it to a dataset I found and the points were extremely overlapped and it looked liked you could not even fit any sort of decision boundary between classes? How do you deal with situations like this
Or does it show that the features I plotted weren't a good predictor of class?
It's more often the case classes don't perfectly separate in real world datasets. As for how you deal with it, try getting more informative features, feature engineering, or boosting.
depends โข
plotting obv won't tell you the full story tho
and how did you determine they were overlapped? the fitted svm didn't perform well?
Hello
I am currently studying deep learning and want to go deeper and learn computer vision or gen ai. Can anyone recommend me some good books?
I was only using 2 features so then I could plot them and see what going on and on that plot the two classes were overlapping
unless you're only inputting those 2 features into the svm, then that's not really an issue and is probably to be expected
some1 else had a similar issue where on a 2d graph points seemed to be overlapping, but again that can easily happen: see #data-science-and-ml message
if you want a better visual graph, maybe try applying pca first
or tsne, umap, pacmap, etc. which are designed for visualizations
Anyone?
https://www.reddit.com/r/computervision/comments/129e3gc/suggestions_for_some_best_books_on_computer_vision/ i find this reddit post to be very thorough
as always, the recommendation both from my side and from the redditor is that, if you lack linalg and optimization background, you should address that first
Guys is it worth it to learn R im good at working with python but the job market isnโt doing its job so i have a lot of free time
Thanks!
Any YouTube suggestions? โMost youtube coursesโ just gives me uncertainty bc thatโs just ganna give me beginner know nothing tutorial hell.
If u donโt have any basics try to study the 3 brown 1 blue deep learning
And in cs230 by stanford
If you really want YouTube videos, then I am sure MIT OpenCourseWare has lectures uploaded. However, I would HIGHLY recommend using university resources. Learn by reading. You'll need to get used to reading documentation anyway, so it's a good habit to develop in my opinion. Here are some resources I've used myself:
https://cedar.buffalo.edu/~srihari/CSE676/
https://ds100.org/fa23/
https://engineering.purdue.edu/DeepLearn/
https://www.cs.columbia.edu/~dechant/deeplearning.html
https://cs231n.stanford.edu/2016/syllabus
Quick question: What is the best or most used Encoder for String data, or does it depend on the data (then which one is the best for what data)? One-Hot Encoding? Or LabelEncoder (OrdinalEncoder)? Do you have any suggestions
it always depends on the data.
and what the model is supposed to do.
Could you give an example please?
for example if you have a quality feature that may be one of low, medium, high then it's natural to use ordinal encoding because they have an order of low < medium < high
something like a color feature with red green blue you might want to one hot instead, because there's not an order
sometimes there are too many unique values and you might want to use ordinal encoding to avoid the curse of dimensionality, or maybe the hashing trick or target encoding, or even use a tree-based model that doesn't need you to do the encoding at all
or maybe you want to leverage the large training corpus of modern embedding models to project them into high dimensional yet meaningful vectors
etc etc
Ok, yeah thanks that helps
is there a best practice with mixed encodings? Like, a single dataframe, some categorical features are ordinal, others are not. You also get numerical features. So you can LabelEncode some features, and OneHotEncode others.
would it make a difference ?
yea and that's what u should do because at the end of the day it is called a science for a reason u need to try and see what works best in ur project
So it sounds like just do all the encodings, and then apply mlxtend and see which combination works best
IOW, feature engineering is woven into the actual ML step.
link?
contacted you in DMs!
!warn @vale umbra your message was removed for soliciting a business relationship.
:incoming_envelope: :ok_hand: applied warning to @vale umbra.
ma bad
Is there a book for pie torch that's built for beginners
can i ask question about my code here?
if it's about data science or AI then yes
ok thx
i am using regression to try and predict the prices of houses based on the area and i am trying to implement MSE so i can know the loss, but the number that pop up are like too big and i don't know how to make them smaller
hope this helps
you are taking the square of the mean of the errors, not the mean of the squares of the errors
ooh i should swap them thanks for the help
i tried to change the sequence and the the numbers are still way to high, i tried other to change my weight and bias but the mse got even higher
your model might just be bad
yeah fair enough
what should i do to improve it?
probably anything other than hard coding the parameters. you could try the closed form equations
will check it right now also i tried to use absolute instead of square since there are a lot of outliers and that helped too
i believe this is a decent fit
it does look reasonable
I concur
Hi
!mute 1459838440609943749 "1 day" I asked you to stop spamming "hi" in a bunch of channels. When your mute expires, please make sure that your messages are substantive.
:incoming_envelope: :ok_hand: applied timeout to @low yoke until <t:1769724964:f> (1 day).
Hello there everyone. I have recently updated my neural network for the TI-84 Plus Silver Edition! I have made a huge breakthrough with dual normalized encoding for the four letter inputs combined with binary presence for the four letters entered represented as 26 input neurons for a total of 30 input neurons. I reduced the hidden layer to 50 hidden neurons, but the 12 outputs have stayed the same. The architecture is fundamentally different. I hope that others will find joy, intrigue, or inspiration from this project. If anyone checks it out, please let me know what you think!
A neural network implementation for the TI-84 Plus Silver Edition calculator capable of autocorrecting words.
yea i have one but it is about cv with pytorch
I'm looking for a starter guide for pytorch
Have you looked at the official tutorials? https://docs.pytorch.org/tutorials/
I'm trying to look for a physical copy of a book because I can get pretty distracted if I'm on the internet too much to do too much to see
kind of off topic but
i'm cramming to submit a paper by midnight (in 3 hours for me) for the ICML deadline. if I can't get it in do I get punished somehow? like I can't submit again next year or something?
alright i'm not getting the paper done lol. sucks to finally believe in yourself the day that it's actually due. i'll get it done soon enough.
Hi everyone ๐
Sharing Semantica, an open-source semantic layer & knowledge engineering framework for building explainable, auditable AI systems.
It bridges the gap between vector-based AI and real understanding by modeling entities, relationships, provenance, and reasoning paths as first-class concepts.
Semantica is designed for GraphRAG, AI agents, and high-stakes domains where traceability, validation, and governance matter.
Feedback, ideas, and contributors are very welcome ๐
https://github.com/Hawksight-AI/semantica
I'm coming across a weird problem. I'm performing a Grid Search CV with Stratified Group K Folds with a verbosity of 4, and I can see that there are some folds with a score of 0.803, but the best_score_ from grid_search.best_score is showing a lower value of 0.795
Is it averaging out the scores of all of the folds with a particular set of params? it's been quite a while since I delved deeper into ML and I'm constantly second guessing myself that I'm doing something wrong :/
yes
Learning the parameters of a prediction function and testing it on the same data is a methodological mistake: a model that would just repeat the labels of the samples that it has just seen would ha...
lot of bais in your data
ah tysm for this! I did a quick skim of the grid search doc and didn't realize it was there too
Hello guys
So I was just wondering if anyone could make an agent skill or is making an agent skill with regards to pytorch or tensorflow or any of the machine learning libraries or frameworks
For coding agents like Claude Code or Open Code
I just checked the agent skills marketplace and it turns out that in the python or ml space there aren't many agent skills, so I just wanted to out that out there
Thanks ๐๐ฝ
Can you rephrase ? What is an "agent skill" ? And what do you want ?
@waxen kindle it's basically a skill.md file with some extras that teaches llms or coding agents exactly how to use a tool
https://m.youtube.com/watch?v=fOxC44g8vig&pp=ygUMQWdlbnQgc2tpbGxz0gcJCXwKAYcqIYzv
Agent Skills are organized folders that package expertise that Claude can automatically invoke when relevant to the task at hand.
Join the Claude Developer Discord - https://anthropic.com/discord
Learn more about Agent Skills - https://www.claude.com/blog/skills
00:06 Introducing Agent Skills
00:30 How Agent Skills work
01:08 Agent Skills vs C...
An example is remotion-skills (remotion is a react library that enables videos to create with react components)
Remotion skills effectively teaches AI agents like Claude code how to use the library together with best practices
Hence effectively turning prompts to motion graphics videos
With Claude code writing the code to make that possible
The same thing was done for manim
Effectively turning prompts to math animations making 3blue1brown videos easier to create
I was thinking we good do the same thing with tensorflow or pytorch
So we write an Agent Skill to effectively teach coding agents like Claude Code or OpenCode how to train models the right way
Using the best practices and stuff
I hope you get the picture I'm trying to paint
quickly skimming through that video, I'm not sure if "skills.md" is anything more complicated than a good rag system
so if that's the case, just put what you imagine are "torch/tf best practices" + some code examples in a skills.md
give it a description that would trigger when you write in said libraries
you're done (at least I think
Yeahh you're right but someone better than me should do it someone who has experience with the libraries and it's ins and outs should do so
It's nothing complicated but if done properly I think you will be able train neural nets from scratch and not much writing much yourself code with this
It's the same thing with remotion
People who aren't as good can now do basic stuff with videos remotion and those who are experienced are super charged now
So yeahh
I'd appreciate it if someone did that
This is another example that totally leveled up frontend Web design from the generic ai slop we all know
It's simple but it actually teaches AI how to do things the right way
Was wondering if someone could do the same for deep learning frameworks like pytorch
def HandlerTask(self):
for model_name in self._models:
model = YOLO(model_name)
input_files = self._gather_input_files()
if len(input_files) == 0:
raise FileNotFoundError(
f"No images or videos found under {self._source}. "
f"Ensure files exist (recursively searched)."
)
workers = 0
imgsz = 960
use_half = self._device_to_use != 'cpu'
try:
result_generator = model.predict(
source=input_files,
iou=self._iou,
agnostic_nms=self._agnostic_nms,
conf=self._conf,
device=self._device_to_use,
save=self._save,
stream=self._stream,
workers=workers,
# imgsz=imgsz,
# half=use_half,
verbose=True
)
input_files is a list of filenames. I was originally passing a directory name but I want it to visit the files recurrently in the folder so I made a list of filenames.
but my program stops working every time, I wonder if it is because the list is too long and I'd better use directory/path name?
๐ This reminds me of last year when I missed NeurIPS submission deadline. I had submitted the abstract, then 24 hours to main paper submission deadline, in the middle of that crazy rush hour, my compute credit finished. I didn't recover on time to beat the deadline. We live to fight another day.
There are some other top tier conferences you can submit your work to this year. You should consider submitting your work in other venues. You can even submit the work in the next ICML (but why wait till then if there are other venues you can submit to this year?)
I did still end up submitting the paper just not with the extra numerical example based on the neural net I was trying to build. Iโm hoping that they accept me (with feedback) and by the time Iโve received that feedback Iโll have cleaned up the issues with my code and made it run nicely. Weโll see @odd meteor
Friend of mine made this plugin based on experimenting with code reviewing with Claude Code. Basically he saw greater success running successive passes (not parallel) for agent reviews, and pinned it down to (his words):
"- Stochastic sampling. Each run samples a different path through the reasoning space. One might focus on error handling, another on boundary conditions.
- Context anchoring. Once a reviewer commits to a line of analysis early in a pass, that reasoning occupies context and steers what it looks for next.
- Bugs mask bugs. When auto-fix resolves a "Must Fix" issue between passes, the next reviewer sees different code.
- Finite output budget. Each reviewer agent has a limited token budget for its response."
He's looking for people to test it out and provide feedback or contribute, if anyone has time here's the gh: https://github.com/HartBrook/lookagain
anyone have any resources on neural networks they found really useful?
we are being taught this semester about neurons perceptrons etc
we have moved onto some sort of logic gate math and the teacher wasnt able to explain it very well so i feel a bit lost
looking to self study so im not behind
hai guys morning, anybody knows free hosting for cloud computing such else?
Google Collab has some free quotas
Personally I'd suggest Paperspace, it's paid though (has a free tier), but it's pretty nice, last I checked, you could pay like 8 bucks a month or so and oftentimes get some free hours on some gpu
but long-term it's cheaper to get your own hardware
Hey, i want Api keys to create an Ai assistant, can anyone tell me which best free API i could get for thinking, listening and speaking?
Hi, Iโm Jash Kevadiya, an AI Automation & Generative AI Developer with hands-on experience in building intelligent systems using Machine Learning, Deep Learning, and Large Language Models. I specialize in designing end-to-end AI solutions from data pipelines and model development to automation workflows and real-world deployment. I enjoy solving complex problems and turning AI ideas into scalable, production-ready systems.
I am struggling to find my first project as a freelancer. need an experienced freelancer to guide me.
Hi, I am trying to get pytorch installed on my machine for cuda 13.0 and python 3.9.25 in a conda environment. I have tried the below but am getting a could not find version error
pip install torch==2.9.0 torchvision==0.24.0 torchaudio==2.9.0 --index-url https://download.pytorch.org/whl/cu130
either go for older pytorch versions that support Python 3.9
or go to a higher version of python like 3.10
or higher
Hello AI people
I have a complex, open ended problem
I'm training an MtG AI player. Here are my assets:
I have a functional rules engine, and a complete graph based world model. This world model is completely accurate and encodes relationships of arbitrary distance. I can easily implement a spider or walker to do traversal. GNNs or an RNNs which walks the graph could be applied here.
I have access to human-played game logs which, presumably, could be translated to resimulations of those games for observation. I can have a flagship LLM play against itself and have the AI observe. And, once the AI is halfway competent, I have self play.
And I have a clear goal. Given the state of the game world, multiple objectives, and a set of possible actions, how do I select the best possible action(s) when they're presented?
How would you approach this problem?
"completely accurate complete graph based world model"?
are you sure about that?
iirc MtG is pretty ridiculously complex, I don't mean that like chess with a ridiculously large number of possible game states, I mean it's literally Turing Complete
Just knew about this moltbook ai reddit website aaaaandโฆ will there be any chance that I could get myself a clanker gf?
It is indeed very, very complex
I'm at 125 classes of node, and counting - probably closer to 250 once I'm finished
But, it is finitely complex
This is why I'm modelling my world graph as a LISP
i need help ๐ญ
how can i extract the values of the results from the dictionaries??
i try to use as little Ai as i can until they optimize them to use less water n such
To get the value of an index you can do it by using:
Value = person1.values()
The output will be (['Mary','2','3','3'])
i only want the results, not the name included
values1 = [v for k, v in person1.items() if k != 'name']
k it will look for the keys name, and if the k is not equal with the 'name' it will take the v which is the value of the key
person1.items() is the key value pairs
just โจperson1['result1']โฉ and etc?
guys in your opinion what projects would you like to see in the resume of a fresher data analyst?
Hey everyone, I recently spent some time training a decoder only character level transformer. I had trained it with some README files that I found on the "stack" dataset.
โจ```
Epoch: 45/50 | Train Loss: 0.8878 | Val Loss: 0.9439
Validation Loss has not improved. Patience:2/5
Epoch: 46/50 | Train Loss: 0.8867 | Val Loss: 0.9394
Val Loss has improved at 46. Model Saved!
Epoch: 47/50 | Train Loss: 0.8887 | Val Loss: 0.9380
Val Loss has improved at 47. Model Saved!
Epoch: 48/50 | Train Loss: 0.8829 | Val Loss: 0.9335
Val Loss has improved at 48. Model Saved!
Epoch: 49/50 | Train Loss: 0.8815 | Val Loss: 0.9322
Val Loss has improved at 49. Model Saved!
Epoch: 50/50 | Train Loss: 0.8746 | Val Loss: 0.9327
Validation Loss has not improved. Patience:1/5
However, when I tried to use it as an autocomplete tool, I got some gibberish text that resembled base64 strings or french text. I believe that this is due to a dirty dataset (My dataset must contain only english ascii letters and punctuation. Atleast 60% of the file must be english letters and whitespace combined.)
I'd like to know any techniques used to effectively clean my dataset while streaming. The entire dataset is around 160 GB and I am using 68 MB (First 10000 files that fit the criteria). Any help is appreciated.
BlockSize = 512
MaxEpochs = 50
LearningRate = 3e-4
Evaluations every epoch, I run 200 iterations and return the normalised losses.
NumEmbed = 384
NumHead = 6
NumLayer = 6
Thank you.
This makes sense
Tysm i will try after i get home ๐
No you're not lol
I like this theme, what's the name of it?
that was my main account i just realised
Guys my lecturers are giving two different responses
The activation function of a proceptron
Is it either 1 or 0 as the final output or 1 or -1
Or is it different depending on the model or something
Depends on the model and the situation
Usually an activation function gives a value between two values (like between 0 and 1). Not exactly one or the other.
Oh you're talking about perceptrons
I forgot
The original perceptron or the one used in pedagogy (made up)?
well ive never heard of pedagogy
we are having an introduction to neural networks but one of my lecturers seems a little lost and has confusd me a bit lol
Ok, so in school they teach a simplified variant of the actual perceptron which they then call "the perceptron". That version taught in schools can use either 0, 1 or -1, 1, and the second option is preferred due to being balanced around 0.
The original paper is talking about activation in terms of high or low (physical circuit). Binary 0 and 1 is when you threshold that and consider above some amount to be 1, and below to be 0, but you can interpret that as -1 or 1 depending on how you have it setup up and what it does later with that value.
("all-or-nothing" -> binary, digital)
Short answer, go with -1, 1. It makes the math easier.
They are equivalent (in learning power / model design).
interesting okay and if i were to model a perceptron in python would i go for 1 -1
Yes. This simplified variant. Modern textbooks prefer -1, 1.
they also never explained why bias is used so i think tommorow im going to open 2 hours to dig deepe
thank you
You know y=mx+b? Think about what the b does.
(Two inputs, xor problem (-1, 1))
yes we used this exact formula last lesson
ahh so the bias allows you to change the orientation of the curve to disclude or include data points
Offset along the normal.
In the line equation: โจAx + By + C = 0โฉ.
A and B hold the normal vector, and C is the offset along that.
For example normal vector pointing straight up, โจ<0, 1>โฉ, has โจโจโจโจโจA=0,B=1โฉโฉโฉโฉโฉ, so you just have โจโจโจโจโจy = some constantโฉโฉโฉโฉโฉ, so you have a horizontal line, and can move it up and down via the constant's value.
If you are familiar with more linear algebra, it turns the transform from linear to affine (adds a translation term).
the version I learned was just real valued stuff with a sigmoid applied, is that the classical one?
(and doing analysis and stuff like proof of convergence for some of these old-school models)
No, the classical one is much more complex and has multiple variations. For example it has feeback connections.
It's also not fully connected.
It's misinformation that a perceptron is that simple form.
you add translations to rotations? this sounds a lot like some kind of group theory
In mathematics, the affine group or general affine group of any affine space is the group of all invertible affine transformations from the space into itself. In the case of a Euclidean space (where the associated field of scalars is the real numbers), the affine group consists of those functions from the space to itself such that the image of ...
I am exactly there right now. "Planar Affine group over the reals"
this is basically crystallography
In Euclidean geometry, an affine transformation or affinity (from the Latin, affinis, "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles.
More generally, an affine transformation is an automorphism of an affine space (Euclidean spaces are specific affine spaces...
(Every game engine ever is built on this too)
(They use augmented matrix form (homogenous coordinates))
sure, just use quaternions instead of euler matrices for the rotation problem
Yeah.
Although there is a small growing push towards geometric algebra (rotors instead of quaternions).
by the way there are exactly 219 space groups in crystallography, if you ignore something known as chiralities
coincidence? I don't think so
(there are 219 affine transformations in 3D)
Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the specialization of each node in the network. It is well suited to finding clusters within data.
Models an...
(They are much more powerful than a simple sigmoid node (and multi-layer))
Ty bro
i got you
Guys am I ready for learning deep learning?
We just learnt more on it today
So 0,1 was a step function which isn't used in computing because if the value is 0 the computer has no idea what to do
Sigmoid function 1,-1 is better and more widely used but there is also other ones like relu which is used in deep learning
Ig the step function was just a pre cursor to the topic
try being as specific as you can so that people can start helping you without having to interview you.
Chill king
I'm very chill. I'm giving you instructions so that people can actually help you.
Thank you king
yw twin
hey guys, wanna learn few ML models, where can I do so?
When I implement my green screen for a MP4 file, do I need to threshold the video frame? This is what I wrote:
# Convert frame to HSV
hsvFrame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Threshold the image
retVal, threshImg = cv2.threshold(hsvFrame, threshold, 255, cv2.THRESH_BINARY)
print("Threshold return value: ", retVal)
The threshold return value is 30.
I am learning about ML models and CNNs at OpenCV University. They have paid courses and free bootcamps. To clarify, what ML models are you looking into?
specifically at:
Dummy / Baseline models (constant, random, majority class)
Logistic Regression
Linear Regression (and regularized variants: Ridge, Lasso, Elastic Net)
Random Forest
Gradient Boosting (XGBoost / LightGBM)
Isolation Forest
One-Class SVM
Hidden Markov Models
There is linear regression for Tensorflow.
Thanks for letting me know.
afaik
Any smart pytorch users around
OptimizedModule(
(_orig_mod): Model(
(token_emb): Embedding(65, 32)
(pos_emb): Embedding(125580, 32)
(transformer): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=32, out_features=32, bias=True)
)
(linear1): Linear(in_features=32, out_features=256, bias=True)
(dropout): Dropout(p=0, inplace=False)
(linear2): Linear(in_features=256, out_features=32, bias=True)
(norm1): LayerNorm((32,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((32,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0, inplace=False)
(dropout2): Dropout(p=0, inplace=False)
)
(l1): Linear(in_features=32, out_features=32, bias=True)
(l2): Linear(in_features=32, out_features=3, bias=True)
)
)
My model keeps exploding and outputting nans (even on first batch with gradient clipping)
I've never seen this sort of thing from pytorch and it's the frist time I ever touch transformers
Ah! I had forgot to give it a mask of what inputs were padding.
Still gives nans but seemingly less frequently now?
Same nonsense with a simpler 1D CNN
Add some normalization
hey, Gyes Iam learing PyTorch from a while but i'am now compining it with matplotlib and iam scared๐ ๐ซ
def plot_predictions(train_data=x_train,
train_labels=y_train,
test_data=x_test,
test_labels=y_test,
predictions=None):
'''
Plots traning data, test data and compare predictions
'''
plt.figure(fig_size=(10,7))
Plot traning data in blue
plt.scatter(train_data, train_labels, c='b', s=4, label='Traning data')
is it that hard or because iam starting to learn it?
first of the indentation is wrong -> will result in error. What do you try to achieve a simple scatter plot can be done as such:
import matplotlib.pyplot as plt
plt.figure()
plt.scatter(train_x, train_y, label="Train")
plt.scatter(actual_x, actual_y, label="Actual")
plt.xlabel("X values")
plt.ylabel("Y values")
plt.title("Train vs Actual Scatter Plot")
plt.legend()
plt.show()
by the way u are hardcoding parameters its simpler to use the obj and assign new items/traces to it.
and if im allowed to make the comment before u dive into pytorch u should grasp the fundamentals of python first, as this isnt a complex task at all.
Matplotlib is easy in my opinion. I used it a lot for school.
In addition, I used Seaborn too.
Another really easy one is plotly
how can i show my code like this massage?
!code
thank you
would anyone be interested in seeing my code of my first regression model and commenting on it?
Has anyone built any kind of AI agent or used openclaw to check out moltbook?
Hello
How do you guys manage discreet variables in XGBoost?
Heard that it wasnt very good in handling that.
Hey can anyone guide me to learn ML from scratch?
just my 2 cents and take it for what it's worth:
- Gemini
Just ask tell it exactly what you're trying to learn and how you like to learn etc...you'd be surprised.
Not perfect solution of course
Thanks mate!
You can correct me if im wrong, i didnt look much in classic ml theory.
As i remember its vica versa, it can handle it. Gradient boostings are just a bunch of continuous decision trees. And these trees at each step literally like: take
takes splits for full batch (full training set, as you wish) and looks which split was most informative by using cross entropy (minimizing suprise) or gini (idk just maybe faster cross entropy). So it can work with any kind of data if it is numerical and can just ignore missing values so the data will be splitted using other feature
Didn't work with ttansformers but maybe you will see something useful from what i will say, although it can be completely useless: big learning rate; exploding exponents (that's why cross entropy with numerical stability exists), activations (in rnns as i know batch is squished with tanh), maybe batch/layer norms will help, maybe just look if you did connect everything in right way, just add printing out some values exceeding threshold after each layer and see if there is anything strange. Maybe you used log somewhere where it wasn't supposed to be, cuz on backprop 1/x will scale gradients very much. Maybe something didnt connect so by chain rule you took some nan values and kept them through layers
Hey i'm making a roadmap for myself to learn AI, is kaggle a good source to learn machine learning and deep learning?
Guys uhm, I need help coding an ai gf from scratch for a challenge lmfao.
Fantastic resource
FastAI is a really high level wrapper for pytorch
What I did was I learned FastAI then switched to PyTorch after
Since the overall stucture is identical
hello everyone! can i post my data engineering doubts here?
HI guys, i have a question, i have taken a ML course in uni and i want to build a CV model to label mushrooms. I have a decent data set already and im just wondering which LLM is the best one to give me a hand with coding? Ive heard both claude code and gemini are fairly good
either of those are fine, but the more you use an LLM to help you with this, the less you will learn.
Helloo, I want some ideas/advice, I'm currently working on my undergrad final year project and my supervisor told me to include an AI things in my project where I can train the model.
So basically what I'm building is an "Animal welfare" app where users can create post and chat. A basic app for now but it seems it's too basic. My supervisor told me to train a model that would compare animal images in case of missing animals.
But I told him that I don't think it's possible using AI models, I know their is another technique used, don't remember the name where we will compare the arrays of images then find how similar they are.
In this context, I wanted some ideas. Do you people know what can I implement in the AI aspect and what additional feature might be interesting for an animal welfare app pls.
That is very fair, i just need something to bounce ideas off of for some robtics/CV related projects i have and idk which one is the most competent. I dont want to pay for more than one subcription
i just turned 13, how do i start ml
i have background in linear algebra and basic calc
any advice
start with traditional ml
start with basics so that you learn fundamental concepts, and slowly work your way up to cutting edge ML. it will be a long time before you're ready to learn about how, for example, LLMs work.
a good place to start is learning how to train a classifer model on some CSV data.
Hi everyone, so i was trying to make a simple perceptron just to try and understand them properly and used the AND logic gate set, how can i discover if what i wrote is done properly or just working because of the set size without having to make a new one with a bigger set?
LLMs to assist with building a project can be good, assuming you mostly know what you're doing already and can spot where errors might occur. You should definitely not use it as a crutch, especially if your main goal is learning. It might make decisions you don't understand, can't justify, and are wrong. But if you already know what pieces you need, and mostly just want syntax, LLMs can be pretty helpful providing snippets.
anyone got an idea pls
Your supervisor wants you to use a CNN
Call that AI if you want
yep it's a CNN, in my head, I was just going to compare 2 arrays, I don't really know if a CNN can help because my training data would be animal in general ,no?
If you just use some kind of KNN on images, you'll end up getting very bad results AND it will be veeeery long
Look on the internet what kind of model can be used for image recognition
But it's a whole project on it's own, really
yeah I see, will try to have a general look see what it can bring, ty !
If you are allowed to, I recommand finding a model on kaggle or huggingface and possiblty fine-tuning it, bc I don't expect you to get some meaningful results on this task if it's not the core of the work
As I said, it's a whole project on its own
You would need a lot (but like, a real lot) of data for training
And all the cleaning and labelling, that's not something I would start from scratch
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
yeah will do that, I have no restrictions on that
does exploding parameters normal on ML??
Yes
Hey!
Now coder here
Iโm really confused about data science and AI
I mean they teach it in school but it sounds like fancy jargon to me half the time ๐ฌ
Anyone here who can help?
What do you mean "fancy jargon" ?
It's a bunch of algorithm and techniques related to using and implementing them (as are any field within computer sciences)
Basically
What do you need help with ?
Any courses online that can help me get started
Pythons pretty cool but is that a part of data science? Are coding languages a part of data science?
Same I had just started @long whale
Python is a tool you use to do data science
Usually yes, people use python
(But other languages can be fine too)
Like C and C+? Java?
those languages aren't really used for data science.
the most common alternative is R.
for vehicle classification model development using roboflow, is this balance or imbalance data? is it too bad or not, sorry i am new
for CV tasks you should not care about balance/imbalance of data
just make sure you have variety of data for each class
so lets say if your tasks is object detection you need to make sure your dataset contains all possible / near possible variety for that class
you can also add image modification techniques such as inverse
ohhhhh
so i will be using albumentation? or just use what yolo have?
Just try a model on dataset first and see if it's getting train or not
thanks, but may i ask where is the best to train my datasets it contains 17k images. locally i have 3060ti 8gb vram or should i try google collab, vast ai, or runpod
I never tried on 3060ti, was only using Collab free tier for yolo models
But I would say give it a try locally
how do you deal with the time limit?
u using the t4?
ohhhhh okay okay thanks
Is there like, a free ETL course anywhere?
can someone share any resources and tips on how to grid search effectively? I'm tuning a couple of models, and the list of hyperparameters is too large to search through all at once.
I'm thinking of running different parameters that are close, together, but couldn't the different sets of parameters have different optimal values when working together than what I'll get from running grid search on separate sets of hyperparameters?
yep, hp search is very time consuming. You basically have to parallelize the computations. Optuna is a good library for that for example
Refer official docs that's all you need.
Use Optuna bro.
It's better than anything.
thing is, I'm running into memory issues when I'm trying to parallelize the tree models. and unfortunately, I'm limited in the libraries I can use and Optuna is not one of them...
is there any native sklearn/python alternative to Optuna? I'd love to use it, but sadly that's outside the scope of my project
we have a thesis for vehicle classification and license plate detection (2 models) i will be buying the raspberry pi 5 with the hailo ai hat 26 tops, my question is should i buy 8gb or 16gb ram raspberry pi 5?
^ ocr for license plate recognition and website with database is included
run the algorithm on a computer and see how much memory it needs ?
we havent have the model yet
simple random search has been found to be more time efficient than grid search
it's directly in sklearn, so you can try that
Do you guys know of any datasets of "real" 3D models? So not fantasy assets but things like chairs, tables, shelves, etc.
I agree with purplys, do a random search instead of grid search. If you're willing to use other dependencies have a look at bayesian optimization or similar
They shine when you can't parallelize because the algos are inherently sequential
hlo
16 if you can
I see, I'll look into that. thanks!
I'll use random search for now. Bayesian optimization is a bit outside my scope, both in terms of knowledge about it, and from a technical standpoint. I'll still look into it after I'm done with my current project, it definitely seems it'll save me problems with regular ml models. thanks for the recommendations!
damn i just bought the 8gb
it is paired with 26 tops hailo hat so i didnt think 16gb ram is needed
yea its okay!
will it struggle with 8gb ram?
I dont think so, I mean 16 is pretty standard nowadays thats why
but for raspberry pi its okay
yaaaa i dont have any experience how efficient are rams in rpi, since i havenโt own one
but my pc is now struggling with 16gb ram lmao
In case anybody's curious:
https://joss.theoj.org/papers/10.21105/joss.09631
Mishchyriak, Y., (2026). PureML: a transparent NumPy-only deep learning framework for teaching and prototyping. Journal of Open Source Software, 11(117), 9631, https://doi.org/10.21105/joss.09631
That's wild
no one has ever complained about having too much ram
especially (and literally) in this economy
what's the price diff though
100 dollars is the price difference between 8gb and 16gb in our country
100-110 dollars
well, I'd then either have to know the full price or need to know the price diff in percent because yk, 1000 vs 1100 is a bit different than 110 vs 210
137 for 8gb 225 for 16gb
almost 2 rpi for 16gb
oh, what the heck
Hhyy
Does anyone have any idea where I can find easy to understand tutorial for learning R? I kinda need it
Hi, quick question, what's the difference between bias in data vs imbalance data? I though these are synonymous to each other but biasness doesn't mean imbalance data?
i think bias data happen when there is an imbalance in data, it happens in high ratio imbalances like 10:1?
yep too expensive
I think it overlaps with data imbalance but I believe there is more to data biasness
for e.g when values are capped within certain ranges for e.g, it's some kind of biasness
yep
anyways another question for rpi5 with 8l hailo hat what is the best yolo model? 8? 11? 26? and also small or nano
26 is the latest (I believe), haven't try it yet, the 11 one, I used it recently, seems to work well.
small or nano depends on the size of your dataset.
hmmmmmmmmmmmmmmmmmmmmmmmmmm
how do u determine the size of the datasets?
i will be using two models btw one for vehicle classification detection and license plate detection
check out yolo's docs, it gives you insight when/where to use nano or when to switch to another size like small or medium
how many images do you have in your dataset?
the vehicle classif has 17k
the license plate also has 15k
but i will transfer learning it with 3k images for the license plate i mean
yeah, I see, recently I work with approximately 20k images for my object detection model, the small model did a decent job, maybe you can try with it and switch if needed
what was the fps?
also how did u train it locally? or cloud like colab?
euh don't remember but since it's on a pi, I would export it using the openvino format which allows it to work better/more fluidly on a pi
colab sadly :c
u on free tier? how long did it take
sorry too many question i am so curious
yeah, too much time sadly and I couldn't exceed 80 epochs I think
cuz im planning to train it locally using my 3060ti
you can give it a try, can be better than colab
Hi guys I need some help
yaaa but i might just rent a gpu in runpod if i need it faster
wait forgot the name
its pod somthing
Hi I have a question what are you talking about ?
computer vision and training
Are you a student ? Sorry I usually get curious
Do you know networking ?
not really
Im trying to getting better computer for cyber security can you help me ? About hardware and software in computer
oh sorry i dont really know anything about cyber sec
No no I mean just computer , in the first place of everything in tech I should learn better the computer
there is a networks channel, maybe you will have better chance if you ask there
Thanks I will this give a try and by the way I'm looking for some friend in tech , you know it's hart to be nerd at school๐
maybe cisco?
Yep , and by the way i will be happy to make friends that they are more like me if you would like
!rule 9
<@&831776746206265384> recruitment
!rule 7
7. Keep discussions relevant to the channel topic. Each channel's description tells you the topic.
Hi, quick question, when performing cosine similarity of two embeddings, should they have the same number of dimensions/length?
I want to look for the vector similarity of 2 images. But the number of embeddings/size of image etc should this be a constant?
yeah I see, for this to work, both should have the same size/length
you cannot compare embeddings generated by different models even if they have the same size though, unless these models are explicitly trained to work with each other
I need some advice. I read about image similarity and I have a better overview of the different method available to perform it. I'm building a web app that will allow users to compare missing animals vs animals found so that we know to what extend these 2 match.
What would be some required techniques to achieve this pls. I know there is CLIP but this is used more when we have a prompt and based on that prompt we would look for images, it's not really a similarity search, no?
I also read about siamese neural network. I vibe coded something with AI just to see how it works; it seems to work at start but when I use photos of different colors, say 2 different colors of cats, I get high similarity score which I don't really want.
maybe check if Meta's SAM (Segment Anything) works for your use case
if not, you might need to use a general purpose vision language model (chatgpt/gemini) or fine-tune a model specific for whatever you're trying to do
will give it a look
CLIP
what makes CLIP special is it projects both text and images into the same embedding space
so while yes, the fact you can compare similarity between text and image is one of its highlights, you can also compare 2 images
besides OpenCLIP, there are also other models that could work similarly, like dino v2/v3, google's siglip v1/v2, etc
by itself I don't think CLIPs are good at what you're describing, but I think you can train a classifier on top of it. I've not done that myself nor have I really looked deep into it, so I'm not sure how well that would turn out
yep noted, by the way things ike OpenCLIP, are these free models or we should paye for that?
the ones I mentioned all have open weights you can freely download from huggingface
noted, ty
hello world !
i'm working on a mini project, funny ai girl offline.
i'm using llama3.2:3b for the brain. i'm working on a feature that make the ai learn about you. but it did't work properly.
can anyone offer help.
for more information here you are the github repo: https://github.com/AhmedGharsallah/funny_ai_offline
llama3.2:3b
that model is very old and small, I would recommend trying something newer and/or larger
We're up and running! But its a local, AI that learns from every conversation, consolidates knowledge while
idle, and can autonomously research the web and execute tasks . Its running great on a qwen 3 vl 30b a3b instruct model right now on Q4 K M. But all you need is 24 gigs of vram. Ideally thought I want to test it on a 80b with full context 262k.
It just pointed out a problem for me.
Anyone else feel like propriety AI software is dead in the water? Why stuff a model with billions of parameters that change on a long enough time line? 80b seems ideal or somewhere in that realm with advance software capabilities and the tools to research and verify on its own accord.
You work at Ring or what? I saw the Superbowl commercial, lol. Thats the problem , you're just searching similarity.
Sounds like a great idea, but full of potential false postives.
Now you got a system that spreads false hope. Dogs weather easily and mange when outdoors for a few days.
People looked for missing animals in the 90's. This is 2026. Ring had a good idea though use their network to track them for their orgins I imagine.
You're missing the infastructure and the huge company ring already looking into this
It can query its own memories and prompts.
another question for rpi5 what is the best remote access vnc or rpi connect? if vnc what would be the best one
Hello I sent you a friend request, I have some questions could you help me ?
Why are you doing this ? Why don't you just ask here ?
Bro I talked about this above
Yep, and as you can see the person you talked to was not really open to just get dmed randomly
Talk here first, then maybe send friends requests
In real life, you don't bump into people and say "can we be friend?"before talking, right ?
Yep your right
Sorry about that
we could perhaps have a better architecture than neural networks, or do we already have it?
rather than having a bunch of layers we could think of processing it some other way
If you're thinking of dense layers - transformers are such a better architecture
don't they still perform the same way, layer after layer?
what if layers could talk to other layers regardless of the order
non-linear operations
all NN architectures have nonlinearities
there are a few architectures that purport to be better than transformers but they didn't catch on. In particular I saw at least one adding connections between layers
yup but linearity in their order of processing data, as in they go from left to right step by step
if I'm not wrong transformers are neural networks with attention layers?
sure
alright just clearing up for myself
depending on what you mean by that, attention layers are already parallel. Like, processing N tokens of prompt does not require N sequential steps; if you have enough parallel compute you can process an arbitrary-length prompt in a fixed amount of time. This is a key difference from RNNs, and the reason why transformer training is so fast
my thought was that (an example: ) instead of simply forward propagation, we introduced a logic so that it can backward propogate a few times in the hidden layers (decided by an arbitrary function that determines if it does so) before finally reaching the outputs
so it could maybe cause correction or improvise the data while it happens
I'm trying to use a naive bayes model for a multi class imbalanced text + other features classification, but I'm having some problems with the scoring. I'm assuming that I'm not processing the data correctly, so I'd appreciate it if someone could guide me in the right direction.
I'm also, severely limited in the libraries I can use, so a general solution that can be implemented with native scikit learn/pandas would be helpful. I did some digging online, and almost everyone uses deep learning libraries to parse the data before passing it to the model. :(
Any resources on packaging ML models into an app?
I've been noticing a gap with modern data science education and actually putting models into production. A lot of the popular resources just show you how to joblib dump and load elsewhere, but this is hand waving a lot of complexity.
most of the models that people use these days are prohibitively expensive to distribute as part of the software, so they just get interacted with over the web.
I can see that being the case for LLMs (LLMOps I guess).
What about project structure, robust data/ML pipelines, handling async requests, Docker, etc.
I'm wondering if there might be some good resources for this side of things.
Even if not by distributing as a software but maybe general ML API design.
This gave me the chills lol
Team
DefaultCPUAllocator: can't allocate memory: you tried to allocate 571894495956 bytes
nn.Linear(L * L, L)
expands to
nn.Linear(27342441, 5229)
def __call__(self, source=None, model=None, stream: bool = False, *args, **kwargs):
"""Perform inference on an image or stream.
Args:
source (str | Path | list[str] | list[Path] | list[np.ndarray] | np.ndarray | torch.Tensor, optional):
Source for inference.
model (str | Path | torch.nn.Module, optional): Model for inference.
stream (bool): Whether to stream the inference results. If True, returns a generator.
*args (Any): Additional arguments for the inference method.
**kwargs (Any): Additional keyword arguments for the inference method.
Returns:
(list[ultralytics.engine.results.Results] | generator): Results objects or generator of Results objects.
"""
self.stream = stream
if stream:
return self.stream_inference(source, model, *args, **kwargs)
else:
return list(self.stream_inference(source, model, *args, **kwargs)) # merge list of Results into one
can anyone explain how and where is __call__ called? why model.predict would call this function?
problems with the scoring
wdym specifically?
if you dont include details ppl wont be able to help
Hey guys to all the people passionate about ml and ai, I have started a study group where passionate people who are studying ai and ml can chat, discuss, and create small projects together! I am very open to suggestions and I believe we can learn a TON together, if any of you are interested then just dm me ๐
According to the paper I've read, naive bayes scored around 0.79 macro f1. However, I only know the rough set of hyper parameters used, and the kind of preprocessing applied on the dataset.
With my best guessimate, my model's scores are ~0.68.
The paper also applied 2 advanced preprocessing steps that I can't replicate with traditional sklearn: lemmatization and tokenization. Everywhere I read, it seems that the documents have to be heavily processed to get a good result with naive bayes.
__call__ is called when you call the object, like
x = Myclass(...)
x(...)
Will work if __call__ is defined. Otherwise it won't. In this case, you will find the same arguments in the definiton of call that you have to give to the x()
Does anyone have a project idea or an active project in progress?
If you need technical support or a developer, feel free to reach out.

Does anyone know the data set for reading lips
I've been trying to figure out where it is
Why not make one?
Could probably use any closed captioned video footage to start
I wouldn't but I would need a large data set one time I'm I'm down to make my own data set but with how people move their mouths and if the audio is corrupted or envelope quality and I can't understand it then how can I reliably make an AI I don't understand what people are moving out
Hey guys to all the people passionate about ml and ai, I have started a study group where passionate people who are studying ai and ml can chat, discuss, and create small projects together! I am very open to suggestions and I believe we can learn a TON together, if any of you are intrested then just dm me , we just need 5-7 more passionate active people who are studying ml and ai ๐
is it possible to train a good text to image generator model just with kaggle's GPU? I don't wanna waste time trying to do something that isn't possible
By good you mean the accuracy of the model should be 95% accurate in all test cases? Like that level good?
in the sense the images it creates are at least containing proper objects defined in the text if not the best resoultion
I think it shouldn't be a problem since there are good models you can fine tune based on but I think in terms of resources, Colab should have higher limits. I have heard Kaggle has stricter limits.
And colab offers TPUs
i want to train them from scratch for learning purposes
what datasets are available for this?
If you train from scratch, then you would need to at least rent a cloud instance to train on since you would need a lot of data and training time.
https://huggingface.co/datasets/jackyhate/text-to-image-2M
https://github.com/poloclub/diffusiondb
would that be the minimum requirements or could I use collab or kaggle for a small scale model?
Does Kaggle have any decent data? I've only used it for 2 months initially when I started out. I think for any reasonable data you just want to find some neat API that you can pump into your warehouse
It depends what kind of data you are looking for
Small datasets for practicing and prototyping, yes
Whole big datasets that answer real world use cases, maybe not
I mean, I'm not sure what I'm supposed to say other than the obvious?
you (likely) get worse results than that paper you're referencing cause you're not doing the crucial processing steps
if you're stuck with only pandas and sklearn, yeah they don't provide an easy way to do those things to my knowledge
I was afraid you'd say so. I'll look into other methods to tweak the performance of my linear models a bit then.
Hello I am a First year Bachelor's in CS student and. I have learned Python and Pandas and did some basic EDA on Titanic and Netflix dataset which makes me think this field is interesting to work in. So I have a question to ask does Data science require heavy math knowledge I am currently learning Statistics from Khan Academy. I'm weak in Math right now but if I keep practicing question and exercises will I be able get it done till my graduation or should i also keep learning Web development side-by-side like I'm doing currently
statistics and linear algebra will be your bread and butter, yeah
Mostly depends on the company. I'd say there are math concepts that are important to understand, mostly to understand how data behaves, justifying things like data transformations, and diagnosing model behaviors. But me personally, I'm almost never doing complex math directly.
hey guys what is the man diferent between machine learning and deep learning
Deep learning is a subset of machine learning. It's just when you have a neutral network with a lot of layers.
Machine learning doesn't even have to be a neural network
soo AI is teaching some thing so solve a problem by machine learning or deep learning in machine learning they are sypervised learning and unsupervised learning but the deep learning use neural network to learn its own by using mathematical formuals is that correct
Uhh you're throwing a lot of terms in there
Remember that anything that is deep learning is also machine learning. So there's no point saying "machine learning or deep learning"
That would be like saying "I want fruit or apples"
I'd probably explain "deep learning" by showing this graph and article: https://epoch.ai/blog/compute-trends
somewhere around 2010, new model architectures were developed that could absorb way more compute and data and show way better results. That resulted in an exponential increase of the amount of compute spent on training, and it was a significant enough change that people invented a term for it.
see also the attached paper. here's how it describes the advent of deep learning:
I have a question regarding imbalanced datasets. If the minority class has a low recall rate, what methods can be used to improve its recall performance?
Even though I try to use SMOTE, the recall rate only increase 1%
I generally wouldn't expect smote to do much in most cases
have you tried tuning the decision threshold to trade precision for recall?
Try SMOTEENN as well. I went from 0.00 recall on the minority class (support=249) to 0.90 recall w/ support = 4687. f1-scores were also balanced, 0.83 & 0.87.
I am running into a different problem, probably also related to imbalance. Using statistical tests:
- t-test for independence (
scp.stats.ttest_ind), - Mann-Whitney U (
scp.stats.mannwhitneyu), - Baumgartner-Weiss-Schindler (
scp.stats.bws_test)
The first two give seemingly reasonable outcomes, with variations in the resulting p-values. But bws-test is always exactly 0.0001, without any extra decimal places, across 18 different sets. Can't figure out wtf is going on
smoteenn = SMOTEENN(
random_state=42,
sampling_strategy='minority'
)
df_work_res, df_trgt_res = smoteenn.fit_resample(df_work, df_trgt)
# ----------------------------------
# same logistic regression as before
class_lr = LogisticRegressionCV(
cv=5,
random_state=42,
max_iter=1000
)
class_lr.fit(df_work_res, df_trgt_res.values.flatten())
y_pred = class_lr.predict(df_work_res)
print(classification_report(df_trgt_res,y_pred))
so you simply preprocess both the X and the y dataframes with SMOTEENN, and then proceed with the usual LogisticRegression procedures.
BTW, df_work in my data set has 39 features & over 5000 rows.
I did, but after adjusting the threshold, my precision slightly decreased from 93% to 80%, and another class accuracy also dropped.
I appreciate it. Iโll try it this morning.
anyone got ai hat for rpi5? how do i convert onnx to hef i am damn losing my mind
On a scale of matmul logic to designing a rocket from scratch how hard is it to learn this w/o school
Can I use a scraper for gathering data that I need if I can't find a data set
What are you asking
Like sure ig?
WTF, google?
autoregressive large language models just 'glitch' like that sometimes, repeating something over and over and over and over and over again until it reaches some limit or breaks in a different way
(hard to tell exactly why as they're black boxes, but either they saw something weird in the training data or the current input just something messed up with their probability distribution)
Hii
Thought I'd join since I love Python and I work on a lot of projects and thought maybe I can find people to share my work with
holy moly this is too hard
which is expected and that's what tuning decision thresholds do
you balance the precision recall until you hit a sweet spot
what if you must improve everything at once? get more quality data, usually; some parameter tuning could also help
Oh, I get it.
Hello, I trained my model to detect the information on the driver license. But the text its detecting is wrong. How can I improve this. I am using yolo v8 for object detection.
I tried google vision but my manager wants a ml explicitily trained using a dataset.
Woud you say, it's worth it to learn Optuna or/and Shap? Or would you recommend me to learn it?
I haven't even heard of those
So rather not?
i haven't used it myself, but my colleagues use optuna
nothing you can't do manually, but it can help you set up and parallelize hyperparameter search
Ok, I'll have a look at it. And what do you think about Shap? And in general what do you think about a VotingClassifier, is it worth using it? Do you use them or rather not?
i have no idea about that
No problem
does anyone have any experience with the microsoft/fedml repo, i've been reading into it for a few days now, and im currently having trouble running the fedavg distributed bash script
Interesting article on geometric relationships between target variables, prediction outcomes, and the Hat Matrix
https://functor.network/user/3370/entry/1645
e.g. y_pred = H_hat * y_target in ordinary least squares
The predicted y are a projection of the observed target feature to the span of the feature vectors in whatever dimensionality your problem has. This projection is the Hat matrix / operator
note that beta are the fitted coefficients of the linear regression problem. Cool way to view this.
Optuna is for searching hyper parameters, and you can even search for optimal values in certain feature engineering transformations. I think it's worth learning. It definitely beats gridsearch and randomized search, so you'd spend less time training models. By shap I assume you also mean the shap library for interpretation. It has tools for both global and local interpretation. It might be a nice to know, especially for justifying predictions made, but I wouldn't say it's crucial.
And a VotingClassifier is a way to ensemble different classifiers. You rarely see this outside of kaggle competitions that stack 20+ models to squeeze out high scores on the leaderboard. I wouldn't say this is worth learning.
Going to take a wild shot in the dark
Load to PyTorch then save in whatever format
I have absolutely no idea if this would work but that's my intuition
i was damn loosing my mind
ncnn onnx and pt are compatible with rpi5
however since it has no tops it ouputs 5 fps for cv
i have hailo hat installed in the pi 26 tops
u need hef format for it (not compatible for pt ncnn and onnx)
too damn hard to convert lack of documentation and really hard to understand
I am sharing with you a summary document on my approach to hybrid neuro-symbolic AI. https://transfert.free.fr/NwecLiq
Service d'envoi et de partage de fichiers, simple, gratuit et sรฉcurisรฉ destinรฉ aussi bien aux particuliers qu'aux entreprises.
guys, don't make charts like this one (https://viz.wtf/post/673472354894086144/sticking-your-neck-out)
hey guys im trying to work on a cv project with Sentinel Bands in tif file format and i want to know if there is any open source models that works well with them
Hi guy's i'm glad to be amongst the best developers, i will like to seek your opinion on handson python project to do after completing python fundamental course, planning to take a AI Engineering course after this.
a nice project would be an AI agent from end to end.
there are templates you can follow
I will really appreciate it, thank you!
does anyone know of a prebuilt mcp server i can use to connect a llm to my project?
How proficient at python do i need to be to pursue data science
not that much
what you really need to be proficient at is the theory - math, linear algebra, statistics
Places like Meta will test your Python skills at a relatively high level. So, it depends on who is interviewing you.
But like @agile cobalt said, it's more than just Python. Statistics, linear algebra, some system design, domain knowledge. It's far more than import pandas as pd, followed by some plotting.
of that set of skills, I think that Linear Algebra is the one that most people neglect.
OTOH, that neglecting also happens on the employer side of things. So not sure exactly how necessary it is to be an expert. I mean that, yes, for actual Data Science linear algebra is absolutely important. But everyone is neglecting it, so the question is open as to how deeply it would be tested during the interviewing.
Like, ask yourself this: could you represent a quadratic in matrix form, and from there prove why matrix diagonalization is equivalent to a stepwise conjugate gradient approach to finding a minimum. And from there describe why the diagonalization, while rigorous, is numerically unfavorable? Stuff like that.
it's also good to think of Data Science as Machine Learning + Statistics. So if you are going to be good at the ML side, you need to understand optimization theory.
RuntimeError: [enforce fail at alloc_cpu.cpp:124] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 1715683487868 bytes. Error code 12 (Cannot allocate memory)
PyTorch consuming my entire computer bro
1.7 TB of ram ๐ฅ
How would someone use Python in lets say, clinical research, where statistics and graphs, etc. are utilized
(I'm new to python but is interested in how I can learn and utilize it for research based applications)
You can use python to manipulate data and generate data visualizations
It's hard to be more specific without knowing what kind of data you're working with and what you want to find out about it
Ah gotcha, thanks!
Hm, what do you think is the best way to learn python for the purpose I mentioned above?
There's a book called Biostatistics with Python by Darko Medin. Check out the table of contents to get an idea of things Python could help with. Python has a large community that builds tools around many domains, so you'll find code written and maintained by people that others often leverage for their own purposes.
I see
Thanks for the response!
Iโll look into it
Another question, do you think I can achieve functional literacy with python for data analysis, etc. within 5-6 months?
I would say yes
If you are completely new to programming, it will have its challenges. There are many things not covered by any singular resource you pick up, so you'll have to get used to looking up answers by yourself.
can we please stop referring to plots as "graphs"? A graph is a concept that is important in actual data science, and is central to Networks
Names can mean different things, nothing new
Google "graph" and see what pops up first
What do I need to do to get an internship as a data scientist
Well, for most internships, you probably need to be enrolled in a relevant university program
Mmm, this guy's cooking stuff, le GPT in 200 LoC: https://x.com/karpathy/status/2021694437152157847
https://gist.github.com/karpathy/8627fe009c40f57531cb18360106ce95
Hi as an mle do u guys think itโs important to know? I hear about it all the time, tjat itโs good for model registry and like keeping track of the models and such. I started a course some days ago and just wanted to know if itโs actually relevant in practice
that it's important to know what?
probably mlflow, it's pretty much the standard for model tracking
you're probably right
Ml flow
I ran the mlflow server for one of my projects last year. we ended up sticking with a version that's kinda old now because there kept being bugs in newer versions
Sorry I didnโt see I didnโt type it lmao I donโt have my glasses on rn ๐
I think it was 3.2 that we used. at the time, it had pretty strong support for model training, but weak support for testing agentic pipelines.
hmm yeah, it's industry standard, focus on the model registry for production deployment stuff
but if there's ever going to be a standard platform for tracking model development, it's going to be MLflow, in whatever form it evolves into.
Ok tyyy for the advice!! :)
Guys, if i needed machine that can answer me by gathering data online. Wich library is the best?
I have an idea for a little side project in python, but i dont know how to implement it, so i wanted to ask for help here.
In short, I want to create an AI that hallucinates faces.
First, I need a (ideally pretrained) ML model that can analyze an image and output a probability from 0.0 to 1.0 denoting how confident it is that this image is a face.
Then, I want to take the image vector, and somehow compute the closest image vector (using Euclidean distance if possible? or some other distance idk) for which the classifier does recognise a face. I'm thinking the easiest approach is to manually set a threshold. i.e. p > 0.8 means it recognises a face.
Then, output that vector as an image to a new file. The output should look something like a messed-up hallucinated face in an image where there isn't one.
So my two questions would be:
- What facial recognition models output a confidence/probability instead of a binary class?
- How do I go about finding the closest vector? Im assuming the model needs to grant me access to its gradient?
Thanks in advance
I feel like you are describing a k-nearest neighbor
Look into "Variational auto-encoder "
So, I did an LSTM-based time series forecasting of electric load profiles for a city, and the back test looks like this
the behavior near the peaks is, I think, reasonable for a neural network. Peak forecasting is a problem that usually depends on several techniques applied at the same time (e.g. something like ARIMA at close ranges, etc)
But the behavior in the troughs is puzzling. The NN cannot predict the shape of anything below a certain baseline
any thoughts / ideas / etc.?
What is your loss function? Compare MSE and MAE. Also, did you normalize the input values?
I'll test the normalization.
I mention the loss function because the behavior looks like the model predicts a low average when uncertain about the values in that range, might be biased towards minimizing error in peaks
it is predicting those shoulder-like features, but always at the same-ish level
similar pathology, it seems
Im working on a web site that turns a prompt into a 3D model, I do this by using a smaller bot that will read throught the prompt fugure out what the user wants made then to save money and power it will search a data base full of templates instead of generating a model every single time. If you know how to optimize this any more please let me know, last I checked I made a thing with in 300-400 milliseconds.
Sounds good to me
me?
Ye
ok, thanks!
Are you trying to reduce time / cost past a specific value?
Im just trying to get the best results with in the least amount of time
I dont wanna be too picky, I would rather have it look good than be fast but there is also a balance between fast and good quality I wanna meet
Depends on what tradeoff you're willing to make. E.g. you could generate an embedding of the description of the template, and embeddings of the user prompt, then use a similarity metric between the prompt and template embeddings, cutting out the llm entirely
Thank makes a lot more sence, but the ai isnโt used much like this its the script finds key word breaking down what the user wants then It formats it into โblueprintsโ telling the ai how to make the model
Hey guys, quick question, anyone know any good websites for small scale project ideas? i wanted to find specific data engineering projects (involving modelling and simulation) but I can't really find anything interesting. I don't know where to look.
pythonanywhere may be a good one
could anyone recommend me some videos for multiple variable linear regression?
thx
Hello Thank you for taking a look at my Problem
Cleaning up easyOCR data
Goal
Cleaning up data read from easyOCR determenistically,
so that a locally running LLM(maybe Olama or Phi-3 Mini) is able to extract valuable information from the leftover data.
Problem
The data is from receipts. So of course it has alot of numbers and lines don't always perfectly line up.
The easyOCR data does extract most information but it's jumbled and has formatting issues.
for example often 0's turn into o's.
I want to clean up the data deterministically before feeding it to the LLM as they're small models and not that powerful.
I'd be grateful for any type of feedback.
But these are my main questions.
- should I use a larger model and interact with it via API instead of running a local model
- is there a better library(text recognition) to use for this endeavour
- how can I clean up the given data
- Is this project even feasible
- Should I try processing the image before feeding it to easyOCR
Things I've tried but maybe didn't implement well
- Flattening all text
- Then splitting spaes
- Then matching for a number via regex and replacing o with 0
# only normalize if token contains digits or number-like chars
if not re.search(r'\d', t):
return token
# common OCR mistakes
t = t.replace('o', '0')
t = t.replace('O', '0')
# remove invalid characters except digits and separators
t = re.sub(r'[^0-9,.\-%]', '', t)
- trying to parse text into types like - and failed miserably
- money
- text
- percent value
- some more
A picture and a sample of the data extracted from that are in the next message
['max wallner', 'bahnhofstraรe', 'kunden-nr _', '20076', '3100 st. pรถlten', 'ihre bestellung', 'vom', '28-10-20', 'ihre', 'uid-nummer', 'atu14009106', 'wien', 'rechnung nr.', 'a 1595', '06-11-20', 'wir lieferten ihnen mit lkw am', 'movember 20', 'zahlbar und klagbar in wien', 'preis', 'betrag', 'einheit', 'produkt', 'stk,', 'oled-fernseher, smart tv 40', '720,00', '440,00', 'stk.', 'oled-fernscher , smart tv 46', '050,00', '1.050,00', 'stk .', 'oled-fernseher, smart tv 52', '1.890,00', '3,780,00', 'stk.', 'playstation ps4', '215,00', '1.290,00', '7,560,00', '30 % wiederverkauferrabatt', '2.268,00', '5,292,00', '10 % sonderrabatt', '529,20', '4,762,80', '20 % ust', '952,56', '5,715,36', 'menge']
Does anyone know how long it takes to make an image or audio dataset?
Image:
Q&A:
Q0: how big is the data set?
A0: basic image detection
Which is usually a thousand images for to learn detection.
Q1: what type of images am I looking for?
A1: Humans,bikes,cars,trees,animals.
Q2: why don't I just use CV2?
A2: those are pre-trained models.
Q3: how many folders am I going to use?
A3: 5 for the dataset !
Q4: why did I put this into an answer question format?
A4: so it's easier to explain.
Q5: why didn't I start this when I was 12?
A5: I didn't know I could do it along with programming at the time.
have you tried one of the modern OCR models based on VLMs? say lightonocr-2 1b, paddleocr-vl-1.5 1b, deepseek ocr, glm ocr, etc
depending on your setup it might be more attractive to run one of these, despite an increase to compute compared to easyocr probably, than running a decent-ish ocr and trying to have large models fix it
also on a tangent; Ollama is not a model, but a program/library to run models
the phi series also has v4 now
a good chunk of them have demo spaces you can try online, say here for lightonocr-2-1b
Thank you.
No I havent looked into modern OCR models.
I'm at school rn but I'll get to it asap
wat
I looked up all the words
you built like a knowledge graph type thing that uses AI
Wow I tried lightonocr-2-1b and it pretty much got all of the text spot on.
This is going to make everything so much easier.
Thank you so much
hello guys, quick question about gradient descent and stochastic gradient descent. as far as I understand, gradient descent find the optimum function/fit by considering the entire data set right? for example for linear regression using sum of squared residuals as the loss function.
what i fail to understand is how stochastic gradient descent is similarly accurate whilst being more efficient? I see that it takes one random sample at a time but how does that produce a best fit for the entire data set?
it's not "efficient". and it's not even guaranteed to be accurate if the problem surface isn't convex (which it pretty much never is)
I guess my answer isn't helpful.
ah, the statquest video made it seem that it was efficient in the sense of it takes fewer sample steps? like for example 23k genes and 1 million data points would be 23 billion instances it has to calculate but taking one sample/batch at a time reduces that number?
suppose you take a step down the gradient for every training instance
or you take several instances into account before you take a step
wouldn't taking fewer, more informed steps be more efficient?
it would.